TinyRL Documentation

Welcome to the TinyRL documentation! This site provides comprehensive guides and API reference for the TinyRL deep learning framework.

What is TinyRL?

TinyRL is a lightweight, header-only C++17 framework designed for real-time reinforcement learning on microcontrollers and embedded systems. It consists of two primary components:

Component	Description
Autograd Core	Tensors, reverse-mode automatic differentiation, neural network layers, and optimizers
Stream-X Module	Streaming RL algorithms: StreamAC (Actor-Critic), StreamQ (Q-learning), and StreamSARSA

The overall project goal is clarity, minimal footprint, and suitability for real-time/embedded learning.

Getting Started

Installation Guide — Setup and build instructions
Quick Start Examples — Basic usage patterns
API Reference — Complete API documentation

Core Components

Tensor Operations — Core tensor manipulation and operations
Automatic Differentiation — Understanding the autograd system
Neural Networks — Building and training neural networks
Optimizers — Available optimization algorithms

Advanced Features

Reinforcement Learning — Stream-X module: StreamAC algorithm, StreamQ algorithm, StreamSARSA algorithm, and ObGD optimizer
ESP32 / Embedded — Embedded development guide
Python Bindings — Using TinyRL from Python

Development

Testing Guide — Running tests and debugging

Overview

Core Design Philosophy

TinyRL is built around these core principles:

Principle	Description
Efficiency	Cache-friendly algorithms and zero-copy operations
Safety	RAII design and strong type checking
Flexibility	Modular architecture for easy extension
Performance	Optimized for both training and inference
Educational	Clean, readable code for learning

Technical Highlights

// Reshape (creates a reshaped copy)
ag::Tensor view = tensor.reshape({batch_size, feature_dim, 1, 1});

// Cache-efficient matrix multiplication
auto result = a.matmul(b);  // Uses tiled algorithm

// Backward pass (free graph after use)
loss.backward();
loss.clear_graph();

// Dynamic computational graphs
ag::draw_graph(loss, "computation_graph.dot");

Performance Optimizations

Block-based matrix operations for cache efficiency
SIMD-ready data structures for vectorized operations
Smart memory reuse to minimize allocations
Efficient broadcasting implementation for shape compatibility

Features

🔧 Automatic Differentiation Engine

Dynamic computational graph construction with automatic memory management
Reverse-mode automatic differentiation for efficient gradient computation
Efficient memory management with shared pointers and RAII
Graph visualization tools for debugging and understanding

📊 Tensor Operations

N-dimensional array support (focus on 2D and 4D for efficiency)
Hardware-optimized matrix multiplication with tiled algorithms
Efficient broadcasting implementation for shape compatibility
Comprehensive math operations suite (element-wise, reductions, etc.)

🧠 Neural Network Components

Basic layers: Linear, Conv2D, LayerNorm with efficient fused operations
Activation functions: ReLU, LeakyReLU, Tanh, Softmax, Softplus
Sequential model container for easy layer composition
Custom layer support for extensibility

🎯 Optimizers

Stochastic Gradient Descent (SGD) - Core optimizer with ESP32 optimizations
RMSProp - Adaptive learning rates with moving average
Overshooting-bounded Gradient Descent (ObGD) - For online streaming RL
User-defined learning rate decay (examples provided)

🤖 Reinforcement Learning

Actor-Critic networks for policy gradient methods
Data normalization utilities for stable training
Reward scaling and advantage estimation
Streaming Deep RL algorithms for real-time learning

🛠️ Developer Tools

Computational graph visualization for debugging
Gradient checking utilities for validation
Comprehensive test suite with edge case coverage
Performance benchmarks for optimization

Installation

C++ (Header-Only)

git clone https://github.com/mohmdelsayed/TinyRL.git
cd TinyRL

# Option 1: CMake build
mkdir build && cd build
cmake -DCMAKE_BUILD_TYPE=Release ..
make -j

# Option 2: Direct inclusion (header-only)
# Just include the headers in your project

Python Bindings

Build from source (no PyPI package yet; module name is autograd—note: distinct from the unrelated Python-only package of the same name):

./install.sh --with-bindings

The resulting autograd.so appears in examples/python/ and is installed into your active environment if permissions allow.

Quick Start

Tensor Operations

#include "autograd.h"

// Create tensors with automatic differentiation
ag::Tensor x = ag::Tensor(ag::Matrix::Random(2, 3), true, "x");
ag::Tensor y = ag::Tensor(ag::Matrix::Random(3, 2), true, "y");

// Perform operations
auto z = x.matmul(y);        // Matrix multiplication
auto w = ag::relu(z);        // Activation function
auto loss = ag::sum(w);      // Reduction

// Compute gradients
loss.backward();

Neural Networks

#include "layers.h"
#include "optimizer.h"

// Define a simple network
nn::Sequential model;
model.add(nn::Linear(784, 128));
model.add(nn::ReLU());
model.add(nn::Linear(128, 10));

// Training utilities
ag::SGD optimizer(0.01);
optimizer.add_parameters(model.layers());

// Single training step
ag::Tensor output = model.forward(input);
ag::Tensor loss = compute_loss(output, target);
optimizer.zero_grad();
loss.backward();
optimizer.step();

C++ Example

#include "autograd.h"
#include "layers.h"
#include "optimizer.h"

// Set random seed for reproducibility
ag::manual_seed(42);

// Create tensors
ag::Matrix X_data = ag::Matrix::Random(1, 3);
ag::Tensor X(X_data, false, "x");

// Define model using proper layers
auto model = nn::Linear(3, 2);

// Forward pass
ag::Tensor output = model.forward(X);
ag::Tensor loss = ag::sum(ag::pow(output, 2.0));

// Backward pass
loss.backward();

Python Example

import autograd
import numpy as np

# Create tensors
X = autograd.Tensor(np.random.rand(1, 3), requires_grad=False)

# Define model
model = autograd.Linear(3, 2)

# Forward pass
output = model.forward(X)
loss = autograd.sum(autograd.pow(output, 2.0))

# Backward pass
loss.backward()

Advanced Features

1. Dynamic Computational Graphs

// Graph visualization
ag::draw_graph(loss, "computation_graph.dot");

// Dynamic tensor shapes
ag::Tensor adaptive = model.forward(input);  // Shape adapts automatically

2. Memory Management

// Automatic resource cleanup
{
    ag::Tensor temp = heavy_computation();
    // Resources freed when temp goes out of scope
}

// Manual cleanup in training loops
loss.clear_graph();  // Free graph memory after backward()

3. Reinforcement Learning (Stream-X)

// Build with: ./install.sh --with-stream-x
#include "stream_x/stream_ac_continuous.h"

// Create agent (model is provided via set_model)
ContinuousStreamAC agent(11, 1.0f, 0.99f, 0.8f, 2.0f, 2.0f);

nn::Sequential actor_backbone;
actor_backbone.add(nn::Linear(11, 128));
actor_backbone.add(nn::ReLU());

nn::Sequential mu_head;
mu_head.add(nn::Linear(128, 3));

nn::Sequential std_head;
std_head.add(nn::Linear(128, 3));
std_head.add(nn::Softplus());

nn::Sequential critic;
critic.add(nn::Linear(11, 128));
critic.add(nn::ReLU());
critic.add(nn::Linear(128, 1));

agent.set_model(actor_backbone, mu_head, std_head, critic);

// Training step
ag::Matrix norm_s = agent.normalize_observation(state);
ag::Tensor s(norm_s, false);
ag::Tensor action = agent.sample_action(s);
ag::Float scaled_r = agent.scale_reward(reward, done);
ag::Matrix norm_sn = agent.normalize_observation(next_state);
ag::Tensor sn(norm_sn, false);
ag::Tensor r(ag::Matrix::Constant(1,1,scaled_r), false);
agent.update(s, action, r, sn, done);

Documentation Structure

Section	Description
Tensor Operations	Core tensor manipulation
Autograd	Automatic differentiation
Neural Networks	Layers and models
Optimizers	SGD, RMSProp, ObGD
Reinforcement Learning	Stream-X algorithms
Build Guide	Installation and configuration
ESP32 Guide	Embedded development
Python Bindings	Using from Python
API Reference	Complete API documentation
Examples	Practical code samples

Architecture Summary

TinyRL
├── Autograd Core (installed API)
│   ├── ag::Tensor        — Differentiable tensors
│   ├── ag::Matrix        — Underlying data storage
│   ├── nn::Linear        — Dense layers
│   ├── nn::Conv2D        — Convolutional layers
│   ├── nn::Sequential    — Model container
│   ├── ag::SGD           — Stochastic gradient descent
│   └── ag::RMSProp       — Adaptive learning rates
│
└── Stream-X Module (optional, -DAUTOGRAD_BUILD_STREAM_X=ON)
    ├── ContinuousStreamAC  — Actor-Critic (continuous actions)
    ├── DiscreteStreamAC    — Actor-Critic (discrete actions)
    ├── StreamQ             — Online Q-learning
    ├── StreamSARSA         — On-policy SARSA
    └── ObGD                — Overshooting-bounded optimizer

TinyRL uses modern C++ (C++17) to create dynamic computational graphs with automatic differentiation. For questions or contributions, see CONTRIBUTING.md.

TinyRL Documentation

What is TinyRL?

Quick Navigation

Getting Started

Core Components

Advanced Features

Development

Table of Contents

Overview

Core Design Philosophy

Technical Highlights

Performance Optimizations

Features

🔧 Automatic Differentiation Engine

📊 Tensor Operations

🧠 Neural Network Components

🎯 Optimizers

🤖 Reinforcement Learning

🛠️ Developer Tools

Installation

C++ (Header-Only)

Python Bindings

Quick Start

Tensor Operations

Neural Networks

C++ Example

Python Example

Advanced Features

1. Dynamic Computational Graphs

2. Memory Management

3. Reinforcement Learning (Stream-X)

Documentation Structure

Architecture Summary