Neural Networks

The nn namespace provides neural network building blocks: layers, activations, and containers. All layers integrate seamlessly with the autograd system for automatic gradient computation.

Quick Example

#include "layers.h"
#include "optimizer.h"

// Build a simple MLP
nn::Sequential model;
model.add(nn::Linear(784, 128));  // Input: 784 features
model.add(nn::ReLU());
model.add(nn::Linear(128, 10));   // Output: 10 classes

// Setup optimizer
ag::SGD optimizer(0.01f);
optimizer.add_parameters(model.layers());

// Training step
ag::Tensor output = model.forward(input);
ag::Tensor loss = compute_loss(output, target);
optimizer.zero_grad();
loss.backward();
optimizer.step();

Layer Types

Linear (Fully Connected)

Implements: \(y = xW + b\)

// Standard initialization (LeCun uniform)
nn::Linear(int input_dim, int output_dim);

// Sparse initialization (fraction of weights set to zero)
nn::Linear(int input_dim, int output_dim, float sparsity);

Example:

nn::Linear fc(784, 128);  // 784 inputs → 128 outputs
auto y = fc.forward(x);   // x: (batch, 784) → y: (batch, 128)

Conv2D (2D Convolution)

For image data with shape (N, C, H, W).

nn::Conv2D(int in_channels, int out_channels, int kernel_size, 
           int stride = 1, int padding = 0);

Example:

nn::Conv2D conv(3, 32, 3, 1, 1);  // 3→32 channels, 3×3 kernel, stride=1, pad=1
// Input: (N, 3, 32, 32) → Output: (N, 32, 32, 32)

LayerNorm

Normalizes activations across all features (no learnable parameters).

nn::LayerNorm(float eps = 1e-5f);

Example:

model.add(nn::Linear(64, 64));
model.add(nn::LayerNorm());  // Stabilizes training
model.add(nn::ReLU());

Flatten

Reshapes 4D tensors to 2D for transitioning from conv to linear layers.

nn::Flatten();  // (N, C, H, W) → (N, C*H*W)

Activation Functions

Layer	Function	Description
`nn::ReLU`	\(\max(0, x)\)	Standard rectifier
`nn::LeakyReLU`	\(\max(\alpha x, x)\)	Leaky rectifier (default \(\alpha=0.01\))
`nn::Sigmoid`	\(\frac{1}{1+e^{-x}}\)	Squashes to (0, 1)
`nn::Tanh`	\(\tanh(x)\)	Squashes to (-1, 1)
`nn::Softmax`	\(\frac{e^{x_i}}{\sum e^{x_j}}\)	Probability distribution over all elements
`nn::Softplus`	\(\log(1 + e^x)\)	Smooth approximation of ReLU

All activations can be used as layers or free functions:

// As layer
model.add(nn::ReLU());

// As function
auto y = ag::relu(x);

Sequential Container

nn::Sequential chains layers into a single model.

nn::Sequential model;
model.add(nn::Linear(10, 64));
model.add(nn::LayerNorm());
model.add(nn::ReLU());
model.add(nn::Linear(64, 1));

// Forward pass through all layers
ag::Tensor output = model.forward(input);

// Get all trainable parameters
auto layers = model.layers();

CNN Example

nn::Sequential cnn;

// Convolutional layers
cnn.add(nn::Conv2D(1, 32, 3, 1, 1));   // (N,1,28,28) → (N,32,28,28)
cnn.add(nn::ReLU());
cnn.add(nn::Conv2D(32, 64, 3, 2, 1));  // (N,32,28,28) → (N,64,14,14)
cnn.add(nn::ReLU());

// Flatten and classify
cnn.add(nn::Flatten());                 // (N,64,14,14) → (N,12544)
cnn.add(nn::Linear(64*14*14, 10));      // (N,12544) → (N,10)

ag::Tensor logits = cnn.forward(images);

Layer Interface

All layers inherit from nn::Layer and implement:

Method	Description
`forward(input)`	Compute output tensor
`get_parameters()`	Return pointers to trainable tensors
`has_parameters()`	Returns `true` if layer has weights
`zero_grad()`	Zero all parameter gradients

Custom Layers

Create custom layers by inheriting from nn::Layer:

class ScaledLinear : public nn::Layer {
public:
    ScaledLinear(int in_dim, int out_dim, float scale)
        : linear(in_dim, out_dim), scale_(scale) {}

    ag::Tensor forward(const ag::Tensor& x) override {
        return linear.forward(x) * scale_;
    }

    std::vector<ag::Tensor*> get_parameters() override {
        return linear.get_parameters();
    }

    bool has_parameters() const override { return true; }

private:
    nn::Linear linear;
    float scale_;
};

Neural Networks

Quick Example

Layer Types

Linear (Fully Connected)

Conv2D (2D Convolution)

LayerNorm

Flatten

Activation Functions

Sequential Container

CNN Example

Layer Interface

Custom Layers

See Also