Neural Networks
The nn namespace provides neural network building blocks: layers, activations, and containers. All layers integrate seamlessly with the autograd system for automatic gradient computation.
Quick Example
#include "layers.h"
#include "optimizer.h"
// Build a simple MLP
nn::Sequential model;
model.add(nn::Linear(784, 128)); // Input: 784 features
model.add(nn::ReLU());
model.add(nn::Linear(128, 10)); // Output: 10 classes
// Setup optimizer
ag::SGD optimizer(0.01f);
optimizer.add_parameters(model.layers());
// Training step
ag::Tensor output = model.forward(input);
ag::Tensor loss = compute_loss(output, target);
optimizer.zero_grad();
loss.backward();
optimizer.step();
Layer Types
Linear (Fully Connected)
Implements: \(y = xW + b\)
// Standard initialization (LeCun uniform)
nn::Linear(int input_dim, int output_dim);
// Sparse initialization (fraction of weights set to zero)
nn::Linear(int input_dim, int output_dim, float sparsity);
Example:
nn::Linear fc(784, 128); // 784 inputs → 128 outputs
auto y = fc.forward(x); // x: (batch, 784) → y: (batch, 128)
Conv2D (2D Convolution)
For image data with shape (N, C, H, W).
Example:
nn::Conv2D conv(3, 32, 3, 1, 1); // 3→32 channels, 3×3 kernel, stride=1, pad=1
// Input: (N, 3, 32, 32) → Output: (N, 32, 32, 32)
LayerNorm
Normalizes activations across all features (no learnable parameters).
Example:
model.add(nn::Linear(64, 64));
model.add(nn::LayerNorm()); // Stabilizes training
model.add(nn::ReLU());
Flatten
Reshapes 4D tensors to 2D for transitioning from conv to linear layers.
Activation Functions
| Layer | Function | Description |
|---|---|---|
nn::ReLU |
\(\max(0, x)\) | Standard rectifier |
nn::LeakyReLU |
\(\max(\alpha x, x)\) | Leaky rectifier (default \(\alpha=0.01\)) |
nn::Sigmoid |
\(\frac{1}{1+e^{-x}}\) | Squashes to (0, 1) |
nn::Tanh |
\(\tanh(x)\) | Squashes to (-1, 1) |
nn::Softmax |
\(\frac{e^{x_i}}{\sum e^{x_j}}\) | Probability distribution over all elements |
nn::Softplus |
\(\log(1 + e^x)\) | Smooth approximation of ReLU |
All activations can be used as layers or free functions:
Sequential Container
nn::Sequential chains layers into a single model.
nn::Sequential model;
model.add(nn::Linear(10, 64));
model.add(nn::LayerNorm());
model.add(nn::ReLU());
model.add(nn::Linear(64, 1));
// Forward pass through all layers
ag::Tensor output = model.forward(input);
// Get all trainable parameters
auto layers = model.layers();
CNN Example
nn::Sequential cnn;
// Convolutional layers
cnn.add(nn::Conv2D(1, 32, 3, 1, 1)); // (N,1,28,28) → (N,32,28,28)
cnn.add(nn::ReLU());
cnn.add(nn::Conv2D(32, 64, 3, 2, 1)); // (N,32,28,28) → (N,64,14,14)
cnn.add(nn::ReLU());
// Flatten and classify
cnn.add(nn::Flatten()); // (N,64,14,14) → (N,12544)
cnn.add(nn::Linear(64*14*14, 10)); // (N,12544) → (N,10)
ag::Tensor logits = cnn.forward(images);
Layer Interface
All layers inherit from nn::Layer and implement:
| Method | Description |
|---|---|
forward(input) |
Compute output tensor |
get_parameters() |
Return pointers to trainable tensors |
has_parameters() |
Returns true if layer has weights |
zero_grad() |
Zero all parameter gradients |
Custom Layers
Create custom layers by inheriting from nn::Layer:
class ScaledLinear : public nn::Layer {
public:
ScaledLinear(int in_dim, int out_dim, float scale)
: linear(in_dim, out_dim), scale_(scale) {}
ag::Tensor forward(const ag::Tensor& x) override {
return linear.forward(x) * scale_;
}
std::vector<ag::Tensor*> get_parameters() override {
return linear.get_parameters();
}
bool has_parameters() const override { return true; }
private:
nn::Linear linear;
float scale_;
};
See Also
- Optimizers — Training with SGD, RMSProp
- Tensor Operations — Underlying tensor operations
- Examples — Complete training examples