A visual, mathematical guide to AI. From neurons to Transformers and Agents, explained through equations, interactive visualizations, and first principles.
Chapters
The weighted sum
The atomic unit of intelligence. Scalar operations, weights, biases, and activation functions like ReLU and Sigmoid.
The shape of data
Moving from scalars to matrices. Understanding shapes, broadcasting, and why GPUs love linear algebra.
Layers of abstraction
Connecting neurons into layers. The forward pass as a series of matrix transformations (MLP).
Learning from mistakes
How machines learn. Loss functions, the chain rule, and visualizing gradient descent in 3D.
From scratch to 97%
Build and train a neural network from scratch using WebGPU compute shaders, entirely in your browser.
The token
Turning text into numbers. Vocabulary, tokenization strategies, and the lookup table.
The attention
The mechanism that changed everything. Query, Key, Value matrices, causal masking, and the scaled dot-product.
The architecture
Building the Transformer block. LayerNorm, Residual connections, and Multi-Head mechanics.
The model
Putting it all together. Positional encodings, stacking blocks, and the full GPT architecture.
The training
Making the model useful. Pre-training objectives, Fine-tuning, and RLHF (Reinforcement Learning from Human Feedback).
The generation
From probabilities to text. Sampling strategies (Temperature, Top-k/p) and the KV Cache optimization.
The agent
Breaking the closed loop. The ReAct pattern, tool calling, and structured outputs.
The system
Orchestration at scale. RAG (Retrieval Augmented Generation), vector databases, and multi-agent systems.