Chapter 13
Summary
All of Ch01–Ch12 in one neural network diagram.
Deep learning diagram by chapter
As you complete each chapter, the diagram below fills in. This is the structure so far.
Chapter 01 Vector Dot Product
Left X1,X2,X3 and right Y1,Y2,Y3 are connected by lines. Each right node is the dot product of the left with weights.
Chapter 02 Matrix Multiplication
Left is one row of matrix A; right Y1–Y3 are dot products with columns of B. Together they form the matrix product A·B.
Chapter 03 Linear Layer (Weights and Bias)
This block is a linear layer. Input is computed to the next layer at once as Y = W·X + b.
Chapter 04 Activation (Nonlinear)
Representative activation functions where output Y changes nonlinearly with input X. (3-level quantized version)
Node values change in a nonlinear way through ReLU or σ. The last layer Y1, Y2, Y3 come from that.
Chapter 05 Artificial Neuron (Weighted Sum and Activation)
Inside the dashed circle is one artificial neuron. Input (X) times weights (w·x+b), then ReLU, gives output (Y).
Chapter 06 Batch (Compute All at Once)
So output Y also comes out as one table at once.
So when we merge inputs into one table, output Y also comes out as one table at once.
Chapter 07 Weight Connections
Each line between layers is a weight (w). Multiply input by weights, add them, then add bias (b) to get the next layer Y.
Circles are values, lines are weights (w). Add bias (b) to the weighted sum to get the next layer Y.
Chapter 08 Hidden Layers (Invisible Layers)
We only see input (X) and output (Y). The layer in between is used only inside the network, so it’s the hidden layer.
Visible: input→Hidden: H→Visible: output
Values flow input → hidden → output. The hidden layer is an internal representation we don’t see.
Chapter 09 Depth (Deep Network)
Deep = many hidden layers (middle steps). The “deep” in deep learning is this depth.
More steps mean a deeper network. Deeper networks can learn more refined patterns.
Chapter 10 Width (Number of Neurons per Layer)
The number of neurons in one layer is the width. Wider layers can handle more features at once.
Chapter 11 Softmax (Turn into Probabilities)
3raised to 27(3^3)
27/31=27 ÷ 31
Chapter 12 Gradient (Backpropagation)
Y → H → X
By the last chapter you'll see the full picture: forward → loss → backward → update.
Summary
The diagram below collects everything from Ch01–Ch12 into one network: input X → hidden layers (A, B, C, D) → output Y, with weights (W), activation (ReLU, etc.), batch, and gradient (∇) shown.
Real training repeats forward pass (compute output) → loss → backward pass (gradients) → update weights. After this course you can follow that flow in the math.