Chapter 13

Summary

All of Ch01–Ch12 in one neural network diagram.

Deep learning diagram by chapter

As you complete each chapter, the diagram below fills in. This is the structure so far.

Chapter 01 Vector Dot Product

X1X2X3Y1Y2Y3Weightx₂·y₂Result

Left X1,X2,X3 and right Y1,Y2,Y3 are connected by lines. Each right node is the dot product of the left with weights.

Chapter 02 Matrix Multiplication

X1X2X3Y1Y2Y3WeightA·BMatrix product result

Left is one row of matrix A; right Y1–Y3 are dot products with columns of B. Together they form the matrix product A·B.

Chapter 03 Linear Layer (Weights and Bias)

X1X2X3Y1Y2Y3Weight·input+biasReLUYResult

This block is a linear layer. Input is computed to the next layer at once as Y = W·X + b.

Chapter 04 Activation (Nonlinear)

Representative activation functions where output Y changes nonlinearly with input X. (3-level quantized version)

Y = Sigmoid(X)00.51
Y = ReLU(X)0
Y = Tanh₃(X)-101

Node values change in a nonlinear way through ReLU or σ. The last layer Y1, Y2, Y3 come from that.

Chapter 05 Artificial Neuron (Weighted Sum and Activation)

Neuronw1w2w3X1X2X3w·x + bReLUYB

Inside the dashed circle is one artificial neuron. Input (X) times weights (w·x+b), then ReLU, gives output (Y).

Chapter 06 Batch (Compute All at Once)

So output Y also comes out as one table at once.

Input table X
X1
X2
X3
Sample 1
Sample 2
Sample 3
One column = one sample
Same W, bCompute at once
Output Y
Sample 1
Sample 2
Sample 3
Y1
Y2
Y3
← Result from same W, b at once

So when we merge inputs into one table, output Y also comes out as one table at once.

Chapter 07 Weight Connections

Each line between layers is a weight (w). Multiply input by weights, add them, then add bias (b) to get the next layer Y.

weight(w)weight(w)weight(w)weight(w)weight(w)weight(w)weight(w)weight(w)weight(w)X1X2X3+bias(b)Y1Y2Y3

Circles are values, lines are weights (w). Add bias (b) to the weighted sum to get the next layer Y.

Chapter 08 Hidden Layers (Invisible Layers)

We only see input (X) and output (Y). The layer in between is used only inside the network, so it’s the hidden layer.

Visible: inputHidden: HVisible: output

Hidden layer (not visible from outside)W₁·X+b₁ → ReLUW₂·H+b₂ → ReLUX1X2X3H1H2H3Y1Y2Y3

Values flow input → hidden → output. The hidden layer is an internal representation we don’t see.

Chapter 09 Depth (Deep Network)

Deep = many hidden layers (middle steps). The “deep” in deep learning is this depth.

X1X2X3A1A2A3B1B2B3C1C2C3D1D2D3Y1Y2Y3XLayer 1ALayer 2BLayer 3CLayer 4DLayer 5YLayer 6

More steps mean a deeper network. Deeper networks can learn more refined patterns.

Chapter 10 Width (Number of Neurons per Layer)

X1H1H2H1H2H3H4Y1Y2Y3Y4Y5Y6Y7Y8Width 11 neuronsWidth 22 neuronsWidth 44 neuronsWidth 88 neurons

The number of neurons in one layer is the width. Wider layers can handle more features at once.

Chapter 11 Softmax (Turn into Probabilities)

SoftmaxScore → probability(example: e ≈ 3)
Score
3
1
0
Mid
27
3
1
3 to the power
3³=27
3¹=3
3=1
Probability
27/31
3/31
1/31
Divide by sum
27÷31=27/31
3÷31=3/31
1÷31=1/31

3raised to 27(3^3)

27/31=27 ÷ 31

Chapter 12 Gradient (Backpropagation)

XHY

Y → H → X

By the last chapter you'll see the full picture: forward → loss → backward → update.

Summary

The diagram below collects everything from Ch01–Ch12 into one network: input X → hidden layers (A, B, C, D) → output Y, with weights (W), activation (ReLU, etc.), batch, and gradient (∇) shown.

Real training repeats forward pass (compute output) → lossbackward pass (gradients) → update weights. After this course you can follow that flow in the math.