Chapter 05

Artificial Neuron (Weighted Sum and Activation)

A unit that computes a weighted sum of inputs and applies an activation function.

Deep learning diagram by chapter

As you complete each chapter, the diagram below fills in. This is the structure so far.

Inside the dashed circle is one artificial neuron. Input (X) times weights (w·x+b), then ReLU, gives output (Y).

Artificial neuron in deep learning

An artificial neuron is the smallest computational unit of deep learning. It does exactly two steps: ① compute the weighted sum Z = W·X + b, ② apply an activation function Y = ReLU(Z) or Sigmoid(Z).

It's inspired by biological neurons: real neurons receive multiple signals, weight each one differently, sum them up, and fire if the total exceeds a threshold. The artificial neuron is a mathematical simplification of this process.

Summary: Input (X) → Weight and bias (Z = W·X + b) → Activation (Y = f(Z)) → Output (Y). That's everything an artificial neuron does.

AI models like ChatGPT, image classifiers, and recommendation systems are built by connecting thousands to billions of these neurons. Understand one neuron, and you can read the entire model's behavior.

Training means gradually adjusting each neuron's weights (W) and bias (b) so the output gets closer to the correct answer. Knowing how W and b affect the output is key to understanding learning.

A single neuron combines dot product + bias + activation, so it unifies everything from the previous chapters: dot product, matrix multiplication, linear layer, and activation function all come together here.

Real-life analogy—exam pass prediction: Compute 'Math×0.4 + Science×0.4 + English×0.2 + 5 = 75' (weighted sum), then 'if ≥60 → pass (1), else fail (0)' (activation). That's exactly one neuron's operation.

One neuron in image recognition: It takes a specific region of pixels, computes weighted sum + bias, passes through ReLU to get a 'is there a horizontal line here?' score. Thousands of such neurons together can determine 'dog or cat.'

Chatbots, translators, speech recognition: Each part of a sentence or sound is converted to numbers, neurons score 'what patterns are present,' and those scores flow to the next layer's neurons to grasp increasingly complex meaning.

Step 1—Weighted sum (Z): Compute Z = W·X + b. Dot product W's row with X, then add b. If the blank is in Z, fill it at this step.

Step 2—Activation (Y): Apply the given activation to Z. ReLU: Y = Z if Z > 0, Y = 0 if Z ≤ 0. Sigmoid: check the table to see which interval Z falls in.

Blank in W or b: If Y and X are given, reverse the activation to find Z first, then solve Z = W·X + b for the blank. The key is to work backwards one step at a time.

An artificial neuron computes the weighted sum $Z = W \cdot X + b$ , then applies an activation (e.g. ReLU, Sigmoid, or Tanh) to produce output Y.

-1

→ ReLU →

Step by step: (W·X) multiplied + b added = Z → ReLU(Z) = Y

Z = (W·X) + b = (1×3 + (-1)×1) + 1 = 2 + 1 = 3

Y = ReLU(Z) = max(0, 3) = 3

Problem

Artificial neuron: apply the given activation (ReLU, Sigmoid, or Tanh) to get Y, and fill in the blank (?).

-1

-2

-1

-2

-1

~ ReLU

1 / 20