Chapter 01
Vector dot product: Finding similarity between data
The most basic operation: multiplying two vectors' direction and magnitude into a single value.
Deep learning diagram by chapter
As you complete each chapter, the diagram below fills in. This is the structure so far.
Left X1,X2,X3 and right Y1,Y2,Y3 are connected by lines. Each right node is the dot product of the left with weights.
Dot product in deep learning
The dot product multiplies same-position elements of two vectors and sums the results into a single number. For example, [2, 3] · [4, 1] = 2×4 + 3×1 = 11.
It also measures how aligned two vectors are: a large positive dot product means similar direction, zero means perpendicular (unrelated), and negative means opposite direction. That's why it's great for measuring similarity.
In formula form: a · b = a₁b₁ + a₂b₂ + … + aₙbₙ. Both vectors must have the same number of elements for the dot product to work.
In real AI systems, dot products are computed between vectors with hundreds or thousands of dimensions. Computers do this instantly, so we can compare “how similar two texts are” or “how well an image matches a caption” with one number.
In deep learning, one neuron's output is computed as a dot product between its weights and the input. Multiply same-position values and sum them up—that gives the neuron's "response score" for that input.
The dot product is the most fundamental operation in deep learning because matrix multiplication is just many dot products bundled together. Linear layers, attention, embedding comparison—all rely on repeated dot products.
It also serves as a similarity measure: for example, Netflix computes the dot product of a user vector and a movie vector to get a "match score." This idea is also called cosine similarity.
Recommendation systems (Netflix, YouTube): Compute the dot product of a user vector and a content vector to get a "how much this user would like this content" score. Higher score = higher recommendation rank.
Search engines & chatbots: Convert queries and documents to vectors, then rank by dot product (similarity). ChatGPT uses the same principle when finding the most relevant information for your question.
Attention mechanism: In translators and chatbots, word vectors are dotted to compute "relevance scores"—the model focuses more on words with high scores.
Translation & summarization: The model compares the current token to others with dot products to get relevance scores—this is how it decides which words to attend to in context.
How to compute: Multiply same-position elements, then add all the products. Example: [1, 2, 3] · [4, 5, 6] = 1×4 + 2×5 + 3×6 = 4 + 10 + 18 = 32.
Finding a blank: If the total dot product and the other products are given, sum the known products first, then subtract from the total to get the missing product. Divide by the known element to find the blank.
Watch out: Both vectors must have the same number of elements. Also, make sure to include every pair of elements—checking off each pair one by one helps avoid mistakes.
Double-check: Missing one pair changes the sum. After forming all products, add them twice or add in a fixed order to catch slips.
a = [2, 3], b = [4, 1] → a·b = sum of element-wise products
Problem
Find the dot product of the vectors below.