Ch.00

Intermediate Math and AI: Multivariable Space and Uncertainty

Intermediate math is where the language of AI becomes more precise. Instead of treating data as just numbers, this course views it as vectors and matrices, and studies the rules that move between them as linear transformations . You’ll also interpret how learning behaves by using Jacobians (how outputs change with many inputs) and Hessians (curvature information), so you can understand why training can be fast, slow, or unstable.

Math diagram by chapter

Select a chapter to see its diagram below. View the flow of intermediate math at a glance.

What you learn in Ch01–Ch20

Intermediate math deepens the language you use to understand AI. You learn how data is represented and transformed using vectors, matrices, and linear transformations, then quantify similarity and direction with dot products and projections. Next, you interpret change and curvature using Jacobians and Hessians, which lets you understand the shape of the loss landscape. Finally, you design learning more robustly with Taylor series and convex optimization, and learn uncertainty with Bayes, covariance, and the multivariate normal distribution.

Ch.01
Vectors and Vector Space: Magnitude and Direction Beyond Scalars
Ch.02
Dot Product and Projection: Angle and Similarity Between Data
Ch.03
Matrices and Data: Structural Representation of Many Vectors
Ch.04
Matrix Multiplication and Linear Transformation: Math That Manipulates Space
Ch.05
Inverse and Determinant: Inverse of Transformation and Change in Volume
Ch.06
Linear Independence and Rank: Redundancy and Effective Dimension
Ch.07
Eigenvalues and Eigenvectors: Principal Axes Unchanged by Transformation
Ch.08
Directional Derivative and Gradient: Steepest Ascent in Multidimensional Space
Ch.09
Jacobian Matrix: First Derivatives of Multivariable Vector Functions
Ch.10
Hessian Matrix: Second Derivatives and Curvature of Surfaces
Ch.11
Taylor Series: Approximating Complex Functions with Polynomials
Ch.12
Convex Optimization: Conditions for Finding the Minimum
Ch.13
Conditional Probability and Dependence: Probabilistic Relations Between Variables
Ch.14
Bayes' Theorem: Updating Probability with Observed Data
Ch.15
Covariance and Correlation: Measuring Linear Association Between Two Variables
Ch.16
Multivariate Normal Distribution: Joint Probability Model for Many Variables
Ch.17
Maximum Likelihood Estimation (MLE): Inferring Parameters from Observations
Ch.18
Entropy: Quantifying Uncertainty via Information Theory
Ch.19
Cross-Entropy and KL Divergence: Measuring Difference Between Two Distributions
Ch.20
Intermediate Math Summary: Linear Algebra and Probability Combined

Vectors, matrices, and sensitivity: how intermediate math explains AI

Vector spaces give a framework for describing data by both direction and magnitude . For example, an image can be represented as coordinates of learned features.

A matrix represents transformations of vectors. In particular, linear transformations provide consistent rules for how coordinates change—this is exactly how each layer in a neural network can be expressed mathematically.

Jacobians and Hessians are maps of sensitivity. Jacobians answer “how much the output changes when the inputs change,” while Hessians describe the curvature of the loss landscape. With these maps, you can design learning updates more intelligently.

Training is essentially repeated computation that reduces error. To understand why error decreases, you need multivariable change (gradients and sensitivity), which is the core of intermediate math.

Linear algebra helps interpret representation. Many ideas (like embeddings and component analysis) reduce to “how vectors are rearranged.” Once you know the math, the results become explainable.

Understanding Hessians helps you see why learning is slow near some regions and faster near others. Second-order information also underpins methods such as Newton’s method and trust-region approaches.

In forward pass, input vectors are transformed by matrix multiplications and linear rules. This determines which features are emphasized and which are suppressed.

In backward pass, you need how changes propagate—Jacobians play that role. The chain rule becomes a language for tracking how small changes reach the output, enabling accurate gradient computation.

During optimization, curvature information (Hessians) can improve stability. Hessians tell you whether the loss surface is flat or steep, shaping the update step.

Topic Similarity & direction Role in AI Bring similar features closer Intermediate concept Dot product, projection Topic How a layer operates Role in AI How one layer transforms vectors Intermediate concept Matrices, linear transformations Topic Sensitivity (change) Role in AI How output changes when inputs change Intermediate concept Jacobians, gradients Topic Learning curvature Role in AI How fast optimization proceeds Intermediate concept Hessians, eigenvalues Topic Uncertainty language Role in AI Describe joint behavior of multiple variables Intermediate concept Covariance, multivariate normal Topic Role in AI Intermediate concept Similarity & direction Bring similar features closer Dot product, projection How a layer operates How one layer transforms vectors Matrices, linear transformations Sensitivity (change) How output changes when inputs change Jacobians, gradients Learning curvature How fast optimization proceeds Hessians, eigenvalues Uncertainty language Describe joint behavior of multiple variables Covariance, multivariate normal

Vectors, matrices, and sensitivity: how intermediate math explains AI

Vector spaces give a framework for describing data by both direction and magnitude . For example, an image can be represented as coordinates of learned features.

A matrix represents transformations of vectors. In particular, linear transformations provide consistent rules for how coordinates change—this is exactly how each layer in a neural network can be expressed mathematically.

Jacobians and Hessians are maps of sensitivity. Jacobians answer “how much the output changes when the inputs change,” while Hessians describe the curvature of the loss landscape. With these maps, you can design learning updates more intelligently.

Training is essentially repeated computation that reduces error. To understand why error decreases, you need multivariable change (gradients and sensitivity), which is the core of intermediate math.

Linear algebra helps interpret representation. Many ideas (like embeddings and component analysis) reduce to “how vectors are rearranged.” Once you know the math, the results become explainable.

Understanding Hessians helps you see why learning is slow near some regions and faster near others. Second-order information also underpins methods such as Newton’s method and trust-region approaches.

In forward pass, input vectors are transformed by matrix multiplications and linear rules. This determines which features are emphasized and which are suppressed.

In backward pass, you need how changes propagate—Jacobians play that role. The chain rule becomes a language for tracking how small changes reach the output, enabling accurate gradient computation.

During optimization, curvature information (Hessians) can improve stability. Hessians tell you whether the loss surface is flat or steep, shaping the update step.

Topic Similarity & direction Role in AI Bring similar features closer Intermediate concept Dot product, projection Topic How a layer operates Role in AI How one layer transforms vectors Intermediate concept Matrices, linear transformations Topic Sensitivity (change) Role in AI How output changes when inputs change Intermediate concept Jacobians, gradients Topic Learning curvature Role in AI How fast optimization proceeds Intermediate concept Hessians, eigenvalues Topic Uncertainty language Role in AI Describe joint behavior of multiple variables Intermediate concept Covariance, multivariate normal Topic Role in AI Intermediate concept Similarity & direction Bring similar features closer Dot product, projection How a layer operates How one layer transforms vectors Matrices, linear transformations Sensitivity (change) How output changes when inputs change Jacobians, gradients Learning curvature How fast optimization proceeds Hessians, eigenvalues Uncertainty language Describe joint behavior of multiple variables Covariance, multivariate normal

Topic	Role in AI	Intermediate concept
Similarity & direction	Bring similar features closer	Dot product, projection
How a layer operates	How one layer transforms vectors	Matrices, linear transformations
Sensitivity (change)	How output changes when inputs change	Jacobians, gradients
Learning curvature	How fast optimization proceeds	Hessians, eigenvalues
Uncertainty language	Describe joint behavior of multiple variables	Covariance, multivariate normal