Everyone's AI
Machine learningAI Papers
Loading...

Learn

🏅My achievements

Chapter 06

Derivative and Derivative Function: Instantaneous Slope, the Compass of Learning

Differentiation gives the instantaneous rate of change (slope) at a point. The derivative as a function is the basis for gradient descent and backprop in deep learning.

Math diagram by chapter

Select a chapter to see its diagram below. View the flow of basic math at a glance.

Left: Tangent line — the line that touches the curve at exactly one point. Its slope is the derivative there. Right: Line through two points — the line through two points on the curve. As the points get closer, this line approaches the tangent, and that limit is the derivative.

1232468(2, 4)y=x²

Pick one point (2, 4) on the curve y=x2y=x^2y=x2. We want to measure the slope at this point.

The derivative is the slope of the tangent line. The limit of the "line through two points" slopes (as the points get closer) is the tangent slope.

What are differentiation and the derivative?

The derivative is the instantaneous rate of change at a point on the curve—how steep the graph is there. Geometrically it is the slope of the tangent line at that point. Just as a speedometer shows the speed at each instant, the derivative tells you how sensitively the output yyy responds when the input xxx changes by a tiny amount.
Mathematically, this slope is obtained by taking the limit (Limit) of the 'average slope between two very close points' and then sending the distance between the two points to zero. The formula is f′(a)=lim⁡h→0f(a+h)−f(a)hf^{\prime}(a) = \lim_{h \to 0} \frac{f(a+h)-f(a)}{h}f′(a)=limh→0​hf(a+h)−f(a)​ (the denominator hhh is the distance between two points, the numerator is the change in fff). The value obtained through this process is called the derivative coefficient and is written as f′(a)f^{\prime}(a)f′(a). A large value means a steep slope; zero means flat.
The derivative function is the new function that assigns to each xxx the slope at that point. So instead of computing the slope at every point by hand, you plug xxx into the derivative formula—it is a "slope generator." The process of finding it is called differentiation.
  • ItemBelow
  • DescriptionCommon differentiation formulas.
  • Item′'′
  • DescriptionMeans the derivative.
ItemDescription
BelowCommon differentiation formulas.
′'′Means the derivative.
  • FormulaConstant
  • Descriptionderivative is 0
  • FormulaPower xnx^nxn
  • Descriptionbring down nnn, exponent n−1n-1n−1
  • FormulaExponential exe^xex
  • Descriptionderivative is still exe^xex
  • FormulaExponential axa^xax
  • Descriptionaxln⁡aa^x \ln aaxlna
  • FormulaNatural log ln⁡x\ln xlnx
  • Description1/x1/x1/x
  • FormulaLog (base a)
  • Description1/(xln⁡a)1/(x \ln a)1/(xlna)
  • Formulasin
  • Descriptionderivative is cos
  • Formulacos
  • Descriptionderivative is −sin⁡-\sin−sin
  • Formulatan
  • Descriptionderivative is 1/cos⁡21/\cos^21/cos2
  • FormulaSum/difference
  • Descriptiondifferentiate each part and add or subtract
  • FormulaConstant multiple
  • Descriptionkeep constant, differentiate the rest
  • FormulaProduct rule
  • Description(first′)×second + first×(second′)
  • FormulaQuotient rule
  • Description(top′×bottom − top×bottom′) / bottom²
  • FormulaIntuition (product)
  • Descriptionfirst derivative × second + first × second derivative. In the limit the two changes add up, so you get these two terms.
  • FormulaIntuition (quotient)
  • DescriptionWrite the quotient as top × (1/bottom) and use the product rule. The derivative of (1/bottom) gives the usual formula.
FormulaDescription
Constantderivative is 0
Power xnx^nxnbring down nnn, exponent n−1n-1n−1
Exponential exe^xexderivative is still exe^xex
Exponential axa^xaxaxln⁡aa^x \ln aaxlna
Natural log ln⁡x\ln xlnx1/x1/x1/x
Log (base a)1/(xln⁡a)1/(x \ln a)1/(xlna)
sinderivative is cos
cosderivative is −sin⁡-\sin−sin
tanderivative is 1/cos⁡21/\cos^21/cos2
Sum/differencedifferentiate each part and add or subtract
Constant multiplekeep constant, differentiate the rest
Product rule(first′)×second + first×(second′)
Quotient rule(top′×bottom − top×bottom′) / bottom²
Intuition (product)first derivative × second + first × second derivative. In the limit the two changes add up, so you get these two terms.
Intuition (quotient)Write the quotient as top × (1/bottom) and use the product rule. The derivative of (1/bottom) gives the usual formula.
  • FormulaConstant
  • Numeric example(5)'=0, (-3)'=0
  • Solution stepderivative is 0
  • Formulaxnx^nxn
  • Numeric example(x3)′=3x2(x^3)'=3x^2(x3)′=3x2, at x=2x=2x=2: 12
  • Solution stepbring down nnn, exponent n−1n-1n−1
  • Formulaexe^xex
  • Numeric exampleat x=0x=0x=0: 1; at x=1x=1x=1: eee
  • Solution stepunchanged by differentiation
  • Formulaaxa^xax
  • Numeric example(2x)′=2xln⁡2(2^x)'=2^x \ln 2(2x)′=2xln2
  • Solution stepmultiply by ln⁡a\ln alna
  • Formulaln⁡x\ln xlnx
  • Numeric exampleat x=5x=5x=5: 1/51/51/5
  • Solution stepderivative is 1/x1/x1/x
  • Formulasin
  • Numeric exampleat x=0x=0x=0: 1
  • Solution stepsin → cos
  • Formulacos
  • Numeric exampleat x=0x=0x=0: 0
  • Solution stepcos → −sin⁡-\sin−sin
  • FormulaSum
  • Numeric example(x2+x)′=2x+1(x^2+x)'=2x+1(x2+x)′=2x+1
  • Solution stepdifferentiate each term and add
  • FormulaConstant multiple
  • Numeric example(5x2)′=10x(5x^2)'=10x(5x2)′=10x
  • Solution stepkeep constant, differentiate rest
  • FormulaProduct
  • Numeric example(x⋅ex)′=ex(1+x)(x\cdot e^x)'=e^x(1+x)(x⋅ex)′=ex(1+x)
  • Solution step(first′)×second + first×(second′)
  • FormulaQuotient
  • Numeric examplex/(x2+1)x/(x^2+1)x/(x2+1) → at x=1x=1x=1: 0
  • Solution step(top′×bottom−top×bottom′)/bottom²
FormulaNumeric exampleSolution step
Constant(5)'=0, (-3)'=0derivative is 0
xnx^nxn(x3)′=3x2(x^3)'=3x^2(x3)′=3x2, at x=2x=2x=2: 12bring down nnn, exponent n−1n-1n−1
exe^xexat x=0x=0x=0: 1; at x=1x=1x=1: eeeunchanged by differentiation
axa^xax(2x)′=2xln⁡2(2^x)'=2^x \ln 2(2x)′=2xln2multiply by ln⁡a\ln alna
ln⁡x\ln xlnxat x=5x=5x=5: 1/51/51/5derivative is 1/x1/x1/x
sinat x=0x=0x=0: 1sin → cos
cosat x=0x=0x=0: 0cos → −sin⁡-\sin−sin
Sum(x2+x)′=2x+1(x^2+x)'=2x+1(x2+x)′=2x+1differentiate each term and add
Constant multiple(5x2)′=10x(5x^2)'=10x(5x2)′=10xkeep constant, differentiate rest
Product(x⋅ex)′=ex(1+x)(x\cdot e^x)'=e^x(1+x)(x⋅ex)′=ex(1+x)(first′)×second + first×(second′)
Quotientx/(x2+1)x/(x^2+1)x/(x2+1) → at x=1x=1x=1: 0(top′×bottom−top×bottom′)/bottom²
In everyday terms, the derivative is a compass for finding the optimum. When you want to go from a mountain top to the lowest valley, you follow the slope under your feet downward. The derivative computes that slope in exact numbers. Where the slope is zero you are at a peak (maximum) or a valley floor (minimum), so the derivative is essential for finding minima and maxima.
Deep learning models must minimize the error (loss) between the correct answer and the prediction. The gradient is what tells us how to change each of the model's many weights (www) so that the error decreases. By differentiating, we learn whether slightly increasing a given weight would increase or decrease the error, and we update weights in the direction that reduces the error fastest.
Backpropagation applies this derivative idea in reverse for efficient learning. It works backward from the output, step by step, computing how much each step contributed to the final error. To differentiate through stacked functions we use differentiation of composite functions (chain rule); the basic derivative formulas in this chapter are exactly what make that work.
In general, the derivative is used for sensitivity analysis: how much does the whole result swing when one variable is nudged? In economics we ask how demand changes when price changes a little; in physics we measure how position changes over time (velocity). So anywhere we quantify how a small change in a cause affects the outcome, we use the derivative.
In AI training, every parameter update depends on derivative values. When we use libraries like PyTorch or TensorFlow, the loss is differentiated with respect to each weight in an instant. Moving the weights in the opposite direction of that derivative is gradient descent. The formulas in this chapter are the first key to understanding how AI gets smarter through computation.
To find the derivative, identify which rules apply (power, exponential, log, trig, product, quotient, chain), then apply them and simplify.
Example problems and solutions are in the table below.
  • ProblemEx 1. f(x)=x3−2xf(x)=x^3-2xf(x)=x3−2x
  • SolutionPower and sum: f′(x)=3x2−2f^{\prime}(x)=3x^2-2f′(x)=3x2−2
  • ProblemEx 2. g(x)=exln⁡xg(x)=e^x \ln xg(x)=exlnx
  • SolutionProduct rule: exln⁡x+ex⋅1xe^x \ln x + e^x \cdot \frac{1}{x}exlnx+ex⋅x1​
  • ProblemEx 3. h(x)=sin⁡xxh(x)=\frac{\sin x}{x}h(x)=xsinx​
  • SolutionQuotient rule: cos⁡x⋅x−sin⁡xx2\frac{\cos x \cdot x - \sin x}{x^2}x2cosx⋅x−sinx​
ProblemSolution
Ex 1. f(x)=x3−2xf(x)=x^3-2xf(x)=x3−2xPower and sum: f′(x)=3x2−2f^{\prime}(x)=3x^2-2f′(x)=3x2−2
Ex 2. g(x)=exln⁡xg(x)=e^x \ln xg(x)=exlnxProduct rule: exln⁡x+ex⋅1xe^x \ln x + e^x \cdot \frac{1}{x}exlnx+ex⋅x1​
Ex 3. h(x)=sin⁡xxh(x)=\frac{\sin x}{x}h(x)=xsinx​Quotient rule: cos⁡x⋅x−sin⁡xx2\frac{\cos x \cdot x - \sin x}{x^2}x2cosx⋅x−sinx​
Problem types and how to solve
  • TypePower
  • Formulaf(x)=xnf(x)=x^nf(x)=xn
  • How to get f′(a)f^{\prime}(a)f′(a) at x=ax=ax=af′(a)=n⋅an−1f^{\prime}(a) = n \cdot a^{n-1}f′(a)=n⋅an−1. Bring down the exponent, subtract 1, then plug in aaa.
  • TypeLinear
  • Formulaf(x)=mx+bf(x)=mx+bf(x)=mx+b
  • How to get f′(a)f^{\prime}(a)f′(a) at x=ax=ax=af′(a)=mf^{\prime}(a) = mf′(a)=m. The slope is the derivative, so the answer is mmm regardless of aaa.
  • TypeQuadratic
  • Formulaf(x)=a2x2+a1x+a0f(x)=a_2 x^2 + a_1 x + a_0f(x)=a2​x2+a1​x+a0​
  • How to get f′(a)f^{\prime}(a)f′(a) at x=ax=ax=af′(a)=2a2⋅a+a1f^{\prime}(a) = 2 a_2 \cdot a + a_1f′(a)=2a2​⋅a+a1​. Twice the x2x^2x2 coefficient times aaa, plus the linear coefficient.
  • TypeConstant × power
  • Formulaf(x)=c⋅xnf(x)=c \cdot x^nf(x)=c⋅xn
  • How to get f′(a)f^{\prime}(a)f′(a) at x=ax=ax=af′(a)=c⋅n⋅an−1f^{\prime}(a) = c \cdot n \cdot a^{n-1}f′(a)=c⋅n⋅an−1. Keep ccc, multiply by the derivative of xnx^nxn.
TypeFormulaHow to get f′(a)f^{\prime}(a)f′(a) at x=ax=ax=a
Powerf(x)=xnf(x)=x^nf(x)=xnf′(a)=n⋅an−1f^{\prime}(a) = n \cdot a^{n-1}f′(a)=n⋅an−1. Bring down the exponent, subtract 1, then plug in aaa.
Linearf(x)=mx+bf(x)=mx+bf(x)=mx+bf′(a)=mf^{\prime}(a) = mf′(a)=m. The slope is the derivative, so the answer is mmm regardless of aaa.
Quadraticf(x)=a2x2+a1x+a0f(x)=a_2 x^2 + a_1 x + a_0f(x)=a2​x2+a1​x+a0​f′(a)=2a2⋅a+a1f^{\prime}(a) = 2 a_2 \cdot a + a_1f′(a)=2a2​⋅a+a1​. Twice the x2x^2x2 coefficient times aaa, plus the linear coefficient.
Constant × powerf(x)=c⋅xnf(x)=c \cdot x^nf(x)=c⋅xnf′(a)=c⋅n⋅an−1f^{\prime}(a) = c \cdot n \cdot a^{n-1}f′(a)=c⋅n⋅an−1. Keep ccc, multiply by the derivative of xnx^nxn.

Example (power)
For f(x)=x3f(x)=x^3f(x)=x3, find the derivative at x=2x=2x=2, i.e. f′(2)f^{\prime} (2)f′(2).
Solution
f′(x)=3x2f^{\prime}(x)=3x^2f′(x)=3x2, so f′(2)=3×22=12f^{\prime} (2) =3 \times 2^2 = 12f′(2)=3×22=12. → Answer 12

Example (linear)
For f(x)=3x+1f(x)=3x+1f(x)=3x+1, find f′(5)f^{\prime} (5)f′(5).
Solution
The derivative of a linear function is the slope, so f′(x)=3f^{\prime}(x)=3f′(x)=3. Hence f′(5)=3f^{\prime} (5) =3f′(5)=3 for any aaa. → Answer 3

Example (quadratic)
For f(x)=x2+2x+1f(x)=x^2+2x+1f(x)=x2+2x+1, find f′(3)f^{\prime} (3)f′(3).
Solution
f′(x)=2x+2f^{\prime}(x)=2x+2f′(x)=2x+2, so f′(3)=2×3+2=8f^{\prime} (3) =2 \times 3 + 2 = 8f′(3)=2×3+2=8. → Answer 8

Example (constant × power)
For f(x)=2x4f(x)=2x^4f(x)=2x4, find f′(1)f^{\prime} (1)f′(1).
Solution
f′(x)=2×4×x3=8x3f^{\prime}(x)=2 \times 4 \times x^3 = 8x^3f′(x)=2×4×x3=8x3, so f′(1)=8×1=8f^{\prime} (1) =8 \times 1 = 8f′(1)=8×1=8. → Answer 8