Ch.04

Logistic Regression: Pass or Fail?

Where linear regression predicts a 'score', logistic regression is the specialist for yes/no classification—e.g. "Will this score mean pass (1) or fail (0)?" It uses the sigmoid function to turn a score into a probability between 0 and 1.

ML diagram by chapter

Select a chapter to see its diagram below. View the machine learning flow at a glance.

The larger the linear score zz, the closer σ(z)\sigma(z) is to 1, so we classify as class 1. z=0z=0 is the decision boundary.

Sigmoid: σ(z)=11+ez\sigma(z) = \frac{1}{1+e^{-z}}. When z>0z>0, y^=1\hat{y}=1; when z0z \le 0, y^=0\hat{y}=0.

How to read the formula — When zz is large and negative, eze^{-z} is large so σ(z)0\sigma(z) \approx 0. When z=0z=0, σ(0)=0.5\sigma(0)=0.5. When zz is large and positive, ez0e^{-z} \approx 0 so σ(z)1\sigma(z) \approx 1. So the formula squeezes any zz into a probability between 0 and 1.

Logistic Regression: Pass or Fail?

The S-curve: sigmoid — The score zz from a linear model can be large or negative. Probabilities must lie between 0 and 1. The sigmoid σ(z)=11+ez\sigma(z) = \frac{1}{1+e^{-z}} maps any real zz into (0, 1).
Decision boundary — When the sigmoid outputs e.g. "probability of pass = 0.7", we need a rule. Usually we use 0.5: if probability ≥ 0.5 we predict 1 (yes), otherwise 0 (no).
Same core as linear regression — Logistic regression still computes a score z=wx+bz = wx + b first; the only difference is passing that score through the sigmoid to get a probability.
How to read σ(z)=11+ez\sigma(z) = \frac{1}{1+e^{-z}} — When zz is large and negative, eze^{-z} is large so σ(z)0\sigma(z) \approx 0. When z=0z=0, σ(0)=0.5\sigma(0)=0.5. When zz is large and positive, ez0e^{-z} \approx 0 so σ(z)1\sigma(z) \approx 1. So any zz is squeezed into a probability in [0, 1].
Many real problems are yes/no — Spam or not? Disease or not? Will the user buy? Binary classification is everywhere; logistic regression is the standard baseline.
Confidence as a number — Saying "pass with 98% probability" is more useful than just "pass". Logistic regression gives a probability, which supports better decisions.
Bridge to deep learning — A single neuron in a neural network behaves much like logistic regression. Mastering this makes deep learning easier later.
Spam filter — Compute "probability this email is spam" from features; if above a threshold, send to spam.
Medical AI — From X-rays or lab values, predict "probability of disease" to support diagnosis.
Marketing and recommendations — Predict "will this user churn?" or "will they click?" for targeting and ads.
Logistic regression summary — It is for binary classification (yes/no, pass/fail). We compute a linear score z=w1x1+w2x2++bz = w_1 x_1 + w_2 x_2 + \cdots + b, then apply the sigmoid σ(z)=11+ez\sigma(z) = \frac{1}{1+e^{-z}} to get a probability. We predict y^=1\hat{y}=1 if probability ≥ 0.5, else y^=0\hat{y}=0 (z=0z=0 is the decision boundary). It is important because many real tasks are binary; it also gives confidence (probability) and is the basis for understanding neurons in deep learning. Used in spam filters, medical decision support, and marketing (churn, click prediction). Solution flow: compute zzσ(z)\sigma(z) → if z>0z>0 then y^=1\hat{y}=1, else y^=0\hat{y}=0. See the Explanation for problem solving block below for examples.