Data and Features: The Start of Machine Learning
Machine learning starts with data. We turn images, text, and numbers into features—numeric representations that let the model learn patterns. The world of numbers and functions from Basic Math Ch00 becomes reality here.
ML diagram by chapter
Select a chapter to see its diagram below. View the machine learning flow at a glance.
What are Data and Features?
Data is the raw material of machine learning — As we learned in Basic Math Ch00, deep learning and machine learning turn images, text, and sound into numbers. These numeric inputs paired with labels (correct answers) form data. For example, 'cat image + cat' is one data point, and thousands of such pairs become the material for the model to learn from.
Features are the numeric essence of data — A photo we see is just a pile of tens of thousands of pixel numbers to a computer. Features are the useful information—like ear shape, eye size, fur color—extracted and expressed as numbers. Mathematically they are vectors, extracted from raw data through functions. The 'functions that define input-output rules' from Ch00 handle this transformation.
In short — Data is a collection of (input, label) pairs; features are the result of turning that input into numeric vectors the model can understand. Good features lead to better learning; bad features hurt performance even with lots of data. The start of machine learning is deciding what data to use and what features to extract.
Why it matters
Without data, learning is impossible — Every decision a model makes is the result of numbers and functions. As in Ch00, to follow the AI computation we need data expressed as numbers. If data is scarce or labels are wrong, the model learns the wrong patterns.
Feature design sets the model's limits — Deciding which information to turn into numbers is called feature engineering. Using only 'yesterday's closing price' vs. adding 'moving average, volume, volatility' for stock prediction leads to very different results. Vectors and matrices bundle many features for batch computation—a core part of the Ch00 roadmap—and the quality of features drives model performance.
Bridge to the next chapters — Ch02 KNN, Ch03 Linear Regression, Ch05 Logistic Regression, and all ML algorithms take feature vectors as input. Understanding data and features is needed to interpret why a model made a given prediction, and the later chapters on differentiation and probability build on this foundation.
How it is used
Input → feature extraction → model → prediction — The ML pipeline matches the input → numeric conversion → repeated functions → output structure from Ch00. Feature extraction is the 'numeric conversion' step; models (linear regression, KNN, etc.) are sets of functions. Differentiation is used to reduce error during training; probability expresses uncertainty in predictions like '90% chance this image is a cat'.
Explanation for solving the problems
Data and features — Data are (input , target ) pairs; features encode observations as numbers that form the feature vector . The target is the you want to predict; the model learns ; evaluation uses loss and metrics.
Example (concept)
Which best describes a feature vector?
① labels only
② numeric encoding of inputs
③ the loss function
Features are the numeric vector built from inputs. → Answer
②
Example (analogy table)
- ConceptData
- Real-estate analogyPast transactions
- MLPairs
- ConceptFeature
- Real-estate analogySize, location
- MLInput vector
- ConceptTarget
- Real-estate analogySale price
- MLLabel
- ConceptModel
- Real-estate analogy"Price per unit area" rule
- MLFunction
- ConceptEvaluation
- Real-estate analogyCompare estimate vs actual
- MLLoss
| Concept | Real-estate analogy | ML |
|---|---|---|
| Data | Past transactions | Pairs |
| Feature | Size, location | Input vector |
| Target | Sale price | Label |
| Model | "Price per unit area" rule | Function |
| Evaluation | Compare estimate vs actual | Loss |