What is machine learning?

Machine learning learns patterns from data to make predictions. Start with https://mdooai.com/en/learn/ml/ml02.

What is the difference between ML and DL?

Deep learning is a subset of machine learning focused on neural networks. Build foundations at https://mdooai.com/en/learn/ml/ml00 first.

How do I start hyperparameter tuning?

Use cross-validation while narrowing search ranges. Start at https://mdooai.com/en/learn/ml/ml11.

Data and Features: The Start of Machine Learning

Machine learning starts with data. We turn images, text, and numbers into features —numeric representations that let the model learn patterns. The world of numbers and functions from Basic Math Ch00 becomes reality here.

ML diagram by chapter

Select a chapter to see its diagram below. View the machine learning flow at a glance.

What are Data and Features?

Data is the raw material of machine learning — As we learned in Basic Math Ch00, deep learning and machine learning turn images, text, and sound into numbers . These numeric inputs paired with labels (correct answers) form data . For example, 'cat image + cat' is one data point, and thousands of such pairs become the material for the model to learn from. Features are the numeric essence of data — A photo we see is just a pile of tens of thousands of pixel numbers to a computer. Features are the useful information—like ear shape, eye size, fur color—extracted and expressed as numbers. Mathematically they are vectors, extracted from raw data through functions . The 'functions that define input-output rules' from Ch00 handle this transformation. In short — Data is a collection of (input, label) pairs; features are the result of turning that input into numeric vectors the model can understand. Good features lead to better learning; bad features hurt performance even with lots of data. The start of machine learning is deciding what data to use and what features to extract.

Why it matters

Without data, learning is impossible — Every decision a model makes is the result of numbers and functions . As in Ch00, to follow the AI computation we need data expressed as numbers . If data is scarce or labels are wrong, the model learns the wrong patterns. Feature design sets the model's limits — Deciding which information to turn into numbers is called feature engineering . Using only 'yesterday's closing price' vs. adding 'moving average, volume, volatility' for stock prediction leads to very different results. Vectors and matrices bundle many features for batch computation—a core part of the Ch00 roadmap—and the quality of features drives model performance. Bridge to the next chapters — Ch02 KNN, Ch03 Linear Regression, Ch05 Logistic Regression, and all ML algorithms take feature vectors as input. Understanding data and features is needed to interpret why a model made a given prediction, and the later chapters on differentiation and probability build on this foundation.

How it is used

Input \to feature extraction \to model \to prediction — The ML pipeline matches the input \to numeric conversion \to repeated functions \to output structure from Ch00. Feature extraction is the 'numeric conversion' step; models (linear regression, KNN, etc.) are sets of functions . Differentiation is used to reduce error during training; probability expresses uncertainty in predictions like '90% chance this image is a cat'.

Explanation for solving the problems

\mathbf{x}

What are Data and Features?

Concept	Real-estate analogy	ML
Data	Past transactions	Pairs $(x,y)$
Feature	Size, location	Input vector $\mathbf{x}$
Target	Sale price	Label $y$
Model	"Price per unit area" rule	Function $y=f(x)$
Evaluation	Compare estimate vs actual	Loss

Data is the raw material of machine learning — As we learned in Basic Math Ch00, deep learning and machine learning turn images, text, and sound into numbers. These numeric inputs paired with labels (correct answers) form data. For example, 'cat image + cat' is one data point, and thousands of such pairs become the material for the model to learn from.

Features are the numeric essence of data — A photo we see is just a pile of tens of thousands of pixel numbers to a computer. Features are the useful information—like ear shape, eye size, fur color—extracted and expressed as numbers. Mathematically they are vectors, extracted from raw data through functions. The 'functions that define input-output rules' from Ch00 handle this transformation.

In short — Data is a collection of (input, label) pairs; features are the result of turning that input into numeric vectors the model can understand. Good features lead to better learning; bad features hurt performance even with lots of data. The start of machine learning is deciding what data to use and what features to extract.

Concept	Real-estate analogy	ML
Data	Past transactions	Pairs $(x,y)$
Feature	Size, location	Input vector $\mathbf{x}$
Target	Sale price	Label $y$
Model	"Price per unit area" rule	Function $y=f(x)$
Evaluation	Compare estimate vs actual	Loss

Concept

Real-estate analogy

Data

Past transactions

Pairs

(x,y)

Feature

Size, location

Input vector $\mathbf{x}$

Target

Sale price

Label $y$

Model

"Price per unit area" rule

Function $y=f(x)$

Evaluation

Compare estimate vs actual

Loss