Ch.09
Cross Validation: Practice Tests and the Real Exam
Cross validation is essential so that models do not become "frogs in a well"—only good at the exercises they memorized. Just as students use practice tests to check their real level and the final exam to confirm it, we do not score machine learning models only on training data; we evaluate them on validation and test data they have not seen. This chapter covers cross validation (Hold-out, K-Fold, etc.) and how to make performance estimates reliable.
ML diagram by chapter
Select a chapter to see its diagram below. View the machine learning flow at a glance.
Split data into train/validation/test; in K-Fold, take turns validating and estimate performance by the mean score.
Cross validation: practice tests (validation) to estimate skill, final exam (test) to confirm.
Cross Validation: Practice Tests and the Real Exam
- Data typeTraining (Train)
- MetaphorTextbook / practice set
- Role and useMain data used to learn patterns and update weights.
- Typical ratio~70–80%
- Data typeValidation
- MetaphorPractice exam
- Role and useUsed mid-learning to check performance and tune hyperparameters.
- Typical ratio~10–15%
- Data typeTest
- MetaphorFinal exam
- Role and useUsed only once after all learning to report final performance.
- Typical ratio~10–15%
| Data type | Metaphor | Role and use | Typical ratio |
|---|---|---|---|
| Training (Train) | Textbook / practice set | Main data used to learn patterns and update weights. | ~70–80% |
| Validation | Practice exam | Used mid-learning to check performance and tune hyperparameters. | ~10–15% |
| Test | Final exam | Used only once after all learning to report final performance. | ~10–15% |