Stop relying on a single train-test split. Learn how K-Fold cross-validation provides a stable, reliable evaluation of your machine learning models.
Previously in this course, we discussed the importance of training and testing data splits to estimate how well a model performs on unseen data. However, a single split is often a "lucky" or "unlucky" roll of the dice—your evaluation score depends heavily on which specific rows ended up in your test set.
In this lesson, we move beyond the single split to cross-validation, a technique that systematically rotates your data to give you a more honest, stable assessment of your model's predictive power.
When you perform a standard train-test split, you might find that your model performs exceptionally well on one test set but poorly on another. This sensitivity to data partitioning is a sign of instability. If your dataset is relatively small, or if there is underlying noise in your data, a single split doesn't capture the full picture of how your model will perform in production.
Cross-validation solves this by partitioning the data into multiple subsets, or "folds." The model is trained and evaluated multiple times, ensuring that every data point gets a turn in the test set. This produces a distribution of scores rather than a single point estimate, allowing you to gauge the stability of your model.
In K-Fold cross-validation, the process follows these steps:
A common choice for K is 5 or 10. With 5-fold cross-validation, your model is evaluated five times, and you get five different accuracy or error scores.
cross_val_scoreScikit-learn makes this straightforward with the cross_val_score function. It handles the splitting, training, and scoring internally, returning an array of scores.
PYTHONfrom sklearn.model_selection import cross_val_score from sklearn.linear_model import LinearRegression import numpy as np # Assuming X and y are already preprocessed model = LinearRegression() # Perform 5-fold cross-validation # We use CE9178">'neg_mean_squared_error' as an example metric scores = cross_val_score(model, X, y, cv=5, scoring=CE9178">'neg_mean_squared_error') # cross_val_score returns negative values for error metrics # to ensure higher is always better for scikit-learn mse_scores = -scores print(f"Scores for each fold: {mse_scores}") print(f"Mean MSE: {mse_scores.mean():.4f}") print(f"Standard Deviation: {mse_scores.std():.4f}")
Using the project dataset you initialized in project dataset initialization, apply 5-fold cross-validation to your baseline model.
cross_val_score from sklearn.model_selection.Pipeline to ensure that scaling happens inside the cross-validation loop for each fold.Cross-validation is your primary tool for ensuring model stability. By using K-Fold techniques, you move away from the volatility of a single train-test split and gain a statistical understanding of your model's performance. Always pair this with Pipeline objects to avoid data leakage and ensure your evaluation is truly representative of how your model will handle new data in the real world.
Up next: Diagnosing Model Weaknesses by analyzing where your model fails.
Learn to measure model accuracy with essential regression metrics. We break down RMSE, MAE, and R-squared so you can evaluate your predictions like a pro.
Read moreLearn why high training performance often masks poor real-world results. Discover how to compare training and testing error to master model generalization.
Introduction to Cross-Validation