Learn to diagnose overfitting and underfitting by mastering the concepts of bias and variance. Build models that generalize, not just memorize.
Previously in this course, we explored training error vs generalization error. While that lesson taught you how to compare performance metrics, it didn't explain why those gaps appear. In this lesson, we dive into the root causes: the relationship between model complexity, bias, and variance.
Every machine learning model you build is an attempt to map input features to a target outcome. However, no model is perfect. The error a model makes on unseen data can be decomposed into three parts: bias, variance, and irreducible noise.
Bias measures how much a model's average prediction differs from the true value. A model with high bias makes strong, simplifying assumptions about the data.
Variance measures how much the model's predictions change if you train it on a different subset of the same data. A model with high variance is overly sensitive to the specific noise or "quirks" in your training set.
To diagnose your model, you must compare performance metrics across your training and testing sets.
| Condition | Training Error | Test Error | Complexity |
|---|---|---|---|
| Underfitting | High | High | Too Low |
| Overfitting | Low | High | Too High |
| Ideal Model | Low | Low | Balanced |
Imagine we are predicting house prices. We use a polynomial regression model.
PYTHONfrom sklearn.preprocessing import PolynomialFeatures from sklearn.linear_model import LinearRegression from sklearn.metrics import mean_squared_error # Low complexity: Degree 1 (Underfitting) # High complexity: Degree 15 (Overfitting) poly = PolynomialFeatures(degree=15) X_poly = poly.fit_transform(X_train) model = LinearRegression() model.fit(X_poly, y_train) # Assessing the gap train_preds = model.predict(X_poly) test_preds = model.predict(poly.transform(X_test)) print(f"Train MSE: {mean_squared_error(y_train, train_preds)}") print(f"Test MSE: {mean_squared_error(y_test, test_preds)}")
If your Train MSE is 100 but your Test MSE is 50,000, you are looking at a classic case of overfitting. The model has chased the noise in the training data, losing its ability to generalize. Conversely, if both are 20,000, your model hasn't captured enough signal—that is underfitting.
Take the baseline model you created in our previous session on training the baseline linear model.
Overfitting and underfitting represent the two ends of the model complexity spectrum. By monitoring the performance gap between training and testing data, you can diagnose whether your model suffers from high bias (underfitting) or high variance (overfitting). Your goal is to find the "sweet spot" where the model is complex enough to capture the signal but simple enough to ignore the noise.
Up next: We will learn how to quantify these errors using specific Regression Evaluation Metrics like RMSE and R-squared to make your diagnostic process more precise.
Learn how to evaluate model calibration using calibration curves and the Brier score. Ensure your predicted probabilities are accurate representations of reality.
Read moreLearn to measure model accuracy with essential regression metrics. We break down RMSE, MAE, and R-squared so you can evaluate your predictions like a pro.
Overfitting and Underfitting