Master error analysis plots to move beyond aggregate metrics. Learn to visualize residuals for regression and classification errors to find model blind spots.
Previously in this course, we discussed Regression Evaluation Metrics and the Confusion Matrix, which provide high-level summaries of how well your model performs. However, aggregate numbers like RMSE or accuracy hide the why behind a failure.
Error analysis is the diagnostic phase of machine learning where we look at individual predictions to understand where the model struggles. If your model is failing, it's rarely failing everywhere equally; it’s usually failing on specific subsets of data. Today, we’ll use error analysis and visualization to uncover those patterns.
A residual is simply the difference between the actual value and the predicted value: $residual = y_{actual} - y_{predicted}$. If your model were perfect, all residuals would be zero.
A residual plot displays the predicted values on the x-axis and the residuals on the y-axis. In a well-behaved linear model, you want to see a random "cloud" of points centered around zero. If you see a pattern—like a U-shape or a funnel—it means your model is failing to capture a systematic relationship in the data.
Using our ongoing project dataset, let’s visualize the residuals to see if our linear model is missing non-linear patterns.
PYTHONimport matplotlib.pyplot as plt import numpy as np # Assuming CE9178">'y_test' and CE9178">'y_pred' are your numpy arrays residuals = y_test - y_pred plt.figure(figsize=(8, 5)) plt.scatter(y_pred, residuals, alpha=0.5) plt.axhline(y=0, color=CE9178">'r', linestyle=CE9178">'--') plt.xlabel("Predicted Values") plt.ylabel("Residuals") plt.title("Residual Plot: Checking for Systematic Bias") plt.show()
If the points form a megaphone shape (getting wider as predictions increase), this indicates heteroscedasticity—your model's error variance is not constant, often because it's failing to account for the increasing scale of your target variable.
For classification, we don't have residuals, but we do have "false" predictions. We want to know: What do the examples the model gets wrong have in common?
If you are building a binary classifier, take your test set and filter for the rows where the model was wrong (False Positives and False Negatives). Then, compare the distribution of features for these "error" rows against the rest of the dataset.
PYTHON# Assuming CE9178">'X_test' is a DataFrame and CE9178">'y_test', CE9178">'y_pred' are arrays errors = X_test[y_test != y_pred] correct = X_test[y_test == y_pred] # Compare the mean of a specific feature for errors vs correct predictions print(f"Mean of feature CE9178">'Age' in errors: {errors[CE9178">'Age'].mean()}") print(f"Mean of feature CE9178">'Age' in correct: {correct[CE9178">'Age'].mean()}")
If the mean "Age" is significantly different in your error set, you’ve found a blind spot. Perhaps your model performs poorly on older individuals because they are underrepresented in the training data.
Aggregate metrics are your starting point, but error analysis is your compass. By plotting residuals, you can spot systematic bias in regression. By isolating and profiling classification errors, you can identify specific segments of your population where the model is failing. Visualization turns abstract loss numbers into actionable insights about your data.
Up next: We will dive into Introduction to Cross-Validation to ensure our error estimates are robust and not just a fluke of a single train-test split.
Learn to measure model accuracy with essential regression metrics. We break down RMSE, MAE, and R-squared so you can evaluate your predictions like a pro.
Read moreMaster advanced hyperparameter tuning with RandomizedSearchCV and Bayesian optimization. Learn to scale your experiments efficiently for better ML models.
Error Analysis Plots