Mahamudul Hasan Rubel
HomeBlogCoursesAboutProjectsSkillsExperiencePhotosContact
Mahamudul Hasan Rubel

Senior Software Engineer crafting high-performance web applications and SaaS platforms.

Navigation

  • Home
  • Blog
  • Courses
  • About
  • Projects
  • Skills
  • Experience
  • Photos
  • Contact

Get in Touch

Available for senior/lead roles and consulting.

bd.mhrubel@gmail.comHire Me

© 2026 Mahamudul Hasan Rubel. All rights reserved.

Built with using Next.js 16 & Tailwind v4

Back to Blog
Lesson 42 of the Intermediate Machine Learning: Real-World Pipelines course
AI/MLJune 26, 20264 min read

Monitoring Data Drift: A Practical Guide for ML Engineers

Data drift occurs when production data shifts away from your training baseline. Learn to calculate the Population Stability Index and set up alerts to catch it.

MLOpsData DriftProductionMonitoringPythonaimachine-learning

Previously in this course, we covered Input Validation and Schema Enforcement for ML Pipelines, which ensures your model receives the correct data types. Now, we move beyond schema structure to address the content of that data: data drift.

Even if your API receives perfectly formatted JSON, your model will fail if the underlying distribution of the features changes. This phenomenon, often called "silent failure," occurs when the world changes—user behavior shifts, seasonal trends emerge, or upstream upstream systems change their data generation process. In this lesson, we will implement methods to detect these shifts quantitatively.

What is Data Drift?

Data drift (or covariate shift) is the change in the distribution of input features ($P(X)$) between the training set and the production set.

If your model was trained on data where the average customer age was 35, but your production traffic shifts to an average of 55, the model is now operating in a region of the feature space it hasn't "seen" during training. Because the model's decision boundaries were optimized for the 35-year-old distribution, performance will likely degrade.

The Population Stability Index (PSI)

The Population Stability Index (PSI) is the industry standard for measuring how much a variable's distribution has changed over time. It quantifies the difference between two distributions by binning the data and comparing the percentage of samples in each bin.

The formula for PSI is: $$PSI = \sum (Actual% - Expected%) \times \ln\left(\frac{Actual%}{Expected%}\right)$$

  • PSI < 0.1: No significant change.
  • 0.1 <= PSI < 0.25: Moderate change; requires investigation.
  • PSI >= 0.25: Significant change; likely requires model retraining or adjustment.

Implementing Drift Checks in Python

To implement this, we need to compare a "reference" dataset (usually your training set) against a "current" dataset (the latest batch of production data). We'll use numpy to bin the data and compute the PSI.

PYTHON
import numpy as np
import pandas as pd

def calculate_psi(expected, actual, buckets=10):
    def get_bin_percentages(data, bins):
        counts, _ = np.histogram(data, bins=bins)
        return counts / len(data)

    # Define bins based on the reference(expected) distribution
    _, bins = np.histogram(expected, bins=buckets)
    
    expected_pct = get_bin_percentages(expected, bins)
    actual_pct = get_bin_percentages(actual, bins)

    # Add small epsilon to avoid division by zero or log(0)
    expected_pct = np.clip(expected_pct, 0.0001, None)
    actual_pct = np.clip(actual_pct, 0.0001, None)

    psi_values = (actual_pct - expected_pct) * np.log(actual_pct / expected_pct)
    return np.sum(psi_values)

# Example usage
train_data = np.random.normal(0, 1, 1000)
prod_data = np.random.normal(0.2, 1.1, 1000) # Slightly shifted

psi = calculate_psi(train_data, prod_data)
print(f"PSI: {psi:.4f}")

Setting Up Alerts

In a production environment, you shouldn't manually run this script. You need an automated monitoring loop.

  1. Reference Baseline: During your Project Milestone: Tuning the Champion Model, save the distribution statistics (mean, std, min, max, and quantiles) of your training features into a JSON config file.
  2. Batch Monitoring: Run a daily job that pulls a sample of production logs.
  3. Thresholding: If the PSI exceeds 0.2, trigger a warning in your dashboard (e.g., Grafana or Slack).

Hands-on Exercise

Using the calculate_psi function provided above, perform the following:

  1. Generate two datasets: baseline (normal distribution) and drifted (normal distribution with a different mean).
  2. Calculate the PSI.
  3. Write a simple conditional check that prints "ALERT: Drift detected" if the PSI is greater than 0.1.
  4. Experiment with different buckets values; how does changing the number of bins affect the sensitivity of the PSI?

Common Pitfalls

  • Ignoring Categorical Data: PSI is natively for continuous data. For categorical features, use the Jensen-Shannon Divergence or simply compare frequency counts directly.
  • Too Much Data, Too Little Signal: If you monitor every single feature, you will suffer from "alert fatigue." Focus your monitoring on high-impact features—those with the highest importance scores identified in Interpreting Complex Ensembles.
  • The "Feedback Loop" Trap: If your model influences the data (e.g., a recommendation system), the drift might be caused by your own model's behavior. Always distinguish between feature drift (the world changed) and model-induced shift.

Recap

Monitoring data drift is not just about tracking numbers—it's about maintaining the contract between your model and the reality it predicts. By using the Population Stability Index, you can provide a rigorous, statistical basis for when it's time to trigger a model refresh.

Up next: Tracking Performance Degradation — we will bridge the gap between input drift and actual model accuracy decay.

Previous lessonInput Validation and Schema EnforcementNext lesson Tracking Performance Degradation
Back to Blog

Similar Posts

AI/MLJune 26, 20264 min read

Logging and Observability for Production ML Pipelines

Master production logging and observability to track execution times and build robust audit trails for your ML pipelines. Ensure your models remain debuggable.

Read more
AI/MLJune 25, 20264 min read

Model Monitoring in Practice: Keeping AI Healthy

Master production monitoring for ML. Learn to design effective health checks, track performance metrics, and build alerts to catch silent model failures.

Part of the course

Intermediate Machine Learning: Real-World Pipelines

intermediate · Lesson 42 of 49

  1. 1

    Pipeline Architecture Essentials

    4 min
  2. 2

    ColumnTransformer for Heterogeneous Data

    3 min
  3. 3

    Custom Transformers for Feature Engineering

    3 min
Read more
AI/MLJune 26, 20263 min read

Project Milestone: Deployment Readiness for ML Pipelines

Learn how to finalize your ML pipeline for production. We cover final validation, dependency locking, and operational readiness for a seamless deployment.

Read more
  • 4

    Handling Missing Values Strategically

    4 min
  • 5

    Scaling and Normalization Pipelines

    3 min
  • 6

    Encoding Categorical Variables

    3 min
  • 7

    Feature Selection in Pipelines

    3 min
  • 8

    Data Leakage Prevention Strategies

    4 min
  • 9

    Designing Reproducible Pipelines

    3 min
  • 10

    Project Initialization: Defining the Prediction Problem

    3 min
  • 11

    Introduction to Cross-Validation

    3 min
  • 12

    Stratification for Imbalanced Data

    4 min
  • 13

    Time-Series Validation Strategies

    4 min
  • 14

    Confusion Matrices and Beyond

    4 min
  • 15

    Precision-Recall Curves

    4 min
  • 16

    ROC-AUC Analysis

    3 min
  • 17

    Cost-Sensitive Learning

    4 min
  • 18

    Handling Class Imbalance with Resampling

    3 min
  • 19

    Advanced Metrics for Imbalanced Datasets

    4 min
  • 20

    Project Milestone: Building the Baseline Pipeline

    3 min
  • 21

    Introduction to GridSearchCV

    3 min
  • 22

    RandomizedSearchCV for Efficiency

    3 min
  • 23

    Bayesian Optimization Principles

    3 min
  • 24

    Early Stopping in Iterative Models

    4 min
  • 25

    Managing Computational Resources

    3 min
  • 26

    Hyperparameter Stability Analysis

    4 min
  • 27

    Pipeline Parameter Nesting

    3 min
  • 28

    Project Milestone: Tuning the Champion Model

    3 min
  • 29

    Baseline-to-Champion Framework

    3 min
  • 30

    Statistical Significance in Model Comparison

    3 min
  • 31

    Model Ensembling: Voting and Averaging

    3 min
  • 32

    Stacking Architectures

    4 min
  • 33

    Blending Techniques

    4 min
  • 34

    Interpreting Complex Ensembles

    3 min
  • 35

    Managing Model Complexity

    3 min
  • 36

    Bias-Variance Tradeoff in Ensembles

    4 min
  • 37

    Project Milestone: The Ensemble Strategy

    3 min
  • 38

    Serializing Pipelines with Joblib

    4 min
  • 39

    Versioning Models and Data

    3 min
  • 40

    Designing Inference APIs

    3 min
  • 41

    Input Validation and Schema Enforcement

    4 min
  • 42

    Monitoring Data Drift

    4 min
  • 43

    Tracking Performance Degradation

    3 min
  • 44

    Logging and Observability

    4 min
  • 45

    Automated Retraining Triggers

    4 min
  • 46

    Containerization Basics

    4 min
  • 47

    Handling Environment Parity

    3 min
  • 48

    Documentation for Production

    4 min
  • 49

    Project Milestone: Deployment Readiness

    3 min
  • View full course