Classification is the foundation of predictive AI. Learn the logic behind categorizing data, defining decision boundaries, and solving real-world problems.
Previously in this course, we explored the mechanics of linear regression, where we learned to predict numeric values like house prices or temperature. In this lesson, we shift our focus from "how much" to "which one." We are entering the world of classification, where our primary goal is to assign data points to discrete categories.
At its core, classification is the task of mapping input variables to a categorical output. While regression models output a continuous range, classification models output a label.
The simplest form is binary classification, which involves exactly two possible outcomes. You are essentially asking a "Yes/No" or "This/That" question:
The logic behind this is fundamentally different from regression. Instead of finding a line that minimizes the distance to data points, we are finding a way to draw a line (or a more complex shape) that separates our data into two distinct groups.
To separate classes, we use a decision boundary. Imagine you have a scatter plot where blue dots represent "Spam" and red dots represent "Not Spam." A decision boundary is the line, curve, or surface that acts as the dividing wall between these two sets.
In two dimensions (two features), the boundary is a line. In three dimensions, it becomes a plane. In higher dimensions, it is called a hyperplane. The effectiveness of your model depends heavily on how well this boundary partitions the feature space without misclassifying your training data.
Let’s use Python and NumPy to simulate a simple 2D classification scenario. We will define two features (e.g., "Time Spent on Site" and "Pages Visited") to predict if a user will "Buy" (1) or "Not Buy" (0).
PYTHONimport numpy as np import matplotlib.pyplot as plt # Simulate data: 2 features, 2 classes # Class 0: Lower activity, Class 1: Higher activity X = np.array([[1, 2], [2, 1], [3, 4], [5, 6], [6, 5], [7, 8]]) y = np.array([0, 0, 0, 1, 1, 1]) # Plotting the points plt.scatter(X[y==0, 0], X[y==0, 1], color=CE9178">'red', label=CE9178">'No Purchase') plt.scatter(X[y==1, 0], X[y==1, 1], color=CE9178">'blue', label=CE9178">'Purchase') # Defining a manual decision boundary: y = -x + 8 x_vals = np.linspace(0, 8, 100) y_vals = -1 * x_vals + 8 plt.plot(x_vals, y_vals, CE9178">'k--', label=CE9178">'Decision Boundary') plt.xlabel(CE9178">'Time Spent') plt.ylabel(CE9178">'Pages Visited') plt.legend() plt.show()
In this code, the line y = -x + 8 acts as our decision boundary. Any point above this line belongs to the "Purchase" class, while any point below belongs to "No Purchase." In real-world machine learning, the model "learns" the coefficients of this line (the slope and intercept) automatically during training.
Using the logic from the example above, consider a dataset with two features: "Temperature" and "Humidity." You want to predict if it will "Rain" (1) or "Stay Sunny" (0).
Humidity = 0.5 * Temperature + constant).[Temperature, Humidity] list and returns the predicted class based on whether it is above or below your boundary line.Classification allows us to map inputs to discrete categories. We achieve this by defining a decision boundary that partitions our feature space. By mastering this logic, you move from simply measuring trends to making actionable, categorical decisions—a skill essential for LLM routing for production: dynamic task classification & scaling and many other advanced AI workflows.
Up next: We will dive into the math of how models "learn" these boundaries by exploring Loss Functions and Model Objectives, specifically focusing on how we penalize incorrect classifications.
Multi-collinearity can destabilize your ML model's coefficients. Learn to calculate VIF, identify redundant features, and improve your model's reliability today.
Read moreLearn how to build a clean, professional inference script to generate predictions. Master model loading, data processing, and standardized output formats.
Mechanics of Classification