Hyperparameter Tuning Basics: Controlling Model Behavior

Master the difference between learned parameters and hyperparameters. Learn to identify tunable settings to optimize your machine learning models effectively.

Machine LearningScikit-LearnHyperparametersModel TuningData Scienceaimachine-learningpython

Previously in this course, we explored the Bias-Variance Tradeoff to understand why models fail. Now that we know how to diagnose underfitting and overfitting, we need the tools to fix them. This lesson introduces hyperparameters, the "knobs" you turn to control how your model learns.

What Are Hyperparameters?

In machine learning, every model has two types of settings. First, there are learned parameters (often just called weights). These are the internal values the model adjusts automatically during training, such as the coefficients in a linear regression model. You don't set these; the data does.

Second, there are hyperparameters. These are the configuration settings you define before training begins. They dictate the "strategy" the model uses to find its internal weights. If the model were a student, the learned parameters would be the knowledge they gain, while hyperparameters would be the study habits or the curriculum you assigned them.

Learned Parameters vs. Hyperparameters

The distinction is critical for your workflow. If you change a hyperparameter, you must retrain the model from scratch.

Feature	Learned Parameters	Hyperparameters
Origin	Derived from training data	Set by the user (you)
Adjustment	Automatic (via optimization)	Manual (via tuning)
Timing	During training	Before training
Examples	Regression coefficients, node weights	Learning rate, tree depth, regularization strength

Identifying Tunable Parameters

Different algorithms have different sets of hyperparameters. As you progress through this course, you'll encounter various models, each requiring its own unique configuration. Here is how to identify them for common Scikit-Learn models:

1. Linear Regression

fit_intercept: A boolean (True/False) that determines if the model should calculate the intercept for this data.
positive: When set to True, forces the coefficients to be positive.

2. Decision Trees

max_depth: The maximum depth of the tree. A deeper tree is more complex and prone to overfitting.
min_samples_split: The minimum number of samples required to split an internal node. Higher values prevent overfitting.

3. K-Nearest Neighbors (KNN)

n_neighbors: The number of neighbors to use for queries.
weights: Whether to weight points uniformly or by distance.

Worked Example: Exploring Model Configuration

Let’s look at how we define these settings in code using a Decision Tree. Notice how we pass these values during initialization.


PYTHON
from sklearn.tree import DecisionTreeRegressor

# Configuring the model with specific hyperparameters
# We are manually setting the "strategy" for how the tree grows
model = DecisionTreeRegressor(
    max_depth=5, 
    min_samples_split=10, 
    random_state=42
)

# After setting these, the model is ready to learn its parameters
# model.fit(X_train, y_train)

By setting max_depth=5, we explicitly constrain the model's complexity. If we left it as default (None), the tree might grow until every leaf is pure, likely leading to overfitting. Tuning these hyperparameters is how we balance performance.

Hands-on Exercise

Open your project notebook where you previously trained your baseline model.
Identify the model class you are using (e.g., RandomForestRegressor or LogisticRegression).
Consult the Scikit-Learn documentation for that class.
Find two hyperparameters related to model complexity (e.g., max_depth for trees or C for logistic regression).
Create two different instances of the model with different values for these hyperparameters and print them to the console to confirm they are set correctly.

Common Pitfalls

Tuning on the Test Set: Never use your test set to choose your hyperparameters. If you do, you are essentially "leaking" information about the test data into your model, which leads to overly optimistic performance results. Always use validation sets or cross-validation.
Over-tuning: It is easy to get lost in an endless loop of searching for the "perfect" configuration. Remember that a simple model that performs well is often better than a complex, highly-tuned model that is fragile in production.
Ignoring random_state: Many algorithms (like Random Forests) are stochastic. If you don't set a random_state, your results will change every time you run the code, making it impossible to tell if a change in performance is due to your tuning or just random luck.

Recap

Hyperparameters are the manual configuration settings that define how a model learns. Unlike learned parameters, which the model discovers during training, you define hyperparameters to control complexity and prevent overfitting. By identifying the critical "knobs" for your chosen algorithm, you gain the ability to steer your model's performance toward better generalization.

Up next: We will move beyond manual setting and look at Implementing Grid Search to automate the discovery of the best hyperparameter combinations.

Back to Blog