Hyperparameter Tuning in Machine Learning with Python

Machine learning has continued to evolve since its inception, and today, it’s a crucial component for many businesses and industries around the globe. In the realm of Machine Learning, one concept that often challenges both beginners and experienced developers is Hyperparameter Tuning. This article aims to demystify Hyperparameters and the process of Hyperparameter Tuning, with special focus on Python implementation.

What are Hyperparameters?

In machine learning, a model’s performance depends on two factors: parameters and hyperparameters. Parameters are learned during the training process, while hyperparameters are set before the training process begins. Examples of hyperparameters include learning rate, number of hidden layers in a neural network, kernel size in SVM, and so forth. The process of systematically finding the right hyperparameters for a model is called Hyperparameter Tuning.

Why is Hyperparameter Tuning Necessary?

The objective of any machine learning model is to make the most accurate predictions possible. By tuning a model’s hyperparameters, we are essentially refining and adjusting the model to better fit the data and the problem at hand.

Selecting the perfect hyperparameters can be a game-changer for your machine learning model. Optimal hyperparameters can reduce overfitting (performance discrepancy between training and test data), underfitting, and improve the accuracy, precision, and recall of your model.

Techniques of Hyperparameter Tuning

There are three commonly used methods for Hyperparameter tuning:

Grid Search: This process involves manually specifying a subset of the hyperparameter space. A grid search will then train a model for each combination of hyperparameters and hold out data to determine which set of hyperparameters provides the best performance.
Random Search: A random search samples algorithm parameters from a distribution over a fixed range. It is a randomized search over parameters where each setting is sampled independently.

Bayesian Optimization: A more advanced method, Bayesian Optimization uses the results of previous iterations to pick hyperparameters that might do well. It is a sequence of experiments in a way that the model gathers information from past trials to pick future ones.

Let’s delve into the Python implementation of these.

Python Packages for Hyperparameter Tuning

Among the multitude of packages available for Python in Machine Learning, Scikit-learn and Keras Tuner are two frequently used ones for hyperparameter tuning. While Scikit-learn supports Grid Search and Random Search, Keras Tuner is used for tuning neural network parameters, including learning rate or layers in a deep network.

We will cover the use of both packages in subsequent sections.

Python Implementation: Grid Search

After importing the necessary libraries and your dataset, and after doing necessary preprocessing, we can fit a model and use GridSearchCV.

from sklearn import svm, datasets
from sklearn.model_selection import GridSearchCV
iris = datasets.load_iris()

parameters = {'kernel':('linear', 'rbf'), 'C':[1, 10]}
svc = svm.SVC()
clf = GridSearchCV(svc, parameters)
clf.fit(iris.data, iris.target)

GridSearchCV takes the model (svc in the code) and the hyperparameters (parameters) as arguments and uses cross-validation to evaluate the performance of each combination of hyperparameters.

Python Implementation: Random Search

Random Search in Python can also be implemented with Scikit-learn using RandomizedSearchCV. Instead of an exhaustive set of every possible combination of hyperparameters, a fixed number of parameter settings is sampled from the specified distributions. This strategy can oftentimes be more cost-effective than a Grid Search.

from sklearn import datasets
from sklearn.ensemble import RandomForestClassifier
from sklearn.model_selection import RandomizedSearchCV
from scipy.stats import uniform, truncnorm, randint

model_params = {
    'n_estimators': randint(4,200),
    'max_features': truncnorm(a=0, b=1, loc=0.25, scale=0.1),
    'min_samples_split': uniform(0.01, 0.199)
}

# create random forest classifier model
rf_model = RandomForestClassifier()

# set up random search meta-estimator
rf_random = RandomizedSearchCV(rf_model, model_params, n_iter=100, cv=5, random_state=1)
iris = datasets.load_iris()

# train the random search meta-estimator to find the best model
rf_random.fit(iris.data, iris.target)

Once the instance of RandomizedSearchCV is fit, it functions like a regular Scikit-learn model. The instance (rf_random) finds the best set of hyperparameters on the given test data.

Python Implementation: Bayesian Optimization

Bayesian optimization methods are a set of sequential design strategies for global optimization of black-box functions that consider previous evaluations of the Loss function to select the next input values. A common Python package for Bayesian Optimization is BayesianOptimization. Here is an example of its implementation:

from bayes_opt import BayesianOptimization
from sklearn.ensemble import RandomForestClassifier
from sklearn.model_selection import cross_val_score
from sklearn import datasets

iris = datasets.load_iris()

def black_box_function(n_estimators, max_depth):
    n_estimators = int(n_estimators)
    max_depth = int(max_depth)

    rf = RandomForestClassifier(n_estimators=n_estimators, max_depth=max_depth, random_state=1)
    val = cross_val_score(rf, iris.data, iris.target, cv=5).mean()

    return val

bounds = {'n_estimators': (30, 500), 'max_depth': (3,10)}

optimizer = BayesianOptimization(
    f=black_box_function,
    pbounds=bounds,
    random_state=1,
)

optimizer.probe(
    params={'n_estimators': 50, 'max_depth': 5},
    lazy=True,
)

optimizer.maximize(
    init_points=3,
    n_iter=10,
)

print(optimizer.max)

The optimization function takes a dictionary of hyperparameters to be tuned, and the range for each of them. You can optionally choose to probe a point in the search space before running the optimization process.

Conclusion

Hyperparameter tuning can be a powerful way to boost your model’s performance, but it can also be a challenging and demanding task. It is essential to go beyond the default configurations and experiment with different hyperparameters and tuning strategies.

The use of packages such as Scikit-learn and Keras Tuner in Python greatly facilitate the process of tuning by providing in-built methods to perform grid search, randomized search, and Bayesian optimization.

However, it is also important to note that hyperparameter tuning is a part of a broader machine learning workflow that involves preprocessing, feature selection, model selection, and evaluation. Remember, an efficient model is a result of a meticulously followed workflow rather than just a set of optimal hyperparameters.

Hyperparameter Tuning In Machine Learning