Automated hyper-parameter tuning with Tensorflow Decision Trees.
The hyper-parameters control how the machine learning model is trained and impact the quality of the model. Then finding the best hyper-parameters is one of the important stages of machine learning modeling.
Learning algorithms come with default hyper-parameters. Those values provide more accurate results in most situations. This approach uses when we start the modeling process. Tensorflow decision forests also expose the hyper-parameter templates (hyperparameter_template=”benchmark_rank1").
There are two main approaches to tuning hyper-parameters.
- Manual tuning — We can select different values and select values that perform best.
- Automated tuning — We can use tuning algorithms to find the best hyper-parameter values automatically.
We can train a model with automated hyper-parameter tuning and manual definition of the hyper-parameters.
My code is available in Github Click here
The tuner object contains all the configurations of the tuner (search space, optimizer, trial, and objective).
tuner = tfdf.tuner.RandomSearch(num_trials=20)
tuner.choice("max_depth", [4, 5, 6, 7]).choice("num_trees",[300,400])
Then train the model
# Specify the model.
model_1 = tfdf.keras.RandomForestModel(verbose=2,tuner=tuner)
# Train the model.
model_1.compile(metrics=["accuracy"])
model_1.fit(train_ds)
We can Train a model with automated hyper-parameter tuning and automatic definition of the hyper-parameters (recommended approach).
# Create a Random Search tuner with 50 trials and automatic hp configuration.
tuner_2 = tfdf.tuner.RandomSearch(num_trials=50, use_predefined_hps=True)
# Define and train the model.
model_2 = tfdf.keras.RandomForestModel(tuner=tuner_2)
model_2.fit(train_ds, verbose=2)
Summary
When we tune the tuner manually we should add more parameters manually. Some parameters are only valid for specific values of others,
Example :
- The max_depth parameter is mostly useful when growing_strategy=LOCAL while max_num_nodes is better suited when growing_strategy=BEST_FIRST_GLOBAL
When we use an automated hyper-parameter tuning approach it automatically searches for the best combination of hyper-parameters that result in the highest model performance. This technique can save a lot of time and effort compared to manually tuning hyper-parameters.
Resources