Automated hyper-parameter tuning with Tensorflow Decision Trees.

Rukmal Senavirathne
2 min readApr 20, 2023

The hyper-parameters control how the machine learning model is trained and impact the quality of the model. Then finding the best hyper-parameters is one of the important stages of machine learning modeling.

Learning algorithms come with default hyper-parameters. Those values provide more accurate results in most situations. This approach uses when we start the modeling process. Tensorflow decision forests also expose the hyper-parameter templates (hyperparameter_template=”benchmark_rank1").

There are two main approaches to tuning hyper-parameters.

  1. Manual tuning — We can select different values and select values that perform best.
  2. Automated tuning — We can use tuning algorithms to find the best hyper-parameter values automatically.

We can train a model with automated hyper-parameter tuning and manual definition of the hyper-parameters.

My code is available in Github Click here

The tuner object contains all the configurations of the tuner (search space, optimizer, trial, and objective).

tuner = tfdf.tuner.RandomSearch(num_trials=20)

tuner.choice("max_depth", [4, 5, 6, 7]).choice("num_trees",[300,400])

Then train the model

# Specify the model.
model_1 = tfdf.keras.RandomForestModel(verbose=2,tuner=tuner)

# Train the model.
model_1.compile(metrics=["accuracy"])
model_1.fit(train_ds)

We can Train a model with automated hyper-parameter tuning and automatic definition of the hyper-parameters (recommended approach).

# Create a Random Search tuner with 50 trials and automatic hp configuration.
tuner_2 = tfdf.tuner.RandomSearch(num_trials=50, use_predefined_hps=True)

# Define and train the model.
model_2 = tfdf.keras.RandomForestModel(tuner=tuner_2)
model_2.fit(train_ds, verbose=2)

Summary

When we tune the tuner manually we should add more parameters manually. Some parameters are only valid for specific values of others,

Example :

  1. The max_depth parameter is mostly useful when growing_strategy=LOCAL while max_num_nodes is better suited when growing_strategy=BEST_FIRST_GLOBAL

When we use an automated hyper-parameter tuning approach it automatically searches for the best combination of hyper-parameters that result in the highest model performance. This technique can save a lot of time and effort compared to manually tuning hyper-parameters.

Resources

  1. https://www.tensorflow.org/decision_forests/tutorials/automatic_tuning_colab
  2. https://www.tensorflow.org/decision_forests/api_docs/python/tfdf/keras/RandomForestModel
  3. https://www.tensorflow.org/decision_forests/api_docs/python/tfdf/tuner/RandomSearch

--

--

Rukmal Senavirathne

Graduated from the Department of Computer Science and Engineering at the University of Moratuwa.