Oracle Data Science Hyperparameter Tuning refer to Doc ADSTuner
Oracle Data Science Hyperparameter Tuning
Many machine learning models have parameters that are used to control the learning process. They are called hyperparameters and are not directly learned from the data. Hyperparameter tuning is the process of searching for the values of the hyperparameters by setting the values and optimizing a model.
This is repeated for many different combinations of values and then the best set of hyperparameters are chosen. The ADSTuner has several hyperparameter search strategies that plug into common model architectures. They support user-defined search spaces and strategies and they are functional with any ML library that doesn’t have hyperparameter tuning.
The ADSTuner generates a tuning report that lists its trials, best performers, and statistics.
ADSTuner: Hyperparameter Tuning
- Has several hyperparameter search strategies that plug into common model architectures like scikit-learn
- Supports user-defined search spaces and strategies
- Is functional with any ML library that doesn’t have hyperparameter tuning
- Generates a tuning report that lists its trials, best performing hyperparameters, and performance statistics
Cross-Validation
You can specify the number of folds to be used by using the cv parameter.
ADSTuner()performs hyperparameter search by using cross-validation.
ADSTuner Search Spaces
ADSTuner() needs a search space to tune the hyperparameters in, so you use the strategy parameter. This parameter can be set in two ways. You can specify detailed search criteria or you can use the built-in defaults.
For the supported model classes, ADSTuner provides perfunctory and detailed search spaces that are optimized for the class of model that is being used.
- The perfunctory option is optimized for a small search space so that the most important hyperparameters are tuned. Generally, this option is used early in your search as it reduces the computational cost and allows you to assess the quality of the model class that you are using.
- The detailed search space instructs ADSTuner to cover a broad search space by tuning more hyperparameters. Typically, you would use it when you have determined what class of model is best suited for the data set and type of problem you are working on.
- If you have experience with the data set and have a good idea of what the best hyperparameter values are, you can explicitly specify the search space. You pass a dictionary that defines the search space into the strategy. This is custom.Instead of using a
perfunctory
ordetailed
strategy, define a custom search space strategy.
To see what search space is being used for your model class when the strategy is perfunctory or detailed, use the search_space() method to see the details.
Defining a Custom Search Space and Score
The next cell, creates a LogisticRegression()
model instance then defines a custom search space strategy for the three LogisticRegression()
hyperparameters, C
, solver
, and max_iter
parameters.
You can define a custom scoring
parameter, see Optimizing a scikit-learn Pipeline()
though this example uses the standard weighted average 𝐹1F1, f1_score
.
tuner = ADSTuner(LogisticRegression(), strategy = {'C': LogUniformDistribution(low=1e-05, high=1), 'solver': CategoricalDistribution(['saga']), 'max_iter': IntUniformDistribution(500, 2000, 50)}, scoring=make_scorer(f1_score, average='weighted'), cv=3) tuner.tune(X, y, exit_criterion=[NTrials(5)], synchronous=True, loglevel=logging.WARNING)
Changing the Search Space Strategy
You can change the search space in the following three ways:
-
Add new hyperparameters
-
Remove existing hyperparameters
-
Modify the range of existing non-categorical hyperparameters
Tuning Process
The tune()
method starts a tuning process. It has a synchronous and asynchronous mode for tuning. The mode is set with the synchronous
parameter. When it is set to False
, the tuning process runs asynchronously so it runs in the background and allows you to continue your work in the notebook. When synchronous
is set to True
, the notebook is blocked until tune()
finishes running.
The ADSTuner
object needs to know when to stop tuning. The exit_criterion
parameter accepts a list of criteria that cause the tuning to finish. If any of the criteria are met, then the tuning process stops.