Randomized Grid Search
Manual hyperparameter searching? No way. Scikit Learn has a got an amazing random grid search function that can give us a hint into the best parameters by calling its class, setting up a dictionary with all parameters, and letting it fly. This example below his using a K-Nearest Neighbours model for its example. After the Randomize Grid Search is done, you can pull the best parameter for your model, and as well as take a look a the history of the previous combination of parameters.
Import Preliminaries¶
# Import modulse
import numpy as np
import pandas as pd
from sklearn.cross_validation import cross_val_score
from sklearn.datasets import load_iris
from sklearn.model_selection import RandomizedSearchCV
from sklearn.neighbors import KNeighborsClassifier
# Import iris dataset
iris = load_iris()
X, y = iris.data, iris.target
# Assign classifier
classifier = KNeighborsClassifier(n_neighbors=5, weights='uniform',
metric ='minkowski', p=2)
# Intiate a grid dictionary
grid = {'n_neighbors':list(range(1,11)), 'weights':['uniform', 'distance'],
'p':[1,2], }
# Declare randomized search on model using our param grid
random_search = RandomizedSearchCV(estimator=classifier,
param_distributions = grid,
n_iter = 10, scoring = 'accuracy',
n_jobs=1, refit=True,
cv = 10,
return_train_score=True)
# Fit the randomized search model with our data
random_search.fit(X,y)
# Print the best parameters and its best accuracy score
print('Best parameters: %s'%random_search.best_params_)
print('CV Accuracy of best parameters: %.3f'%random_search.best_score_)
- This method is more computationaly visable then a full grid search
- The result will change each time the model is fitted
Baseline Cross Validation Score¶
# Print our current accuracy score using our current parameters
print ('Baseline with default parameters: %.3f' %np.mean(
cross_val_score(classifier, X, y, cv=10, scoring='accuracy', n_jobs=1)))
Viewing Randomized Grid Score¶
# The grid scores attribute is now depricated,
# but I'll use it till its completely gone
random_search.grid_scores_
# The new cv_results attribute outpute our results in JSON
# Throw it in a dataframe to make some sense of it
json_df = pd.DataFrame(random_search.cv_results_).head(3)
json_df
# Here is the raw JSON output
random_search.cv_results_
Author: Kavi Sekhon