Screen Link:

https://app.dataquest.io/m/213/guided-project%3A-predicting-bike-rentals/1/introduction-to-the-data-set

During trying this guided project, I was forced to avoid adding `max_features`

as a parameter that would be optimized by the randomized search estimator. The parameter `max_features`

is listed as a parameter of `RandomForestRegressor`

in the SKLearn Documentation. It is also listed when calling the `get_params`

method, a method that is supposed to list all “optimizable” parameters of a model. For some reason, whether or not the `RandomForestRegressor`

model I tried to optimize had `bootstrap`

set to `True`

, adding a dictionary item with `max_features`

as a key caused my code to return the error:

```
ValueError: Invalid parameter max_samples for estimator RandomForestRegressor. Check the list of available parameters with `estimator.get_params().keys()`.
```

Here is the entire code set I used attempting to run Random Search Optimization on a `RandomForestRegressor`

instance. It is a python notebook translated into a simple Python file.

```
# ### Conclusions About Results of Decision Tree Regressor
#
# The Hyper-parameters of the randomized search stuck on either minimum or maximum will have to be explored. The range of values will be correspondingly increased or decreased. This is to avoid any case where the hyper-parameter score has stopped increasing because it reached the limit of values but must still increase or decrease to get a higher accuracy score.
# ## Modeling Using a Random Forest Regressor
# In[102]:
from sklearn.ensemble import RandomForestRegressor
# In[103]:
#For reference, the previous hyper-parameter value ranges are copied.
parameter_values = {
"max_depth": range(1, 25),
"max_features": np.linspace(start = 0.10, stop = 1.0, num = 10),
"min_samples_leaf": np.linspace(start = 0.05, stop = 0.50, num = 10 ),
"min_samples_split": np.linspace(start = 0.05, stop = 0.50, num = 10 ),
#"max_samples": [n for n in range(500, len(df_rentals), 100)]
"max_samples": np.linspace(start = 0.20, stop = 1.0, num = 10)
}
mdl_random_forest_optimized = RandomForestRegressor(bootstrap = True)
#The best scoring hyper-parameter settings from the simple decision tree:
cv_random_search_dtree.best_params_
# In[104]:
#Add item for hyper-parameter choices:
for param, val_array in parameter_values.items():
print(param, val_array)
cv_randomized_search_random_forest = RandomizedSearchCV(
estimator = mdl_random_forest_optimized,
param_distributions = parameter_values,
n_iter = 64,
n_jobs = -1,
cv = 5
)
mdl_random_forest_optimized.get_params()
# In[105]:
cv_randomized_search_random_forest.fit(X = df_rentals.drop(columns = ["cnt"]), y = df_rentals["cnt"])
# In[ ]:
cv_randomized_search_random_forest.best_params_
# In[ ]:
mdl_random_forest_reg_optimized = cv_randomized_search_random_forest.best_estimator_
# ### Cross-Validation on Random Forest Regressor With Best Hyper-Parameters
# In[ ]:
ndar_crossval_scores = cross_val_score(
estimator = mdl_random_forest_reg_optimized,
X = df_rentals.drop(columns = ["cnt"]),
y = df_rentals["cnt"],
n_jobs = -1,
scoring = "neg_mean_squared_error",
cv = 5
)
# In[ ]:
np.sqrt(np.abs(ndar_crossval_scores))
# In[ ]:
np.std(np.sqrt(np.abs(ndar_crossval_scores)))
# In[ ]:
rmse = np.mean( np.sqrt(np.abs(ndar_crossval_scores)) )
print(rmse)
```