黄金特征和黄金模型

一. 特征选择

1. Permutation Importance

# shuffle a single column of the validation data and get the loss(which reflects the importance)
import eli5
from eli5.sklearn import PermutationImportance

perm = PermutationImportance(my_model, random_state=1).fit(val_X, val_y)
eli5.show_weights(perm, feature_names = val_X.columns.tolist())

2. Partial Plots（部分依赖图）

Partial dependence plots show how a feature affects predictions.

from matplotlib import pyplot as plt
from pdpbox import pdp, get_dataset, info_plots

feat_name = 'pickup_longitude'
pdp_dist = pdp.pdp_isolate(model=first_model, dataset=val_X, model_features=base_features, feature=feat_name)

pdp.pdp_plot(pdp_dist, feat_name)
plt.show()

3. SHAP Values

SHAP Values break down a prediction to show the impact of each feature

import shap  # package used to calculate Shap values

data_for_prediction = val_X.iloc[0,:]  # use 1 row of data here. Could use multiple rows if desired

# Create object that can calculate shap values
explainer = shap.TreeExplainer(my_model)
shap_values = explainer.shap_values(data_for_prediction)
shap.initjs()
shap.force_plot(explainer.expected_value[0], shap_values[0], data_for_prediction)

#How features matter in every data 
explainer = shap.TreeExplainer(my_model)
shap_values = explainer.shap_values(small_val_X)

shap.summary_plot(shap_values[1], small_val_X)

posted @ 2022-07-06 22:03 失控D大白兔阅读(75) 评论(0) 收藏举报

刷新页面返回顶部

929code

黄金特征和黄金模型

一. 特征选择

1. Permutation Importance

2. Partial Plots（部分依赖图）

3. SHAP Values