摘要:
What is Data Leakage¶ Data leakage is one of the most important issues for a data scientist to understand. If you don't know how to prevent it, leakag 阅读全文
摘要:
The Cross-Validation Procedure In cross-validation, we run our modeling process on different subsets of the data to get multiple measures of model qua 阅读全文
摘要:
# Most scikit-learn objects are either transformers or models. # Transformers are for pre-processing before modeling. The Imputer class (for filling i 阅读全文
摘要:
# Partial dependence plots# 改变单变量对最终预测结果的影响# 先fit出一种模型,然后取一行,不断改变某一特征,看它对最终结果的印象。# 但是,只使用一行不具有典型性# 所以对所有行执行上述操作,求均值 阅读全文