摘要: What is Data Leakage¶ Data leakage is one of the most important issues for a data scientist to understand. If you don't know how to prevent it, leakag 阅读全文
posted @ 2018-04-14 15:22 cbattle 阅读(620) 评论(0) 推荐(0) 编辑
摘要: The Cross-Validation Procedure In cross-validation, we run our modeling process on different subsets of the data to get multiple measures of model qua 阅读全文
posted @ 2018-04-14 11:12 cbattle 阅读(336) 评论(0) 推荐(0) 编辑
摘要: # Most scikit-learn objects are either transformers or models. # Transformers are for pre-processing before modeling. The Imputer class (for filling i 阅读全文
posted @ 2018-04-14 11:00 cbattle 阅读(151) 评论(0) 推荐(0) 编辑
摘要: # Partial dependence plots# 改变单变量对最终预测结果的影响# 先fit出一种模型,然后取一行,不断改变某一特征,看它对最终结果的印象。# 但是,只使用一行不具有典型性# 所以对所有行执行上述操作,求均值 阅读全文
posted @ 2018-04-14 10:45 cbattle 阅读(399) 评论(0) 推荐(0) 编辑