特征选择
2018-01-29 12:41 xplorerthik 阅读(265) 评论(0) 编辑 收藏 举报Filter methods
These include simple statistical test to determine if a feature is statistically significant for example the p value for a t test to determine if the null hypothesis should be accepted and the feature rejected. This does not take into account feature interactions and is generally not a very recommended way of doing feature selection as it can lead to lost in information
Wrapper based methods
Tree based models like RandomForest are also robust against issues like multi-collinearity, missing values, outliers etc as well as being able to discover some interactions between features. However this can be rather computationally expensive.
a simple wrapper method: Forward Feature Selection (FFS) ,特征逐步添加。 每次迭代添加一个特征。
Feature engineering is a super-set of activities which include feature extraction, feature construction and feature selection. Each of the three are important steps and none should be ignored. We could make a generalization of the importance though, from my experience the relative importance of the steps would be feature construction > feature extraction > feature selection.