[MACHINE LEARNING] Can we predict voting outcomes?
1. CART Tree
library(rpart)
library(rpart.plot)
CTree = rpart(Party ~ . -USER_ID, data = train, method = "class")
PredTest = predict(CTree, newdata = test, type = "class") # result is bad
2. Cross validation
library(e1071)
library(caret)
set.seed(100)
numFolds = trainControl(method = "cv", number = 10)
cpGrid = expand.grid(.cp = seq (0.01,0.50,0.01))
tr = train(Party ~.- USER_ID,method = "rpart",data = train,trControl = numFolds, tuneGrid = cpGrid,na.action = na.pass)
Tip: the red part is to deal with missing NA values #cp = 0.04
3. CART Tree
CTree = rpart(Party ~ . -USER_ID, data = train, method = "class", cp = 0.04)
PredTest = predict(CTree, newdata = test, type = "class")
#after upload, the accuracy is 0.61207. it is my first time, the score is higher than the default logistic regression 0.57902
p.s. I also tried random forest
library(randomForest)
RFTree = randomForest(Party ~.- USER_ID,method = "rpart",data = train, ntree = 500, cp = 0.04, na.action = na.omit)
#The score is not good.
2017/3/20 I am thinking i need to learn how to plot about the complex data structure. ggplot2. I think it's a good way for me.
posted on 2017-03-21 00:35 xiaojin693 阅读(126) 评论(0) 编辑 收藏 举报