Victoria's workshop

导航

[MACHINE LEARNING] Can we predict voting outcomes?

1. CART Tree

library(rpart)
library(rpart.plot)
CTree = rpart(Party ~ . -USER_ID, data = train, method = "class")
PredTest = predict(CTree, newdata = test, type = "class")  # result is bad

2. Cross validation

library(e1071)
library(caret)
set.seed(100)
numFolds = trainControl(method = "cv", number = 10)
cpGrid = expand.grid(.cp = seq (0.01,0.50,0.01))
tr = train(Party ~.- USER_ID,method = "rpart",data = train,trControl = numFolds, tuneGrid = cpGrid,na.action = na.pass)

Tip: the red part is to deal with missing NA values #cp = 0.04

3. CART Tree

CTree = rpart(Party ~ . -USER_ID, data = train, method = "class", cp = 0.04)
PredTest = predict(CTree, newdata = test, type = "class")  

#after upload, the accuracy is 0.61207. it is my first time, the score is higher than the default logistic regression 0.57902

 

p.s. I also tried random forest

library(randomForest)
RFTree = randomForest(Party ~.- USER_ID,method = "rpart",data = train, ntree = 500, cp = 0.04, na.action = na.omit)

#The score is not good. 

 

2017/3/20 I am thinking i need to learn how to plot about the complex data structure. ggplot2. I think it's a good way for me.

posted on 2017-03-21 00:35  xiaojin693  阅读(126)  评论(0编辑  收藏  举报