摘要: (1)C4.5算法的特点为:输入变量(自变量):为分类型变量或连续型变量。输出变量(目标变量):为分类型变量。连续变量处理:N等分离散化。树分枝类型:多分枝。分裂指标:信息增益比率gain ratio(分裂后的目标变量取值变异较小,纯度高)前剪枝:叶节点数是否小于某一阈值。后剪枝:使用置信度法和减少... 阅读全文
posted @ 2015-02-04 13:13 payton数据之旅 阅读(4995) 评论(0) 推荐(0) 编辑
摘要: Joanna Zhao’s and Jenny Bryan’sR graph catalogis meant to be a complement to the physical book,Creating More Effective Graphs, but it’s a really nice ... 阅读全文
posted @ 2015-02-04 10:11 payton数据之旅 阅读(467) 评论(0) 推荐(0) 编辑
摘要: In preparation for a R Workgroup meeting, I started thinking about what would be my "Top 5 R Functions". I ruled out the functions for basic mechanics... 阅读全文
posted @ 2015-02-04 10:04 payton数据之旅 阅读(184) 评论(0) 推荐(0) 编辑
摘要: This slidify-based deck introduces the shinypackage from R-Studio and walks one through the development of an interactive application that presents us... 阅读全文
posted @ 2015-02-04 09:56 payton数据之旅 阅读(207) 评论(0) 推荐(0) 编辑