摘要:
本文旨在介绍R语言中ggplot2包的一些精细化操作,主要适用于对R画图有一定了解,需要更精细化作图的人,尤其是那些刚从excel转ggplot2的各位,有比较频繁的作图需求的人。不讨论那些样式非常酷炫的图表,以实用的商业化图表为主。包括以下结构: 1、画图前的准备:自定义ggplot2格式刷 2、 阅读全文
摘要:
Every now and again someone comes along and writes an R package that I consider to be a ‘game changer’ for the language and it’s application to Data S 阅读全文
摘要:
The glmnetUtils package provides a collection of tools to streamline the process of fitting elastic net models with glmnet. I wrote the package after 阅读全文
摘要:
If the media coverage is anything to go by, people are desperate to know who will win the US election on November 8. Polls give us some indication of 阅读全文
摘要:
If your regression model contains a categorical predictor variable, you commonly test the significance of its categories against a preselected referen 阅读全文
摘要:
原文在此:8 Trips to Combat Imbalanced Classes in You Machine Learning Dataset by Jason Brownlee 当你遇到非均衡数据集的时候,即便是你得到准确率为90%的分类模型,只要你仔细研读你会发现,分类中基本都是某一类。 C 阅读全文
摘要:
In the first installment of this series, we scraped reviews from Goodreads. In thesecond one, we performed exploratory data analysis and created new v 阅读全文
摘要:
TensorFlow™ is an open source software library for numerical computation using data flow graphs. Nodes in the graph represent mathematical operations, 阅读全文
摘要:
利用聚类分析,我们可以很容易地看清数据集中样本的分布情况。以往介绍聚类分析的文章中通常只介绍如何处理连续型变量,这些文字并没有过多地介绍如何处理混合型数据(如同时包含连续型变量、名义型变量和顺序型变量的数据)。本文将利用 Gower 距离、PAM(partitioning around medoid 阅读全文
摘要:
文章摘要 本文首先介绍了并行计算的基本概念,然后简要阐述了R和并行计算的关系。之后作者从R用户的使用角度讨论了隐式和显示两种并行计算模式,并给出了相应的案例。隐式并行计算模式不仅提供了简单清晰的使用方法,而且很好的隐藏了并行计算的实现细节。因此用户可以专注于问题本身。显示并行计算模式则更加灵活多样, 阅读全文