随笔分类 -  数据处理与分析

用Pyhton做数据分析, 主推 Pandas , Numpy, Excel, 主要是记录平时工作及学习常用的功能.
摘要:多层索引
摘要:pandas 中, 关于字符串处理的常用api总结
摘要:数据映射, 离散化, 异常值, 重采样, one-hot coding....
摘要:数据清洗-缺失值处理(drop, fill)
摘要:![](https://img2018.cnblogs.com/blog/1325660/201911/1325660-20191119230314019-587683946.jpg) 阅读全文
pandas objects are equipped(配备的) with a set of common mathematical and statistical methods. Most of these fall into the categrory of red
Pandas will be a major tool of interest throughout(贯穿) much of the rest of the book. It contains data structures and manipulation tools designed to ma
The numpy.random module supplements(补充) the built in Python random with functions for efficiently generating whole arrays of sample values from many k
File Input and Output NumPy is able to save and load data to and from disk either in text or binary format. In this section I only discuss NumPy's bui
Using NumPy arrays enables you to express many kinds of data processing tasks as concise(简明的) array expressions(不用写循环就能用数组表达很多数据过程) that might otherwi
摘要:why 回顾我的数据分析入门, 最开始时SPSS+EXCEL,正好 15年初是上大一下的时候, 因为统计学的还蛮好的, SPSS傻瓜式操作,上手挺方便,可渐渐地发现, 使用软件的最不好的地方是不够灵活, 不能为所欲为 , 编程语言才是最灵活的, 最还是用R, 命令式的, 也是感觉不太好是, 于是开始
