Pandas入门之十:百分比与相关性

已信任
Jupyter 服务器: 本地
Python 3: Not Started
[60]



import pandas as pd
import numpy as np



[61]




s = pd.Series([877,865,874,890,912])
s
0    877
1    865
2    874
3    890
4    912
dtype: int64
[62]



# 想知道每天的变化量,对比的是当天跟昨天的变化量
s.pct_change()
0         NaN
1   -0.013683
2    0.010405
3    0.018307
4    0.024719
dtype: float64
[63]



# 协方差
s1 = pd.Series(np.random.randn(10))
s2 = pd.Series(np.random.randn(10))
s1.cov(s2)
-0.3417718431113297
[64]



# 相关性计算:一个变,另一个是否跟着变
s1
0    0.070405
1    0.155567
2   -0.518001
3   -0.057693
4    0.411682
5    1.841240
6    0.759474
7    0.301355
8   -0.864013
9    0.642086
dtype: float64
[65]



s2 = s1*2
[66]



s1.corr(s2)
1.0
[67]



s3 = pd.Series(np.random.randn(10))
[68]



df = pd.DataFrame({
    's1':s1,
    's2':s2,
    's3':s3
})
df
s1    s2    s3
0    0.070405    0.140811    0.771643
1    0.155567    0.311135    2.976528
2    -0.518001    -1.036002    -0.368043
3    -0.057693    -0.115387    0.273931
4    0.411682    0.823364    0.434022
5    1.841240    3.682480    -1.641432
6    0.759474    1.518949    0.682910
7    0.301355    0.602710    0.514268
8    -0.864013    -1.728025    0.023511
9    0.642086    1.284171    0.960029
[69]



df.corr()
s1    s2    s3
s1    1.000000    1.000000    -0.278755
s2    1.000000    1.000000    -0.278755
s3    -0.278755    -0.278755    1.000000
[-]

 

posted @ 2021-07-14 21:23  vv_869  阅读(178)  评论(0编辑  收藏  举报