检验某个变量是否服从正太分布
检验同一个热点,同一个采样点,同一个channel的csi值(500个)是否符合正太分布,或者符合其他什么分布?
采用Q-Q图。
参考资料:https://wenku.baidu.com/view/c661ebb365ce050876321319.html
用QQ图检验一序列是否服从正太分布,序列为X=(x1,x2,…,xi,…xN),(N>0)
- 将原序列按从小到大的顺序排列: x1 <= x2 <= … <= xi <= … <= xN
- 计算QQ序列:
样本均值和标准差分别为avg = 1/n * sum(xi), std = np.sqrt(1/(N-1) * sum (np.square(xi-avg)))
分位数Qi = (xi – avg) / std, ti = (i-0.5)/N
数据序列 |
x1 |
… |
xi |
… |
xN |
Q |
Q1 = (xi-avg)/std |
… |
Qi = (xi-avg)/std |
… |
QN = (xi-avg)/std |
t |
t1 = (1-0.5)/N |
… |
ti = (i-0.5)/N |
… |
tN=(N-0.5)/N |
Q’ |
由t1查表得出 |
… |
查表 |
… |
查表 |
3. 画出Q-Q’图,与y=kx+b比较,若基本与之吻合则原序列服从N(b,k)的正态分布,若不为直线,则不服从正态分布。
import tensorflow as tf import numpy as np import pandas as pd import matplotlib.pyplot as plt import pylab import scipy.stats as stats import statsmodels.api as sm #读取数据 num_sample = 500; with open("data/clean_data/training_csi.csv", "rb") as fi: with open("data/clean_data/for_qq_plot.csv",'wb') as fo: fo.write(fi.readline()) for i in range(num_sample): fo.write(fi.readline()); samples = pd.read_csv('data/clean_data/for_qq_plot.csv') csi540 = np.array(samples['csi540']) sm.qqplot(csi540, line='45') pylab.show()
由图可以看出,散点近似和直线y=x+3重合,所以该变量近似服从正太分布,均值约为3,方差约为1