从读取的csv文件中对数据进行处理,求最大值、最小值、平均值等等

1.问题说明

输出数据集的基本信息,比如最大值,最小值,平均值等

统计确实的变量和样本个数

通过箱式图判断异常点

2.求最大值、最小值和平均值

求最大值:

import pandas as pd
import numpy as np
data = pd.read_csv("C:\\Users\\Administrator\\Desktop\\catering_sale.csv") data1
= data.describe() print(data1.max())

运行结果:

销量    9106.44
dtype: float64

求最小值:

import pandas as pd
import numpy as np
data = pd.read_csv("C:\\Users\\Administrator\\Desktop\\catering_sale.csv")
data1 = data.describe()
print(data1.min())

运行结果:

销量    22.0
dtype: float64

求平均值:

import pandas as pd
import numpy as np
data = pd.read_csv("C:\\Users\\Administrator\\Desktop\\catering_sale.csv")
data1 = data.describe()
print(data1.mean())

运行结果:

销量    2621.079309
dtype: float64

3.缺失值的数量

import pandas as pd
import numpy as np
data = pd.read_csv("C:\\Users\\Administrator\\Desktop\\catering_sale.csv")
data2 = data.isnull().sum()
print(data2)

运行结果:

日期    0
销量    1
dtype: int64

4.箱式图判断异常点

plt.figure()
plt.rcParams['font.sans-serif']=[u'SimHei']
plt.rcParams['axes.unicode_minus']=False
p = data.boxplot(return_type='dict')  #画箱式图
x = p['fliers'][0].get_xdata()
y = p['fliers'][0].get_ydata()
y.sort()
for i in range(len(x)):
  if i > 0:
    plt.annotate(y[i], xy=(x[i], y[i]), xytext=(x[i]+0.05 - 0.8/(y[i]-y[i-1]), y[i]))
  else:
    plt.annotate(y[i], xy=(x[i], y[i]), xytext=(x[i]+0.08, y[i]))
plt.show()

运行结果:

5.完整代码

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt #导入图像库
data = pd.read_csv("C:\\Users\\Administrator\\Desktop\\catering_sale.csv")
data1 = data.describe()
data2 = data.isnull().sum()
print(data1.mean())
print(data2)
plt.figure()
plt.rcParams['font.sans-serif']=[u'SimHei']
plt.rcParams['axes.unicode_minus']=False
p = data.boxplot(return_type='dict')  #画箱式图
x = p['fliers'][0].get_xdata()
y = p['fliers'][0].get_ydata()
y.sort()
for i in range(len(x)):
  if i > 0:
    plt.annotate(y[i], xy=(x[i], y[i]), xytext=(x[i]+0.05 - 0.8/(y[i]-y[i-1]), y[i]))
  else:
    plt.annotate(y[i], xy=(x[i], y[i]), xytext=(x[i]+0.08, y[i]))
plt.show()

 

posted @ 2021-03-12 15:49  彭仔  阅读(2410)  评论(1编辑  收藏  举报