数据的直观分析

4.2 数据的直观分析

4.2.1 常用的绘图函数
matplotlib是Python的基本绘图包,是一个Python的图形框架。提供了一整套与matlab相似的命令API,十分适合基本统计图形的绘制。
# 绘制图形时,须作一些基本设置
%config InlineBackend.figure_format='retina' # 提高图形显示的清晰度

image.png

4.2.1.1 计数数据统计图
X=['A','B','C','D','E','F','G']
Y=[1,4,7,3,2,5,6]
  • 条图
import matplotlib.pyplot as plt # 加载基本绘图包
plt.bar(X,Y) # 条图
<BarContainer object of 7 artists>

png

  • 饼图
plt.pie(Y,labels=X) # 饼图
([<matplotlib.patches.Wedge at 0x2ddde98f190>,
  <matplotlib.patches.Wedge at 0x2ddde98f6d0>,
  <matplotlib.patches.Wedge at 0x2ddde98fbb0>,
  <matplotlib.patches.Wedge at 0x2ddde99c0d0>,
  <matplotlib.patches.Wedge at 0x2ddde99c5b0>,
  <matplotlib.patches.Wedge at 0x2ddde99ca90>,
  <matplotlib.patches.Wedge at 0x2ddde99cf10>],
 [Text(1.0930834302648262, 0.12316092919623813, 'A'),
  Text(0.8600146100749843, 0.6858388079261577, 'B'),
  Text(-0.3633070202274737, 1.038271645116746, 'C'),
  Text(-1.0930834374717984, 0.12316086523257752, 'D'),
  Text(-0.9910657227744873, -0.47727217930807914, 'E'),
  Text(-0.3633068744124594, -1.038271696139623, 'F'),
  Text(0.8600147063942706, -0.6858386871455828, 'G')])

png

4.2.1.2 计量数据统计图
  • 线图
plt.plot(X,Y)
[<matplotlib.lines.Line2D at 0x2ddde9dfb20>]

png

  • 直方图
import pandas as pd
BSdata=pd.read_excel('data/BSdata.xlsx','Sheet1');BSdata #读取数据
Region/Country/Area Unnamed: 1 Year Series Value Footnotes Source
0 1 Total, all countries or areas 2010 Population mid-year estimates (millions) 6956.82 NaN United Nations Population Division, New York, ...
1 1 Total, all countries or areas 2010 Population mid-year estimates for males (milli... 3507.70 NaN United Nations Population Division, New York, ...
2 1 Total, all countries or areas 2010 Population mid-year estimates for females (mil... 3449.12 NaN United Nations Population Division, New York, ...
3 1 Total, all countries or areas 2010 Sex ratio (males per 100 females) 101.70 NaN United Nations Population Division, New York, ...
4 1 Total, all countries or areas 2010 Population aged 0 to 14 years old (percentage) 27.00 NaN United Nations Population Division, New York, ...
5 1 Total, all countries or areas 2010 Population aged 60+ years old (percentage) 11.00 NaN United Nations Population Division, New York, ...
6 1 Total, all countries or areas 2010 Population density 53.50 NaN United Nations Population Division, New York, ...
7 1 Total, all countries or areas 2015 Population mid-year estimates (millions) 7379.80 NaN United Nations Population Division, New York, ...
8 1 Total, all countries or areas 2015 Population mid-year estimates for males (milli... 3720.70 NaN United Nations Population Division, New York, ...
9 1 Total, all countries or areas 2015 Population mid-year estimates for females (mil... 3659.10 NaN United Nations Population Division, New York, ...
10 1 Total, all countries or areas 2015 Sex ratio (males per 100 females) 101.70 NaN United Nations Population Division, New York, ...
11 1 Total, all countries or areas 2015 Population aged 0 to 14 years old (percentage) 26.20 NaN United Nations Population Division, New York, ...
12 1 Total, all countries or areas 2015 Population aged 60+ years old (percentage) 12.20 NaN United Nations Population Division, New York, ...
13 1 Total, all countries or areas 2015 Population density 56.70 NaN United Nations Population Division, New York, ...
14 1 Total, all countries or areas 2015 Surface area (thousand km2) 136162.00 NaN United Nations Statistics Division, New York, ...
15 1 Total, all countries or areas 2019 Population mid-year estimates (millions) 7713.47 NaN United Nations Population Division, New York, ...
16 1 Total, all countries or areas 2019 Population mid-year estimates for males (milli... 3889.03 NaN United Nations Population Division, New York, ...
17 1 Total, all countries or areas 2019 Population mid-year estimates for females (mil... 3824.43 NaN United Nations Population Division, New York, ...
18 1 Total, all countries or areas 2019 Sex ratio (males per 100 females) 101.70 NaN United Nations Population Division, New York, ...
19 1 Total, all countries or areas 2019 Population aged 0 to 14 years old (percentage) 25.60 NaN United Nations Population Division, New York, ...
20 1 Total, all countries or areas 2019 Population aged 60+ years old (percentage) 13.20 NaN United Nations Population Division, New York, ...
21 1 Total, all countries or areas 2019 Population density 59.30 NaN United Nations Population Division, New York, ...
22 1 Total, all countries or areas 2019 Surface area (thousand km2) 130094.00 NaN United Nations Statistics Division, New York, ...
23 1 Total, all countries or areas 2021 Population mid-year estimates (millions) 7874.97 Projected estimate (medium fertility variant). United Nations Population Division, New York, ...
24 1 Total, all countries or areas 2021 Population mid-year estimates for males (milli... 3970.24 Projected estimate (medium fertility variant). United Nations Population Division, New York, ...
plt.hist(BSdata['Year']) # 频数直方图,默认density=False
(array([7., 0., 0., 0., 8., 0., 0., 0., 8., 2.]),
 array([2010. , 2011.1, 2012.2, 2013.3, 2014.4, 2015.5, 2016.6, 2017.7,
        2018.8, 2019.9, 2021. ]),
 <BarContainer object of 10 artists>)

png

plt.hist(BSdata.Series)
(array([4., 4., 3., 0., 3., 3., 0., 3., 3., 2.]),
 array([0. , 0.7, 1.4, 2.1, 2.8, 3.5, 4.2, 4.9, 5.6, 6.3, 7. ]),
 <BarContainer object of 10 artists>)

png

  • 散点图 scatter
plt.scatter(BSdata.Year,BSdata.Series)
<matplotlib.collections.PathCollection at 0x2dde10a7460>

png

4.2.1.3 图形参数的设置

标题、标签、标尺及颜色

plt.plot(X,Y,c='red') # 控制图形的颜色colors, c='red'为红色
plt.ylim(0,8) # plt.xlim, plt.ylim: 设置横纵坐标轴范围
plt.xlabel('x');plt.ylabel('y'); # plt.xlabel, plt.ylabel: 设置坐标轴名称

png

plt.plot(X,Y,linestyle='--',marker='.')
# linestyle: 控制连线的线性(-:实线,--:虚线, ::点线)
# marker:控制符号的类型,例如:'o'控制实心圆点图
[<matplotlib.lines.Line2D at 0x2dde117e4c0>]

png

plt.plot(X,Y,linestyle='-',marker='o')
[<matplotlib.lines.Line2D at 0x2dde11deb20>]

png

plt.plot(X,Y,linestyle=':',marker='o')
[<matplotlib.lines.Line2D at 0x2dde26b4100>]

png

绘制函数附加图形

plt.plot(X,Y,'o--')
plt.axvline(x=1) # 垂直线:在横坐标x处画垂直线(plt.axvline)
plt.axhline(y=4) # 水平线:在纵坐标y处画水平(plt.axhline)
<matplotlib.lines.Line2D at 0x2dde216a340>

png

文字函数:text(x,y,labels,...),在(x,y)处添加用labels指定的文字

plt.plot(X,Y);plt.text(2,7,'peak point');

png

图例:绘制图形后,可使用legend函数给图形加图例

plt.plot(X,Y,label='line');plt.legend();

png

plt.plot(X,Y,'.',label='point');plt.legend()
<matplotlib.legend.Legend at 0x2dde2a9ba90>

png

误差线图

s=[0.1,0.4,0.7,0.3,0.2,0.5,0.6] # 误差值
plt.plot(X,Y);plt.errorbar(X,Y,yerr=s,fmt='o',capsize=4)
<ErrorbarContainer object of 3 artists>

png

误差条图

plt.bar(X,Y,yerr=s,capsize=4) # kw={'capsize': 4}
<BarContainer object of 7 artists>

png

4.2.1.4 多图的排列与绘制
  • 在matplotlib下,一个Figure对象可以包含多个子图(Axes),有两种调用形式
  • subplot(numRows,numCols,plotNum)
  • fig.ax=plt.subplots(numRows,numCols,figsize=(width,height))

一行绘制两个图形

plt.subplot(121);plt.bar(X,Y);
plt.subplot(122);plt.plot(Y);

png

一列绘制两个图形

plt.subplot(211);plt.bar(X,Y);
plt.subplot(212);plt.plot(Y);

png

根据页面大小绘制两个图形

fig,ax=plt.subplots(1,2,figsize=(10,4))
ax[0].bar(X,Y);ax[1].plot(X,Y);

png

一页绘制四个图形

fig,ax=plt.subplots(2,2,figsize=(10,8))
ax[0,0].bar(X,Y);
ax[0,1].pie(Y,labels=X);
ax[1,0].plot(Y);
ax[1,1].plot(Y,'.-',linewidth=3);

png

posted @ 2022-10-17 21:08  LUNA2333  阅读(104)  评论(0编辑  收藏  举报