kaggle2 - 数据可视化

# 我们将index_col的值设置为第一列的名称（“日期”，在Excel中打开时在文件的单元格A1中找到) , 将行的标签当作日期来读
fifa_data = pd.read_csv(fifa_filepath, index_col="Date", parse_dates=True)
# 使用Seaborn画数据

plt.figure(figsize=(14,6))

plt.title("Daily Global Streams of Popular Songs in 2017-2018")

sns.lineplot(data=spotify_data)
# 打印所有列名字
list(spotify_data.columns)
# 使标签倾斜
plt.xticks(rotation=-45)
# 画某一列

# Line chart showing daily global streams of 'Shape of You'
sns.lineplot(data=spotify_data['Shape of You'], label="Shape of You")

# 画一个柱型图

plt.figure(figsize=(10,6))

# Bar chart showing average arrival delay for Spirit Airlines flights by month
sns.barplot(x=flight_data.index, y=flight_data['NK'])
# ci=None  消除误差棒


# 热力图

sns.heatmap　　　　-这告诉笔记本我们要创建一个heatmap。

data=data_airplot　　-这告诉笔记本使用航班数据中的所有条目来创建热图。

annot=true　　　　　-这可以确保每个单元格的值显示在图表上。（去掉这个会删除每个单元格中的数字！）

# 返回列表中最大值的索引

np.argmax(alist())

# 画一个散点图

sns.scatterplot(x = candy_data['sugarpercent'] ,y=candy_data['winpercent'])

　　区分数据的标记

　　hue=candy_data['chocolate']

# 有一个相关曲线的图

sns.regplot(x=candy_data['sugarpercent'], y=candy_data['winpercent'])

# 有两个相关曲线的图

sns.lmplot(x="bmi", y="charges", hue="smoker", data=insurance_data)

# 画一个分类散点图 (像小花的图,横坐标最好是两种情况,Yes,或No)

sns.swarmplot(x=candy_data['chocolate'],y=candy_data['winpercent'])

# 画一个直方图

sns.distplot(a=iris_data['Petal Length (cm)'], kde=False)

kde = Flase 是否在图中画出核密度估计图

# 核密度估计图 (可以理解为平滑的直方图)

# KDE plot

sns.kdeplot(data=iris_data['Petal Length (cm)'], shade=True)

# 2维的核密度估计图

# 2D KDE plot

sns.jointplot(x=iris_data['Petal Length (cm)'], y=iris_data['Sepal Width (cm)'], kind="kde")

# 图的分类

Since it's not always easy to decide how to best tell the story behind your data, we've broken the chart types into three broad categories to help with this.

Trends - A trend is defined as a pattern of change.
- sns.lineplot - Line charts are best to show trends over a period of time, and multiple lines can be used to show trends in more than one group.
Relationship - There are many different chart types that you can use to understand relationships between variables in your data.
- sns.barplot - Bar charts are useful for comparing quantities corresponding to different groups.
- sns.heatmap - Heatmaps can be used to find color-coded patterns in tables of numbers.
- sns.scatterplot - Scatter plots show the relationship between two continuous variables; if color-coded, we can also show the relationship with a third categorical variable.
- sns.regplot - Including a regression line in the scatter plot makes it easier to see any linear relationship between two variables.
- sns.lmplot - This command is useful for drawing multiple regression lines, if the scatter plot contains multiple, color-coded groups.
- sns.swarmplot - Categorical scatter plots show the relationship between a continuous variable and a categorical variable.
Distribution - We visualize distributions to show the possible values that we can expect to see in a variable, along with how likely they are.
- sns.distplot - Histograms show the distribution of a single numerical variable.
- sns.kdeplot - KDE plots (or 2D KDE plots) show an estimated, smooth distribution of a single numerical variable (or two numerical variables).
- sns.jointplot - This command is useful for simultaneously displaying a 2D KDE plot with the corresponding KDE plots for each individual variable.

# seaborn 主题

sns.set_style("dark")

Seaborn有五个不同的主题：（1）“DarkGrid”、（2）“WhiteGrid”、（3）“Dark”、（4）“White”和（5）“Ticks”，您只需要使用与上面代码单元中的类似的命令（填充所选主题）来更改它。

posted @ 2019-06-24 21:42 childhood_2 阅读(364) 评论(0) 编辑收藏举报

会员力量，点亮园子希望

刷新页面返回顶部

childhood_2

kaggle2 - 数据可视化

公告