代码改变世界

Python Tutorial: Basic graphing and plotting functions

2017-12-12 14:13  nuswgg  阅读(265)  评论(0编辑  收藏  举报
import matplotlib.pyplot as plt

  • Visualize a single continuous variable by producing a histogram.

# Notice the labeling of the axes
plt.hist(student["Weight"], bins=[40,60,80,100,120,140,160])
plt.xlabel('Weight')
plt.ylabel('Frequency')
plt.show()

  • Visualize a single continuous variable by producing a boxplot.

# showmeans=True tells Python to plot the mean of the variable on the boxplot
plt.boxplot(student["Weight"], showmeans=True)
# prevents Python from printing a "1" at the bottom of the boxplot
plt.xticks([])
plt.ylabel('Weight')
plt.show()

  • Visualize two continuous variables by producing a scatterplot.

# Notice here you specify the x variable, followed by the y variable
plt.scatter(student["Height"], student["Weight"])
plt.xlabel("Height")
plt.ylabel("Weight")
plt.show()

  • Visualize a relationship between two continuous variables by producing a scatterplot and a plotted line of best fit.
x = student["Height"]
y = student["Weight"]
# np.polyfit() models Weight as a function of Height and returns the 
# parameters
m, b = np.polyfit(x, y, 1)
plt.scatter(x, y)
# plt.text() prints the equation of the line of best fit, with the first two 
# arguments specifying the x and y locations of the text, respectively 
# "%f" indicates to print a floating point number, that is specified following
# the string and a "%" character
plt.text(51, 140, "Line: y = %f x + %f"% (m,b))
plt.plot(x, m*x + b)
plt.xlabel("Height")
plt.ylabel("Weight")
plt.show()
  • Visualize a categorical variable by producing a bar chart.
# Get the counts of Sex 
counts = pd.crosstab(index=student["Sex"], columns="count")
# len() returns the number of categories of Sex (2)
# np.arange() creates a vector of the specified length
num = np.arange(len(counts))
# alpha = 0.5 changes the transparency of the bars
plt.bar(num, counts["count"], align='center', alpha=0.5)
# Set the xticks to be the indices of counts
plt.xticks(num, counts.index)
plt.xlabel("Sex")
plt.ylabel("Frequency")
plt.show()
  • Visualize a continuous variable, grouped by a categorical variable, by producing side-by-side boxplots.
    • Simple side-by-side boxplot without color.
# Subset data set to return only female weights, and then only male weights 
Weight_F = np.array(student.query('Sex == "F"')["Weight"])
Weight_M = np.array(student.query('Sex == "M"')["Weight"])
Weights = [Weight_F, Weight_M]
# PyPlot automatically plots the two weights side-by-side since Weights 
# is a 2D array
plt.boxplot(Weights, showmeans=True, labels=('F', 'M'))
plt.xlabel('Sex')
plt.ylabel('Weight')
plt.show()
    • More advanced side-by-side boxplot with color.

import seaborn as sns
sns.boxplot(x="Sex", y="Weight", hue="Sex", data = student, showmeans=True)
sns.plt.show()