Python学习笔记(五)
5、常用库介绍
学习NumPy
Q:什么是NumPy?
A:一个强大的N维数组对象Array、比较成熟的(广播)函数库、用于整合C/C++和Fortran代码的工具包
——————————————————————————————————————————
ndarray
- N维数组对象,一系列同类数据的集合(基本类型+复杂类型),以0为下标起点
如何创建ndarray
np.array([[1,2,3],[4,5,6]], dtype='f8')
#复杂类型
stu = np.dtype([('name','U20'), ('age', 'i1'), ('marks', 'f4')])
x = np.array([('Bob',21,50),('Amy',18,75)], dtype=stu) # 2x1 array of stu type
x['name']
np.empty([6,7], dtype='u4') # 6x7array, not initialized
np.zeros([3,5], dtype='f4') # 3x5 array, initialized to 0
np.ones([3,4], dtype='f4') # 3x4 array, initialized to 1
np.asarray([1,2,3,4,5], dtype='u8') # similar to np.array, fewer parameters
np.fromiter(range(100), dtype='U3') # 1x100 array, values from range, as str
np.arange(1, 50, 2, dtype='i2') # 1x25 array, values from range
np.full((3,5), 7) # 3x5 array, filled with 7
np.eye(5) # 5x5 array, diagonal对角线 is 1, elsewhere is 0
np.random.random((3,4)) # 3x4 array, filled with random number in [0.0, 1)
np.linspace(4, 6, 8)# 1x8 array, filled with evenly spaced numbers from 4 to 6 inclusive
关于数组切片
a[维度1,维度2,维度3...]
每个维度的切片描述又有两种:
1、[起始下标 : 终止下标] (范围,左包右不包)
2、[下标1,下标2,下标3] (下标列表)
关于数组广播
两个形状不同数组,也可以进行运算,但必须遵循以下条件:
- 从右向左,以最高维度为起点,至少在两个维度上两个数组长度相同或其中一个长度为1.
- 最终数组的形状是各个维度上的最大长度
作业:
-
Giventwoarrays:
a=np.arange(1,25).reshape(2,1,3,4)
b=np.arange(1,25).reshape(4,6)
How can we add them together(i.e.a+b)? Please show how numbers are paired?
#reshape
b.shape=2,3,4
print(a+b)
# how numbers are paired:
for x,y in np.nditer([a,b]): print(x,y)
2.Define a structure named toy with following fields:
name: 10-character string
price: float64
toy=np.dtype([('name','U10'),('price',"f8")])
a=np.array([('doll',12.34),('lego',56.78),('car',90.12)], dtype=toy) a=np.append(a, np.array([('chess',56.78),('ball',56.78)], dtype=toy))
a.sort(order=['price', 'name'])
Draw price using line,scatter,bar,pie,histogram charts:
import numpy as np
from matplotlib import pyplot as plt
x=np.arange(1, len(a)+1)
y=a['price']
fig, axes = plt.subplots(5,1)
fig.set_size_inches(10, 35)
line, scatter, bar, pie, histogram = axes
line.plot(x, y, label='price')
line.legend()
scatter.scatter(x, y, s=200, c='r', marker='*', label='price') scatter.legend()
bar.bar(x, y, label='price')
bar.legend()
pie.pie(y, labels=a['name'], autopct='%1.1f%%')
pie.axis('equal')
pie.legend()
histogram.hist(y, bins=3, label=['price'])
histogram.legend()
3.load CSV file into Pandas DataFrame
url= 'http://samplecsvs.s3.amazonaws.com/Sacramentorealestatetransactions.csv' df=pd.read_csv(url)
df=pd.read_csv(url, index_col='city')
4. List cheapest building in each city?