Numpy学习笔记

%matplotlib inline
import matplotlib.pyplot as plt

import numpy as np

1. Introduction

基本上所有的Python科学计算中都会使用\(Numpy\)
这是一个给出向量、数组、高纬度的数据结构的宏包
用于表示向量、矩阵和高纬度数据集的是\(array\)

2. Creating numpy arrays

\(array\)可以来自于Python中很多的数据类型，比如：

\(list\), \(tuple\)
functions dedicated to generating numpy arrays, \(arange\), \(linspace\) etc.
reading data from files(\(csv\) etc.)

2.1 From lists

我们使用 \(numpy.array\) 函数进行强制的类型转换

v = np.array([1,2,3,4])
m = np.array([[1,2],[3,4]])

v, type(v)

(array([1, 2, 3, 4]), numpy.ndarray)

m, type(m)

(array([[1, 2],
        [3, 4]]),
 numpy.ndarray)

v.shape, m.shape

((4,), (2, 2))

np.shape(v), np.shape(m)

((4,), (2, 2))

v.size, m.size

(4, 4)

ndarray 是numpy模块中的一个class
shape和size是ndarray每个实例的两个属性
shape(a), size(a) 是numpy模块中的函数，其功能是返回ndarray对象的shape和size属性

使用ndarray类实例的\(dtype\)属性，可以查看到ndarray中存储数据的类型
也可以在array函数中附加\(dtype\)属性的值，可以将原来list和tuple中的值的类型进行强制转换

M = np.array([[1,2],[3,4]], dtype = complex)

array([[1.+0.j, 2.+0.j],
       [3.+0.j, 4.+0.j]])

\(dtype\)属性有\(int, float, complex, bool, object\)等

2.2 Using array-generating functions

We can use functions to generate arrays of different forms

\(arange\)

Create a range

x = arange(start, stop, step)

x = np.arange (0, 10, 1)
x

array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

x = np.arange(-1, 1, 0.1)
x

array([-1.00000000e+00, -9.00000000e-01, -8.00000000e-01, -7.00000000e-01,
       -6.00000000e-01, -5.00000000e-01, -4.00000000e-01, -3.00000000e-01,
       -2.00000000e-01, -1.00000000e-01, -2.22044605e-16,  1.00000000e-01,
        2.00000000e-01,  3.00000000e-01,  4.00000000e-01,  5.00000000e-01,
        6.00000000e-01,  7.00000000e-01,  8.00000000e-01,  9.00000000e-01])

\(linspace \& logspace\)

\(linspace\) 在一定范围之间一定数量的等间隔的数

x = np.linspace(start, stop, N)

\(logspace\) 在一定范围之间一定数量的等间隔的数作为指数, base为底

x = np.logspace(start, stop, N, base = a)

np.linspace(0, 10, 25)

array([ 0.        ,  0.41666667,  0.83333333,  1.25      ,  1.66666667,
        2.08333333,  2.5       ,  2.91666667,  3.33333333,  3.75      ,
        4.16666667,  4.58333333,  5.        ,  5.41666667,  5.83333333,
        6.25      ,  6.66666667,  7.08333333,  7.5       ,  7.91666667,
        8.33333333,  8.75      ,  9.16666667,  9.58333333, 10.        ])

np.logspace(0, 10, 11, base = 10)

array([1.e+00, 1.e+01, 1.e+02, 1.e+03, 1.e+04, 1.e+05, 1.e+06, 1.e+07,
       1.e+08, 1.e+09, 1.e+10])

\(mgrid\)

x, y = np.mgrid[0:5, 0:6]
x, y

(array([[0, 0, 0, 0, 0, 0],
        [1, 1, 1, 1, 1, 1],
        [2, 2, 2, 2, 2, 2],
        [3, 3, 3, 3, 3, 3],
        [4, 4, 4, 4, 4, 4]]),
 array([[0, 1, 2, 3, 4, 5],
        [0, 1, 2, 3, 4, 5],
        [0, 1, 2, 3, 4, 5],
        [0, 1, 2, 3, 4, 5],
        [0, 1, 2, 3, 4, 5]]))

\(random \quad data\)

\(rand\) 产生 \([0,1]\)内的均匀分布

np.random.rand(5,5)

array([[0.66442015, 0.90629324, 0.5467012 , 0.00513714, 0.49277189],
       [0.17140787, 0.92461858, 0.67124238, 0.2545325 , 0.74407112],
       [0.39156052, 0.40204662, 0.47811055, 0.81849795, 0.28357586],
       [0.38806353, 0.5585088 , 0.36854102, 0.2059275 , 0.5687478 ],
       [0.17182169, 0.42607232, 0.31883796, 0.05032641, 0.11960338]])

\(randn\) 按照标准正态分布产生随机数

np.random.randn(5,5)

array([[ 0.01381457, -1.19134824, -0.96232556, -0.81001151, -0.20093418],
       [ 0.73000021, -1.27875203,  0.71340573,  1.42708013,  1.91083278],
       [-0.26139491, -1.01805302, -2.37713149, -2.74976479, -0.72777884],
       [-0.36674878, -0.67727608,  1.31941451,  0.18738834,  0.29141163],
       [ 0.06743741, -0.67125098, -1.02038057, -0.68247009,  0.97729812]])

\(diag\)

\(diag(list)\) 用\(list\)里面的元素生成一个对角矩阵

np.diag([1,2,3])

array([[1, 0, 0],
       [0, 2, 0],
       [0, 0, 3]])

np.diag([1,2,3], k=1)

array([[0, 1, 0, 0],
       [0, 0, 2, 0],
       [0, 0, 0, 3],
       [0, 0, 0, 0]])

\(zeros\&ones\)

np.zeros((3,3))

array([[0., 0., 0.],
       [0., 0., 0.],
       [0., 0., 0.]])

np.ones((3,3))

array([[1., 1., 1.],
       [1., 1., 1.],
       [1., 1., 1.]])

传入的参数一定要是一个元组\((w,h)\)

3. File I/O

3.1 Comma-separated values (CSV)

逗号分隔文件，纯字符的形式读取和存储

\(CSV\)和\(TSV\)文件，\(Comma-separated\)和\(Tad-separated\)
使用\(np.genfromtxt\)函数来读取

用 \(savetxt\) 进行纯字符形式的存储

M = np.random.rand(3,3)
M

array([[0.28011205, 0.82406684, 0.539676  ],
       [0.48132395, 0.4149462 , 0.04837249],
       [0.91403304, 0.00245526, 0.41934376]])

np.savetxt('random.csv', M, fmt='%.5f')

!random.csv

3.2 Numpy's native file format

Use \(np.save\) and \(np.load\) to save and load \(.npy\) files.

np.save("random.npy", M)

N = np.load("random.npy")

array([[0.28011205, 0.82406684, 0.539676  ],
       [0.48132395, 0.4149462 , 0.04837249],
       [0.91403304, 0.00245526, 0.41934376]])

4. More properties of the numpy arrays

bytes per element

M.itemsize

number of bytes

M.nbytes

number of dimensions

M.ndim

5. Manipulating arrays

5.1 Indexing

M[0]

array([0.28011205, 0.82406684, 0.539676  ])

M[0,:]

array([0.28011205, 0.82406684, 0.539676  ])

M[1,1]

0.4149462008918443

M[:,1]

array([0.82406684, 0.4149462 , 0.00245526])

M[0,0]

0.2801120490316811

M[1,:] = 0
M

array([[0.28011205, 0.82406684, 0.539676  ],
       [0.        , 0.        , 0.        ],
       [0.91403304, 0.00245526, 0.41934376]])

M[:,2] = -1
M

array([[ 0.28011205,  0.82406684, -1.        ],
       [ 0.        ,  0.        , -1.        ],
       [ 0.91403304,  0.00245526, -1.        ]])

5.2 Index slicing

Index slicing 是用来对数组使用\(M[lower:upper:step]\)进行片段提取的操作

A = np.arange(1,6)
A

array([1, 2, 3, 4, 5])

A[1:3]

array([2, 3])

A[0:5:2]

array([1, 3, 5])

A[1:3] = [-2,-3]
A

array([ 1, -2, -3,  4,  5])

A[-1]

A[-3:]

array([-3,  4,  5])

A = np.array([n+m*10 for n in range(5) for m in range(4)])
A

array([ 0, 10, 20, 30,  1, 11, 21, 31,  2, 12, 22, 32,  3, 13, 23, 33,  4,
       14, 24, 34])

A = np.array([[n+m*10 for n in range(5)] for m in range(4)])
A

array([[ 0,  1,  2,  3,  4],
       [10, 11, 12, 13, 14],
       [20, 21, 22, 23, 24],
       [30, 31, 32, 33, 34]])

A[1:4, 1:4]

array([[11, 12, 13],
       [21, 22, 23],
       [31, 32, 33]])

5.3 Fancy Indexing

index of a \(list\) or an \(array\)

row_indices = [1,2,3]
col_indices = [1,2,-1]

A[row_indices, col_indices]

array([11, 22, 34])

A[row_indices]

array([[10, 11, 12, 13, 14],
       [20, 21, 22, 23, 24],
       [30, 31, 32, 33, 34]])

B = np.array([n for n in range(5)])
B

array([0, 1, 2, 3, 4])

row_mask = np.array([True, False, True, False, False])

B[row_mask]

array([0, 2])

row_mask = np.array([1,0,1,0,0], dtype = bool)
B[row_mask]

array([0, 2])

x = np.arange(0, 10, 0.5)
x

array([0. , 0.5, 1. , 1.5, 2. , 2.5, 3. , 3.5, 4. , 4.5, 5. , 5.5, 6. ,
       6.5, 7. , 7.5, 8. , 8.5, 9. , 9.5])

mask = (5<x) * (x<7.5)
mask

array([False, False, False, False, False, False, False, False, False,
       False, False,  True,  True,  True,  True, False, False, False,
       False, False])

x [mask]

array([5.5, 6. , 6.5, 7. ])

6. Functions for extracting data from arrays and creating arrays

\(where\)

\(where\) 用来寻找bool类型的数组中真值的下标

indices = np.where(mask)
indices

(array([11, 12, 13, 14], dtype=int64),)

x[indices]

array([5.5, 6. , 6.5, 7. ])

\(diag\)

取出主对角线

array([[ 0,  1,  2,  3,  4],
       [10, 11, 12, 13, 14],
       [20, 21, 22, 23, 24],
       [30, 31, 32, 33, 34]])

np.diag(A)

array([ 0, 11, 22, 33])

np.diag(A, -1)

array([10, 21, 32])

\(take\)

array([0, 1, 2, 3, 4])

B.take([0,2,4])

array([0, 2, 4])

np.take([0,1,2,3,4],[0,2,4])

array([0, 2, 4])

\(choose\)

which = [1, 0, 1, 0]
choices = [[-2,-2,-2,-2],[5,5,5,5]]

np.choose (which, choices)

array([ 5, -2,  5, -2])

which = [1,2,0,1]
choices = [[-2,-2,-2,-2], [5,5,5,5], [6,6,6,6]]

np.choose (which, choices)

array([ 5,  6, -2,  5])

决定每个位置的元素是来自于第几个列表

7. Linear Algebra

可以基于\(Numpy\)模块中的两个类进行线性代数的运算。线性代数运算的核心在于使用向量来表征运算。

\(array\)
\(matrix\)

7.1 Scalar-array operation

v1 = np.arange(0,5)

v1*2

array([0, 2, 4, 6, 8])

v1+2

array([2, 3, 4, 5, 6])

A*2, A+2

(array([[ 0,  2,  4,  6,  8],
        [20, 22, 24, 26, 28],
        [40, 42, 44, 46, 48],
        [60, 62, 64, 66, 68]]),
 array([[ 2,  3,  4,  5,  6],
        [12, 13, 14, 15, 16],
        [22, 23, 24, 25, 26],
        [32, 33, 34, 35, 36]]))

这里采用的是广播机制

7.2 Element-wise array-array operations

在array之间使用的运算符运算都是对应位置元素之间进行运算。
称为element-wise

A * A

array([[   0,    1,    4,    9,   16],
       [ 100,  121,  144,  169,  196],
       [ 400,  441,  484,  529,  576],
       [ 900,  961, 1024, 1089, 1156]])

v1*v1

array([ 0,  1,  4,  9, 16])

A.shape, v1.shape

((4, 5), (5,))

A*v1

array([[  0,   1,   4,   9,  16],
       [  0,  11,  24,  39,  56],
       [  0,  21,  44,  69,  96],
       [  0,  31,  64,  99, 136]])

A, v1

(array([[ 0,  1,  2,  3,  4],
        [10, 11, 12, 13, 14],
        [20, 21, 22, 23, 24],
        [30, 31, 32, 33, 34]]),
 array([0, 1, 2, 3, 4]))

这里所使用到的称为广播机制

7.3 Matrix algebra

要做矩阵运算，就有以下两种运算方式：

\(array\)
\(matrix\)

\(array\)

np.dot(A,v1)

array([ 30, 130, 230, 330])

注意array里面用list存储的向量是列向量！！
因为这个相当于每行都是一个只有一个元素的list！！

\(matrix\)

M = np.matrix(A)
v = np.matrix(v1).T

M, v

(matrix([[ 0,  1,  2,  3,  4],
         [10, 11, 12, 13, 14],
         [20, 21, 22, 23, 24],
         [30, 31, 32, 33, 34]]),
 matrix([[0],
         [1],
         [2],
         [3],
         [4]]))

M*v

matrix([[ 30],
        [130],
        [230],
        [330]])

#M+v

这一步会出错，因为矩阵之间的加法需要矩阵的尺寸完全一样。

v.T * v

matrix([[30]])

这一步就是求向量积

np.shape(v), np.shape(M)

((5, 1), (4, 5))

7.4 Matrix computations

Inverse

就是求逆矩阵

C = M
C

matrix([[ 0,  1,  2,  3,  4],
        [10, 11, 12, 13, 14],
        [20, 21, 22, 23, 24],
        [30, 31, 32, 33, 34]])

C.I

matrix([[-0.158, -0.086, -0.014,  0.058],
        [-0.082, -0.044, -0.006,  0.032],
        [-0.006, -0.002,  0.002,  0.006],
        [ 0.07 ,  0.04 ,  0.01 , -0.02 ],
        [ 0.146,  0.082,  0.018, -0.046]])

Determinant

Det = v * v.T

np.linalg.det(Det)

0.0

7.5 Data Processing

\(mean: mean(data[:,3])\)

\(standard\quad deviations: std(data[:,3])\)

\(variance: var(data[:, 3])\)

\(min: data[:,3].min()\)

\(max: data[:,3].max()\)

\(sum: sum(matrix)\)

\(prod: prod(matrix)\)

\(cumsum: cumsum(matrix)\) 前缀和

\(cumprod: cumprod(matrix)\) 前缀积

\(trace: trace(matrix)\) 迹

7.6 Computations on subsets of arrays

\(unique: unique(data[:,1])\)

\(mask\_feb = data[:, 1] == 2\)

posted @ 2020-02-29 12:54 Xiaojian_xiang 阅读(164) 评论(0) 编辑收藏举报

刷新页面返回顶部

Xiaojian_xiang

永远赤诚，永远年轻，永远热爱生命。

Numpy学习笔记

1. Introduction

2. Creating numpy arrays

2.1 From lists

2.2 Using array-generating functions

\(arange\)

\(linspace \& logspace\)

\(mgrid\)

\(random \quad data\)

\(diag\)

\(zeros\&ones\)

3. File I/O

3.1 Comma-separated values (CSV)

3.2 Numpy's native file format

4. More properties of the numpy arrays

5. Manipulating arrays

5.1 Indexing

5.2 Index slicing

5.3 Fancy Indexing

6. Functions for extracting data from arrays and creating arrays

\(where\)

\(diag\)

\(take\)

\(choose\)

7. Linear Algebra

7.1 Scalar-array operation

7.2 Element-wise array-array operations

7.3 Matrix algebra

\(array\)

\(matrix\)

7.4 Matrix computations

Inverse

Determinant

7.5 Data Processing

7.6 Computations on subsets of arrays

公告