Pytorch_2.2_数据结构

2.2 数据结构

torch.Tensor 是存储和变换数据的主要工具

2.2.1 创建TENSOR

首先导入PyTorch

import torch

创建一个5x3的未初始化的Tensor

x = torch.empty(5,3)
print(x)

tensor([[9.2755e-39, 1.0561e-38, 1.0929e-38],
        [1.0102e-38, 9.7347e-39, 4.2246e-39],
        [1.0286e-38, 1.0653e-38, 1.0194e-38],
        [8.4490e-39, 1.0469e-38, 9.3674e-39],
        [9.9184e-39, 8.7245e-39, 9.2755e-39]])

创建一个5x3的随机初始化Tencor

x  = torch.rand(5,3)
print(x)

tensor([[0.8809, 0.5242, 0.8139],
        [0.1677, 0.4035, 0.3680],
        [0.4707, 0.9393, 0.5377],
        [0.7676, 0.7518, 0.4257],
        [0.7233, 0.4705, 0.3036]])

创建一个全0的long型Tensor

x = torch.zeros(5,3,dtype = torch.long)
print(x)
print(x[1][1])
print(type(x[1][1]))

tensor([[0, 0, 0],
        [0, 0, 0],
        [0, 0, 0],
        [0, 0, 0],
        [0, 0, 0]])
tensor(0)
<class 'torch.Tensor'>

还可以根据数据直接创建Tensor

x = torch.tensor([6.6,8])
print(x)

tensor([6.6000, 8.0000])

通过现有的Tensor 来创建新的数据

y =x.new_ones(5,3,dtype = torch.float64)
print(y)
z = torch.randn_like(y,dtype = torch.float)
print(z)

tensor([[1., 1., 1.],
        [1., 1., 1.],
        [1., 1., 1.],
        [1., 1., 1.],
        [1., 1., 1.]], dtype=torch.float64)
tensor([[ 0.3913,  1.5179,  1.6415],
        [ 1.2150, -0.1650, -0.7815],
        [-0.2489,  0.5252, -1.2605],
        [ 0.5305,  0.1410, -0.6449],
        [-1.4771,  0.1270,  0.6471]])

可以通过shape 或者 size() 来获取Tensor 的数据形状

print(y.size())
print(y.shape)
# torch.size 的类型是 tuple

torch.Size([5, 3])
torch.Size([5, 3])

函数	功能
Tensor(*sizes)	基础构造函数
tensor(data,)	类似np.array的构造函数
ones(*sizes)	全1Tensor
zeros(*sizes)	全0Tensor
eye(*sizes)	对⻆线为1，其他为0
arange(s,e,step)	从s到e，步⻓为step
linspace(s,e,steps)	从s到e，均匀切分成steps份
rand/randn(*sizes)	均匀/标准分布
normal(mean,std)/uniform(from,to)	正态分布/均匀分布
randperm(m)	随机排列

2.2.2 操作

算数操作

加法直接利用加号

x  = torch.rand(5,3)
y = torch.rand(5,3)
print(x + y)

tensor([[1.4692, 0.7374, 0.7234],
        [0.8925, 0.7805, 1.3061],
        [0.8631, 1.8786, 1.5237],
        [0.9812, 0.9703, 1.0981],
        [1.1590, 0.7849, 1.1890]])

print(torch.add(x,y))

tensor([[1.4692, 0.7374, 0.7234],
        [0.8925, 0.7805, 1.3061],
        [0.8631, 1.8786, 1.5237],
        [0.9812, 0.9703, 1.0981],
        [1.1590, 0.7849, 1.1890]])

还可以指定输出

result = torch.empty(5,3)
torch.add(x,y,out = result)
print(result)
# 可以直接指定
result = torch.add(x,y)
print(result)

tensor([[1.4692, 0.7374, 0.7234],
        [0.8925, 0.7805, 1.3061],
        [0.8631, 1.8786, 1.5237],
        [0.9812, 0.9703, 1.0981],
        [1.1590, 0.7849, 1.1890]])
tensor([[1.4692, 0.7374, 0.7234],
        [0.8925, 0.7805, 1.3061],
        [0.8631, 1.8786, 1.5237],
        [0.9812, 0.9703, 1.0981],
        [1.1590, 0.7849, 1.1890]])

另一种加法

y.add(x)
print(y)

tensor([[0.5895, 0.4489, 0.3473],
        [0.0495, 0.6829, 0.5846],
        [0.2628, 0.9495, 0.9024],
        [0.3175, 0.9222, 0.5868],
        [0.1993, 0.7389, 0.7718]])

索引

可以利用类似NumPy的索引方式访问Tensor的一部分。索引的结果和原数据共享一个内存

y = x[0,:]
y += 1
print(y)
print(x[0,:])
# 索引的数据修改后也改变了原数据

tensor([1.8798, 1.2886, 1.3761])
tensor([1.8798, 1.2886, 1.3761])

其他一些函数

函数	功能
index_select(input,dim,index)	指定维度dim上选取，某行或某列
masked_select(input, mask)	例⼦如上， a[a>0]，使⽤ByteTensor进⾏选取
non_zero(input)	⾮0元素的下标
gather(input, dim, index)	根据index，在dim维度上选取数据，输出的size与index⼀样

改变Tensor的形状

可以用view()来改变Tensor的形状，就是改变矩阵的行列

y = x.view(15) # 将5行3列的矩阵改变为一维行向量
z = x.view(-1,5) # -1 表示该位置自动识别，后面是5列 所以-1位置应该为3
print(x.size(),y.size(),z.size())

torch.Size([5, 3]) torch.Size([15]) torch.Size([3, 5])

这里的新tensor与原tensor共享内存，一改则都改

如果需要复制可以使用clone再使用view

x_cp = x.clone().view(15)
x -= 1
print(x)
print(x_cp)

tensor([[ 0.8798,  0.2886,  0.3761],
        [-0.1569, -0.9025, -0.2784],
        [-0.3997, -0.0710, -0.3788],
        [-0.3363, -0.9519, -0.4887],
        [-0.0403, -0.9539, -0.5827]])
tensor([1.8798, 1.2886, 1.3761, 0.8431, 0.0975, 0.7216, 0.6003, 0.9290, 0.6212,
        0.6637, 0.0481, 0.5113, 0.9597, 0.0461, 0.4173])

item() 可以将标量的Tensor转换成Python number

x = torch.randn(1)
print(x)
print(x.item())

tensor([-0.5851])
-0.585087776184082

torch.rand() 与 torch.randn()

torch.rand() 是均匀分布（[0,1]之间随机抽取）
torch.randn() 是标准正态分布（均值为0 方差为1）

线性代数

函数	功能
trace	对⻆线元素之和(矩阵的迹)
diag	对⻆线元素
triu/tril	矩阵的上三⻆/下三⻆，可指定偏移量
mm/bmm	矩阵乘法， batch的矩阵乘法
addmm/addbmm/addmv/addr/badbmm..	矩阵运算
t	转置
dot/cross	内积/外积
inverse	求逆矩阵
svd	奇异值分解

x = torch.rand(3,3)
print(x)
print('x的对角元素之和为：{}'.format(x.trace()))
print('x的对角元素为：{}'.format(x.diag()))
print('x的上三角：{}'.format(x.triu()))
print('x的转置：{}'.format(x.t()))

tensor([[0.8447, 0.7846, 0.1641],
        [0.5459, 0.4649, 0.9806],
        [0.1267, 0.0251, 0.9043]])
x的对角元素之和为：2.2139716148376465
x的对角元素为：tensor([0.8447, 0.4649, 0.9043])
x的上三角：tensor([[0.8447, 0.7846, 0.1641],
        [0.0000, 0.4649, 0.9806],
        [0.0000, 0.0000, 0.9043]])
x的转置：tensor([[0.8447, 0.5459, 0.1267],
        [0.7846, 0.4649, 0.0251],
        [0.1641, 0.9806, 0.9043]])

2.2.3 广播机制

所谓广播机制，就是在处理数据维度不相等的数据时，会自动将数据补全成相同大小

例如：

x = torch.arange(1,3).view(1,2)
print(x)
y = torch.arange(1,4).view(3,1)
print(y)
print(x + y)

tensor([[1, 2]])
tensor([[1],
        [2],
        [3]])
tensor([[2, 3],
        [3, 4],
        [4, 5]])

2.2.4 运算的内存开销

利用Python中自带的id函数验证前面的内存地址差异

x = torch.tensor([1,2])
y = torch.tensor([3,4])
print(id(y))
y = y + x  # 赋值运算 所以开辟新内存
print(id(y))
y += x  # 不开辟新内存  也可以使用.add()
print(id(y))
print(y)
torch.add(x,y,out = y)
# y.add_(x)
print(id(y))
print(y)

1714379452136
1714379449544
1714379449544
tensor([5, 8])
1714379449544
tensor([ 6, 10])

区分一下add(x,y,out = y) 和 .add()
前者在torch下

2.2.5 Tensor和Numpy 互相转换

利用 numpy() 和 from_numpy() 可以实现Tensor 和 Numpy的数组互相转换

转换后数据共享内存

Tensor转为Numpy

a = torch.ones(5)
b = a.numpy()
print(a,b)

a += 1
print(a,b)
b += 1
print(a,b)

tensor([1., 1., 1., 1., 1.]) [1. 1. 1. 1. 1.]
tensor([2., 2., 2., 2., 2.]) [2. 2. 2. 2. 2.]
tensor([3., 3., 3., 3., 3.]) [3. 3. 3. 3. 3.]

Numpy 转为 Tensor

import numpy as np
a = np.ones(5)
b = torch.from_numpy(a)
print(a,b)

a += 1
print(a,b)
b += 1
print(a,b)

[1. 1. 1. 1. 1.] tensor([1., 1., 1., 1., 1.], dtype=torch.float64)
[2. 2. 2. 2. 2.] tensor([2., 2., 2., 2., 2.], dtype=torch.float64)
[3. 3. 3. 3. 3.] tensor([3., 3., 3., 3., 3.], dtype=torch.float64)

【总结】
主要介绍了Tensor 的创建于使用

posted on 2020-01-31 16:18 wangxiaobei2019 阅读(139) 评论(0) 编辑收藏举报

刷新页面返回顶部

wangxiaobei2019