【动手学深度学习】Tensor数据操作

Tensor数据操作

本文为李沐老师《动手学深度学习》一书的学习笔记，原书地址为：Dive into Deep Learning。

文章目录

Tensor数据操作

1. Tensor创建

# 导入一些常用库
import torch
from IPython import display
from matplotlib import pyplot as plt
import numpy as np
import random
import torch.nn as nn
import torch.optim as optim # torch.optim模块提供了很多常用的优化算法比如SGD、Adam和RMSProp等。
from torch.nn import init
import torch.utils.data as Data

x=torch.zeros(2, 3)# tensor([[0., 0., 0.], [0., 0., 0.]])
y=torch.rand(2, 3)# tensor([[0.4825, 0.5103, 0.4058], [0.0201, 0.4958, 0.5905]])
z=torch.tensor([5.5, 2])# tensor([5.5000, 2.0000])

还可以通过现有的Tensor来创建，此方法会默认重用输入Tensor的一些属性，例如数据类型，除非自定义数据类型。

x = torch.eye(5, 3)
y = x.new_ones(5, 3, dtype=torch.float64)#返回的tensor默认具有相同的torch.dtype和torch.device
z = torch.randn_like(x, dtype=torch.float) # 指定新的数据类型
# print(f"{x}\n{y}\n{z}")
print(x)# tensor([[1., 0., 0.], [0., 1., 0.], [0., 0., 1.], [0., 0., 0.], [0., 0., 0.]])
print(y)# tensor([[1., 1., 1.], [1., 1., 1.], [1., 1., 1.], [1., 1., 1.], [1., 1., 1.]], dtype=torch.float64)
print(z)# tensor([[ 0.4024, -0.2069, -1.3955], [ 0.2060, -0.0344, -0.7559], [-0.3694,  1.6153, -0.0104], [-0.1180,  0.3266, -0.4363], [ 1.1623,  0.0751,  0.9289]])
print(x.size())# torch.Size([5, 3])
print(x.shape)# torch.Size([5, 3])

2. 索引

索引出来的结果与原数据共享内存，也即修改一个，另一个会跟着修改。

y = x[0, :]
y += 1
print(y)# tensor([2., 1., 1.])
# 源tensor也被改了
print(x[0, :]) # tensor([2., 1., 1.])

3. 改变形状

注意view()返回的新Tensor与源Tensor虽然可能有不同的size，但是是共享data的，也即更改其中的一个，另外一个也会跟着改变。虽然view返回的Tensor与源Tensor是共享data的，但是依然是一个新的Tensor（因为Tensor除了包含data外还有一些其他属性），二者id（内存地址）并不一致。

y = x.view(15)
z = x.view(-1, 5)  # -1所指的维度可以根据其他维度的值推出来
print(x.size(), y.size(), z.size())# torch.Size([5, 3]) torch.Size([15]) torch.Size([3, 5])
x += 1
print(x)# tensor([[4., 3., 3.], [2., 3., 2.], [2., 2., 3.], [2., 2., 2.], [2., 2., 2.]])
print(y) # tensor([4., 3., 3., 2., 3., 2., 2., 2., 3., 2., 2., 2., 2., 2., 2.])

x_cp = x.clone().view(15)# 先用clone创造一个副本然后再使用view,就可以得到一个真正的改变了形状的副本
x -= 1
print(x)# tensor([[3., 2., 2.], [1., 2., 1.],[1., 1., 2.], [1., 1., 1.], [1., 1., 1.]])
print(x_cp)# tensor([3., 2., 2., 1., 2., 1., 1., 1., 2., 1., 1., 1., 1., 1., 1.])

x = torch.randn(1)# item() 可以将一个标量Tensor转换成一个Python number
print(x)# tensor([-0.6769])
print(x.item())# -0.6768880486488342

4. 广播机制

# 当对两个形状不同的Tensor按元素运算时，可能会触发广播（broadcasting）机制：先适当复制元素使这两个Tensor形状相同后再按元素运算。
x = torch.arange(1, 3).view(1, 2)
print(x)# tensor([[1, 2]])
y = torch.arange(1, 4).view(3, 1)
print(y)# tensor([[1], [2], [3]])
print(x + y)tensor([[2, 3], [3, 4], [4, 5]])

5. 运算的存储机制

索引操作是不会开辟新内存的，而像y = x + y这样的运算是会新开内存的，然后将y指向新内存。为了演示这一点，可以使用Python自带的id函数：如果两个实例的ID一致，那么它们所对应的内存地址相同；反之则不同。

x = torch.tensor([1, 2])
y = torch.tensor([3, 4])
id_before = id(y)
y = y + x
print(id(y) == id_before) # False

x = torch.tensor([1, 2])
y = torch.tensor([3, 4])
id_before = id(y)
y[:] = y + x # 运用索引实现 指定结果到原来的y的内存
print(id(y) == id_before) # True

x = torch.tensor([1, 2])
y = torch.tensor([3, 4])
id_before = id(y)
torch.add(x, y, out=y) # y += x, y.add_(x)也可以实现 指定结果到原来的y的内存
print(id(y) == id_before) # True

6. Tensor和NumPy相互转换

[tensor转化为numpy]：使用numpy()产生的numpy和tensor数组共享相同的内存（所以他们之间的转换很快），改变其中一个时另一个也会改变！！！

a = torch.ones(5)
b = a.numpy()
print(a, b)# tensor([1., 1., 1., 1., 1.]) [1. 1. 1. 1. 1.]
a += 1
print(a, b)# tensor([2., 2., 2., 2., 2.]) [2. 2. 2. 2. 2.]
b += 1
print(a, b)# tensor([3., 3., 3., 3., 3.]) [3. 3. 3. 3. 3.]

[numpy转化为 tensor]：使用from_numpy()或tensor()，from_numpy()产生的numpy和tensor数组共享相同的内存（所以他们之间的转换很快），改变其中一个时另一个也会改变！！！而tensor()总是会进行数据拷贝，返回的Tensor和原来的数据不再共享内存。

import numpy as np
a = np.ones(5)
b = torch.from_numpy(a)
print(a, b)# [1. 1. 1. 1. 1.] tensor([1., 1., 1., 1., 1.], dtype=torch.float64)

a += 1
print(a, b)# [2. 2. 2. 2. 2.] tensor([2., 2., 2., 2., 2.], dtype=torch.float64)

b += 1
print(a, b)# [3. 3. 3. 3. 3.] tensor([3., 3., 3., 3., 3.], dtype=torch.float64)

c = torch.tensor(a)
a += 1
print(a, c)# [4. 4. 4. 4. 4.] tensor([3., 3., 3., 3., 3.], dtype=torch.float64)

7. 梯度

把tensor的.requires_grad属性设置为True，将开始追踪在其上的所有操作（这样就可以利用链式法则进行梯度传播了）。

完成计算后，可以调用.backward()来完成所有梯度计算。此Tensor的梯度将累积到.grad属性中。

grad_fn属性就是说该Tensor是不是通过某些运算得到的，若是，则grad_fn返回一个与这些运算相关的对象，否则是None。

x = torch.ones(2, 2, requires_grad=True)
print(x)# tensor([[1., 1.], [1., 1.]], requires_grad=True)
print(x.grad_fn)# None

举些例子:

y = x + 2
print(y)# tensor([[3., 3.], [3., 3.]], grad_fn=<AddBackward0>)

# y是通过一个加法操作创建的，所以它有一个为<AddBackward>的grad_fn
print(y.grad_fn)# <AddBackward0 object at 0x000002ADB0796048>

z = y * y * 3
out = z.mean()
print(z, out)# tensor([[27., 27.], [27., 27.]], grad_fn=<MulBackward0>) tensor(27., grad_fn=<MeanBackward0>)

# 像x这种直接创建的称为叶子节点，叶子节点对应的grad_fn是None
print(x.is_leaf, y.is_leaf) # True False

a = torch.randn(2, 2) # 缺失情况下默认 requires_grad = False
a = ((a * 3) / (a - 1))
print(a.requires_grad) # False

a.requires_grad_(True)# 通过.requires_grad_()来用in-place的方式改变requires_grad属性
print(a.requires_grad) # True

b = (a * a).sum()
print(b.grad_fn)# <SumBackward0 object at 0x0000016BBB329208>

x = torch.ones(2, 2, requires_grad=True)
y = x + 2
z = y * y * 3
out = z.mean()
out.backward() # 等价于 out.backward(torch.tensor(1.))
print(x.grad)# tensor([[4.5000, 4.5000], [4.5000, 4.5000]])

# 再来反向传播一次，注意grad是累加的
out2 = x.sum()
out2.backward()
print(x.grad)# tensor([[5.5000, 5.5000], [5.5000, 5.5000]])

out3 = x.sum()
x.grad.data.zero_()# 反向传播之前需把梯度清零
out3.backward()
print(x.grad)# tensor([[1., 1.], [1., 1.]])

x = torch.tensor([1.0, 2.0, 3.0, 4.0], requires_grad=True)
y = 2 * x
z = y.view(2, 2)
print(z)# tensor([[2., 4.], [6., 8.]], grad_fn=<ViewBackward0>)
# 不允许张量对张量求导，只允许标量对张量求导，求导结果是和自变量同形的张量
v = torch.tensor([[1.0, 0.1], [0.01, 0.001]], dtype=torch.float)
z.backward(v)# 现在 z 不是一个标量，所以在调用backward时需要传入一个和z同形的权重向量进行加权求和得到一个标量。
print(x.grad)# tensor([2.0000, 0.2000, 0.0200, 0.0020])

中断梯度追踪

x = torch.tensor(1.0, requires_grad=True)
y1 = x ** 2 
with torch.no_grad():
    y2 = x ** 3
y3 = y1 + y2

print(x.requires_grad)# True
print(y1, y1.requires_grad) # tensor(1., grad_fn=<PowBackward0>) True
print(y2, y2.requires_grad) # tensor(1.) False
print(y3, y3.requires_grad) # tensor(2., grad_fn=<AddBackward0>) True
y3.backward()
print(x.grad)# tensor(2.)

posted @ 2022-01-23 22:12 ccql 阅读(47) 评论(0) 收藏举报来源

刷新页面返回顶部

ccql

【动手学深度学习】Tensor数据操作

Tensor数据操作

文章目录

1. Tensor创建

2. 索引

3. 改变形状

4. 广播机制

5. 运算的存储机制

6. Tensor和NumPy相互转换

7. 梯度

公告