PyTorch0.4更新指南

PyTroch 0.4 更新指南

Tensors并Variables已合并
Tensors支持0维（标量）
弃用volatile标签
dtypes，devices和NumPy风格的创作功能
编写device-agnostic代码
nn.Module中子模块名称，参数和缓冲区中的新边界约束

一、合并Tensor和Variable和类

torch.autograd.Variable和torch.Tensor现在类相同。确切地说，torch.Tensor能够像Variable一样自动求导; Variable继续像以前一样工作但返回一个torch.Tensor类型的对象。意味着你在代码中不再需要Variable包装器。

1、Tensor中的type()改变了

type()不再反映张量的数据类型。使用isinstance()或x.type()替代：

>>> x = torch.DoubleTensor([1, 1, 1])
>>> print(type(x))  # was torch.DoubleTensor
"<class 'torch.Tensor'>"
>>> print(x.type())  # OK: 'torch.DoubleTensor'
'torch.DoubleTensor'
>>> print(isinstance(x, torch.DoubleTensor))  # OK: True
True

2、什么时候autograd开始自动求导？

equires_grad是autograd的核心标志，现在是Tensors上的一个属性。让我们看看在代码中如何体现的。
autograd使用以前用于Variables的相同规则。当张量定义了requires_grad=True就可以自动求导了。例如，

>>> x = torch.ones(1)  # create a tensor with requires_grad=False (default)
>>> x.requires_grad
False
>>> y = torch.ones(1)  # another tensor with requires_grad=False
>>> z = x + y
>>> # both inputs have requires_grad=False. so does the output
>>> z.requires_grad
False
>>> # then autograd won't track this computation. let's verify!
>>> z.backward()
RuntimeError: element 0 of tensors does not require grad and does not have a grad_fn
>>>
>>> # now create a tensor with requires_grad=True
>>> w = torch.ones(1, requires_grad=True)
>>> w.requires_grad
True
>>> # add to the previous result that has require_grad=False
>>> total = w + z
>>> # the total sum now requires grad!
>>> total.requires_grad
True
>>> # autograd can compute the gradients as well
>>> total.backward()
>>> w.grad
tensor([ 1.])
>>> # and no computation is wasted to compute gradients for x, y and z, which don't require grad
>>> z.grad == x.grad == y.grad == None
True

3、操作requires_grad标志

除了直接设置属性之外，您可以使用my_tensor.requiresgrad(requires_grad=True)直接修改此标志，或者如上例所示，在创建时将其作为参数传递（默认为False），例如:

>>> existing_tensor.requires_grad_()
>>> existing_tensor.requires_grad
True
>>> my_tensor = torch.zeros(3, 4, requires_grad=True)
>>> my_tensor.requires_grad
True

4、关于.data

.data是从Variable中获取Tensor的方法。合并后，调用y = x.data仍然具有类似的语义。因此y将是与x共享同的Tensor相数据，x与计算历史无关，并具有requires_grad=False。

但是，.data在某些情况下可能不安全。x.data上的任何变化都不会被autograd跟踪，并且x在向后传递中计算梯度将不正确。一种更安全的替代方法是使用x.detach()，它也返回一个Tensor与requires_grad=False共享数据的数据，但是如果x需要反向传播那就会使用autograd直接改变报告。

下面是一个.data和x.detach()（以及为什么我们建议detach一般使用）之间的区别的例子。

如果你使用Tensor.detach()，保证梯度计算是正确的

>>> a = torch.tensor([1,2,3.], requires_grad = True)

>>> out = a.sigmoid()

>>> c = out.detach()

>>> c.zero_() tensor([ 0., 0., 0.])

>>> out # modified by c.zero_() !! tensor([ 0., 0., 0.])

>>> out.sum().backward() # Requires the original value of out, but that was overwritten by c.zero_() RuntimeError: one of the variables needed for gradient computation has been modified by an

posted @ 2018-12-24 17:57 DreamBoy_张亚飞阅读(184) 评论(0) 编辑收藏举报

刷新页面返回顶部

超级英雄拯救世界之前成长的日子