PyTorch0.4更新指南

PyTroch 0.4 更新指南

  • TensorsVariables已合并
  • Tensors支持0维(标量)
  • 弃用volatile标签
  • dtypesdevicesNumPy风格的创作功能
  • 编写device-agnostic代码
  • nn.Module中子模块名称,参数和缓冲区中的新边界约束

一、合并Tensor和Variable和类

torch.autograd.Variabletorch.Tensor现在类相同。确切地说,torch.Tensor能够像Variable一样自动求导; Variable继续像以前一样工作但返回一个torch.Tensor类型的对象。意味着你在代码中不再需要Variable包装器。

1、Tensor中的type()改变了

type()不再反映张量的数据类型。使用isinstance()x.type()替代:

>>> x = torch.DoubleTensor([1, 1, 1])
>>> print(type(x))  # was torch.DoubleTensor
"<class 'torch.Tensor'>"
>>> print(x.type())  # OK: 'torch.DoubleTensor'
'torch.DoubleTensor'
>>> print(isinstance(x, torch.DoubleTensor))  # OK: True
True

2、什么时候autograd开始自动求导?

equires_gradautograd的核心标志,现在是Tensors上的一个属性。让我们看看在代码中如何体现的。
autograd使用以前用于Variables的相同规则。当张量定义了requires_grad=True就可以自动求导了。例如,

>>> x = torch.ones(1)  # create a tensor with requires_grad=False (default)
>>> x.requires_grad
False
>>> y = torch.ones(1)  # another tensor with requires_grad=False
>>> z = x + y
>>> # both inputs have requires_grad=False. so does the output
>>> z.requires_grad
False
>>> # then autograd won't track this computation. let's verify!
>>> z.backward()
RuntimeError: element 0 of tensors does not require grad and does not have a grad_fn
>>>
>>> # now create a tensor with requires_grad=True
>>> w = torch.ones(1, requires_grad=True)
>>> w.requires_grad
True
>>> # add to the previous result that has require_grad=False
>>> total = w + z
>>> # the total sum now requires grad!
>>> total.requires_grad
True
>>> # autograd can compute the gradients as well
>>> total.backward()
>>> w.grad
tensor([ 1.])
>>> # and no computation is wasted to compute gradients for x, y and z, which don't require grad
>>> z.grad == x.grad == y.grad == None
True

3、操作requires_grad标志

除了直接设置属性之外,您可以使用my_tensor.requiresgrad(requires_grad=True)直接修改此标志,或者如上例所示,在创建时将其作为参数传递(默认为False),例如:

>>> existing_tensor.requires_grad_()
>>> existing_tensor.requires_grad
True
>>> my_tensor = torch.zeros(3, 4, requires_grad=True)
>>> my_tensor.requires_grad
True

4、关于.data

.data是从Variable中获取Tensor的方法。合并后,调用y = x.data仍然具有类似的语义。因此y将是与x共享同的Tensor相数据,x与计算历史无关,并具有requires_grad=False

但是,.data在某些情况下可能不安全。x.data上的任何变化都不会被autograd跟踪,并且x在向后传递中计算梯度将不正确。一种更安全的替代方法是使用x.detach(),它也返回一个Tensorrequires_grad=False共享数据的数据,但是如果x需要反向传播那就会使用autograd直接改变报告。

下面是一个.datax.detach()(以及为什么我们建议detach一般使用)之间的区别的例子。

如果你使用Tensor.detach(),保证梯度计算是正确的

 >>> a = torch.tensor([1,2,3.], requires_grad = True)

>>> out = a.sigmoid()

>>> c = out.detach()

>>> c.zero_() tensor([ 0., 0., 0.])

>>> out # modified by c.zero_() !! tensor([ 0., 0., 0.])

>>> out.sum().backward() # Requires the original value of out, but that was overwritten by c.zero_() RuntimeError: one of the variables needed for gradient computation has been modified by an

posted @ 2018-12-24 17:57  DreamBoy_张亚飞  阅读(184)  评论(0编辑  收藏  举报