Pytorch 几个踩坑点

tensor.detach() creates a tensor that shares storage with tensor that does not requires grad. This will remove a tensor from a computation graph.
tensor.clone() creates a copy of tensor that imitates the original tensor's requires_grad field. This still keeps the copy as a part of the computational graph it came from.
tensor.data returns a new tensor that shares storage with tensor. But it always has requires_grad=False.
2.
gradient 可以理解成一阶近似,所以梯度可以理解成某个变量

posted @ 2019-02-13 06:18  林小奚  阅读(340)  评论(0编辑  收藏  举报