摘要:
x = torch.tensor(2.0)x.requires_grad_(True)y = 2 * xz = 5 * x w = y + z.detach()w.backward() print(x.grad) => 2 本来应该x的梯度为7,但是detach()那一路切段了梯度的传播,导致5没有 阅读全文
摘要:
def step(self): "Update parameters and rate" self._step += 1 rate = self.rate() for p in self.optimizer.param_groups: p['lr'] = rate self._rate = rate 阅读全文