PyTorch学习(一)

PyTorch官网
PyTorch官方教程
PyTorch官方文档
PyTorch中文文档/教程
动手学深度学习PyTorch版

引言

做了一个小测试,发现在cpu上pytorch比tensorflow快很多。另外还发现,conda命令安装的tensorflow比pip安装的要快,pytorch则没有明显区别,之前就看到有人说conda中的tensorflow经过了优化,看来是真的。

寻找下面函数的最小值:

conda:

import torch
import tensorflow as tf
import time
import numpy as np

def himmelblau(x):
    return (x[0]**2 + x[1] - 11)**2 + (x[0] + x[1]**2 - 7)**2

import plotly.graph_objects as go
x = np.arange(-6, 6, 0.1)
y = np.arange(-6, 6, 0.1)
# print('x,y range:', x.shape, y.shape)
X, Y = np.meshgrid(x, y)
fig = go.Figure(data=go.Surface(z=himmelblau([X,Y])))
fig.write_image('figure2.svg')
fig.write_html('first_figure.html', auto_open=True)


tic = time.time()
x = torch.tensor([0., 0.], requires_grad=True)
optimizer = torch.optim.Adam([x], lr=1e-3)
for step in range(20000):
    pred = himmelblau(x)
    optimizer.zero_grad() # 梯度信息清零
    pred.backward()
    optimizer.step() # 每调用一次step,就更新一次: x' = x, y' = y
    
    if step % 2000 == 0:
        print('step{}: x = {}, f(x) = {}'.format(step, x.detach().numpy(), pred.item()))
toc = time.time()
print('time:',toc-tic)

tic = time.time()
x = tf.Variable([0., 0.])  # 传入GradientTape计算梯度的必须是tf.Variable类型
optimizer = tf.optimizers.Adam(lr=1e-3)
for step in range(20000):
    with tf.GradientTape() as tape:
        tape.watch([x])
        pred = himmelblau(x)
        
    grads = tape.gradient(pred, [x])
    optimizer.apply_gradients(zip(grads, [x])) # 和pytorch不同,tf是将所有梯度信息存起来一次性更新
    # x -= 0.001*grads
    
    if step % 2000 == 0:
        print('step{}: x = {}, f(x) = {}'.format(step, x.numpy(), pred.numpy()))
toc = time.time()
print('time:',toc-tic)

conda版:

step0: x = [0.001 0.001], f(x) = 170.0
step2000: x = [2.3331807 1.9540695], f(x) = 13.730916023254395
step4000: x = [2.982008 2.0270984], f(x) = 0.014858869835734367
step6000: x = [2.9999835 2.0000222], f(x) = 1.1074007488787174e-08
step8000: x = [2.9999938 2.0000083], f(x) = 1.5572823031106964e-09
step10000: x = [2.9999979 2.0000029], f(x) = 1.8189894035458565e-10
step12000: x = [2.9999993 2.000001 ], f(x) = 1.6370904631912708e-11
step14000: x = [2.9999998 2.0000002], f(x) = 1.8189894035458565e-12
step16000: x = [3. 2.], f(x) = 0.0
step18000: x = [3. 2.], f(x) = 0.0
time: 8.470422983169556
step0: x = [0.001 0.001], f(x) = 170.0
step2000: x = [2.3331852 1.9540718], f(x) = 13.730728149414062
step4000: x = [2.9820085 2.0270977], f(x) = 0.01485812570899725
step6000: x = [2.9999835 2.0000222], f(x) = 1.1074007488787174e-08
step8000: x = [2.9999938 2.0000083], f(x) = 1.5572823031106964e-09
step10000: x = [2.9999979 2.0000029], f(x) = 1.8189894035458565e-10
step12000: x = [2.9999995 2.0000007], f(x) = 9.322320693172514e-12
step14000: x = [3. 2.0000002], f(x) = 9.094947017729282e-13
step16000: x = [3. 2.], f(x) = 0.0
step18000: x = [3. 2.], f(x) = 0.0
time: 43.112674951553345

pip版:

step0: x = [0.001 0.001], f(x) = 170.0
step2000: x = [2.3331807 1.9540695], f(x) = 13.730916023254395
step4000: x = [2.982008 2.0270984], f(x) = 0.014858869835734367
step6000: x = [2.9999835 2.0000222], f(x) = 1.1074007488787174e-08
step8000: x = [2.9999938 2.0000083], f(x) = 1.5572823031106964e-09
step10000: x = [2.9999979 2.0000029], f(x) = 1.8189894035458565e-10
step12000: x = [2.9999993 2.000001 ], f(x) = 1.6370904631912708e-11
step14000: x = [2.9999998 2.0000002], f(x) = 1.8189894035458565e-12
step16000: x = [3. 2.], f(x) = 0.0
step18000: x = [3. 2.], f(x) = 0.0
time: 8.337981462478638
step0: x = [0.001 0.001], f(x) = 170.0
step2000: x = [2.3331852 1.9540718], f(x) = 13.730728149414062
step4000: x = [2.9820085 2.0270977], f(x) = 0.01485812570899725
step6000: x = [2.9999835 2.0000222], f(x) = 1.1074007488787174e-08
step8000: x = [2.9999938 2.0000083], f(x) = 1.5572823031106964e-09
step10000: x = [2.9999979 2.0000029], f(x) = 1.8189894035458565e-10
step12000: x = [2.9999995 2.0000007], f(x) = 9.322320693172514e-12
step14000: x = [3. 2.0000002], f(x) = 9.094947017729282e-13
step16000: x = [3. 2.], f(x) = 0.0
step18000: x = [3. 2.], f(x) = 0.0
time: 54.814427614212036

安装

新建环境:

conda create --name torch python=3.7

安装一些可能要用到的包(非必须,看自己情况):

conda install numpy
conda install spyder
conda install jupyter notebook

安装PyTorch:

conda install pytorch torchvision cpuonly -c pytorch # CPU版

GPU版根据CUDA版本不同命令也不同,可以去这里查看安装命令

自动求导

requires_grad

设置张量的属性 .requires_gradTrue,那么它将会追踪对于该张量的所有操作。

import torch
import torch.nn.functional as F

x = torch.ones(1)
w = torch.full([1],2)  # 应为w = torch.full([1], 2, requires_grad=True)
mse = F.mse_loss(x, x+w)
torch.autograd.grad(mse, [w])

RuntimeError: element 0 of tensors does not require grad and does not have a grad_fn

PyTorch的图是静态创建的,如果使用requires_grad_()方法:

x = torch.ones(1)
w = torch.full([1], 2)
mse = F.mse_loss(x, x+w)
w.requires_grad_()
torch.autograd.grad(mse, [w])

依然会报错:

RuntimeError: element 0 of tensors does not require grad and does not have a grad_fn

因为mse这张图已经创建好了,因此要重新创建一次,或者将mse的创建放到w.requires_grad_(True)后面:

x = torch.ones(1)
w = torch.full([1], 2)
w.requires_grad_()
mse = F.mse_loss(x, x+w)
grad = torch.autograd.grad(mse, [w])  # 返回一个列表,分别是对每个变量的梯度
print(grad)

(tensor([4.]),)

backward()

也可以通过调用 .backward(),来自动计算所有的梯度。这个张量的所有梯度将会自动累加到.grad属性.

x = torch.ones(1)
w = torch.full([1], 2)
w.requires_grad_()
mse = F.mse_loss(x, x+w)
# grad = torch.autograd.grad(mse, [w]) 和下面语句等价
mse.backward()   # 不返回值,而是把梯度附加在每个变量的grad属性
print(w.grad)

tensor([4.])

调用完backward()后,pytorch会把图的信息清除掉,当再次调用backward(),会报错:

RuntimeError: Trying to backward through the graph a second time, but the buffers have already been freed. Specify retain_graph=True when calling backward the first time.

要保持图的信息,可设置retain_graph=True

torch.autograd.grad(mse, [w], retain_graph=True)
或
mse.backward(retain_graph=True)

detach()

要阻止一个张量被跟踪历史,可以调用 .detach() 方法将其与计算历史分离,并阻止它未来的计算记录被跟踪。

为了防止跟踪历史记录(和使用内存),可以将代码块包装在 with torch.no_grad(): 中。在评估模型时特别有用,因为模型可能具有 requires_grad = True 的可训练的参数,但是我们不需要在此过程中对他们进行梯度计算。

posted @ 2020-04-17 21:13  pengweii  阅读(257)  评论(0编辑  收藏  举报