xinyu04

导航

Deep Learning Week1 Notes

1. Tensors

\(\text{A tensor is a generalized matrix:}\)

\(\text{an element of }\mathbb{R^3} \text{ is a 3-dimension vector, but it's a 1-dimension tensor.}\)

\(\large \text{The 'dimension' of a tensor is the number of indices.}\)

2. PyTorch operation

@ \(\text{ corresponds to matrix/vector or matrix/matrix multiplication.}\)

* \(\text{ is component-wise product.}\)

lstsq:\(\text{ least square: }mq = y\)

>>>y = torch.randn(3)
>>>y
tensor([ 1.3663, -0.5444, -1.7488])
>>>m = torch.randn(3,3)
>>>q = torch.linalg.lstsq(m,y).solution
>>>m@q
tensor([ 1.3663, -0.5444, -1.7488])

3. Data Sharing

>>> a = torch.full((2, 3), 1)
>>> a
tensor([[1, 1, 1],
[1, 1, 1]])
>>> b = a.view(-1)
>>> b
tensor([1, 1, 1, 1, 1, 1])
>>> a[1, 1] = 2
>>> a
tensor([[1, 1, 1],
[1, 2, 1]])
>>> b
tensor([1, 1, 1, 1, 2, 1])
>>> b[0] = 9
>>> a
tensor([[9, 1, 1],
[1, 2, 1]])
>>> b
tensor([9, 1, 1, 1, 2, 1])

\(\large \text{Note: many operations returns a new tensor which shares the same underlying storage as the original tensor, so changing the values of one will change the other as well:}\) view, transpose,
squeeze, unsqueeze, expand, permute.

4. Einstein summation convention

torch.einsum:
\(\text{Matrix Multiplication}\)

>>> p = torch.rand(2, 5)
>>> q = torch.rand(5, 4)
>>> torch.einsum('ij,jk->ik', p, q)
tensor([[2.0833, 1.1046, 1.5220, 0.4405],
[2.1338, 1.2601, 1.4226, 0.8641]])
>>> p@q
tensor([[2.0833, 1.1046, 1.5220, 0.4405],
[2.1338, 1.2601, 1.4226, 0.8641]])

\(\text{Matrix-Vector product:}\)

w = torch.einsum('ij,j->i', m, v)

\(\text{Component-wise Product:}\)

m = torch.einsum('ij,ij->ij', p, q)

\(\text{Trace:}\)

v = torch.einsum('ii->i', m)

5. Storage

>>> x = torch.zeros(2, 4)
>>> x.storage()
0.0
0.0
0.0
0.0
0.0
0.0
0.0
0.0
[torch.FloatStorage of size 8]
>>> q = x.storage()
>>> q[4] = 1.0
>>> x
tensor([[ 0., 0., 0., 0.],
[ 1., 0., 0., 0.]])

\(\large \text{The main idea of functions like }\)view, narrow, transpose, \(\large\text{ etc. and of operations involving broadcasting is to never replicate data in memory, but to “play” with the offsets and strides of the underlying storage.}\)
\(\text{Therefore:}\)

>>> x = torch.empty(100, 100)
>>> x.t().view(-1)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
RuntimeError: invalid argument 2: view size is not compatible with
input tensor's size and stride (at least one dimension spans across
two contiguous subspaces). Call .contiguous() before .view()

x.t() \(\text{ shares the storage with }\)x, \(\text{ cannot flatten to 1d}\). \(\text{We can use the function }\)reshape().

posted on 2022-04-27 04:10  Blackzxy  阅读(17)  评论(0编辑  收藏  举报