eleg5491: introduction to deep learning pytorchbasics tutorial
TRANSCRIPT
ELEG5491: Introduction to Deep LearningPyTorch Basics Tutorial
GUO Xiaoyang
4 Feb 2021
PyTorch Installation
• Installation: https://pytorch.org/get-started/locally/• Pip: pip install torch torchvision torchaudio
• Conda: conda install pytorch torchvision torchaudio -c pytorch
• Google CoLab: a jupyter notebook like environment (free gpu to use)• https://colab.research.google.com/• https://colab.research.google.com/notebooks/intro.ipynb
Tensors
A torch.Tensor is a multi-dimensional matrix containing elements of a single data type.
# How to construct a new tensor?
# see: https://pytorch.org/docs/stable/tensors.html#:~:text=To%20create%20a%20tensor%20with,ops%20(see%20Creation%20Ops).
# Construct a 2x3 matrix, uninitialized
x1 = torch.empty(2, 3)
# Construct a randomly initialized matrix
x2 = torch.rand(2, 3)
# Construct a matrix filled zeros and of dtype long
x3 = torch.zeros(3, 2, dtype=torch.long)
# Construct a tensor directly from data
x4 = torch.tensor([5.5, 3])
# Construct a tensor directly from numpy array
x5 = torch.from_numpy(np.array([5.5, 3], dtype=np.float32))
Computational Graph
Image from https://blog.paperspace.com/pytorch-101-understanding-graphs-and-automatic-differentiation/
Basic operations
# Point wise Ops
abs, acos, add, asin, bitwise_or/and/not, ceil, clamp, exp …
# Reduction Opsargmax/min, mean, median, unique, prod, norm …
# Comparison Opsallclose, argsort, equal, greater, isnan, isfinite, topk …
# BLAS Opsdot, eig, det, matmul, qr, svd
Basic operations# Basic operations
x = torch.eye(3) * 10
y = torch.full((3, 4), 2.5)
z = torch.rand(3)
print("x", x)
print("y", y)
print("z", z)
# addition
# broadcast rule: https://pytorch.org/docs/stable/notes/broadcasting.html
print("x + z:", x + z) # [3, 3] + [3]
# matmul
print("x @ y:", x @ y)
print("x @ x^-1:", x @ torch.inverse(x))
Device# transfer devices
x = torch.rand(3)
print(x.device)
# transfer to gpu
x_gpu1 = x.cuda()
x_gpu2 = x.to("cuda")
print("to gpu:", x_gpu1.device, x_gpu2.device)
# transfer to cpu
x_cpu1 = x_gpu1.cpu()
x_cpu2 = x_gpu1.to("cpu")
print("to cpu:", x_cpu1.device, x_cpu2.device)
# transfer to device
device = "cuda" if torch.cuda.is_available() else "cpu"
x_device = x.to(device)
print("to device:", x_device.device)
# convert tensor to numpy array (only for cpu tensor)
print("to numpy: ", type(x_cpu1.numpy()))
Visualize gradientimport torch
# create a tensor which requires gradient
x = torch.ones(2, 2, requires_grad=True)
# or
x = torch.ones(2, 2)
x.requires_grad_(True) # in-place
# Do some operations:
y = x + 2
z = y * y * 3
out = z.mean()
# Let’s backpropagate:
out.backward()
print(x.grad)
# By default, gradients are only retained for leaf variables.
# However, y is a intermediate variable
print(y.grad)
Visualize gradient
# print intermediate gradient
x = torch.ones(2, 2, requires_grad=True)
y = x + 2
z = y ** 2
out = z.sum() # sum(([1, 1; 1, 1] + 2) ^ 2)
# register a hook which will be called in backpropagation
y.register_hook(lambda grad: print("gradient of y is", grad))
out.backward()
Avoid gradient backpropagation# How to avoid backpropagation?
# method 1: use detach
x = torch.ones(2, 2, requires_grad=True)
x = x.detach()
print(x.requires_grad)
# method 2: use no_grad context manager
x = torch.ones(2, 2, requires_grad=True)
y = torch.full([2, 2], 2., requires_grad=True)
with torch.no_grad():
z = x ** 2
z = z + y
z.sum().backward()
print(x.requires_grad, y.requires_grad, z.requires_grad)
print("x's grad: ", x.grad)
print("y's grad: ", y.grad)
Commonly used APIs
# basic operations, cat/view/sin/cos/add/reshape
import torch
# neural networks modules, Linear/ConvXd/MaxPoolxd/BatchNorm/ReLU/Loss
import torch.nn as nn
# neural network layers, activations functions (not modules)
import torch.nn.functional as F
# optimizers e.g. ADAM, SGD, etc.
import torch.optim as optim
nn.Module# nn.Module is the Base class for all neural network modules.
# Modules can also contain other Modules, allowing to nest them in a tree structure. You can assign the submodules as regular attributes:import torch.nn as nn
import torch.nn.functional as F class Model(nn.Module):
def __init__(self): super(Model, self).__init__()
# submodules which contain neural network parametersself.conv1 = nn.Conv2d(1, 20, 5) self.conv2 = nn.Conv2d(20, 20, 5)
def forward(self, x): # neural network operations to build computational graph
x = F.relu(self.conv1(x)) return F.relu(self.conv2(x))
Mnist Classification Example
Mnist Classification Example
Mnist Classification Example
Mnist Classification Example
Summarization
How to create the computational graph?
1. Define the neural network that has some learnable parameters/weights (nn.Module)
2. Process input through the network (defined in forward function)
3. Compute the loss (how far is the output from being correct)
4. Propagate gradients back into the network’s parameters (loss.backward)
5. update the weights of the network, typically using a simple
update rule: weight = weight - learning_rate * gradient (SGD) (optimizer.step())
6. Repeat 2
Google Colab
• If you do not have GPUs, use Google Colab• https://colab.research.google.com/
• Colab allows you to write and execute Python in your browser, with• Zero configuration required• Free access to GPUs• Easy sharing
• It is a jupyter-notebook-like environment that allows you:• Simple debugging• Easy visualization
Remember to change runtime type to be GPU for faster training
Copy the example and run in Colab
Resources
• Tutorials• https://github.com/pytorch/tutorials
• Examples• https://github.com/pytorch/examples
• Docs• http://pytorch.org/docs/
• Cheat Sheet• https://pytorch.org/tutorials/beginner/ptcheat.html
• Discussions• https://discuss.pytorch.org/
Questions?