# Pytorch Tutorials – Understanding and Implimenting ResNet

## Deep Convolution Neural Network

In our we have seen how a simple convolution neural network works. A Deep Convolution Neural Network is the network that consists of many hidden layers, for example, AlexNet which consists of 8 layers where the first 5 were convolutional layer and last 3 were full connected layer or VGGNet which consists of 16 convolution layer.

The problem with these deep neural networks was as you increase the layer we start seeing degradation problems. Or to put it in another word as we increase the depth of the network the accuracy gets saturated and starts degrading rapidly. In a deep neural network, as we perform back-propagation, repeated multiplication for finding optimal solution makes gradient very small which results in degradation. This problem is often called the vanishing gradient/exploding gradient.

## ResNet(or Residual Network)

ResNet solve this degradation problem, is by skipping connection or layer. Skipping connection means, consider input x and this input is passed through a stack of neural network layers and produce f(x) and this f(x) is then added to original input x. So our output will be:

H(x) = f(x) + x So, instead of mapping direct function of x -> y with a function f(x), here we define a residual function using f(x) = H(x) – x. Which can be reframed to H(x) = f(x) + x, where f(x) represent stack of non-linear layers and x represent identity function. From this if the identity mapping is optimal we can easily put f(x) = 0 simply by putting value of weight to 0. So the f(x) is what authors call residual function.

This mapping ensures that the higher layer will perform at least as good as the lower layer, and not worse.

## Implementing ResNet

Now let’s implement the ResNet model. Here I will be using the ResNet18 model which consists of 18 layers. The dataset I will be using is dog-vs-cat which I have downloaded from Kaggle websites. Our model will classify images of dogs and cats.

### Architecture

In the above diagram first, we take input image which consists 3 channel(RGB) passed it to convolution layer of kernel_size = 3 and get 64 channel output. The convolution block between the curved arrow represents a Residual Block which will consist of:

convolution layer -> Batch Normalization -> ReLU activation -> convolution layer-> Batch Normalization.

The output of these residual blocks is then added to the initial input(i.e x) of the residual block. After adding the output is then passed to the ReLU activation function for the next layer.

The dotted arrow represents that the output dimensions of residual have changed so we also have to change the dimensions of the input which is passed to that residual block(i.e x) for adding it. Because adding is only possible if the dimensions are equal.

The last layer of this architecture is a Linear Layer which will take the input and gives us output i.e wheater it is dog or cat.

### Code

Let’s first our import necessary libraries:

``````from PIL import Image
import torch.optim as optim
from tqdm import tqdm
from torchvision import transforms
import torch.nn.functional as F
import torch.nn as nn
import torchvision.datasets as dt
import torch
import os

device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")

PREPROCESS = transforms.Compose([transforms.Resize(256),
transforms.CenterCrop(224),
transforms.ToTensor(),
transforms.Normalize(mean = [0.485,0.456,0.406],std = [0.229,0.224,0.225])])``````

PyTorch provides very good class `transforms` which are used for modifying and transforming image`transforms.Compose` is used to combine or chained different transformations. This is used to build transformation pipeline.

Now let’s get out dataset:

``````def get_dataset(train = True):
if train:
trainset = dt.ImageFolder(root = "./train/",transform = PREPROCESS)
else:
testset = dt.ImageFolder(root = "./test/",transform = PREPROCESS)

Next let’s write our Residual Block:

``````class ResidualBlock(nn.Module):
expansion = 1
def __init__(self, inchannel, outchannel, stride=1):
super(ResidualBlock, self).__init__()
self.conv1 = nn.Sequential(
nn.Conv2d(inchannel, outchannel, kernel_size=3, stride=stride, padding=1, bias=False),
nn.BatchNorm2d(outchannel),
)
self.conv2  = nn.Sequential(
nn.Conv2d(outchannel, outchannel, kernel_size=3, stride=1, padding=1, bias=False),
nn.BatchNorm2d(outchannel)
)
self.skip = nn.Sequential()
if stride != 1 or inchannel != self.expansion * outchannel:
self.skip = nn.Sequential(
nn.Conv2d(inchannel, self.expansion * outchannel, kernel_size=1, stride=stride, bias=False),
nn.BatchNorm2d(self.expansion * outchannel)
)

def forward(self, X):
out = F.relu(self.conv1(X))
out = self.conv2(out)
out += self.skip(X)
out = F.relu(out)
return out``````

In the last article, I have explained why we use `nn.Module` in our class so, I am going to skip that part.

We have created two convolution layer `self.conv1` and `self.conv2` just like in the diagram. The `self.skip` is our shortcut layer which will be added to the output of `self.conv2`.

The “if” part in `__init__()` method checks whether the dimensions of `self.conv2` will change or not. If it changes than we have to change the output dimensions of input by passing it to `nn.Conv2d` layer. In `forward()` method it is straight forward that how our data will flow.

Now let’s write our Model class or ResNet class:

``````class Model(nn.Module):
def __init__(self, ResidualBlock, num_classes):
super(Model, self).__init__()
self.inchannel = 64
self.conv1 = nn.Sequential(
nn.Conv2d(3, 64, kernel_size=3, stride=1, padding=1, bias=False),
nn.BatchNorm2d(64),
)
self.layer1 = self.make_layer(ResidualBlock, 64,  2, stride=1)
self.layer2 = self.make_layer(ResidualBlock, 128, 2, stride=2)
self.layer3 = self.make_layer(ResidualBlock, 256, 2, stride=2)
self.layer4 = self.make_layer(ResidualBlock, 512, 2, stride=2)
self.fc = nn.Linear(512*ResidualBlock.expansion, num_classes)

def make_layer(self, block, channels, num_blocks, stride):
strides = [stride] +  * (num_blocks - 1)
layers = []
for stride in strides:
layers.append(block(self.inchannel, channels, stride))
self.inchannel = channels * block.expansion
return nn.Sequential(*layers)

def forward(self, x):
out = F.relu(self.conv1(x))
out = self.layer1(out)
out = self.layer2(out)
out = self.layer3(out)
out = self.layer4(out)
out = F.avg_pool2d(out, out.size())
out = torch.flatten(out,1 )
out = self.fc(out)
return out``````

In `__init__()` method `self.conv1` is the layer where we will take our input image of channel 3 (RGB) and will produce 64 output channels. Then we create 4 layers using `make_layer` method and each layer consists of 2 `ResidualBlock`. And the last layer(`self.fc`) is our `Linear` layer which will give us output whether it is a dog or cat.

In `forward` method before passing it to `self.fc` layer we first `flatten` or reshape our matrics to 1D.

Now let’s define our loss function and optimizer:

``````if __name__ == '__main__':
resnet = Model(ResidualBlock,num_classes = 2)
if torch.cuda.is_available():
resnet.cuda()
print(resnet)

criterion = nn.CrossEntropyLoss()
optimizer = torch.optim.SGD(resnet.parameters(),lr = 0.01)``````

I have used here `CrossEntropyLoss() and SGD()`optimizer.

### Training

Let’s train our model:

``````    train = get_dataset(train = True)

for epoch in tqdm(range(10)):
for i,(images,target) in enumerate(train):
images = images.to(device)
target = target.to(device)

out = resnet(images)
loss = criterion(out,target)
print(loss)

# Back-propogation
loss.backward()
optimizer.step()

_,pred = torch.max(out.data,1)
correct = (pred == target).sum().item()

if i % 100 == 0:
torch.save(resnet.state_dict(),"model")
print(f" epoch: {epoch}\tloss: {loss.data}\tAccuracy: {(correct/target.size(0)) * 100}%")futureinternet-10-00080-g002``````

I have used 10 epochs to train the model. `optimizer.zero_grad()` method is used to make gradient to 0. Next, we call `backword()` on our loss variable to perform back-propagation. After the gradient has been calculated we optimize our model by using `optimizer.step()` method.

### Testing

``````    test = get_data(train = False)

correct = 0
total = 0
for i,(images,target) in tqdm(enumerate(test)):
images = images.to(device)
target = target.to(device)

out = resnet(images)
_,pred = torch.max(out.data,1)
total += target.size(0)
correct += (pred == target).sum().item()
print(f"Accuracy: {(correct/total) * 100}")``````

Since we don’t need to calculate weight during back-propagation while testing the model we use `torch.no_grad` method. The rest part is the same as training.

After 10 epochs I got an accuracy of 93.23%.

Sharing is caring!

### 20 thoughts on “Pytorch Tutorials – Understanding and Implimenting ResNet”

1. I could not resist commenting. Exceptionally well written! I
have been surfing online more than three hours today, yet I never found any interesting article like yours.
It’s pretty worth enough for me. Personally, if all website owners and
bloggers made good content as you did, the internet
will be much more useful than ever before. I could not refrain from commenting.
Exceptionally well written!

1. Thank You !! For your kind words Jim 😉

2. My spouse and I stumbled over here from a different page
and thought I should check things out. I like what I see
so i am just following you. Look forward to looking

1. Thank you;)

3. Your blog must offer compelling and unique content in order for it to be successful.

1. Thanks for the feedback

4. “Having read this I thought it was very enlightening. I appreciate you taking the time and effort to put this article together. I once again find myself personally spending way too much time both reading and commenting. But so what, it was still worth it!”

5. I’m curious to find out what blog system you’re working with?
I’m having some minor security problems with my latest site and I would like to find something
more safeguarded. Do you have any solutions?

6. Very good article. I absolutely love this site.
Stick with it!

7. Truly no matter if someone doesn’t be aware of then its up to
other visitors that they will help, so here it
takes place.

1. 8. I truly love your website.. Very nice colors & theme. Did you
develop this site yourself? Please reply back as I’m looking to create my own personal blog and would love to learn where you got
this from or just what the theme is called. Many thanks!

9. I’m no longer certain where you’re getting your information,
however good topic. I needs to spend some time studying more or understanding more.
Thanks for wonderful info I was in search of this information for my mission.

10. Spot on with this write-up, I truly believe this site needs a lot more attention. I’ll probably be returning to read through more, thanks for the advice!

1. 11. the smoothie diet review

Great info. Lucky me I found your site by chance (stumbleupon). I have bookmarked it for later!

1. 12. techno music artists

An outstanding share! I have just forwarded this onto a coworker who was doing a little homework on this. And he actually ordered me lunch due to the fact that I found it for him… lol. So let me reword this…. Thank YOU for the meal!! But yeah, thanks for spending the time to talk about this subject here on your internet site.

1. U r welcome

13. Aw, this was a very nice post. Taking the time and actual effort to generate a really good article… but what can I say… I procrastinate a whole lot and don’t seem to get nearly anything done.

Scroll to Top