The power of neural networks is no mystery. With increasing data and increasing compute, where the tools that we use for deep learning are very important. It is important that we pick the right one. PyTorch is not everyone’s cup of tea, but it certainly checks many of the boxes for intense out-of-the-box neural networks.
PyTorch is a library developed by Facebook for GPU accelerating computation of tensors. It is heavily optimized for these types of tensor computations (similar to how NumPy is optimized for mathematical operations) and can be used seamlessly on a GPU for vastly faster training. Facebook is aware that users of PyTorch will likely also use NumPy so they have implemented easy ways to switch between the two.
Background
PyTorch at its base does operations on tensors which make them the most basic data type in PyTorch. It is very similar to the ndarray data type in NumPy. As the tensors get passed between layers, a gradient function can keep track of the gradients of each tensor so that they can easily be computed in future computations.
Sequential models can be deployed with a standard neural network where the output from a previous neural layer serves as the input to the next layer.
Functional layers are the alternative to sequential layers. Functional layers can be applied in PyTorch as well as other deep learning libraries such as Keras. Functional layers allow you to connect a layer to any other number of layers in the network. In these cases, it does not need to be the previous and next layers. Models that have functional architectures can even share layers.
Models that are highly connected can sometimes be overfit. This is due to the fact of the sequential model literally memorizing everything. The concept of drop out has become a commonplace practice in deep learning applications. The purpose of drop out is to not have each neuron fully connected to each other. There are some connections that are intentionally missing to not overfit.
Implementing with PyTorch
One of the basic building blocks in PyTorch is the module block. Each of these are typically used to define a layer of a model and are called with torch.nn.module which is the base class for all of the different types of neural networks possible in PyTorch.
Implementing a sequential neural network with is quite simple. As described above, the sequential model simply involves stacking of different layers on top of each other.
In Torch, there exists a container that can be initialized with:
torch.nn.Sequential(*args)
The *args argument to the Sequential container should be different Torch modules such as 2-D convolutional layers. Typically, after each
The sequential function can also be used with an ordered dictionary in order to name each of the layers and be able to call them later easily:
Even so, the really nice thing about the sequential container in PyTorch is that additional modules (layers or functions) can be added to it later on.
In the example code above from the PyTorch documentation, ReLU layers are added after each convolutional layer in order to prevent overfitting. The use of dropout is an extremely good measure to avoid over or underfitting a model to your training data.
Concluding Remarks
With so many neural network libraries available, Pytorch is certainly one of the best available. Adding a layer to the neural network is as simple as adding another line into the Torch constructor. Implementation of sequential neural networks in Pytorch is a breeze. Layers can also be added non-sequentially through the functional function call.
Additional Resources
http://www.goldsborough.me/ml/ai/python/2018/02/04/20-17-20-a_promenade_of_pytorch/