Coding Self Attention Transformer Networks in PyTorch for Question Classification

Hi everyone,
I just published a series of blogs w.r.t. Self Attention Transformers. The blogs series goes through coding self-attention transformers in PyTorch and then using the coded model for question classification. The classification problem has two categories of a different number of classes in each category. The blogs also explain two different ways to solve the classification problem. Please have a look at the series here,

Part - 1: https://thevatsalsaglani.medium.com/question-classification-using-self-attention-transformer-part-1-33e990636e76