What is Generative Pre-Trained Transformer 3 (GPT-3)

Does it seem uncanny to find artificial intelligence (AI) being able to mimic human written text?

Well, as uncanny as it sounds, the fact is real. Introduced by OpenAI, a research firm co-founded by Elon Musk introduced perhaps one of the most advanced and useful forms of AI, i.e. the GPT-3.

Simply said, GPT-3 stands for Generative Pre-Trained Transformer 3. This tool uses algorithms that are already pre-trained to generate text. To be precise, they’ve been fed with nearly 570 GB of text information gathered by crawling through the internet – a dataset that is public and also known as CommonCrawl. Other information that’s fed to the tool is from selected texts by OpenAI and even Wikipedia.

With the introduction of this tool, the year 2020 seemed to be a tipping point for AI. And since we’ve already entered 2021, this technology is here to boost the launch of many startups and applications. Other similar models have demonstrated the power of AI in the hands of people looking to experiment, and the outcome of these experiences is astonishing.

The model has been trained under a trillion of words, so to speak, it is a 175-billion parameter transformer model. The third model was released by OpenAI. This model is extraordinary and holds the capability of generating human-like text and even responses if needed. Though this sounds weird, when the model is prompted with a text by the user, it can easily respond via tweets, emails, etc.

Did you know that this model has been the largest AI language model with 175 BN parameters i.e. 10X more than Microsoft’s Turing NLG? If you didn’t, now you do.

The company, OpenAI has been in the race for quite some time now. And therefore, we can further speak about its features, advantages, and even its shortcomings.

GPT-3 is composed of a transformer-based architecture that was almost similar to its previous model the GPT-2. The authors of the tool trained it with different model sizes that range between 125 million parameters to 175 billion parameters in an attempt to measure the correlation between the model size and the benchmark performance.

Now GPT-3, powered by a neural network has the capability of writing articles, poems, essays, and even can work on code – these are certain reasons why the world is excited yet they fear the AI’s conundrum.

GPT-3 Use cases

To be precise GPT-2 had nearly 1.5 billion parameters while GPT-3 had 175 billion parameters. In short, a Ferrari will always be a Ferrari. And it’s not much excitement when we see an upgrade the Apple’s iPhone camera from 12 MP to 14 MP. More so, even before GPT-3, the transformer technology was already performing well. For instance, in 2018, the year had been dubbed as a year of NLP’s ImageNet.

A report in Forbes, “in October 2012, a deep neural network achieved an error rate of only 16% in the ImageNet Large Scale Visual Recognition Challenge, a significant improvement over the 25% error rate achieved by the best entry the year before.”

In a nutshell, when the error rate dropped down from 25 percent to 16 percent, machine learning becomes the most viable tool used for image classification, since no other model offered this kind of low error rate, until the release of GPT-3.

The rise of this tool came to the attention of the public because it had good marketing. Do you even remember when was the last time the public paid any attention to any type of language model? Perhaps, only when GPT-2 was introduced in February 2019.

Although we may see multiple models being released – Google, Uber, Microsoft, and Salesforce… yet the transformer technology beats it all, especially in the field of natural language processing (NLP).