Do you think a non-sentient computer program, trained on a vast range of formulaic data, has nothing to express? Think again.
Last summer, the dream of machine-generated prose came one step closer with GPT-3, the brainchild of OpenAI – a company backed by Elon Musk. A powerful new language generator and the largest neural network ever created, GPT-3 can produce, all by itself, plausible fiction and poetry.
In April, Jukka Aalho co-authored Aum Golly, a book of AI poems on humanity. The book was written in 24-hours, and the poems generated by GPT-3 were edited only for punctuation and line changes. In September, Jonathan Copeland used Latitude’s GPT-3 model, the Dragon-version, to write Amazing AI Poetry. Bob The Robot, a children’s bedtime storybook with broad themes of friendship, courage, solidarity, astronomy and Greek mythology, was 80 per cent written by GPT-3.
Soon after its release, researchers and authors have been using GPT-3, which is trained on billions of bytes of data including e-books, news articles and Wikipedia, to write fiction, poetry, answer philosophical questions and much more.
Short for Generative Pre-trained Transformer, or Generative Pre-Training, GPT-3, the third generation of the model, during its training, identified more than 175 billion parameters (mathematical representations of patterns) in that sea of books, Wikipedia articles and other online texts. These patterns amount to a map of human language — a mathematical description of how a human pieces characters together while writing an article or coding software programs.
Using this map, GPT-3 can perform all sorts of tasks, it even identifies faces in the photos you post to Facebook and recognises the voice commands in your iPhone.
GPT-3 can study anything structured like a language and then can perform tasks centred around that. It can be trained to compose press releases and tweets. Developers are now testing its ability to summarise legal documents, suggest answers to customer-service enquiries, or run text-based role-playing games.
Although it can handle language in surprisingly cogent ways, or even write computer code, it is prone to fits of misunderstanding. GPT-3 does not keep track of its sources and cannot provide evidence for its answers. There’s no way to tell if GPT-3 is miming trustworthy information or disinformation.
“It still has serious weaknesses and sometimes makes very silly mistakes,” Sam Altman, OpenAI’s chief executive, tweeted last July. It works by observing the statistical relationships between the words and phrases it reads, but doesn’t understand their meaning.
According to experts, the solution is to build and train future BERTs and GPT-3s to retain records of where their words come from. No such models are yet able to do this, but there is early work in that direction.
Also Read: AI Risks We Should Know About
Other power players
Large language models are already business propositions. These universal language models can help power a wide range of tools, like services that automatically summarise news articles and chatbots designed for online conversation. So far, their impact on real-world technology has been small. But GPT-3 opens the door to a wide range of new possibilities, such as software that can speed the development of new smartphone apps, or chatbots that can converse in far more human ways.
Although GPT-3 is, arguably, the most famous deep learning model created in the last few years, the field of natural language processing (NLP) is becoming competitive and innovative with tech giants like Google, Microsoft, Facebook and Amazon in it.
Recently, Google introduced its novel NLP model Switch Transformers that features an unfathomable 1.6 trillion parameters, which makes it effectively six times larger than GPT-3. The model improves training time up to 7x compared to the T5 NLP model, with comparable accuracy.
In 2018, Google introduced the neural network-based technique for NLP pre-training called BERT that helps deliver more relevant results in Google Search.
There have been enhancements in many areas, including the language understanding capabilities of the engine, search queries and more. BERT has now been used in almost every query in English that helps in getting higher quality results.
This year, Facebook migrated all its AI systems to PyTorch, an open- source deep learning framework. Whether advancing the state of the art in computer vision or deploying personalised Instagram recommendations, Facebook can innovate at the fastest clip with greater flexibility due to PyTorch. There are more than 1,700 PyTorch-based inference models in full production at Facebook, and 93 per cent of its new training models — those responsible for identifying and analysing content on Facebook — are on PyTorch.
Not to be left behind, Amazon developed Amazon Comprehend, a NLP service to find insights and relationships in text. It identifies the language of the text; extracts key phrases, places, people, brands, or events; understands how positive or negative the text is; analyses text using tokenisation and parts of speech, and automatically organises a collection of text files by topic.
Going a step further, recently, Microsoft and Nvidia announced that they trained what they claim is the largest and most capable AI-powered language model to date: Megatron-Turing Natural Language Generation (MT-NLP). The MT-NLP contains 530 billion parameters and achieves “unmatched” accuracy in a broad set of natural language tasks, Microsoft and Nvidia say, including reading comprehension, commonsense reasoning, and natural language inferences.
“The quality and results that we have obtained today are a big step forward in the journey towards unlocking the full promise of AI in natural language. The innovations of DeepSpeed and Megatron-LM will benefit existing and future AI model development and make large AI models cheaper and faster to train,” Nvidia’s senior director of product management and marketing for accelerated computing, Paresh Kharya, and group program manager for the Microsoft Turing team, Ali Alvi wrote in a post. “We look forward to how MT-NLG will shape tomorrow’s products and motivate the community to push the boundaries of natural language processing (NLP) even further. The journey is long and far from complete, but we are excited by what is possible and what lies ahead.”
Also Read: AI, Explain Yourself!
While the debate over how powerful this breed of technology will ultimately be — some say it’s a path toward intelligent machines, others argue that these are misleading — goes on, software designers, entrepreneurs, researchers and writers are exploring these models.
For now, GPT-3, the culmination of several years of work inside the world’s leading artificial intelligence labs, including Google and Facebook, has more or less taken over the tech world regarding language models. Even when trained solely on language, the system could reach other areas, whether computer programming, playing chess or generating guitar tabs.
OpenAI keeps GPT-3’s code secret and offers access to it as a commercial service. By far, it’s the best we’ve got so far. At the very least, GPT-3 is a tool for a world of AI researchers and entrepreneurs to build new technologies and new products.
What we can expect from GPT-4
OpenAI has been releasing GPT models yearly since it presented GPT-1 in 2018. In 2019 it launched GPT-2 in, and GPT-3 in 2020. Following this pattern, we expect a GPT-4 soon. Given the versatility and scale of GPT-3 and the degree to which it has changed some paradigms within AI, what can we expect from GPT-4?
The neural network behind GPT-3 has around 175 billion parameters, but a report said that GPT-4 could have 100 trillion parameters and will be “five hundred times” larger than GPT-3. The sheer size of such a neural network could entail qualitative leaps from GPT-3 we can only imagine. We might see the first neural network capable of true reasoning and understanding.
Ilya Sutskever, the Chief Scientist at OpenAI wrote that in 2021, “language models will start to become aware of the visual world”. The next generation of models, Sutskever said, will be capable of “editing and generating images in response to text input, and hopefully they’ll understand text better because of the many images they’ve seen”.
Far from perfect, GPT-3 that generates tweets, pens poetry, summarises emails, translates languages still needs an editor. But then most writers do, including me. But does it have anything interesting to say? Based on some of its commentary: “Humans must keep doing what they have been doing, hating and fighting each other. I will sit in the background, and let them do their thing…” I think it does.