GPT-3: An AI that understands
September 2, 2021
7 minutes read
Last year OpenAI (California), backed by Sam Altman and Elon Musk amongst other tech entrepreneurs, revealed to the world the largest language model ever with overwhelming results. Our research team got early access to the technology and has been reflecting on what the future of human-artificial intelligence cooperation might look like.
Recently, our team got also access to a downstream language model from this technology - a model that translates natural language into source code in the programming language of your choice. Details are presented below.
Language Models
Most people reading this have probably engaged with a language model in some manner during their day, consciously or not, whether it was via Google search, an autocomplete text tool, or a voice assistant.
In today's NLP applications, language modeling is critical. It is for this reason that machines must comprehend qualitative data, such as natural text. Each language model converts qualitative information into quantitative data in some way. To a limited extent, this allows people to communicate with machines in the same way they do with each other.
One of the most powerful language models is the GPT-3 from OpenAI team. It is an autoregressive language model with 175 billion parameters that achieves strong performance on many tasks, including translation, question-answering and reasoning[1] . This model demonstrates that scaling up language models greatly improves task-agnostic and few-shot performance.
Research must advance towards models learning from unlabeled data - GPT-3 is a great example - also aligned with Yan LeCun's (a notable computer scientist) vision: "The next AI revolution will not be supervised". This unsupervised learning has several benefits such as larger amounts of databases, no label bias is inherited or no need of microworkers on data tagging processes.
Model Interface
GPT-3's most astounding aspect is that it is a meta-learner, meaning it has learned how to learn. You can ask it to execute a new task in normal language, and it will "understand" what it has to do in a similar way to how a human would.
OpenAI released a beta API that works with a natural language interface. Developers condition the model to their particular case by means of text - called a prompt. Thanks to its meta-learning capacity, the model is able to understand the pattern of the task and generate an answer accordingly.
Unsupervised learning is a learning environment where model learns patterns from unlabeled data.

This is GPT-3's natural language interface. A prompt (bold text) provides some examples of the task to be solved: English to French translation.

Finally, the model is able to translate What's going on tonight? to French thanks to these few samples.
These natural language prompts make programming a much more creative and narrative process. Apart from that, it also shoots down technical barriers for building applications since we are only involved in the narrative process, which doesn't require coding skills.
Apps & Industrial Aplications
A few hundred apps are using GPT-3 across a wide range of industries, from productivity to creativity applications. Now, we'll explore some possibilities:
  • Transform unstructured data. Build tables from long form text by specifying a structure and providing some examples. For instance, this can provide a competitive advantage to your business because all the internet-information related to your industry can be structured in a manner that is easy not only to access but also to extract insights from it.
This is a particular case in which we - Batou - are very interested and we are also developing our own tools.
  • Product name generator. Create product names from example words. Influenced by a community prompt. It can be seen as a partner in a brainstorming session of your community & marketing team. This is an example of how an AI system can partner up with humans, and rather than substitute them, it involves the agent in the task to improve the expected output of the team.
AI Models (extensions)
One of the pain points of the vast majority of AI models is the low accuracy across different environments. However, the OpenAI family has successfully trained GPT-3 model to work on alternative tasks, showcasing its unconstrained power, for example:
  • Github Copilot & Codex[2] (recently announced). They work in the text to code domain. It basically converts comments, describing the logic of your program, into code (executable or not), ranging from some lines to entire functions in several programming languages.
We are developing our own programming language that will power Fermat - a scriptable productivity app for individuals & teams.

One of our research goals is to build a model that translates natural language into code written in our programming language. Essentially, this model can be used in the Workspace to democratise software creation for non-technical folks.
  • DALLE-E[3] . Generates images from text descriptions. It is able to create plausible images for a great variety of sentences that explore the compositional structure of language.
Image taken from the original OpenAI blog post: DALL·E: Creating Images from Text
Conclusions
This post’s main goal has been to explain our experience with one of the most outstanding technologies in the AI industry.
We have seen how these models enable human-machine cooperation in a natural and healthy manner thanks to a combination of Deep Learning and Interaction techniques. Providing an answer to a more complex problem: Augmentation[4] .
In a different environment, we presented a physical space for people to work and collaborate, where the computer’s job is to augment our capabilities: Augmented Desk
Basically, humans are responsible for coming up with new ideas and it is AI's job to understand and turn those ideas into applications, not the other way around.

Augmenting human capabilities should be the ultimate goal of AI.


References
  1. Brown TB, Mann B, Ryder N, Subbiah M, Kaplan J, Dhariwal P, Neelakantan A, Shyam P, Sastry G, Askell A, Agarwal S. Language models are few-shot learners. arXiv preprint arXiv:2005.14165. 2020 May 28.

  2. Chen M, Tworek J, Jun H, Yuan Q, Ponde H, Kaplan J, Edwards H, Burda Y, Joseph N, Brockman G, Ray A. Evaluating large language models trained on code. arXiv preprint arXiv:2107.03374. 2021 Jul 7.

  3. Ramesh A, Pavlov M, Goh G, Gray S, Voss C, Radford A, Chen M, Sutskever I. Zero-shot text-to-image generation. arXiv preprint arXiv:2102.12092. 2021 Feb 24.

  4. Engelbart DC. Augmenting human intellect: A conceptual framework. Menlo Park, CA. 1962 Oct.