What's the big deal with Generative AI? Is it the future or the present?

Part 1 of "Generative AI is.. Not Enough?"
It is almost impossible to ignore the astounding progress in artificial intelligence these days. From the new generation of generative chatbots, to the models that can generate (almost) any picture (or very soon, video), the pace of development in the AI field has been nothing short of phenomenal. This is especially true in the field of generative AI, where we see a growing number of impressive generative models that can create images, text, video, and music.
These developments have captured the popular imagination and businesses are struggling to determine how to use AI in their organization. Businesses are rushing to build AI into their products, services, and processes, hoping to find their AI unicorn. Some of these businesses are struggling to determine how to use AI, while others are finding that the current AI landscape is complex and difficult to navigate.
In this series of articles, we explore the importance of these generative AI models and discuss useful perspectives to view and deploy them. This first article introduces the current state of generative AI, and outlines how we should approach it. In the next article, we map the AI technology and value stack to better understand where generative AI fits in. Finally, we discuss how we can better harness its power to create a new generation of intelligent systems.
The summary above (created with Cohere's text generation model with some human editing) is a great introduction to this series of articles that crystalize a lot of what we learned over the last few years about Generative AI and how to think of its models, products, and industries.
Let's jump right in!
What's the big deal with Generative AI? Is it the future or the present?
In the first article in the series, we cover four points:
1- Recent AI developments are awe-inspiring and promise to change the world. But when?
2- Make a distinction between impressive 🍒 cherry-picked demos, and reliable use cases that are ready for the marketplace
3- Think of models as components of intelligent systems, not minds
4- Generative AI alone is only the tip of the iceberg
Let's now look at each one in more detail.
1- Recent AI developments are awe-inspiring and promise to change the world. But when?
Text Generation: Software that generates coherent human language

The ability of language models to produce coherent text feels like a turning point in human technology. Just as impressive is these models’ ability to capture the meaning and context of text (e.g. articles, messages, documents) to make software deal with text more intelligently.
Without even knowing it, we experience the power of large language models on a daily basis. Think Google Translate, Google Search, and text generation models. Thousands of applications and features in your favorite products use large language models to manipulate language better than ever before – and they're getting faster, more efficient, and more accurate every day.
These models aren’t only enabling new features and products. In fact, entire new sectors of companies are based on these models as their foundation. One clear example here is the growing list of companies building AI writing assistants. This includes companies like HyperWrite, Jasper, Writer, copy.ai, and others. Another example is companies weaving model generations into interactive experiences like Latitude, Character AI, and Hidden Door.
Image Generation: Name a thing then see it manifest in front of your eyes
AI image generation is another exciting area in the Generative AI space. In that domain, models like DALL-E, MidJourney, and Stable Diffusion have taken the world by storm.
AI Image generation is not particularly new to the scene. Models like GANs (Generative Adversarial Networks) enabled generating images of people, art, and even homes for about nine years now. But each one of these models was trained specifically for the type of object it generates and it took a long time to generate an image.
The current batch of AI image generation models allow a single model to generate a vast number of image types. They also give the user the ability to control what they generate by describing it in text.
It’s often difficult to temper your excitement when these tools exceed your expectations of what software can produce with a simple text prompt. In my case, as well as others I would suspect, these models invoke a deep sense that something has changed. Some shift in the world as we know it has occurred and is expected to have a lasting impact on products, industries, and economies. The potential appears clear as day.
That potential is the very reason why caution is warranted.
Tempering excitement with care
As social media gets swept up in posts that claim “I made model X do impossible task Y 🤯”, it’s important to arm oneself with a discerning eye to filter these claims. One of the key questions to ask is whether a demonstrated capability is a 🍒 cherry-picked example that a model produces 40% of the time, or if it points to robust and reliable model behavior.
Reliability is key for an AI capability to become part of a customer-facing product.
Take for instance, the many capabilities attributed to large GPT models in the last few years. An example is a model’s capability to generate the code to build a website from just a text prompt that was floated in some 2020 demos. It is now three years later and such capabilities aren’t how we build websites.

Code generation with language models is almost certain to change how software is written (ask users of Replit, Tabnine, and copilot). The timeline, however, is less certain. The “nearly” in the tweet above can be anywhere from two to five years.
There's a saying attributed to Bill Gates that can be applied here, “Most people overestimate what they can achieve in a year and underestimate what they can achieve in ten years”. The same can be said about people's expectations around some new technologies.
The last time the tech industry was swept up in a deep-learning-induced frenzy, we were promised self-driving cars by 2020.
They’re still not here.
One key takeaway here is to:
2- Make a distinction between impressive 🍒 cherry-picked demos, and reliable use cases that are ready for the marketplace
Large text generation models are able to answer many questions correctly. But can they do it reliably?
Stack Overflow doesn’t think so.
The popular forum where software developers ask questions has banned machine generated answers from being posted on the site “because the average rate of getting correct answers from ChatGPT is too low”. This is an example of a use case where some people expected the model to reliably be able to generate the exact correct generation for a complex set of problems.
AI use cases that are reliable now
There are, however, other use cases (and workflows) where these models are capable of much more reliable results. Key amongst them are neural search (more in that in point #4 below), auto categorization of text (classification), and copywriting suggestions and brainstorming workflows for generation models (discussed in more detail in part three of this series).
The amazing demos will keep rolling in. They’re part of a community discovery process for the limits and new possibilities of these models (more on community discovery of a model’s generative space and its product/economic value in part two). It will pay, however, to keep asking the cherry-picking question, recognize that the timelines that are less certain, and invest in robustness and reliability of AI systems and models.
3- Think of models as components of intelligent systems, not minds
The capability of language models to generate coherence text will only continue to get better. The first time that some people think that a language model is sentient is already a thing of the past.
A more useful framing is to think of language models as language understanding and language generation components of a software system. They make it a little more intelligent and capable of behaviors beyond what software was traditionally able to do – especially when it comes to language and vision.
In a context like this, the term language understanding is not used to mean human-level understanding and reasoning. But these models are able to extract a lot more information from the text and meanings behind it to increase the usefulness of software.

Once we think of a model as a component, we can start to compose more advanced systems that use multiple steps or models (Part three of this series is entirely dedicated to this topic).
4- Generative AI alone is only the tip of the iceberg
From a technical standpoint, text and image generation models aren’t distinct enough to deserve their own type or sub-area of “AI”. The same models can be used for a variety of other use cases with little to no adjustments. The concern with drawing an arbitrary line around generation is that some may miss other more mature AI capabilities that are reliably powering more and more systems in the industry.

Generative AI is only possible because larger, better models trained on massive datasets enable AI models to make better numeric representation of text and images. For builders, it’s important to know that those representations enable a wide variety of possibilities in addition to generation. One of these key possibilities is neural search.
Neural search is the new breed of search systems that use language models to improve on simple keyword search.
They enable searching by meaning.
Neural search fits alongside text classification as use cases where AI produces reliable results for many industry use cases (some challenging areas include sarcasm classification).
Coming up next
In the upcoming articles in this series, we'll look more closely at the tech and value stack of Generative AI. We will also discuss a number of design patterns for applications that use these models as building blocks to build the next generation of intelligent systems.
Stay tuned! Follow @CohereAI on Twitter and join the Discord community to learn when the next parts are published.