Generative AI with Cohere: Part 1 - Model Prompting

In this multi-part guide, we will go through everything that you need to know about generative AI with Cohere’s large language models (LLMs). In Part 1, we talk about model prompting.

Generative AI with Cohere: Part 1
Generative AI with Cohere: Part 1

What is Generative AI?

Generative AI is a type of artificial intelligence that focuses on creating or generating new content or data. This can be in the form of language, images, videos, and more.

Its market potential is significant as it has the potential to revolutionize many industries and drive innovation in a wide range of fields. For example, in creative arts, generative AI can be used to generate unique and engaging content, such as music or visual art, with minimal need for human input. In the business world, generative AI can be used to generate reports, presentations, and other business documents, reducing the need for manual data analysis and enhancing productivity.

At Cohere, our focus is on language. We want to enable developers to add language AI to their technology stack and build impactful applications with it.

In this multi-part guide, we will go through everything that you need to know about generative AI with Cohere’s large language models (LLMs).

Here is what we’ll cover throughout this series:

  • Part 1: Prompting the Model
  • Part 2: Use Case Ideation
  • Part 3: Working with the Generate Endpoint
  • Part 4: Training Your Custom Model
  • Part 5: Building Your Applications
This series covers everything that you need to know about generative AI with Cohere’s LLMs
This series covers everything that you need to know about generative AI with Cohere’s LLMs

In this Part 1 article, we will cover the following topics:

  1. Getting Started with the Cohere Playground
  2. Prompting the Models
  3. Controlling the Model Output
  4. Saving and Sharing Your Prompts

Getting Started with the Cohere Playground

Throughout the series, we will cover the full spectrum of working with generative AI to enable you to build applications with it. But to start with, let’s take the no-code route: we’ll show you how AI text generation works, and how you can experiment with it in the Cohere Playground.

First, sign up for a Cohere account and then visit the Playground.

The Playground UI consists of a few sections. The main window is where you enter your prompt and where the output, or response, is generated. A menu of saved prompts, or presets, is shown in the left-hand pane, and model parameter controls are located in the right-hand pane.

A snapshot of the Cohere Playground
A snapshot of the Cohere Playground

We’ll cover presets and parameters later in this article, but now, let’s start with the fun part: prompt design.

Prompting the Models

Prompt Design

Prompting is at the heart of working with LLMs. The prompt provides a context for the text that we want the model to generate. The prompts we create can be anything from simple instructions to more complex pieces of text, and they are used to encourage the model to produce a specific type of output.

Coming up with a good prompt is a bit of both science and art. On the one hand, we know the broad patterns that enable us to construct a prompt that will generate the output that we want. But on the other hand, there is so much room for creativity and imagination in coming up with prompts that can get the best out of a model.

To understand how model prompting works, let’s start by entering the following prompt into the playground.

Once upon a time in a magical land called

Clicking Generate gives us a continuation of the text (model-generated text is in bold).

Generation for a simple prompt entered
Generation for a simple prompt entered

This is the most basic form of prompting, which is simply asking the model to complete the text that we have entered. But this type of prompt is rather open-ended, and in more practical applications, you will need to make the prompt tighter, so that the output generated will be more predictable.

With that, let’s now dive into how you can design more effective prompts.

The Two Types of Generative Models

How a prompt is designed is dependent on the type of model you are using. There are two types of generative models available on the Cohere Platform: one where you prompt by instruction and another where you prompt by example.

In summary, this is how the two types of models differ:

  • Prompting by Instruction

    • Works best with the Command-Xlarge model
    • You can think of this as a “tell” type of prompt
    • The output you get is more open-ended and the format can vary
  • Prompting by Example

    • Works best with the XLarge and Medium models.
    • You can think of this as a “show” type of prompt
    • The output you get is more close-ended and the format is more predictable

Now let’s learn more about designing these two types of prompts.

Prompting by Instruction

The Command-Xlarge model works best when we provide an instruction-based prompt. One way to do this is by using imperative verbs to tell the model what to do, for example: generate, write, list, provide, and other variations.

This model is in beta.
Our number one priority is to ensure that we are making our large language models safe, accessible, and useful. Our Command model is still in beta and we are aware of some of the limitations around safety. As you experiment with the model and run into issues, please help us by flagging it to our team by emailing us at team@cohere.com.
Prompting by instruction
Prompting by instruction

Let’s say that we are creating social media ad copy for a wireless earbuds product. We can write the prompt as follows.
Generate a social ad copy for the product: Wireless Earbuds.

At this point, ensure that you select command-xlarge in the MODEL dropdown in the right pane. Then, click on Generate.

This generates the following output.

A simple, one-line prompt and its output
A simple, one-line prompt and its output 

That’s not bad. With a simple, one-line prompt, we already have a piece of ad copy that will make a digital marketer proud!

But perhaps we want to be more specific in terms of what we want the output to look like. For this, we can layer additional instructions onto the model in the prompt.

Let’s say that we want the model to provide the ad copy in the form of a well–known copywriting technique, called the AIDA Framework (which stands for Attention, Interest, Desire, and Action). In this case, we can append this specific instruction in the prompt as follows.

Generate an ad copy for the product: Wireless Earbuds. The copy consists of four parts (Attention, Interest, Desire, Action), following the AIDA Framework.

A a more specific prompt and its output
A a more specific prompt and its output 

The model picks up what the AIDA Framework is, and duly returns an output following the format that we wanted. Wonderful!

The prompt can be constructed as a combination of an instruction and some context. Let’s see this in action with another example: emails. We can create a simple instruction to write an email as follows.

A instruction-only prompt and its output
A instruction-only prompt and its output

Or we can create an instruction to summarize an email, and that email now becomes part of the prompt, acting as the context.

A instruction prompt with context and its output
A instruction prompt with context and its output

This is just a taste of what kinds of prompts you can design. You can keep layering your instructions to be as specific as you want, and see the output generated by the model. And there is really no right or wrong way to design a prompt. It’s really about applying an idea and continuing to iterate the prompt until you get the outcome you are looking for.

Prompting by Example

Sometimes you want the model output to be more attuned to a specific pattern or nuance that you have in mind. For this, a better option might be to use the XLarge or Medium model and prompt it by example instead of by instruction.

For example, say you want to generate product descriptions for a long list of products. You want each of the descriptions to be of similar length. And for each product description, you want to be able to input a few keywords to guide the content of the descriptions.

A basic prompt format that generally works well is as follows.

  • A short description of the overall context
  • A few examples of prompts and completions; usually two to three examples are sufficient but for more challenging tasks, you will need more
  • A short sequence of characters or “stop sequences” to guide the model towards writing a complete passage and then stopping.
Prompting by example
Prompting by example

With our product description case, here’s what the prompt looks like:

Product description prompt
Product description prompt

At this point, ensure that you select xlarge or medium in the MODEL dropdown in the right pane.

And this is what the output looks like:

Generated product description
Generated product description

Notice that since the examples we provided were concise, the model generated one with a style for this new product. If you want the output to be more elaborate, or as a list, or in any format, what you need to do is to alter your examples, and the model will duly return similar outputs.

Let’s take another example. This one is for generating haikus, a type of short-form poetry originally from Japan. Say we want the haiku to be precisely three lines, each with a similar length. The Command-XLarge model might still work for this, but over many generations, the output format could vary. This is where we can use the XLarge model to provide examples of the type of output we want.

Here’s what the prompt looks like, where we give a few haiku examples:

Haiku prompt
Haiku prompt

And this is what the output looks like:

Generated haiku
Generated haiku

Perfect, just as we want it!

Controlling the Model Output

Other than the prompt design, there is another way to control the kind of output we want, that is, by adjusting the model parameters. These parameters are available on the right pane of the Playground. They are applicable to both types of prompting we discussed earlier.

Let’s now see what we can do with these parameters.

When to Stop

There are a couple of parameters that let you decide when the model should stop.

  • Number of tokens — One English word roughly translates to 1-2 tokens. The model will stop generating text once it reaches the maximum number of tokens specified by this parameter.
  • Stop sequences — You can define any character or sequence of characters to tell the model when to stop. This is useful when you are showing the model a few generation examples and you want to show exactly where one example ends. For examples, you can use “--” to split the examples, or simply a new line (Enter key) or a period character to tell the model to stop once it finishes a sentence.

Being Creative vs. Predictable

Probably the most useful set of parameters are the ones that we can tune to control the randomness of the output. The beauty of working with LLMs is, for the same prompt, the next generated token will not be the same every time. Rather, it is sampled from a long list of possible tokens. This is where the creative aspect of LLM comes from, allowing us to generate a variety of outputs, given the exact same prompt.

But depending on your application, you may want to reduce, or increase, this level of randomness. You can do this by adjusting a number of parameters.

But before looking at the parameters, it’s worth taking the time to understand how the model selects the next token to generate. It does this by assigning a likelihood number to each of all possible next tokens. The model would see that the token cookies has a much higher likelihood than chair for appearing after the phrase I like to bake.

The model assigns a likelihood number to each of all possible next tokens
The model assigns a likelihood number to each of all possible next tokens

During text generation, there’s still a probability that chair would appear, but the probability is much lower than cookies. The parameters we are going to see now can change this behavior.

There are three parameters we can adjust for this.

  • Temperature — is how the model chooses from its next choice of tokens. Lower temperature will cause the model to output text that is more predictable, while higher temperature means that the output will be more creative. It is a number between 0 and 2, and in most cases, somewhere between 0 and 1 works fine.
  • Top-k — is the list in which the model can make its choices from. The default is 0, which means the model will consider all possible next tokens out of the thousands of possible tokens. If you change to any other number, of say 100, the model will only consider the top most probable tokens, limited by the number you define.
  • Top-p — is similar to top-k, except now the choice scoping is not by maximum token count, but by the maximum sum of the token probabilities. The default is 0.75. If you’d like to understand more about Top-k and Top-p, head over to this documentation page, which contains more in-depth explanation and examples.
Increasing the temperature makes generating tokens with lower likelihoods more probable, and vice versa.
Increasing the temperature makes generating tokens with lower likelihoods more probable, and vice versa.
Increasing the top-k increases the number of tokens the model can choose from, and vice versa. The same concept applies in top-p, but uses probabilities instead of count.
Increasing the top-k increases the number of tokens the model can choose from, and vice versa. The same concept applies in top-p, but uses probabilities instead of count.

Reducing Repetition

Two parameters allow you to control the amount of repetition in the generated text.

  • Frequency penalty — penalizes tokens that have already appeared in the preceding text (including the prompt), and scales based on how many times that token has appeared. So a token that has already appeared 10 times gets a higher penalty (which reduces its probability of appearing) than a token that has appeared only once.
  • Presence penalty — applies the penalty regardless of frequency. As long as the token has appeared once before, it will get penalized.

Saving and Sharing Your Prompts

You can save the prompts that you have created by clicking on the Save button on the bottom right of the Playground. Once you have saved a prompt, it will appear as a preset on the Playground’s left pane.

You can also share these prompts with others. To do this, click on the Share button on the top right of the Playground. You will get a link which you can share with anyone!

Conclusion

In this article, we covered how to prompt a model — probably the most important, and definitely the most fun part of working with large language models. If you’d like to dive deeper into it, here are some resources you can go to for further reading:

In Part 2, we will explore the range of use cases and areas where generative AI in language can be applied. And along the way, we’ll see the different ways a prompt can be designed.

Ready to get started with Generative AI? Sign up for a free Cohere account to start building.