Generative AI with Cohere: Part 1 - Model Prompting
In this multi-part guide, we will go through everything that you need to know about generative AI with Cohere’s large language models (LLMs). In Part 1, we talk about model prompting.

What is Generative AI?
Generative AI is a type of artificial intelligence that focuses on creating or generating new content or data. This can be in the form of language, images, videos, and more.
Its market potential is significant as it has the potential to revolutionize many industries and drive innovation in a wide range of fields. For example, in creative arts, generative AI can be used to generate unique and engaging content, such as music or visual art, with minimal need for human input. In the business world, generative AI can be used to generate reports, presentations, and other business documents, reducing the need for manual data analysis and enhancing productivity.
At Cohere, our focus is on language. We want to enable developers to add language AI to their technology stack and build impactful applications with it.
In this multi-part guide, we will go through everything that you need to know about generative AI with Cohere’s large language models (LLMs).
Here is what we’ll cover throughout this series:
- Part 1: Model Prompting
- Part 2: Use Case Ideation
- Part 3: The Generate Endpoint
- Part 4: Creating Custom Models
- Part 5: Chaining Prompts

In this Part 1 article, we will cover the following topics:
- Getting Started with the Cohere Playground
- Prompting the Models
- Controlling the Model Output
- Saving and Sharing Your Prompts
Getting Started with the Cohere Playground
Throughout the series, we will cover the full spectrum of working with generative AI to enable you to build applications with it. But to start with, let’s take the no-code route: we’ll show you how AI text generation works, and how you can experiment with it in the Cohere Playground.
First, sign up for a Cohere account and then visit the Playground.
The Playground UI consists of a few sections. The main window is where you enter your prompt and where the output, or response, is generated. A menu of saved prompts, or presets
, is shown in the left-hand pane, and model parameter controls are located in the right-hand pane.

We’ll cover presets and parameters later in this article, but now, let’s start with the fun part: prompt design.
Prompting the Models
Prompt Design
Prompting is at the heart of working with LLMs. The prompt provides a context for the text that we want the model to generate. The prompts we create can be anything from simple instructions to more complex pieces of text, and they are used to encourage the model to produce a specific type of output.
Coming up with a good prompt is a bit of both science and art. On the one hand, we know the broad patterns that enable us to construct a prompt that will generate the output that we want. But on the other hand, there is so much room for creativity and imagination in coming up with prompts that can get the best out of a model.
To understand how model prompting works, let’s start by entering the following prompt into the playground.
Once upon a time in a magical land called
Clicking Generate
gives us a continuation of the text (model-generated text is in bold).

This is the most basic form of prompting, which is simply asking the model to complete the text that we have entered. But this type of prompt is rather open-ended, and in more practical applications, you will need to make the prompt tighter, so that the output generated will be more predictable.
With that, let’s now dive into how you can design more effective prompts.
The Two Types of Generative Models
How a prompt is designed is dependent on the type of model you are using. There are two types of generative models available on the Cohere Platform: one where you prompt by instruction and another where you prompt by example.
In summary, this is how the two types of models differ:
-
Prompting by Instruction
- Works best with the
Command-Xlarge
model - You can think of this as a “tell” type of prompt
- The output you get is more open-ended and the format can vary
- Works best with the
-
Prompting by Example
- Works best with the
XLarge
andMedium
models. - You can think of this as a “show” type of prompt
- The output you get is more close-ended and the format is more predictable
- Works best with the
Now let’s learn more about designing these two types of prompts.
Prompting by Instruction
The Command-Xlarge
model works best when we provide an instruction-based prompt. One way to do this is by using imperative verbs to tell the model what to do, for example: generate, write, list, provide, and other variations.
Our number one priority is to ensure that we are making our large language models safe, accessible, and useful. Our Command model is still in beta and we are aware of some of the limitations around safety. As you experiment with the model and run into issues, please help us by flagging it to our team by emailing us at team@cohere.com.

Let’s say that we are creating social media ad copy for a wireless earbuds product. We can write the prompt as follows.
Generate a social ad copy for the product: Wireless Earbuds.
At this point, ensure that you select command-xlarge
in the MODEL
dropdown in the right pane. Then, click on Generate.
This generates the following output.

That’s not bad. With a simple, one-line prompt, we already have a piece of ad copy that will make a digital marketer proud!
But perhaps we want to be more specific in terms of what we want the output to look like. For this, we can layer additional instructions onto the model in the prompt.
Let’s say that we want the model to provide the ad copy in the form of a well–known copywriting technique, called the AIDA Framework (which stands for Attention, Interest, Desire, and Action). In this case, we can append this specific instruction in the prompt as follows.
Generate an ad copy for the product: Wireless Earbuds. The copy consists of four parts (Attention, Interest, Desire, Action), following the AIDA Framework.

The model picks up what the AIDA Framework is, and duly returns an output following the format that we wanted. Wonderful!
The prompt can be constructed as a combination of an instruction and some context. Let’s see this in action with another example: emails. We can create a simple instruction to write an email as follows.

Or we can create an instruction to summarize an email, and that email now becomes part of the prompt, acting as the context.

This is just a taste of what kinds of prompts you can design. You can keep layering your instructions to be as specific as you want, and see the output generated by the model. And there is really no right or wrong way to design a prompt. It’s really about applying an idea and continuing to iterate the prompt until you get the outcome you are looking for.
Prompting by Example
Sometimes you want the model output to be more attuned to a specific pattern or nuance that you have in mind. For this, a better option might be to use the XLarge
or Medium
model and prompt it by example instead of by instruction.
For example, say you want to generate product descriptions for a long list of products. You want each of the descriptions to be of similar length. And for each product description, you want to be able to input a few keywords to guide the content of the descriptions.
A basic prompt format that generally works well is as follows.
- A short description of the overall context
- A few examples of prompts and completions; usually two to three examples are sufficient but for more challenging tasks, you will need more
- A short sequence of characters or “stop sequences” to guide the model towards writing a complete passage and then stopping.

With our product description case, here’s what the prompt looks like:

At this point, ensure that you select xlarge
or medium
in the MODEL
dropdown in the right pane.
And this is what the output looks like:

Notice that since the examples we provided were concise, the model generated one with a style for this new product. If you want the output to be more elaborate, or as a list, or in any format, what you need to do is to alter your examples, and the model will duly return similar outputs.
Let’s take another example. This one is for generating haikus, a type of short-form poetry originally from Japan. Say we want the haiku to be precisely three lines, each with a similar length. The Command-XLarge
model might still work for this, but over many generations, the output format could vary. This is where we can use the XLarge
model to provide examples of the type of output we want.
Here’s what the prompt looks like, where we give a few haiku examples:

And this is what the output looks like:

Perfect, just as we want it!
Controlling the Model Output
Other than the prompt design, there is another way to control the kind of output we want, that is, by adjusting the model parameters. These parameters are available on the right pane of the Playground. They are applicable to both types of prompting we discussed earlier.
Let’s now see what we can do with these parameters.
When to Stop
There are a couple of parameters that let you decide when the model should stop.
Number of tokens
— One English word roughly translates to 1-2 tokens. The model will stop generating text once it reaches the maximum number of tokens specified by this parameter.Stop sequences
— You can define any character or sequence of characters to tell the model when to stop. This is useful when you are showing the model a few generation examples and you want to show exactly where one example ends. For examples, you can use “--” to split the examples, or simply a new line (Enter
key) or a period character to tell the model to stop once it finishes a sentence.
Being Creative vs. Predictable
Probably the most useful set of parameters are the ones that we can tune to control the randomness of the output. The beauty of working with LLMs is, for the same prompt, the next generated token will not be the same every time. Rather, it is sampled from a long list of possible tokens. This is where the creative aspect of LLM comes from, allowing us to generate a variety of outputs, given the exact same prompt.
But depending on your application, you may want to reduce, or increase, this level of randomness. You can do this by adjusting a number of parameters.
But before looking at the parameters, it’s worth taking the time to understand how the model selects the next token to generate. It does this by assigning a likelihood
number to each of all possible next tokens. The model would see that the token cookies
has a much higher likelihood than chair
for appearing after the phrase I like to bake.
During text generation, there’s still a probability that chair
would appear, but the probability is much lower than cookies.
The parameters we are going to see now can change this behavior.
There are three parameters we can adjust for this.
Temperature
— is how the model chooses from its next choice of tokens. Lower temperature will cause the model to output text that is more predictable, while higher temperature means that the output will be more creative. It is a number between 0 and 2, and in most cases, somewhere between 0 and 1 works fine.Top-k
— is the list in which the model can make its choices from. The default is 0, which means the model will consider all possible next tokens out of the thousands of possible tokens. If you change to any other number, of say 100, the model will only consider the top most probable tokens, limited by the number you define.Top-p
— is similar totop-k
, except now the choice scoping is not by maximum token count, but by the maximum sum of the token probabilities. The default is 0.75. If you’d like to understand more aboutTop-k
andTop-p
, head over to this documentation page, which contains more in-depth explanation and examples.

Reducing Repetition
Two parameters allow you to control the amount of repetition in the generated text.
Frequency penalty
— penalizes tokens that have already appeared in the preceding text (including the prompt), and scales based on how many times that token has appeared. So a token that has already appeared 10 times gets a higher penalty (which reduces its probability of appearing) than a token that has appeared only once.Presence penalty
— applies the penalty regardless of frequency. As long as the token has appeared once before, it will get penalized.
Saving and Sharing Your Prompts
You can save the prompts that you have created by clicking on the Save
button on the bottom right of the Playground. Once you have saved a prompt, it will appear as a preset on the Playground’s left pane.
You can also share these prompts with others. To do this, click on the Share
button on the top right of the Playground. You will get a link which you can share with anyone!
Conclusion
In this article, we covered how to prompt a model — probably the most important, and definitely the most fun part of working with large language models. If you’d like to dive deeper into it, here are some resources you can go to for further reading:
- Documentation on designing a prompt
- Documentation on text generation
- The Generate API reference
- More prompt tips and tricks
In Part 2, we will explore the range of use cases and areas where generative AI in language can be applied. And along the way, we’ll see the different ways a prompt can be designed.