How To Train Your Pet LLM: Prompt Engineering

The prompt is the input you give to a Large Language Model and is the best way to influence its output. In this post, we cover building the perfect prompt and mistakes to watch out for.

How To Train Your Pet LLM: Prompt Engineering

One of the reasons why Large Language Models have taken the world by storm is their ability to manipulate or generate text for a wide variety of purposes without much instruction or training.

Having been trained on large amounts of content already, the model can often produce the right outcomes based on a simple task description. In the world of LLMs and Generative AI, this task description is called a prompt.

The prompt is one of the best ways you can influence the outcome of the LLM, and in this article, we’ll share some tips and tricks on how to get your prompts right.

Prompts 101

It’s quite expensive to build and train your own Large Language Models. Most people prefer to use a pre-trained model like Cohere, which you can access through our API.

When calling the API, you need to pass in some parameters, like how random you want the output to be, how long you want it to be, and so on. One of those parameters is called the prompt and this is where you can describe to the model what you want it to do.

Think of it like giving somebody an instruction, except in this case that somebody is a Language AI. So in this screenshot, I gave it the instruction “Write a sentence using the word ocean” and it responded with (the part in bold) “The ocean is vast and beautiful.”

A basic prompt in the Cohere Playground

That was a pretty basic example. However, when you’re talking to a human, you can often be vague with your instructions and expect the other person understands. With a Language Model, sometimes you need to be a little clear with your instructions or word them in a certain way to get the best outcome. This is called prompt engineering.

Zero-Shot Learning

In the example above, I gave the model a simple instruction as a prompt and it gave me the expected output. This is called zero-shot learning. We didn’t need to train the model on writing sentences using the word “ocean”. We just told it to do so and it figured it out.

Another example of prompts with zero-shot learning would be asking the model to translate a sentence from one language to another.

I’ve often found that zero-shot learning works best for simple tasks, like writing a sentence, translating something, making a list, and so on. However, in the real world, tasks aren’t always that simple. And to truly get the most out of a powerful AI like Cohere, we need to give the model a few examples.

Few-Shot Learning

Let’s say we do marketing for a shoe company. We’ve designed a new shoe for long-distance runners and want a catchy description we could use on our product page and ads. I could simply try a zero-shot approach and ask the model to generate a product description for the shoe.

Generating product descriptions in Cohere

Ok, not bad, but it’s pretty dry. I think I want something a little more... Mad Men.

This is where we need to “train” the model with a few examples. This is called Few-Shot learning, where we prompt the model with a few examples so that it picks up on the pattern and style we’re going for.

Generating using few-shot learning

Ok, now we’re talking! As you can see, I’m using the task description from earlier but I also added two examples that I pulled from Casper, the mattress company, and Glossier. These serve as guidelines for the AI, which takes cues from the writing style and presents our shoe in a similar fashion.

I really like this description but there’s one problem. My shoe is not called the Nike Zoom Pegasus Ultra, although it sounds really cool.

Context Stuffing

We need to add more context to our prompt to get better outputs. To be honest, though Cohere was able to generate an interesting description, our basic prompt is pretty vague. If you asked a copywriting expert to write a product description for a long-distance running shoe, they too would want more context. What’s the company, what’s the product, what’s unique about it, and so on? Let’s see what happens when we feed this data in.

Adding context to your prompts

Perfect! It got the name right and also added more details. We can see it made up the point about having a carbon fiber plate but, with more context, we can solve that. I can simply add a line that looks like this to each example -

Materials: Made with <insert respective materials here>.

The more context you can add to your prompt, the better the output is, especially if details matter to you. Of course, you don’t want to go overboard and put in too much context that you’re limiting the AI or making more work for yourself.

I’ve found that 3-4 lines of context are the sweet spot. You can also add a little more detail to the context itself than I did. I also used just two examples but sometimes three might be better.

Tips And Tricks

Experiment With It

You’re not going to get your prompts right the first time. You’ll need to try out a few different types of prompts, change your few-shot examples, and test them to see what’s giving you the best outputs. The Cohere Playground, which you can see in the screenshots above, is the best place to do this.

I’d also suggest running the same prompt a few times and averaging the outputs in terms of quality. If you’re putting it into production, you can also generate n times and pick the best output to serve the client-side. For example, you may want to eliminate outputs that are too similar to the input or have too much repetition.

Make The Prompt Dynamic

In the example above, I added descriptions from Casper and Glossier in my prompt. Both are D2C eCommerce companies so they were good examples for my made-up shoe company.

However, if you’re building an app with the Cohere API, you may have users who want to generate product descriptions for their SAAS product. Hard coding eCommerce examples into your prompt may not be the best idea. Try building a database of different examples you can use in your prompt and dynamically insert them based on the use case.

Provide Guardrails

Let’s go back to the translation task from earlier. I’m going to give it the same prompt but with a very small change.

Prompts gone wrong

Uh oh, what happened here? At the core, a language model tries to predict the next word in a sequence. You may give it a task description that makes sense to a human but the language model doesn’t actually understand it. It’s simply just continuing the pattern.

That’s why few-shot works so well. You’re building a pattern that the model can follow. In this new translation example, it’s also following a pattern, just not the pattern you were expecting.

To avoid this, write your prompt in a way that forces the model to perform the task vs simply continuing the instruction. You can do it the way I did it earlier, or an even better prompt would be -

Translate the following sentence into Spanish.

English: How are you?

Spanish:

This will almost guarantee that the next set of words generated is the Spanish translation of the English sentence.

Small Changes In Prompts Can Lead To Vastly Different Outputs

As you noticed in the translation example above, a small change in the prompt led to vastly different outputs. Sometimes, if your prompt is not working, it might not be because of the way you've worded it but instead the way you've structured it.

Avoid Obvious Patterns

Speaking of patterns, try not to use examples in your prompts that are too similar unless you’re going for a very specific style. I used a mattress company and a beauty brand with very different copywriting styles. Had I gone with two mattress examples, my shoe may have ended up sounding like a nice pillow.

Another pattern the model could pick up on is repetition or spelling mistakes. So if you use a certain word often in the prompt, it’s likely to show up in the output. If your prompt is filled with spelling errors or poor grammar, it may also end up in the output.

Limit Output Size

Since the language model is predicting the next word in a sequence, it may predict something that makes sense but takes it into a weird tangent. If you want to avoid that, don’t let the model generate for too long. Have it generate short amounts of text by limiting it with the “number of tokens” parameter.

If you want longer outputs, you can feed the short output back into the model and have it continue from there. A better way to do this is to run some sort of post-processing on the first output, like removing repetition or irrelevant content, and then feeding the cleaned version back in.

Try The Playground

Playing around with prompts and finding new ways to produce the right outputs is a lot of fun. The Cohere Playground is the perfect place to try this out. It’s a visual interface where you can test prompts before you put them into production.

Getting started with the Playground is really easy. You can sign up here and within a couple of minutes, you’ll be able to play with our Generate models. If you want to learn more about how the Playground works, you can see the documentation here.