Hackathons are a great place to get hands-on with new tools and technologies. At Cohere, we encourage developers to learn more about natural language processing (NLP), explore its potential, and get to know the Cohere Platform. That’s why we’re partnering with lablab.ai to host a series of online hackathons focused on language AI. During our initial event in August, the first-place winner was a chat-based customer support system called “Turing Test.”

TL;DR:

In this article, we go over how the Turing team used text generation with embedding endpoints to build a chatbot for customer support.

This article’s title and TL;DR have been generated with Cohere.
Get started with text generation

Team Turing, consisting of Bence Gadanyi, Edwin Holst, Jonathan Fernandes, and Artur Gasparyan, presented their project as an NLP tool for customer chat support agents. The program uses a company’s knowledge base to generate automated responses to customer queries. When implemented successfully, this type of app could substantially reduce the costs and time associated with customer support training and increase overall staff efficiency.

In this article, we’ll go over the details of this application from frontend to backend, including how this creative team used the Cohere Generate and Cohere Embed endpoints to develop a practical business solution for customer support chatbots.

Enhance Customer Chatbot Support with LLMs

One of the main reasons it takes so long to contact customer support is the extended wait time due to high request volumes from both customers and internal staff. Additionally, because many of these queries relate to specific product information, support agents either need to research the queries manually, undergo continual training to familiarize themselves with inventory products and specifications, or—in many cases—both.

The Turing team’s app aims to solve this issue by providing agents with automated responses to these product-specific questions. This allows agents to reply to customers more rapidly, enabling them to resolve more requests in the same amount of time.

The step-by-step process behind this NLP-powered app is relatively simple. First, the app extracts product information from the company website and stores it in a knowledge base as text. It then uses the Cohere Embed and Cohere Generate endpoints to generate questions and answers (Q&As) from that text.

Based on input from the customer support chatbot, the tool creates three appropriate responses to the customer’s message. The support agent can either reply using one of the responses as-is, edit one before sending it as a response, or answer the query manually.

That is the app’s process in a nutshell. Next, let’s explore each step in more detail, starting with the knowledge base.

Generating the Knowledge Base

The app generates a knowledge base by extracting the product description from the company’s website and creating a vector embedding. It will use this embedding later for comparing the semantic similarity between the knowledge base and the customer’s message.

For this project, the developers used information from Two Wests & Elliott, an eCommerce store selling equipment for greenhouses and gardens, to generate the knowledge base. Below is the webpage for the “Halls Standard Cold Frame” item, which lists product information, including the price, dimensions, materials, and delivery options.

The app includes a prompt instructing the Generate endpoint to create Q&As using the product’s description. It also provides sample Q&As that better enable the model to understand the downstream task.

In this example, the app uses the pictured product description to create a prompt with the instruction: “Generate questions from this text.” This prompt contains product information and a sampling of Q&As that need to be generated.

Chatbot support question and answer prompts example

From this example, you can see that the app has collected everything it needs to generate responses for the most common types of customer support questions. But how exactly does it do that?

How the App Works

The backend of the app consists of two parts. The first uses semantic search to compare the prompt that the customer enters and the existing information in the knowledge base. To do so, it uses Cohere Embed endpoint to capture semantic information about the customer’s query. Cohere returns a list of floating point numbers—called embeddings—that capture the semantic information about the text. Embeddings are a way to represent the meaning of the text as a list of numbers.

The app also uses Annoy—a Python library used to perform a nearest neighbor search for points in space—to build an index that stores the embeddings and searches these using nearest neighbor search.

co = cohere.Client('{apiKey}')
embeds = co.embed(texts=list(dataframe['question']), model="large", truncate="LEFT").embeddings

embeds = np.array(embeds)
num_entries, num_dimensions = embeds.shape
search_index = AnnoyIndex(num_dimensions, 'angular')
for i in range(len(embeds)):
        search_index.add_item(i, embeds[i])

The second part of the tool handles cases where the answer to the customer’s question is not in the knowledge base or when the question is not relevant to the product. To do this, the tool uses the Generate endpoint to provide a reasonable response to the customer’s query.

In this example, the customer’s statement has not specified the product for which they need information.

prompt_text = f"""You are a customer support agent responding to a customer.
--
Customer: Hello.
Agent: Hello, what can I help you with today?
--
Customer: I'm looking for information on a product
Agent:"""

response = co.generate(model='xlarge',
            prompt=prompt_text,
            max_tokens=15,
            temperature=0.3,
            k=0,
            p=0.75,
            frequency_penalty=0,
            presence_penalty=0,
            stop_sequences=["--"],
            return_likelihoods='NONE')

In this instance, Cohere Generate returns the following response:

cohere.Generations {
	generations: [cohere.Generation {
	text:  I'm sorry, what product are you looking for?
--
	likelihood: None
	token_likelihoods: None
}]
	return_likelihoods: NONE
}

Building the Chatbot Interface

While the backend operates in Python, the front-end interfaces—one for the customer and one for the support agent—run using JavaScript.

Customer Support Chatbot Agent Interface

This support agent can access the interface by navigating to {url}/customer-support. Then, they can connect with a customer in the chatbot window to send and receive messages.

The code for this connection looks as follows:

var socket = new WebSocket("ws://127.0.0.1:8000/api/chat/Customer Support");
    socket.onmessage = function (event) {
      var parent = $("#messages");
      var data = JSON.parse(event.data);
      var sender = data["sender"];
      if (sender === current_user) sender = "You";

Customer Chatbot Interface

The customer interface is accessible at the path {url}/customer-chat. The customer initiates a chat with an agent and can send and receive messages when the agent connects.

Here’s the code for connecting at a customer’s address:

var socket = new WebSocket("ws://127.0.0.1:8000/api/chat/Customer Support");
    socket.onmessage = function (event) {
      var parent = $("#messages");
      var data = JSON.parse(event.data);
      var sender = data["sender"];
      if (sender === current_user) sender = "You";

The Chatbot Application in Action

Now that you’ve thoroughly explored the application, let’s see how all the pieces come together.

First, the customer initiates a chat by entering a message into the window prompt. Then, the customer chatbot support agent joins. The customer can then ask for product-specific information by entering their query. The app converts this query into a vector embedding and measures the semantic similarity between the query and the information from the knowledge base.

Cohere Embed then returns two potential responses to the query. Cohere Generate also returns a product-nonspecific response as the third automated response.

From there, the agent can answer the query using any of these responses. The agent can also choose to edit the automated response before sending it.

Conclusion

While NLP is relatively new as a widely available tool, it is already changing how technology and businesses operate. Chat-based customer support systems can benefit substantially from implementing NLP-based functionality. They represent just one of an abundance of NLP applications, and the potential of NLP extends as far as its users’ aspirations. The developers of the customer support app had just three days to create it, so imagine what you can create with just a little more time.

Moreover, these tools are well within your reach. Cohere brings this technology right to your fingertips, providing access to advanced NLP functionality via an innovative and intuitive platform.
Check out Team Turing’s project code or watch their hackathon presentation to learn more about their customer support app powered by Cohere. Maybe you’ll even feel inspired to try out your own programming prompts in the Cohere Playground. Or better yet, join the next Cohere/lablab.ai hackathon starting on December 2, 2022!