Augmenting Personal Knowledge Management with Language AI

We’ll explore how developers can build the next generation of digital note-taking tools, powered by language AI.

Augmenting Personal Knowledge Management with Language AI
Augmenting Personal Knowledge Management with Language AI

What’s Next in Personal Knowledge Management?

The internet has made the world a smaller place and enabled us to collectively produce a massive amount of information. But turning this firehose of raw data into useful knowledge is challenging.

Because of this, the topic of Personal Knowledge Management (PKM) has been getting a lot of attention in recent years. It looks into how we can efficiently capture, organize, and ultimately make use of all the information around us.

The Personal Knowledge Management space is trending up [Source: Google Trends].
The Personal Knowledge Management space is trending up [Source: Google Trends].

So, it’s not surprising that digital note-taking applications are now proliferating, coming in different shapes and flavors. They allow us to conveniently capture and organize our notes, give us access to them when we need them, easily search for the information we need, or share notes with others.

Having said that, these tools still very much require that users invest a significant amount of time and effort to make them work. But not everyone has the patience and discipline to keep their personal PKM system running.

What if we could leverage language AI technology to make the whole experience more frictionless, and even more enjoyable? Imagine fusing language AI into these note-taking tools and taking them to the next level — organize, condense, connect, and even create new information.

There exists an opportunity for developers and entrepreneurs to build innovative products that help users get even more out of what they consume.

Language AI can take PKM up another level.
Language AI can take PKM up another level.

Folders vs. Graphs

In this article, we’ll focus on one particular PKM use case: how we make sense of our notes. The premise is that, while we typically organize information in files and folders, our brains don’t actually work that way. Our creativity comes from connecting seemingly disconnected ideas, and a folder structure for organizing information does not help us do that.

This is where the graph approach comes in. The idea is to connect information in a network-like fashion rather than in folders, which frees up information instead of having it stuck in compartments.

The graph approach frees up information instead of having it stuck in compartments.
The graph approach frees up information instead of having it stuck in compartments.

The way of thinking about the graph is analogous to how the world wide web is organized. One web page contains links pointing to other web pages, which in turn contain more links pointing to other web pages. As a website’s information expands, its number of pages increases, and eventually they all form a web of information connected via links.

The graph works in a similar manner, except now the context is your personal collection of notes. When your notes are connected in this way (instead of being buried somewhere and never to be found again), they become more discoverable. This facilitates the serendipitous generation of ideas from the notes that you have been curating all the while.

There are already a number of digital note-taking tools that support the creation of such graphs. But the problem is, as we’ve mentioned earlier, to make it work, users will have to invest a substantial amount of effort to build the links manually over time.

What if we could build an AI-assistant that can make these connections automatically? And what if this AI-assistant could also help generate notes and surface new ideas collaboratively with users?

Let’s go through a demo to explore this idea.

A Quick Demo

In this demo, we’ll explore two specific ways that language AI can be used to enhance the experience of using note-taking applications, namely:

  • Automatically building links from existing notes
  • Generating ideas when writing new notes

The Generate Endpoint

Large language models (LLM) have been pre-trained with a massive collection of text, which makes them capable of capturing the patterns of how humans use language. The outcome: just by giving them a simple prompt, these models can generate impressively original and coherent text.

With the Cohere’s API, this capability is served by the Generate endpoint. Given an input (called a “prompt”), the endpoint will generate a new stream of text. The purpose of a prompt is to provide a context for the text that we want the model to generate.

A prompt format for text generation that generally works well.
A prompt format for text generation that generally works well.

A basic format that generally works well contains a short description of the overall context followed by a few examples of prompts and completions. Needing just a few examples, it establishes patterns, or “training” data, for telling the model of what kind of text to generate next. Read this documentation if you would like to learn more.

The Generate endpoint suggesting potential new topics.
The Generate endpoint suggesting potential new topics.

The Generate endpoint can be applied in many different use cases, and one of them is in extracting information from a piece of text. The key is in the prompt, where we need to show a few examples of a piece of text and the kind of information to extract from the text.

We’ll build an extraction step that takes each note — let’s call this the “parent” note — and suggests parts of that note that can be expanded into its own note. These new “child” notes would then become links and be linked back to the parent note.

The prompt we’ll use is as follows. It contains a few examples of a note and the corresponding topics or key concepts within the note.

This program extracts the key concepts from a note.

Note: For a healthy, fun way to stay fit, rock climbing is an excellent option. Rock climbing is a fun and rewarding sport that's suitable for people of all ages and fitness levels. It can help us stay fit and reduce stress, and it can also be a great family activity.
Choosing the right equipment is essential. There are many different climbing areas and equipment options, but the first thing needed is a harness and a helmet.
Key Concepts: fitness, rock climbing, choosing the right fitness equipment, places to go rock climbing

--
Note: The brewing process for coffee can be a little bit confusing, but with the right equipment, it isn't that difficult. There are many different techniques available, but the most important thing to know is how to make the coffee evenly throughout the day.
One cool skill to have is knowing how to make our own coffee grounds. We'll need a grinder, which is available at a good store or online. There are many different types of grinders, but one key feature to look for is those that make a uniformly fine grind.
Key Concepts: coffee brewing techniques, make our own coffee grounds, coffee grinder


--
Note: Graphic design is a type of visual art that uses pictures, symbols, and text to create a visual representation of an idea. Graphic design can be challenging but also a rewarding career option. Many different types of businesses use graphic design, from retail stores to advertising companies. It can take a lot of time to master, but it can also be rewarding once someone gets good at it.
Most people who are interested in graphic design will start by taking a basic graphic design course. These courses will give all the knowledge needed in order to enter the field, and they'll also teach about different options.
Key Concepts: graphic design, rewarding career option

--
Note:

Let’s test it with a couple of short notes, as follows.

Software Engineering:

Software engineering is a very broad field, but it’s also one of the fastest growing professions in the world. Software engineering is the application of engineering principles to software development.

Software engineering is about more than just writing code; it’s about designing and developing software that meets the needs of customers and users. Software engineers are responsible for creating new programs and applications, as well as maintaining and fixing existing ones.

Artificial Intelligence:

Artificial intelligence (AI) is the simulation of human intelligence processes by machines, especially computer systems.

Artificial intelligence (AI) is used in many applications today, including voice recognition, self-driving cars, and even some household appliances. AI is also used in video games, where it can control the behavior of non-player characters in order to create more realistic interactions between the player and the game world.

We call the Generate endpoint via the co.generate() method to generate the potential links for each of these two notes.

We won’t cover the details here, but if you’d like to understand what the parameters used in this method mean, you can read about them in the API documentation or this blog post.

prediction = co.generate(
       model='xlarge',
       prompt=note_prompt,
       max_tokens=30,
       temperature=0.2,
       k=0,
       p=0.75,
       frequency_penalty=0.1,
       presence_penalty=0,
       stop_sequences=["--"])

And using our two example notes, the endpoint suggests the following topics:

Software Engineering:
Software Development, Software Engineers

Artificial Intelligence:
Video Games, Non-Player Characters, Realistic Interactions

Notice how these suggested topics were taken from the note passage. These then will become new links to be created and linked back to the original note.

There are still many ways we can enhance this further. One of them is to add more sophistication to the link suggestions. We used a simple prompt consisting of a few examples of straightforward extractions in this demo. But sometimes we want the model to identify deeper concepts within a text, and that’s when it needs to see more examples to do its job well.

We can do this by finetuning a model. Finetuning uses a custom dataset to retrain a model, so it can specialize in performing a specific task. Finetuning with the Cohere API is a simple process, which you can read more about in the docs.

Use Case #2: Generate Note Ideas

The Generate endpoint turning a new topic into a note.
The Generate endpoint turning a new topic into a note.

When building a personal knowledge base, the notes we create come uniquely from our own experiences. We record what we’ve learnt and what we’ve been thinking about, and these build over time. The main ideas and inspiration come from no one else but ourselves.

But what if we could get the help of an AI-assistant to make the whole process more effective? Whenever you are stuck, the assistant comes to the rescue and suggests new ideas to be expanded upon or new areas to be studied.

Let’s see how we might do this. We can use the same Generate endpoint but with a different prompt. The prompt we’ll use contains a few examples of a topic name and its corresponding note.

This program will generate a note given the note title.
--
Note Title: Rock Climbing
Note: For a healthy, fun way to stay fit, rock climbing is an excellent option. Rock climbing is a fun and rewarding sport that's suitable for people of all ages and fitness levels. It can help us stay fit and reduce stress, and it can also be a great family activity.
Choosing the right equipment is essential. There are many different climbing areas and equipment options, but the first thing needed is a harness and a helmet.

--
Note Title: Brewing Coffee at Home
Note: The brewing process for coffee can be a little bit confusing, but with the right equipment, it isn't that difficult. There are many different techniques available, but the most important thing to know is how to make the coffee evenly throughout the day.
One cool skill to have is knowing how to make our own coffee grounds. We'll need a grinder, which is available at a good store or online. There are many different types of grinders, but one key feature to look for is those that make a uniformly fine grind.

--
Note Title: Career in Graphic Design
Note: Graphic design is a type of visual art that uses pictures, symbols, and text to create a visual representation of an idea or message.
Graphic design can be challenging but also a rewarding career option. Many different types of businesses use graphic design, from retail stores to advertising companies. It can take a lot of time to master, but it can also be rewarding once someone gets good at it.
Most people who are interested in graphic design will start by taking a basic graphic design course. 

--
Note Title:

We call the Generate endpoint, with slight changes to the max_tokens and the temperature parameters (here’s the documentation again) to generate a new note given a topic.

prediction = co.generate(
       model='xlarge',
       prompt=note_prompt,
       max_tokens=100,
       temperature=0.4,
       k=0,
       p=0.75,
       frequency_penalty=0.1,
       presence_penalty=0,
       stop_sequences=["--"])

As an example, let’s take one of the suggested topics we got in the previous section: Software Development. Below is the generated note.

Software Development:

Software development is the process of conceiving, specifying, designing, programming, documenting, testing, and bug fixing involved in creating and maintaining applications, frameworks, or other software components.

Software engineering is an engineering branch associated with development of software in a systematic method. The systematic method is known as software development life cycle (SDLC).

While this short note is unlikely to be the final version we’d keep, notice that it already contains many concepts and ideas that we could now expand on. So, instead of staring at a blank screen, an AI-assistant is a welcome help in growing a knowledge base.

Growing a Knowledge Base

We’ve now covered the two use cases: building links and generating note ideas. But their benefits will only become evident when there is some scale involved. So, let’s complete this demo with a hypothetical example of growing a knowledge base from scratch and visualizing the resulting graph.

We’ll start by seeding a few initial topics: Computer Science, Software Engineering, Programming, and Artificial Intelligence. We want the model to generate a short note for each of these topics, suggest new links from each note, and then repeat this cycle for a few rounds.

Repeating the cycle of creating new topics and notes.
Repeating the cycle of creating new topics and notes.

Once that’s done, we may want to visualize the graph of this knowledge base. With Python, we can use libraries such as Pyvis (which is what we are using here).

After a few cycles, this is what the graph looks like.

The graph view of the knowledge base built in this demo.
The graph view of the knowledge base built in this demo.

Each dot represents a note in which the title is shown. Each connecting line between two notes indicates that one of the notes mentions the other note’s topic in its contents. And as we traverse the graph, we can trace how one topic relates to another.

Review

In the demo, we used the Generate endpoint to help us build a personal knowledge base. We looked at a couple of ways to utilize this endpoint: first, to generate paragraphs of text, and second, to extract links from an existing text.

However, there are so many other possible ways to build with this endpoint. For example, you can summarize a long piece of text into a condensed format, rewrite a text to follow a specific tone, build a conversational agent, create a question answering interface, and much more.

And the nice thing is, you don’t have to have a lot of training data to start prototyping with the model — just start with a short prompt. This blog post shares a few more ideas of what you can build with this endpoint.

To try out the Cohere platform, sign up for an account today!