NLP in Finance: Speaking the Language of Capital Markets

Use of NLP in capital markets is becoming crucial for making sense of textual data. Using Cohere enables firms to customize models for their specific use cases.

NLP in Finance: Speaking the Language of Capital Markets
NLP in Finance: Speaking the Language of Capital Markets

Good investment ideas can be found just about anywhere. This presents an opportunity, but also a challenge, since the sheer volume of public data available can be overwhelming. Every minute of every day, new data sources are published about any given company or industry, all of which potentially contain relevant information for investors. Some examples include:

  • Annual and quarterly reports, S-1s, and other statements and disclosures
  • Earnings call transcripts, press releases, and investor day presentations
  • Analyst reports, research, and academic papers
  • Corporate sustainability reports and other ESG data sources
  • News feeds, including Bloomberg, Thomson Reuters, and the Wall Street Journal
  • Blogs
  • Social media

Asset managers and analysts struggle to ingest and parse all this data efficiently using manual processes. The result is not only information overload, but the thing they fear the most: missing out.

NLP Makes Sense of Finance Data

Enter natural language processing (NLP) — a language AI capability that’s fast gaining ground in the finance industry. The following is a small sampling of use cases that firms are using NLP for today:

  • Classification: Perform sentiment analysis, ESG scoring, and a range of other sorting tasks
  • Topic Modeling: Scan documents to cluster similar text together and uncover patterns or key themes
  • Summarization: Synthesize text from very long documents, or from multiple sources, into a cohesive summary
  • Semantic Search: Go beyond keyword search and achieve Google-like search performance on an internal set of documents
  • Content Generation: Automate copywriting for client newsletters or marketing emails
  • Contextual Entity Extraction: Parse derivatives contracts to extract the rates, or commercial loan agreements to extract the covenants
  • Error Correction: Improve the quality of automated voice-to-text transcriptions or translations

Cohere Delivers Performance and Simplicity

NLP has actually been around since the 1980s, so why the sudden flood of interest and investment among hedge funds and asset managers? The simple answer is that Transformer-based large language models (LLM) have enabled incredible breakthroughs in performance — it’s a difference of kind, instead of just a difference of degree. Smaller models produce limited results that are often wildly off-base, whereas LLMs have the depth and scope to truly support the complex use cases in demand by capital markets participants.

The challenge is that these models are powered by neural networks with billions of parameters trained on terabytes of text. Simply holding these networks in memory requires multiple cutting-edge GPUs, and training these models requires supercomputer clusters well beyond the reach of all but the largest organizations.

Take it from an expert who analyzes the intersection of finance, economics, and AI, Evan Schnidman, who also cofounded MarketReader, a market intelligence platform. When asked about the future of NLP, Schnidman said, "The dissemination of NLP technology in recent years is a result of advances in AI modeling techniques coupled with massive increases in data availability and compute power. In asset management (and finance more broadly), this has meant moving away from models trained exclusively with domain-specific data and toward using generalized LLMs trained on an enormous corpus."

From Schnidman's point of view, these LLMs generalize across domains and can produce comparable-quality outputs to the domain-specific models, but they require far less training and are less prone to overfitting. Schnidman continues, "The primary barrier to using these models is that they can be expensive to run due to the need for both enormous amounts of training data and high-end compute power, so those teams looking to use modern NLP tools should consider working with experts in NLP and LLM providers."

That’s why at Cohere, we train and serve large language models via a simple API, and provide an interface for firms to take advantage of what modern NLP has to offer, including customizing models for their specific use cases. You no longer have to worry about updating models to give them an understanding of current events (e.g., COVID), or worry about provisioning GPU clusters for inference — we’ll handle this for you, all for a fraction of what it would otherwise cost to develop and manage them yourself.

Want to explore how Cohere could help your team? Get in touch and I’ll be happy to set up a short meeting.