2 Model Releases for 2x the Fun
We are excited to announce that the Cohere team has released a new suite of Representation models. We have released
large Representation Models and will now be offering these models as our Baseline Representation Models.
small has also been updated. In addition to releasing new models, we have expanded the maximum token length for our Representation models to 1024 tokens.
Cohere’s Large and Medium Representation models outperform SOTA Representation models, and Cohere’s updated Small Representation model is in line with SOTA. For the purposes of comparison, we used SentEval, which is a standard academic benchmark for representation models.
Embedding Max Tokens Have Increased
We have increased previous max tokens per text from 512 to 1024. For any text longer than 128 tokens, the text is spliced and the resulting embeddings of each component are averaged and returned.
Upgrading to Larger Embeds
New models will be available at
small-20220217. Cohere’s previous “Small” Representation Model will still be available via
small-20211115, and the new
small model has redirected to
small-20220217 since February 28th. See our pricing page for updated pricing.