How to generate Cohere embeddings on an entire dataset?

6 min read

Cohere’s embeddings are excellent for a variety of use cases such as multi-lingual datasets. In this tutorial, we’ll show how we can get started with using Cohere embeddings on an entiredata dataset by leveraging Relevance AI as our data store and computation. We’ll store the data and the vectors in a Dataset and we’ll use a Workflow to generate the embeddings. You can complete the whole process within 10 minutes and not touch a line of code.

Creating a Cohere account

In order to use the embeddings API by Cohere, we need to create an account and get an API key. Head to the Cohere dashboard, set up your account and get your API keys from the settings.

Creating a Relevance AI account

Relevance AI will be our database, computation and API - so we’ll need an account to get started. Head to the Relevance AI dashboard and set up your account.

Creating your first dataset

Grab your dataset - this can be a CSV, PDF, Video file or more. If it’s a CSV, make sure to include a header for each column that will define the name of the field. Once you have it ready, drag and drop it into the dashboard and name your dataset.

Running your first workflow

Workflows are at-scale transformations of Datasets. Relevance AI has a large list of workflows that you can run. For this tutorial, we’ll be using the workflow to vectorise text using Cohere. Head to the Workflows tab, search for Cohere and select the “Vectorise with Cohere” workflow. From the form, select the fields to vectorise, which cohere model to use and your Cohere API key. Click run and you’ll see the progress indicator.

Next steps

Now you have a Dataset with your data and a field on each row with your Cohere embeddings. You can now run other workflows to make use of the embeddings or use the API to make vector search queries or get instant answers. Read more about how to get Q&A out of our newly created dataset here.

February 12, 2023
Daniel Vassilev
Vector Embeddings
You might also like