Relevance AI supports three different models for converting audio to text, and two for converting video to text. You can use one of these Tool steps which use different models:

  • Deepgram (audio and video)
  • AssemblyAI (audio and video)
  • OpenAI (only audio)

By using these steps, you can convert your audio and video into readable text for other Tools to use.

Add a ‘Convert audio/video to text’ Tool step to your Tool

You can add the ‘Convert audio/video to text’ Tool step to your Tool by:

  1. Creating a new Tool, then searching for one of the ‘Convert audio/video to text’ Tool step
  2. Click ‘Expand’ to see the full Tool step
  3. Upload the file you want to convert
  4. Click ‘Run step’ to see the Tool step’s output!

Deepgram steps

If you use the Deepgram Tool step, you can select an option to recognise speaker changes.

OpenAI steps

Using this tool step requires an OpenAI API Key.

If you use the OpenAI Tool step, you can select a Response Type. There are two options, Transcript only and Transcript + advanced metadata.

Common errors