File to text
Input files and extract the texts from them
What is a File to text input
File to text input is an input component that allows you to upload a file. It then automatically extracts all the included text or transcribe the file. This text can later be used as the source of truth is tasks like question-answering or summarization.
When to use the File to text component
The File to text input component is suitable for when you need to upload a file to work with the content. For instance, to provide the content of a 10 page report to a Large Language Model (LLM) to summarize the information to 2 pages. Or when you want an LLM list all the items discussed in an audio recording.
Supported formats are
- PDF (no ocr)
- Word (.doc, .docx)
- Excel (.xlsx, .xls)
- CSV
- Audio (.mp3, .mp4, .mp2, aac, .wav, .flac, .pcm, .m4a, .ogg, .opus, .webm)
- JSON
- TXT
Note: Audio transcription is done using the Deepgram model.
How to add a File to text input to my custom Tool
The File to text input component is listed under User inputs. You can add it to your Tool using the main section or the side-bar.
File to text setting
- Title: the title of your input component
- Description: To describe what this input is or will be used for, etc.
- Variable name: located on the bottom left and marked in green, you can rename the variable and use the name to access the data in your Tool
- Optional/Required: located on the top right, as the name suggests, unlike an optional component, a required components must be provided before your analysis starts.
Access the extracted text
To access the extracted text output use the variable mode {{}}
and the input step name. Under the default setting, it is accessible
via {{file_text}}
(or params.file_text
in a JavaScript step).
Was this page helpful?