Extract data from PDF
One of the frequently used templates at Relevance is Extract data from PDF. This tool enables you to extract specific information from a PDF file.
How to use the Tool
Locate the Tool in the template page and click on Use template
.
You can use the Tool as is or
clone it.
Tool inputs and output
The Tool requires two inputs:
- A PDF file (PDF)
- A list of topics to extract (Data to extract):
Provide the input data and hit
Run once
, you will see the LLM response in a few seconds similar to what is shown in the image below. Simply click on the Export button on the bottom right to save the output.
The output is a CSV with columns representing the extracted data.
Tool execution
Tools and templates can be
-
tested on individually provided inputs:
-
set to fetch the data from a dataset and apply the analysis on the whole dataset:
Tool components
If you clone a template, or make a Tool from scratch, you will have access to the Build tab. Build is where one put together different components to build a Tool suitable for their needs.
User inputs
-
File to URL: An easy-to-use, one step component, which takes care of all you need when uploading a file for further analysis.
-
Text list input: An input text component suitable for entering a list of items, such as entering a list of topics or examples.
Tool steps
There are 5 components under the Tool steps in this analysis flow. These components take care of three tasks: converting PDF to text, the LLM step, and formatting for CSV export.
Converting PDF to text
Text to PDF is a Tool step supported by Rlevance AI, which receives a file URL and extract the text from it. This component supports OCR.
Large Language Model (LLM)
A large language model component is all set up to provide you access to GPT (and many other LLMs). In the prompt section, you will provide the required information as well as instructions to what is expected to be done.
- Be short and precise with your instruction/request from the LLM
- Include formatting instruction when necessary
- Specify the scope using
"
,"""
or similar identifiers
Formatting for CSV export
-
Markdown to CSV
A Javascript code component is available to Run Javascript codes when necessary.
In this Tool, the code-snippet turns the Markdown format to a CSV.
-
A temporary downloadable file
In many analysis, when the output is generated, it needs to be turned into a downloadable file. At Relevance this can be easily done via a ready-to-use component. This component takes some data that is possible to turn into a CSV format, exports the data to a downloadable CSV and returns a URL to access the file.
-
Export to CSV
A Javascript code component is available to Run Javascript codes when necessary.
In this Tool, the second code-snippet simply returns the downloadable URL to the output file.