Jina Reader is a powerful web content extraction and transformation service that converts web pages, PDFs, and HTML content into LLM-friendly formats. With Relevance AI, you can harness this capability to create dynamic, data-driven applications powered by AI Agents.



Jina Reader excels at converting web content into LLM-friendly formats, while Relevance AI empowers you to leverage that content with intelligent AI Agents that can analyze and act on the data.
Document Intelligence Mastery
Empowers agents with advanced document understanding and data extraction capabilities across multiple formats
Precision Data Harvesting
Achieves 98%+ accuracy in extracting and structuring information from unstructured documents
Scale Amplification
Processes documents 5-10x faster than traditional methods while maintaining high accuracy
Relevance AI seamlessly integrates with Jina Reader to enhance your workflows by automating content extraction and transformation.
What you’ll need
You don't need to be a developer to set up this integration. Follow this simple guide to get started:
- A Relevance AI account
- An Airtable account with access to the base and table you'd like to use
- Authorization (you'll connect securely using OAuth—no sensitive info stored manually)
Security & Reliability
Jina Reader is a robust web content extraction and transformation service that seamlessly converts web pages, PDFs, and HTML content into formats suitable for LLMs. This integration allows developers to effortlessly access and process web content via a RESTful API, featuring capabilities such as content formatting, selective extraction, and streaming.
Key benefits include automated web content extraction and cleaning, support for multiple output formats (markdown, HTML, text, screenshots), customizable content selection and filtering, and the ability to handle authenticated and proxy-enabled requests. Additionally, it offers a streaming mode for processing large content efficiently.
To get started, ensure you have a Jina Reader account with the necessary OAuth credentials and permissions. Your environment should support HTTPS and REST API calls, with adequate storage for content processing and network access to r.jina.ai
.
Authentication setup is straightforward, requiring your Jina Reader account ID and the appropriate OAuth permission type. Basic API configuration involves setting the base URL and headers for your requests. You can customize your environment settings in a configuration file, specifying content format, timeout, and response preferences.
For a quick start, you can convert a URL to an LLM-friendly format with a simple API call. Advanced content extraction allows for selective content retrieval using target and excluded selectors, ensuring you get the precise information you need.
In case of issues, common troubleshooting steps include verifying authentication details, adjusting timeout values for large pages, and ensuring the correct response format. For restricted content, using a proxy server can help bypass access limitations.
Best practices for effective content extraction involve using specific target selectors, implementing robust error handling, and caching responses when feasible. Securely storing credentials and managing token refresh logic are crucial for maintaining authentication integrity.
For further assistance or specific use cases, refer to the comprehensive API documentation or reach out to Jina Reader support.
No training on your data
Your data remains private and is never utilized for model training purposes.
Security first
We never store anything we don’t need to. The inputs or outputs of your tools are never stored.

To get the most out of the Jina Reader + Relevance AI integration without writing code:
- Start with clear content selection: Use specific target selectors to ensure you extract only the necessary content.
- Utilize streaming mode: For large web pages, enable streaming to handle content efficiently without timeouts.
- Secure your credentials: Store OAuth credentials securely and implement token refresh logic to maintain access.
- Test with sample URLs: Validate your configurations with test URLs before applying them to production data.
- Monitor response formats: Ensure your contentFormat is set correctly to avoid unexpected output types.