LLMs can be enhanced with your own context and data
Screenshot of Add Data modal for Knowledge
{{ knowledge }}
. This will inject into the prompt information from the dataset based on the setting you have configured.
If you have multiple knowledge datasets selected, you can reference them individually by doing {{ knowledge.dataset_name }}
.
text-embedding-3-large
(3072-Dimensions) and embed-english-v3.0
(1024-Dimensions; default) from OpenAI and Cohere, respectively as we’ve found that they perform the best.gpt-4o
and other models.all-mpnet-base-v2
(768-Dimensions) open source model and will use the same by default for both embedding and retrieval. You can specify both at runtime with the advanced knowledge tool step.
For other default knowledge interactions like uploading knowledge or using the ‘knowledge search’ tool, it will use the all-mpnet-base-v2
model by default.
The vectorization happens at the stage of uploading knowlege by default. However, the advanced knowledge embeds on the fly and then caches.