Reach out

Command Palette

Search for a command to run...

[Capabilities]

import Tabs from '@theme/Tabs'; import TabItem from '@theme/TabItem';

Document AI QnA

The Document QnA capability combines OCR with large language model capabilities to enable natural language interaction with document content. This allows you to extract information and insights from documents by asking questions in natural language.

The workflow consists of two main steps:

  1. Document Processing: OCR extracts text, structure, and formatting, creating a machine-readable version of the document.

  2. Language Model Understanding: The extracted document content is analyzed by a large language model. You can ask questions or request information in natural language. The model understands context and relationships within the document and can provide relevant answers based on the document content.

Key capabilities:

  • Question answering about specific document content
  • Information extraction and summarization
  • Document analysis and insights
  • Multi-document queries and comparisons
  • Context-aware responses that consider the full document

Common use cases:

  • Analyzing research papers and technical documents
  • Extracting information from business documents
  • Processing legal documents and contracts
  • Building document Q&A applications
  • Automating document-based workflows

The examples below show how to interact with a PDF document using natural language:

1import os
2from mistralai import Mistral
3
4# Retrieve the API key from environment variables
5api_key = os.environ["MISTRAL_API_KEY"]
6
7# Specify model
8model = "mistral-small-latest"
9
10# Initialize the Mistral client
11client = Mistral(api_key=api_key)
12
13# If local document, upload and retrieve the signed url
14# uploaded_pdf = client.files.upload(
15#     file={
16#         "file_name": "uploaded_file.pdf",
17#         "content": open("uploaded_file.pdf", "rb"),
18#     },
19#     purpose="ocr"
20# )
21# signed_url = client.files.get_signed_url(file_id=uploaded_pdf.id)
22
23# Define the messages for the chat
24messages = [
25    {
26        "role": "user",
27        "content": [
28            {
29                "type": "text",
30                "text": "what is the last sentence in the document"
31            },
32            {
33                "type": "document_url",
34                "document_url": "https://arxiv.org/pdf/1805.04770"
35                # "document_url": signed_url.url
36            }
37        ]
38    }
39]
40
41# Get the chat response
42chat_response = client.chat.complete(
43    model=model,
44    messages=messages
45)
46
47# Print the content of the response
48print(chat_response.choices[0].message.content)
49
50# Output:
51# The last sentence in the document is:\n\n\"Zaremba, W., Sutskever, I., and Vinyals, O. Recurrent neural network regularization. arXiv:1409.2329, 2014.
1import { Mistral } from "@mistralai/mistralai";
2// import fs from 'fs';
3
4// Retrieve the API key from environment variables
5const apiKey = process.env["MISTRAL_API_KEY"];
6
7const client = new Mistral({
8apiKey: apiKey,
9});
10
11// If local document, upload and retrieve the signed url
12// const uploaded_file = fs.readFileSync('uploaded_file.pdf');
13// const uploaded_pdf = await client.files.upload({
14// file: {
15// fileName: "uploaded_file.pdf",
16// content: uploaded_file,
17// },
18// purpose: "ocr"
19// });
20// const signedUrl = await client.files.getSignedUrl({
21// fileId: uploaded_pdf.id,
22// });
23
24const chatResponse = await client.chat.complete({
25model: "mistral-small-latest",
26messages: [
27{
28role: "user",
29content: [
30{
31type: "text",
32text: "what is the last sentence in the document",
33},
34{
35type: "document_url",
36documentUrl: "https://arxiv.org/pdf/1805.04770",
37// documentUrl: signedUrl.url
38},
39],
40},
41],
42});
43
44console.log("JSON:", chatResponse.choices[0].message.content);
45

Upload the Image File

1curl https://api.mistral.ai/v1/files \
2  -H "Authorization: Bearer $MISTRAL_API_KEY" \
3  -F purpose="ocr" \
4  -F file="@uploaded_file.pdf"

Get the Signed URL

1  curl -X GET "https://api.mistral.ai/v1/files/$id/url?expiry=24" \
2     -H "Accept: application/json" \
3     -H "Authorization: Bearer $MISTRAL_API_KEY"

Chat Completion

1curl https://api.mistral.ai/v1/chat/completions \
2  -H "Content-Type: application/json" \
3  -H "Authorization: Bearer ${MISTRAL_API_KEY}" \
4  -d '{
5    "model": "mistral-small-latest",
6    "messages": [
7      {
8        "role": "user",
9        "content": [
10          {
11            "type": "text",
12            "text": "what is the last sentence in the document"
13          },
14          {
15            "type": "document_url",
16            "document_url": "<url>"
17          }
18        ]
19      }
20    ],
21    "document_image_limit": 8,
22    "document_page_limit": 64
23  }'

Cookbooks

For more information on how to make use of Document QnA, we have the following Document QnA Cookbook with a simple example.

FAQ

Q: Are there any limits regarding the Document QnA API?
A: Yes, there are certain limitations for the Document QnA API. Uploaded document files must not exceed 50 MB in size and should be no longer than 1,000 pages.