MistralClient.API.OCR (mistralex_ai v0.1.0)
View SourceOCR (Optical Character Recognition) API operations.
This module provides functionality for processing documents and images using Mistral's OCR capabilities, extracting text and structured data from various document formats.
Features
- Document URL processing
- Image URL processing
- Page-specific processing
- Image extraction with base64 encoding
- Structured annotation formats
- Bounding box annotations
Usage
# Process a document URL
document = MistralClient.Models.DocumentURLChunk.new("https://example.com/document.pdf")
request = MistralClient.Models.OCRRequest.new("pixtral-12b-2024-12-19", document)
{:ok, response} = MistralClient.API.OCR.process(config, request)
# Process an image URL
image_url = MistralClient.Models.ImageURLChunkImageURL.new("data:image/png;base64,...")
image_chunk = MistralClient.Models.ImageURLChunk.new(image_url)
request = MistralClient.Models.OCRRequest.new("pixtral-12b-2024-12-19", image_chunk)
{:ok, response} = MistralClient.API.OCR.process(config, request)
# Process specific pages with options
document = MistralClient.Models.DocumentURLChunk.new("https://example.com/document.pdf")
request = MistralClient.Models.OCRRequest.new("pixtral-12b-2024-12-19", document,
pages: [0, 1, 2],
include_image_base64: true,
image_limit: 10
)
{:ok, response} = MistralClient.API.OCR.process(config, request)
Summary
Functions
Process a document or image using OCR.
Process a document or image using OCR with direct parameters.
Functions
@spec process(MistralClient.Config.t(), MistralClient.Models.OCRRequest.t()) :: {:ok, MistralClient.Models.OCRResponse.t()} | {:error, term()}
Process a document or image using OCR.
Parameters
config
- Client configurationrequest
- OCR request with model, document, and options
Options
:id
- Request identifier:pages
- List of specific page numbers to process (0-indexed):include_image_base64
- Include base64-encoded images in response:image_limit
- Maximum number of images to extract:image_min_size
- Minimum size (height and width) for image extraction:bbox_annotation_format
- Structured output format for bounding boxes:document_annotation_format
- Structured output format for the document
Returns
{:ok, OCRResponse.t()}
- Successful OCR processing{:error, term()}
- Error occurred during processing
Examples
# Basic document processing
document = MistralClient.Models.DocumentURLChunk.new("https://example.com/doc.pdf")
request = MistralClient.Models.OCRRequest.new("pixtral-12b-2024-12-19", document)
{:ok, response} = MistralClient.API.OCR.process(config, request)
# Image processing with options
image_url = MistralClient.Models.ImageURLChunkImageURL.new("data:image/png;base64,...")
image_chunk = MistralClient.Models.ImageURLChunk.new(image_url)
request = MistralClient.Models.OCRRequest.new("pixtral-12b-2024-12-19", image_chunk,
include_image_base64: true,
image_limit: 5
)
{:ok, response} = MistralClient.API.OCR.process(config, request)
@spec process( MistralClient.Config.t(), String.t(), MistralClient.Models.OCRRequest.document_type(), keyword() ) :: {:ok, MistralClient.Models.OCRResponse.t()} | {:error, term()}
Process a document or image using OCR with direct parameters.
This is a convenience function that creates an OCRRequest internally.
Parameters
config
- Client configurationmodel
- Model to use for OCR processingdocument
- Document or image to processopts
- Additional options (seeprocess/2
for details)
Examples
# Process document URL
document = MistralClient.Models.DocumentURLChunk.new("https://example.com/doc.pdf")
{:ok, response} = MistralClient.API.OCR.process(config, "pixtral-12b-2024-12-19", document)
# Process with options
{:ok, response} = MistralClient.API.OCR.process(
config,
"pixtral-12b-2024-12-19",
document,
pages: [0, 1],
include_image_base64: true
)