ReqLLM.OCR (ReqLLM v1.14.0)

View Source

Optical Character Recognition for ReqLLM.

Extracts rich markdown from documents (PDF, images) using OCR models. Currently supports Mistral OCR on Google Vertex AI.

Examples

model = ReqLLM.model!(%{provider: :google_vertex, id: "mistral-ocr-2505"})

# Process a PDF binary
{:ok, result} = ReqLLM.ocr(model, pdf_binary,
  provider_options: [region: "europe-west4"]
)
result.markdown  #=> "# Title\n\nExtracted text with ![images](data:...)..."
result.pages     #=> [%{index: 0, markdown: "...", images: [...]}]

# Process a file
{:ok, result} = ReqLLM.ocr_file(model, "doc.pdf",
  provider_options: [region: "europe-west4"]
)

Response

Returns {:ok, %{markdown: String.t(), pages: [map()]}} where:

  • markdown — concatenated page markdowns with --- separators
  • pages — list of %{index: integer, markdown: String.t(), images: [map()]}

Summary

Functions

Process a document binary through an OCR model.

Process a document binary through an OCR model. Raises on error.

Process a file at the given path through an OCR model.

Process a file through an OCR model. Raises on error.

Validates that a model supports OCR operations.

Types

ocr_result()

@type ocr_result() :: %{markdown: String.t(), pages: [map()]}

Functions

ocr(model_spec, document_binary, opts \\ [])

@spec ocr(String.t() | struct(), binary(), keyword()) ::
  {:ok, ocr_result()} | {:error, term()}

Process a document binary through an OCR model.

Parameters

  • model_spec — Model specification (e.g., %{provider: :google_vertex, id: "mistral-ocr-2505"})
  • document_binary — Raw document bytes (PDF, PNG, JPEG, etc.)
  • opts — Options:
    • :include_images — extract images as base64 in markdown (default true)
    • :document_type — MIME type hint (default "application/pdf")
    • :provider_options — provider-specific options (e.g., region, access_token)

Examples

pdf_bytes = File.read!("document.pdf")
model = ReqLLM.model!(%{provider: :google_vertex, id: "mistral-ocr-2505"})
{:ok, result} = ReqLLM.ocr(model, pdf_bytes)

ocr!(model_spec, document_binary, opts \\ [])

@spec ocr!(String.t() | struct(), binary(), keyword()) :: ocr_result()

Process a document binary through an OCR model. Raises on error.

ocr_file(model_spec, path, opts \\ [])

@spec ocr_file(String.t() | struct(), String.t(), keyword()) ::
  {:ok, ocr_result()} | {:error, term()}

Process a file at the given path through an OCR model.

Reads the file, detects document type from extension, and delegates to ocr/3.

Examples

model = ReqLLM.model!(%{provider: :google_vertex, id: "mistral-ocr-2505"})
{:ok, result} = ReqLLM.ocr_file(model, "report.pdf")

ocr_file!(model_spec, path, opts \\ [])

@spec ocr_file!(String.t() | struct(), String.t(), keyword()) :: ocr_result()

Process a file through an OCR model. Raises on error.

validate_model(model_spec)

@spec validate_model(ReqLLM.model_input()) ::
  {:ok, LLMDB.Model.t()} | {:error, term()}

Validates that a model supports OCR operations.