Modules
Normalises supported OCR inputs into a Vix.Vips.Image.t() and an associated
raw pixel buffer suitable for handing to the Tesseract NIF.
Translation between user-facing language identifiers and Tesseract's trained-data filename codes.
A NimblePool-backed pool of Image.OCR instances for concurrent OCR.
Helpers for resolving and managing Tesseract trained-data (tessdata) files.
Mix Tasks
Downloads <language>.traineddata from the upstream tessdata_* GitHub
repository into the configured trained-data directory.
Lists every <language>.traineddata file in the resolved trained-data
directory along with provenance from the manifest.
Deletes one or more <language>.traineddata files and their manifest
entries.
Re-fetches every trained-data file recorded in the manifest, picking up the latest commit on each language's recorded branch.