Helpers for resolving and managing Tesseract trained-data (tessdata) files.
Trained-data files (<lang>.traineddata) live in a directory that Tesseract
reads at initialisation time. Image.OCR resolves that directory in the
following order:
The
:datapathoption passed toImage.OCR.new/1.The
:tessdata_pathapplication environment value:config :image_ocr, tessdata_path: "/var/lib/image_ocr/tessdata"The
TESSDATA_PREFIXoperating-system environment variable.The vendored fallback at
priv/tessdata/inside the:image_ocrpackage.
See Mix.Tasks.Image.Ocr.Tessdata.Add and friends for managing the contents
of a configured directory.
Summary
Functions
Returns the absolute path to the directory in which trained-data files are read from and written to.
Returns true when language has a trained-data file in the resolved
trained-data directory.
Returns the list of language codes installed in the resolved trained-data directory.
Returns the absolute path to the trained-data file for language inside the
resolved trained-data directory.
Returns the absolute path to the directory of trained-data shipped with the
image_ocr package.
Functions
Returns the absolute path to the directory in which trained-data files are read from and written to.
Arguments
optionsis an optional keyword list. See the options below.
Options
:datapathis an explicit path that overrides every other lookup. Whennil(the default) the standard resolution order is used.
Returns
- A string containing the absolute path to the trained-data directory.
Examples
iex> path = Image.OCR.Tessdata.datapath()
iex> File.dir?(path)
true
Returns true when language has a trained-data file in the resolved
trained-data directory.
Returns the list of language codes installed in the resolved trained-data directory.
Arguments
optionsis an optional keyword list. Seedatapath/1for the supported options.
Returns
- A list of language code strings (for example
["eng", "fra"]) sorted alphabetically. Returns[]when the directory does not exist.
Examples
iex> "eng" in Image.OCR.Tessdata.installed_languages()
true
Returns the absolute path to the trained-data file for language inside the
resolved trained-data directory.
Arguments
languageis a language code string such as"eng"or"fra".optionsis an optional keyword list. Seedatapath/1for the supported options.
Returns
- A string containing the absolute path. The file is not guaranteed to exist.
@spec vendored_path() :: String.t()
Returns the absolute path to the directory of trained-data shipped with the
image_ocr package.
Returns
- A string containing the absolute path to the vendored trained-data directory.
Examples
iex> Image.OCR.Tessdata.vendored_path() |> String.ends_with?("priv/tessdata")
true