ExtrText (extr_text v0.2.1)

ExtrText is an Elixir library for extracting text and meta information from .docx, .xlsx, .pptx files.

Link to this section Summary

Functions

Extracts properties (metadata) from the specified OOXML data.

Extracts plain texts from the body of specified OOXML data.

Link to this section Functions

Link to this function

get_metadata(data)

Specs

get_metadata(binary()) :: {:ok, ExtrText.Metadata.t()} | {:error, String.t()}

Extracts properties (metadata) from the specified OOXML data.

Link to this function

get_texts(data)

Extracts plain texts from the body of specified OOXML data.

The return value is a double nested list of strings.

Each element of outer list represents the sheets of .xsls data and the slides of .pptx data. For .docx data, the outer list has only one element.

Each element of inner list represents the paragraphs or lines of a spreadsheet.