ExtrText (extr_text v0.3.1)
ExtrText is an Elixir library for extracting text and meta information from .docx
, .xlsx
,
.pptx
files.
Link to this section Summary
Functions
Extracts properties (metadata) from the specified OOXML data.
Extracts plain texts from the body of specified OOXML data.
Link to this section Functions
Link to this function
get_metadata(data)
Specs
get_metadata(binary()) :: {:ok, ExtrText.Metadata.t()} | {:error, String.t()}
Extracts properties (metadata) from the specified OOXML data.
Link to this function
get_texts(data)
Extracts plain texts from the body of specified OOXML data.
The return value is a double nested list of strings.
Each element of outer list represents the sheets of .xsls
data and the slides of .pptx
data.
For .docx
data, the outer list has only one element.
Each element of inner list represents the paragraphs or lines of a spreadsheet.