xlsx_reader v0.3.0 XlsxReader View Source

Opens XLSX workbook and reads its worksheets.

Example

{:ok, package} = XlsxReader.open("test.xlsx")

XlsxReader.sheet_names(package)
# ["Sheet 1", "Sheet 2", "Sheet 3"]

{:ok, rows} = XlsxReader.sheet(package, "Sheet 1")
# [
#   ["Date", "Temperature"],
#   [~D[2019-11-01], 8.4],
#   [~D[2019-11-02], 7.5],
#   ...
# ]

Sheet contents

Sheets are loaded on-demand by sheet/3 and sheets/2.

The sheet contents is returned as a list of lists:

[
  ["A1", "B1", "C1" | _],
  ["A2", "B2", "C2" | _],
  ["A3", "B3", "C3" | _],
  | _
]

The behavior of the sheet parser can be customized for each individual sheet, see sheet/3.

Link to this section Summary

Types

Error tuple with message describing the cause of the error

List of cell values

List of rows

Sheet name

Source for the XLSX file: file system (:path) or in-memory (:binary)

Option to specify the XLSX file source

Functions

Loads all the sheets in the workbook concurrently.

Opens an XLSX file located on the file system (default) or from memory.

Loads the sheet with the given name (see sheet_names/1)

Lists the names of the sheets in the package's workbook

Loads all the sheets in the workbook.

Link to this section Types

Link to this type

error()

View Source
error() :: {:error, String.t()}

Error tuple with message describing the cause of the error

List of cell values

List of rows

Link to this type

sheet_name()

View Source
sheet_name() :: String.t()

Sheet name

Link to this type

source()

View Source
source() :: :path | :binary

Source for the XLSX file: file system (:path) or in-memory (:binary)

Link to this type

source_option()

View Source
source_option() :: {:source, source()}

Option to specify the XLSX file source

Link to this section Functions

Link to this function

async_sheets(package, sheet_options \\ [], task_options \\ [])

View Source

Loads all the sheets in the workbook concurrently.

On success, returns {:ok, [{sheet_name, rows}, ...]}.

When processing files with multiple sheets, async_sheets/3 is ~3x faster than sheets/2 but it comes with a caveat. async_sheets/3 uses Task.async_stream/3 under the hood and thus runs each concurrent task with a timeout. If you expect your dataset to be of a significant size, you may want to increase it from the default 10000ms (see "Concurrency options" below).

If the order in which the sheets are returned is not relevant for your application, you can pass ordered: false (see "Concurrency options" below) for a modest speed gain.

Filtering options

See sheets/2.

Sheet options

See sheet/2.

Concurrency options

  • max_concurrency - maximum number of tasks to run at the same time (default: System.schedulers_online/0)
  • ordered - maintain order consistent with sheet_names/1 (default: true)
  • timeout - maximum duration in milliseconds to process a sheet (default: 10_000)
Link to this function

open(file, options \\ [])

View Source
open(String.t() | binary(), [source_option()]) ::
  {:ok, XlsxReader.Package.t()} | error()

Opens an XLSX file located on the file system (default) or from memory.

Examples

Opening XLSX file on the file system

{:ok, package} = XlsxReader.open("test.xlsx")

Opening XLSX file from memory

blob = File.read!("test.xlsx")

{:ok, package} = XlsxReader.open(blob, source: :binary)

Options

  • source: :path (on the file system, default) or :binary (in memory)
Link to this function

sheet(package, sheet_name, options \\ [])

View Source
sheet(XlsxReader.Package.t(), sheet_name(), Keyword.t()) :: {:ok, rows()}

Loads the sheet with the given name (see sheet_names/1)

Options

  • type_conversion - boolean (default: true)
  • blank_value - placeholder value for empty cells (default: "")
  • empty_rows - include empty rows (default: true)
  • number_type - type used for numeric conversion :Integer, 'Decimal' or Float (default: Float)

The Decimal type requires the decimal library.

Link to this function

sheet_filter_option(options, key)

View Source
Link to this function

sheet_names(package)

View Source
sheet_names(XlsxReader.Package.t()) :: [sheet_name()]

Lists the names of the sheets in the package's workbook

Link to this function

sheets(package, options \\ [])

View Source
sheets(XlsxReader.Package.t(), Keyword.t()) ::
  {:ok, [{sheet_name(), rows()}]} | error()

Loads all the sheets in the workbook.

On success, returns {:ok, [{sheet_name, rows}, ...]}.

Filtering options

  • only - include the sheets whose name matches the filter
  • except - exclude the sheets whose name matches the filter

Sheets can filtered by name using:

  • a string (e.g. "Exact Match")
  • a regex (e.g. ~r/Sheet +/)
  • a list of string and/or regexes (e.g. ["Parameters", ~r/Sheet [12]/])

Sheet options

See sheet/2.