ExArrow. Schema. Mapper
(ex_arrow v0.6.0)
View Source
Bidirectional mapping between Arrow type representations and external type systems.
ExArrow interacts with several Elixir libraries that have their own type systems — Explorer, Nx, and in the future ExZarr and Dataset. This module is the single authority for converting between Arrow dtype strings (used by the NIF layer) and each external representation, eliminating duplicated mapping logic across bridge modules.
Arrow dtype strings
The NIF layer identifies column types with short string codes:
| Code | Arrow type |
|---|---|
"s8" | Int8 |
"s16" | Int16 |
"s32" | Int32 |
"s64" | Int64 |
"u8" | UInt8 |
"u16" | UInt16 |
"u32" | UInt32 |
"u64" | UInt64 |
"f32" | Float32 |
"f64" | Float64 |
"bool" | Boolean |
"utf8" | Utf8 |
These are the canonical internal representation. All public conversion functions accept and return these strings.
Extensibility
New external targets (e.g. ExZarr, Dataset) can be added by introducing
new target_dtype_to_arrow/1 and arrow_dtype_to_target/1 clause groups.
The existing targets are grouped by module section below.
Summary
Functions
Convert an Arrow dtype string to an Explorer dtype atom.
Convert an Arrow dtype string to an Nx dtype tuple.
Convert an Arrow dtype string to an Arrow type atom.
Convert an Arrow type atom (as returned by ExArrow.Schema.fields/1) to an
Arrow dtype string.
Convert an Explorer dtype atom to an Arrow dtype string.
Returns true if the given Arrow dtype string maps to a numeric Nx dtype,
false otherwise.
Convert an Nx dtype tuple to an Arrow dtype string.
Types
@type arrow_dtype() :: String.t()
@type nx_dtype() :: {:s, 8 | 16 | 32 | 64} | {:u, 8 | 16 | 32 | 64} | {:f, 32 | 64}
Functions
@spec arrow_dtype_to_explorer(arrow_dtype()) :: {:ok, atom()} | {:error, String.t()}
Convert an Arrow dtype string to an Explorer dtype atom.
Returns {:ok, explorer_dtype} or {:error, message}.
Integer dtypes (s8–s64, u8–u64) all map to :integer because Explorer
does not distinguish integer widths in its dtype system. Float dtypes (f32,
f64) map to :float.
@spec arrow_dtype_to_nx(arrow_dtype()) :: {:ok, nx_dtype()} | {:error, String.t()}
Convert an Arrow dtype string to an Nx dtype tuple.
Returns {:ok, nx_dtype} or {:error, message}.
Boolean columns ("bool") map to {:u, 8} because Nx represents booleans
as unsigned 8-bit integers with values 0 and 1.
@spec arrow_dtype_to_type_atom(arrow_dtype()) :: {:ok, atom()} | {:error, String.t()}
Convert an Arrow dtype string to an Arrow type atom.
Returns {:ok, type_atom} or {:error, message}.
Examples
iex> ExArrow.Schema.Mapper.arrow_dtype_to_type_atom("s64")
{:ok, :int64}
iex> ExArrow.Schema.Mapper.arrow_dtype_to_type_atom("bool")
{:ok, :boolean}
@spec arrow_type_atom_to_dtype(atom()) :: {:ok, arrow_dtype()} | {:error, String.t()}
Convert an Arrow type atom (as returned by ExArrow.Schema.fields/1) to an
Arrow dtype string.
Returns {:ok, dtype_string} or {:error, message}.
This bridges the NIF schema representation (atoms like :int64) to the
dtype strings used by column buffer NIFs ("s64").
Examples
iex> ExArrow.Schema.Mapper.arrow_type_atom_to_dtype(:int64)
{:ok, "s64"}
iex> ExArrow.Schema.Mapper.arrow_type_atom_to_dtype(:boolean)
{:ok, "bool"}
iex> ExArrow.Schema.Mapper.arrow_type_atom_to_dtype(:timestamp)
{:error, "unsupported Arrow type atom for dtype mapping: timestamp"}
@spec explorer_dtype_to_arrow(atom()) :: {:ok, arrow_dtype()} | {:error, String.t()}
Convert an Explorer dtype atom to an Arrow dtype string.
Returns {:ok, dtype_string} or {:error, message}.
Supported Explorer dtypes
| Explorer dtype | Arrow dtype | Notes |
|---|---|---|
:integer | "s64" | Explorer stores as 64-bit int |
:float | "f64" | Explorer stores as 64-bit float |
:boolean | "bool" | Arrow Boolean column |
:string | "utf8" | Arrow Utf8 column |
Explorer dtypes :date, :datetime, :duration, and :nil are not yet
mapped and return an error. These will be added as the NIF layer gains
support for the corresponding Arrow types.
@spec numeric?(arrow_dtype()) :: boolean()
Returns true if the given Arrow dtype string maps to a numeric Nx dtype,
false otherwise.
Examples
iex> ExArrow.Schema.Mapper.numeric?("s64")
true
iex> ExArrow.Schema.Mapper.numeric?("bool")
true
iex> ExArrow.Schema.Mapper.numeric?("utf8")
false
@spec nx_dtype_to_arrow(nx_dtype()) :: {:ok, arrow_dtype()} | {:error, String.t()}
Convert an Nx dtype tuple to an Arrow dtype string.
Returns {:ok, dtype_string} or {:error, message}.
Supported Nx dtypes
| Nx dtype | Arrow dtype |
|---|---|
{:s, 8} | "s8" |
{:s, 16} | "s16" |
{:s, 32} | "s32" |
{:s, 64} | "s64" |
{:u, 8} | "u8" |
{:u, 16} | "u16" |
{:u, 32} | "u32" |
{:u, 64} | "u64" |
{:f, 32} | "f32" |
{:f, 64} | "f64" |
Nx does not have a dedicated boolean dtype; booleans are represented as
{:u, 8} with values 0 and 1. Arrow Boolean columns map to {:u, 8} via
arrow_dtype_to_nx/1.