ExArrow. Broadway. BatchBuilder
(ex_arrow v0.7.0)
View Source
Assemble ExArrow.RecordBatch values from Broadway messages.
Each Broadway message is expected to carry one of:
- an
ExArrow.RecordBatch.t()handle inmessage.data(the common case for Arrow-aware producers), or - a
{names, binaries, dtypes, length}tuple describing raw Arrow columns, which is converted to a batch viaExArrow.RecordBatch.from_columns/4.
from_messages/1 returns {:ok, schema, batches} where schema is the
schema of the first batch and batches is the list of assembled batches.
from_messages/2 accepts options:
:rows_per_batch— split the assembled batches into smaller batches of at most this many rows by re-chunking the column buffers. Defaults to:infinity(no splitting).
Example
{:ok, schema, batches} =
ExArrow.Broadway.BatchBuilder.from_messages(messages)
Summary
Functions
Extract the ExArrow.RecordBatch handles from a list of Broadway messages,
without resolving the schema.
Build a list of ExArrow.RecordBatch values from a list of Broadway
messages, returning the shared schema and the batch list.
Functions
@spec extract_batches([term()]) :: {:ok, [ExArrow.RecordBatch.t()]} | {:error, String.t()}
Extract the ExArrow.RecordBatch handles from a list of Broadway messages,
without resolving the schema.
Returns {:ok, [batch, ...]} or {:error, message}.
@spec from_messages([term()]) :: {:ok, ExArrow.Schema.t(), [ExArrow.RecordBatch.t()]} | {:error, String.t()}
Build a list of ExArrow.RecordBatch values from a list of Broadway
messages, returning the shared schema and the batch list.
Returns {:ok, schema, [batch, ...]} or {:error, message}.
@spec from_messages( [term()], keyword() ) :: {:ok, ExArrow.Schema.t(), [ExArrow.RecordBatch.t()]} | {:error, String.t()}