ExArrow. Broadway. ParquetSink
(ex_arrow v0.7.0)
View Source
Write assembled Arrow batches to a Parquet file from a Broadway batch handler.
Intended to be called from a Broadway handle_batch/4 callback. The
batches are written in a single ExArrow.Parquet.Writer.to_file/3 call so
the output is one Parquet file with one row group per batch (subject to the
writer's chunking).
Emits a [:ex_arrow, :parquet, :write] telemetry event with :rows,
:batch_count, and %{destination: path, source: :broadway} metadata.
Example
def handle_batch(:parquet, messages, _info, _ctx) do
{:ok, schema, batches} = ExArrow.Broadway.BatchBuilder.from_messages(messages)
ExArrow.Broadway.ParquetSink.write("/data/out.parquet", schema, batches)
end
Summary
Functions
Write schema and batches to a Parquet file at path.
Functions
@spec write(Path.t(), ExArrow.Schema.t(), [ExArrow.RecordBatch.t()]) :: :ok | {:error, String.t()}
Write schema and batches to a Parquet file at path.
Returns :ok or {:error, message}.