DeltaQuery.Client (DeltaQuery v0.2.3)

Copy Markdown View Source

HTTP client for Delta Sharing REST API.

Implements the Delta Sharing Protocol for reading shared Delta tables. See: https://github.com/delta-io/delta-sharing/blob/main/PROTOCOL.md

Parquet Files

Delta Sharing returns data as Parquet files, a columnar storage format optimized for analytical queries. This client downloads Parquet files and parses them using Explorer.

See: https://parquet.apache.org/docs/

Predicates

Predicates are SQL-like filter expressions used to reduce data transfer and improve performance. They work at two levels:

  1. Partition filtering - Server-side filtering that excludes entire Parquet files based on partition values, reducing the number of files downloaded.

  2. Row filtering - Client-side filtering applied after downloading Parquet files to further narrow results to matching rows.

Example predicates: ["book_id = 123", "genre = 'Fiction'", "publication_date > '2024-01-01'"]

Summary

Functions

Create a new client from a Config struct.

List schemas in a share.

List tables in a schema.

Create a new client from endpoint and bearer token.

Download and parse Parquet files from Delta Sharing query response.

Query table data with optional predicates and limits.

Get table metadata including schema (column names and types).

Types

t()

@type t() :: %DeltaQuery.Client{download_req: Req.Request.t(), req: Req.Request.t()}

Functions

from_config(config)

@spec from_config(DeltaQuery.Config.t()) :: t()

Create a new client from a Config struct.

list_schemas(client, share)

@spec list_schemas(t(), String.t()) :: {:ok, [map()]} | {:error, term()}

List schemas in a share.

Returns a list of schema metadata maps.

list_tables(client, share, schema)

@spec list_tables(t(), String.t(), String.t()) :: {:ok, [map()]} | {:error, term()}

List tables in a schema.

Returns a list of table metadata maps.

new(endpoint, bearer_token, opts \\ [])

@spec new(String.t(), String.t(), keyword()) :: t()

Create a new client from endpoint and bearer token.

Options

Any additional options are passed to Req.new/1 (e.g., :retry, :connect_options).

parse_parquet_files(client, files, opts \\ [])

@spec parse_parquet_files(t(), [map()], keyword()) :: {:ok, Explorer.DataFrame.t()}

Download and parse Parquet files from Delta Sharing query response.

Returns an Explorer DataFrame, enabling joins, grouping, and aggregations. Use Explorer.DataFrame.to_rows/1 to convert to a list of maps if needed.

Options

  • :predicates - List of SQL-like filter strings (e.g., ["genre = 'Fiction'", "book_id = 123"])
  • :columns - List of column names to return (nil = all columns)

query_table(client, share, schema, table, opts \\ [])

@spec query_table(t(), String.t(), String.t(), String.t(), keyword()) ::
  {:ok, map()} | {:error, term()}

Query table data with optional predicates and limits.

Options

  • :limit - Maximum number of rows to return (hint to server)
  • :predicate_hints - SQL-like predicates for filtering (e.g., ["date > '2024-01-01'"])

table_metadata(client, share, schema, table)

@spec table_metadata(t(), String.t(), String.t(), String.t()) ::
  {:ok, map()} | {:error, term()}

Get table metadata including schema (column names and types).

Returns the protocol and metadata from the table, including the schema string.