QuackDB connects Elixir applications to a remote DuckDB process through DuckDB's experimental Quack protocol. The client talks to the Quack HTTP endpoint, decodes DuckDB result chunks, and exposes the connection through DBConnection.
Requirements
- Elixir 1.19 or newer
- DuckDB 1.5.3 or newer for the current Quack extension behavior
- A running Quack server
Install
Add :quackdb to your dependencies:
def deps do
[
{:quackdb, "~> 0.3.0"}
]
endOptional integrations are compiled only when their packages are available. Add Explorer when you want dataframe handoff helpers:
def deps do
[
{:quackdb, "~> 0.3.0"},
{:explorer, "~> 0.11"}
]
endThen fetch dependencies:
mix deps.get
Start DuckDB with Quack
For local development, QuackDB can supervise DuckDB's external CLI process for you:
children =
QuackDB.Server.child_specs(
server: [name: MyApp.DuckDB, duckdb: :managed, endpoint: "quack:localhost:9494"],
client: [name: MyApp.QuackDB, pool_size: System.schedulers_online()]
)child_specs/1 generates one shared random token and injects the matching URI/token into both child specs. Pass :token on either side when you want to provide it yourself.
duckdb: :managed downloads and caches DuckDB's official CLI binary when the local server starts. Managed downloads are checksum-verified for QuackDB.Binary.default_version/0; other DuckDB versions require passing an explicit :sha256. For explicit setup, run:
mix quackdb.install
mix quackdb.install --print-path
Use QUACKDB_BINARY_PATH or pass duckdb: "/path/to/duckdb" when you want QuackDB to use a system or custom executable instead. See the managed DuckDB guide for cache, checksum, and target-prefetch options.
Or start DuckDB manually with the quack extension loaded:
duckdb -interactive -init /dev/null \
-cmd "LOAD quack; CALL quack_serve('quack:localhost', token='super_secret');"
On some systems, quack:localhost binds to IPv6 localhost. If http://localhost:9494 does not connect, use http://[::1]:9494.
Connect from Elixir
{:ok, conn} =
QuackDB.start_link(
uri: "http://[::1]:9494",
token: "super_secret"
)QuackDB.start_link/1 starts a DBConnection process. You can pass the connection to QuackDB.ping/2, QuackDB.query/4, QuackDB.prepare_execute/4, QuackDB.stream/4, or DBConnection APIs.
Run a query
:ok = QuackDB.ping(conn)
{:ok, result} = QuackDB.query(conn, "SELECT 1 AS n")
result.columns
#=> ["n"]
result.rows
#=> [[1]]
result.num_rows
#=> 1rows are row-oriented lists. This shape is convenient for DBConnection and future Ecto integration.
QuackDB formats positional parameters as DuckDB SQL literals client-side because the current Quack request path does not expose server-side bind parameters:
{:ok, result} = QuackDB.query(conn, "SELECT ? AS name, ? AS n", ["duck", 42])
result.rows
#=> [["duck", 42]]Placeholders inside strings and comments are ignored while formatting, and unsupported parameter values raise explicit errors.
Decode nested values
DuckDB nested result types decode to ordinary Elixir values:
{:ok, result} =
QuackDB.query(conn, """
SELECT
[1, 2, 3] AS xs,
{'name': 'duck', 'count': 2} AS obj,
array_value(1, 2, 3) AS arr,
map(['a', 'b'], [1, 2]) AS m,
[{'a': 1}, {'a': 2}] AS nested
""")
result.rows
#=> [
#=> [
#=> [1, 2, 3],
#=> %{"name" => "duck", "count" => 2},
#=> [1, 2, 3],
#=> %{"a" => 1, "b" => 2},
#=> [%{"a" => 1}, %{"a" => 2}]
#=> ]
#=> ]Query files and lakehouse sources
DuckDB can scan files, object stores, and lakehouse table formats directly. QuackDB.Source builds safe table-function fragments that can be used from Ecto queries or direct SQL:
use QuackDB.Ecto
alias QuackDB.Source
source = Source.csv("events.csv", header: true)
MyApp.AnalyticsRepo.all(
from event in source,
where: event.id > 1,
select: %{id: event.id, name: event.name}
)See guides/sources.md for HTTP(S), S3/R2/GCS, Azure/ADLS, Hugging Face, Delta, Iceberg, extensions, and DuckDB secrets.
Stream large result sets
QuackDB.query/4 materializes the full result. QuackDB fetches additional result chunks when DuckDB reports that more rows are available, but for large analytical results prefer streaming helpers.
Use QuackDB.stream/4 to process %QuackDB.Result{} batches lazily:
row_count =
conn
|> QuackDB.stream("SELECT i FROM range(0, 50_000) t(i)")
|> Enum.reduce(0, fn result, count -> count + result.num_rows end)
row_count
#=> 50_000Use QuackDB.rows/4 for row-level streaming:
conn
|> QuackDB.rows("SELECT i FROM range(0, ?) t(i)", [50_000])
|> Enum.take(3)
#=> [[0], [1], [2]]Use QuackDB.maps/4 for row maps keyed by column names. Duplicate column names are disambiguated with suffixes such as _2 and _3:
conn
|> QuackDB.maps("SELECT i AS n FROM range(0, ?) t(i)", [50_000])
|> Enum.take(2)
#=> [%{"n" => 0}, %{"n" => 1}]Use QuackDB.columnar/4 when an analytical workflow wants vectors plus column order and metadata:
{:ok, columns} = QuackDB.columnar(conn, "SELECT id, name FROM events ORDER BY id")
columns.names
#=> ["id", "name"]
columns["id"]
#=> [1, 2]QuackDB.columns/4 returns just the column map:
{:ok, columns} = QuackDB.columns(conn, "SELECT id, name FROM events ORDER BY id")
columns
#=> %{"id" => [1, 2], "name" => ["duck", "goose"]}For large results, QuackDB.columnar_batches/4 streams QuackDB.Columns fetch batches without materializing the whole result set. QuackDB.column_batches/4 returns just the map from each batch:
conn
|> QuackDB.column_batches("SELECT i AS n FROM range(0, 50_000) t(i)", [], max_rows: 1_000)
|> Enum.take(1)
#=> [%{"n" => [0, 1, 2, ...]}]This is not Arrow IPC yet, but it exposes a column-oriented shape that can back future Arrow integration without changing the query API.
Convert results to Explorer DataFrames
When :explorer is available, QuackDB exposes optional helpers for building Explorer.DataFrame values from query results:
Ecto queries can be passed directly when you already have schemas or source helpers:
import Ecto.Query
alias QuackDB.Explorer, as: QuackExplorer
alias QuackDB.Source
source = Source.csv("events.csv", header: true)
query =
from event in source,
where: event.id > ^1,
select: %{id: event.id, name: event.name}
{:ok, df} = QuackExplorer.dataframe(conn, query)The Explorer integration materializes query results in Elixir before constructing a dataframe. It is useful for interactive analysis and downstream Explorer pipelines, but it is not a zero-copy Arrow IPC path yet.
Explorer dataframes can also be appended through Quack's native column-oriented append path:
alias QuackDB.Explorer, as: QuackExplorer
QuackExplorer.insert_dataframe!(conn, "events_copy", df, batch_size: 10_000)You can also convert existing results:
alias QuackDB.Explorer, as: QuackExplorer
{:ok, result} = QuackDB.query(conn, "SELECT 1 AS id, 'duck' AS name")
{:ok, df} = QuackExplorer.from_result(result)Work with command results
DuckDB returns affected counts as a Count result column for DML statements. QuackDB normalizes those into num_rows:
alias QuackDB.{DDL, DML}
{:ok, _} = QuackDB.query(conn, DDL.create_table("events", [id: :integer], temporary: true))
{:ok, result} = QuackDB.query(conn, DML.insert_into("events", [[id: 1], [id: 2]]))
result.command
#=> :insert
result.num_rows
#=> 2
result.columns
#=> nil
result.rows
#=> nilThe raw DuckDB count result stays available for debugging:
result.metadata[:duckdb_columns]
#=> ["Count"]
result.metadata[:duckdb_rows]
#=> [[2]]Append rows
QuackDB.insert_rows/4 uses Quack's append protocol to send a DuckDB DataChunk directly to a table:
alias QuackDB.DDL
QuackDB.query!(conn, DDL.create_table("events", [id: :integer, name: :varchar, active: :boolean], temporary: true))
{:ok, result} =
QuackDB.insert_rows(conn, "events", [
[id: 1, name: "duck", active: true],
[id: 2, name: "goose", active: false]
])
result.command
#=> :insert
result.num_rows
#=> 2Keyword rows preserve append order and allow QuackDB to infer the column list from the first row. Map rows are also accepted, but pass :columns for stable append order and types. Explicit columns are still required for empty batches or all-null columns. Use batch_size: n to split large inputs across multiple append requests while returning the total inserted row count.
Native append columns can be declared with scalar QuackDB.Type specs and nested specs such as {:list, :varchar}, {:struct, [source: :varchar, count: :integer]}, {:array, :integer, 3}, and {:map, :varchar, :varchar}. Temporal append values are normalized through Elixir's Calendar-aware Date, Time, NaiveDateTime, and DateTime conversion APIs before encoding.
Full-text search helpers
DuckDB's fts extension can index text columns and expose BM25 ranking.
alias QuackDB.FTS
MyApp.AnalyticsRepo.query!(FTS.install())
MyApp.AnalyticsRepo.query!(FTS.load())
MyApp.AnalyticsRepo.query!(
FTS.create_index("documents", :id, [:title, :body], overwrite: true)
)
from doc in "documents",
where: match_bm25("fts_main_documents", doc.id, ^"duckdb analytics") > 0,
order_by: [desc: match_bm25("fts_main_documents", doc.id, ^"duckdb analytics")],
select: %{id: doc.id, title: doc.title}See the full-text search guide for direct SQL and Ecto usage.
Spatial helpers
DuckDB's spatial extension can be loaded with SQL helpers. Use Ecto spatial helpers when you want to keep spatial expressions inside Ecto queries:
use QuackDB.Ecto
alias QuackDB.Spatial
QuackDB.query!(conn, Spatial.load())
point = %Geo.Point{coordinates: {1.0, 2.0}, srid: nil}
from(place in "places",
where: intersects(place.geom, ^point),
select: as_text(place.geom)
)DuckDB GEOMETRY values decode as WKB-compatible bytes. Add optional {:geo, "~> 4.1"} when you want to convert those bytes to Geo structs with QuackDB.Geometry.decode_wkb!/1, or pass %Geo.*{} structs as Ecto parameters.
Inspect output in IEx
QuackDB implements compact inspection for common structs so manual review stays readable:
QuackDB.query!(conn, "SELECT i FROM range(0, 4) t(i)")
#QuackDB.Result<command: :select, columns: ["i"], rows: 4, preview: [[0], [1], [2], :...], connection_id: "...", needs_more_fetch?: false>The actual rows are still available through result.rows.
Transactions
QuackDB implements DBConnection transaction callbacks with SQL statements:
alias QuackDB.{DDL, DML}
DBConnection.transaction(conn, fn tx ->
QuackDB.query!(tx, DDL.create_table("tx_events", [id: :integer], temporary: true))
QuackDB.query!(tx, DML.insert_into("tx_events", id: 1))
end)Ecto raw SQL
QuackDB includes an initial Ecto SQL adapter for raw SQL queries. The Ecto adapter is compiled when ecto_sql is available, so add Ecto SQL if your app does not already depend on it:
def deps do
[
{:quackdb, "~> 0.3.0"},
{:ecto_sql, "~> 3.13"}
]
endThen define a repo:
defmodule MyApp.AnalyticsRepo do
use Ecto.Repo,
otp_app: :my_app,
adapter: Ecto.Adapters.QuackDB
endConfigure the repo with the same options accepted by QuackDB.start_link/1:
config :my_app, MyApp.AnalyticsRepo,
uri: "http://[::1]:9494",
token: "super_secret"Generated DDL and setup-oriented DML can participate in Ecto transactions:
{:ok, :committed} =
MyApp.AnalyticsRepo.transaction(fn ->
MyApp.AnalyticsRepo.query!(
QuackDB.DDL.create_table("events", [id: :integer], temporary: true)
)
MyApp.AnalyticsRepo.query!(QuackDB.DML.insert_into("events", [[id: 1], [id: 2]]))
:committed
end)Use Repo.rollback/1 to abort transaction work:
{:error, :rolled_back} =
MyApp.AnalyticsRepo.transaction(fn ->
MyApp.AnalyticsRepo.query!(QuackDB.DML.insert_into("events", id: 3))
MyApp.AnalyticsRepo.rollback(:rolled_back)
end)Read-only Ecto queries against table names are also supported, including CTEs, window functions, joins, grouping, having, distinct, aggregate FILTER, arithmetic expressions, in/2, predicates, ordering, limits, aggregates, fragments, and DuckDB analytical helpers:
use QuackDB.Ecto
MyApp.AnalyticsRepo.all(
from event in "events",
where: event.id > ^min_id and like(event.name, "d%"),
group_by: event.category,
select: %{
category: event.category,
median_score: median(event.score),
p95_score: quantile_cont(event.score, 0.95),
scores: duckdb_list(event.score)
}
)Ecto insert/2 and insert_all/3 are supported for straightforward row inserts. DuckDB RETURNING works through the SQL insert path:
MyApp.AnalyticsRepo.insert!(%Event{id: 1, name: "duck"})
MyApp.AnalyticsRepo.insert_all(
"events",
[[id: 2, name: "goose"]],
returning: [:id]
)Use insert_method: :append to opt into Quack's native append protocol for plain insert_all workloads. This fast path does not support query inserts, :returning, placeholders, or upserts.
MyApp.AnalyticsRepo.insert_all(
"events",
[[id: 1, name: "duck"], [id: 2, name: "goose"]],
insert_method: :append,
chunk_every: 10_000
)For temporary analytical setup, QuackDB.DDL.create_table/3 builds quoted DuckDB CREATE TABLE and CREATE TABLE AS statements:
MyApp.AnalyticsRepo.query!(
QuackDB.DDL.create_table("events",
[payload: :json, occurred_at: :timestamp],
temporary: true
)
)
source = QuackDB.Source.parquet("s3://bucket/events/*.parquet")
query =
from event in source,
select: %{id: event.id, name: event.name}
MyApp.AnalyticsRepo.query!(
QuackDB.DDL.create_table("events_from_parquet", as: query, temporary: true)
)DDL.create_table/2 rejects parameterized Ecto queries in :as because DDL helpers return SQL iodata, not {sql, params} tuples.
Ecto support covers analytical reads and common write/setup flows. Repo.query/3, schema-backed reads, combinations, inserts/upserts, schema update/delete callbacks, update_all / delete_all mutations, EXPLAIN, transactions, and basic migrator-backed DDL work; advanced migration features and DuckDB-specific SQL should still use Repo.query/3.
Basic Ecto migrations
QuackDB implements the Ecto migration DDL callbacks needed for common analytical setup migrations:
defmodule MyApp.Repo.Migrations.CreateEvents do
use Ecto.Migration
def change do
create table(:events, primary_key: false) do
add(:id, :integer, primary_key: true)
add(:name, :string, null: false)
add(:score, :integer, default: 0)
end
create(index(:events, [:name]))
create(constraint(:events, :positive_score, check: "score >= 0"))
end
endSupported DDL includes create/drop/alter table, add/modify/drop columns, references, ordinary and unique indexes, primary keys, composite primary keys, check constraints, and table/column renames. DuckDB-incompatible options such as concurrent indexes, covering indexes, exclude constraints, constraint comments, and NOT VALID constraints raise explicit QuackDB errors.
Current limitations
- Server-side bind parameters are not exposed by this Quack client path yet. QuackDB formats supported parameter values as DuckDB SQL literals client-side.
- Native appends support row and column batches but not Arrow IPC or automatic local-file/data staging yet.
- Ecto coverage focuses on analytical reads and common write/setup workflows, not every relational adapter feature.
- Quack is experimental and may change with DuckDB releases.
Supervision and connection options
Use QuackDB under your application supervisor when you want a long-lived connection pool:
children = [
{QuackDB,
uri: "http://[::1]:9494",
token: "super_secret",
name: MyApp.QuackDB,
pool_size: 5}
]The client accepts QuackDB options such as :uri, :token, and :transport, plus DBConnection pool options such as :name, :pool_size, :queue_target, :queue_interval, and per-call :timeout.
QuackDB.query(MyApp.QuackDB, "SELECT 1", [], timeout: 10_000)For local development, tests, or notebooks, QuackDB can also supervise a local DuckDB Quack server process with MuonTrap:
children =
QuackDB.Server.child_specs(
server: [
name: MyApp.DuckDB,
duckdb: :managed,
endpoint: "quack:localhost:9494",
settings: [threads: System.schedulers_online()],
global_settings: [quack_fetch_batch_chunks: 4]
],
client: [name: MyApp.QuackDB, pool_size: System.schedulers_online()]
)QuackDB.Server starts the external duckdb executable and serves the Quack protocol. It is a convenience process supervisor, not an embedded DuckDB driver and not required for remote DuckDB servers. Use duckdb: :managed for local development convenience, or omit it to use duckdb from PATH.
The local server defaults to settings: [threads: System.schedulers_online()] and global_settings: [quack_fetch_batch_chunks: 4]. Use a smaller client pool_size such as 1..4 for heavy analytical scans, because DuckDB parallelizes each query internally. Use System.schedulers_online() for many small concurrent queries.
Running QuackDB's integration tests
With a server running locally:
QUACKDB_TEST_URI='http://[::1]:9494' \
QUACKDB_TEST_TOKEN=super_secret \
mix test --include integration