Zvex.Collection.Schema (zvex v0.4.0)

Copy Markdown View Source

Schema builder for zvec collections.

Pure Elixir struct -- no NIF resources held during building. Materialized into C objects at Zvex.Collection.create/3 time.

Example

Zvex.Collection.Schema.new("my_collection")
|> Zvex.Collection.Schema.add_field("id", :string, primary_key: true)
|> Zvex.Collection.Schema.add_field("embedding", :vector_fp32,
     dimension: 768,
     index: [type: :hnsw, metric: :cosine, m: 16, ef_construction: 200])
|> Zvex.Collection.Schema.add_field("category", :string,
     nullable: true, index: [type: :invert])

Summary

Types

A single field definition within a schema.

t()

A collection schema definition.

Functions

Appends a field to the schema.

Sets the maximum number of documents per segment. Must be a positive integer.

Creates a new empty schema with the given collection name.

Validates the schema, checking

Types

field()

@type field() :: %{
  name: String.t(),
  data_type: atom(),
  primary_key: boolean(),
  nullable: boolean(),
  dimension: non_neg_integer(),
  index: Zvex.Collection.Schema.IndexParams.t() | nil
}

A single field definition within a schema.

  • :name — field name (must be unique within the schema).
  • :data_type — one of the atoms from Zvex.Types.data_types/0.
  • :primary_key — exactly one field per schema must be true (must be :string type).
  • :nullable — whether the field accepts null values.
  • :dimension — required for vector types, must be > 0.
  • :index — optional Zvex.Collection.Schema.IndexParams for the field.

t()

@type t() :: %Zvex.Collection.Schema{
  fields: [field()],
  max_doc_count_per_segment: pos_integer() | nil,
  name: String.t()
}

A collection schema definition.

  • :name — collection name, must be a non-empty string.
  • :fields — ordered list of field definitions.
  • :max_doc_count_per_segment — optional cap on documents per segment.

Functions

add_field(schema, name, data_type, opts \\ [])

@spec add_field(t(), String.t(), atom(), keyword()) :: t()

Appends a field to the schema.

Options

  • :primary_key — marks this field as the primary key (default false). Exactly one :string field must be the primary key.
  • :nullable — allows null values (default false, forced false for primary keys).
  • :dimension — vector dimension, required for vector data types (default 0).
  • :index — keyword list of index parameters forwarded to Zvex.Collection.Schema.IndexParams.from_opts/1. Must include :type.

max_doc_count_per_segment(schema, count)

@spec max_doc_count_per_segment(t(), pos_integer()) :: t()

Sets the maximum number of documents per segment. Must be a positive integer.

new(name)

@spec new(String.t()) :: t()

Creates a new empty schema with the given collection name.

validate(schema)

@spec validate(t()) :: :ok | {:error, Zvex.Error.t()}

Validates the schema, checking:

  • Name is a non-empty string
  • At least one field is defined
  • Exactly one :string primary key field
  • All field names are unique
  • All data types are recognized
  • Vector fields have dimension > 0
  • Index types are compatible with their field's data type
  • max_doc_count_per_segment (if set) is a positive integer