View Source Vettore (Vettore v0.1.11)

The Vettore library is designed for fast, in-memory operations on vector (embedding) data.

All vectors (embeddings) are stored in a Rust data structure (a HashMap), accessed via a shared resource (using Rustler’s ResourceArc with a Mutex). Core operations include:

  • Creating a collection: A named set of embeddings with a fixed dimension and a chosen similarity metric ("hnsw", "binary", "euclidean", "cosine", or "dot").

  • Inserting an embedding: Add a new embedding (with ID, vector, and optional metadata) to a specific collection.

  • Retrieving embeddings: Fetch all embeddings from a collection or look up a single embedding by its unique ID.

  • Similarity search: Given a query vector, calculate a “score” for every embedding in the collection and return the top‑k results (e.g. the smallest distances or largest similarities).

usage-example

Usage Example

db = Vettore.new_db()
:ok = Vettore.create_collection(db, "my_collection", 3, "euclidean")

# Insert an embedding via struct:
embedding = %Vettore.Embedding{id: "my_id", vector: [1.0, 2.0, 3.0], metadata: %{"note" => "hello"}}
:ok = Vettore.insert_embedding(db, "my_collection", embedding)

# Retrieve it back:
{:ok, returned_emb} = Vettore.get_embedding_by_id(db, "my_collection", "my_id")
IO.inspect(returned_emb.vector, label: "Retrieved vector")

# Perform a similarity search:
{:ok, top_results} = Vettore.similarity_search(db, "my_collection", [1.5, 1.5, 1.5], 2)
IO.inspect(top_results, label: "Top K search results")

Link to this section Summary

Functions

Creates a new collection in the database with the given name, dimension, and distance metric.

Deletes a collection by its name.

Deletes a single embedding by its id.

Looks up a single embedding by its id, within the given collection.

Returns all embeddings from a given collection.

Inserts a single embedding (as a %Vettore.Embedding{} struct) into a specified collection.

Batch-inserts a list of embeddings (each a %Vettore.Embedding{}) into the specified collection.

Re-rank a list of {id, score} results using Maximal Marginal Relevance (MMR).

Returns a new database resource (wrapped in a Rustler ResourceArc).

Similarity search with optional limit (defaults to 10) and optional filter map.

Link to this section Functions

Link to this function

create_collection(db, collection_name, dimension, distance, opts \\ [])

View Source
@spec create_collection(
  db :: any(),
  collection_name :: String.t(),
  dimension :: integer(),
  distance :: String.t(),
  opts :: keyword()
) :: {:ok, String.t()} | {:error, String.t()}

Creates a new collection in the database with the given name, dimension, and distance metric.

  • db is the database resource (created with new_db/0).
  • name is the name of the collection.
  • dimension is the number of dimensions in the vector.
  • distance can be one of: "euclidean", "cosine", "dot", "hnsw", or "binary".

Returns {:ok, name} on success, or {:error, reason} if the collection already exists or if the distance is invalid.

examples

Examples

= Vettore.create_collection(db, "my_collection", 3, "euclidean")

Link to this function

delete_collection(db, name)

View Source
@spec delete_collection(db :: any(), collection_name :: String.t()) ::
  {:ok, String.t()} | {:error, String.t()}

Deletes a collection by its name.

  • db is the database resource (created with new_db/0).
  • name is the name of the collection.

Returns {:ok, name} if the collection was found and deleted, or {:error, reason} otherwise.

examples

Examples

{:ok, "my_collection"} = Vettore.delete_collection(db, "my_collection")
Link to this function

delete_embedding_by_id(db, collection_name, id)

View Source
@spec delete_embedding_by_id(
  db :: any(),
  collection_name :: String.t(),
  id :: String.t()
) ::
  {:ok, String.t()} | {:error, String.t()}

Deletes a single embedding by its id.

  • db is the database resource (created with new_db/0).
  • collection is the name of the collection.
  • id is the ID of the embedding.

Returns {:ok, id} if the embedding was found and deleted, or {:error, reason} otherwise.

examples

Examples

Vettore.delete_embedding_by_id(db, "my_collection", "my_id")
# => {:ok, "my_id"}
Link to this function

get_embedding_by_id(db, collection_name, id)

View Source
@spec get_embedding_by_id(
  db :: any(),
  collection_name :: String.t(),
  id :: String.t()
) ::
  {:ok, Vettore.Embedding.t()} | {:error, String.t()}

Looks up a single embedding by its id, within the given collection.

If found, returns {:ok, %Vettore.Embedding{}}. If not found, returns {:error, reason}.

examples

Examples

Vettore.get_embedding_by_id(db, "my_collection", "my_id")
# => {:ok, %Vettore.Embedding{id: "my_id", vector: [1.0, 2.0, 3.0], metadata: %{foo: "bar"}}}
Link to this function

get_embeddings(db, collection)

View Source
@spec get_embeddings(db :: any(), collection_name :: String.t()) ::
  {:ok, [{String.t(), [float()], map() | nil}]} | {:error, String.t()}

Returns all embeddings from a given collection.

Each embedding is returned as (id, vector, metadata) in a list—if you want to convert each item into a %Vettore.Embedding{}, you can do so manually or provide a helper function.

examples

Examples

Vettore.get_embeddings(db, "my_collection")
# => {:ok, [
#   {"emb1", [1.0, 2.0, 3.0], %{"info" => "test"}},
#   {"emb2", [3.14, 2.71, 1.62], nil},
# ]}
Link to this function

insert_embedding(db, collection_name, embedding)

View Source
@spec insert_embedding(
  db :: any(),
  collection_name :: String.t(),
  embedding :: Vettore.Embedding.t()
) ::
  {:ok, String.t()} | {:error, String.t()}

Inserts a single embedding (as a %Vettore.Embedding{} struct) into a specified collection.

If the collection doesn't exist, you'll get {:error, "Collection '...' not found"}. If another embedding with the same :id is already in the collection, you’ll get an error. Also, if the :vector length does not match the collection's configured dimension, you’ll get a dimension mismatch error.

examples

Examples

embedding = %Vettore.Embedding{id: "my_id", vector: [1.0, 2.0, 3.0], metadata: %{foo: "bar"}}
{:ok, "my_id"} = Vettore.insert_embedding(db, "my_collection", embedding)
Link to this function

insert_embeddings(db, collection_name, embeddings)

View Source
@spec insert_embeddings(
  db :: any(),
  collection_name :: String.t(),
  embeddings :: [Vettore.Embedding.t()]
) :: {:ok, [String.t()]} | {:error, String.t()}

Batch-inserts a list of embeddings (each a %Vettore.Embedding{}) into the specified collection.

It returns {:ok, ["id1", "id2", ...]} if all embeddings inserted successfully. If any embedding fails (e.g., dimension mismatch, ID conflict, or missing collection), the function returns {:error, reason} and stops immediately (does not insert the rest).

examples

Examples

embs = [
  %Vettore.Embedding{id: "e1", vector: [1.0, 2.0, 3.0], metadata: nil},
  %Vettore.Embedding{id: "e2", vector: [4.5, 6.7, 8.9], metadata: %{"info" => "test"}}
]
{:ok, ["e1", "e2"]} = Vettore.insert_embeddings(db, "my_collection", embs)
Link to this function

mmr_rerank(db, collection_name, initial_results, opts \\ [])

View Source
@spec mmr_rerank(
  db :: any(),
  collection_name :: String.t(),
  initial_results :: [{String.t(), number()}],
  opts :: keyword()
) :: {:ok, [{String.t(), number()}]} | {:error, String.t()}

Re-rank a list of {id, score} results using Maximal Marginal Relevance (MMR).

Given a database resource db, a collection name, and an initial_results list of {id, score} tuples (usually obtained from similarity_search/4), this function applies an MMR formula to select up to :limit items that maximize both relevance (the score) and diversity among the selected items.

The alpha parameter (0.0 to 1.0) balances relevance vs. redundancy:

  • alpha close to 1.0 → prioritizes the raw score (similarity to query).
  • alpha close to 0.0 → heavily penalizes items similar to already-selected ones, thus promoting diversity.

We automatically convert Euclidean or Binary distance into a “higher is better” similarity by negating the distance (i.e. similarity = -distance). For Cosine, Dot Product, or HNSW approaches, the score is already in a higher-is-better format.

Returns {:ok, [{id, mmr_score}, ...]} on success or {:error, reason} if the collection is not found.

examples

Examples

After calling similarity_search/4:

{:ok, initial_results} = Vettore.similarity_search(db, "my_collection", query_vec, limit: 50)

You can re-rank:

{:ok, mmr_list} =
  Vettore.mmr_rerank(db, "my_collection", initial_results,
    limit: 10,
    alpha: 0.7
  )

mmr_list then gives a smaller set (up to 10 items) in MMR order, each with a new score (mmr_score) that reflects their final MMR weighting.

@spec new_db() :: any()

Returns a new database resource (wrapped in a Rustler ResourceArc).

The database resource is a handle for the underlying Rust data structure.

examples

Examples

is_reference(db)
# => true
Link to this function

similarity_search(db, collection_name, query, opts \\ [])

View Source
@spec similarity_search(
  db :: any(),
  collection_name :: String.t(),
  query :: [float()],
  opts :: keyword()
) :: {:ok, [{String.t(), float()}]} | {:error, String.t()}

Similarity search with optional limit (defaults to 10) and optional filter map.

performs a similarity (or distance) search in the given collection using the provided query vector, returning the top-k results as a list of {embedding_id, score} tuples.

  • For "euclidean", lower scores are better (distance).
  • For "cosine", higher scores are better (dot product).
  • For "dot", also higher is better.
  • For "binary", the score is the Hamming distance—lower is more similar.
  • For "hnsw", an approximate nearest neighbors is used.

Examples:

Vettore.similarity_search(db, "my_collection", [1.0, 2.0, 3.0], limit: 2, filter: %{"category" => "test"})
Vettore.similarity_search(db, "my_collection", [1.0, 2.0, 3.0], limit: 2)
Vettore.similarity_search(db, "my_collection", [1.0, 2.0, 3.0])

 # => {:ok, [{"emb1", 0.0}, {"emb2", 1.23}]}