Ragex.VectorStore (Ragex v0.8.0)

View Source

Vector similarity search for code embeddings.

Provides efficient cosine similarity search over code entity embeddings stored in the graph store. Supports filtering by entity type, similarity thresholds, and result limits.

Summary

Functions

Returns a specification to start this module under a supervisor.

Calculates cosine similarity between two embedding vectors.

Finds the k nearest neighbors to a query embedding.

Searches for similar code entities based on a query embedding.

Returns statistics about the vector store.

Functions

child_spec(init_arg)

Returns a specification to start this module under a supervisor.

See Supervisor.

cosine_similarity(vec1, vec2)

Calculates cosine similarity between two embedding vectors.

Returns a float between -1.0 and 1.0, where 1.0 means identical direction. For normalized embeddings (like ours), this is equivalent to dot product.

nearest_neighbors(query_embedding, k, opts \\ [])

Finds the k nearest neighbors to a query embedding.

Similar to search/2 but always returns exactly k results (or fewer if not enough embeddings exist).

search(query_embedding, opts \\ [])

Searches for similar code entities based on a query embedding.

Parameters

  • query_embedding: List of floats representing the query vector
  • opts: Keyword list of options:
    • :limit - Maximum results to return (default: 10)
    • :threshold - Minimum similarity score 0.0-1.0 (default: 0.0)
    • :node_type - Filter by node type (:module, :function, etc.)

Returns

List of results sorted by similarity (highest first), each containing:

  • :node_type - Type of the entity
  • :node_id - ID of the entity
  • :score - Similarity score (0.0 to 1.0)
  • :text - Original text description
  • :embedding - The embedding vector

Example

{:ok, query_emb} = Bumblebee.embed("function to calculate sum")
results = VectorStore.search(query_emb, limit: 5, threshold: 0.7)

start_link(opts \\ [])

stats()

Returns statistics about the vector store.