ExMemvid
ExMemvid is a proof-of-concept library for storing and retrieving large amounts of text data by encoding it into a video file composed of QR code frames. It leverages modern Elixir libraries for machine learning, video processing, and vector search to provide a unique solution for data storage and semantic retrieval.
How it Works
The core idea is to treat video frames as a data storage medium. Each frame in the video contains a QR code that holds a chunk of text. A separate search index is created using text embeddings to allow for fast, semantic searching of the content stored in the video.
Encoding Process
- Text Chunking: The input text is divided into smaller, manageable chunks.
- Embedding: A sentence transformer model from Hugging Face (via
Bumblebee
) generates a vector embedding for each text chunk. - QR Code Generation: Each text chunk is serialized (optionally with Gzip compression) and encoded into a QR code image.
- Video Encoding: The QR code images are compiled into a video file, where each image becomes a single frame. The library uses
Xav
andEvision
(OpenCV bindings) for this. - Index Creation: The vector embeddings are stored in an HNSWLib (Hierarchical Navigable Small World) index for efficient similarity search. This index maps the embeddings to their corresponding frame numbers in the video.
- Saving: The final video file and the search index are saved to disk.
Retrieval Process
- Search Query: The user provides a text query.
- Query Embedding: The query is converted into a vector embedding using the same model as the encoding process.
- Semantic Search: The HNSWLib index is queried to find the text chunks with embeddings most similar to the query's embedding.
- Frame Identification: The search results from the index provide the frame numbers where the relevant text chunks are stored.
- Frame Decoding: The
Retriever
seeks to the specific frames in the video file, reads the QR codes, and decodes them to retrieve the original text chunks. - Result Aggregation: The retrieved text chunks are returned to the user.
Features
- Data Archiving: Store large text corpora in a compressed video format.
- Semantic Search: Go beyond keyword matching with state-of-the-art text embeddings.
- Configurable: Easily configure everything from the video codec and QR code version to the embedding model.
- Concurrent: Utilizes Elixir's concurrency to parallelize embedding and frame decoding tasks.
- Extensible: The
Embedding
behaviour allows for swapping out the embedding implementation. - Supervised: Built-in supervisors for managing encoder and retriever processes.
Installation
Add ex_memvid
to your list of dependencies in mix.exs
:
def deps do
[
{:ex_memvid, "~> 0.1.1"}
]
end
You will also need ffmpeg
installed on your system for some of the underlying video operations.
Quick Start
Basic Usage
# 1. Configure with Hugging Face embeddings
config = ExMemvid.Config.validate!([])
# 2. Start the embedding supervisor
{:ok, _} = ExMemvid.Embedding.Supervisor.start_link(config)
# 3. Create and populate an encoder
{:ok, encoder} = ExMemvid.Encoder.new(config)
# Add your text data
texts = [
"The Elixir programming language is designed for building maintainable and scalable applications.",
"Phoenix LiveView enables rich, real-time user experiences with server-rendered HTML.",
"OTP provides battle-tested abstractions for building fault-tolerant systems.",
"Ecto is a database wrapper and query generator for Elixir.",
"GenServers are the building blocks of stateful processes in Elixir applications."
]
encoder = ExMemvid.Encoder.add_chunks(encoder, texts)
# 4. Build the video and index
video_path = "knowledge_base.mp4"
index_path = "knowledge_base.hnsw"
{:ok, stats} = ExMemvid.Encoder.build_video(encoder, video_path, index_path)
IO.puts("Encoded #{stats.frame_count} frames into video")
# 5. Search the content
{:ok, retriever} = ExMemvid.Retriever.start_link(video_path, index_path, config)
{:ok, results} = ExMemvid.Retriever.search(retriever, "What is LiveView?", top_k: 2)
Enum.each(results, &IO.puts/1)
Using Supervisors
# Start the retriever supervisor
{:ok, _} = ExMemvid.RetrieverSupervisor.start_link([])
# Start multiple retrievers for different video archives
{:ok, docs_retriever} = ExMemvid.RetrieverSupervisor.start_retriever(
"documentation.mp4",
"documentation.hnsw",
config,
name: :docs_retriever
)
{:ok, blog_retriever} = ExMemvid.RetrieverSupervisor.start_retriever(
"blog_posts.mp4",
"blog_posts.hnsw",
config,
name: :blog_retriever
)
# Query different knowledge bases
{:ok, docs} = ExMemvid.Retriever.search(:docs_retriever, "How to use GenServers?")
{:ok, blogs} = ExMemvid.Retriever.search(:blog_retriever, "Real-world Elixir stories")
# Check active retrievers
ExMemvid.RetrieverSupervisor.count_retrievers()
#=> 2
# Get info about a specific retriever
{:ok, info} = ExMemvid.RetrieverSupervisor.get_retriever_info(:docs_retriever)
#=> %{video_path: "documentation.mp4", cache_size: 5, ...}
Contributing
- Fork the repository
- Create your feature branch (
git checkout -b feature/amazing-feature
) - Commit your changes (
git commit -m 'Add some amazing feature'
) - Push to the branch (
git push origin feature/amazing-feature
) - Open a Pull Request
License
This project is licensed under the MIT License - see the LICENSE file for details.