Ragex.Embeddings.FileTracker (Ragex v0.9.1)

View Source

Tracks file metadata to enable incremental embedding updates.

This module maintains a registry of analyzed files with their content hashes, modification times, and associated entities (modules, functions). It enables smart diff detection to determine which embeddings need regeneration when files change.

Strategy

  1. Content Hashing: SHA256 hash of file content for reliable change detection
  2. Entity Tracking: Map files to their contained entities (modules, functions)
  3. Incremental Updates: Only regenerate embeddings for changed files
  4. Performance: <5% regeneration on typical single-file changes

Usage

# Track a file after analysis
FileTracker.track_file("/path/to/file.ex", analysis_result)

# Check if file has changed
FileTracker.has_changed?("/path/to/file.ex")

# Get entities that need regeneration
FileTracker.get_stale_entities()

# Clear tracking for deleted files
FileTracker.untrack_file("/path/to/file.ex")

Summary

Functions

Clears all tracked files.

Exports tracking data for persistence.

Returns entities from files that have changed.

Checks if a file has changed since it was last tracked.

Imports tracking data from persistence.

Initializes the file tracker ETS table.

Returns a list of all tracked files.

Returns statistics about tracked files.

Tracks a file with its metadata and associated entities.

Removes tracking for a file.

Types

entity_ref()

@type entity_ref() :: {:module, term()} | {:function, term()}

file_metadata()

@type file_metadata() :: %{
  path: String.t(),
  content_hash: binary(),
  mtime: integer(),
  size: integer(),
  entities: [entity_ref()],
  analyzed_at: integer()
}

Functions

clear_all()

Clears all tracked files.

Used when performing a full refresh or clearing the cache.

export()

Exports tracking data for persistence.

Returns a map that can be serialized and stored alongside embeddings.

get_stale_entities()

Returns entities from files that have changed.

This is used to determine which embeddings need to be regenerated. Returns a list of {entity_type, entity_id} tuples.

has_changed?(file_path)

Checks if a file has changed since it was last tracked.

Returns {:changed, old_metadata} if the file has changed, {:unchanged, metadata} if it hasn't, or {:new, nil} if the file was never tracked.

import(data)

Imports tracking data from persistence.

Restores file tracking state from a previously exported state.

init()

Initializes the file tracker ETS table.

Called automatically by the application supervisor.

list_tracked_files()

Returns a list of all tracked files.

stats()

Returns statistics about tracked files.

track_file(file_path, analysis_result)

Tracks a file with its metadata and associated entities.

Parameters

  • file_path - Absolute path to the file
  • analysis_result - Analysis result containing modules and functions

Returns

  • :ok on success
  • {:error, reason} on failure

untrack_file(file_path)

Removes tracking for a file.

Used when files are deleted or need to be re-analyzed from scratch.