# Changelog

All notable changes to this project will be documented in this file.

The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/),
and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).

## [0.5.2] - 2025-12-28

### Changed
- **Oban-style repo injection**: Configure `repo: MyApp.Repo` instead of auto-starting internal Repo
- **`start_repo` replaces `enable_repo`**: Defaults to `false`; set `true` for legacy behavior
- Added `CrucibleFramework.repo/0` and `repo!/0` accessors
- Bumped `crucible_trace` to `~> 0.3.1`, `telemetry` to `~> 1.3`

## [0.5.1] - 2025-12-27

### Added
- Refreshed examples to match the current pipeline and IR
- Added `examples/run_all.sh` to run all examples at once
- New `guides/` directory with hex-doc-friendly documentation:
  - `guides/getting_started.md` - Installation and quick start
  - `guides/stages.md` - Creating custom stages with schema specification
  - `guides/configuration.md` - Registry, adapters, and optional dependencies

### Changed
- Made `crucible_bench` and `crucible_trace` optional dependencies to keep the core slim
- Guarded optional dependencies: bench stage fails fast when `crucible_bench` is missing; tracing disables with a warning when `crucible_trace` is missing
- Normalized stage options to an empty map when omitted to prevent nil option crashes
- Report rendering now sanitizes metrics/outputs for JSON encoding
- Bumped `crucible_bench` to `~> 0.4.0`
- Raised `postgrex` minimum version to `>= 0.21.1`
- Made persistence integration tests opt-in via `CRUCIBLE_DB_ENABLED=true` in test config
- Updated `mix.exs` doc configuration to use `guides/` directory structure

### Removed
- Removed stale root documentation files that documented separate packages:
  - `ADVERSARIAL_ROBUSTNESS.md`, `DATASETS.md`, `ENSEMBLE_GUIDE.md`,
    `HEDGING_GUIDE.md`, `INSTRUMENTATION.md`, `STATISTICAL_TESTING.md`,
    `CAUSAL_TRANSPARENCY.md` (moved to respective packages)
  - `GETTING_STARTED.md`, `ARCHITECTURE.md`, `RESEARCH_METHODOLOGY.md` (replaced by `guides/`)
  - `FAQ.md`, `PUBLICATIONS.md`, `CONTRIBUTING.md` (stale umbrella-era docs)

## [0.5.0] - 2025-12-27

### Added

#### Schema Infrastructure
- **`Crucible.Stage.Schema`**: Canonical schema definition module with:
  - `validate/1` - Validates schema conformance
  - `valid_type_spec?/1` - Type specification validation
  - Complete type system: primitives, structs, enums, lists, maps, functions, unions, tuples

- **`Crucible.Stage.Schema.Normalizer`**: Legacy schema conversion module
  - Converts `:stage` key to `:name`
  - Converts string names to atoms
  - Adds missing `required`, `optional`, `types` fields
  - Moves non-core fields to `__extensions__`

- **`Crucible.Stage.Validator`**: Runtime options validation
  - Validates required options presence
  - Type-checks option values against schema
  - Supports all type specifications from `Schema`

#### Registry Enhancements
- **`Crucible.Registry.list_stages_with_schemas/0`**: Returns all stages with their schemas
- **`Crucible.Registry.stage_schema/1`**: Gets normalized schema for a specific stage
- **`Crucible.Registry.list_stages/0`**: Lists all registered stage names

#### Pipeline Runner Validation
- **`validate_options` option**: Opt-in validation mode for `CrucibleFramework.run/2`
  - `:off` (default) - No validation
  - `:warn` - Log warnings but continue
  - `:error` - Fail on validation errors

#### Mix Task
- **`mix crucible.stages`**: CLI for stage discovery
  - Lists all registered stages with descriptions
  - `--name <stage>` shows detailed schema for a stage
  - Shows required/optional fields and type specifications

#### Conformance Testing
- **`Crucible.Stage.ConformanceTest`**: Comprehensive tests for all framework stages
  - Existence tests (describe/1, run/2)
  - Schema structure validation
  - Type coherence checks
  - Required/optional overlap detection

### Changed

- **`describe/1` is now REQUIRED** - Removed from `@optional_callbacks`
- **`Crucible.Stage` moduledoc** - Updated to reflect required `describe/1`

### Breaking Changes

- All stages **must** implement `describe/1` callback
- Stages without `describe/1` will cause compilation warnings

### Migration Guide

#### Add describe/1 to Your Stages

**Before (0.4.x):**
```elixir
defmodule MyStage do
  @behaviour Crucible.Stage

  @impl true
  def run(ctx, opts), do: {:ok, ctx}
  # describe/1 was optional
end
```

**After (0.5.0):**
```elixir
defmodule MyStage do
  @behaviour Crucible.Stage

  @impl true
  def run(ctx, opts), do: {:ok, ctx}

  @impl true
  def describe(_opts) do
    %{
      name: :my_stage,
      description: "What this stage does",
      required: [],
      optional: [:option1],
      types: %{option1: :string}
    }
  end
end
```

#### Enable Options Validation (Optional)

```elixir
# Warn on invalid options
CrucibleFramework.run(experiment, validate_options: :warn)

# Fail on invalid options
CrucibleFramework.run(experiment, validate_options: :error)
```

## [0.4.1] - 2025-12-26

### Added

#### Stage Contract Enforcement
- **`Crucible.Stage` Behaviour Documentation**: Comprehensive documentation for the stage contract including:
  - Runner location clarification (`crucible_framework` owns execution, `crucible_ir` defines specs only)
  - Required `run/2` callback semantics
  - Policy-required `describe/1` callback with schema specification
  - Type specifications for option schemas (`:string`, `:integer`, `{:struct, Module}`, `{:enum, [values]}`, etc.)

- **Pipeline Runner Documentation**: Enhanced `Crucible.Pipeline.Runner` moduledoc clarifying:
  - Authoritative runner location in `crucible_framework`
  - Pipeline execution flow and stage resolution
  - Trace integration for observability

#### Built-in Stage Schemas
All built-in stages now implement proper `describe/1` schemas:
- `Crucible.Stage.Validate` - validation options schema
- `Crucible.Stage.Bench` - statistical testing options schema
- `Crucible.Stage.DataChecks` - data validation options schema
- `Crucible.Stage.Guardrails` - guardrail adapter options schema
- `Crucible.Stage.Report` - report generation options schema (new)

### Changed
- **`describe/1` Schema Format**: Updated all built-in stages to return standardized schema:
  ```elixir
  %{
    name: :stage_name,
    description: "Human-readable description",
    required: [:key1, :key2],
    optional: [:key3, :key4],
    types: %{key1: :string, key2: {:struct, Module}}
  }
  ```

### Ecosystem Updates
The following external repositories were updated to implement `describe/1`:

- **crucible_train**: SupervisedTrain, Distillation, DPOTrain, RLTrain stages
- **crucible_model_registry**: Register, Promote stages
- **crucible_deployment**: Deploy, Promote, Rollback stages (also added `@behaviour Crucible.Stage`)
- **crucible_feedback**: CheckTriggers, ExportFeedback stages

### Notes
- The `describe/1` callback remains optional at the behaviour level but is **required by policy**
- Stages own their options schema and validation; IR remains opaque
- External stages (crucible_bench, crucible_ensemble, crucible_hedging, ExFairness) already had `describe/1`

## [0.4.0] - 2025-12-23

### Changed
- **BREAKING**: Now depends on `crucible_ir` package for shared IR structs
- All internal IR definitions removed in favor of `crucible_ir` dependency
- Ensemble config field renamed from `members` to `models` to match CrucibleIR
- Hedging config field renamed from `max_extra_requests` to `max_hedges` to match CrucibleIR
- **Pipeline Runner**: Now automatically marks stages as complete during execution
- **Context Module**: Enhanced with comprehensive documentation and 20+ helper functions (fully backward compatible)

### Added

#### CrucibleIR Migration
- Backwards-compatible `Crucible.IR` module with aliases to `CrucibleIR` structs
- Override declaration for `crucible_ir` dependency to support local path development

#### Enhanced Context Ergonomics
- **Metrics Management**: Added `put_metric/3`, `get_metric/3`, `update_metric/3`, `merge_metrics/2`, and `has_metric?/2` helper functions for cleaner metric manipulation
- **Output Management**: Added `add_output/2` and `add_outputs/2` for ergonomic output collection
- **Artifact Management**: Added `put_artifact/3`, `get_artifact/3`, and `has_artifact?/2` for artifact storage and retrieval
- **Assigns Management**: Added Phoenix-style `assign/2` and `assign/3` functions for flexible context assignments
- **Query Functions**: Added `has_data?/1`, `has_backend_session?/2`, and `get_backend_session/2` for querying context state
- **Stage Tracking**: Added `mark_stage_complete/2`, `stage_completed?/2`, and `completed_stages/1` for pipeline progress tracking

#### Pre-Flight Validation
- **`Crucible.Stage.Validate`**: New validation stage for catching configuration errors before pipeline execution
  - Backend registration validation
  - Pipeline stage module resolution
  - Dataset provider verification
  - Reliability configuration validation
  - Output specification validation
  - Strict mode for warnings-as-errors
  - Configurable validation skip options
- **Validation Metrics**: Validation results stored in `context.metrics.validation` with detailed error/warning information

### Removed
- `lib/crucible/ir/` directory (all IR structs now from `crucible_ir` package)
  - Removed: experiment.ex, dataset_ref.ex, backend_ref.ex, stage_def.ex, output_spec.ex
  - Removed: reliability_config.ex, ensemble_config.ex, hedging_config.ex
  - Removed: stats_config.ex, fairness_config.ex, guardrail_config.ex

### Documentation
- Added comprehensive inline documentation for all Context helper functions
- Added design document in `docs/20251125/enhancements_design.md` detailing v0.4.0 enhancements
- Updated README.md with v0.4.0 feature highlights

### Testing
- Added 180+ new tests covering all enhancements
- `test/crucible/context_test.exs`: 50+ tests for Context helper functions
- `test/crucible/stage/validate_test.exs`: 30+ tests for validation stage
- All tests passing with zero compilation warnings

### Developer Experience Improvements
- Reduced boilerplate code by 40-60% for common context operations
- Clearer error messages from validation stage
- Better debugging via stage completion tracking
- Phoenix-style context manipulation patterns

### Notes
- **Backwards Compatible Aliases**: `Crucible.IR.*` aliases provided for smooth migration
- **Performance**: Helper functions have negligible overhead (<1% measured)

### Migration Guide

#### Update Imports

**Old:**
```elixir
alias Crucible.IR.Experiment
alias Crucible.IR.{BackendRef, DatasetRef}
```

**New (recommended):**
```elixir
alias CrucibleIR.Experiment
alias CrucibleIR.{BackendRef, DatasetRef}
```

**Backwards compatible (deprecated):**
```elixir
# Still works but will be removed in v1.0.0
alias Crucible.IR.Experiment
```

#### Update Config References

**Ensemble config:**
```elixir
# Old
%EnsembleConfig{members: [...]}

# New
%CrucibleIR.Reliability.Ensemble{models: [...]}
```

**Hedging config:**
```elixir
# Old
%HedgingConfig{max_extra_requests: 2}

# New
%CrucibleIR.Reliability.Hedging{max_hedges: 2}
```

#### Update Reliability Config

**Old:**
```elixir
alias Crucible.IR.{ReliabilityConfig, EnsembleConfig, HedgingConfig}

%ReliabilityConfig{
  ensemble: %EnsembleConfig{...},
  hedging: %HedgingConfig{...}
}
```

**New:**
```elixir
alias CrucibleIR.Reliability.{Config, Ensemble, Hedging}

%Config{
  ensemble: %Ensemble{...},
  hedging: %Hedging{...}
}
```

## [0.3.0] - 2025-11-23

### Changed
- Introduced a declarative Experiment IR (`Crucible.IR.*`) with serializable structs for datasets, stages, backends, and outputs.
- Replaced legacy harness/runner with a stage-based pipeline engine (`Crucible.Pipeline.Runner`) and core stages for data loading, checks, guardrails, backend calls, CNS metrics, bench hooks, and reporting.
- Added `Crucible.Backend` behaviour and a mockable Tinkex backend implementation that delegates to the `tinkex` SDK via swappable clients.
- Added an Ecto/Postgres persistence layer (experiments, runs, artifacts) plus a turnkey bootstrap script `scripts/setup_db.sh`.
- Added `examples/tinkex_live.exs` as a live, end-to-end demo using the new pipeline and IR.

## [0.2.1] - 2025-11-21

### Fixed
- **AdaptiveRouting init args** - `Crucible.Hedging.AdaptiveRouting.start_link/1` and `init/1` now normalize maps and keyword lists so `Supertester.OTPHelpers.setup_isolated_genserver/3` can forward `:init_args` unchanged without double-wrapping, keeping the GenServer init contract stable.

## [0.2.0] - 2025-11-21

### Added

#### Tinkex Integration - Unified ML Training API
- **Crucible.Tinkex Adapter**: Complete integration with Tinkex SDK for LoRA fine-tuning
  - `Crucible.Tinkex.Config` - API credentials, retry policies, default LoRA hyperparameters, quality targets
  - `Crucible.Tinkex.Experiment` - Declarative experiment structure for datasets, sweeps, checkpoints, and replications
  - `Crucible.Tinkex.QualityValidator` - CNS3-derived schema/citation/entailment quality gates
  - `Crucible.Tinkex.Results` - Training/eval aggregation with CSV export and best-run selection
  - `Crucible.Tinkex.Telemetry` - Standardized `[:crucible, :tinkex, ...]` events

#### LoRA Training Interface
- **Crucible.Lora**: High-level adapter-agnostic training interface
  - `create_experiment/1` - Create new training experiments with configuration
  - `train/3` - Run LoRA fine-tuning with automatic checkpointing and quality targets
  - `evaluate/3` - Evaluate trained models against test datasets
  - `resume/2` - Resume training from checkpoint
  - `batch_dataset/2` - Efficient dataset batching
  - `format_training_data/1` - Format data for training backend
  - `checkpoint_name/2` - Deterministic artifact naming
- **Crucible.Lora.Adapter**: Behaviour for implementing custom training backends
  - Swap adapters via `config :crucible_framework, :lora_adapter, MyAdapter`

#### Ensemble Inference with LoRA Adapters
- **Crucible.Ensemble.create/1**: Create ensembles from multiple fine-tuned LoRA adapters
- **Crucible.Ensemble.infer/3**: Run ensemble inference with voting and hedging
- **Crucible.Ensemble.batch_infer/3**: Batch processing for multiple prompts
- Support for weighted adapter configurations in ensemble voting

#### Configuration Architecture
- Hierarchical configuration: application-level, component-level, per-experiment
- Environment variable support via `{:system, "VAR_NAME"}` syntax
- Per-experiment configuration overrides at runtime

#### New Telemetry Events
- `[:crucible, :training, :start | :stop | :exception]` - Training lifecycle
- `[:crucible, :inference, :start | :stop | :exception]` - Inference lifecycle
- `[:crucible, :checkpoint, :save | :load]` - Checkpoint operations
- `[:crucible, :tinkex, :forward_backward | :optim_step | :save_weights]` - Low-level Tinkex operations

#### Documentation
- Updated README with LoRA training workflow quick start
- Updated ARCHITECTURE.md with Tinkex integration layer diagrams
- Updated GETTING_STARTED.md with complete training walkthrough
- Added data flow diagrams for training and inference paths

### Changed
- **mix.exs**: Added `tinkex ~> 0.1.1` as core dependency
- **Version**: Bumped to 0.2.0 reflecting significant new functionality
- **Error handling**: Unified structured errors via `Crucible.Error` across all components
- **Telemetry**: Enhanced instrumentation with experiment context propagation

### Migration Guide from 0.1.x

#### 1. Add Tinkex Configuration

```elixir
# config/config.exs
config :crucible_framework, Crucible.Tinkex,
  api_key: System.get_env("TINKEX_API_KEY"),
  base_url: "https://api.tinker.example.com",
  timeout: 60_000,
  pool_size: 10

config :crucible_framework,
  lora_adapter: Crucible.Tinkex,
  telemetry_backend: :ets,
  default_hedging: :percentile_75
```

#### 2. Update Experiment Creation

```elixir
# Old approach (0.1.x)
experiment = %{name: "my-experiment", ...}

# New approach (0.2.0)
{:ok, experiment} = Crucible.Lora.create_experiment(
  name: "my-experiment",
  config: %{
    base_model: "llama-3-8b",
    lora_rank: 16,
    learning_rate: 1.0e-4
  }
)
```

#### 3. Update Ensemble Usage

```elixir
# Old approach (using crucible_ensemble directly)
{:ok, result} = CrucibleEnsemble.vote(models, prompt, strategy)

# New approach (unified API with adapters)
{:ok, ensemble} = Crucible.Ensemble.create(
  adapters: [
    %{name: "adapter-v1", weight: 0.4},
    %{name: "adapter-v2", weight: 0.3},
    %{name: "adapter-v3", weight: 0.3}
  ],
  strategy: :weighted_majority
)
{:ok, result} = Crucible.Ensemble.infer(ensemble, prompt)
```

#### 4. Telemetry Handler Updates

```elixir
# New events to handle
:telemetry.attach_many(
  "my-handler",
  [
    [:crucible, :training, :stop],
    [:crucible, :inference, :stop],
    [:crucible, :checkpoint, :save]
  ],
  &MyApp.TelemetryHandler.handle_event/4,
  nil
)
```

## [0.1.5] - 2025-11-21

### Fixed
- **mix.exs metadata** - Corrected a small bug in `mix.exs` so the package version and documentation source references align for the v0.1.5 release.

## [0.1.4] - 2025-11-12

### Changed
- **Tinkex overlay configuration namespace** - Moved API auth, config, job queue/runner, and related documentation/tests to read application env under `:crucible_framework` instead of `:crucible_tinkex`, ensuring credentials and hooks resolve through the framework app configuration.

## [0.1.3] - 2025-11-21

### Added
- **Tinkex Integration Layer**
  - `Crucible.Tinkex`, `Config`, `Experiment`, `QualityValidator`, `Results`, and `Telemetry` modules for orchestrating LoRA fine-tuning, telemetry capture, and report generation
  - Helpers for batching datasets, formatting training data, checkpoint naming, and sampling parameter management
  - Quality validation reports and monitoring callbacks aligned with CNS3 targets
  - Experiment management primitives for sweeps, run generation, and lifecycle transitions
  - Result aggregation utilities with CSV export, best-run selection, and report data production
- **LoRA Adapter Abstraction**
  - Added `Crucible.Lora` facade plus `Crucible.Lora.Adapter` behaviour so Crucible can target any fine-tuning backend
  - Default adapter (`Crucible.Tinkex`) now implements the behaviour and can be swapped via `config :crucible_framework, :lora_adapter, MyAdapter`
- **Comprehensive Test Coverage**
  - 6 new ExUnit files spanning configuration, experiments, results, telemetry, and top-level helpers
  - Property-based fixtures via `stream_data` and mocking hooks via `mox`
- **Dependency Support**
  - Added `tinkex`, `mox`, and `stream_data` to `mix.exs` along with the corresponding lock entries

### Changed
- Updated README with MIT licensing, the new LoRA adapter layer overview, and reproducibility metadata for v0.1.3
- Expanded GETTING_STARTED guide with the adapter architecture, refreshed version metadata, and Hex dependency snippets
- Set package license metadata to MIT and documented the change across docs

## [0.1.2] - 2025-10-29

### Added
- **Core Library Implementation** - Added practical Elixir modules for framework usage
  - `CrucibleFramework` module with version info, component status, and system information
  - `CrucibleFramework.Experiment` module for defining and validating experiments
  - `CrucibleFramework.Statistics` module with fundamental statistical functions (mean, median, std dev, variance, percentiles)
- **Comprehensive Test Suite** - 72 tests (24 doctests + 48 unit tests) with 100% pass rate
  - Full test coverage for all modules and functions
  - Doctest examples in all public functions
  - Edge case testing and validation
- **Working Examples** - Four complete, runnable examples in `examples/` directory
  - `01_basic_usage.exs` - Framework information and component status
  - `02_statistics.exs` - Statistical analysis of experimental data
  - `03_experiment_definition.exs` - Experiment configuration and validation
  - `04_statistical_analysis.exs` - Complete research workflow with cost-benefit analysis
  - `examples/README.md` - Comprehensive guide for all examples
- **Enhanced Documentation**
  - Detailed module documentation with examples
  - Clear learning path for new users
  - Troubleshooting guides

### Changed
- Transformed from documentation-only package to functional library with working code
- Updated package structure to include `lib/` and `test/` directories
- Enhanced mix.exs configuration for better code organization

## [0.1.1] - 2025-10-28

### Added
- **ADVERSARIAL_ROBUSTNESS.md** - Comprehensive adversarial defense guide covering the complete security stack
  - Documentation for 21 attack types across 5 categories (character, word, semantic, prompt injection, jailbreak)
  - Defense mechanisms: detection, filtering, and sanitization with risk scoring
  - Integration guide for 4-layer security stack: CrucibleAdversary, LlmGuard, ExFairness, ExDataCheck
  - Fairness metrics and EEOC 80% rule compliance checking
  - Data quality validation with 22 expectations and drift detection (KS test, PSI)
  - Complete production security pipeline examples with defense-in-depth patterns
  - Performance benchmarks and best practices for adversarial robustness
  - Links to all 4 component GitHub repositories with technical deep dives
- Updated README.md with "Security & Adversarial Robustness" section
- Added adversarial robustness documentation to HexDocs configuration

### Changed
- Organized documentation to highlight adversarial defense capabilities alongside other framework components
- Enhanced documentation navigation with adversarial robustness in Component Guides section

## [0.1.0] - 2024-10-09

### Added
- Initial release of Crucible documentation framework
- Migrated from Spectra umbrella project to independent organization
- Complete guide collection for all Crucible components
- Comprehensive documentation hub for the Crucible framework
- Architecture documentation
- Research methodology guides
- Component-specific guides (Ensemble, Hedging, Statistical Testing, etc.)
- Contribution guidelines
- FAQ and publications
