Rulestead explainability is for support, operators, and incident response. The goal is not raw internal trace dumping. The goal is a bounded answer to "why did this subject get this result in this environment?"

Two Explain Paths

Use the path that matches where you are standing:

  • Rulestead.explain(flag_payload, context) when you already have the authored flag payload and want a pure, human-readable explanation
  • Rulestead.explain_flag(flag_key, environment_key, context, opts \\ []) when an operator needs the mounted admin-safe runtime seam for one live flag

The root explain/2 call stays payload-first. The admin-safe explain_flag/4 adds environment lookup, authorization, and redaction at the package boundary.

What A Good Explanation Tells You

A useful explanation answers:

  • which flag and environment were evaluated
  • whether a rule matched or the default applied
  • which rule matched
  • whether deterministic bucketing affected the outcome
  • what the final value or variant decision was

That is enough for support and operator workflows without exposing raw actor payloads.

Keep Context Bounded

Explain requests should carry only the bounded context fields the runtime uses:

context =
  Rulestead.Context.new(
    targeting_key: "user_123",
    environment: "prod",
    attributes: %{country: "US", plan: "pro"}
  )

Avoid passing whole application structs or raw user payloads. The explain path is designed around explicit context and redacted metadata.

Operator Workflow

From the mounted admin package, the stable explain route is:

  • /admin/flags/:key/simulate?env=:environment

Use it like this:

  1. choose the flag
  2. choose the environment through ?env=
  3. enter the bounded targeting context
  4. read the explanation and matched-rule outcome
  5. share the operator-facing URL or summarize the explanation in a ticket

The URL and environment convention are stable. Internal LiveView implementation details are not.

Lifecycle Evidence For Support And SRE

Support and SRE should not use explainability in isolation when lifecycle questions appear. Use three bounded surfaces together:

  • explain output for one decision path
  • lifecycle evidence from mounted review or mix rulestead.lifecycle
  • audit history for who changed what and why

That combination answers the real operator questions:

  • is the flag still expected to be active?
  • was it an archive candidate or blocked by missing evidence?
  • did a recent cleanup or owner handoff happen?
  • who changed the lifecycle posture?

This keeps lifecycle evidence, explain traces, and audit history aligned for support handoff without turning explainability into a second lifecycle system.

Redaction Rules

Explain and simulation workflows should stay redacted by default:

  • do not surface raw traits or PII unless the host explicitly allowlists a bounded key
  • prefer targeting_key and a small set of business-safe attributes
  • keep screenshots and support notes focused on the explanation, not the full input payload

The admin-safe explain seam returns redacted context metadata alongside the explanation so operators can confirm what was actually used without dumping the full trait bag.

Audience Trace In Explain Output

Explain and simulate output includes Audience trace steps for reusable targeting: matched, missed, missing from snapshot, and archived. Resolution is snapshot-local — no live database reads, mounted-admin lookups, host identity resolution, or observability queries during audience evaluation.

Support-safe explain permalinks include flag, environment, tenant, and targeting key only — never raw traits.

When audience questions exceed one explain call, escalate through explain + dependency inventory + audit history. Rulestead does not provide built-in observability dashboards or package-owned metrics for this path.

Simulation And Explain Belong Together

Simulation is the operator workflow for asking "what would happen for this context right now?" Explainability is the readable trace that answers it.

Use that pair when:

  • support needs to answer a customer report
  • an operator wants to verify a rollout step before publishing
  • on-call needs to understand whether a flag or rule caused an incident

Escalation Boundary

If an explanation is not enough, escalate to:

  • the timeline route for change history
  • lifecycle evidence from mix rulestead.lifecycle or the mounted queue
  • telemetry for aggregate runtime signals
  • the authored ruleset itself for exact rule order and conditions

Do not escalate by depending on RulesteadAdmin.Live.* internals. That would couple your workflow to implementation details the package does not stabilize.