Decision Explanation

Overview

Every routing decision in Floopy already carries a structured decision_trace: candidates, weights, filtered reasons, winner, confidence, and (since v2) evidence. That is the machine-readable record. The Decision Explanation feature renders the same record into one short paragraph of plain prose, in the customer’s language, on demand.

It is intentionally not an LLM-generated explanation. There is a closed taxonomy of templates, a closed schema of typed parameters, and a deterministic render function. The same decision rendered twice produces the same prose byte-for-byte.

What you get

When you read a decision via GET /v1/decisions/{request_id} or POST /v1/routing/explain, the response now includes:

{
  "explanation": {
    "text": "Floopy routed this request to openai/gpt-5.4-mini based on 412 historical samples and a moderate confidence of 0.78. The next candidate scored within 0.07 points and outcome variance has been stable. One regression was recorded in the last 7 days.",
    "template_id": "feedback_driven_moderate_confidence"
  }
}

And the response carries an HTTP header:

Content-Language: en

The locale is echoed as a header rather than a JSON field so the wire shape stays compact and the locale is communicated through the standard HTTP mechanism.

Template-id taxonomy

Explanations are composed from a closed set of 15 templates. Every decision picks exactly one template based on its strategy and outcome:

Template id	When it fires
`cache_hit`	The response was served from cache; no router was invoked.
`fallback_only`	All candidates were filtered; the request fell through to the default model.
`no_router_invoked`	A short-circuit path took over (legacy model, single-candidate rule).
`feedback_driven_high_confidence`	The Feedback-Driven router selected with `confidence >= 0.8`.
`feedback_driven_moderate_confidence`	The Feedback-Driven router selected with `confidence` in `[0.5, 0.8)`.
`feedback_driven_low_confidence`	The Feedback-Driven router selected with `confidence < 0.5`.
`smart_cost_selected`	The Smart Cost router picked the cheapest candidate that met the quality bar.
`constraint_rejected_max_cost_increase`	The intended winner was filtered by `max_cost_increase`.
`constraint_rejected_max_regression`	Filtered by `max_regression`.
`constraint_rejected_min_samples`	Filtered by `min_samples_before_promotion`.
`constraint_rejected_cost_drop_requires_validation`	Filtered by `max_cost_drop_without_validation`.
`constraint_rejected_high_variance`	Filtered by `max_outcome_variance`.
`constraint_rejected_shadow_required`	Filtered by `require_shadow_before_live`.
`firewall_blocked`	The LLM firewall blocked the request before routing completed.
`fallback`	A generic fallback path took over (provider error, circuit breaker, rate limit).

The template_id enum is closed and stable. Adding a new template is a deliberate, versioned change. A decision whose strategy is not yet covered is a compile error in the gateway, not a runtime fallback to “I don’t know”.

How locale is resolved

The locale used for the rendered text is resolved at read time, on every request, in the following order:

The HTTP Accept-Language request header.
If absent, malformed, or otherwise unparseable: en.

There is no organisation-wide default locale and no per-user preference baked into the response. The decision is purely on the request header. The resolved locale is then echoed via the Content-Language response header so callers can verify which locale they got.

The Accept-Language parser is hardened against abuse:

The header is capped at 256 bytes.
ASCII-only — non-ASCII bytes trigger a fallback to en.
A strict allowlist regex rejects malformed quality factors and unknown locales.
The supported locales are en and pt; any other locale falls back to en.

The i18n contract

The render function is an exhaustive match over (template_id, locale). Every supported template has a string for every supported locale, enforced at compile time. Adding a pt translation that is missing or stale is a build failure, not a runtime fallback.

The current set of locales is {en, pt}. Adding a new locale is an explicit gateway release, not a configuration toggle.

What the text never contains

This is the most important contract: the explanation text never references prompt content. The render function takes a (template_id, params) pair where params is a closed sub-schema of typed scalars — sample counts, confidence values, the gap to the runner-up, the bucketed regression count, and provider/model identifiers. None of these fields ever carry caller-controlled text.

Provider and model strings pass through a sanitizer before they are interpolated into the template:

Anything outside [a-zA-Z0-9._/-] is dropped.
The result is bounded so a hostile model name cannot blow up the prose.

The output is also defensively truncated to 600 characters at the edge of the renderer, fuzz-tested for control characters, and gated against Markdown control characters. The dashboard renders the text inside a <p> element as plain text — no HTML, no Markdown.

Why we render at read time

Two reasons.

First, security and storage: if explanation prose were persisted in decision_trace, every byte of every decision would carry an attack surface for stored-text injection forever. By keeping the persisted artifacts strictly typed (explanation_template_id plus explanation_params), the audit row stays small, machine-friendly, and free of free text.

Second, internationalisation: rendering at read time means a customer who reads the same decision in en today and pt tomorrow gets two different paragraphs from one stored decision. There is no migration step when we ship a new locale or refine a template — every existing decision picks up the new prose immediately.

Where you will not see explanation text

The explanation field is not added to:

GET /v1/decisions (the list endpoint) — too verbose for a paginated list view.
GET /v1/export/decisions — exports are machine-consumed; the persisted template_id is intentionally excluded from the export allowlist so the wire shape is stable across customers regardless of locale support.

If you want the explanation, fetch the single decision: GET /v1/decisions/{request_id}.

Onboarding shadow defaults

The Floopy onboarding “shadow setup” step uses our internal model catalog to pick a sensible cheaper alternative for the model the customer just connected. When that step’s experiment writes its first decisions, the explanation template will typically be one of the Smart Cost or Feedback-Driven branches, depending on whether the customer is already running Feedback-Driven routing.