Skip to content

Decision Explanation

Every routing decision in Floopy already carries a structured decision_trace: candidates, weights, filtered reasons, winner, confidence, and (since v2) evidence. That is the machine-readable record. The Decision Explanation feature renders the same record into one short paragraph of plain prose, in the customer’s language, on demand.

It is intentionally not an LLM-generated explanation. There is a closed taxonomy of templates, a closed schema of typed parameters, and a deterministic render function. The same decision rendered twice produces the same prose byte-for-byte.

When you read a decision via GET /v1/decisions/{request_id} or POST /v1/routing/explain, the response now includes:

{
"explanation": {
"text": "Floopy routed this request to openai/gpt-5.4-mini based on 412 historical samples and a moderate confidence of 0.78. The next candidate scored within 0.07 points and outcome variance has been stable. One regression was recorded in the last 7 days.",
"template_id": "feedback_driven_moderate_confidence"
}
}

And the response carries an HTTP header:

Content-Language: en

The locale is echoed as a header rather than a JSON field so the wire shape stays compact and the locale is communicated through the standard HTTP mechanism.

Explanations are composed from a closed set of 15 templates. Every decision picks exactly one template based on its strategy and outcome:

Template idWhen it fires
cache_hitThe response was served from cache; no router was invoked.
fallback_onlyAll candidates were filtered; the request fell through to the default model.
no_router_invokedA short-circuit path took over (legacy model, single-candidate rule).
feedback_driven_high_confidenceThe Feedback-Driven router selected with confidence >= 0.8.
feedback_driven_moderate_confidenceThe Feedback-Driven router selected with confidence in [0.5, 0.8).
feedback_driven_low_confidenceThe Feedback-Driven router selected with confidence < 0.5.
smart_cost_selectedThe Smart Cost router picked the cheapest candidate that met the quality bar.
constraint_rejected_max_cost_increaseThe intended winner was filtered by max_cost_increase.
constraint_rejected_max_regressionFiltered by max_regression.
constraint_rejected_min_samplesFiltered by min_samples_before_promotion.
constraint_rejected_cost_drop_requires_validationFiltered by max_cost_drop_without_validation.
constraint_rejected_high_varianceFiltered by max_outcome_variance.
constraint_rejected_shadow_requiredFiltered by require_shadow_before_live.
firewall_blockedThe LLM firewall blocked the request before routing completed.
fallbackA generic fallback path took over (provider error, circuit breaker, rate limit).

The template_id enum is closed and stable. Adding a new template is a deliberate, versioned change. A decision whose strategy is not yet covered is a compile error in the gateway, not a runtime fallback to “I don’t know”.

The locale used for the rendered text is resolved at read time, on every request, in the following order:

  1. The HTTP Accept-Language request header.
  2. If absent, malformed, or otherwise unparseable: en.

There is no organisation-wide default locale and no per-user preference baked into the response. The decision is purely on the request header. The resolved locale is then echoed via the Content-Language response header so callers can verify which locale they got.

The Accept-Language parser is hardened against abuse:

  • The header is capped at 256 bytes.
  • ASCII-only — non-ASCII bytes trigger a fallback to en.
  • A strict allowlist regex rejects malformed quality factors and unknown locales.
  • The supported locales are en and pt; any other locale falls back to en.

The render function is an exhaustive match over (template_id, locale). Every supported template has a string for every supported locale, enforced at compile time. Adding a pt translation that is missing or stale is a build failure, not a runtime fallback.

The current set of locales is {en, pt}. Adding a new locale is an explicit gateway release, not a configuration toggle.

This is the most important contract: the explanation text never references prompt content. The render function takes a (template_id, params) pair where params is a closed sub-schema of typed scalars — sample counts, confidence values, the gap to the runner-up, the bucketed regression count, and provider/model identifiers. None of these fields ever carry caller-controlled text.

Provider and model strings pass through a sanitizer before they are interpolated into the template:

  • Anything outside [a-zA-Z0-9._/-] is dropped.
  • The result is bounded so a hostile model name cannot blow up the prose.

The output is also defensively truncated to 600 characters at the edge of the renderer, fuzz-tested for control characters, and gated against Markdown control characters. The dashboard renders the text inside a <p> element as plain text — no HTML, no Markdown.

Two reasons.

First, security and storage: if explanation prose were persisted in decision_trace, every byte of every decision would carry an attack surface for stored-text injection forever. By keeping the persisted artifacts strictly typed (explanation_template_id plus explanation_params), the audit row stays small, machine-friendly, and free of free text.

Second, internationalisation: rendering at read time means a customer who reads the same decision in en today and pt tomorrow gets two different paragraphs from one stored decision. There is no migration step when we ship a new locale or refine a template — every existing decision picks up the new prose immediately.

The explanation field is not added to:

  • GET /v1/decisions (the list endpoint) — too verbose for a paginated list view.
  • GET /v1/export/decisions — exports are machine-consumed; the persisted template_id is intentionally excluded from the export allowlist so the wire shape is stable across customers regardless of locale support.

If you want the explanation, fetch the single decision: GET /v1/decisions/{request_id}.

The Floopy onboarding “shadow setup” step uses our internal model catalog to pick a sensible cheaper alternative for the model the customer just connected. When that step’s experiment writes its first decisions, the explanation template will typically be one of the Smart Cost or Feedback-Driven branches, depending on whether the customer is already running Feedback-Driven routing.