POST /v1/routing/explain

Dry-run the routing selection for a candidate request body. The gateway runs the full strategy chain — including the LLM firewall — and returns the candidates, weights, the model that would have been selected, and the router’s confidence. No provider is called. No decision_trace row is emitted. No rate-limit budget is spent on the chat path.

Designed for CI checks (“does this prompt route the way we expect?”), pre-sale demos, and constraint-tuning experiments.

Endpoint

POST https://api.floopy.ai/v1/routing/explain

Authentication

Authorization: Bearer <your-floopy-api-key>
Content-Type: application/json

Permission: write_permission. The endpoint accepts a JSON payload (hence write), but it is non-mutating — no provider call, no log row, no counter increment.
Pro plan (requires Feedback-Driven routing).
We are rolling this out per organization while we validate quality. Contact support if your organization is not yet enabled.

Request body

Body cap: 64 KiB. Larger payloads return 400 body_too_large.

{
  "request": {
    "model": "gpt-5.4",
    "messages": [
      { "role": "user", "content": "Summarise this incident report in 3 bullets." }
    ],
    "temperature": 0.3
  },
  "headers": {
    "floopy-session-id": "sess_demo_1"
  }
}

Field	Type	Required	Constraints
`request`	object	Yes	A standard OpenAI-compatible chat-completion body (same shape as `POST /v1/chat/completions`).
`headers`	object	No	A subset of headers to forward to the routing context. Reserved keys are dropped server-side (see Header sanitisation).

Unknown top-level keys return 400 invalid_body (the request schema is deny_unknown_fields).

Header sanitisation

Before the headers map is passed into the dry-run engine, entries whose lowercase key matches the following are dropped:

exact match: authorization, cookie, host
prefix match: x-forwarded-, x-real-ip

This keeps the explain endpoint from being abused as a header-relay or fingerprinting tool.

Response (200)

{
  "dry_run": true,
  "strategy_id": "feedback_driven",
  "phase": "auto",
  "weights": { "session": 0.5, "auto": 0.3, "manual": 0.1, "benchmark": 0.1 },
  "candidates": [
    { "provider": "openai",    "model": "gpt-5.4-mini",    "score": 0.81 },
    { "provider": "anthropic", "model": "claude-sonnet-4", "score": 0.78 }
  ],
  "filtered": [
    { "provider": "openai", "model": "gpt-5.4", "reason": "constraint_max_cost_increase", "score": 0.86 }
  ],
  "would_select": { "provider": "openai", "model": "gpt-5.4-mini" },
  "reason": "dispatched",
  "confidence": 0.78,
  "confidence_reason": "ok",
  "exploration_rate_effective": 0.05,
  "used_shared_pool_prior": false,
  "evidence": {
    "samples": 412,
    "top2_score_gap": 0.07,
    "outcome_variance": 0.04,
    "recent_regressions": { "kind": "exact", "exact": 1 },
    "last_regression_at": "2026-05-04T18:25:00Z"
  },
  "explanation": {
    "text": "Floopy would route this request to openai/gpt-5.4-mini based on 412 historical samples and a moderate confidence of 0.78. The next candidate scored within 0.07 points and outcome variance has been stable. One regression was recorded in the last 7 days.",
    "template_id": "feedback_driven_moderate_confidence"
  }
}

The response also carries a Content-Language header echoing the locale used to render explanation.text. See the Decision Explanation feature.

Field	Type	Description
`dry_run`	boolean	Always `true`.
`strategy_id`	string	Strategy that ran (`feedback_driven`, `smart_cost`, etc.).
`phase`	string \| null	Feedback-Driven phase.
`weights`	object \| null	Signal weights (Feedback-Driven only).
`candidates`	array	Considered models and their scores.
`filtered`	array	Removed candidates. `reason` is a closed enum (see `GET /v1/decisions/{request_id}`).
`would_select`	object \| null	The `(provider, model)` the gateway would dispatch to. `null` when the firewall blocked the request.
`reason`	string	`dispatched`, `exhausted`, or `no_enabled_targets`.
`confidence`	number \| null	See Confidence methodology.
`confidence_reason`	string \| null	Closed enum: `ok`, `cap_day0`, `cap_shared`, `no_router_invoked`, `insufficient_samples`, `single_candidate`.
`exploration_rate_effective`	number	Effective exploration rate.
`used_shared_pool_prior`	boolean	Whether cross-tenant priors influenced the score.
`evidence`	object \| null	Inputs that drove the confidence number, surfaced for audit (Feedback-Driven and Smart-Cost only). `null` when no router was invoked. See Confidence methodology.
`evidence.samples`	integer	Rolling sample count.
`evidence.top2_score_gap`	number	Score gap between winner and runner-up.
`evidence.outcome_variance`	number	Rolling outcome variance for the winner.
`evidence.recent_regressions`	tagged union	Bucketed regression count over the winner’s `(provider, model)` in the last 7 days. `kind: "exact"` with `exact` field for `n < 10`; `kind: "at_least"` with `at_least` field for `10` and `50` thresholds.
`evidence.last_regression_at`	ISO8601 \| null	Most recent regression in the 7-day window, rounded to the nearest 5-minute boundary. `null` when none.
`explanation`	object \| null	Human-readable explanation of the dry-run decision. See Decision Explanation.
`explanation.text`	string	Plain-prose paragraph (≤ 600 chars). Never references prompt content.
`explanation.template_id`	string	Closed-enum template id.

Response headers

The response carries:

Content-Language: en or Content-Language: pt — the locale used to render explanation.text. Resolved from the request’s Accept-Language header, falling back to en. See Decision Explanation.

Firewall behaviour

The LLM firewall runs on the dry-run path. If it blocks, the response is still 200, with:

{ "filtered": [{ "provider": "...", "model": "...", "reason": "firewall_blocked", "score": 0.0 }],
  "would_select": null,
  "reason": "no_enabled_targets" }

This is intentional — the dry-run faithfully reflects what would happen in production, including the firewall.

Errors

Status	`error` code	When
`400`	`invalid_body`	Body fails schema validation or is not JSON.
`400`	`body_too_large`	Body exceeds 64 KiB.
`403`	`write_permission` / `plan_required`	Permission or plan flag missing.
`404`	`no_route`	No routing rule applies to the org and model.
`429`	`rate_limited`	Exceeded per-org or per-key budget. Carries `Retry-After`.
`5xx`	`internal`	Upstream failure.

Curl example

curl -s -X POST \
  -H "Authorization: Bearer $FLOOPY_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "request": {
      "model": "gpt-5.4",
      "messages": [{"role":"user","content":"Summarise this incident report in 3 bullets."}],
      "temperature": 0.3
    },
    "headers": {"floopy-session-id":"sess_demo_1"}
  }' \
  "https://api.floopy.ai/v1/routing/explain" | jq .

Rate limits

Both windows are evaluated atomically — if either denies, the request is rejected with 429.

30 requests / minute / organization (floopy_explain_rpm_budget, default).
10 requests / minute / API key.

The dual cap exists so a leaked key cannot probe routing internals at high cadence.