POST /v1/routing/explain
Dry-run the routing selection for a candidate request body. The gateway runs the full strategy chain — including the LLM firewall — and returns the candidates, weights, the model that would have been selected, and the router’s confidence. No provider is called. No decision_trace row is emitted. No rate-limit budget is spent on the chat path.
Designed for CI checks (“does this prompt route the way we expect?”), pre-sale demos, and constraint-tuning experiments.
Endpoint
Section titled “Endpoint”POST https://api.floopy.ai/v1/routing/explainAuthentication
Section titled “Authentication”Authorization: Bearer <your-floopy-api-key>Content-Type: application/json- Permission:
write_permission. The endpoint accepts a JSON payload (hence write), but it is non-mutating — no provider call, no log row, no counter increment. - Pro plan (requires Feedback-Driven routing).
- We are rolling this out per organization while we validate quality. Contact support if your organization is not yet enabled.
Request body
Section titled “Request body”Body cap: 64 KiB. Larger payloads return 400 body_too_large.
{ "request": { "model": "gpt-5.4", "messages": [ { "role": "user", "content": "Summarise this incident report in 3 bullets." } ], "temperature": 0.3 }, "headers": { "floopy-session-id": "sess_demo_1" }}| Field | Type | Required | Constraints |
|---|---|---|---|
request | object | Yes | A standard OpenAI-compatible chat-completion body (same shape as POST /v1/chat/completions). |
headers | object | No | A subset of headers to forward to the routing context. Reserved keys are dropped server-side (see Header sanitisation). |
Unknown top-level keys return 400 invalid_body (the request schema is deny_unknown_fields).
Header sanitisation
Section titled “Header sanitisation”Before the headers map is passed into the dry-run engine, entries whose lowercase key matches the following are dropped:
- exact match:
authorization,cookie,host - prefix match:
x-forwarded-,x-real-ip
This keeps the explain endpoint from being abused as a header-relay or fingerprinting tool.
Response (200)
Section titled “Response (200)”{ "dry_run": true, "strategy_id": "feedback_driven", "phase": "auto", "weights": { "session": 0.5, "auto": 0.3, "manual": 0.1, "benchmark": 0.1 }, "candidates": [ { "provider": "openai", "model": "gpt-5.4-mini", "score": 0.81 }, { "provider": "anthropic", "model": "claude-sonnet-4", "score": 0.78 } ], "filtered": [ { "provider": "openai", "model": "gpt-5.4", "reason": "constraint_max_cost_increase", "score": 0.86 } ], "would_select": { "provider": "openai", "model": "gpt-5.4-mini" }, "reason": "dispatched", "confidence": 0.78, "confidence_reason": "ok", "exploration_rate_effective": 0.05, "used_shared_pool_prior": false, "evidence": { "samples": 412, "top2_score_gap": 0.07, "outcome_variance": 0.04, "recent_regressions": { "kind": "exact", "exact": 1 }, "last_regression_at": "2026-05-04T18:25:00Z" }, "explanation": { "text": "Floopy would route this request to openai/gpt-5.4-mini based on 412 historical samples and a moderate confidence of 0.78. The next candidate scored within 0.07 points and outcome variance has been stable. One regression was recorded in the last 7 days.", "template_id": "feedback_driven_moderate_confidence" }}The response also carries a Content-Language header echoing the locale used to render explanation.text. See the Decision Explanation feature.
| Field | Type | Description |
|---|---|---|
dry_run | boolean | Always true. |
strategy_id | string | Strategy that ran (feedback_driven, smart_cost, etc.). |
phase | string | null | Feedback-Driven phase. |
weights | object | null | Signal weights (Feedback-Driven only). |
candidates | array | Considered models and their scores. |
filtered | array | Removed candidates. reason is a closed enum (see GET /v1/decisions/{request_id}). |
would_select | object | null | The (provider, model) the gateway would dispatch to. null when the firewall blocked the request. |
reason | string | dispatched, exhausted, or no_enabled_targets. |
confidence | number | null | See Confidence methodology. |
confidence_reason | string | null | Closed enum: ok, cap_day0, cap_shared, no_router_invoked, insufficient_samples, single_candidate. |
exploration_rate_effective | number | Effective exploration rate. |
used_shared_pool_prior | boolean | Whether cross-tenant priors influenced the score. |
evidence | object | null | Inputs that drove the confidence number, surfaced for audit (Feedback-Driven and Smart-Cost only). null when no router was invoked. See Confidence methodology. |
evidence.samples | integer | Rolling sample count. |
evidence.top2_score_gap | number | Score gap between winner and runner-up. |
evidence.outcome_variance | number | Rolling outcome variance for the winner. |
evidence.recent_regressions | tagged union | Bucketed regression count over the winner’s (provider, model) in the last 7 days. kind: "exact" with exact field for n < 10; kind: "at_least" with at_least field for 10 and 50 thresholds. |
evidence.last_regression_at | ISO8601 | null | Most recent regression in the 7-day window, rounded to the nearest 5-minute boundary. null when none. |
explanation | object | null | Human-readable explanation of the dry-run decision. See Decision Explanation. |
explanation.text | string | Plain-prose paragraph (≤ 600 chars). Never references prompt content. |
explanation.template_id | string | Closed-enum template id. |
Response headers
Section titled “Response headers”The response carries:
Content-Language: enorContent-Language: pt— the locale used to renderexplanation.text. Resolved from the request’sAccept-Languageheader, falling back toen. See Decision Explanation.
Firewall behaviour
Section titled “Firewall behaviour”The LLM firewall runs on the dry-run path. If it blocks, the response is still 200, with:
{ "filtered": [{ "provider": "...", "model": "...", "reason": "firewall_blocked", "score": 0.0 }], "would_select": null, "reason": "no_enabled_targets" }This is intentional — the dry-run faithfully reflects what would happen in production, including the firewall.
Errors
Section titled “Errors”| Status | error code | When |
|---|---|---|
400 | invalid_body | Body fails schema validation or is not JSON. |
400 | body_too_large | Body exceeds 64 KiB. |
403 | write_permission / plan_required | Permission or plan flag missing. |
404 | no_route | No routing rule applies to the org and model. |
429 | rate_limited | Exceeded per-org or per-key budget. Carries Retry-After. |
5xx | internal | Upstream failure. |
Curl example
Section titled “Curl example”curl -s -X POST \ -H "Authorization: Bearer $FLOOPY_API_KEY" \ -H "Content-Type: application/json" \ -d '{ "request": { "model": "gpt-5.4", "messages": [{"role":"user","content":"Summarise this incident report in 3 bullets."}], "temperature": 0.3 }, "headers": {"floopy-session-id":"sess_demo_1"} }' \ "https://api.floopy.ai/v1/routing/explain" | jq .Rate limits
Section titled “Rate limits”Both windows are evaluated atomically — if either denies, the request is rejected with 429.
- 30 requests / minute / organization (
floopy_explain_rpm_budget, default). - 10 requests / minute / API key.
The dual cap exists so a leaked key cannot probe routing internals at high cadence.
See also
Section titled “See also”GET /v1/decisions/{request_id}— same shape for real requests.- GET /v1/constraints — set the constraints the dry-run respects.
- Confidence methodology.