Headers Reference
Floopy uses custom HTTP headers to control gateway behavior on a per-request basis. Pass these headers alongside your standard Authorization and Content-Type headers when calling the gateway.
Request Headers
Cache Headers
Control how the gateway caches requests and responses.
| Header | Type | Description | Example |
|---|---|---|---|
Floopy-Cache-Enabled | boolean | Enable or disable exact cache for this request. | true |
Floopy-Cache-Seed | string | Isolates exact cache entries by seed value. Semantic and advanced tiers match by embedding similarity and are not affected by the seed. | deterministic-seed-abc |
Floopy-Cache-Bucket-Max-Size | integer | Maximum number of cached responses stored per cache key. | 3 |
Floopy-Cache-Ignore-Keys | string | Comma-separated list of message keys to ignore when computing the cache key. | timestamp,request_id |
floopy-cache-advanced | boolean | Enable advanced semantic cache (Qdrant-backed). Matches requests by meaning rather than exact content. | true |
cache-control | string | Standard HTTP cache-control header. Used to override the default TTL. | max-age=3600 |
Prompt Headers
Reference managed prompts from the Floopy prompt library.
| Header | Type | Description | Example |
|---|---|---|---|
floopy-prompt-id | string (UUID) | The UUID of a prompt from the prompt library. The gateway resolves and injects it at request time. | e51a2820-8ab5-4d6a-96a0-cc7bb4759371 |
floopy-prompt-version | integer | Pin a specific prompt version. If omitted, the gateway uses the latest version. | 2 |
Body Field: inputs
When using a managed prompt with floopy-prompt-id, you can pass an inputs object in the JSON request body to fill template variables. The gateway substitutes {{key}} placeholders in the prompt template with the corresponding values from inputs.
{ "model": "gpt-4o", "messages": [{"role": "user", "content": "placeholder"}], "inputs": { "language": "English", "topic": "quantum computing" }}Resolution order: inputs values take priority, then template defaults from the prompt config, then the placeholder is left as-is. The inputs field is stripped before the request reaches the LLM provider.
See the Prompt Management guide for full examples.
Security Headers
Enable the LLM firewall to scan prompts for injection attacks and unsafe content.
| Header | Type | Description | Example |
|---|---|---|---|
floopy-llm-security-enabled | boolean | Run the LLM firewall for this request. The firewall sends the prompt to a safety-tuned model (configured via FIREWALL_MODEL) and blocks any classified as unsafe. A Qdrant verdict cache short-circuits repeat unsafe prompts. | true |
Token Handling Headers (Coming Soon)
Control how the gateway handles requests that exceed a model’s context window. This feature is planned and not yet available.
| Header | Type | Description | Example |
|---|---|---|---|
Floopy-Token-Limit-Exception-Handler | string | Strategy to apply when the request exceeds the model’s token limit. | truncate |
The three planned strategies are:
truncate— Removes messages from the beginning of the conversation to fit within the model’s token limit. The system message and the most recent messages are preserved.middle-out— Keeps the first and last messages in the conversation and removes messages from the middle. Useful when both the initial context and the latest user message are important.fallback— Switches to a model with a larger context window instead of modifying the messages. The gateway selects an appropriate model from the same provider.
Routing Headers
Override the default routing behavior for a single request.
| Header | Type | Description | Example |
|---|---|---|---|
floopy-model-override | string | Override the model without changing the request body. | gpt-4o-mini |
floopy-routing-rule | string (UUID) | Override the routing rule applied to this request. | a3f1b2c4-5678-9def-ghij-klmnopqrstuv |
floopy-ab-test | string (UUID) | A/B test ID. The gateway resolves the assigned variant for this request. | b7e2c3d4-1234-5678-abcd-ef0123456789 |
floopy-smart-select | string (UUID) | Smart Selector ID. The gateway picks the best model based on the selector’s configuration. | c8f3d4e5-2345-6789-bcde-f01234567890 |
Rate Limit Headers
Override the default rate limit policy for a single request.
| Header | Type | Description | Example |
|---|---|---|---|
floopy-ratelimit-policy | string | Custom rate limit policy for this request. | 100;w=60;u=request;s=global |
The policy format is <limit>;w=<window>;u=<unit>;s=<segment>:
limit— The maximum number of allowed units within the window.w(window) — Time window in seconds. The minimum window is 60 seconds.u(unit) — The unit to count:request(number of requests) orcents(cost in cents).s(segment) — How to segment the limit:global(shared across all users),user(per end-user viafloopy-user-id), orcustom(per custom key).
Example: 100;w=60;u=request;s=global means 100 requests per 60 seconds, applied globally.
Project Scoping
Segment requests by project for per-project cost tracking and analytics.
| Header | Type | Description | Example |
|---|---|---|---|
floopy-project-id | string (UUID) | Project identifier. Tags the request with a project for cost allocation and dashboard filtering. If the API key is hard-locked to a project, this header is optional. A mismatched UUID returns 403. | a1b2c3d4-5678-9abc-def0-123456789abc |
See the Projects guide for the full fallback chain and environment model.
Session and Property Headers
Attach session metadata and custom properties to requests for tracking and analytics.
| Header | Type | Description | Example |
|---|---|---|---|
floopy-user-id | string | End-user identifier. Used for per-user rate limiting and analytics. | user-alice-001 |
floopy-session-id | string | Session identifier. Groups related requests together. | sess-abc123 |
floopy-session-name | string | Human-readable session name for display in the dashboard. | math-tutoring |
floopy-session-path | string | Session path or location within your application. | /dashboard/math |
floopy-property-* | string | Custom property header. Any suffix after floopy-property- becomes the property key. | floopy-property-usertier: premium |
Custom properties appear in the observability dashboard and can be used to filter and group requests.
Response Headers
The gateway adds these headers to every response. They provide metadata about how the request was processed.
| Header | Description | Example |
|---|---|---|
Floopy-Provider | The provider that handled the request. | OpenAI |
Floopy-Model | The model that processed the request. | gpt-4o |
Floopy-Fallback-Used | Whether a fallback provider was used because the primary was unavailable. | true |
Floopy-Reasoning-Tokens | Number of reasoning tokens used (DeepSeek models). | 150 |
Floopy-Queue-Time | Time the request spent in the provider queue, in seconds (Groq). | 0.5 |
Floopy-Prompt-Time | Time spent processing the prompt, in seconds (Groq). | 0.2 |
Floopy-Completion-Time | Time spent generating the completion, in seconds (Groq). | 1.3 |