Prompt Management
Overview
Floopy lets you manage prompts directly in the dashboard instead of hardcoding them in your application. Create a prompt once, reference it by name in your API calls, and update it at any time without redeploying your code. The gateway resolves the latest version at request time.
Creating Prompts
Go to Prompts in the dashboard and click Create Prompt. Give it a name and write the prompt body. You can create system prompts, user message templates, or full conversation templates.
Prompts support variables using double curly brace syntax:
Summarize the following {{document_type}} in {{language}}:
{{content}}When the gateway processes a request that references this prompt, it substitutes the variable values provided in the API call. If a required variable is missing, the gateway returns an error.
Version History
Every time you edit a prompt, Floopy saves the previous version automatically. The version history shows:
- All previous versions with timestamps.
- Who made each change.
- A side-by-side diff comparison between any two versions.
You can roll back to a previous version at any time by selecting it and clicking Restore.
Live Resolution
Prompts are resolved by the gateway at request time. When you update a prompt in the dashboard, all future requests immediately use the new version — there is no cache delay or propagation period. This means you can iterate on prompt quality in production without touching your application code.
Feedback Tracking
Attach feedback to prompt responses to track quality over time. Floopy supports:
- Thumbs up / thumbs down — simple binary feedback from end users or reviewers.
- Custom dimensions — score responses on specific dimensions like accuracy, tone, helpfulness, or any custom criteria you define.
Feedback is linked to the specific prompt version that generated the response, so you can see how quality changes across versions.
Advanced feedback dimensions and analytics are available on the Pro plan and above.
A/B Testing
Split traffic between two or more prompt versions to compare their performance. Set a percentage split and let Floopy route requests accordingly. Track feedback and metrics per variant to determine which version performs better.
See A/B Testing for full details on configuring experiments and applying winners.
Using Prompts via the Gateway
Instead of including the full prompt in your request body, reference a managed prompt by sending the floopy-prompt-id header. The gateway resolves the prompt template and substitutes variables using the inputs object in the request body.
| Header | Description |
|---|---|
floopy-prompt-id | The UUID of the prompt to resolve |
floopy-prompt-version | Pin a specific version (omit for latest) |
| Body Field | Description |
|---|---|
inputs | Object with key-value pairs to substitute into {{key}} variables |
The inputs field is stripped from the request before it reaches the LLM provider — it is only used for variable substitution.
Variable Resolution Order
inputsfrom request body — first priority- Template defaults from the prompt configuration in the dashboard — fallback
- No match — the
{{key}}placeholder remains in the text as-is
Example
Given a prompt template saved in the dashboard:
Summarize the following {{document_type}} in {{language}}:
{{content}}Call the API with the inputs object to fill in the variables:
import { OpenAI } from "openai";
const client = new OpenAI({ baseURL: "https://api.floopy.ai/v1", apiKey: process.env.FLOOPY_API_KEY,});
const response = await client.chat.completions.create( { model: "gpt-4o", messages: [{ role: "user", content: "placeholder" }], inputs: { document_type: "legal contract", language: "English", content: "This agreement is entered into by and between...", }, } as any, { headers: { "floopy-prompt-id": "e51a2820-8ab5-4d6a-96a0-cc7bb4759371", "floopy-prompt-version": "2", }, },);
console.log(response.choices[0].message.content);from openai import OpenAIimport os, httpx
# Using httpx directly since the OpenAI SDK doesn't support "inputs"response = httpx.post( "https://api.floopy.ai/v1/chat/completions", headers={ "Authorization": f"Bearer {os.environ['FLOOPY_API_KEY']}", "Content-Type": "application/json", "floopy-prompt-id": "e51a2820-8ab5-4d6a-96a0-cc7bb4759371", "floopy-prompt-version": "2", }, json={ "model": "gpt-4o", "messages": [{"role": "user", "content": "placeholder"}], "inputs": { "document_type": "legal contract", "language": "English", "content": "This agreement is entered into by and between...", }, },)
print(response.json()["choices"][0]["message"]["content"])curl https://api.floopy.ai/v1/chat/completions \ -H "Authorization: Bearer $FLOOPY_API_KEY" \ -H "Content-Type: application/json" \ -H "floopy-prompt-id: e51a2820-8ab5-4d6a-96a0-cc7bb4759371" \ -H "floopy-prompt-version: 2" \ -d '{ "model": "gpt-4o", "messages": [{"role": "user", "content": "placeholder"}], "inputs": { "document_type": "legal contract", "language": "English", "content": "This agreement is entered into by and between..." } }'The gateway replaces {{document_type}}, {{language}}, and {{content}} in the prompt template with the values from inputs, then sends the final prompt to the LLM provider.
Using Prompts Without Inputs
If your prompt has no variables, you can omit the inputs field entirely. The gateway resolves the prompt messages and uses them directly:
const response = await client.chat.completions.create( { model: "gpt-4o", messages: [{ role: "user", content: "placeholder" }], }, { headers: { "floopy-prompt-id": "e51a2820-8ab5-4d6a-96a0-cc7bb4759371", }, },);response = client.chat.completions.create( model="gpt-4o", messages=[{"role": "user", "content": "placeholder"}], extra_headers={ "floopy-prompt-id": "e51a2820-8ab5-4d6a-96a0-cc7bb4759371", },)curl https://api.floopy.ai/v1/chat/completions \ -H "Authorization: Bearer $FLOOPY_API_KEY" \ -H "Content-Type: application/json" \ -H "floopy-prompt-id: e51a2820-8ab5-4d6a-96a0-cc7bb4759371" \ -d '{ "model": "gpt-4o", "messages": [{"role": "user", "content": "placeholder"}] }'Variable Syntax
The full format for prompt variables is {{ fl:name:type }} where:
fl:is a required prefix that identifies the variable as a Floopy template variable.nameis the variable name (e.g.,user_name,topic,language).typeis the data type:string,number,boolean, etc.
Examples:
Hello {{ fl:user_name:string }}, please summarize this text at atemperature of {{ fl:temperature:number }}.
Include references: {{ fl:include_refs:boolean }}The prompt editor in the dashboard highlights variables automatically when you type them in this format. If you use the shorthand {{name}} syntax (without the fl: prefix and type), the gateway treats the variable as a string by default.
Model Configuration
When creating or editing a prompt, you can configure model parameters that are applied whenever the prompt is resolved by the gateway:
- Temperature (0—2) — Controls randomness. Lower values produce more deterministic outputs; higher values increase creativity and variation.
- Max Tokens — Maximum number of tokens the model can generate in the response.
- Top P (0—1) — Nucleus sampling threshold. The model considers only tokens whose cumulative probability mass reaches this threshold.
- Frequency Penalty (-2 to 2) — Reduces repetition by penalizing tokens that have already appeared. Positive values decrease repetition; negative values encourage it.
- Presence Penalty (-2 to 2) — Encourages the model to talk about new topics by penalizing tokens that have appeared at all, regardless of frequency.
- Stop Sequences — Comma-separated strings that cause the model to stop generating when encountered.
- Reasoning Effort — Controls how much computation the model spends reasoning before responding. Options: None, Minimal, Low, Medium, High.
- Response Format — The format of the model’s response: Plain text or JSON mode. JSON mode constrains the output to valid JSON.
These parameters are saved with the prompt version. If you do not set a parameter, the model’s default value is used.
Best Practices
- Use descriptive variable names. Names like
{{customer_name}}are clearer than{{x}}and make prompts easier to maintain. - Version intentionally. Make one logical change per version so the diff is meaningful and rollbacks are clean.
- Collect feedback early. Even a simple thumbs up/down signal helps you identify regressions quickly.
- Test before deploying. Use the Playground to test prompt changes before making them live.