Skip to content
Get Started
Pricing

Transparent plans.
Scale with confidence.

Start on Free. When you need a real feedback loop, Starter at $29.90 ships the essentials. Pro scales to production. Enterprise covers compliance and isolation.

Free
$0/mo
Explore Floopy with a limited free plan.
Get started
  • 50,000 requests/month
  • 20 providers (OpenAI, Anthropic, Gemini…)
  • Exact cache + basic firewall (LLM Firewall)
  • 7-day log retention
Starter
$29.90/mo
Build something real with your own feedback signal.
Subscribe to Starter
  • 100k requests/month + 1k rpm rate limit
  • Feedback API (500/month)
  • Semantic cache
  • 90-day log retention
Enterprise
Custom
Compliance, isolation, SLA and dedicated support.
Contact us
  • SSO/SAML + SOC-2 + HIPAA
  • SLA + dedicated support (Slack + engineer)
  • Opt-out of shared model
  • Dedicated tenant isolation + custom retention
Savings calculator

What would Floopy pay back?

Estimate based on our median deployment. Actual savings depend on traffic mix and quality tolerance — most teams see 40–65%.

$ 40,000
50%
↓ Strict Balanced Aggressive ↑
5M
You'd save
$22,800
per month, net of Floopy fees

Floopy fee
$4,020/mo
Effective reduction
57%
Compare plans

Everything, side by side.

FeatureFreeStarterProEnterprise
Routing & Optimization
Feedback-driven routing Routes each request based on prior-session NPS, improving model choice as user feedback accumulates. + custom model
Smart selectors Selectors that pick the right model per task class (reasoning, extraction, summarization) without you hardcoding the rule.
Smart Cost Routing Swaps expensive models for cheaper alternatives when observed quality is equivalent, cutting cost without sacrificing performance.
A/B testing & experiments
Regression monitoring
Gateway & Providers
Supported providers20202020
OpenAI-compatible API
Automatic fallback
Rate Limit (RPS / 10s burst / RPM)5 / 50 / 10030 / 300 / 1000100 / 1,000 / 10,000custom
Cache & Performance
Exact cache
Semantic cache
Embeddings cache
LLM Firewall & Security
LLM Firewall
LLM Firewall (advanced)
Custom guardrails
Prompts & Datasets
Versioned prompts
Playground
Prompt experiments
Optimized dataset creator Builds training/eval datasets from real traffic, automatically deduplicating and prioritizing diverse examples.
Observability
Requests included50k100k100k + overagecustom
Log retention7 days90 days2 yearscustom
Alerts & webhooks
Feedback API500/mounlimitedunlimited
Collaboration & Limits
Seats135 (+$10/extra)unlimited
Organizations113unlimited
Tiered overage Tiered overage billing: the more requests you make in a month, the lower the price per extra request.
Enterprise & Compliance
SSO/SAML
SOC-2
HIPAA
Audit logs
SLA

Start saving in under 10 minutes.

Start free Talk to sales
FAQ

Frequently asked questions

The questions we actually get from teams evaluating Floopy.

What happens when I exceed 50,000 requests?

On the Free plan, requests are blocked. On Pro, automatic tiered pricing applies — the more you use, the cheaper per request.

Do I need to install an SDK?

No. Floopy works by changing the baseURL in the OpenAI SDK you already use. One line of code, zero dependencies.

Which AI providers are supported?

20 providers, full list: OpenAI, Anthropic, Google Gemini, Google Vertex AI, AWS Bedrock, Azure OpenAI, DeepSeek, Mistral, xAI, Groq, Cerebras, SambaNova, Together AI, Fireworks AI, Perplexity, Cohere, AI21 Labs, DeepInfra, Nebius, and Novita. All accessible through a single OpenAI-compatible endpoint — point your SDK's baseURL at Floopy and call any model from this list with the same shape. Streaming and tool calls are supported across the list; vision inputs are supported on OpenAI, Anthropic, Gemini, and Bedrock today, and rolling out elsewhere as providers ship the capability.

What about enterprise plans?

Contact us for enterprise needs including compliance, dedicated tenant isolation, and custom SLAs.