Every candidate route ships behind a canary and an eval bar. Regressions roll back in seconds, not sprints.
We started Floopy because our own AI bill scared us.
Floopy is the infrastructure layer between your agent and the providers — so the cheapest model that can do the job, does it. Every time. Without sacrificing what your users feel.
We were running AI features in production and the bill went from "line item" to "largest line item" in a few months. We tried what everyone tries: hand-written routing rules, token limits, aggressive caching, prompt compression. It worked, a little. It also broke constantly — the moment a new model shipped, our rules were stale. The moment a prompt changed, our caches were poisoned. The moment we relaxed anything, quality wobbled.
The pattern we wanted didn't exist as a product: route every call by a learned policy grounded in real user feedback, with canary-safe promotion and fast rollback. So we built it. Floopy is now an OpenAI-compatible AI gateway teams use to move traffic between models without moving their quality bar.
We believe three things, strongly: optimization belongs in the request path, not in a dashboard. Every routing decision should be explainable to a non-ML engineer on call at 3am. And teams should only pay for infrastructure that has measurably earned the money.
Four rules we don't break.
These are the constraints we hold the product to. When something else has to give, these don't.
Floopy is a drop-in base URL for the OpenAI SDK you already ship. Ejecting is a one-line config change — point the baseURL back at OpenAI — not a migration project.
Every routed call has a reason string. Every promotion has a diff. Every rollback has a trace.
Measured baseline vs. post-routing cost is always visible in your dashboard. If we can't move your bill, you'll see it — and so will we.
Small by design.
Eleven people. Half infra, half ML, all shipping. We hire slowly and rarely — if you're reading this and it sounds like home, write us anyway.
Previously infra lead at Parallax. Built the original routing engine over one very long weekend.
Ex-Anthropic applied research. Spends most days in the bandit convergence code.
Previously staff at Stripe. Owns the SDK surface and refuses to break it.
Former Vertex research. Wrote the paper the eval harness is based on.
On call when the graphs are green; usually also on call when they're red.
Drove SOC 2 and ISO 27001. Previously at Cloudflare.
Runs the design system & docs. If something looks off, blame her but gently.
Closed deal #1. Still closes most of them.
A short history.
Two years in; roughly a billion calls routed in the last ninety days.
Internal prototype
Routing rules move from a cron job to an actual control plane. We stop breaking our own product.
OpenAI-compatible endpoint
We ship the first public api.floopy.ai/v1. One base-URL swap and routing is live — no SDK to adopt, no prompts to change.
Seed round · $8.4M
Led by Nimbus Capital, with angels from Anthropic, Anyscale, and Cloudflare.
v2.0 routing engine
Contextual bandits ship. Median convergence drops from 7 days to 48 hours.
≈ 900 teams · ~1B calls/mo
Still small. Still hiring selectively. Still shipping every Tuesday.
Patient capital, picked carefully.
Want to work with us?
We're a small team building carefully. Send a note — the best conversations don't start from a listing.