Skip to content

Blog

Feedback-driven routing, session-level feedback, LLM cost optimization, and AI gateway engineering — plus product updates and guides.

engineering 12 min read

Four Signals, One Loop: How Multi-Source Feedback Routing Actually Works

Session NPS, LLM-as-judge, admin rating, and benchmarks each fail alone. Combining all four with dynamic weights is the only honest way to route.

engineering 7 min read

Self-Hosted vs Managed LLMOps: When to Choose Each

Self-hosted LLMOps gives you control. Managed LLMOps gives you cross-tenant intelligence. Here's the honest decision framework — neither is wrong.

Per-project cost allocation dashboard showing spend breakdown by agent and feature
product 5 min read

Per-Project Cost Allocation for AI Agents

Break down AI spend by agent, product feature, or team — so you can see which ones are worth the cost and which are quietly eating your budget.

guides 9 min read

How Hackers Exploit LLM APIs (And How to Stop Them)

Common attack vectors targeting LLM APIs — from prompt injection to cost attacks — and how real-time threat detection protects your app.

guides 6 min read

How to Build an Agentic Workflow with Floopy's MCP Integration

Build a production agentic loop with Floopy: plugin YAML config, MCP server connection, secret management, and full workflow testing.

product 4 min read

Floopy Now Supports MCP: Connect Any AI Tool to Your Gateway

Floopy adds Model Context Protocol support — expose your gateway as an MCP server or connect external MCP tools to your agentic workflows.

guides 5 min read

How to Reduce OpenAI API Costs by Up to 70%

Practical strategies to cut your OpenAI API bill — from prompt optimization and caching to model routing and usage monitoring.

engineering 7 min read

How Floopy Protects Your LLM Traffic

A deep dive into the security layers that protect your data, API keys, and prompts as they flow through the Floopy gateway.

product 4 min read

Smart Cost Routing: Cut AI Costs Up to 60%

Smart Cost Routing picks cheaper models for simple prompts, guarded by Floopy's session-level feedback loop. 40-60% savings without compromising quality.

engineering 6 min read

Why Floopy Stays Fast While Optimizing Your Agents

Gateway speed is table-stakes now. The real question is whether your routing layer can make agents measurably better over time. Here's how Floopy does both.

engineering 9 min read

Agent Optimization vs AI Gateway: What's the Difference in 2026

Gateways route traffic. Agent optimization platforms learn from production feedback and make routing measurably better. Here's why that distinction matters.

guides 7 min read

How to Choose the Right AI Model for Your App (Hint: Stop Choosing)

Stop picking a model per endpoint. Let a feedback-driven router choose per prompt using session NPS, auto-scores, admin ratings, and public benchmarks.

guides 6 min read

How to Estimate and Control Your AI API Costs

Learn how to predict your OpenAI, Anthropic, or Google AI spending before it surprises you — with formulas, examples, and monitoring tips.

guides 5 min read

How to Protect Your AI App from Prompt Injection

A developer's guide to understanding and preventing prompt injection attacks in LLM-powered applications.

guides 5 min read

How to Cache AI API Requests and Save 30-40%

A practical guide to implementing caching for OpenAI, Claude, and other LLM API calls — from exact matching to semantic caching.