How Hackers Exploit LLM APIs (And How to Stop Them)
Common attack vectors targeting LLM APIs — from prompt injection to cost attacks — and how real-time threat detection protects your app.
Most teams find out their LLM API key was compromised the same way: a cloud bill that makes no sense. By the time the invoice arrives, the damage is done — thousands of dollars burned in hours, sometimes minutes. No alert fired. No dashboard turned red. The attacker was long gone.
LLMs are expensive by design. Every token costs real money. That makes LLM APIs one of the most attractive targets for abuse today. Unlike traditional APIs where a breach might leak data, a compromised LLM endpoint bleeds cash with every single request.
This post breaks down exactly how attackers exploit LLM APIs, what the real costs look like, and how to build a defense that actually works.
The Threat Landscape
Attackers targeting LLM APIs are not using sophisticated zero-day exploits. Most attacks rely on well-known techniques applied to a new surface area. Here is what they go after.
Credential Stuffing and Key Theft
The most common attack vector is also the simplest: stolen API keys. Keys end up exposed in public GitHub repositories, application logs, client-side JavaScript bundles, and .env files accidentally committed to version control. Automated bots scan GitHub in real time for patterns matching API key formats. Once a key is found, it can be exploited within seconds.
What makes this particularly dangerous with LLM APIs is that a single key often grants access to expensive models with no spending cap. Traditional SaaS API keys might let an attacker read some data. An LLM API key lets them generate thousands of dollars in compute costs.
Prompt Injection
Prompt injection is the SQL injection of the AI era. Attackers craft inputs designed to override your system prompt, extract hidden instructions, or manipulate the model into behaving in unintended ways. This can mean leaking your proprietary system prompt, bypassing content safety filters, or tricking the model into executing unauthorized tool calls.
The risk compounds when your LLM has access to tools, databases, or external APIs. A successful injection does not just produce bad output — it can trigger real actions in your system.
Cost Attacks (Denial of Wallet)
This is the attack vector that keeps engineering leads up at night. Instead of trying to take your service down with a DDoS, attackers aim to drain your budget. The playbook is straightforward: send rapid-fire requests using the most expensive model available, set max_tokens to the highest value allowed, and craft prompts designed to produce the longest possible outputs.
Unlike traditional denial-of-service attacks that trigger infrastructure alerts, denial-of-wallet attacks look like legitimate traffic. Each individual request is valid. The model responds normally. The only signal is the rate at which your credit balance drops.
Geographic Anomalies
When a key is stolen, it rarely gets used from the same location as its owner. A development team based in San Francisco suddenly sees requests originating from a country they have no operations in. This is one of the clearest signals of compromise, yet most teams have no way to detect it because standard LLM API providers do not expose geolocation data.
Model Abuse
Not all attacks aim for maximum damage. Some attackers use stolen keys to quietly access expensive models they do not want to pay for — using your gpt-4o access to run their own applications, switching to premium models your application was never intended to use, or running batch workloads during off-hours when nobody is watching.
The Real Cost of an Attack
Let’s do some quick math. A single GPT-4 request with maximum context can cost around $0.30. An attacker running 10,000 requests — which takes minutes with simple scripting — generates roughly $3,000 in charges. Scale that to a full day of undetected abuse and you are looking at six figures.
These are not theoretical numbers. Teams regularly report surprise bills in the thousands after a key leak. The OpenAI community forums and Reddit are full of these stories. And the providers are clear: you are responsible for usage on your key, regardless of who made the requests.
No rate limit means unlimited damage. No geographic restriction means global exposure. No real-time alerting means hours or days of undetected abuse.
Why Traditional Monitoring Fails
If you are relying on your cloud provider’s billing alerts, you are already too late. Here is why traditional approaches fall short:
Billing alerts are delayed. Cloud provider billing dashboards update every few hours at best. Some update daily. By the time an alert fires, the attacker has had a significant head start.
Log-based detection is reactive. Shipping logs to a SIEM and writing rules to detect anomalies is valuable, but it introduces latency. Logs need to be collected, parsed, indexed, and queried. The detection-to-response loop is measured in minutes or hours, not milliseconds.
No LLM-specific intelligence. Generic API monitoring tools do not understand LLM traffic patterns. They cannot distinguish between a legitimate burst of requests and a cost attack. They do not know what max_tokens abuse looks like. They cannot correlate geographic anomalies with model switching patterns.
You need monitoring that is purpose-built for LLM APIs — something that understands the specific ways these endpoints get abused.
A Better Approach: Real-Time Threat Detection
The key insight is that threat detection must happen on every request, in real time, not after the fact in a log pipeline. That is the approach we built into Floopy.
Every request that passes through the gateway increments Redis-based counters with sub-millisecond overhead. Six independent threat detectors run in parallel on every request, analyzing patterns as they happen — not minutes later.
When a threat is detected, the system assigns a severity level (low, medium, high, or critical) based on configurable thresholds. Alerts appear instantly in your dashboard with full context: the offending API key, the threat type, the geographic origin, and the specific pattern that triggered detection.
From the dashboard, you can revoke a compromised key with a single click. No digging through logs. No waiting for the next billing cycle. See the full technical breakdown in our Security Threat Detection documentation.
The 6 Threats Floopy Detects
The gateway monitors for six distinct threat patterns, each targeting a different attack vector:
Rapid Request Volume — Detects abnormal spikes in request frequency from a single API key. A sudden jump from 10 requests per minute to 500 is a strong signal of automated abuse or a compromised key being exploited by a script.
High Error Rate — Monitors the ratio of failed requests to successful ones. A high error rate often indicates credential stuffing attempts, where an attacker is probing your endpoint with variations of stolen or guessed keys.
Cost Spike — Tracks the estimated cost generated by each API key over rolling time windows. A key that normally generates $2/hour suddenly costing $200/hour triggers an immediate alert, even if every individual request looks legitimate.
Geographic Anomaly — Uses GeoIP data to flag requests originating from countries that differ from a key’s normal usage pattern. If your team operates from the US and Germany but a key starts making requests from an unexpected region, you will know immediately.
Model Switching — Detects when an API key suddenly begins requesting a different model than its established pattern. An application configured for gpt-4o-mini that starts hitting gpt-4o may indicate an attacker who has stolen the key and is using it for their own, more expensive workloads.
Max Token Exploitation — Identifies requests that consistently set max_tokens to unusually high values. This is a hallmark of cost attacks, where the goal is to generate the longest (and most expensive) responses possible.
Defense in Depth
No single security measure is sufficient. The most resilient approach layers multiple defenses so that each one catches what the others miss.
Layer 1: Rate Limiting — Your first line of defense against volume-based attacks. Floopy enforces rate limits at the gateway level using atomic Redis operations, ensuring consistency even across horizontally scaled deployments. This prevents the brute-force scenario where an attacker simply floods your endpoint.
Layer 2: LLM Firewall — Prevents prompt injection and content policy violations before they reach the model. Floopy runs a local ONNX-based classifier (Prompt Guard) on every request with minimal latency overhead. For details, see the LLM Firewall documentation.
Layer 3: Threat Detection — The pattern analysis layer described above. Even if an attacker stays within your rate limits and avoids triggering the firewall, threat detection catches the behavioral anomalies — the geographic shift, the model switch, the cost curve bending upward.
Layer 4: Alerts and Auto-Remediation — Detection without response is just expensive logging. The final layer ensures that when a threat is identified, the right people are notified instantly and can take action — revoking keys, tightening limits, or blocking regions — directly from the dashboard.
For a comprehensive walkthrough of setting up all four layers, see our Security Guide.
Getting Started
If you are already using OpenAI, Anthropic, or Google Gemini APIs, adding gateway-level security takes minutes:
1. Route your traffic through Floopy. Change your SDK’s baseURL to point at the gateway. Your application code stays the same — the gateway handles translation and routing transparently.
2. Configure threat detection thresholds. Every application has different traffic patterns. Set the thresholds that make sense for your usage — what counts as a “spike” for a high-traffic production app is different from a development environment.
3. Enable GeoIP detection. Provide a MaxMind GeoLite2 database path and the gateway will automatically enrich every request with geographic data, enabling country-based anomaly detection.
4. Set up alerts. Connect your dashboard notifications so the right team members get alerted when threats are detected.
The full setup guide is available in our documentation.
Conclusion
The economics of LLM APIs have created a new class of security risk. Every unprotected endpoint is a potential liability measured in dollars per second, not just data exposure. The cost of a single incident — a leaked key exploited over a weekend — can dwarf the cost of implementing proper security by orders of magnitude.
Traditional monitoring was not built for this. Billing alerts that fire hours after the fact and log pipelines that detect patterns in batch are not fast enough when an attacker can burn through your budget in minutes.
Real-time, LLM-aware threat detection is not a nice-to-have. For any team running AI in production, it is table stakes. The question is not whether you will face an attack — it is whether you will catch it before the bill arrives.