Security Threat Detection
Overview
The Floopy gateway monitors all API traffic in real time for suspicious patterns. Every request that passes through the gateway is evaluated against a set of threat detectors that track volume anomalies, authentication failures, geographic shifts, cost spikes, firewall violations, and unusual model usage.
When a threat is detected, the system generates an alert in the dashboard with severity classification, contextual metadata, and actionable remediation options. Detection runs asynchronously after request dispatch — it adds zero latency to the critical path.
Architecture
The detection system uses a two-layer approach: lightweight in-memory counters for hot detection, and full analytical queries for deep evaluation. This keeps per-request overhead to sub-millisecond levels while still enabling complex pattern analysis.
%%{init: {'theme': 'dark', 'themeVariables': { 'primaryColor': '#5ee3ff', 'lineColor': '#5ee3ff', 'textColor': '#E0E0E0' }}}%%
flowchart TD
A[Incoming Request] --> B[Gateway Processing]
B --> C[Provider Dispatch]
C --> D[Response to Client]
C --> E[Async Post-Dispatch]
E --> F[Update Counters]
F --> G{Threshold Exceeded?}
G -->|No| H[Done]
G -->|Yes| I[Acquire Distributed Lock]
I --> J[Deep Evaluation]
J --> K[Run 6 Detectors in Parallel]
K --> L{Threats Found?}
L -->|No| H
L -->|Yes| M[Dedup Check]
M --> N[Insert Alert]
N --> O[Dashboard Shows Alert]
P[Scheduled Evaluation] --> J
style A fill:#1a1a2e,stroke:#5ee3ff
style C fill:#1a1a2e,stroke:#5ee3ff
style F fill:#0d2137,stroke:#5ee3ff
style J fill:#0d2137,stroke:#5ee3ff
style K fill:#0d2137,stroke:#5ee3ff
style N fill:#1a1a2e,stroke:#5ee3ff
style O fill:#1a1a2e,stroke:#5ee3ff
style P fill:#1a1a2e,stroke:#5ee3ffFlow Summary
- Post-dispatch: After the gateway dispatches a request to the provider, it spawns an async task to update counters for the organization.
- Hot check: If any counter exceeds its threshold, the system acquires a distributed lock to prevent concurrent evaluations.
- Deep evaluation: With the lock held, the system runs all 6 detectors in parallel, querying historical data for patterns.
- Alert creation: Detected threats are deduplicated against existing open alerts, then inserted as new alerts.
- Scheduled evaluation: A scheduled job can also trigger the same evaluation pipeline on a regular interval, catching threats that accumulate between request-triggered checks.
Threat Types
| Type | Detection Window | Threshold | Granularity |
|---|---|---|---|
| Volume Spike | 5 min vs 7-day avg | 3x | Per API key |
| Brute Force | 5 min | 50%+ error rate, min 10 req | Per org |
| Geo Anomaly | 30-day history | New country | Per API key |
| Cost Anomaly | 1 hour vs 7-day hourly avg | 5x | Per org |
| Firewall Threat | 30 min | 10+ blocked requests | Per org |
| Model Switching | 10 min | 5+ unique models | Per API key |
Volume Spike
Compares the current 5-minute request count for each API key against the 7-day average for the same time window. If the current volume exceeds 3x the historical average, an alert is raised. This catches sudden surges that may indicate a compromised key or automated abuse.
Brute Force
Monitors the error rate within a 5-minute sliding window. If more than 50% of requests result in authentication or authorization errors and the total request count exceeds 10, the detector flags a brute force attempt. This catches credential stuffing and key enumeration attacks.
Geo Anomaly
Tracks the set of countries from which each API key makes requests. If a request originates from a country that has not been seen for that key in the past 30 days, an alert is generated. This detects compromised keys being used from unexpected locations. Requires a GeoIP database (MaxMind GeoLite2).
Cost Anomaly
Compares the current hour’s total cost against the 7-day average hourly cost. If the current cost exceeds 5x the historical average, an alert is raised. This catches unexpected cost spikes from expensive model usage, unusually long completions, or abuse.
Firewall Threat
Counts the number of requests blocked by the LLM Firewall within a 30-minute window. If more than 10 requests are blocked, an alert is raised. Sustained firewall blocks indicate a persistent attacker probing for prompt injection or jailbreak vectors.
Model Switching
Tracks the number of unique models requested by each API key within a 10-minute window. If more than 5 distinct models are used, an alert is generated. Rapid model switching is unusual in normal application usage and may indicate an attacker probing for model-specific vulnerabilities.
Detection Layers
Layer 1: Hot Detection
The hot detection layer runs inline after every request with sub-millisecond overhead. It uses lightweight counters organized in fixed 5-minute windows with self-expiring keys.
- No blocking: Counter updates are fire-and-forget operations.
- Fixed windows: Each window is a discrete 5-minute bucket.
- Self-expiring: All counters have a TTL that ensures automatic cleanup with no garbage collection needed.
- Threshold check: After incrementing, the counter value is compared against the configured threshold. Only when exceeded does the system proceed to deep evaluation.
This layer acts as a cheap filter — the vast majority of requests increment a counter and move on without triggering any further work.
Layer 2: Deep Evaluation
When hot detection fires, the system acquires a distributed lock and runs the full analytical evaluation. This layer queries historical data and runs all 6 detectors in parallel.
- Historical baselines: Queries pull 7-day averages, 30-day country sets, and hourly cost aggregations.
- Parallel execution: All 6 detectors run concurrently. The evaluation completes when the slowest detector finishes.
- Lock protection: A distributed lock prevents multiple gateway instances from running duplicate evaluations for the same organization simultaneously.
- Deduplication: Before inserting an alert, the system checks for existing open alerts of the same type and organization to avoid duplicate notifications.
Alert Lifecycle
Alerts follow a simple state machine from creation to resolution.
%%{init: {'theme': 'dark', 'themeVariables': { 'primaryColor': '#5ee3ff', 'lineColor': '#5ee3ff', 'textColor': '#E0E0E0' }}}%%
stateDiagram-v2
[*] --> Open: Threat Detected
Open --> Acknowledged: User reviews alert
Acknowledged --> Resolved: User takes action
Acknowledged --> Dismissed: False positive
Open --> Dismissed: False positive
Resolved --> [*]
Dismissed --> [*]- Open: A new alert has been created. It appears in the dashboard with a notification badge.
- Acknowledged: A team member has reviewed the alert and is investigating.
- Resolved: The alert has been addressed with a concrete action (e.g., key revocation, rate limit enabled).
- Dismissed: The alert was determined to be a false positive and requires no action.
Severity Classification
Severity is calculated automatically based on the ratio between the observed value and the expected baseline. This applies to volume, cost, and firewall threat types. Geo anomaly and model switching use fixed severity based on count thresholds.
| Ratio (Observed / Baseline) | Severity |
|---|---|
| Above 10x | Critical |
| Above 5x | High |
| Above 2x | Medium |
| Up to 2x | Low |
Critical and high severity alerts are highlighted in the dashboard and can be configured to trigger webhook notifications.
Configuration
The threat detection system is configured via environment variables on the gateway.
| Setting | Description | Default |
|---|---|---|
| Security API Key | Authenticates scheduled evaluation calls. Required for production. | — |
| GeoIP Database Path | Path to the MaxMind GeoLite2-Country.mmdb file. Required for geo anomaly detection. | — |
| Volume Threshold | Request count in a 5-minute window before volume spike detection activates. | 500 |
| Error Threshold | Error count in a 5-minute window before brute force detection activates. | 50 |
| Geo Threshold | Number of new countries that triggers a geo anomaly alert. | 3 |
If the GeoIP database path is not set, geo anomaly detection is silently disabled. All other detectors continue to function normally.
GeoIP Setup
Geo anomaly detection requires a MaxMind GeoLite2-Country database file. To set it up:
- Create a free account at MaxMind.
- Download the GeoLite2-Country database in MMDB format from your account dashboard.
- Place the
.mmdbfile in an accessible path on the gateway server. - Set the GeoIP database path environment variable to the absolute path of the file.
The database should be updated periodically (MaxMind releases updates weekly). If the file is missing or unreadable at startup, the gateway logs a warning and disables geo anomaly detection.
Dashboard Actions
When an alert appears in the Security section of the dashboard, users can take the following actions:
- Acknowledge: Mark the alert as under investigation. This prevents it from appearing as a new notification for other team members.
- Dismiss: Mark the alert as a false positive. The alert is archived and no further action is taken.
- Revoke Key: Immediately revoke the API key associated with the alert. The key is invalidated and all subsequent requests using it are rejected.
- Enable Rate Limit: Apply a rate limit to the affected API key or organization. This throttles the suspicious traffic without fully blocking it.
All actions are logged with the acting user and timestamp for audit purposes.