Security Threat Detection

Overview

The Floopy gateway monitors all API traffic in real time for suspicious patterns. Every request that passes through the gateway is evaluated against a set of threat detectors that track volume anomalies, authentication failures, geographic shifts, cost spikes, firewall violations, and unusual model usage.

When a threat is detected, the system generates an alert in the dashboard with severity classification, contextual metadata, and actionable remediation options. Detection runs asynchronously after request dispatch — it adds zero latency to the critical path.

Architecture

The detection system uses a two-layer approach: lightweight in-memory counters for hot detection, and full analytical queries for deep evaluation. This keeps per-request overhead to sub-millisecond levels while still enabling complex pattern analysis.

%%{init: {'theme': 'dark', 'themeVariables': { 'primaryColor': '#5ee3ff', 'lineColor': '#5ee3ff', 'textColor': '#E0E0E0' }}}%%
flowchart TD
    A[Incoming Request] --> B[Gateway Processing]
    B --> C[Provider Dispatch]
    C --> D[Response to Client]
    C --> E[Async Post-Dispatch]

    E --> F[Update Counters]
    F --> G{Threshold Exceeded?}
    G -->|No| H[Done]
    G -->|Yes| I[Acquire Distributed Lock]
    I --> J[Deep Evaluation]
    J --> K[Run 6 Detectors in Parallel]
    K --> L{Threats Found?}
    L -->|No| H
    L -->|Yes| M[Dedup Check]
    M --> N[Insert Alert]
    N --> O[Dashboard Shows Alert]

    P[Scheduled Evaluation] --> J

    style A fill:#1a1a2e,stroke:#5ee3ff
    style C fill:#1a1a2e,stroke:#5ee3ff
    style F fill:#0d2137,stroke:#5ee3ff
    style J fill:#0d2137,stroke:#5ee3ff
    style K fill:#0d2137,stroke:#5ee3ff
    style N fill:#1a1a2e,stroke:#5ee3ff
    style O fill:#1a1a2e,stroke:#5ee3ff
    style P fill:#1a1a2e,stroke:#5ee3ff

Flow Summary

Post-dispatch: After the gateway dispatches a request to the provider, it spawns an async task to update counters for the organization.
Hot check: If any counter exceeds its threshold, the system acquires a distributed lock to prevent concurrent evaluations.
Deep evaluation: With the lock held, the system runs all 6 detectors in parallel, querying historical data for patterns.
Alert creation: Detected threats are deduplicated against existing open alerts, then inserted as new alerts.
Scheduled evaluation: A scheduled job can also trigger the same evaluation pipeline on a regular interval, catching threats that accumulate between request-triggered checks.

Threat Types

Type	Detection Window	Threshold	Granularity
Volume Spike	5 min vs 7-day avg	3x	Per API key
Brute Force	5 min	50%+ error rate, min 10 req	Per org
Geo Anomaly	30-day history	New country	Per API key
Cost Anomaly	1 hour vs 7-day hourly avg	5x	Per org
Firewall Threat	30 min	10+ blocked requests	Per org
Model Switching	10 min	5+ unique models	Per API key

Volume Spike

Compares the current 5-minute request count for each API key against the 7-day average for the same time window. If the current volume exceeds 3x the historical average, an alert is raised. This catches sudden surges that may indicate a compromised key or automated abuse.

Brute Force

Monitors the error rate within a 5-minute sliding window. If more than 50% of requests result in authentication or authorization errors and the total request count exceeds 10, the detector flags a brute force attempt. This catches credential stuffing and key enumeration attacks.

Geo Anomaly

Tracks the set of countries from which each API key makes requests. If a request originates from a country that has not been seen for that key in the past 30 days, an alert is generated. This detects compromised keys being used from unexpected locations. Requires a GeoIP database (MaxMind GeoLite2).

Cost Anomaly

Compares the current hour’s total cost against the 7-day average hourly cost. If the current cost exceeds 5x the historical average, an alert is raised. This catches unexpected cost spikes from expensive model usage, unusually long completions, or abuse.

Firewall Threat

Counts the number of requests blocked by the LLM Firewall within a 30-minute window. If more than 10 requests are blocked, an alert is raised. Sustained firewall blocks indicate a persistent attacker probing for prompt injection or jailbreak vectors.

Model Switching

Tracks the number of unique models requested by each API key within a 10-minute window. If more than 5 distinct models are used, an alert is generated. Rapid model switching is unusual in normal application usage and may indicate an attacker probing for model-specific vulnerabilities.

Detection Layers

Layer 1: Hot Detection

The hot detection layer runs inline after every request with sub-millisecond overhead. It uses lightweight counters organized in fixed 5-minute windows with self-expiring keys.

No blocking: Counter updates are fire-and-forget operations.
Fixed windows: Each window is a discrete 5-minute bucket.
Self-expiring: All counters have a TTL that ensures automatic cleanup with no garbage collection needed.
Threshold check: After incrementing, the counter value is compared against the configured threshold. Only when exceeded does the system proceed to deep evaluation.

This layer acts as a cheap filter — the vast majority of requests increment a counter and move on without triggering any further work.

Layer 2: Deep Evaluation

When hot detection fires, the system acquires a distributed lock and runs the full analytical evaluation. This layer queries historical data and runs all 6 detectors in parallel.

Historical baselines: Queries pull 7-day averages, 30-day country sets, and hourly cost aggregations.
Parallel execution: All 6 detectors run concurrently. The evaluation completes when the slowest detector finishes.
Lock protection: A distributed lock prevents multiple gateway instances from running duplicate evaluations for the same organization simultaneously.
Deduplication: Before inserting an alert, the system checks for existing open alerts of the same type and organization to avoid duplicate notifications.

Alert Lifecycle

Alerts follow a simple state machine from creation to resolution.

%%{init: {'theme': 'dark', 'themeVariables': { 'primaryColor': '#5ee3ff', 'lineColor': '#5ee3ff', 'textColor': '#E0E0E0' }}}%%
stateDiagram-v2
    [*] --> Open: Threat Detected
    Open --> Acknowledged: User reviews alert
    Acknowledged --> Resolved: User takes action
    Acknowledged --> Dismissed: False positive
    Open --> Dismissed: False positive
    Resolved --> [*]
    Dismissed --> [*]

Open: A new alert has been created. It appears in the dashboard with a notification badge.
Acknowledged: A team member has reviewed the alert and is investigating.
Resolved: The alert has been addressed with a concrete action (e.g., key revocation, rate limit enabled).
Dismissed: The alert was determined to be a false positive and requires no action.

Severity Classification

Severity is calculated automatically based on the ratio between the observed value and the expected baseline. This applies to volume, cost, and firewall threat types. Geo anomaly and model switching use fixed severity based on count thresholds.

Ratio (Observed / Baseline)	Severity
Above 10x	Critical
Above 5x	High
Above 2x	Medium
Up to 2x	Low

Critical and high severity alerts are highlighted in the dashboard and can be configured to trigger webhook notifications.

Configuration

The threat detection system is configured via environment variables on the gateway.

Setting	Description	Default
Security API Key	Authenticates scheduled evaluation calls. Required for production.	—
GeoIP Database Path	Path to the MaxMind GeoLite2-Country.mmdb file. Required for geo anomaly detection.	—
Volume Threshold	Request count in a 5-minute window before volume spike detection activates.	500
Error Threshold	Error count in a 5-minute window before brute force detection activates.	50
Geo Threshold	Number of new countries that triggers a geo anomaly alert.	3

If the GeoIP database path is not set, geo anomaly detection is silently disabled. All other detectors continue to function normally.

GeoIP Setup

Geo anomaly detection requires a MaxMind GeoLite2-Country database file. To set it up:

Create a free account at MaxMind.
Download the GeoLite2-Country database in MMDB format from your account dashboard.
Place the .mmdb file in an accessible path on the gateway server.
Set the GeoIP database path environment variable to the absolute path of the file.

The database should be updated periodically (MaxMind releases updates weekly). If the file is missing or unreadable at startup, the gateway logs a warning and disables geo anomaly detection.

Dashboard Actions

When an alert appears in the Security section of the dashboard, users can take the following actions:

Acknowledge: Mark the alert as under investigation. This prevents it from appearing as a new notification for other team members.
Dismiss: Mark the alert as a false positive. The alert is archived and no further action is taken.
Revoke Key: Immediately revoke the API key associated with the alert. The key is invalidated and all subsequent requests using it are rejected.
Enable Rate Limit: Apply a rate limit to the affected API key or organization. This throttles the suspicious traffic without fully blocking it.

All actions are logged with the acting user and timestamp for audit purposes.