Distributed denial of service attacks work by overwhelming a target with requests. The defender's response has always been to identify and block the attacking sources. But what happens when each attacking source is making requests that are individually indistinguishable from legitimate usage?

This is the API abuse problem created by AI agents — and it's distinct enough from traditional DDoS that the term barely applies. Call it distributed legitimate-looking load: thousands of AI agents, each with a valid API key, each making requests within your rate limits, collectively consuming infrastructure at a scale you didn't provision for.

How it happens

Consider a public API that offers 1,000 requests per minute per API key. You've designed your infrastructure assuming a certain distribution of usage — most keys idle, a few active, peak load well within your provisioned capacity.

Now consider a scenario where your API is useful to AI agents. Maybe it's a search API, a pricing API, a data enrichment API, or simply a content API that an agent uses to gather information. Once one agent starts using it, others follow. Agent developers share tools. Usage explodes — not from a single source you can block, but from thousands of sources that are each behaving exactly as allowed.

You hit capacity limits. You add capacity. Costs escalate. The agents consume the new capacity. Your cost-per-request metric is fine; your total cost is not.

Why rate limits don't solve this

Rate limits were designed to protect against a single abusive user. They work by capping how much any one user can consume. But they do nothing about aggregate consumption from many individually-compliant users.

The math is straightforward: if you have 10,000 API keys and each one makes 100 requests per minute (well within a typical 1,000/min limit), that's 1 million requests per minute — potentially 10x your provisioned capacity. Each individual key is compliant. The system is overwhelmed.

This isn't a hypothetical. We've seen this pattern across multiple APIs in our detection network. The signature is: stable per-key metrics, but unusual growth in total key count; per-key request patterns that are perfectly regular (because agents aren't random); and request distributions across endpoints that cluster in ways inconsistent with exploratory human usage.

The agent-specific patterns

Even when agents are individually rate-compliant, their usage patterns diverge from human patterns in ways that are detectable:

Temporal clustering. Humans use APIs at unpredictable times. Agents often run on schedules or in response to triggers — pricing agents run at market open, data collection agents run on cron jobs, processing agents run when new data arrives. The result is periodic bursts that don't match human usage patterns.

Endpoint distribution. Humans explore APIs. They call endpoints they don't need, make mistakes, experiment. Agents are efficient — they call exactly the endpoints they need, in exactly the sequence they need them, with no exploratory calls. A usage pattern with zero variance in endpoint distribution across sessions is an agent signature.

Parameter space coverage. An agent tasked with extracting all available data will systematically cover the parameter space of your API — iterating through all IDs, paginating through all results, requesting all available fields. This exhaustive coverage is characteristic of agents and atypical of human usage.

Error response patterns. When an API returns an error, humans read the error message, pause, and adjust. Agents retry immediately with exactly the same or slightly modified parameters. The inter-request timing after error responses is a strong classifier.

The cost visibility problem

What makes agent API abuse particularly insidious is that it's often invisible until costs materialize. Your monitoring shows:

By the time the cost anomaly surfaces — at the end of the billing period, or when an alert finally triggers — you've already served the load. You've incurred the compute cost, the egress cost, the LLM inference cost (if your API is AI-powered), and potentially exceeded your capacity provider's committed-use discount tier.

Detection approach

Detecting agent API abuse requires moving the analysis unit from per-key to per-session and per-cohort:

Session-level analysis. Classify each API session (a sequence of calls from the same key over a time window) as agent-generated or human-generated. Agents produce highly regular, efficient sessions; humans produce irregular, exploratory sessions.

Cohort analysis. Group API keys by behavioral similarity. Keys that exhibit agent-like behavior form natural cohorts. The aggregate capacity consumed by any cohort is more informative than individual key metrics.

Infrastructure fingerprinting. Even without browser signals, API calls from agent frameworks leave TLS and HTTP/2 fingerprints. Python's httpx, Node's axios, and Go's standard library each have characteristic fingerprints that differ from browser-originated requests.

What to do

For APIs that may be agent targets, the defensive posture should include:

  1. Detect and classify agent traffic — know what fraction of your API usage is agent-generated before you take action
  2. Segment your rate limits by client type — humans and agents can have different limits based on detection confidence
  3. Price accordingly — if agents are your primary consumer, your pricing model should reflect that; free tiers designed for individual developers are exploitable at scale
  4. Alert on cohort-level metrics — when a new cohort of agent-like keys reaches a threshold, that's an event worth investigating

The agent API abuse problem is nascent but growing rapidly as AI agent usage expands. The organizations that instrument for it now will have the data and tooling to respond when it reaches their scale. The ones that wait will be reacting to cost surprises.

Classify your API traffic by agent vs. human

Agent FP works on APIs as well as web pages. Integrate via script tag or direct API call.

Get early access