Every AI agent leaves a fingerprint. Not a single signal — a constellation of signals, each individually ambiguous but collectively highly diagnostic. Understanding what these signals are, where they come from, and why they're difficult to spoof is the foundation of reliable agent detection.
This is a technical walkthrough of the signal categories we use in Agent FP. We're publishing this because transparency about our methodology helps the security community and doesn't meaningfully help adversaries — these signals are either difficult to spoof without breaking functionality, or they're already known to sophisticated adversaries.
Category 1: TLS and network layer signals
Before your application sees a single byte of request data, the underlying network connection has already revealed something about the client. TLS fingerprinting has been used for years to identify browser types; the challenge for AI agent detection is that modern agents often use Playwright or Puppeteer to drive real browser instances, which produce real browser TLS fingerprints.
However, several TLS-adjacent signals remain diagnostic:
JA4 fingerprint with version correlation. A real Chrome browser on macOS 14 produces a JA4 fingerprint that is consistent with that specific browser version, operating system, and a plausible set of system extensions. Playwright-driven Chrome on a Linux container produces a different JA4 that is consistent with that environment. These fingerprints can be compared against expected distributions for the claimed User-Agent.
HTTP/2 SETTINGS frame. The initial HTTP/2 SETTINGS frame sent by a client contains window size, max concurrent streams, and other parameters that differ predictably between browser versions and HTTP client libraries. Python's httpx, JavaScript's undici, and Go's net/http/http2 each have characteristic SETTINGS values that differ from Chrome's defaults.
HTTP/2 HEADERS frame ordering. The pseudo-headers (:method, :path, :scheme, :authority) and regular headers in HTTP/2 requests follow characteristic ordering that differs between browser implementations and HTTP libraries. Chrome has a specific ordering that has been stable across versions; deviation is significant.
# Chrome HTTP/2 header ordering (typical)
:method: GET
:authority: example.com
:scheme: https
:path: /
accept: text/html,application/xhtml+xml...
accept-encoding: gzip, deflate, br
accept-language: en-US,en;q=0.9
cache-control: max-age=0
...
# Python httpx HTTP/2 header ordering (typical)
:method: GET
:path: /
:scheme: https
:authority: example.com
accept: */*
accept-encoding: br, gzip, deflate
...
Category 2: Browser environment signals
When an agent uses a real browser for execution, network-layer signals become less reliable. The browser environment itself, however, provides a rich set of signals:
Hardware fingerprint plausibility. Real browsers on real hardware report values for navigator.hardwareConcurrency, navigator.deviceMemory, and screen geometry that correlate with physical hardware distributions. Headless browser instances in cloud environments cluster at values (4–8 cores, 8GB memory, 1920×1080) that are uncommon in the organic population of desktop users but common in cloud VM configurations.
AudioContext fingerprint. The Web Audio API produces consistent but hardware-specific output when given a known input signal. The exact frequency response is determined by the underlying audio hardware and driver stack. Cloud VMs with virtual audio devices produce characteristic AudioContext outputs that cluster in a narrow range.
WebGL renderer and extensions. The WebGL renderer string, available extensions, and the output of specific shader programs form a composite fingerprint tied to physical GPU hardware. Cloud VMs use software rasterizers (LLVMpipe, SwiftShader) or shared GPU slices that produce characteristic values not seen in organic desktop populations.
Missing or modified browser APIs. Real browsers accumulate a large surface area of APIs through years of web standards development. Headless browser implementations sometimes have incomplete or subtly modified implementations of edge-case APIs. Some common tests:
// Notification API presence (missing in some headless environments)
'Notification' in window
// WebRTC ICE candidate behavior
// (different in headless; can expose true IP)
const pc = new RTCPeerConnection({iceServers:[]});
pc.createDataChannel('');
pc.createOffer().then(offer => pc.setLocalDescription(offer));
pc.onicecandidate = e => { /* analyze candidate */ };
// Permissions API consistency check
navigator.permissions.query({name:'notifications'})
.then(r => { /* headless returns inconsistent states */ });
Category 3: Behavioral timing signals
This is the category most specific to LLM-powered agents and the one that's most novel from a detection standpoint.
When an LLM generates text in response to a prompt, it does so token by token. In a Playwright-based agent that types into a form field, this token generation is reflected in the keystroke timing: characters appear in clusters (one cluster per generated token) with brief pauses between clusters (one pause per token generation round).
This creates a distinctive timing signature when analyzed at the character level:
- Burst coefficient: The ratio of inter-cluster pauses to intra-cluster intervals. In LLM agents, this is typically 10–100x. In humans, it's typically 3–8x.
- Cluster size distribution: LLMs generate tokens of 1–4 characters most commonly. The resulting cluster sizes follow a distribution consistent with the tokenizer vocabulary, not with human cognitive chunking.
- Think-time distribution: The duration of pauses between bursts follows a log-normal distribution characteristic of transformer inference latency, which differs from the log-normal distribution of human cognitive pauses.
This signal is not easily spoofed because adding realistic human-like timing at the application layer would require either (a) buffering the LLM output and replaying with different timing, which adds significant latency and complexity, or (b) adding noise to the existing timing, which we detect separately.
Category 4: Navigation and interaction patterns
The sequence of interactions an agent makes with a page tells its own story:
Tab navigation vs. click navigation. Agents frequently use tab-key navigation to move between form fields, which is statistically less common among humans who typically click into fields. More specifically, the pattern of tab-key navigation — uniform timing, no key-repeat events, no modifier keys — is distinct from human tab navigation.
Scroll behavior. Humans scroll continuously and irregularly. Agents either don't scroll at all (when using JavaScript scrollIntoView) or scroll in discrete, even increments. The absence of any scroll events in a multi-screen-height page is itself a signal.
Focus/blur event patterns. Real users focus and blur elements in irregular patterns as they pause to think, look at other windows, or interact with other UI elements. Agent interactions show highly regularized focus/blur sequences that match the programmatic interaction pattern exactly.
Copy-paste detection. Agents frequently paste complete field values (email addresses, long text strings) in a single paste event, rather than typing them character by character. The combination of a single paste event with no preceding clipboard API calls and no delay between field focus and content appearance is a reliable signal.
Category 5: Framework-specific fingerprints
Many agent frameworks leave framework-specific artifacts in addition to the general signals above:
LangChain + Playwright: Default User-Agent includes "HeadlessChrome" unless explicitly configured otherwise. The navigator.webdriver property is true by default.
OpenAI Swarm: Characteristic request batching pattern — multiple requests initiated with very similar timestamps, consistent with async task scheduling.
Anthropic Claude via API: When used directly (not through a browser), produces a distinctive set of HTTP headers including the anthropic-version header in some configurations.
Custom agent frameworks via Python: Python-based HTTP clients (requests, httpx, aiohttp) have characteristic behavior in connection pooling and reuse patterns that differ from browser connection management.
The composite model
No single signal is a reliable classifier. An agent can hide one or two signals. But hiding all of them simultaneously, without breaking the agent's functionality, is extremely difficult. Each additional signal layer that an agent has to defeat adds cost, latency, and maintenance burden.
The Agent FP confidence score is produced by a gradient-boosted ensemble that weights 40+ signals based on their historical reliability and current prevalence in the agent traffic we observe. The model is updated weekly as new agent frameworks and evasion techniques emerge.
The result is a score that is robust to partial evasion: an agent that successfully hides 10 of our 40 signals still produces a high confidence score based on the 30 it didn't hide.
See this in action on your traffic
Agent FP applies all 40+ signals to every request on your site or API. Free for 100K requests/month.
Get early access