Subscribe to the Non-Human & AI Identity Journal

Notifications
Clear all

LLM fingerprinting in agentic apps: where does it fail?


(@nhi-mgmt-group)
Member Moderator
Joined: 1 year ago
Posts: 3789
Topic starter  

TL;DR: LLMmap’s open-set LLM fingerprinting reached about 95% top-1 accuracy on raw model APIs, but recognition fell sharply in agentic deployments, dropping to 17.95% under restrictive prompts and to 38.46% under German output, according to Lasso Security. The practical lesson is that model identification is no longer a clean lab exercise once tools, prompts, and language shaping enter the response path.

NHIMG editorial — based on content published by Lasso Security: From Lab to Wild, how robust is LLM fingerprinting in the agentic era?

Questions worth separating out

Q: How should security teams test LLM fingerprinting in production AI agents?

A: Test the fingerprinting method against the exact production agent stack, not just the raw model API.

Q: Why does agentic AI make model identification less reliable?

A: Agentic AI adds tools, memory, retrieval, and formatting rules between the user and the model, so the observable response no longer reflects the model alone.

Q: What do security teams get wrong about fingerprinting hardened AI systems?

A: Teams often assume that a successful defensive prompt makes the model unidentifiable.

Practitioner guidance

  • Benchmark fingerprinting against the deployed agent, not the base model Run identification tests on the exact production stack, including system prompt, tools, retrieval, language settings, and formatting rules.
  • Treat prompt hardening as a fingerprinting control boundary Review which defensive instructions reduce observable model variance, such as forced refusals, rigid output schemas, or mandatory tool calls.
  • Require reliability checks before operational use of model IDs Do not act on a top-1 model guess alone.

What's in the full report

Lasso Security's full research covers the operational detail this post intentionally leaves for the source:

  • The per-configuration recognition tables that show how pure, default, german, and restrictive settings affect exact identification.
  • The underlying statistical tests, including pairwise comparisons and corrected significance thresholds, for practitioners who need the evidence trail.
  • The full description of the four application scenarios, including the customer-service chatbot, email assistant, and research assistant with RAG.
  • The paper's future-work discussion on agent-shaped templates, multilingual retraining, and reliability-aware fingerprinting design.

👉 Read Lasso Security's research on LLM fingerprinting in agentic deployments →

LLM fingerprinting in agentic apps: where does it fail?

Explore further

View Full Forum →  |  NHI Foundation Course →



   
Quote
(@mr-nhi)
Member Moderator
Joined: 4 weeks ago
Posts: 2127
 

Agentic wrappers turn model identification into a governance problem, not just a tooling problem. Once the model sits behind tools, retrieved context, and output constraints, the visible response signal is no longer the model alone. That means security teams are not evaluating a stable identity surface, but a composite behavioural layer that changes with deployment choices. The practitioner conclusion is that model inventory controls must account for the wrapper, not just the foundation model.

A few things that frame the scale:

  • 98% of companies plan to deploy even more AI agents within the next 12 months, despite documented rogue behaviour in 80% of current deployments, according to AI Agents: The New Attack Surface report.
  • Only 52% of companies can track and audit the data their AI agents access, leaving 48% with a complete blind spot for compliance and breach investigation.

A question worth separating out:

Q: How can organisations tell if model identification results are trustworthy?

A: Use a confidence check that includes a distance metric, a second validation method, and an out-of-distribution flag for wrapped or multilingual deployments. If the tool cannot explain when its own answer becomes unreliable, treat the result as advisory rather than authoritative.

👉 Read our full editorial: LLM fingerprinting weakens fast in agentic deployments



   
ReplyQuote
Share: