Subscribe to the Non-Human & AI Identity Journal

LLM Availability Abuse

LLM availability abuse is a denial-of-service pattern that targets model uptime and responsiveness rather than data confidentiality. It can happen through legitimate interfaces when an attacker submits requests that are computationally expensive enough to slow or block other workload traffic.

Expanded Definition

LLM availability abuse is a service-degradation pattern that targets model uptime, queue depth, and response latency instead of stealing data. The attacker stays inside legitimate request paths, but crafts prompts, tool calls, or session patterns that force expensive inference, repeated retries, or excessive context processing.

In NHI and agentic AI environments, the risk grows when access is broad, rate limits are weak, or an agent can chain requests into downstream tools. The issue is not limited to public chat endpoints. It also appears in internal copilots, workflow agents, and API-backed LLM services where availability is now an operational dependency. Guidance in the industry is still evolving, but frameworks such as the OWASP Agentic AI Top 10 and the NIST AI Risk Management Framework both treat resilience as a core governance concern rather than an afterthought.

The most common misapplication is treating every slowdown as ordinary traffic growth, which occurs when teams fail to distinguish normal burst usage from deliberately expensive request patterns.

Examples and Use Cases

Implementing controls against LLM availability abuse often introduces tighter quotas and more aggressive throttling, requiring organisations to balance user convenience against service resilience.

  • A malicious user submits long, nested prompts that maximize token processing and create backlog for other tenants.
  • An attacker repeatedly triggers tool-enabled agent workflows, causing downstream calls to external systems and amplifying compute load.
  • A botnet rotates through many low-volume sessions to avoid simple per-IP limits while still exhausting inference capacity.
  • A red team tests how quickly an internal copilot degrades when context windows are filled with unnecessary data and repeated follow-up queries.
  • In a shared environment, a single customer workload dominates GPU scheduling and creates a visible denial-of-service effect for other users.

These patterns are closely related to broader agentic attack-surface issues described in NHIMG research such as the AI Agents: The New Attack Surface report and the AI LLM hijack breach, where abuse emerges through legitimate access rather than obvious intrusion. The standards side is also converging on abuse resistance in OWASP Agentic AI Top 10 and NIST AI 600-1 Generative AI Profile.

Why It Matters in NHI Security

Availability abuse matters because LLMs are increasingly embedded in identity workflows, incident response, knowledge retrieval, and automation. When an attacker can stall the model, they can also stall the business process that depends on it. In NHI terms, the model endpoint, service account, tool credentials, and orchestration layer form a chain of trust that must be protected as a single operational surface.

NHIMG research shows how quickly agentic systems can escape governance boundaries: only 52% of companies can track and audit the data their AI agents access, leaving 48% with a complete blind spot for compliance and breach investigation, according to AI Agents: The New Attack Surface report from SailPoint. That same visibility gap makes it harder to notice abuse when the symptom is latency rather than exfiltration. The right response usually combines request shaping, token and tool budgets, tenant isolation, and monitoring aligned to the MITRE ATLAS adversarial AI threat matrix and CSA MAESTRO agentic AI threat modeling framework.

Organisations typically encounter the operational impact only after a high-volume request flood or expensive agent loop exhausts capacity, at which point availability abuse becomes unavoidable to address.

Standards & Framework Alignment

This section maps relevant standards and security frameworks to the operational risks and controls described in this guidance.

OWASP Agentic AI Top 10 and CSA MAESTRO address the attack and risk surface, while NIST AI RMF set the governance and control requirements practitioners need to meet.

Framework Control / Reference Relevance
OWASP Agentic AI Top 10 A6 Availability abuse maps to denial-of-service and resource exhaustion risks in agentic systems.
NIST AI RMF GV.2 AI RMF treats resilience and harmful abuse as governance issues requiring ongoing risk management.
CSA MAESTRO TBD MAESTRO addresses agentic workflow abuse that can cascade into service exhaustion.

Define resilience thresholds and monitor LLM capacity as an AI risk, not just an ops metric.