A control that restricts how many requests a caller can make within a defined period. It protects availability and reduces abuse, but it only works when paired with correct authentication, authorisation and endpoint design.
Expanded Definition
Rate limiting is a policy that caps how many requests a caller can make over a defined interval, such as per second, per minute, or per day. In NHI security, the caller may be a service account, API key, workload, or AI agent, not just a human user. It is often implemented at the API gateway, reverse proxy, service mesh, or application layer, and it should be treated as an availability and abuse-control measure, not as an identity control by itself. Guidance varies across vendors on whether rate limiting belongs in access management, application security, or resilience engineering, but the operational goal is consistent: prevent burst abuse, credential-stuffing style automation, and runaway agent behavior. The NIST Cybersecurity Framework 2.0 reinforces this kind of protective control as part of broader risk reduction and service resilience. In practice, rate limits should be paired with authentication strength, authorisation logic, and endpoint-specific quotas so legitimate workloads are not blocked while abusive traffic is contained. The most common misapplication is treating rate limiting as a substitute for authorisation, which occurs when exposed endpoints throttle traffic but still allow overprivileged callers to perform harmful actions.
Examples and Use Cases
Implementing rate limiting rigorously often introduces operational friction, requiring organisations to balance abuse resistance against the risk of slowing legitimate automation and time-sensitive integrations.
- An API for invoice creation permits only a small burst rate per service account, reducing the impact of compromised API keys while preserving normal batch processing.
- A model-serving endpoint throttles repeated prompt submissions from an AI agent to prevent runaway loops and control cloud cost exposure.
- A secrets retrieval endpoint applies tighter quotas to legacy workloads after review shows frequent retries caused by misconfigured clients.
- After a spike in failed calls, teams correlate rate-limit events with service account behaviour and review exposure patterns documented in the Ultimate Guide to NHIs.
- A partner integration uses tiered request budgets so higher-trust callers receive larger limits, while unknown or newly onboarded callers are constrained until they are validated under NIST Cybersecurity Framework 2.0 aligned controls.
For NHI-heavy environments, rate limiting is most effective when applied per identity, per token, and per endpoint, because a single workload can fan out into many calls. It should also be reviewed alongside rotation and offboarding practices, since abandoned credentials can still consume capacity even after business ownership has changed. A well-tuned policy may slow abuse without creating a brittle control plane.
Why It Matters in NHI Security
Rate limiting matters because NHI abuse often looks like ordinary automation until volume or timing reveals the pattern. In the Ultimate Guide to NHIs, NHI Mgmt Group reports that 79% of organisations have experienced secrets leaks, with 77% of those incidents causing tangible damage, and 97% of NHIs carry excessive privileges. When a leaked token, overused service account, or misbehaving agent is left unchecked, request throttling becomes one of the few controls that can slow exploitation long enough for responders to rotate secrets, revoke access, and isolate the affected path. It also reduces the blast radius of logic bugs, scraping, enumeration, and denial-of-wallet style attacks against cost-bearing APIs. The control is not a cure-all, because a determined adversary can distribute requests across many identities or endpoints, but it still creates measurable friction and visibility. Organisations typically encounter the need for rate limiting only after a compromised credential, runaway agent, or partner integration has already exhausted resources, at which point the control becomes operationally unavoidable to address.
Standards & Framework Alignment
This section maps relevant standards and security frameworks to the operational risks and controls described in this guidance.
OWASP Non-Human Identity Top 10 address the attack and risk surface, while NIST CSF 2.0 and NIST Zero Trust (SP 800-207) set the governance and control requirements practitioners need to meet.
| Framework | Control / Reference | Relevance |
|---|---|---|
| OWASP Non-Human Identity Top 10 | NHI-05 | Rate limits help contain abusive non-human callers and credential misuse. |
| NIST CSF 2.0 | PR.PT-5 | Protective technology controls include limiting service exposure and abuse patterns. |
| NIST Zero Trust (SP 800-207) | Zero trust requires policy enforcement that can constrain each request path. |
Apply per-identity and per-endpoint throttles to limit abuse from service accounts and API keys.
Related resources from NHI Mgmt Group
Deepen Your Knowledge
Reviewed and updated by the NHIMG editorial team on June 23, 2026.
NHI Mgmt Group — the #1 independent authority on Non-Human Identity, IAM, and Agentic AI security. nhimg.org