Runtime authorization is becoming the control plane for identity security

By NHI Mgmt Group Editorial TeamPublished 2026-05-18Domain: Best PracticesSource: Cerbos

TL;DR: Static authorization preserves yesterday’s assumptions, while runtime authorization decides each request against live policy and context, a distinction Cerbos uses to frame the modern IAM stack. The control matters because stolen credentials, over-permissioned workloads, and AI agents all move faster than admin-time reviews can react, so the broken assumption is that access can still be safely judged long after it is requested.

At a glance

What this is: This is an analysis of runtime authorization as the identity security layer that evaluates each request at the moment it hits a service, not at provisioning or login.

Why it matters: It matters because IAM, PAM, and IGA programmes still leave a decision gap at the point of use, which is where service accounts, workloads, and AI agents now create the highest-risk access paths.

By the numbers:

88% of basic web application attacks involved stolen credentials, according to Verizon's 2025 Data Breach Investigations Report.
95% of cloud identities use less than 3%
40% of enterprise applications will feature task-specific AI agents by 2026, up from less than 5% a year earlier, according to Gartner.

👉 Read Cerbos' full analysis of runtime authorization as the identity control plane

Context

Runtime authorization is the decision layer that evaluates whether a specific request should be allowed right now, based on the principal, the resource, the action, and the current context. That is the missing piece in many IAM programmes, because provisioning, authentication, and access reviews all describe access before use, while runtime authorization governs access at the exact point of impact.

The problem is not that those upstream controls are unnecessary. The problem is that static models preserve yesterday’s state, and identity security failures increasingly happen in live sessions, over API calls, and across delegated machine and agent interactions. For service accounts, workloads, and AI agents, the governance question is no longer who was granted access, but whether the request should still be allowed this instant.

Cerbos uses the runtime authorization category to describe that enforcement layer, including continuous evaluation when session conditions change mid-stream. For IAM teams, that means the control boundary moves from records of entitlement to live decisioning, which changes how least privilege, auditability, and zero trust have to be implemented.

Key questions

Q: How should security teams implement runtime authorization alongside IGA and PAM?

A: Treat IGA as the source of granted entitlement, PAM as the control for elevated access, and runtime authorization as the request-time decision layer. The practical goal is to ensure that a live request is evaluated against current policy and context before any application or API action proceeds. That keeps access reviews useful without assuming they are sufficient.

Q: Why do service accounts and AI agents increase the need for runtime authorization?

A: Service accounts and AI agents act in dynamic request paths, often across tools, services, and delegated chains that change faster than provisioning records. Static authorization can describe what they were allowed to do, but not whether the current request still fits the policy. Runtime authorization matters because it checks the action at the moment it matters, when drift and misuse become real.

Q: What breaks when authorization is decided only at login or provisioning time?

A: The control breaks when the live request differs from the conditions assumed at login or provisioning. Tokens, roles, and approvals may all be correct in history but wrong for the current resource, context, or session state. That is how identity programmes end up with access that looks governed on paper but remains executable in production.

Q: How can organisations tell whether runtime authorization is actually working?

A: Look for three signs: decisions happen fast enough to stay inline, policies use live context instead of stale claims, and every allow or deny produces an auditable record. If teams cannot explain a specific decision after the fact, or if applications bypass the control because it is too slow, the runtime layer is not functioning as intended.

Technical breakdown

Runtime authorization versus admin-time authorization

Admin-time authorization is the durable grant created during provisioning, role assignment, or approval. Runtime authorization is different because it evaluates each request at the moment it reaches a service, using live policy and live context rather than a frozen entitlement record. That distinction matters because the same user, workload, or agent can be validly authorised in the abstract and still be denied for a specific request if the context has changed. Continuous authorization extends that model by re-evaluating access when conditions shift after the initial decision. In practice, this is the layer that converts identity data into a security boundary instead of a historical record.

Practical implication: Move high-risk enforcement out of provisioning logic and into request-time policy evaluation.

Policy evaluation, PEPs, and PDPs in runtime enforcement

A runtime authorization platform usually separates policy decision points from policy enforcement points. The PEP intercepts the request, the PDP evaluates the policy, and supporting inputs from identity providers, data stores, and application state flow in as live context. This architecture keeps authorization close to the workload while preserving a central policy model. Stateless PDPs are important because they can scale horizontally and avoid becoming a bottleneck on the request path. If the decision engine cannot respond quickly enough, teams route around it and the control evaporates.

Practical implication: Place the decision engine near the workload and test latency under production load before expanding coverage.

Open standards for runtime authorization and continuous signals

AuthZEN standardises the API between enforcement points and decision engines, which reduces bespoke integration work and makes runtime authorization more portable across stacks. Shared Signals and CAEP extend the model by letting systems react to mid-session changes such as revocation or risk events. For workloads, SPIFFE and SPIRE provide portable identity primitives that runtime policy can evaluate. The technical significance is that runtime authorization is no longer only an application pattern. It is becoming an interoperable identity layer that can span gateways, services, and machine identities.

Practical implication: Use standards-based integrations where possible so runtime decisions remain portable across services and vendors.

Threat narrative

Attacker objective: The objective is to turn a valid identity event into real-time access to systems and data that static controls would have allowed by default.

Entry occurs when a legitimate credential, workload identity, or agent session reaches a service with more access than the moment requires.
Escalation happens when the request is authorised by stale admin-time or session-time assumptions instead of current policy and context.
Impact follows when the attacker or mis-scoped workload can move laterally, access data, or trigger delegated actions that would have been blocked at runtime.

Cisco DevHub NHI breach — IntelBroker exploited exposed Cisco credentials, API tokens and keys in DevHub.
ASP.NET machine keys RCE attack — 3,000+ exposed ASP.NET machine keys enabled remote code execution.

Read our 52 NHI Breaches Analysis report for a comprehensive view of breaches impacting Non-Human Identities including AI Agents.

NHI Mgmt Group analysis

Runtime authorization is the control layer that stops identity programmes from confusing granted access with safe access. Provisioning records, access reviews, and token claims all describe what was true earlier in the lifecycle. They do not answer the only question that matters when a request arrives at a service: should this action be allowed right now? For NHI, workload, and human access alike, that gap is where policy has to become executable. The practitioner implication is that request-time decisioning must be treated as a first-class identity control, not a convenience layer.

Standing privilege is the named concept that runtime authorization exposes but does not itself create. The issue is not simply excess entitlement in a review report. The deeper problem is that a resource can remain reachable in the live request path long after the governance record has been signed off. That makes runtime checks the only point where unused access becomes visible as actual attack surface. The practitioner implication is to think in terms of enforced request boundaries, not only entitlement cleanup.

Admin-time authorization was designed for stable identity state. That assumption fails when the actor is an AI agent or a delegated machine workflow because the decision path, tool use, and request timing can all change within the active session. The implication is not just more control, but a rethink of where authorization truth lives when execution is no longer paced by human review.

Runtime authorization is becoming the practical meeting point between IGA, PAM, and zero trust. IGA tells you what was granted, PAM constrains elevated access, and runtime policy decides whether a specific action should proceed under current conditions. That makes the category less about product shape and more about closing the loop between governance intent and service enforcement. The practitioner implication is to align identity controls around decision moment, not around organisational convenience.

The market is moving from identity records to identity decisions. That shift matters because it changes what teams need to buy, design, and measure. Auditability, sub-millisecond policy execution, and live context have become architectural requirements, not optional features. The practitioner implication is to evaluate identity controls by how they behave under live traffic, not by how well they document historic access.

From our research:
79% of organisations have experienced secrets leaks, with 77% of these incidents resulting in tangible damage, according to Ultimate Guide to NHIs.
Only 5.7% of organisations have full visibility into their service accounts, which is why request-time enforcement cannot depend on historical entitlement records alone.
Runtime authorization fits into the broader identity lifecycle problem described in NHI Lifecycle Management Guide, where grant, rotation, and offboarding controls still need a live enforcement point.

What this signals

Runtime decisions are becoming the control plane for identity programmes that have outgrown static reviews. As service accounts, workloads, and AI agents take on more delegated access, IAM teams need policy that can answer the question at the moment of use, not after the next certification cycle. The practical shift is toward live enforcement, auditable decisions, and request-level context as the new operating baseline.

Standing privilege is the condition that runtime authorization is designed to expose in motion. When access is visible only at grant time, programmes miss the gap between entitlement and actual use. That is why the next maturity step is not more review cadence alone, but a policy layer that can interpret current identity state, current resource state, and current request state together.

For teams building toward zero trust, the important signal is whether request-time controls can operate across humans and non-humans without fragmenting into separate governance models. The more mixed the environment becomes, the more valuable it is to align policy, identity fabric, and audit output around a single decision point.

For practitioners

Map request-time enforcement gaps Inventory where applications still rely on token claims, code-level checks, or provisioning-time entitlements instead of live policy evaluation at the service boundary.
Separate policy decision from policy enforcement Use a PDP and PEP pattern so services can call a central decision layer without embedding access logic in application code.
Validate runtime latency before rollout Measure sub-millisecond decision performance under peak traffic, then confirm that the control remains in path when workloads scale horizontally.
Feed live identity and context signals into policy Connect identity providers, workload identity sources, and application state so the decision reflects current conditions rather than stale role data.
Extend authorization governance to non-human identities Apply the same runtime policy model to service accounts, workloads, and AI agents so delegated access is checked at use, not just at grant.

Key takeaways

Runtime authorization closes the gap between granted access and safe access by deciding each request at the moment it reaches a service.
The scale of the problem is already visible in stolen-credential attacks, excess cloud entitlement, and fast-growing AI agent adoption.
Practitioners should evaluate identity controls by live enforcement, auditability, and latency, not by how complete the provisioning record looks.

Standards & Framework Alignment

This section maps relevant standards and security frameworks to the operational risks and controls described in this guidance.

OWASP Non-Human Identity Top 10 address the attack and risk surface, while NIST CSF 2.0 and NIST Zero Trust (SP 800-207) set the governance and control requirements practitioners need to meet.

Framework	Control / Reference	Relevance
OWASP Non-Human Identity Top 10	NHI-03	Runtime controls help reduce standing credential and privilege risk.
NIST CSF 2.0	PR.AC-4	Dynamic access decisions support least-privilege enforcement across environments.
NIST Zero Trust (SP 800-207)	AC-4	Zero trust requires continuous evaluation at the point of request.

Place runtime authorization in the enforcement path so access is rechecked continuously.

Key terms

Runtime Authorization: Runtime authorization is the decision made at the exact moment a request reaches a service. It evaluates the principal, resource, action, and context in real time, which makes it fundamentally different from provisioning-time or login-time access decisions. In identity programmes, it is the enforcement layer that turns policy into a live control.
Policy Decision Point: A policy decision point is the component that evaluates whether a request should be allowed or denied. It takes live inputs from identity, resource, and contextual systems, then returns a decision for the enforcement point to act on. In runtime architectures, it must stay fast, stateless where possible, and auditable.
Policy Enforcement Point: A policy enforcement point is the place where a request is intercepted and the authorization decision is applied. It may sit in the application, sidecar, gateway, or mesh layer. Its job is to stop access from proceeding until the policy decision has been made and recorded.
Continuous Authorization: Continuous authorization is the practice of re-evaluating access after the initial decision has been made. If session risk, identity state, or resource state changes mid-session, the control can revoke or tighten access. It is especially relevant for workloads and AI agents that keep acting after the original grant.

Deepen your knowledge

Runtime authorization is a core topic in the NHI Foundation Level course, the industry's only accredited NHI security programme. If you are working to close the gap between entitlement and request-time enforcement, it is worth exploring.

This post draws on content published by Cerbos: Runtime Authorization Platform analysis and IAM stack positioning. Read the original.

NHIMG Editorial Note
Published by the NHIMG editorial team on 2026-05-18.
NHI Mgmt Group — the independent authority on Non-Human Identity, IAM, and Agentic AI security. nhimg.org