By NHI Mgmt Group Editorial TeamPublished 2026-06-03Domain: Agentic AI & NHIsSource: WorkOS

TL;DR: AI agent audit logs must capture user identity, agent identity, delegated scope, tool-level actions, and approval context because application logs alone cannot reconstruct who authorised what, according to WorkOS. Without session-level accountability, enterprises cannot verify agent behaviour, satisfy compliance, or investigate incidents with confidence.


At a glance

What this is: This article explains why AI agent audit logs must go beyond application logs and capture identity, delegation, scope, and approval context.

Why it matters: It matters because IAM, PAM, and NHI programmes need records that prove who or what acted, what it was allowed to do, and whether humans approved it.

By the numbers:

👉 Read WorkOS's analysis of why AI agent audit logs differ from application logs


Context

AI agent audit logging is the practice of recording not just what a system did, but who authorised it, which agent executed it, what it was allowed to access, and how that action flowed through the delegation chain. Existing application logs are built for request-response operations, so they miss the identity and approval detail that agentic systems now require.

For IAM and NHI teams, the gap is not in observability alone. It is in accountability across user, agent, and downstream tool calls. That means the logging model must support session scope, approval state, and identity-centric queries, or the organisation cannot prove whether an agent stayed inside its authorised boundary.


Key questions

Q: How should security teams log AI agent actions for audit and compliance?

A: Security teams should log AI agent actions as identity events, not just application events. Each record should include the human initiator, agent identity, approved session scope, tool invocation details, and any downstream delegation. That structure lets investigators prove whether the action stayed within authorised boundaries and gives compliance teams a defensible record of accountability.

Q: Why are application logs not enough for AI agent governance?

A: Application logs are built for operational troubleshooting, so they usually miss the approval chain, agent identity, and session scope that matter in agentic systems. An AI agent can perform many actions from one instruction, and those actions may span multiple tools and services. Without dedicated audit logging, teams cannot reconstruct responsibility with confidence.

Q: What should an AI agent audit trail include?

A: An audit trail should include who initiated the session, which agent executed it, what the agent was authorised to do, the exact tool calls made, the results returned, and whether a human approved the action. It should also preserve the delegation chain and time-bounded session context so the record is useful for incident response and compliance review.

Q: Who is accountable when an AI agent acts on behalf of a user?

A: The human initiator usually remains accountable for the task, but the organisation must be able to show how the agent was authorised and what it actually did. Accountability fails when logs flatten the agent into the user account and erase the approval trail. Strong audit records keep both identities visible and searchable.


Technical breakdown

Why application logs break down in agentic systems

Application logs are optimised for operational debugging: they record service, latency, status, and error context around a single request. Agent systems behave differently because one user action can fan out into many tool invocations, across multiple services, over a longer session window. A single application log entry usually captures only the final hop, which loses the causal chain and the actor distinction. In practice, that means you can see an API call succeed without knowing whether the agent was authorised, whether a human approved it, or which downstream tool actually executed it.

Practical implication: Treat application logs as operational telemetry and build a separate audit record for identity, scope, and approval.

What an agent audit log must record

An agent audit log needs to preserve four identity controls that application logs usually omit: the human initiator, the agent identity, the approved session scope, and the tool-level action details. It should also record delegation chain metadata so teams can reconstruct on-behalf-of activity across token exchange or downstream APIs. RFC 8693-style delegation is not enough by itself because the audit record must make the chain searchable, attributable, and time-bounded. Without that structure, compliance and incident response teams can see that something happened, but not whether it happened within the authorised session.

Practical implication: Design audit events around identity and authorisation fields first, then add operational metadata.

Why tamper evidence and retention matter for agent accountability

Audit logs serve a different purpose from observability pipelines. They cannot be sampled, rotated away aggressively, or aggregated in ways that destroy individual action fidelity. Agent audit records need to be complete, immutable, retained for the required regulatory period, and queryable by user ID, agent ID, and session ID. That is a governance problem as much as a logging problem because the record becomes part of the evidence chain for internal review, legal discovery, and regulator requests. A log that cannot prove integrity cannot prove accountability.

Practical implication: Route agent audit logs to a tamper-evident system with retention and identity-based search built in.


NHI Mgmt Group analysis

Agent audit logging is a governance control, not an observability enhancement. The article is right to separate operational logging from accountability logging because agentic behaviour creates a second identity layer that traditional app logs were never designed to prove. Application telemetry can tell you that a tool call succeeded, but not whether the action was inside an approved session or whether the agent was the true executor. Practitioners should treat this as a distinct control plane for non-human identity.

Session scope is the named concept that closes the accountability gap. In agent systems, scope must describe what an agent may do during a bounded runtime session, not just what a user account is entitled to in general. That is a different governance problem from static entitlement review because the relevant evidence is the session record, approval state, and delegation chain. The implication is that identity governance needs to become session-aware for agents, not merely permission-aware.

Auditability for agents depends on distinguishing actor identity from initiator identity. A human may authorise the task, but the agent decides how to execute it, which tools to call, and in what order. That separation matters because accountability breaks if every action is flattened back into the user account alone. The practitioner conclusion is that agent identity must exist as a first-class subject in IAM and logging design.

Existing PAM and IGA assumptions are too coarse when execution is delegated to agents. Traditional governance assumes a relatively stable relationship between approval, access, and action. Agentic systems compress those steps and can generate many actions from one approval, which means review evidence must move from entitlement snapshots to session artefacts. The practical conclusion is that governance teams need records that prove who approved what, when, and for which agent session.

This topic signals a broader shift toward evidence-grade identity infrastructure. Enterprises are moving from logs that help operators debug systems to records that help security, compliance, and legal teams reconstruct delegated action. That is a material change for identity programmes because the control objective is no longer just visibility. It is provable responsibility across human, agent, and downstream service identities.

From our research:

  • Only 44% of developers are reported to follow security best practices for secrets management, exposing a significant developer behaviour gap, according to The State of Secrets in AppSec.
  • The average estimated time to remediate a leaked secret is 27 days, despite 75% of organisations expressing strong confidence in their secrets management capabilities, according to GitGuardian & CyberArk.
  • For a broader identity-control lens, see Top 10 NHI Issues for the access, lifecycle, and governance patterns that agent audit logging must support.

What this signals

Session scope is becoming the new unit of accountability. As agent features move from prototypes into production, teams will need evidence that binds a human authorisation to a bounded agent session and its downstream tool calls. That means logging design now sits alongside IAM design, not after it. Practitioners should expect audit requirements to influence how agent identity, approval flows, and retention policies are built.

The next governance gap will be between what the platform can trace and what the organisation can prove. A system can expose thousands of events and still fail an audit if the record does not tie action, approval, and delegation together. That is why evidence-grade logging should be treated as part of identity architecture rather than an optional security add-on.

AI agent programmes also need to align with the OWASP Top 10 for Agentic Applications 2026 and the NIST AI Risk Management Framework where governance, traceability, and accountability intersect. The practical signal is straightforward: if you cannot query actions by agent, session, and approver, you do not yet have a defensible control model.


For practitioners

  • Separate operational logs from audit logs Keep application telemetry in the observability stack, but write agent actions to a dedicated audit store with immutable records, identity-centric indexing, and retention aligned to compliance needs.
  • Record the full delegation chain Capture the human initiator, the agent identity, the approved scope, and the downstream tool invocation in one session record so investigators can reconstruct on-behalf-of activity.
  • Make approval state searchable Store who approved an action, when approval happened, and when the session expires, so teams can verify whether a tool call was human-approved or autonomous.
  • Require identity-based queries for investigations Index audit data by user ID, agent ID, and session ID so support, security, and compliance teams can answer questions without reconstructing events from fragments.

Key takeaways

  • AI agent audit logging is a governance requirement because application logs do not preserve the identity and approval context needed for accountability.
  • The critical evidence model includes human initiator, agent identity, session scope, tool invocation detail, and delegation chain, not just timestamp and status.
  • Teams should separate observability from auditability now, because agent-driven action chains need immutable, identity-centric records to satisfy incident response and compliance.

Standards & Framework Alignment

This section maps relevant standards and security frameworks to the operational risks and controls described in this guidance.

OWASP Agentic AI Top 10 address the attack and risk surface, while NIST AI RMF and NIST CSF 2.0 set the governance and control requirements practitioners need to meet.

FrameworkControl / ReferenceRelevance
OWASP Agentic AI Top 10Agent audit logging directly supports agent identity and tool-use governance.
NIST AI RMFAccountability and traceability are central AI RMF governance concerns.
NIST CSF 2.0PR.AA-01Identity-aware logging supports accountable access and action traceability.

Use AI RMF governance processes to define ownership, approval, and evidence requirements for agent actions.


Key terms

  • Agent Audit Log: An agent audit log is a record of actions taken by an AI agent that preserves identity, authorisation, approval, and delegation context. It is designed for accountability rather than debugging, so it must support investigation, compliance review, and proof of who authorised what.
  • Delegation Chain: A delegation chain is the sequence of identities and approvals that connects an initiator to the actor that actually executed an action. In agentic systems, it shows how authority moved from a human to an agent and then to any downstream tool or service.
  • Session Scope: Session scope is the bounded set of actions, tools, and resources an agent may use during a specific runtime session. It matters because agent behaviour is often temporary and task-specific, so governance has to prove what was allowed at execution time, not just at provisioning time.
  • On-behalf-of Token: An on-behalf-of token is a delegation mechanism that lets one identity call downstream services with authority derived from another identity. It is useful for tracing provenance, but it is not enough on its own because audit records still need the full human, agent, and tool chain.

Deepen your knowledge

NHI governance, agentic AI identity, and machine identity security are core topics in our NHI Foundation Level course, the industry's only accredited NHI security programme. If you are responsible for identity security strategy or NHI governance in your organisation, it is worth exploring.

This post draws on content published by WorkOS: Why AI agent audit logs are different from application logs. Read the original.

NHIMG Editorial Note
Published by the NHIMG editorial team on 2026-06-03.
NHI Mgmt Group — the independent authority on Non-Human Identity, IAM, and Agentic AI security. nhimg.org