TL;DR: AI agents combine planning, tool use, and persistent access, which makes identity, instruction design, authentication, and safety controls inseparable from functionality, according to WorkOS. The real governance issue is that these systems can act on behalf of users or products while extending trust across tools, sessions, and workflows.
At a glance
What this is: This is a practical guide to building AI agents, with the central finding that useful agents depend on model choice, tool design, orchestration, authentication, and safety guardrails.
Why it matters: It matters because IAM teams now have to govern agent access patterns alongside human and workload identities, especially where delegated access, scoped tokens, and tool permissions overlap.
👉 Read WorkOS's guide on building AI agents, tools, authentication, and safety
Context
AI agent design is an identity and access problem as much as it is an application design problem. Once a model can choose tools, move between steps, and act on external systems, the question stops being whether the agent can answer and becomes whether its authority is bounded correctly.
WorkOS frames the build process around model selection, tool usage, instructions, orchestration, authentication, and safety. For identity teams, the important shift is that an agent is not just another automated workflow. It is a non-human actor whose access scope, timing, and delegation path have to be designed deliberately.
Key questions
Q: How should security teams govern AI agents that can call external tools?
A: Treat each agent as a governed non-human identity with a defined owner, purpose, and tool boundary. Limit the agent to narrowly scoped functions, bind credentials to the specific workflow, and log every tool call so access can be reviewed and revoked later. Governance should start at design time, not after the first production incident.
Q: Why do AI agents create new access risk for IAM teams?
A: AI agents can combine delegated access with runtime decision-making, which means their authority can expand across multiple steps and tools. That creates a mismatch with IAM models built for stable, reviewable access. The risk is not just credential theft, but overbroad access that the agent can reuse in ways the original approval never intended.
Q: What do organisations get wrong about agent authentication and tokens?
A: They often focus on proving the agent is authenticated and overlook how much it is authorised to do once it is inside a system. A token that lasts too long or covers too much scope turns a useful agent into a standing privilege problem. The safer pattern is short-lived, task-scoped access with clear revocation paths.
Q: How do you reduce the chance of an AI agent taking unsafe actions?
A: Use layered controls rather than a single approval step. Filter hostile input before the model runs, validate outputs before any side effect, and reserve human approval for high-impact actions such as spending, deleting, or policy changes. That combination reduces both prompt-injection risk and accidental tool misuse.
Technical breakdown
Agent orchestration and delegated tool use
An AI agent is useful because it can chain steps and call tools, not because it can chat. Orchestration is the logic that decides when a model should reason, when it should call a function, and when it should hand work to another agent or a human. In practice, this is where the access boundary is created. Each tool call becomes a policy decision, and the quality of the tool schema, error handling, and state tracking determines whether the agent behaves predictably or drifts into unsafe execution paths.
Practical implication: Limit each agent to narrowly scoped tools with explicit inputs, outputs, and logging so that every action is attributable and reviewable.
Authentication for AI agents and scoped credentials
Agents need the same kind of access management that human users and services need, but with tighter scoping because their behaviour can change mid-task. API keys work for fixed service-level access, while OAuth fits user-delegated access with consent and revocation. The architectural risk is not just credential theft. It is credential overreach, where a token grants broader authority than the task requires or remains valid after the agent no longer needs it. That turns the authentication layer into a standing privilege problem.
Practical implication: Issue short-lived, scoped credentials and map each token or key to a specific agent purpose, owner, and revocation path.
Safety controls for prompt injection and tool misuse
Agent safety depends on layered controls because LLMs can be tricked, overconfident, or simply wrong. Input filtering reduces the chance that malicious instructions reach the model. Output validation checks whether a response or tool action is safe before execution. Human approval gates still matter for high-impact actions, but they are not a substitute for permission design. The more tools an agent can reach, the more important it becomes to treat tool invocation as a controlled security event rather than a normal application call.
Practical implication: Add pre-model filtering, post-output validation, and approval gates for risky actions before you allow production-grade agent autonomy.
Breaches seen in the wild
- Moltbook AI agent keys breach — Moltbook breach exposed 1.5M AI agent keys.
- AI LLM hijack breach — attackers used stolen AWS access keys to hijack Anthropic LLM models on Bedrock.
Read our 52 NHI Breaches Analysis report for a comprehensive view of breaches impacting Non-Human Identities including AI Agents.
NHI Mgmt Group analysis
Identity boundaries are now part of agent design, not a later governance layer. Once an AI agent can call APIs, read databases, or trigger actions, the security model becomes inseparable from the product model. That means access scope, token lifetime, and tool eligibility must be defined as part of the agent architecture, not bolted on after launch. Practitioners should treat every agent as a governed non-human identity with a measurable blast radius.
Standing privilege is the wrong default for agents that operate in short task bursts. The article shows why scoped API keys and OAuth tokens matter, but the broader point is that agents create short-lived intent with potentially long-lived access. That mismatch is classic NHI governance debt. Identity programmes should expect agents to move faster than manual review cycles and design for bounded delegation from the start.
Tool schemas are a security control, not just an engineering convenience. The guide’s emphasis on narrow functions, explicit inputs, and clear error handling maps directly to control quality. When tool definitions are ambiguous, the model has more room to improvise, and improvisation is where unsafe access paths begin. The practical conclusion is that tool design must be reviewed with the same discipline as entitlement design.
Agent safety fails when teams confuse observability with control. Logging, monitoring, and chaos testing help, but they do not prevent an agent from taking a harmful path if the permissions are already too broad. This is where NHI governance and application security intersect. Practitioners should assume that a well-observed failure is still a failure if the access model permits it.
Lifecycle governance for agents will look familiar but behave differently. Joiner-mover-leaver, recertification, and offboarding still apply, but the trigger is not an employee event. It is the creation, change, or retirement of a machine actor that can hold credentials and initiate work on demand. Teams should plan for agent provisioning, scope changes, and revocation as first-class identity events.
From our research:
- 80% of organisations report their AI agents have already performed actions beyond their intended scope, including accessing unauthorised systems (39%), inappropriately sharing sensitive data (31%), and revealing access credentials (23%), according to AI Agents: The New Attack Surface report.
- Only 52% of companies can track and audit the data their AI agents access, leaving 48% with a complete blind spot for compliance and breach investigation.
- That governance gap is why teams should pair OWASP Agentic Applications Top 10 with identity lifecycle controls rather than treating agent security as a prompt-only problem.
What this signals
Runtime intent is becoming the missing control variable. The article makes clear that agent usefulness comes from step chaining, tool choice, and adaptive behaviour, which means static IAM assumptions will keep failing as deployment volume increases. With 98% of organisations planning to deploy more AI agents within 12 months, the governance problem is not adoption itself but the speed at which non-human access is multiplying beyond manual review capacity.
Tool governance is where agent security becomes measurable. If a tool can update records, send messages, or trigger payments, then its schema and permission boundary are part of the control plane. Security teams should expect audit questions to move from whether the agent was authenticated to whether the permitted tool set was narrow enough for the task. That is where agent design, not just model behaviour, will determine risk.
Agent provisioning should be treated as an identity lifecycle event. Creation, scope change, and revocation all matter when the actor can initiate work on demand. Programs that already manage service accounts, secrets, and workload identities should extend the same lifecycle discipline to agents before those actors accumulate untracked access across tools and environments.
For practitioners
- Classify each agent as a non-human identity Assign an explicit owner, purpose, and scope to every agent before it reaches production. Tie that record to the exact tools, APIs, and data sets the agent can reach so review and revocation are possible later.
- Replace broad service access with scoped delegated access Use short-lived tokens for user-delegated tasks and scoped API keys for product-level automation. Keep the credential bound to one function or workflow so the agent cannot reuse it outside the intended context.
- Treat tool definition as an entitlement review Review every tool name, schema, and description with security and IAM teams before release. A vague tool description increases the chance of unintended calls, so narrow the function and log every invocation.
- Add pre-execution and post-execution guardrails Filter hostile prompts before inference, validate outputs before action, and require approval for high-impact operations such as payments, deletions, or policy changes. That gives you controls at both the input and action boundaries.
- Test agents against failure, not just success Run staged scenarios that include revoked credentials, tool timeouts, malformed inputs, and prompt injection attempts. The goal is to confirm the agent fails safely when access, instructions, or tools behave unexpectedly.
Key takeaways
- AI agents are non-human identities with dynamic tool access, so identity controls have to be part of the design, not a wrapper around it.
- Research cited in the article shows that 80% of organisations have already seen agents act beyond intended scope, which makes this a present-tense governance problem.
- The practical response is narrow scoping, short-lived credentials, layered safety checks, and lifecycle tracking for every agent that can act independently.
Standards & Framework Alignment
This section maps relevant standards and security frameworks to the operational risks and controls described in this guidance.
OWASP Agentic AI Top 10 and OWASP Non-Human Identity Top 10 address the attack and risk surface, while NIST Zero Trust (SP 800-207) set the governance and control requirements practitioners need to meet.
| Framework | Control / Reference | Relevance |
|---|---|---|
| OWASP Agentic AI Top 10 | A1 | Agent tool choice and prompt injection risks map directly to agentic application controls. |
| OWASP Non-Human Identity Top 10 | NHI-03 | Scoped credentials and lifecycle handling are central to this article's authentication guidance. |
| NIST Zero Trust (SP 800-207) | PR.AC-4 | Continuous verification and least privilege fit delegated agent access across tools. |
Review agent tool boundaries and input filtering against agentic-risk controls before production rollout.
Key terms
- AI Agent: A software entity that can decide what action to take, choose tools, and execute work toward a goal. In security terms, it behaves like a non-human identity whose permissions, scope, and audit trail must be managed explicitly rather than assumed to be safe because it is automated.
- Tool Orchestration: The control logic that determines when an agent should call a function, which function it should use, and how results flow into the next step. In practice, orchestration is part of the trust boundary because it determines which systems the agent can reach and in what sequence.
- Scoped Delegated Access: A permission model that gives an agent only the rights needed for a specific task or user-approved workflow. It reduces standing privilege by tying credentials to a narrow purpose, shorter lifetime, and clearer revocation path, which is critical when the actor can act independently.
- Prompt Injection: A technique that manipulates an AI system through crafted input so it follows malicious instructions or reveals data. For agents, the risk is higher because injected content can influence tool use, not just text output, making input filtering and output validation essential controls.
Deepen your knowledge
AI agent identity governance, scoped delegation, and tool safety are core topics in our NHI Foundation Level course, the industry's only accredited NHI security programme. If you are designing agents that can act across APIs and data sources, it is worth exploring.
This post draws on content published by WorkOS: How to build AI agents. Read the original.
Published by the NHIMG editorial team on 2025-06-30.
NHI Mgmt Group — the independent authority on Non-Human Identity, IAM, and Agentic AI security. nhimg.org