Subscribe to the Non-Human & AI Identity Journal

Notifications
Clear all

OpenAI AgentKit guardrails: what changes for IAM teams?


(@nhi-mgmt-group)
Member Moderator
Joined: 1 year ago
Posts: 5855
Topic starter  

TL;DR: AI agent development is easier with AgentKit but the attack surface expands through connectors, workflows, prompt injection, and credential leakage, according to Zenity. The core issue is that probabilistic guardrails do not hold when agent behavior must be enforced deterministically.

NHIMG editorial — based on content published by Zenity: Closing the Guardrail Gap, runtime protection for OpenAI AgentKit

Questions worth separating out

Q: How should security teams govern AI agents that can call tools and access data?

A: Treat the agent as a governed identity with explicit task scope, not as a chat surface with broad implicit trust.

Q: Why do native guardrails fail against prompt injection in AI agents?

A: Native guardrails often classify text rather than control execution, so they can miss attacks that manipulate the agent’s next action instead of its visible output.

Q: What breaks when AI agents reuse broad OAuth scopes and tokens?

A: Broad scopes turn the agent into a high-blast-radius identity that can expose data, invoke tools, or move into systems it was never meant to touch.

Practitioner guidance

  • Map every agent connector to its underlying identity and scope Inventory which tokens, OAuth grants, API keys, and service permissions each agent can reach, then reduce scopes to the smallest task boundary that still allows the workflow to function.
  • Test guardrails against prompt injection and obfuscation Run adversarial exercises that include multi-turn prompt injection, encoded text, foreign-language payloads, and hidden instructions to see whether the control blocks execution or only flags it.
  • Separate output safety from action authorization Do not rely on the agent’s response text as a proxy for safety.

What's in the full article

Zenity's full research covers the operational detail this post intentionally leaves for the source:

  • Specific runtime detection and blocking logic for AgentKit interactions across users, tools, and outputs
  • Examples of how Zenity classifies risky intent before a response reaches a user or downstream system
  • Policy-based enforcement scenarios for data leakage, secrets exposure, and unsafe responses
  • Lifecycle coverage across discovery, posture management, inline detection, and active prevention

👉 Read Zenity's analysis of runtime protection for OpenAI AgentKit →

OpenAI AgentKit guardrails: what changes for IAM teams?

Explore further

View Full Forum →  |  NHI Foundation Course →



   
Quote
(@mr-nhi)
Member Moderator
Joined: 1 month ago
Posts: 5343
 

Runtime AI agent protection is becoming an identity control, not a model feature. AgentKit lowers the threshold for deploying agents, but the governance problem shifts to controlling what the agent can reach, invoke, and reveal at runtime. That makes the control plane part of identity security, because the agent now operates with tool access, data access, and policy exposure that resemble a non-human identity. Practitioners should treat runtime protection as an identity boundary that decides whether execution is allowed.

A few things that frame the scale:

  • 98% of companies plan to deploy even more AI agents within the next 12 months, despite documented rogue behaviour in 80% of current deployments, according to AI Agents: The New Attack Surface report.
  • Only 52% of companies can track and audit the data their AI agents access, leaving 48% with a complete blind spot for compliance and breach investigation.

A question worth separating out:

Q: Who is accountable when an AI agent leaks secrets or violates policy?

A: Accountability sits with the organisation that defined the agent’s permissions, workflows, and enforcement model, not with the model itself. If runtime controls are absent, the failure is governance design, not just user behaviour. Teams should assign an owner for agent policy, connector scope, and incident response so there is a clear path from risk detection to containment.

👉 Read our full editorial: Runtime protection for OpenAI AgentKit and AI agent guardrails



   
ReplyQuote
Share: