Should organisations use security skill prompts instead of access controls for AI agents?

Why This Matters for Security Teams

Security skill prompts can shape agent behaviour, but they do not enforce who can act, what data can be touched, or which systems can be reached. That distinction matters because autonomous agents can chain tools, retry actions, and drift beyond the task a designer imagined. NHI Management Group’s research on the AI Agents: The New Attack Surface report shows that 80% of organisations say their AI agents have already performed actions beyond intended scope, including unauthorised system access and credential exposure.

This is why prompting alone is not a security boundary. A well-written prompt may reduce risky choices, but it cannot revoke a token, narrow a scope, or stop an overbroad connector from reading sensitive records. Governance still has to be anchored in identity, approval gates, and runtime enforcement, as reflected in guidance from the NIST AI Risk Management Framework and the OWASP Agentic AI Top 10. In practice, many security teams discover the gap only after an agent has already accessed something it was never meant to reach.

How It Works in Practice

Security skill prompts are best treated as behavioural guardrails. They are useful for teaching an agent to decline questionable requests, prefer safe tools, or summarise uncertainty before taking action. But for agentic systems, the control plane must sit outside the prompt. Access decisions need to be enforced by identity governance, workload identity, and short-lived credentials that limit what an agent can do at runtime.

A practical model looks like this:

Use prompts to influence decision quality, not to define permission.

Issue task-scoped credentials with short time-to-live values and automatic revocation on completion.

Bind the agent to workload identity so each action is tied to a cryptographic identity, not a reusable shared secret.

Evaluate policy at request time, using context such as task, data sensitivity, environment, and approval state.

Require approval gates for irreversible actions, high-risk data access, and connector onboarding.

This aligns with current guidance from the OWASP Non-Human Identity Top 10 and the CSA MAESTRO agentic AI threat modeling framework, both of which emphasise that machine identities need explicit scope, rotation, and monitoring. The operational lesson is reinforced by NHIMG’s LLMjacking research, where exposed credentials were rapidly abused once discovered. These controls tend to break down in high-autonomy environments where agents can self-chain tools across multiple SaaS systems because the prompt cannot reliably constrain downstream side effects.

Common Variations and Edge Cases

Tighter prompt governance often improves safety but increases operational overhead, so organisations have to balance reduced misuse against slower execution and more review burden. That tradeoff is real, especially when teams want agents to move quickly across many tools without bottlenecking every action.

There is no universal standard for how much a prompt should control. Some teams use “security skills” as a first line of defence, then pair them with just-in-time approval for high-risk tasks. Others embed policy text directly into system prompts, but current guidance suggests that approach is only advisory unless the underlying controls also enforce scope. In regulated environments, prompt quality may support auditability, but it cannot replace access reviews, segregation of duties, or revocation workflows.

Edge cases are most common when agents operate with inherited human privileges, long-lived API keys, or broad service accounts. Those setups make prompt-based restraint especially fragile because a single misroute can expose large parts of the environment. NHIMG’s Ultimate Guide to NHIs and 52 NHI Breaches Analysis both show the same pattern: identity misuse, not bad prompting, is usually what turns an automation issue into a breach.

Standards & Framework Alignment

This section maps relevant standards and security frameworks to the operational risks and controls described in this guidance.

OWASP Agentic AI Top 10 and CSA MAESTRO address the attack and risk surface, while NIST AI RMF set the governance and control requirements practitioners need to meet.

Framework	Control / Reference	Relevance
OWASP Agentic AI Top 10	A2	Prompts can't replace runtime controls for agent tool use and data access.
CSA MAESTRO	M3	MAESTRO addresses agent identity, authorization, and control-plane separation.
NIST AI RMF	GOVERN	AI RMF governance applies to accountability and risk ownership for agents.

Assign owners, define approval thresholds, and audit agent actions continuously.

#1 Authority in NHI Education, Research and Advisory, empowering organizations to tackle the critical risks posed by Non-Human Identities (NHIs), including AI Agents.

Should organisations use security skill prompts instead of access controls for AI agents?

Why This Matters for Security Teams

How It Works in Practice

Common Variations and Edge Cases

Standards & Framework Alignment

Related resources from NHI Mgmt Group