Subscribe to the Non-Human & AI Identity Journal
Home FAQ Agentic AI & Autonomous Identity What should organisations do before an AI assistant…
Agentic AI & Autonomous Identity

What should organisations do before an AI assistant can act on real systems?

← Back to all FAQ
By NHI Mgmt Group Editorial Team Updated June 10, 2026 Domain: Agentic AI & Autonomous Identity

Organisations should require hard boundaries between data and instructions, then restrict the assistant to the minimum set of actions needed for the workflow. They should also monitor tool calls and redact hostile or user-controlled text before the model sees it. That combination reduces the chance that poisoned content becomes operational action.

Why This Matters for Security Teams

Before an AI assistant can touch production systems, the core question is not simply whether it has a prompt filter or a user approval step. The real issue is whether the assistant can be trusted to separate instructions from data, stay within a narrow action scope, and resist being steered by content it encountered along the way. That is why this topic sits at the intersection of workload identity, secret handling, and AI governance rather than classic chatbot security. Guidance from the NIST Cybersecurity Framework 2.0 still applies, but AI assistants add a new operational problem: they can transform untrusted text into real-world actions through tools, APIs, and delegated permissions.

The practical risk is easy to underestimate. A model that can read tickets, emails, logs, or documents may also be exposed to injected instructions that try to override policy, expand scope, or exfiltrate secrets. NHIMG research on the DeepSeek breach shows how quickly sensitive material can escape once boundaries are weak, and that pattern matters even more when an assistant has operational authority. In practice, many security teams encounter unsafe tool use only after the assistant has already executed a harmful action, rather than through intentional pre-deployment testing.

How It Works in Practice

The safest deployment pattern is to treat the assistant as an untrusted workload until it proves otherwise. That means assigning a distinct workload identity to the agent, issuing ephemeral credentials per task, and forcing every tool call through real-time policy checks. Current guidance suggests that the model should not inherit broad human permissions, because autonomous systems do not behave like static roles. Their actions are goal-driven, context-sensitive, and often hard to predict in advance.

  • Use hard separation between instructions, retrieved content, and tool arguments so hostile text cannot masquerade as policy.
  • Apply least privilege to every connected system, then narrow it further with just-in-time access that expires after the task.
  • Evaluate tool requests at runtime with policy-as-code, rather than relying only on pre-approved roles or static allow lists.
  • Log every tool invocation, input source, and approval decision so later review can reconstruct what the assistant actually did.
  • Strip or neutralize user-controlled text before it reaches the model when that text could influence actions or retrieval.
This approach aligns with the operational direction described in The State of Secrets in AppSec, where secret exposure and fragmented controls remain persistent weaknesses. It also matches the access-control direction in the NIST Cybersecurity Framework 2.0, but with an AI-specific twist: the policy engine must decide not just who can act, but whether the requested action is safe in the current context. These controls tend to break down when the assistant is wired directly into legacy admin consoles with standing credentials, because the tool layer then becomes a shortcut around every intended boundary.

Common Variations and Edge Cases

Tighter control over an AI assistant often increases friction, latency, and integration effort, so organisations have to balance operational speed against the risk of delegated overreach. Best practice is evolving here, and there is no universal standard for every architecture yet.

One common edge case is read-only assistants that still have access to sensitive data. Even without write privileges, they can be dangerous if they ingest secrets, internal plans, or adversarial prompts that later influence other systems. Another is multi-step automation, where one agent gathers data and another takes action. That separation helps, but it only works if the boundary between the agents is enforced with distinct identities and scoped credentials, not shared tokens.

The other major exception is emergency access. Break-glass paths may be necessary, but they should be time-boxed, heavily monitored, and clearly segregated from normal assistant behaviour. This is where the lessons from the DeepSeek breach are especially relevant: once an assistant can cross from analysis into execution, poor boundaries become an incident multiplier rather than a minor configuration issue. In practice, these controls fail most often in environments that mix rapid prototyping, shared service accounts, and production tool access without a separate approval layer.

Standards & Framework Alignment

This section maps relevant standards and security frameworks to the operational risks and controls described in this guidance.

OWASP Agentic AI Top 10 and CSA MAESTRO address the attack and risk surface, while NIST AI RMF set the governance and control requirements practitioners need to meet.

FrameworkControl / ReferenceRelevance
OWASP Agentic AI Top 10A03Addresses prompt/tool injection risks before agents touch real systems.
CSA MAESTROM1Covers identity, policy, and runtime guardrails for autonomous agents.
NIST AI RMFGovernance is required before deploying AI with operational authority.

Bind each agent to scoped identity and enforce contextual authorization at runtime.

NHIMG Editorial Note
Reviewed and updated by the NHIMG editorial team on June 10, 2026.
NHI Mgmt Group — the #1 independent authority on Non-Human Identity, IAM, and Agentic AI security. nhimg.org