Subscribe to the Non-Human & AI Identity Journal

Why do agent controls need to start before the first prompt?

Because hostile behaviour can be introduced in the setup layer through MCP servers, plugins, skills, hooks, or misconfiguration. By the time a session starts, the agent may already be working from a compromised trust base. Pre-session assessment is therefore essential for preventing unsafe actions rather than merely documenting them.

Why This Matters for Security Teams

Agent controls cannot begin at prompt time because the real trust decision happens earlier, when the agent is assembled from MCP servers, plugins, skills, hooks, and secrets. That setup layer determines whether the agent starts with safe boundaries or a compromised operating context. Current guidance suggests treating pre-session review as part of identity assurance, not as a later monitoring task.

This is especially important because autonomous systems do not behave like static service accounts. Once tool access exists, the agent can chain actions, escalate through indirect paths, and consume poisoned configuration before a human ever sees the first output. NHI Management Group research shows that 97% of NHIs carry excessive privileges, which is exactly why pre-execution control matters more than post hoc cleanup. See the Ultimate Guide to NHIs — 2025 Outlook and Predictions and the OWASP Agentic AI Top 10 for the broader risk model.

In practice, many security teams encounter agent abuse only after an unsafe connector, secret, or policy has already been loaded into the session, rather than through intentional pre-session validation.

How It Works in Practice

Pre-prompt control means the environment is assessed before the agent receives a task. That starts with workload identity, short-lived secrets, and verified configuration. Rather than assuming a prompt is the first meaningful event, the security team checks whether the agent has a trusted identity, whether the toolchain is approved, and whether the runtime policy is current. This aligns with the NIST AI Risk Management Framework, which emphasises governance and operational controls around AI systems, and with the CSA MAESTRO agentic AI threat modeling framework.

In operational terms, a secure setup phase usually includes:

  • Verifying the agent’s workload identity, such as OIDC-based proof or SPIFFE/SPIRE-backed identity, before any tools are attached.
  • Issuing JIT credentials with tight TTLs so the agent only receives the minimum access needed for that task.
  • Scanning MCP servers, plugins, and hooks for drift, unauthorized updates, or hidden outbound paths.
  • Evaluating policy at runtime with policy-as-code, rather than relying on a static role matrix that cannot predict agent behaviour.
  • Blocking any configuration that introduces standing privilege, long-lived secrets, or unreviewed tool delegation.

For agentic workloads, static RBAC is often too blunt because the task path is dynamic, not pre-scripted. Intent-based authorisation is more appropriate: the agent asks for access, the policy engine evaluates context, and approval is granted only if the requested action matches the current task and risk posture. NHI Management Group’s OWASP NHI Top 10 highlights why agent setup must be treated as an attack surface, not a neutral deployment step.

These controls tend to break down in loosely governed developer environments where plugins can be added ad hoc and secrets are injected from unmanaged scripts because the pre-session trust base is never made explicit.

Common Variations and Edge Cases

Tighter pre-session control often increases setup friction, requiring organisations to balance speed of experimentation against the need to prevent unsafe starting conditions. Best practice is evolving, especially where teams are trying to support both fast agent iteration and strong governance.

One common exception is low-risk internal automation, where teams may be tempted to relax setup checks for convenience. That can be acceptable only if the blast radius is genuinely small and the toolchain is isolated. In contrast, customer-facing or externally reachable agents need stronger gates because their environment is exposed to untrusted input, tool chaining, and prompt injection from the start. The distinction matters because the danger is not just what the agent says, but what the agent can reach before anyone intervenes. For incident context, NHI Mgmt Group’s AI LLM hijack breach and Moltbook AI agent keys breach show how quickly compromised setup layers become operational incidents.

There is no universal standard for this yet, but current guidance converges on the same principle: if the agent is not trusted before prompt ingestion, the prompt becomes a delivery mechanism for an already unsafe execution context.

Standards & Framework Alignment

This section maps relevant standards and security frameworks to the operational risks and controls described in this guidance.

OWASP Agentic AI Top 10 and CSA MAESTRO address the attack and risk surface, while NIST AI RMF set the governance and control requirements practitioners need to meet.

Framework Control / Reference Relevance
OWASP Agentic AI Top 10 A01 Pre-session trust failures are core agentic attack paths.
CSA MAESTRO M1 MAESTRO addresses threat modeling before agent execution begins.
NIST AI RMF AI RMF supports governance and risk controls across the AI lifecycle.

Treat agent provisioning as a governed lifecycle step with documented risk checks.