Agentic AI & Autonomous Identity

Should organisations let an LLM decide when an agent workflow is complete?

By NHI Mgmt Group Editorial Team Updated May 30, 2026 Domain: Agentic AI & Autonomous Identity

Organisations can let an LLM suggest completion, but they should not rely on it as the only stop condition. A hard turn limit and explicit termination rules are necessary because the model can fail to stop, stop too early, or optimize for output instead of safety. Completion should be bounded by workflow policy, not just model output.

Why This Matters for Security Teams

An LLM can be useful as a signal, but completion is a security decision, not a language preference. If an agent can keep calling tools, opening tickets, moving data, or chaining prompts after the model says it is “done,” then the organisation has delegated workflow termination to an unpredictable actor. That is especially risky in agentic systems where intent changes mid-task and access is wider than a single human session.

Current guidance from OWASP Agentic AI Top 10 and NIST AI Risk Management Framework points in the same direction: runtime behaviour needs bounded authority, policy checks, and explicit stop conditions. NHIMG research on the OWASP NHI Top 10 shows why autonomous systems must be governed as identities with execution power, not just as chat interfaces.

In practice, many security teams discover overrun workflows only after the agent has already overreached, rather than through intentional completion design.

How It Works in Practice

The safest pattern is to treat the LLM as one input to completion, not the authority that ends the workflow. A solid implementation pairs the model’s “done” suggestion with hard controls such as maximum turns, a task-specific policy engine, and a required termination rule in the orchestrator. That aligns with the direction in CSA MAESTRO agentic AI threat modeling framework and the OWASP Top 10 for Agentic Applications 2026, both of which emphasize that agentic risk comes from runtime autonomy.

In practical terms, security teams should define:

an explicit stop condition tied to the business objective, not model confidence
a hard cap on tool calls, retries, and elapsed time
JIT credentials and short-lived secrets that expire whether or not the model claims success
workload identity and request-level authorisation so the agent can only do what the current task allows
audit logs that record why the orchestrator stopped, not just what the model said

This is where workload identity matters: an agent should present a cryptographic identity, such as SPIFFE or OIDC-backed workload credentials, and each action should be evaluated against context at runtime. That is more reliable than static RBAC alone because autonomous systems do not follow fixed user-like patterns. NHIMG’s AI LLM hijack breach coverage and DeepSeek breach analysis show why secrets exposure and overbroad tool access turn small control failures into major incidents.

These controls tend to break down when agents are embedded in long-running, multi-system workflows with loosely defined success criteria because termination logic becomes fragmented across services.

Common Variations and Edge Cases

Tighter completion control often increases orchestration overhead, so organisations have to balance safety against throughput and developer convenience. There is no universal standard for this yet, but current guidance suggests that higher-risk agents should use stricter termination rules than low-impact assistants.

Some teams let the model propose “complete,” then require a separate policy service or human approver to confirm completion for sensitive actions. That can work well for finance, customer data, or production changes, but it adds latency. In lower-risk environments, a simple rule set may be enough if the agent has very narrow scope and no sensitive tool access. The key is that the model should never be the only stop condition.

Another edge case is when the agent’s goal is intentionally open-ended, such as research or content drafting. In those cases, the workflow should still end on policy triggers such as time, token budget, or task boundary, not on the model’s self-assessment. For guidance on broader NHI hygiene, NHIMG’s Moltbook AI agent keys breach and LiteLLM PyPI package breach articles show how quickly static credentials and weak containment become operational liabilities.

Best practice is evolving, but the safe default is consistent: let the LLM advise, let policy decide, and let the orchestrator enforce the stop.

Standards & Framework Alignment

This section maps relevant standards and security frameworks to the operational risks and controls described in this guidance.

OWASP Agentic AI Top 10 and CSA MAESTRO address the attack and risk surface, while NIST AI RMF set the governance and control requirements practitioners need to meet.

Framework	Control / Reference	Relevance
OWASP Agentic AI Top 10	A3	Agentic workflows need explicit stop and boundary controls.
CSA MAESTRO	TA2	MAESTRO addresses runtime threat modeling for autonomous agents.
NIST AI RMF	GOVERN	AI RMF governance covers accountability for autonomous workflow decisions.

Bind agent completion to policy, turn limits, and termination rules, not model self-reporting.

Deepen Your Knowledge

Ultimate Guide to NHIs → NHI Foundation Course → Discussion Forum →

NHIMG Editorial Note
Reviewed and updated by the NHIMG editorial team on May 30, 2026.
NHI Mgmt Group — the #1 independent authority on Non-Human Identity, IAM, and Agentic AI security. nhimg.org

#1 Authority in NHI Education, Research and Advisory, empowering organizations to tackle the critical risks posed by Non-Human Identities (NHIs), including AI Agents.

Get in Touch

Quick Links

FAQ

NHI 101 Articles

Legal & Policies

Should organisations let an LLM decide when an agent workflow is complete?

Why This Matters for Security Teams

How It Works in Practice

Common Variations and Edge Cases

Standards & Framework Alignment

Related resources from NHI Mgmt Group