What should organisations do first when securing LLMs and AI agents?

Why This Matters for Security Teams

The first security mistake with LLMs and AI agents is treating them like ordinary applications with a few extra prompts. These systems can retrieve data, invoke tools, and take actions across other services, so the real risk is not just model output but the permissions wrapped around the model. That is why NHI Management Group emphasises ownership, permitted actions, and access boundaries before adding monitoring or red-teaming. Guidance from the OWASP Agentic AI Top 10 and the NIST AI Risk Management Framework both point to governance and context as the starting point, not an afterthought.

This matters because attackers do not need to break the model if they can abuse the surrounding identity and toolchain. NHIMG research on the AI agents: new attack surface report shows how quickly agent misuse becomes a business issue when oversight is weak, and the same pattern appears in breach writeups such as the LLMjacking research. In practice, many security teams encounter unsafe agent behaviour only after a tool call, data leak, or credential exposure has already occurred, rather than through intentional review.

How It Works in Practice

The practical first step is to define the agent’s trust boundary in operational terms: who owns it, what it is allowed to do, which data sources it can query, and which tools it may call. For LLMs, that means documenting prompt handling, retrieval scope, and output destinations. For agents, it means treating tool permissions as privileged access, not as a product configuration detail. Current best practice is evolving, but the direction is clear: static role assignment is too blunt for goal-driven workloads that can chain actions in ways humans did not pre-plan.

Security teams should start with a simple control map:

Assign a business owner and a technical owner for each model or agent.

Classify every retrieval source, plugin, API, and downstream system.

Separate read-only use cases from those that can trigger writes, transfers, or approvals.

Prefer short-lived, task-scoped credentials over standing secrets whenever tool access is required.

Log prompts, tool calls, and data access together so investigation can reconstruct intent, not just events.

That approach aligns with the CSA MAESTRO agentic AI threat modeling framework and the NIST AI Risk Management Framework, both of which stress governance, mapped responsibilities, and risk treatment before expansion. NHIMG’s OWASP NHI Top 10 also reflects the same operational reality: secret exposure and overbroad access are recurring failure modes, not edge cases. These controls tend to break down when an agent is allowed to span multiple business systems with shared credentials and no per-action approval path.

Common Variations and Edge Cases

Tighter control over LLMs and agents often increases deployment overhead, requiring organisations to balance speed against assurance. That tradeoff becomes sharper in environments that rely on experimentation, rapid prompt iteration, or autonomous workflows that change weekly. There is no universal standard for this yet, so the right answer depends on whether the system is advisory, semi-autonomous, or allowed to execute transactions.

Edge cases usually fall into three buckets. First, internal copilots may feel low risk, but if they can reach sensitive documents or shared drives, the boundary problem is the same as for external-facing agents. Second, multi-agent workflows can blur accountability because one agent’s retrieval becomes another agent’s action. Third, some teams over-focus on model safety and ignore the identity layer that actually enables abuse. NHIMG breach analysis such as the Moltbook AI agent keys breach and the DeepSeek breach shows why secret hygiene and scope control must be established early, before broader AI rollout.

Where organisations move fastest, the safest first decision is not “How do we monitor everything?” but “What exactly is this system authorised to do, and who approves that authority?”

Standards & Framework Alignment

This section maps relevant standards and security frameworks to the operational risks and controls described in this guidance.

OWASP Agentic AI Top 10 and CSA MAESTRO address the attack and risk surface, while NIST AI RMF set the governance and control requirements practitioners need to meet.

Framework	Control / Reference	Relevance
OWASP Agentic AI Top 10	A1	Covers overbroad agent actions and missing guardrails around tool use.
CSA MAESTRO	GOV-1	Governance-first design fits the need to assign ownership before deployment.
NIST AI RMF		AI RMF prioritises governance and risk framing before technical controls.

Establish AI governance, risk ownership, and control boundaries before expanding use.

#1 Authority in NHI Education, Research and Advisory, empowering organizations to tackle the critical risks posed by Non-Human Identities (NHIs), including AI Agents.

What should organisations do first when securing LLMs and AI agents?

Why This Matters for Security Teams

How It Works in Practice

Common Variations and Edge Cases

Standards & Framework Alignment

Related resources from NHI Mgmt Group