Who is accountable when an AI vendor changes an agent’s capabilities without notice?

Why This Matters for Security Teams

When a vendor changes an agent’s capabilities without notice, the issue is not just “unexpected functionality.” It is a control break in the enterprise identity graph, because the effective access model has changed without a corresponding approval event. That is why accountability cannot stop at the supplier boundary. Security teams need evidence of what the agent could do yesterday, what it can do today, and who accepted the residual risk. The governance gap is often exposed through audit, incident response, or a business exception, not during planned review. Current guidance from the NIST AI Risk Management Framework and the OWASP Agentic AI Top 10 both point to the same operational problem: capabilities must be governed as dynamic risk, not assumed stable after procurement.

NHIMG research on LLMjacking shows how quickly attackers move once AI credentials or agent access paths are exposed, which makes silent capability expansion especially dangerous for response teams. In practice, many security teams encounter the change only after the agent has already been used in production in a way nobody explicitly approved.

How It Works in Practice

Accountability starts with assigning ownership to the enterprise, but it has to be operationalised through controls that can detect and bound change. For autonomous agents, static RBAC is usually too slow and too coarse because capability shifts are not always tied to a human role change. Instead, organisations are moving toward runtime authorisation, recertification triggers, and short-lived permissions that can be revoked when behaviour changes. That means treating the agent as a workload identity with clearly scoped privileges, not as a permanent account with broad standing access.

In practice, teams should require vendor changes to flow through a documented change-control path, even when the vendor frames them as “improvements.” The control set usually includes:

capturing the approved capability baseline for each agent version

binding the agent to workload identity and short-lived tokens rather than durable secrets

re-evaluating permissions at request time using policy-as-code

freezing or quarantining the agent when the capability delta cannot be assessed quickly

re-certifying high-risk tools and actions before re-enabling production use

This aligns with the direction of the CSA MAESTRO agentic AI threat modeling framework, which treats tool access, orchestration, and runtime trust as separate decision points. It also matches NHIMG’s OWASP NHI Top 10 coverage of identity and secret abuse in agentic environments, where the practical failure is usually not the model alone but the way credentials and permissions are inherited across tools.

These controls tend to break down when vendors can push capability updates directly into production agents without a customer-side approval gate, because the enterprise cannot prove which effective permissions were active at the time of use.

Common Variations and Edge Cases

Tighter change control often increases operational overhead, requiring organisations to balance faster vendor iteration against stronger approval discipline. That tradeoff is real, especially for agents that depend on frequent model or tool updates. Best practice is evolving, and there is no universal standard for this yet, but the current direction is clear: if a vendor can alter what an agent does, the customer needs an enforceable freeze path, not just a contractual promise.

Edge cases usually appear in three places. First, hosted agent platforms may hide implementation changes behind the same API, which makes version pinning and evidence capture essential. Second, multi-agent workflows can amplify a small capability change into a broader blast radius because one agent’s new tool access becomes another agent’s trusted input. Third, emergency security patches can justify temporary exceptions, but those exceptions still need time-boxed approval and post-change recertification. NHIMG’s AI LLM hijack breach and the broader DeepSeek breach coverage both reinforce the same lesson: when capability changes are invisible, accountability becomes contested after the fact.

The practical answer is not to trust vendor notice alone. It is to define who owns the identity graph, who can approve capability deltas, and what automation will stop an agent before unreviewed access becomes production behaviour.

Standards & Framework Alignment

This section maps relevant standards and security frameworks to the operational risks and controls described in this guidance.

OWASP Agentic AI Top 10 and CSA MAESTRO address the attack and risk surface, while NIST AI RMF set the governance and control requirements practitioners need to meet.

Framework	Control / Reference	Relevance
OWASP Agentic AI Top 10	A1	Agent capability drift creates runtime trust and access risks.
CSA MAESTRO	M1	MAESTRO addresses orchestration, tool trust, and change control.
NIST AI RMF		AI RMF covers governance, accountability, and change impact management.

Pin agent versions, gate tool access, and recertify permissions before re-enabling changed capabilities.

#1 Authority in NHI Education, Research and Advisory, empowering organizations to tackle the critical risks posed by Non-Human Identities (NHIs), including AI Agents.

Who is accountable when an AI vendor changes an agent’s capabilities without notice?

Why This Matters for Security Teams

How It Works in Practice

Common Variations and Edge Cases

Standards & Framework Alignment

Related resources from NHI Mgmt Group