Subscribe to the Non-Human & AI Identity Journal

How can organisations govern third-party AI systems without losing accountability?

They need explicit ownership, control evidence, and review cadence for every external AI dependency. That includes suppliers, hosted platforms, and integrated services that influence model behaviour or data exposure. Without those controls, accountability becomes implied rather than auditable, which is a weak basis for enterprise governance.

Why This Matters for Security Teams

Third-party AI systems create a governance gap because the organisation can be accountable for outcomes without directly controlling the model, the hosting stack, or the data flows that shape those outcomes. That makes supplier risk, data exposure, and model behaviour inseparable from enterprise identity governance. Current guidance suggests this should be treated as an access and assurance problem, not just a procurement review. The control challenge becomes clearer when incidents show how quickly secrets and dependencies can be abused, as described in NHIMG’s 52 NHI breaches Report and the Ultimate Guide to NHIs for Regulatory and Audit Perspectives.

Security teams often assume a vendor contract or shared-responsibility statement is enough. It is not. Governance fails when no one can show who approved the dependency, what telemetry is available, whether the vendor can change behaviour unilaterally, or how evidence is reviewed over time. The issue is not whether the AI is external. The issue is whether enterprise accountability remains auditable when the system is external. In practice, many security teams discover this only after a supplier changes model behaviour, logging, or retention settings without a formal review path.

How It Works in Practice

Effective governance starts by assigning a named business owner, a technical owner, and a risk owner for each third-party AI dependency. Those roles need to be linked to the specific service, not the vendor in general. The organisation should also maintain an inventory of what the system can access, what data it ingests, what outputs it can influence, and which internal workflows depend on it. This is where identity governance intersects with vendor management. The OWASP Non-Human Identity Top 10 is useful here because many third-party AI risks are really NHI risks: over-privileged service accounts, opaque token use, weak rotation, and missing revocation paths.

Practitioners should require control evidence that can be reviewed on a cadence, not just at onboarding. That evidence may include access logs, data retention settings, incident notification SLAs, model update notices, and test results showing how the system behaves under policy constraints. The Top 10 NHI Issues and Lifecycle Processes for Managing NHIs both reinforce the same operational point: identities, secrets, and approvals need lifecycle control, not one-time approval.

  • Define whether the vendor is a data processor, model provider, or integrated service operator.
  • Map all secrets, API keys, and delegated tokens used by the third-party AI system.
  • Set review intervals for access, retention, and behavioural changes.
  • Require change notification before model updates or policy changes reach production.
  • Capture evidence of testing, logging, and revocation so accountability remains auditable.

The NIST Cybersecurity Framework 2.0 is a useful baseline for structuring those controls across govern, identify, protect, detect, respond, and recover. These controls tend to break down when the third-party AI system is embedded through multiple sub-processors and downstream APIs because ownership, telemetry, and change notification become fragmented across several operators.

Common Variations and Edge Cases

Tighter oversight often increases procurement friction and review overhead, requiring organisations to balance speed against assurance. That tradeoff is unavoidable for high-impact AI, but current guidance suggests the burden should scale with the system’s data sensitivity, autonomy, and business criticality. Not every third-party AI dependency needs the same review depth, yet there is no universal standard for this yet, so risk-based tiering is the safest practical approach.

One edge case is a low-touch embedded AI feature inside a SaaS platform. The vendor may claim the feature is part of the standard service, but it can still change data handling, prompts, or output behaviour in ways that affect accountability. Another is a hosted model exposed through an internal proxy, where the enterprise owns the proxy but not the underlying model. In that case, the organisation must document where its control ends and the supplier’s begins. NHIMG’s DeepSeek breach illustrates why exposure, retention, and third-party handling cannot be treated as abstract risk categories.

For organisations with regulated data or safety-critical workflows, annual reviews are usually too slow. Best practice is evolving toward continuous or event-driven reassessment when the vendor changes models, retraining inputs, hosting regions, or incident posture. In practice, accountability weakens fastest when the third-party AI becomes “just another feature” and no one rechecks the evidence after integration.

Standards & Framework Alignment

This section maps relevant standards and security frameworks to the operational risks and controls described in this guidance.

OWASP Non-Human Identity Top 10 address the attack and risk surface, while NIST CSF 2.0 and NIST AI RMF set the governance and control requirements practitioners need to meet.

Framework Control / Reference Relevance
OWASP Non-Human Identity Top 10 NHI-03 Third-party AI often relies on unmanaged secrets and delegated access.
NIST CSF 2.0 GV.OV-01 Governance requires named ownership, oversight, and review of external AI risk.
NIST AI RMF GOVERN AI RMF governance is central to documenting accountability for third-party AI systems.

Track every vendor-issued secret, set rotation rules, and revoke access when the dependency changes.