Why do AI vendor assessments need more than a standard security questionnaire?

Why This Matters for Security Teams

AI vendor reviews fail when they focus only on baseline controls such as encryption, access reviews, and incident response. For AI services, the real risk is often in data handling, prompt retention, model reuse, output logging, and whether hidden features can change how information is processed after onboarding. A standard questionnaire can confirm that a supplier has policies; it cannot show how the service behaves under real workload conditions. NIST’s Cybersecurity Framework 2.0 is useful as a baseline, but AI vendors need additional scrutiny around AI-specific data flows and operational controls.

NHI Management Group’s research on the State of Secrets in AppSec shows how quickly secrets exposure and weak operational discipline can turn into real abuse, which is exactly why AI vendor due diligence must test what happens to prompts, credentials, and outputs after the initial sale. In practice, many security teams discover those gaps only after sensitive content has already been ingested into a service or reused in a way the questionnaire never captured.

How It Works in Practice

A stronger assessment starts by separating ordinary SaaS questions from AI-specific controls. A vendor may answer yes to MFA, encryption, and logging while still retaining prompts for model improvement, exposing outputs to broad internal access, or changing feature behavior without clear customer notice. That is why practitioners should ask for explicit answers on data retention, training reuse, human review, output filtering, tenant isolation, and whether prompts or files are excluded from model training by default.

Current guidance suggests the review should also test evidence, not just policy. Ask how the vendor documents:

Prompt ingestion, retention periods, and deletion workflows

Whether customer content is used to train or fine-tune models

How output monitoring detects leakage, harmful content, or policy drift

How feature flags or “silent” product changes are approved and communicated

Which personnel can access logs, traces, and prompt history

This is where AI assessments differ from conventional security questionnaires. The control question is no longer only “is data protected,” but also “what is the service allowed to do with it after submission.” That distinction is central to the DeepSeek breach analysis, which illustrates how embedded secrets and exposed records can create risk far beyond a normal vendor questionnaire. Teams should also reference the NIST AI risk guidance in the Cybersecurity Framework 2.0 when mapping responsibilities for governance, monitoring, and response.

These controls tend to break down when the vendor chains multiple model providers, retrieval layers, and telemetry services because data lineage and retention responsibilities become hard to verify.

Common Variations and Edge Cases

Tighter AI vendor review often increases procurement time and legal overhead, requiring organisations to balance speed against the need for evidence-based assurance. Best practice is evolving, and there is no universal standard for this yet, so mature teams adapt the depth of review to the sensitivity of the use case. A low-risk summarisation tool should not be assessed the same way as a system that ingests regulated, confidential, or customer-facing data.

Edge cases often include subcontracted model hosting, open-weight models deployed in the vendor’s environment, and “bring your own key” claims that do not clearly limit backend access. Standard questionnaires also struggle when vendors reserve the right to activate new features, retain telemetry for analytics, or use customer interactions to improve services unless the contract explicitly forbids it. In those cases, the right follow-up is a data-flow review, contract clause review, and a request for operational evidence such as retention settings, redaction behavior, and log access controls. NHI Management Group’s Ultimate Guide to NHIs — Standards is useful for aligning these checks with broader identity and secrets governance. The Ultimate Guide to NHIs — The NHI Market also helps teams understand why vendor risk now includes machine identities, service tokens, and model-adjacent workflows rather than only user access.

Standards & Framework Alignment

This section maps relevant standards and security frameworks to the operational risks and controls described in this guidance.

OWASP Agentic AI Top 10 address the attack and risk surface, while NIST CSF 2.0 and NIST AI RMF set the governance and control requirements practitioners need to meet.

Framework	Control / Reference	Relevance
NIST CSF 2.0	GV.OV-01	AI vendors need governance oversight beyond baseline security questions.
NIST AI RMF	GOVERN	Vendor AI risk depends on accountability, documentation, and oversight.
OWASP Agentic AI Top 10	A2	Questionnaires miss prompt handling and output controls central to agentic AI risk.

Require evidence for AI data handling, retention, and monitoring under your vendor governance review.

#1 Authority in NHI Education, Research and Advisory, empowering organizations to tackle the critical risks posed by Non-Human Identities (NHIs), including AI Agents.

Why do AI vendor assessments need more than a standard security questionnaire?

Why This Matters for Security Teams

How It Works in Practice

Common Variations and Edge Cases

Standards & Framework Alignment

Related resources from NHI Mgmt Group