Who is accountable when untrusted project configuration changes what an AI assistant sees?

Accountability sits with the team that allowed repository-local configuration to auto-load without explicit trust controls. Governance frameworks such as OWASP NHI and zero trust both point to the same principle: provenance, review, and revocation must be part of the control design, not an afterthought.

Why This Matters for Security Teams

When repository-local configuration is allowed to auto-load, the assistant is no longer operating only on the prompt the user can see. It is also shaped by hidden files, workspace defaults, and inherited settings that may be unreviewed or attacker-controlled. That shifts the risk from a simple prompt issue into a provenance and trust problem, which is why guidance from NIST Cybersecurity Framework 2.0 matters here: asset trust, change control, and recovery all depend on knowing what influenced the system.

This is also why NHI Management Group treats untrusted project state as a governance failure, not just a developer convenience issue. The DeepSeek breach shows how quickly hidden exposure can become an operational security problem when secrets or sensitive context are present where they should not be. In practice, many security teams encounter this only after an assistant has already acted on malicious configuration rather than through intentional review of what the assistant is allowed to load.

How It Works in Practice

The accountable team is the one that owns the trust boundary around the repository, workspace, or build context. If an AI assistant reads local project files automatically, then configuration files, instruction files, and environment metadata become part of the control plane. That means the security design has to answer four questions: what is allowed to load, who approved it, how is it validated, and how is it revoked when risk changes.

Current best practice is to treat project-local configuration as untrusted until it passes explicit checks. That usually means:

Separating user-entered prompts from repository-supplied instructions.
Requiring provenance checks before auto-loading workspace configuration.
Using allowlists for files, paths, and settings the assistant may consume.
Logging when configuration changes affect tool access, data scope, or model behavior.
Revalidating trust after pull requests, dependency updates, or branch switches.

For agentic or tool-using assistants, this is not just a content filter problem. An assistant with execution authority can chain a poisoned configuration into file access, tool calls, or secret discovery, so runtime policy has to be evaluated against the current context rather than a static role alone. That aligns with the State of Secrets in AppSec research, which shows how fragmented secret handling and slow remediation create a persistent exposure window. The control objective is simple: make trust explicit, make inherited state visible, and make revocation fast. These controls tend to break down when local development environments mix personal overrides, shared workspaces, and unattended assistant plugins because the system cannot reliably distinguish intentional project context from attacker-supplied instructions.

Common Variations and Edge Cases

Tighter configuration controls often increase developer friction, requiring organisations to balance safety against workflow speed. That tradeoff is real, especially in monorepos, ephemeral branches, and agentic coding tools where frequent context changes can make heavy approval flows impractical. Current guidance suggests using risk-based controls rather than a single universal rule, but there is no universal standard for this yet.

Edge cases matter. A trusted repository can still carry untrusted configuration if a compromised contributor modifies assistant-facing files. A sandboxed assistant can still be misled if it inherits secrets, cached context, or policy exceptions from a previous task. And in multi-agent pipelines, one agent may generate a file that another agent treats as authoritative without any human review. The safest pattern is to scope trust to the smallest viable unit, then require explicit renewal whenever the assistant crosses a boundary such as repository, environment, or privilege level.

For teams mapping this to governance, the lesson is consistent: accountability rests with the control owner who approved auto-loading, not with the assistant that consumed the data. That principle is reinforced by NIST Cybersecurity Framework 2.0 and by NHIMG’s view that provenance and revocation must be designed into the workflow, not bolted on later.

Standards & Framework Alignment

This section maps relevant standards and security frameworks to the operational risks and controls described in this guidance.

OWASP Non-Human Identity Top 10 address the attack and risk surface, while NIST Zero Trust (SP 800-207) and NIST CSF 2.0 set the governance and control requirements practitioners need to meet.

Framework	Control / Reference	Relevance
OWASP Non-Human Identity Top 10	NHI-01	Untrusted config changes the NHI trust boundary and control expectations.
NIST Zero Trust (SP 800-207)	PR.AC-1	Zero trust requires explicit verification before inherited context is trusted.
NIST CSF 2.0	PR.AC-4	Access control must reflect who approved the configuration, not just who executed it.

Classify project-local assistant config as untrusted until provenance and approval are verified.

Who is accountable when untrusted project configuration changes what an AI assistant sees?

Why This Matters for Security Teams

How It Works in Practice

Common Variations and Edge Cases

Standards & Framework Alignment

Related resources from NHI Mgmt Group