Code provenance is the missing control for AI-generated commits

By NHI Mgmt Group Editorial TeamPublished 2025-12-22Domain: Agentic AI & NHIsSource: Beyond Identity

TL;DR: AI is now generating or assisting with 41% of code, while 82% of developers use AI tools weekly, according to the source article’s cited industry reports. That scale makes commit-level identity and provenance controls the deciding factor in software supply chain security, not post-commit scanning alone.

At a glance

What this is: This analysis argues that code provenance has become a core security requirement because AI-assisted development makes commit identity harder to trust and easier to exploit.

Why it matters: For IAM and NHI practitioners, the issue is that code commits now behave like high-impact non-human actions and need cryptographic identity controls, not just repository permissions.

By the numbers:

41% of all code is now AI-generated or AI-assisted, with 82% of developers using AI tools weekly.
25% of Google's code and 30% of Microsoft's code are written by AI.
Software supply chain attacks are projected to cost organizations $60 billion in 2025.

👉 Read Beyond Identity's analysis of code provenance and AI-generated commits

Context

Code provenance is the ability to verify who or what created a change, when it was made, and whether it was altered before release. In an AI-heavy development pipeline, that is no longer a nice-to-have control because code generation is happening faster than manual review can keep up, and the primary IAM question shifts from who can push code to who can be trusted behind each commit.

The governance gap is familiar to NHI teams: a credential can prove possession, but not intent, authorship, or downstream accountability. That makes source control identities, signing keys, and developer authentication part of the broader NHI problem space. For a foundational reference, see the Ultimate Guide to NHIs.

Key questions

Q: How should security teams verify the identity behind AI-generated code commits?

A: Security teams should require every commit to be cryptographically signed and bound to a verified corporate identity before it can merge. That means pairing repository policy with identity assurance, hardware-backed credentials where possible, and audit trails that show who approved the change. Possession of a key is not enough when AI can generate code at scale.

Q: What is the difference between code signing and code provenance?

A: Code signing proves that a change or artifact was signed by a key, while code provenance proves where the code came from and who or what authored it. Provenance is broader because it includes identity, authorship, and trust in the full path from commit to build. For AI-era pipelines, provenance is the stronger control.

Q: Why does AI make software supply chain risk harder to control?

A: AI increases the amount of code produced, which reduces the time available for review and makes malicious or unauthorized changes harder to spot. It also introduces non-human actors into the development flow, so traditional assumptions about developer identity no longer hold. That combination expands the attack surface at the commit stage.

Q: When should organisations require signed commits for production code?

A: Organisations should require signed commits for any code that can influence build, release, or production systems. The stricter the environment, the less acceptable unsigned changes become, especially when AI tools or automation can create code faster than humans can inspect it. Signed commits should be a baseline, not an exception.

Technical breakdown

Why commit identity is weaker than code provenance

Version control systems were built to record change history, not to verify the real-world identity behind each change. Git author fields can be spoofed, SSH keys prove possession rather than identity, and neither control natively binds a commit to a corporate identity provider. In an AI-assisted workflow, that gap matters because the entity creating code may be a human, a model, or an automated agent acting at machine speed. Provenance closes that gap by linking commit metadata, cryptographic signatures, and trusted identity assertions into one auditable chain.

Practical implication: Treat commit signing and identity binding as part of IAM design, not as a developer convenience feature.

Why AI changes the attack surface in software delivery

AI increases both code volume and code velocity, which compresses the time available for review and expands the number of change events that can slip through. The risk is no longer only vulnerable code after merge; it is unauthorized or malicious code at the point of commit, when trust is first established. That makes identity-based controls more important than reactive scanning alone. The security model has to assume that some commits will be generated, modified, or replayed by non-human actors with valid credentials.

Practical implication: Shift controls left into commit authentication, approval policy, and traceability for every code change.

How SLSA treats provenance as a supply chain control

SLSA, the Supply-chain Levels for Software Artifacts framework, elevates provenance from an audit feature to an assurance requirement. Its source-track expectations focus on verifiable authorship, tamper resistance, and repeatable build integrity so downstream consumers can trust the artifact lineage. That is especially relevant when AI-generated code enters pipelines, because the provenance record must prove more than build success. It must prove controlled origin, signed change intent, and a traceable path from contributor to artifact.

Practical implication: Map provenance controls to your software supply chain assurance model and make signed source a release gate.

Threat narrative

Attacker objective: The attacker wants to move malicious or unauthorized code into trusted build and release paths without being reliably attributed or blocked.

Entry occurs when an attacker or malicious tool introduces code through a commit that appears to come from a legitimate contributor.
Escalation follows when stolen keys, spoofed author fields, or automated AI-generated changes bypass review and gain repository trust.
Impact is achieved when unverified code is merged into production and becomes part of the software supply chain, enabling downstream compromise.

Reviewdog GitHub Action supply chain attack — reviewdog/action-setup GitHub Action supply chain attack exposed secrets.
CI/CD pipeline exploitation case study — full server takeover via exposed .git directory and mismanaged CI/CD pipeline secrets.

Read our 52 NHI Breaches Analysis report for a comprehensive view of breaches impacting Non-Human Identities including AI Agents.

NHI Mgmt Group analysis

Code provenance is now an identity problem, not just a software engineering problem. The key question is no longer whether the repository captured a change, but whether the identity behind that change was verified at the moment of commit. That shifts responsibility toward IAM, developer identity, and cryptographic assurance. Practitioners should treat provenance as a control plane for trust, not as a logging feature.

AI-generated code creates provenance debt. As automated code generation rises, organisations accumulate changes that were not authored through the same human review assumptions used in legacy pipelines. That debt shows up as weaker accountability, harder incident reconstruction, and more room for unauthorized changes to blend in. The practical response is stronger commit authentication and tighter release policy, not more hope in manual review.

Identity-bound signing is the minimum viable control for AI-era source integrity. A commit signature without identity binding is only partial evidence, and a strong identity without signing still leaves room for tampering. The discipline has to combine verified identity, hardware-backed credentials, and policy enforcement at the repository boundary. Practitioners should make signed, attributable commits the default for any code that can reach production.

Code provenance is becoming a governance requirement for software supply chains. Frameworks such as SLSA are pushing organisations toward demonstrable trust in authorship and artifact lineage, which means provenance must be auditable, repeatable, and enforceable. This is exactly where NHI governance intersects with software delivery. Teams that treat source identity as part of their security architecture will be better positioned to handle AI-assisted development.

Commit trust should be measured with the same seriousness as privileged access. If a service account or API key can change production systems, then the identity that signs code deserves equivalent scrutiny. That includes lifecycle management, offboarding, and revocation when a credential or automation path is no longer trusted. Practitioners should align code provenance with NHI lifecycle controls, not separate it from them.

From our research:
97% of NHIs carry excessive privileges, increasing unauthorised access and broadening the attack surface, according to Ultimate Guide to NHIs.
Only 20% have formal processes for offboarding and revoking API keys, and even fewer have procedures for rotating them.
52 NHI Breaches Analysis shows how weak identity lifecycle controls turn routine access into repeatable compromise paths.

What this signals

Ephemeral code trust debt: AI-assisted development creates a growing gap between how quickly code is produced and how reliably its origin can be verified. For security teams, that means provenance checks must move into the same governance layer as secrets handling, approval workflows, and release control, because unsigned change is now a persistent programme risk.

The operational signal is clear: teams that still treat repository credentials as a narrow developer tooling issue will miss the broader NHI pattern. Committers, build bots, and signing services all need ownership, review, and revocation discipline. If a change can reach production, its identity path should be as visible as any privileged access path.

With only 5.7% of organisations reporting full visibility into their service accounts, per Ultimate Guide to NHIs, the same visibility gap is likely to affect code-signing and automation identities unless teams deliberately extend governance. That is where programme maturity will separate quickly: organisations that unify IAM, PAM, and software supply chain controls will have a defensible model, while everyone else will keep absorbing trust debt.

For practitioners

Bind commits to verified identity Require cryptographic commit signing tied to a corporate identity and enforce it at merge time for all repositories that feed production. Use hardware-backed, phishing-resistant credentials where possible so the signing identity is harder to steal or replay.
Treat AI-assisted code as a governed NHI workflow Classify automated code generation pipelines, build bots, and repository service accounts as non-human identities with explicit ownership, scoped access, and review paths. Apply the same lifecycle discipline you use for other secrets-bearing identities.
Move provenance checks into release gates Block release approval unless source identity, signature validity, and build lineage are all verified. Pair that with policy that flags unsigned or unauthenticated changes before they reach the main branch.
Reduce trust in author fields and SSH alone Do not rely on Git author metadata or possession-based SSH access as proof of who made a change. Use them as transport mechanisms, then layer identity assurance, signing, and audit evidence on top.

Key takeaways

AI-assisted development turns commit identity into a security control, not a metadata detail.
Code provenance matters because possession-based credentials do not prove authorship or intent.
Practitioners should enforce signed, identity-bound commits before production release paths are trusted.

Standards & Framework Alignment

This section maps relevant standards and security frameworks to the operational risks and controls described in this guidance.

OWASP Agentic AI Top 10 and OWASP Non-Human Identity Top 10 address the attack and risk surface, while NIST CSF 2.0 set the governance and control requirements practitioners need to meet.

Framework	Control / Reference	Relevance
OWASP Agentic AI Top 10		AI-generated commits and automation identities fit agentic trust and authz risks.
OWASP Non-Human Identity Top 10	NHI-03	Commit signing and key lifecycle map to NHI credential rotation and trust.
NIST CSF 2.0	PR.AC-4	Least-privilege access and authorization are central to repo trust and release control.

Restrict commit and release permissions to verified identities with documented approvals.

Key terms

Code provenance: Code provenance is the verifiable history of where code came from and who or what created it. In security practice, it combines authorship, timestamps, signatures, and build lineage so teams can prove a change was trusted before it reached production.
Commit signing: Commit signing is the process of attaching a cryptographic signature to a code change so it can be verified later. It is useful only when the signature is tied to a trusted identity and enforced by repository policy, otherwise it becomes weak evidence rather than strong assurance.
Supply chain assurance: Supply chain assurance is the discipline of proving that software artifacts were produced and modified through controlled, auditable steps. It extends beyond scanning code for bugs and focuses on the integrity of authorship, build process, and release lineage.
Non-human identity: A non-human identity is any credentialed software actor such as a service account, bot, token, certificate, workload, or AI agent. In code pipelines, these identities can create, sign, move, or release software, which makes their lifecycle and trust boundaries security-critical.

Deepen your knowledge

Code provenance and identity-bound commit controls are core topics in our NHI Foundation Level course, the industry's only accredited NHI security programme. If you are extending governance into software delivery and automation identities, it is worth exploring.

This post draws on content published by Beyond Identity: Why Is Code Provenance Non-Negotiable in the Age of AI? Read the original.

NHIMG Editorial Note
Published by the NHIMG editorial team on 2025-12-22.
NHI Mgmt Group — the independent authority on Non-Human Identity, IAM, and Agentic AI security. nhimg.org