Data governance for AI agents is becoming production critical

By NHI Mgmt Group Editorial TeamPublished 2026-06-16Domain: Governance & RiskSource: Collibra

TL;DR: Enterprises are moving beyond AI experimentation toward production use cases where agents approve transactions, generate reports, and trigger workflows, and Collibra’s post argues that trusted data context now determines whether those actions stay inside control boundaries. The real issue is governance, not AI novelty: if lineage, certification, and ownership are stale, the agent is effectively operating outside policy.

At a glance

What this is: This is Collibra’s view of why data governance has become a control layer for production AI, especially where agents act on business context.

Why it matters: It matters because IAM, data governance, and AI teams now have to treat context, lineage, and certification as part of the operating model that constrains machine and human decision-making.

By the numbers:

Only 44% of developers are reported to follow security best practices for secrets management, exposing a significant developer behaviour gap.

👉 Read Collibra's post on why data governance is central to production AI

Context

Production AI changes the control problem. Once agents begin approving transactions, generating reports, or triggering workflows, the question is no longer whether the model can reason, but whether the data and business context behind each action are current, certified, and owned.

That makes data governance an access-control issue in practice, not just an information-management discipline. When the context is stale or lineage is incomplete, the AI system may still act, but it is acting on a governance foundation that no longer matches the business reality it is supposed to represent.

Key questions

Q: How should security teams govern AI actions that depend on business data context?

A: They should treat context as an enforcement signal, not a description. If a model or agent depends on ownership, classification, lineage, or certification to make a decision, those signals must be current before the action is allowed to complete. Otherwise the system may act on data that is technically accessible but operationally unfit for the decision.

Q: Why do lineage and certification matter for production AI?

A: Because they tell the system whether the data behind the action is trustworthy for that use case. Lineage shows origin and change history, while certification shows formal approval for a specific purpose. Without both, AI can still produce outputs, but the organisation cannot prove those outputs were based on governed inputs.

Q: What do organisations get wrong about AI governance and data governance?

A: They separate them too early. If AI is making business decisions from enterprise data, then the governance of the data, the policy on its use, and the accountability for the resulting action are part of one control problem. Splitting those disciplines creates gaps that are hard to audit and harder to contain.

Q: How can teams tell whether AI is operating inside governed boundaries?

A: Check whether the system can demonstrate the source, ownership, classification, and certification state of the data it used for the decision. If any of those are missing or stale, the action may still run, but it is outside a defensible governance boundary.

Technical breakdown

How trusted data context constrains AI-driven actions

AI systems that act on enterprise data do not need raw access alone. They need definitions, ownership metadata, quality certifications, and lineage to interpret whether a record, event, or report can be trusted. In governed environments, those context signals determine whether the system should use, delay, or reject an action. Without them, the model may produce outputs that are syntactically correct but operationally wrong. The technical issue is not model accuracy in isolation. It is whether the data platform can carry business meaning consistently across workflows, applications, and AI agents.

Practical implication: treat context metadata as part of the control plane for AI-enabled decisions.

Why lineage and certification matter when agents trigger workflows

Lineage shows where data came from and how it changed. Certification shows whether a dataset has been reviewed and approved for a specific use case. When agents trigger workflows, those signals become operational guardrails because they determine whether downstream actions rest on validated inputs or on stale, partial, or uncertified records. This is especially important when the same dataset is reused across reports, approvals, and automated decisions. A governed platform must therefore expose not just data, but the confidence level attached to that data.

Practical implication: block agent actions that depend on uncertified or lineage-poor datasets.

Enterprise governance across multiple data sources and AI systems

The article’s core architectural point is that governance cannot stop at one platform boundary. Enterprises usually operate across multiple systems, each with its own definitions, ownership records, and classification rules. If that context does not travel consistently, AI systems inherit inconsistent policy signals and can make divergent decisions from the same underlying facts. That creates risk in compliance, operational consistency, and auditability. Unified governance means the meaning of a data asset stays stable even as it moves between platforms, analytics, and AI execution layers.

Practical implication: standardise data context across systems before expanding autonomous or agentic workflows.

NHI Mgmt Group analysis

AI governance fails when context is treated as metadata instead of a control. Collibra’s central point is that the business meaning of data now shapes whether AI actions remain trustworthy. When agents approve transactions or trigger workflows, stale definitions and uncertified datasets are not cosmetic defects. They are governance failures that alter the legitimacy of the action itself. Practitioners should treat context as part of the policy boundary, not a documentation layer.

Enterprise AI trust breaks at the point where lineage stops being operational. If lineage cannot tell the system where a datum came from and how it changed, then the AI layer cannot reliably decide whether to use it. That is a governance problem because automated decisions depend on provenance, not just availability. The field should stop describing lineage as reporting support and start treating it as an execution dependency for production AI.

Governed context portability: the real requirement is that business definitions, ownership, quality, and classification move with the data wherever AI uses it. That is the specific concept this post sharpens. Collibra’s argument is strongest where multiple systems and multiple AI workflows need the same trusted meaning. Practitioners should therefore evaluate whether their context model survives platform boundaries intact.

This is a data governance story, but its impact reaches identity governance. When machines can take action based on enterprise context, the programme must connect data access, policy inheritance, and decision accountability. Human IAM alone does not solve that problem, and neither does model governance in isolation. The practical conclusion is that governance teams need shared ownership of the decision path from data source to AI action.

Collibra and Databricks reflect a broader market shift toward governance-led AI control planes. The category is moving away from isolated tooling and toward integrated control over meaning, lineage, and execution context. That signals that practitioners will increasingly be judged on whether they can prove why an AI system was allowed to act, not just whether the underlying model was accurate. The implication is to align governance architecture with auditable decision-making.

From our research:
The average estimated time to remediate a leaked secret is 27 days, despite 75% of organisations expressing strong confidence in their secrets management capabilities, according to The State of Secrets in AppSec.
Only 44% of developers are reported to follow security best practices for secrets management, exposing a significant developer behaviour gap, according to The State of Secrets in AppSec.
For a broader governance lens on machine identities, see Ultimate Guide to NHIs , Lifecycle Processes for Managing NHIs for lifecycle control patterns that matter when AI systems depend on trusted context.

What this signals

With 43% of security professionals already concerned that AI systems may learn and reproduce sensitive information patterns from codebases, the next governance question is whether context controls are strong enough to stop bad data from becoming policy. That is why the control plane now has to include provenance, classification, and certification, not just model oversight.

Context portability: if business meaning does not travel with the data, then AI workflows will make different decisions from the same record in different systems. That creates audit friction, policy drift, and inconsistent approvals across the stack.

For practitioners, the signal is clear: data governance is no longer a back-office discipline. It is becoming the mechanism that determines whether AI can act with authority, especially where business context and identity-based access meet in production systems.

For practitioners

Map AI actions to governed data dependencies Identify which approvals, reports, and workflow triggers depend on certified datasets, business definitions, and lineage records. Then classify those dependencies as control inputs, not reference material.
Block execution on stale context signals Prevent AI systems from using uncertified, outdated, or lineage-poor data for business actions. Put the decision gate before the workflow completes, not after the output is generated.
Unify ownership and classification across platforms Standardise business definitions, data ownership, quality certification, and regulatory classification so the same asset carries consistent meaning across analytics and AI environments.
Create audit trails for AI-driven decisions Record which data source, context label, and certification state supported each action. That evidence is essential when a workflow is disputed, reviewed, or investigated.

Key takeaways

Production AI depends on governed context, because agents that act on stale or uncertified data are operating outside meaningful control boundaries.
The hard problem is not model output alone, but whether lineage, ownership, and certification can travel consistently across platforms and workflows.
Practitioners should align data governance, identity governance, and auditability so every AI action can be justified with current, trusted context.

Standards & Framework Alignment

This section maps relevant standards and security frameworks to the operational risks and controls described in this guidance.

NIST CSF 2.0, NIST Zero Trust (SP 800-207) and NIST AI RMF set the governance and control requirements practitioners need to meet.

Framework	Control / Reference	Relevance
NIST CSF 2.0	GV.OV-01	Governance and oversight fit the article's focus on trusted AI control boundaries.
NIST Zero Trust (SP 800-207)	PR.AC-4	AI decisions depend on consistent access and policy context across systems.
NIST AI RMF	GOVERN	The post centres on accountability for AI actions based on enterprise data.

Assign clear ownership for AI decision-making and verify the provenance of inputs before action.

Key terms

Business Context: The meaning attached to data that tells a system how it should be used, interpreted, and trusted. In production AI, business context includes ownership, classification, quality status, and regulatory meaning. Without it, models may still process data, but they cannot reliably decide whether acting on that data is appropriate.
Data Lineage: A record of where data came from, how it changed, and where it was used. Lineage matters because AI decisions depend on provenance, not just availability. When lineage is incomplete, the organisation loses the ability to explain or defend why a system trusted a given input.
Certified Data: Data that has been reviewed and approved for a defined business purpose. Certification is not a generic quality label. It is a governance signal that says the asset is fit for a specific operational use, which becomes critical when AI systems use that asset to make or trigger decisions.
Governed Context: The combination of definitions, ownership, quality, lineage, and policy signals that travel with data as it moves through systems. For AI programmes, governed context is what prevents decisions from drifting away from the organisation's current rules, approvals, and accountability structure.

Deepen your knowledge

NHI governance, agentic AI identity, and machine identity security are core topics in our NHI Foundation Level course, the industry's only accredited NHI security programme. If you are responsible for identity security strategy or governance in your organisation, it is worth exploring.

This post draws on content published by Collibra: Collibra named Databricks' 2026 Data Governance Partner of the Year. Read the original.

NHIMG Editorial Note
Published by the NHIMG editorial team on 2026-06-16.
NHI Mgmt Group — the independent authority on Non-Human Identity, IAM, and Agentic AI security. nhimg.org