Open semantic interchange standardizes data semantics for AI governance

By NHI Mgmt Group Editorial TeamPublished 2025-12-09Domain: Breaches & IncidentsSource: Collibra

TL;DR: Vendor-neutral semantic metadata, consistent definitions, and portable governed results across BI and AI tools are gaining traction as Collibra’s participation in the Open Semantic Interchange points to a broader push for interoperability, according to Collibra. The real issue is not tool interoperability alone, but whether governance can survive fragmented definitions without creating new trust and accountability gaps.

At a glance

What this is: This is an analysis of the Open Semantic Interchange and its push to standardize semantic metadata across tools, dashboards, and AI systems.

Why it matters: It matters because identity, governance, and access decisions depend on consistent data context, and fragmented semantics can undermine controls across human, NHI, and AI-driven workflows.

👉 Read Collibra's announcement on the Open Semantic Interchange

Context

Open semantic interchange is about making data definitions portable, so the same business term means the same thing across dashboards, notebooks, and machine learning systems. In practice, that is a governance problem, not just a tooling problem, because inconsistent semantics create inconsistent decisions.

For IAM, NHI, and AI governance teams, the question is whether shared metadata can reduce ambiguity in policy, reporting, and automation without introducing another layer of hidden dependency. When definitions drift, access reviews, audit evidence, and machine-driven decisions all become harder to trust.

Key questions

Q: How should security teams govern shared data definitions across BI and AI tools?

A: Security teams should treat shared data definitions as governed assets with ownership, versioning, approval, and lineage. The key is not just making a definition portable, but proving which systems consume it and who can change it. That prevents silent drift from undermining reporting, automation, and audit evidence.

Q: Why do inconsistent semantics create risk for IAM and AI governance?

A: Inconsistent semantics cause different systems to make decisions from different interpretations of the same term. That breaks trust in access approvals, exception handling, and AI outputs because the control logic depends on the meaning of the data, not just the data itself. Governance must therefore cover definitions as well as systems.

Q: What breaks when data definitions are shared without ownership?

A: Without ownership, changes to definitions spread faster than governance can track them. Teams may continue using outdated logic in reports, policies, and models, which creates audit gaps and inconsistent decisions. A standard format helps exchange data, but it does not replace accountable change control.

Q: How do organisations know whether semantic governance is actually working?

A: Semantic governance is working when critical definitions are consistent across tools, changes are approved, and lineage can explain how a decision was derived. If users still argue about what a metric means, or if AI outputs vary because upstream terms drift, the governance layer is not effective.

Technical breakdown

Semantic metadata and governed interoperability

Semantic metadata is the layer that describes what data means, not just where it lives. An open interchange format lets systems exchange metric names, business definitions, lineage context, and governance tags in a common structure. That reduces the translation work between BI tools, AI platforms, and data catalogs, but it does not automatically solve ownership or quality. The hard part is ensuring that the shared definition is authoritative, versioned, and auditable across every consuming system.

Practical implication: treat semantic standards as governance infrastructure and assign explicit ownership for each business definition.

Why fragmented definitions break AI and BI trust

When two tools use the same label for different logic, users see the same dashboard title but not the same underlying truth. That creates false confidence in reporting, policy evaluation, and model inputs. In AI workflows, semantic drift can propagate into prompts, features, and outputs, which means the problem is not just analytics inconsistency but decision contamination. Standardisation helps only if the organisation controls provenance and change management.

Practical implication: pair semantic standardisation with lineage, approvals, and change control for all business-critical terms.

Governance context for machine and human decisioning

Data semantics sit upstream of both human decisions and automated actions. If an access policy, fraud rule, or AI prompt references an unstable definition, the downstream decision inherits that instability. This is why semantic governance intersects with identity governance: the reliability of the data context shapes whether controls are applied correctly and whether exceptions can be defended during audit. Standards improve portability, but governance still decides whether the definition is trusted.

Practical implication: map critical semantic terms to policy, audit, and automation dependencies before allowing broad reuse.

Threat narrative

Attacker objective: The objective is not compromise in the traditional sense, but operational confusion that weakens trust, auditability, and decision consistency.

Entry occurs when fragmented semantic definitions enter the environment through disconnected dashboards, notebooks, and machine learning pipelines using different business logic for the same metric.
Escalation follows as those inconsistent definitions are reused in reporting, policy logic, and AI outputs, amplifying ambiguity across teams and systems.
Impact appears when leaders, auditors, or automated workflows act on conflicting interpretations of the same data, reducing trust in governance and decision quality.

DeepSeek breach — DeepSeek breach exposed 1M+ log lines and sensitive secret keys.
Schneider Electric credentials breach — exposed credentials gave attackers access to Schneider Electric Jira, exfiltrating 40GB.

Read our 52 NHI Breaches Analysis report for a comprehensive view of breaches impacting Non-Human Identities including AI Agents.

NHI Mgmt Group analysis

Semantic sprawl is becoming a governance risk, not just a data quality issue. When the same metric means different things across tools, teams lose the ability to prove that decisions were made on consistent evidence. That matters for audit, model governance, and access governance alike, because every control downstream depends on the integrity of the shared definition. Practitioners should treat semantic consistency as a control surface, not a documentation task.

Open standards can reduce translation friction, but they do not create authority by themselves. A vendor-neutral interchange format can make semantic metadata portable, yet portability without ownership simply moves ambiguity faster. The field should not confuse exchangeability with governance, because a shared schema is only useful when someone is accountable for versioning, approval, and change traceability. Practitioners need governance around the definition itself, not just around the system carrying it.

Machine-driven analytics magnify semantic drift into identity and access consequences. If AI systems, notebooks, and dashboards all consume unstable definitions, the resulting decisions can affect approvals, segmentation, reporting, and exception handling. That makes semantic governance adjacent to IAM and NHI governance, because policy engines and automated workflows are only as reliable as the context they consume. Practitioners should align semantic control with the identity and automation paths that depend on it.

Traceability is the real differentiator between a standard and a control. An interchange specification becomes operationally valuable only when organisations can trace what changed, who approved it, and which systems consumed it. Without that, the enterprise gains portability but not defensibility. Practitioners should evaluate semantic initiatives by their audit trail, not just by their interoperability claims.

Data confidence is now a prerequisite for AI confidence. The more organisations use semantic layers to standardise AI and BI, the more they need explicit governance over the terms that drive those systems. Otherwise, automation scales uncertainty instead of control. Practitioners should use semantic governance to narrow decision variance before extending AI use cases.

From our research:
The average estimated time to remediate a leaked secret is 27 days, despite 75% of organisations expressing strong confidence in their secrets management capabilities, according to The State of Secrets in AppSec.
Only 44% of developers are reported to follow security best practices for secrets management, exposing a significant developer behaviour gap.
For the broader control model behind this discussion, see Ultimate Guide to NHIs , Standards for how NHI governance aligns with NIST, OWASP, and zero trust.

What this signals

Semantic standardisation will only help if organisations also harden the control plane around change, ownership, and lineage. When business definitions move into shared interchange layers, the governance challenge shifts from translation to accountability. Teams that already struggle with inconsistent secrets, distributed ownership, or weak audit evidence should expect the same failure pattern to appear in semantic governance unless they formalise approval paths and traceability.

Data confidence and identity confidence are converging control problems. If an enterprise cannot prove which definition drove a decision, it cannot fully defend the access, automation, or AI action that followed from it. That makes semantic governance a practical dependency for IAM and NHI teams, especially when machine-driven workflows consume business metrics directly.

A useful way to frame this is semantic drift debt: the accumulated cost of letting business terms diverge across tools until the enterprise no longer knows which version is authoritative. Once that debt exists, every new dashboard, model, or policy consumes uncertainty rather than reducing it.

For practitioners

Inventory the semantic terms that drive policy and reporting Map the business definitions that feed dashboards, approvals, fraud rules, and AI prompts, then identify where the same term has different logic across systems.
Assign accountable owners for each governed definition Require an owner, version history, and approval path for every critical metric or business term so changes cannot propagate silently into analytics or automation.
Tie semantic changes to audit evidence and lineage Record which systems consume each definition, what changed, and when it changed so auditors and control owners can reconstruct decisions later.
Review automation dependencies before broad reuse Check whether access policies, data products, or AI workflows rely on definitions that are still unstable, then restrict reuse until the terms are governed.

Key takeaways

Open semantic interchange addresses a real governance gap: the same business term can otherwise produce different decisions across systems.
Standardised metadata improves interoperability, but authority still depends on ownership, versioning, and traceability.
Practitioners should govern semantic definitions with the same discipline they apply to policies, audit evidence, and automation inputs.

Standards & Framework Alignment

This section maps relevant standards and security frameworks to the operational risks and controls described in this guidance.

OWASP Non-Human Identity Top 10 address the attack and risk surface, while NIST CSF 2.0 and NIST Zero Trust (SP 800-207) set the governance and control requirements practitioners need to meet.

Framework	Control / Reference	Relevance
NIST CSF 2.0	GV.OV-01	Semantic governance affects oversight of trusted data and decision inputs.
NIST Zero Trust (SP 800-207)	PR.DS	Trusted data context supports policy decisions and downstream access logic.
OWASP Non-Human Identity Top 10	NHI-08	Shared context and metadata quality influence machine identity control reliability.

Track dependencies between semantic definitions and NHI-driven automation before broad reuse.

Key terms

Semantic Metadata: Semantic metadata is the layer that explains what data means, not just where it is stored. It includes business definitions, metric logic, lineage, and governance tags that help different systems interpret information consistently. In practice, it is what makes portability possible without turning meaning into guesswork.
Vendor-neutral Specification: A vendor-neutral specification is a shared format that lets multiple tools exchange information without locking the organisation into one platform’s internal model. For governance teams, the value is consistency and portability. The risk is assuming the format itself provides trust, when authority still depends on ownership and change control.
Lineage: Lineage is the record of how data, definitions, or decisions move through systems over time. It helps explain origin, transformation, and consumption, which is essential for audit and governance. Without lineage, teams can see a result but cannot reliably defend how that result was produced.
Semantic Drift: Semantic drift is the gradual divergence of meaning across tools, teams, or workflows. A metric, policy term, or model input may keep the same name while its logic changes. That creates hidden inconsistency, which is especially dangerous when analytics, automation, or AI rely on the same term in different systems.

Deepen your knowledge

NHI governance, agentic AI identity, and machine identity lifecycle are core topics in our NHI Foundation Level course, the industry's only accredited NHI security programme. If you are responsible for identity security strategy or NHI governance in your organisation, it is worth exploring.

This post draws on content published by Collibra: the Open Semantic Interchange and semantic governance. Read the original.

NHIMG Editorial Note
Published by the NHIMG editorial team on 2025-12-09.
NHI Mgmt Group — the independent authority on Non-Human Identity, IAM, and Agentic AI security. nhimg.org