What do security and governance teams get wrong about data quality?

Why This Matters for Security Teams

Data quality is not just a reporting concern. For security and governance teams, it determines whether access decisions, audit evidence, risk scoring, and automated controls are trustworthy. When source records are incomplete, stale, duplicated, or inconsistently classified, downstream systems still produce outputs, but those outputs become brittle. That creates false confidence in dashboards, control attestations, and exception workflows. The NIST Cybersecurity Framework 2.0 treats governance as a core function, which is the right lens here: data quality is an operational control dependency, not a side issue.

That dependency shows up quickly in NHI and AI programmes. If an identity inventory is inaccurate, ownership and rotation checks fail. If policy inputs are noisy, approvals become inconsistent. If training or operational datasets are flawed, AI-driven recommendations amplify the error instead of correcting it. NHIMG’s Top 10 NHI Issues research is useful here because many of the same failure modes begin as data hygiene problems and end as governance failures. In practice, many security teams encounter control breakdowns only after an audit exception, access incident, or automation failure has already exposed the bad data.

How It Works in Practice

Good data quality work starts by treating critical records as governed assets with explicit ownership, validation rules, and measurable thresholds. For security teams, that usually means defining which fields are control-bearing, such as identity owner, system of record, last verified date, privilege scope, classification, and lifecycle state. It also means deciding what “good enough” means for each workflow, because there is no universal standard for every use case. A dataset that is acceptable for reporting may be unacceptable for automated access revocation or compliance evidence.

In mature programmes, teams combine preventive and detective controls:

Validate mandatory fields at creation time and block incomplete records from entering authoritative systems.

Reconcile identity, asset, and secret inventories on a scheduled basis to find drift.

Use exception queues for unresolved conflicts rather than silently overwriting records.

Track freshness, completeness, and accuracy as control metrics, not just data operations KPIs.

Require human review for high-impact exceptions, especially where automation would change access or attestations.

For NHI governance, this matters because poor metadata undermines lifecycle controls. If an API key is linked to the wrong owner, rotation and revocation may never happen. If service accounts are not consistently classified, privileged access reviews miss them. NHIMG’s Lifecycle Processes for Managing NHIs is relevant because lifecycle control depends on accurate upstream records, not just policy intent. For broader governance framing, the 2024 ESG Report: Managing Non-Human Identities shows how common NHI compromise and suspicion of breach can be when visibility is incomplete and records are not trusted. These controls tend to break down in fast-moving environments with many ephemeral workloads, where ownership changes, tooling sprawl, and shadow automation create constant drift.

Common Variations and Edge Cases

Tighter data quality controls often increase operational overhead, so organisations have to balance trustworthiness against speed. That tradeoff becomes visible when governance teams try to standardise every field, every source, and every approval path at once. Best practice is evolving, but current guidance suggests focusing first on the records that directly affect access, evidence, and automated decisions, rather than trying to perfect all data everywhere.

There are also edge cases where “clean” data is the wrong goal. In incident response, incomplete records may still be useful if the team preserves provenance and timestamps. In AI governance, noisy source data may be acceptable if the model pipeline includes strong validation, sampling, and human oversight. The real risk is unexamined data drift, especially when upstream systems change schema, ownership, or semantics without notice. That is why the Regulatory and Audit Perspectives guidance is useful: auditors care less about perfection than about whether the organisation can prove control, lineage, and exception handling. Security and governance teams get into trouble when they assume downstream review can reliably compensate for upstream uncertainty, because at scale it usually cannot.

Standards & Framework Alignment

This section maps relevant standards and security frameworks to the operational risks and controls described in this guidance.

OWASP Non-Human Identity Top 10 address the attack and risk surface, while NIST CSF 2.0 and NIST AI RMF set the governance and control requirements practitioners need to meet.

Framework	Control / Reference	Relevance
NIST CSF 2.0	GV.OV	Data quality directly affects governance oversight and control validation.
NIST AI RMF	GOVERN	AI governance depends on trustworthy data inputs, lineage, and accountability.
OWASP Non-Human Identity Top 10	NHI-08	Poor identity data breaks lifecycle controls, ownership, and rotation.

Define data-quality thresholds for control-bearing records and review drift as a governance risk.

#1 Authority in NHI Education, Research and Advisory, empowering organizations to tackle the critical risks posed by Non-Human Identities (NHIs), including AI Agents.

What do security and governance teams get wrong about data quality?

Why This Matters for Security Teams

How It Works in Practice

Common Variations and Edge Cases

Standards & Framework Alignment

Related resources from NHI Mgmt Group