Subscribe to the Non-Human & AI Identity Journal

Why do identity programmes need baselines before they claim savings?

Because without a pre-deployment baseline, there is no way to show change rather than noise. Baselines let teams compare the same workflows before and after rollout, isolate seasonality, and attach a confidence level to the result. That is what turns a savings claim into something finance and audit can test.

Why This Matters for Security Teams

Identity programmes often promise fewer standing privileges, less manual review, and lower operational overhead. The problem is that those claims are only credible if the team can prove a measurable change against the same workload, not just a different month or a different mix of systems. Without a baseline, reductions in tickets, access events, or remediation effort can be driven by seasonality, policy drift, or reporting noise rather than real improvement. NIST’s Cybersecurity Framework 2.0 treats measurement and continuous improvement as part of governance, not an afterthought.

This is especially important for NHIs because their behaviour is often invisible until something breaks. NHIMG’s Ultimate Guide to NHIs makes the point that machine identities need lifecycle control, not just inventory, while the 52 NHI Breaches Analysis shows how quickly weak identity hygiene can become an enterprise incident. In practice, many security teams discover the true cost of “savings” only after audit challenges or incident response have already exposed the missing comparison point.

How It Works in Practice

A usable baseline starts by freezing the current state before any policy, tool, or workflow change goes live. That usually means capturing volume, timing, and effort across the same process steps you expect to improve: access requests, approvals, privileged session launches, secret rotations, exception handling, and support tickets. The goal is not perfect precision. The goal is a repeatable comparison that can survive finance review and internal audit.

For identity programmes, the baseline should separate actual work from background noise. A monthly ticket count alone is too blunt. Better baselines track:

  • Average time to provision and revoke access
  • Number of manual interventions per workflow
  • Volume of standing privileges or dormant accounts
  • Exception rate by application, team, or environment
  • Remediation time for leaked or rotated secrets

That last metric matters because secrets management is often where programme cost hides. In The State of Secrets in AppSec, GitGuardian and CyberArk report that the average estimated time to remediate a leaked secret is 27 days, even though 75% of organisations express strong confidence in their secrets management capabilities. That gap between confidence and reality is exactly why baselines matter. Without pre-change data, a team can reduce one metric while making another worse and still call it success.

Current best practice is to define the baseline window long enough to capture normal variation, then compare like for like after rollout. The strongest claims use the same business units, the same workflows, and the same measurement method on both sides of the change. These controls tend to break down when organisations change tooling, approval paths, and reporting logic at the same time, because the result becomes impossible to attribute cleanly.

Common Variations and Edge Cases

Tighter measurement often increases reporting overhead, requiring organisations to balance evidence quality against the effort needed to collect it. Some programmes can tolerate a lightweight baseline; others need a formal measurement plan because savings will be used in budget decisions, board reporting, or audit responses.

There is no universal standard for this yet. In regulated environments, the safest approach is to document the baseline method, the observation window, and the assumptions behind any savings estimate. In fast-moving cloud or agentic environments, teams may need separate baselines for human access, workload identity, and automated secrets rotation because combining them hides real differences in control performance.

That distinction matters in NHI work because the real object being improved is often not the number of identities, but the amount of unmanaged privilege attached to them. NHIMG’s Top 10 NHI Issues and JetBrains GitHub plugin token exposure both illustrate how quickly exposed credentials can invalidate assumed savings by creating urgent remediation work. Baselines should therefore be refreshed after major architecture changes, because a claim that was valid in one operating model may no longer hold after a platform migration or identity consolidation.

Standards & Framework Alignment

This section maps relevant standards and security frameworks to the operational risks and controls described in this guidance.

OWASP Non-Human Identity Top 10 address the attack and risk surface, while NIST CSF 2.0 and NIST AI RMF set the governance and control requirements practitioners need to meet.

Framework Control / Reference Relevance
NIST CSF 2.0 GV.OV-01 Baseline metrics support governance oversight and measurable improvement claims.
OWASP Non-Human Identity Top 10 NHI-06 NHI lifecycle and monitoring need a baseline to prove reduced exposure.
NIST AI RMF AI RMF stresses measurable risk management, which depends on pre-deployment baselines.

Define pre-change metrics and track them through rollout so savings claims can be verified over time.