Subscribe to the Non-Human & AI Identity Journal

How should agencies measure whether cybersecurity modernisation is actually working?

They should measure whether controls reduce risk and workload, not just whether tools were deployed. Useful indicators include faster detection, lower intrusion volume, fewer manual analyst tasks, and clearer decision-making across teams. If a modernisation effort does not improve those outcomes, it is only changing the stack, not the security posture.

Why This Matters for Security Teams

Cybersecurity modernisation only matters if it changes outcomes in the environment, not if it merely changes procurement language. Agencies often report success when tools are deployed, dashboards are green, or policy documents are updated, yet the real test is whether intrusion paths are shortened, analyst effort drops, and decisions become faster and more consistent. That is especially important in identity-heavy environments where non-human identities quietly expand the attack surface; NHIMG research shows only 5.7% of organisations have full visibility into their service accounts, and 80% of identity breaches involve compromised non-human identities such as service accounts and API keys. See Ultimate Guide to NHIs — Why NHI Security Matters Now and CISA cyber threat advisories for the operational context behind that risk. Measurement should therefore track whether modernisation reduces exposure, improves recovery, and makes controls easier to run at scale. In practice, many security teams encounter “modernisation success” only after a breach review reveals that the control changes did not materially alter attacker dwell time or analyst workload.

How It Works in Practice

A workable measurement model starts with baseline, target, and trend, not just a project checklist. Agencies should define a small set of outcome metrics that connect directly to risk reduction and operating cost, then measure them before and after each modernisation initiative. Useful measures include mean time to detect, mean time to contain, percentage of alerts requiring manual triage, privileged access exceptions, and the volume of dormant or over-privileged identities. For identity-centric programmes, compare those results against the findings in The 52 NHI Breaches Report and the broader patterns in Top 10 NHI Issues so the scorecard reflects actual attack paths, not generic IT service health.

  • Track control effectiveness: fewer successful phishing, API abuse, or credential replay events.
  • Track workload reduction: fewer repetitive tickets, fewer manual approvals, less swivel-chair administration.
  • Track decision quality: more consistent access decisions and cleaner escalation paths across teams.
  • Track resilience: faster restoration after compromise, misconfiguration, or outage.

Modernisation should also be evaluated in terms of control coverage. If a new platform improves visibility into endpoints but leaves secrets, service accounts, and third-party integrations unmanaged, the measurement is incomplete. This is why agencies should correlate SOC metrics with identity hygiene, patch latency, and exception rates rather than rely on one vendor dashboard. The CISA cyber threat advisories model is useful here because it pushes teams to validate whether mitigation actions actually reduced operational risk, not just whether they were issued. These controls tend to break down when agencies measure them only at programme milestones, because end-state dashboards can hide regression in daily operations.

Common Variations and Edge Cases

Tighter measurement often increases reporting overhead, requiring agencies to balance rigor against staff capacity and data quality. Some environments can instrument almost everything, while others must rely on sampled evidence, especially where legacy systems, segmented networks, or multiple contractors make telemetry inconsistent. Best practice is evolving on how much quantitative evidence is enough, but the current guidance suggests agencies should avoid claiming “success” when the only proof is tool adoption or policy completion.
If the modernisation effort includes agentic AI, automation, or delegated execution, the bar should be higher because those systems can change behaviour faster than static review cycles can capture. In that case, agencies should also review the emerging threat patterns described by Anthropic — first AI-orchestrated cyber espionage campaign report and the MITRE ATLAS adversarial AI threat matrix when deciding what “working” means for autonomous tooling. The practical edge case is simple: a modernisation programme can improve visibility while still increasing complexity, and in highly federated agencies that complexity can offset gains unless the metric set is kept narrow, repeatable, and tied to mission outcomes.