Subscribe to the Non-Human & AI Identity Journal

How can identity teams tell whether passkeys are working at scale?

Look beyond total enrolment and track platform-specific adoption, recovery success rates, help-desk override frequency, and post-login fraud signals. A healthy passkey programme shows low fallback usage, stable desktop behaviour, and no increase in session abuse after the primary factor is strengthened.

Why This Matters for Security Teams

Passkeys are often treated as a binary rollout milestone, but identity teams need operational proof that the new factor is actually reducing friction and risk. Adoption alone can be misleading if users enroll but still fall back to passwords, if recovery paths are brittle, or if help-desk overrides quietly become the real authentication control. The right question is not just whether passkeys exist in the tenant, but whether they are replacing weaker flows at scale in a way that is measurable and sustainable. Guidance from the NIST Cybersecurity Framework 2.0 emphasizes outcome-based measurement, which is the right lens here. NHIMG research on Ultimate Guide to NHIs shows why identity programmes fail when visibility is incomplete: only 5.7% of organisations have full visibility into their service accounts, a useful reminder that unseen authentication paths are usually where programme risk hides. In practice, many security teams discover passkey failure only after recovery volume spikes or fraud patterns shift, rather than through intentional measurement.

How It Works in Practice

A passkey programme should be evaluated as a conversion funnel, not a single control. Identity teams need to segment data by platform, application, and journey type so they can see where passkeys are succeeding and where users are silently reverting to passwords, SMS, or legacy MFA. The most useful metrics are adoption by device class, authentication success rate, recovery completion rate, fallback frequency, help-desk-assisted override rate, and post-authentication abuse signals such as session hijacking or unusual token reuse.

A practical review usually starts with four questions:

  • Are users completing passkey enrollment on desktop, mobile, and managed devices at similar rates?
  • Are successful passkey logins increasing while password and OTP fallback decrease?
  • Do recovery flows complete without excessive manual intervention or account lockouts?
  • Do fraud and session-abuse indicators remain flat or improve after stronger authentication is enabled?

Current guidance suggests that passkeys should be measured against the business flows that matter most, not against vanity metrics like total registrations. That means comparing login success, abandonment, and support burden before and after rollout. It also means checking whether the strongest authenticator is actually required for high-risk actions, not just initial sign-in. NHIMG’s Top 10 NHI Issues is a useful reminder that credential governance fails when controls exist on paper but are not enforced consistently; the same pattern shows up when passkeys are deployed unevenly across channels. These controls tend to break down in hybrid estates with shared devices, unmanaged endpoints, or applications that still rely on legacy authentication flows.

Common Variations and Edge Cases

Tighter passkey policy often increases recovery and support overhead, so organisations need to balance stronger authentication against user support capacity. Best practice is evolving for environments with contractors, kiosk devices, shared workstations, and regulated fallback requirements, because there is no universal standard for handling every exception yet.

The biggest edge case is recovery. If a passkey programme appears healthy but recovery is slow, manual, or prone to social engineering, the security gain may be offset by operational risk. Another common exception is cross-platform behavior: some user groups may enroll passkeys but still prefer passwords on unmanaged devices, which can mask partial adoption if reporting is not segmented. Identity teams should also watch for “successful” logins that do not improve security because downstream session handling remains weak. A passkey can reduce phishing risk at the front door while leaving token theft, replay, or privilege abuse untouched after authentication. For a broader identity-risk lens, the Ultimate Guide to NHIs and NHIMG’s breach-focused 52 NHI Breaches Analysis both show that identity control failures are often detected through downstream abuse, not through the control itself. The metric set should therefore include both authentication quality and post-login assurance, especially where help-desk resets or step-up exemptions are common.

Standards & Framework Alignment

This section maps relevant standards and security frameworks to the operational risks and controls described in this guidance.

OWASP Non-Human Identity Top 10 address the attack and risk surface, while NIST CSF 2.0 and NIST AI RMF set the governance and control requirements practitioners need to meet.

Framework Control / Reference Relevance
NIST CSF 2.0 DE.CM-1 Passkey scale should be measured with ongoing monitoring and outcome tracking.
NIST AI RMF Identity programmes need measurable governance and risk outcomes, not vanity rollout stats.
OWASP Non-Human Identity Top 10 NHI-06 Fallbacks, recovery, and credential handling are core NHI governance failure points.

Define passkey success criteria around risk reduction, usability, and residual authentication exposure.