Subscribe to the Non-Human & AI Identity Journal

How should security teams design Epic identity continuity when the primary IdP fails?

Use an orchestration layer that fronts Epic’s authentication endpoints, keeps Epic registration stable, and routes new logins to a healthy secondary provider when the primary is unavailable. The goal is controlled continuity, not application reconfiguration during an outage.

Why This Matters for Security Teams

Epic identity continuity is not just an uptime problem. When the primary IdP fails, the risk is that authentication control, registration state, and session trust all get disrupted at once, creating pressure to make unsafe changes during an incident. Security teams need a design that preserves Epic’s registration plane while redirecting only new authentication traffic to a trusted secondary path. That approach fits Zero Trust thinking and avoids reconfiguring the application under outage conditions, which is when mistakes become expensive.

Practitioners should treat this as a controlled failover pattern, not a generic disaster recovery exercise. The continuity layer must decide which identities are allowed to sign in, which provider is authoritative for the moment, and how to prevent duplicate or stale identity records from being created. NIST’s NIST Cybersecurity Framework 2.0 is useful here because it emphasizes resilience, identity governance, and recovery discipline rather than ad hoc exception handling. For broader NHI failure modes, the Ultimate Guide to NHIs shows how identity systems become brittle when credential state and operational continuity are not designed together.

In practice, many security teams encounter Epic identity outages only after users have already started bypassing controls to restore access.

How It Works in Practice

The most reliable pattern is an orchestration or broker layer that sits in front of Epic’s authentication endpoints and preserves a stable trust relationship with the application. Epic continues to point at the same registration and sign-in surface, while the broker routes authentication requests to the primary IdP when healthy and to a pre-approved secondary IdP when the primary is unavailable. The application should not need a code change or a new configuration push during the outage.

Operationally, that broker should enforce explicit rules for identity matching, attribute mapping, and session continuity. Current guidance suggests three design choices:

  • Keep Epic registration immutable so the application always sees the same issuer-facing endpoint.
  • Pre-stage federation metadata for the secondary IdP, including signing keys, claims mapping, and group logic.
  • Use short-lived sessions and re-authentication checkpoints so failover does not extend trust indefinitely.

Identity continuity also depends on secrets hygiene. If the broker, IdP connectors, or Epic integration uses long-lived credentials, an outage can turn into a credential exposure event. The NHI data in Top 10 NHI Issues is a reminder that weak secret handling remains common, and NHI compromise often spreads through identity infrastructure rather than the application itself. For implementation discipline, align the failover plan with the identity resilience and least-privilege expectations in 52 NHI Breaches Analysis and with the access-control intent described in NIST CSF 2.0.

These controls tend to break down when the secondary IdP uses different subject identifiers, claim formats, or approval workflows because Epic cannot reliably reconcile the same person across two trust sources.

Common Variations and Edge Cases

Tighter continuity controls often increase operational overhead, requiring organisations to balance outage resilience against identity governance complexity.

One common edge case is partial outage rather than full failure. If the primary IdP is degraded but still intermittently reachable, the broker needs clear health thresholds so it does not oscillate between providers and create inconsistent sign-in results. Another edge case is regulated access, where the secondary IdP may be allowed for general clinical users but not for privileged administrators or high-risk workflows. In those cases, the failover path should support policy exceptions, not a universal pass-through.

Another practical issue is whether the secondary provider is truly equivalent. Best practice is evolving, but there is no universal standard for treating two IdPs as perfectly interchangeable. Security teams should validate that MFA strength, RBAC mappings, lifecycle joiner-mover-leaver rules, and audit logging remain consistent across both paths. The JetBrains GitHub plugin token exposure incident is a useful reminder that trusted identity tooling can fail through credential exposure as much as through software defects.

For organisations mapping this pattern to formal governance, the control intent is straightforward: preserve service continuity without weakening authentication assurance. That aligns with the resilience focus in NIST Cybersecurity Framework 2.0 and with the NHI governance view in the Ultimate Guide to NHIs — What are Non-Human Identities.

Standards & Framework Alignment

This section maps relevant standards and security frameworks to the operational risks and controls described in this guidance.

OWASP Non-Human Identity Top 10 address the attack and risk surface, while NIST CSF 2.0 and NIST Zero Trust (SP 800-207) set the governance and control requirements practitioners need to meet.

Framework Control / Reference Relevance
NIST CSF 2.0 PR.AC-4 Identity continuity depends on controlled access enforcement during provider failover.
NIST Zero Trust (SP 800-207) Epic failover should preserve trust decisions without relying on network location or outage shortcuts.
OWASP Non-Human Identity Top 10 NHI-05 The broker and IdP connectors are NHI-adjacent components that need credential and secret control.

Treat each login as a fresh trust decision and route it through policy checks, not implicit network trust.