Identity continuity for Epic depends on provider failover

By NHI Mgmt Group Editorial TeamPublished 2026-04-30Domain: Governance & RiskSource: Strata Identity

TL;DR: During identity provider outages, Maverics keeps Epic users, SMART on FHIR launches, and backend service tokens working by fronting OAuth and OIDC flows, normalizing claims, and failing over to a healthy connector in seconds, according to Strata Identity. The governance issue is not just resilience but preserving clinical uptime without turning Epic into a reconfiguration project.

At a glance

What this is: This is an analysis of identity continuity for Epic, showing how an orchestration layer can keep clinician and service logins working when the primary identity provider is unavailable.

Why it matters: It matters because IAM teams running EHR, NHI, and federation programmes need continuity patterns that preserve access without breaking clinical workflows or forcing repeated Epic reconfiguration.

👉 Read Strata Identity's analysis of Epic identity continuity and provider failover

Context

Identity continuity is the ability to keep authentication and token issuance working when the upstream identity provider fails, changes policy, or becomes unreachable. In Epic environments, that problem sits squarely in IAM and NHI governance because clinical access depends on external identity services as much as it depends on the EHR itself.

The article argues that Epic should not be coupled to one provider lifecycle. That framing is relevant for hospitals, health systems, and regulated operators that need federation resilience, break-glass access, and auditability without turning provider migration into an outage event.

Key questions

Q: How should security teams design Epic identity continuity when the primary IdP fails?

A: Use an orchestration layer that fronts Epic's authentication endpoints, keeps Epic registration stable, and routes new logins to a healthy secondary provider when the primary is unavailable. The goal is controlled continuity, not application reconfiguration during an outage.

Q: Why do healthcare identity failures create operational risk beyond login problems?

A: Because Epic authentication gates clinical work, backend services, and patient access. When identity fails, the outage affects care delivery, system-to-system integrations, and support workflows, so IAM resilience becomes an operational dependency rather than a convenience feature.

Q: What breaks when identity provider failover is not separated from the application?

A: The application inherits provider churn, which means migrations, outages, and policy changes force reconfiguration and can interrupt user authentication. That coupling turns identity maintenance into an incident every time the upstream provider changes.

Q: Who is accountable for emergency access during identity failover?

A: The identity and application owners are accountable for defining break-glass conditions, approving the deviation path, and preserving audit evidence. If emergency access is not documented and reviewable, resilience becomes an accountability gap instead of a control.

Technical breakdown

Epic OAuth and SMART on FHIR continuity

Epic delegates authentication through OAuth 2.0 and OpenID Connect, which means the EHR depends on the identity provider's availability, health, and policy state. In this pattern, the orchestration layer fronts the authorize and token endpoints so Epic sees one stable trust boundary while the upstream provider can change. That preserves launch context, claims normalization, and token issuance for clinicians and backend services. The technical point is simple: the continuity layer abstracts provider churn without changing the application registration.

Practical implication: keep Epic registrations immutable and place provider changes behind an orchestration layer rather than embedding provider dependency into the EHR.

Health-check-driven failover for identity providers

The failover model uses discovery endpoint polling to decide which connector is healthy before the next login arrives. That is a proactive routing decision, not a retry after failure. In active standby design, the first healthy provider in the ordered list handles new requests, while claims are normalized so downstream applications do not detect a change. This matters because identity continuity depends on making provider state observable and routable, not merely available somewhere in the stack.

Practical implication: tie login routing to explicit health signals and define healthy and unhealthy thresholds before users encounter the outage.

DDIL and air-gapped identity continuity

Disconnected, disrupted, intermittent, and limited environments change the identity assumption entirely. Cloud authentication cannot be treated as always reachable, so local identity capability becomes part of the access design rather than an exception path. In those deployments, the continuity layer must support local authentication, later reconciliation, and audited deviation windows. This is where operational resilience and identity governance meet, because access still needs control even when connectivity is unstable.

Practical implication: design a local authentication and audit model for disconnected sites instead of assuming WAN-dependent federation will be enough.

Salesloft OAuth token breach — hackers stole OAuth tokens to access Salesforce data via Salesloft.
Internet Archive breach — unsecured GitLab authentication tokens exposed 31M Internet Archive accounts.

Read our 52 NHI Breaches Analysis report for a comprehensive view of breaches impacting Non-Human Identities including AI Agents.

NHI Mgmt Group analysis

Identity continuity is now a governance problem, not just an availability pattern. The article treats Epic access as a clinical uptime dependency, which is the right framing for healthcare identity architecture. When authentication fails, the business impact is not limited to login friction. The governance question becomes how to preserve controlled access while avoiding app-level reconfiguration every time the upstream provider changes. Practitioners should treat continuity as part of identity operating model design.

Provider dependency without a continuity layer creates brittle clinical access. Epic delegates authentication to external identity services, so a single provider outage can become a workflow outage. That dependency is acceptable only if the surrounding identity fabric can absorb provider failure, normalize claims, and preserve audit state. The field implication is that application trust must be decoupled from provider lifecycle.

Break-glass continuity must remain attested and bounded. The article correctly distinguishes failover from bypass by preserving audit and defining a break-glass policy during degraded operation. That matters because emergency access without traceability becomes a governance exception that can outlive the incident. The practitioner conclusion is that resilience controls must not erase accountability.

DDIL healthcare environments force identity design to assume local control. Rural, tactical, and disaster-response sites cannot rely on uninterrupted cloud identity. The relevant assumption is that authentication can happen remotely at all times, and that assumption breaks in these environments. Teams should treat local identity capability, state reconciliation, and logged deviation windows as core requirements for edge care.

From our research:
91.6% of secrets remain valid five days after the targeted organisation is notified, showing a critical gap in remediation procedures, according to Ultimate Guide to NHIs.
Only 5.7% of organisations have full visibility into their service accounts, which is why continuity designs need governance as much as routing logic.
For the broader control picture, see Top 10 NHI Issues for the access and lifecycle failures that often sit behind identity outages.

What this signals

Identity continuity is becoming a baseline requirement for regulated operations. Hospitals, federal care sites, and distributed clinical networks cannot treat cloud identity reachability as a constant. The programme signal is that identity architecture must include local fallback, explicit audit trails, and recovery design, not just federated login success in the steady state.

Continuity changes the IAM roadmap from migration planning to dependency management. Teams need to map which applications can tolerate provider churn, which cannot, and where the identity fabric requires an orchestration layer. The real programme risk is not a failed login alone, but a provider outage that cascades into clinical downtime or uncontrolled emergency access.

Identity uptime is a control objective, not an implementation detail. When access depends on external providers, resilience work belongs beside federation, lifecycle governance, and break-glass policy. The most mature programmes will measure how quickly they can restore controlled authentication, not just how quickly a login screen comes back.

For practitioners

Decouple Epic from provider lifecycles Front Epic with an orchestration layer so the EHR keeps one registration, one JWKS, and one SMART configuration while upstream providers change behind it.
Define health-based failover thresholds Set discovery polling intervals and unhealthy thresholds before production use so login routing changes only after a provider is explicitly marked unhealthy.
Document break-glass audit rules Write down when failover is permitted, who approves it, and how the deviation window is recorded so emergency access stays attested.
Plan for disconnected clinical sites Establish local authentication, state reconciliation, and offline recovery procedures for edge, rural, and air-gapped deployments where WAN identity is not reliable.

Key takeaways

Epic access becomes fragile when authentication is tied directly to a single upstream provider without continuity controls.
The practical control problem is provider dependency, not just outage handling, because login failure can stop clinical work and backend services together.
Governed failover with audit, local fallback, and stable application registration is the limiting factor that separates resilience from reconfiguration chaos.

Standards & Framework Alignment

This section maps relevant standards and security frameworks to the operational risks and controls described in this guidance.

OWASP Non-Human Identity Top 10 address the attack and risk surface, while NIST CSF 2.0 and NIST Zero Trust (SP 800-207) set the governance and control requirements practitioners need to meet.

Framework	Control / Reference	Relevance
NIST CSF 2.0	PR.AC-1	Identity continuity depends on authenticated access remaining available through provider failure.
NIST Zero Trust (SP 800-207)	SC.AA-2	Continuous verification and policy-based access routing align with provider health checks.
OWASP Non-Human Identity Top 10	NHI-07	Backend JWT and service identity continuity depend on managed non-human credentials.

Review service account and token continuity under NHI governance before migrating Epic dependencies.

Key terms

Identity continuity: Identity continuity is the ability to keep authentication and token issuance working when an upstream identity provider is degraded, unavailable, or being changed. In healthcare, it preserves access to clinical and service workflows without forcing the application to be reconfigured every time the provider lifecycle changes.
Break-glass access: Break-glass access is an emergency access path used when normal identity controls cannot complete the request. It should be tightly bounded, explicitly approved, and fully audited so temporary resilience does not become permanent privilege drift.
Claims normalization: Claims normalization is the process of translating identity provider attributes into a stable set of application-facing fields. It allows downstream systems to see consistent identities and session context even when the upstream provider changes, which is critical for continuity architectures.
Disconnected, disrupted, intermittent, and limited environments: Disconnected, disrupted, intermittent, and limited environments are operating conditions where reliable WAN connectivity cannot be assumed. Identity design in these environments must include local authentication capability, recovery procedures, and auditability rather than assuming continuous cloud reachability.

Deepen your knowledge

Identity continuity for Epic and other federated applications is covered in our NHI Foundation Level course, the industry's only accredited NHI security programme. If you are building resilience around provider failover, this is a useful place to start.

This post draws on content published by Strata Identity: identity continuity for Epic when the IdP goes down. Read the original.

NHIMG Editorial Note
Published by the NHIMG editorial team on 2026-04-30.
NHI Mgmt Group — the independent authority on Non-Human Identity, IAM, and Agentic AI security. nhimg.org