They miss the real compromise path. In device code attacks, the user authenticates on a legitimate Microsoft page, so there may be no fake login artefact to detect. The durable risk is the token issued after that interaction, which can be reused from attacker infrastructure for mailbox access and fraud.
Why Defenders Miss the Real Compromise Path
Phishing-page hunting is useful only when an attacker actually stands up a fake login. token replay changes the problem entirely: the user may authenticate on a legitimate identity provider page, but the session artefact becomes the weapon. That means detection logic tuned to look for typosquatted domains, cloned login forms, or suspicious branding can miss the initial compromise and the later abuse of a valid token.
This is why incident patterns like the Salesloft OAuth token breach matter to defenders: the durable risk is not the page, it is the credential artefact that can be replayed from attacker infrastructure. CISA cyber threat advisories also repeatedly emphasise session theft, token abuse, and authenticated misuse as distinct from classic phishing, which means the detection model has to move beyond URL inspection and user training alone. In practice, many security teams discover token replay only after mailbox access, data export, or fraud has already occurred, rather than through intentional phishing-page detection.
How Token Replay Actually Bypasses Phishing-Centric Defences
Token replay works because the attacker does not need to impersonate the login page after the token exists. In device code and OAuth consent flows, the victim may complete authentication on a legitimate Microsoft page, so there is no fake page artefact to block. Once the token is issued, the attacker can reuse it until expiry or revocation, often from a different device, IP range, or automation path. The control problem is therefore about session trust, token lifetime, and post-authentication abuse.
Operationally, defenders should treat token replay as a lifecycle issue:
- Detect abnormal token use, not just suspicious logins.
- Shorten token time-to-live where the business process allows it.
- Revoke refresh tokens and sessions when risk signals appear.
- Correlate user consent, device code entry, and mailbox or API activity.
- Instrument identity telemetry for impossible travel, atypical client apps, and rapid privilege jumps.
Research from the State of Secrets Sprawl 2026 is a useful reminder that detection without revocation is weak: 64% of valid secrets leaked in 2022 are still valid and exploitable today, which is the same failure pattern defenders face with long-lived tokens. Pair that with guidance from CISA cyber threat advisories on credential abuse and the lesson is clear: the control plane must move from page-level suspicion to token-level containment. These controls tend to break down in environments that allow long-lived refresh tokens and weak conditional access, because the replayed token remains valid even after the original user interaction is over.
Where the Standard Response Breaks Down
Tighter token controls often increase help desk load and integration friction, so organisations have to balance replay resistance against user experience and legacy app compatibility. That tradeoff becomes especially sharp in hybrid environments, where older mail clients, service desks, and third-party apps still depend on broad OAuth scopes or extended session lifetimes.
Current guidance suggests three places where phishing-first thinking fails most often. First, alerting that keys only on suspicious domains misses legitimate-page authentication entirely. Second, conditional access that trusts the first successful sign-in may not stop a stolen token from being replayed later. Third, organisations that do not monitor token issuance, consent grants, and token usage separately often cannot tell whether an account was phished, consented, or replayed.
That is why case studies such as the Internet Archive breach and the Dropbox Sign breach remain relevant: once tokens or sessions are exposed, the attacker’s advantage is persistence, not page spoofing. The practical answer is to investigate token provenance, revoke affected sessions quickly, and tune detections for reuse patterns rather than phishing artefacts alone.
Standards & Framework Alignment
This section maps relevant standards and security frameworks to the operational risks and controls described in this guidance.
OWASP Agentic AI Top 10 and CSA MAESTRO address the attack and risk surface, while NIST AI RMF set the governance and control requirements practitioners need to meet.
| Framework | Control / Reference | Relevance |
|---|---|---|
| OWASP Agentic AI Top 10 | A03 | Token replay is a post-authentication abuse path that evades page-focused detection. |
| CSA MAESTRO | IAM-04 | Highlights runtime identity abuse and session misuse in autonomous workflows. |
| NIST AI RMF | Supports governance for identity and misuse risks in dynamic AI-enabled environments. |
Build monitoring and response for post-authentication abuse, not only initial access.