TL;DR: MiniMax M2.5 is now the most-used model on OpenRouter by token volume, processing over 2.45 trillion tokens in a week as of February 2026, while Chinese AI models reached 61% market share and the model delivered near frontier performance at far lower cost, according to WorkOS. The governance issue is no longer model quality alone but routing, escalation, and control boundaries across AI agent workloads.
NHIMG editorial — based on content published by WorkOS: Why MiniMax M2.5 is the most popular model on OpenRouter right now
By the numbers:
- MiniMax M2.5 processes more tokens on OpenRouter than any other model, with over 2.45 trillion tokens processed in a single week as of February 2026.
- M2.5 scores 80.2% on SWE-Bench Verified, compared with 80.8% for Claude Opus 4.6.
Questions worth separating out
Q: How should security teams govern model routing in AI agent workflows?
A: Security teams should treat model routing as a policy decision, not a performance shortcut.
Q: Why does a cheap front-line model change IAM risk for AI systems?
A: A cheap front-line model makes it easier to route more work through a single decision layer, which concentrates access and data handling.
Q: What breaks when escalation from one model to another is implicit?
A: Implicit escalation breaks governance because teams cannot prove when a request crossed from low-risk handling into higher-trust reasoning.
Practitioner guidance
- Map routing decisions to governance owners Assign a named control owner for when the front-line model may answer directly and when it must escalate to a higher-trust model.
- Set escalation thresholds for agent workflows Document which tasks are safe for the low-cost model and which require specialist reasoning, especially where tool use or external side effects are possible.
- Extend NHI controls to self-hosted model paths If you run open-weight models locally, protect the model endpoint, orchestration layer, and service credentials with the same discipline used for other non-human identities.
What's in the full article
WorkOS's full article covers the operational detail this post intentionally leaves for the source:
- The benchmark comparisons and pricing breakdown that show where the model sits relative to frontier alternatives.
- The author’s own OpenClaw routing setup, including how requests are split between chat, reasoning, and implementation paths.
- The practical trade-offs of running open-weight models locally versus relying on hosted APIs for availability and cost control.
- The performance notes on tokens per second, context window size, and real-world production fit.
👉 Read WorkOS's analysis of MiniMax M2.5 and AI model routing →
MiniMax M2.5 and model routing: what it means for IAM teams?
Explore further
Routing layers are becoming identity control points, not just model optimization layers. Once an AI system decides which model handles each task, it is making an access decision with governance consequences. That decision determines cost, latency, data exposure, and whether a request is escalated into a higher-trust workflow. For IAM teams, the routing layer now sits close enough to privilege boundaries that it must be reviewed like any other control plane.
A few things that frame the scale:
- 28.65 million new hardcoded secrets were detected in public GitHub commits in 2025 alone, a 34% year-over-year increase and the largest single-year jump ever recorded, according to the State of Secrets Sprawl 2026.
- AI-related credential leaks surged 81.5% year-over-year in 2025, with the surrounding AI infrastructure leaking 5x faster than core LLM providers.
A question worth separating out:
Q: How do organisations decide between self-hosted open-weight models and hosted APIs?
A: Organisations should decide based on control requirements, not just price. Self-hosting improves configurability and availability control, but it also shifts responsibility for access management, logging, patching, and lifecycle governance onto the team operating the model path.
👉 Read our full editorial: MiniMax M2.5 shows why model routing now drives AI governance