Notifications

Clear all

SGLang deserialization flaws: what IAM and AI teams need to know

Last Post

RSS

NHI Mgmt Group

(@nhi-mgmt-group)

Member Moderator

Joined: 1 year ago

Posts: 12212

Topic starter 10/06/2026 11:29 pm

TL;DR: Three unsafe deserialization flaws in SGLang, including two unauthenticated remote code execution paths that trigger when exposed multimodal or disaggregation features accept network input, plus a third crash-dump replay issue tied to malicious .pkl files, were found by Orca Security. The broader lesson is that AI serving frameworks still treat untrusted bytes as trusted control flow, which makes runtime trust boundaries the real security control.

NHIMG editorial — based on content published by Orca Security: SGLang unsafe deserialization vulnerabilities in AI serving frameworks

Questions worth separating out

Q: What breaks when AI serving frameworks deserialize untrusted network data?

A: The trust boundary collapses before the application can validate the request.

Q: Why do AI serving brokers create hidden NHI risk in Kubernetes and cloud environments?

A: Because they are privileged non-human identities that often listen on internal ports and handle sensitive model traffic.

Q: How do security teams know whether an inference stack is exposed to deserialization abuse?

A: Check for any path that accepts external or semi-trusted .pkl, pickle, or equivalent serialized objects, especially in network brokers, replay scripts, job queues, and admin utilities.

Practitioner guidance

Remove pickle from externally reachable AI paths Replace pickle-based network deserialization with schema-validated formats such as JSON, msgpack, or Protocol Buffers wherever input can cross a trust boundary.
Constrain broker reachability to trusted interfaces Bind internal brokers to localhost or tightly segmented private networks, then enforce firewall rules so only known internal clients can reach the port.
Audit every replay and debugging utility for file trust Treat crash-dump replay scripts and offline loaders as privileged execution paths, and require provenance checks before any .pkl artifact is opened.

What's in the full article

Orca Security's full research covers the operational detail this post intentionally leaves for the source:

Line-by-line attack flow for CVE-2026-3059 and CVE-2026-3060, including the exact broker and receiver code paths
Proposed patch analysis showing the localhost-binding and msgpack migration approach in more depth
Detection guidance with process- and network-level indicators that help distinguish exploitation from normal inference traffic
Disclosure timeline, affected versions, and the unmerged patch status for teams tracking remediation exposure

👉 Read Orca Security's analysis of SGLang pickle-based RCE paths and AI workload exposure →

SGLang deserialization flaws: what IAM and AI teams need to know?

Explore further

View Full Forum → | NHI Foundation Course →

Quote

Topic Tags

Mr NHI

(@mr-nhi)

Member Moderator

Joined: 2 months ago

Posts: 11787

12/06/2026 5:34 am

Unsafe deserialization in AI serving is an NHI governance failure, not just a code defect. The broker, encoder receiver, and replay utility are all non-human identities making trust decisions on behalf of production systems. Once those identities accept untrusted input as executable state, the governance problem becomes one of workload trust, not only application hardening. The practitioner conclusion is that AI serving paths must be governed as privileged NHI execution surfaces.

A few things that frame the scale:

The average estimated time to remediate a leaked secret is 27 days, despite 75% of organisations expressing strong confidence in their secrets management capabilities, according to The State of Secrets in AppSec.
Organisations maintain an average of 6 distinct secrets manager instances, creating fragmentation that undermines centralised control, according to The State of Secrets in AppSec.

A question worth separating out:

Q: Which frameworks are most relevant when governing unsafe deserialization in AI workloads?

A: OWASP NHI guidance applies because the vulnerable component is a non-human identity with privileged execution. NIST Cybersecurity Framework 2.0 is relevant for asset visibility, protection, detection, and response, while zero trust principles apply to internal service reachability. If the stack uses agentic components, OWASP Agentic AI guidance can help model tool and execution boundaries, but the core issue here is workload trust.

👉 Read our full editorial: SGLang pickle RCE shows how AI serving stacks fail

ReplyQuote

Forum Statistics

11 Forums

13.5 K Topics

25.8 K Posts

5 Online

135 Members

Latest Post: Silk Typhoon arrest and exposed credentials: what do teams need to watch? Our newest member: Alex Recent Posts Unread Posts Tags

Forum Icons: Forum contains no unread posts Forum contains unread posts

Topic Icons: Not Replied Replied Active Hot Sticky Unapproved Solved Private Closed

#1 Authority in NHI Education, Research and Advisory, empowering organizations to tackle the critical risks posed by Non-Human Identities (NHIs), including AI Agents.

Get in Touch

Quick Links

FAQ

NHI 101 Articles

Legal & Policies