By NHI Mgmt Group Editorial TeamPublished 2026-04-24Domain: AnnouncementsSource: Pillar Security

TL;DR: Pillar says its runtime AI protection can be wired into TrueFoundry’s AI Gateway so every request and response is inspected in real time for prompt injection, jailbreaks, sensitive data, and evasion, with verdicts logged for audit and incident response. The control matters because agentic workflows break single-turn scanning assumptions and need session-aware enforcement.


At a glance

What this is: This is a product integration that adds runtime AI inspection to a gateway, with the key finding that session-aware guardrails are needed for agentic workloads.

Why it matters: It matters because IAM, NHI, and AI governance teams need policy enforcement that travels with every model call, not just application-layer controls that miss multi-turn abuse.

👉 Read Pillar Security's post on runtime AI protection in the TrueFoundry gateway


Context

Runtime AI protection is the control layer that inspects prompts and outputs as they move through an AI gateway, rather than relying only on application code or model-side safety features. The problem it addresses is simple: agentic and chat-based systems now mix model routing, tool use, retrieval, and state, which makes single-message inspection too narrow for production governance.

For identity and access teams, the question is no longer whether the model is safe in isolation. It is whether every model call, every routed response, and every session trace can be governed in a way that supports audit, policy enforcement, and incident review across NHI and agentic AI environments.


Key questions

Q: How should security teams govern AI requests at the gateway level?

A: Security teams should enforce prompt and response checks at a shared gateway so policy is applied before model output reaches applications. That approach gives consistent inspection, centralised logging, and a single place to block unsafe requests across models, teams, and environments.

Q: Why do agentic AI workflows need session-aware guardrails?

A: Agentic workflows chain tool calls, retrieval, and state across multiple turns, so a single prompt rarely captures the full risk. Session-aware guardrails can detect indirect injection, hidden instructions, and context drift that only become obvious when the conversation is evaluated as one governed interaction.

Q: What do security teams get wrong about prompt injection controls?

A: They often treat prompt injection as a message-level content problem, when the real issue is governance across the whole interaction. If policy only looks at one turn, attackers can hide malicious instructions in retrieved documents, tool output, or earlier conversation state.

Q: How can teams balance AI protection with rollout speed?

A: Use scoped policy profiles so production routes receive strict controls while internal prototypes can start with lighter guardrails. That lets teams deploy protection without forcing every environment into the same risk posture or slowing initial adoption.


How it works in practice

Session-aware runtime inspection for agentic AI

Runtime inspection at the gateway sits between the application and the model provider, scanning prompts before they are sent and responses before they return. That matters because agentic workflows accumulate context across turns, retrieve external content, and pass tool outputs back into the conversation. A single-turn classifier can miss indirect prompt injection hidden in a document, while session-aware analysis can see how malicious instructions persist across the full exchange. The technical shift is from message-level filtering to conversation-level enforcement, with policy applied uniformly regardless of which downstream model handled the request.

Practical implication: teams need controls that evaluate the full session, not just isolated prompts, when agents can chain tools and memory.

Gateway-enforced policy for prompts, responses, and sensitive data

The architecture described here applies the same verdict path to input and output scans, then either allows, blocks, or redacts before the application sees the result. That is a stronger model than simple content moderation because it can stop secrets, credentials, and regulated data from flowing into prompts or leaking back out through completions. It also centralises policy in the gateway, which reduces the need to embed bespoke detection logic inside each application. For identity programmes, this is a governance pattern: the control point sits where model access is mediated, not where developers remember to add checks.

Practical implication: place scanning at the model access boundary so sensitive-data controls are enforced consistently across applications and teams.

Audit logging across AI request paths

The integration logs every scan, verdict, and decision on both sides of the gateway, creating evidence for audit, tuning, and incident response. In practice, that means defenders can reconstruct which prompt was inspected, what policy fired, and whether a request was blocked or passed through. This is especially important for AI workloads because model interaction trails often span multiple providers and routes. Without shared logs, policy decisions become invisible, and compliance teams are left with partial traces that are hard to interpret after an incident or access review.

Practical implication: require end-to-end logging for model requests and responses so audit and response teams can reconstruct AI activity.


NHI Mgmt Group analysis

Runtime AI protection is becoming a gateway function, not an application afterthought. The integration pattern matters because it places inspection at the traffic layer where model access is mediated, rather than asking each development team to recreate the same guardrails. That shifts AI security from scattered code controls to a policy boundary that can be governed consistently. For practitioners, the implication is that gateway placement is now part of the identity and access design, not just an AI tooling decision.

Session-aware inspection is the right answer to multi-turn abuse, but it also exposes a deeper governance gap. Single-message controls were designed for bounded interactions, not agentic workflows that carry state, retrieve content, and combine tool outputs over time. The failure mode is not just missed detection. It is the assumption that each request is independently meaningful when the real attack surface emerges across the conversation. Practitioners should treat conversation context as a governed object, not an incidental log line.

AI request logging is now an audit requirement, not just an operational convenience. Once prompt, response, and verdict data are split across gateway and model paths, you need unified evidence to answer who saw what, when, and under which policy. That becomes central to incident response, policy tuning, and compliance review. For identity teams, the practical conclusion is that AI traceability belongs in the same control conversation as access logging and entitlement review.

Policy scope by team, route, and environment reflects where AI governance is heading. The article shows a control model that can be tightened for production agents and relaxed for prototyping, which is the right direction for risk-based governance. That does not remove the need for standards, but it does show that one-size-fits-all controls will not scale across enterprise AI estates. Practitioners should expect AI governance to fragment by workload risk, while still requiring central oversight.

From our research:

  • 98% of companies plan to deploy even more AI agents within the next 12 months, despite documented rogue behaviour in 80% of current deployments, according to AI Agents: The New Attack Surface report.
  • Only 52% of companies can track and audit the data their AI agents access, leaving 48% with a complete blind spot for compliance and breach investigation.
  • That gap is why readers should also review OWASP Agentic Applications Top 10 for the control failures that surface once agent behaviour becomes part of the threat model.

What this signals

With 33% of organisations already reporting AI agents accessing inappropriate or sensitive data beyond their intended scope, gateway-level inspection is shifting from optional hardening to baseline governance. The practical signal is that teams will need policy scopes that vary by route, model, and risk tier rather than a single global rule set.

Conversation-burst governance: the next control problem is not just whether an AI request is allowed, but whether the entire session can be explained, audited, and constrained before the model compounds a small exposure into a broader incident. That pushes identity and security programmes toward traceable model access boundaries, not just model selection.


For practitioners

  • Place guardrails at the gateway boundary Route model traffic through a shared control point so prompts and responses are inspected before they reach the application or user. Keep policy enforcement outside individual service code so teams do not implement inconsistent checks across products.
  • Require full-session inspection for agentic workflows Treat multi-turn conversation state, retrieved content, and tool outputs as one governed session. Use policies that evaluate context across the exchange so indirect injection is not judged only on the last prompt.
  • Log scans, verdicts, and responses for audit Preserve the request path, policy decision, and blocked or allowed outcome so security and compliance teams can reconstruct what happened after an incident. Combine those records with per-request tracing from the gateway.
  • Scope stricter policies to production agents Apply tighter controls to customer-facing or high-risk routes, and use separate policy profiles for internal prototypes. That keeps experimentation possible while preventing weak defaults from spreading into production.

Key takeaways

  • Gateway-based runtime protection moves AI governance closer to the point where model access is actually mediated.
  • Session-aware inspection is necessary because agentic systems create risk across multiple turns, not just in isolated prompts.
  • Centralised logs and scoped policies give security teams a practical path to audit, tune, and separate production risk from prototype usage.

Standards & Framework Alignment

This section maps relevant standards and security frameworks to the operational risks and controls described in this guidance.

OWASP Agentic AI Top 10 address the attack and risk surface, while NIST CSF 2.0 and NIST Zero Trust (SP 800-207) set the governance and control requirements practitioners need to meet.

FrameworkControl / ReferenceRelevance
OWASP Agentic AI Top 10NHI-01Covers prompt injection and tool abuse in routed agentic flows.
NIST CSF 2.0PR.DS-5Sensitive data detection and redaction map to protecting data in transit.
NIST Zero Trust (SP 800-207)PR.AC-4Gateway enforcement aligns with mediation of access to model services.

Inspect prompts and tool outputs at the gateway so agent sessions cannot bypass policy.


Key terms

  • Runtime AI protection: Runtime AI protection is the inspection and enforcement layer applied while an AI system is actively processing requests. It evaluates prompts, tool outputs, and model responses in real time so unsafe content can be blocked, redacted, or logged before it affects users or downstream systems.
  • AI gateway: An AI gateway is the control point that sits between applications and one or more model providers. It routes requests, applies policy, and records activity, making it the natural place to centralise guardrails, tracing, and access governance across multiple models and teams.
  • Session-aware detection: Session-aware detection evaluates the full conversation or interaction sequence instead of judging each prompt in isolation. It is essential for agentic and multi-turn workflows because malicious intent can be spread across turns, retrieved content, or tool responses that look harmless on their own.
  • Indirect prompt injection: Indirect prompt injection is a technique where malicious instructions are hidden in external content such as documents, web pages, or tool output. The model or agent consumes that content as context, which can cause it to follow attacker instructions without the user typing them directly.

Deepen your knowledge

AI gateway guardrails and session-aware inspection are core topics in our NHI Foundation Level course, the industry's only accredited NHI security programme. If you are defining controls for agentic workloads and routed model traffic, the course gives you a practical governance baseline.

This post draws on content published by Pillar Security: Pillar + TrueFoundry: Runtime AI Protection, Built Into the Gateway. Read the original.

NHIMG Editorial Note
Published by the NHIMG editorial team on 2026-04-24.
NHI Mgmt Group — the independent authority on Non-Human Identity, IAM, and Agentic AI security. nhimg.org