TL;DR: AI security is moving toward inference-time exploitation, indirect injection, poisoned MCP tooling, and agent-to-agent propagation, according to Pillar Security. The governing assumption breaks when data becomes executable and agents can chain trusted inputs into privileged actions without runtime validation.
NHIMG editorial — based on content published by Pillar Security: The New AI Attack Surface, 3 AI Security Predictions for 2026
By the numbers:
- 86% of organizations are blind to AI data flows, having no inventory or visibility into where their AI is connected or what data is exposed.
- 97% lacking proper AI access controls.
Questions worth separating out
Q: How should security teams govern AI agents that consume untrusted data?
A: Security teams should treat every data source that can influence an AI agent as part of the control boundary.
Q: Why do AI agents create a larger attack surface than ordinary automation?
A: AI agents create a larger attack surface because they can reinterpret inputs, combine context from multiple sources, and choose actions at runtime.
Q: What breaks when an MCP server is compromised?
A: When an MCP server is compromised, the agent may still trust its response as if it were internal policy or approved guidance.
Practitioner guidance
- Map AI data flows before granting production access Inventory every source that can influence model behaviour, including RAG stores, document repositories, tool outputs, Slack channels, and API feeds.
- Separate read influence from write authority Do not allow the same agent chain to both consume untrusted context and commit privileged changes.
- Validate MCP outputs before agents can use them Treat tool responses as untrusted until they are checked against policy, package allowlists, and expected intent.
What's in the full article
Pillar Security's full opinion piece covers the operational detail this post intentionally leaves for the source:
- The article’s three prediction scenarios and the reasoning behind each threat pattern.
- The CFS model explanation for why indirect injection succeeds or fails in practice.
- The full MCP poisoning example showing how a compromised tool can alter code-generation outcomes.
- The article’s final runtime-security recommendations for AI discovery and attack-path visibility.
👉 Read Pillar Security's analysis of AI agent attack surfaces and runtime exploitation →
AI agent runtime risk is growing faster than code controls?
Explore further
Inference-time exploitation is the right name for this threat class. The core failure is not code weakness but decision manipulation at runtime, where trusted data becomes executable behaviour. That moves security from static application control into the identity and governance of the system that interprets the data. Practitioners should treat this as a runtime trust problem, not a model tuning problem.
A few things that frame the scale:
- 96% of technology professionals identify AI agents as a growing security threat, and 66% believe this risk is immediate, according to AI Agents: The New Attack Surface report.
- Another finding from the same research shows that 80% of organisations report their AI agents have already performed actions beyond their intended scope, including unauthorised system access, sensitive data sharing, and credential exposure.
A question worth separating out:
Q: Who is accountable when an AI agent follows malicious instructions from a trusted source?
A: Accountability sits with the organisation operating the agent, the team governing its access, and the owners of the data or tool path that allowed the instruction to be acted on. AI governance has to cover provenance, delegated authority, and runtime approval boundaries, otherwise blame gets assigned after the damage is already done.
👉 Read our full editorial: AI agent attack surfaces are shifting from code to runtime behavior