Agentic chat shifts Venice toward tool-using AI workflows

By NHI Mgmt Group Editorial TeamPublished 2026-05-19Domain: AnnouncementsSource: Venice

TL;DR: Tasks are now broken into steps, with tools and models selected automatically and text, image, video, search, and file analysis chained in one conversation, according to Venice, with the company saying Kimi K2.5 runs privately on its infrastructure and retains no server-side history. That changes the governance question from single-model output quality to runtime tool use, model selection, and privacy boundary control.

At a glance

What this is: Venice has made agentic chat its default, using tool selection and stepwise execution to handle multimodal tasks in a single conversation.

Why it matters: IAM and security teams need to understand how runtime tool choice, model switching, and conversation-state handling change access, privacy, and governance assumptions across agentic AI, NHI, and human workflows.

👉 Read Venice’s full explanation of agentic chat and model routing

Context

Agentic chat changes the security model because the system no longer behaves like a single-response interface. Instead, it can choose tools, chain actions, and switch models within a session, which means the governance problem moves from prompt handling to runtime decision boundaries and data-flow control across the conversation.

For identity teams, the key issue is not whether the interface feels more capable. It is whether access, retention, delegation, and privacy controls still match a workflow that can search, generate, edit, and render content without a human choosing every step. That matters for agentic AI governance, workload identity, and the trust assumptions behind session-scoped access.

Key questions

Q: How should security teams govern agentic chat tools that can search, create, and render content in one session?

A: Treat each tool as a separate permission boundary, not as a feature bundle. Security teams should define which tools are available, which data classes each tool may touch, and what logging or approval applies before the agent can move from reasoning to action. The goal is to control the execution path, not only the final output.

Q: What changes when an AI chat system can switch between different models mid-conversation?

A: Model switching turns routing into a governance decision because different models may receive different context, retain different records, or sit behind different providers. Teams should decide which transitions are allowed, what minimum context each destination model gets, and whether a switch changes the privacy or compliance posture of the session.

Q: What breaks when conversation state is spread across local storage, proxies, and external model calls?

A: Auditability and retention controls become inconsistent when no single system owns the full conversation path. A session can no longer be treated as a simple server-side record, so teams need explicit rules for where state lives, how it moves, and which actors can reconstruct it later.

Q: What should organisations do before deploying agentic chat as the default interface?

A: They should perform a control review of the tools, models, and data paths the agent can reach, then align approvals to the highest-risk action in the chain. That means defining guardrails for search, file handling, image generation, video rendering, and provider switching before broad rollout.

How it works in practice

Runtime tool selection in agentic chat

Agentic chat is a control loop, not a single inference call. The agent receives a set of available tools, reasons about the task, chooses a tool, observes the result, and then decides whether to continue. That pattern matters because the security boundary moves from the model output to the sequence of tool calls, each with its own permissions, data exposure, and side effects. In practical terms, the risk is not just what the model says, but what it is allowed to do next with web search, file parsing, image generation, or video rendering.

Practical implication: inventory which tools are available to the agent at runtime and treat each as a separate control point.

Model routing and privilege separation

The platform can select from multiple models based on task type, subscription tier, and user settings, and it can switch between free and premium model pools. That creates a routing layer above the model itself, which is an identity and access decision as much as a product feature. If the system can move from one model to another mid-session, the trust boundary changes. That is especially important when different providers receive different amounts of context, because model choice becomes part of data minimisation, logging exposure, and delegated access control.

Practical implication: define which model transitions are permitted, and review whether each model receives only the minimum context required.

Conversation privacy and session state

Venice describes a privacy model where default conversations on Kimi K2.5 remain on its infrastructure without server-side retention, while some external model calls can be proxied with stripped metadata. The governance issue is that privacy is no longer just about encryption in transit or storage policy. It is about where state lives, when it persists, and whether a session can be reconstructed elsewhere if the user changes model mid-conversation. For identity teams, this is a lifecycle question for conversation state and delegated context, not only a transport question.

Practical implication: document where session state is stored, when it leaves the platform boundary, and who can reconstruct it later.

NHI Mgmt Group analysis

Agentic chat turns runtime tool use into an identity governance problem. Once the system can search, generate, edit, and render content in one loop, access is no longer a static permission set assigned to a single model call. The operational question becomes which tools an agent may invoke, in what sequence, and under what context. That is a governance shift from output review to action-path control, and practitioners should treat the tool chain as the primary security surface.

Identity does not stay singular across multimodal agent sessions: the same conversation may touch internal infrastructure, third-party model endpoints, browser state, and file processing in one flow. That assumption was designed for one model, one response, one boundary. It fails when the actor can route itself across tools and providers mid-session because context becomes distributed and the trust boundary fragments. The implication is that identity and access programmes must re-think how delegated execution is bounded across a single user session.

Privacy claims must be evaluated at the session boundary, not the marketing boundary. The article describes zero-retention behaviour for some modes and proxying for others, which means the practical control question is where conversation state resides at each step. That is a lifecycle and data-handling issue as much as an AI issue. Practitioners should read this as a reminder that retention, proxying, and model switching are governance decisions, not just UI settings.

Model selection is now part of the security architecture. When the platform can route between free, premium, and external models, the model pool itself becomes an access decision. That changes how teams should think about approved providers, data classification, and session-level approvals. The lesson for identity leaders is that model routing is a policy surface, not an implementation detail.

Named concept, runtime delegation drift: this is the gap that appears when an agent can move across tools, models, and providers without a stable review point. The control assumption that one session maps to one governed execution path no longer holds. Practitioners should use this concept to test where their current policy language still assumes a single, bounded actor.

From our research:
96% of technology professionals identify AI agents as a growing security threat, and 66% believe this risk is immediate, according to AI Agents: The New Attack Surface report.
92% agree governing AI agents is critical to enterprise security, yet only 44% have implemented any policies to do so.
The governance gap is now an operational one, as OWASP Agentic Applications Top 10 provides the control lens teams should use next.

What this signals

Runtime delegation drift: the real issue is not whether an AI interface can do more, but whether your governance model can follow it across tools, models, and providers in the same session. When execution paths are assembled dynamically, review cadences that assume a fixed workflow lose visibility at the exact point where control matters most.

Teams should expect policy work to move from model approval to action-path approval. That means defining who can invoke search, file processing, image generation, and video rendering, then deciding which of those actions can occur without human review. The programme signal is clear: agentic AI is becoming a control-plane issue, not just an application feature.

For practitioners

Map the agent tool chain by permission boundary. List every tool the chat agent can call, including search, file analysis, image generation, and video rendering, and assign an owner, a data classification, and a logging requirement to each one.
Define allowed model transitions. Specify which model switches are permitted within a session, what context each destination model can receive, and whether cross-provider switching requires a policy review.
Separate privacy claims from retention controls. Document where conversation state is stored, whether any part leaves the platform boundary, and how proxying or zero-retention claims are verified in practice.
Treat multimodal outputs as governed side effects. Apply approval and review rules to image, video, and file-processing actions, because the security impact comes from what the agent can create or transmit after it reasons about the task.

Key takeaways

Venice’s default chat shifts the security question from prompt quality to runtime control over tool use, model routing, and session state.
Agentic workflows expand the attack and governance surface because a single conversation can now touch multiple tools, providers, and data paths.
Practitioners should map permissions, model transitions, and retention boundaries before treating agentic chat as a default interface.

Standards & Framework Alignment

This section maps relevant standards and security frameworks to the operational risks and controls described in this guidance.

OWASP Agentic AI Top 10 and OWASP Non-Human Identity Top 10 address the attack and risk surface, while NIST AI RMF and NIST CSF 2.0 set the governance and control requirements practitioners need to meet.

Framework	Control / Reference	Relevance
OWASP Agentic AI Top 10	A1	Agentic tool use and model routing are central to this post.
NIST AI RMF		AI governance and accountability apply to dynamic session behaviour.
OWASP Non-Human Identity Top 10	NHI-01	The agent behaves like a non-human identity with delegated access to tools and data.
NIST CSF 2.0	PR.AC-4	Access control and least privilege are directly affected by tool delegation.

Inventory the agent as an NHI, then constrain its permissions, context, and lifecycle like any other privileged identity.

Key terms

Agentic Chat: A chat interface that can decide how to complete a request by selecting tools, sequencing steps, and adapting its behaviour during the session. In governance terms, it behaves like a delegated non-human actor, so access, logging, and approvals must follow the execution path rather than the prompt alone.
Model Routing: The process of directing a request to one of several available models or providers based on task, settings, cost, or policy. For identity teams, routing is a control decision because different destinations can change what data is exposed, where state lives, and how accountability is assigned.
Session State: The information that persists across an interaction, including prompts, outputs, files, and intermediate context. In agentic systems, session state is a governance boundary because persistence, transfer, and reconstruction determine whether the conversation is auditable and whether data leaves the original trust domain.
Runtime Delegation Drift: A condition where an agent shifts across tools, models, or providers during a single session in ways that outgrow the original governance assumptions. The drift matters because the approved actor at the start of the session is no longer the only actor that can influence the final outcome.

Deepen your knowledge

Agentic chat governance is covered in the NHI Foundation Level course, the industry's only accredited NHI security programme. If you are defining controls for tool-using AI systems, it is a relevant starting point.

This post draws on content published by Venice: agentic chat now defaults for all users. Read the original.

NHIMG Editorial Note
Published by the NHIMG editorial team on 2026-05-19.
NHI Mgmt Group — the independent authority on Non-Human Identity, IAM, and Agentic AI security. nhimg.org