What should teams do when AI tools are used in security operations?

Why This Matters for Security Teams

AI-assisted security operations can improve speed, but speed is exactly where teams get into trouble. Once an AI tool is allowed to suggest triage steps, containment actions, or investigation paths, operators can start treating output as authoritative instead of advisory. That creates a real governance problem: the tool is not just reading data, it is shaping operational decisions, sometimes with access to sensitive telemetry, incidents, and secrets. Current guidance from the NIST Cybersecurity Framework 2.0 still points teams toward risk-based governance, but AI requires tighter guardrails around what the system can see and what it can influence.

For NHIMG readers, the issue is not whether AI is useful. It is whether security teams can prevent over-trust, data leakage, and unsafe automation when AI sits inside alerting, hunting, and response workflows. The same pattern that exposed organisations in the State of Non-Human Identity Security also appears here: poor control over machine-driven access tends to become visible only after something has already been misused. In practice, many security teams encounter unsafe AI-assisted actions only after a rushed analyst accepts a bad recommendation or a response workflow executes against the wrong context.

How It Works in Practice

The safest operating model is to treat AI in security operations as a bounded assistant, not a decision-maker. That means defining exactly which workflows may use AI, which data sources it can inspect, and which actions remain human-only. Security teams should separate read access from write access, because a model that can summarise incidents does not need permission to trigger containment, modify tickets, or query every sensitive log source.

Practical controls usually include:

Allowlisting specific use cases such as alert summarisation, enrichment, or draft report generation.

Blocking access to secrets, privileged session data, and highly sensitive operational context unless there is a clear business need.

Requiring human verification before any remediation, escalation, or enforcement action.

Logging prompts, outputs, and operator approvals so AI-assisted decisions can be audited later.

Using policy-as-code so access and action limits are enforced at request time, not only in documentation.

Where AI tools participate in security workflows, workload identity matters as much as user identity. Teams should prefer short-lived, scoped credentials and explicit service identities over shared tokens or persistent API keys. That approach aligns with emerging workload identity patterns used in agentic systems and avoids giving an assistant broad standing access simply because it is convenient. NHIMG’s research on the State of Secrets in AppSec shows why this matters: secrets sprawl and weak handling practices make every extra integration a potential exposure point.

Current best practice is to evaluate AI output in the context of the workflow, not the wording of the response alone. That means the model can propose, but the operator and policy engine must decide. These controls tend to break down when teams connect AI tools directly to ticketing, SIEM, or response automation with broad read-write permissions and no enforced approval step.

Common Variations and Edge Cases

Tighter AI controls often increase operational overhead, requiring organisations to balance analyst speed against verification, logging, and access restrictions. That tradeoff becomes sharper in high-volume SOC environments, where teams want automation to reduce alert fatigue but still need defensible decision-making. There is no universal standard for this yet, so current guidance suggests starting with low-risk tasks first and expanding only after controls prove reliable.

Some teams use AI only for drafting summaries, while others allow retrieval over internal knowledge bases or incident history. The latter can be useful, but it introduces a sensitive-data problem: once the model can see investigations, threat intel, or privileged context, prompt leakage and accidental overexposure become realistic risks. This is especially true when shared prompts, broad connectors, or unmanaged browser-based copilots are involved.

Another edge case is semi-autonomous response. If an AI tool can recommend actions but a separate automation layer can execute them, governance must cover both layers. Teams should not assume that a human review step somewhere in the process is enough unless the approval is technically enforced. Best practice is evolving, but the safest model remains narrow scope, least privilege, and explicit approval for any action that changes systems or containment state.

Standards & Framework Alignment

This section maps relevant standards and security frameworks to the operational risks and controls described in this guidance.

OWASP Agentic AI Top 10 and CSA MAESTRO address the attack and risk surface, while NIST AI RMF set the governance and control requirements practitioners need to meet.

Framework	Control / Reference	Relevance
OWASP Agentic AI Top 10	A02	Covers unsafe tool use and over-privileged agent actions in AI-assisted workflows.
CSA MAESTRO	GOV-1	Maps to governance for bounded autonomy and approval controls in agentic operations.
NIST AI RMF		Addresses governance, measurement, and accountability for AI-assisted decisions.

Restrict AI tools to approved actions and enforce human approval before any operational change.

#1 Authority in NHI Education, Research and Advisory, empowering organizations to tackle the critical risks posed by Non-Human Identities (NHIs), including AI Agents.

What should teams do when AI tools are used in security operations?

Why This Matters for Security Teams

How It Works in Practice

Common Variations and Edge Cases

Standards & Framework Alignment

Related resources from NHI Mgmt Group