Subscribe to the Non-Human & AI Identity Journal

What should organisations do when employees use public LLMs for work tasks?

Organisations should treat public LLM use as a data-handling event and classify the content before submission. Sensitive code, transcripts, and customer information should be blocked from unmanaged tools, while sanctioned alternatives should be provided for lower-risk tasks. The aim is to reduce unsafe submission, not merely punish the user after the fact.

Why This Matters for Security Teams

Public LLM use is not just a productivity choice. It is a governance and data-loss question because prompts can contain code, customer data, incident details, regulated content, and credentials. Once that material leaves controlled systems, organisations often lose visibility into retention, secondary use, and downstream exposure. Current guidance suggests treating the prompt as a data transfer decision, not a casual search query, especially when employees move between approved tools and public services.

This is also where non-human identity risk begins to intersect with ordinary user behaviour. If employees paste API keys, tokens, or environment details into a public model, they may create the same kind of secret exposure seen in breaches such as the LLMjacking research and DeepSeek breach analysis. NIST’s AI 600-1 Generative AI Profile and the OWASP Agentic AI Top 10 both reinforce that AI use must be governed by context, sensitivity, and intended outcome. In practice, many security teams encounter the problem only after a user has already shared sensitive material with a public model.

How It Works in Practice

The practical response starts with classification. Organisations need a simple rule set that tells employees what can be pasted into a public model, what must stay inside sanctioned tooling, and what is prohibited entirely. That usually means blocking secrets, source code with embedded credentials, customer records, legal content, incident data, and anything governed by contractual or regulatory constraints. For lower-risk work, such as drafting generic prose or summarising publicly available text, approved tools with logging and retention controls are the safer path.

Implementation works best when policy is paired with workflow friction reduction. If secure alternatives are hard to access, employees will route around them. Security teams should provide approved LLM services, browser controls, DLP rules, and user education that explain why a prompt can become a data-handling event. The CSA MAESTRO agentic AI threat modeling framework and NIST’s AI guidance both support runtime risk assessment rather than one-time policy statements. NHIMG’s Analysis of Claude Code Security and LLMjacking research illustrate why exposed secrets and prompt leakage are operational threats, not abstract concerns.

  • Classify content before submission, not after a breach review.
  • Block secrets, customer data, and sensitive code from unmanaged public LLMs.
  • Provide sanctioned alternatives for drafting, summarisation, and code assistance.
  • Log and review high-risk usage so policy violations are detectable.
  • Train users on what counts as sensitive input, including transcripts and embedded credentials.

These controls tend to break down in bring-your-own-device environments where browser-based LLM access bypasses managed endpoints and data inspection.

Common Variations and Edge Cases

Tighter LLM controls often increase friction for employees, requiring organisations to balance speed against confidentiality and auditability. That tradeoff is real, especially in engineering, support, sales, and research teams that rely on rapid drafting or summarisation. There is no universal standard for this yet, but current guidance suggests using risk tiers rather than a single blanket rule for every workflow.

One common edge case is public information that becomes sensitive when combined. A harmless prompt about a client name, repository structure, and recent outage can reveal more than the individual pieces imply. Another is regulated data embedded in screenshots, pasted logs, or copied email threads. Security teams should treat those inputs the same way they treat structured records. The OWASP Top 10 for Agentic Applications 2026 and NIST’s AI Risk Management Framework both point toward governance that scales with context, not one that assumes all model use is equal. The practical rule is simple: if the organisation would not want the content indexed, retained, or replayed outside its control, it should not go into a public LLM.

Standards & Framework Alignment

This section maps relevant standards and security frameworks to the operational risks and controls described in this guidance.

OWASP Agentic AI Top 10 and CSA MAESTRO address the attack and risk surface, while NIST AI RMF set the governance and control requirements practitioners need to meet.

Framework Control / Reference Relevance
OWASP Agentic AI Top 10 A01 Addresses unsafe prompt handling and agentic data leakage risks.
CSA MAESTRO T1 Covers threat modeling for AI workflows and user data exposure.
NIST AI RMF GOVERN Supports governance for AI use cases, data handling, and accountability.

Map public LLM use into a threat model and define approved, logged workflows.