They often assume local processing means no governance needed. In reality, the app still creates access decisions around features, content types, and user sharing. The governing question is not only where inference runs, but whether the surrounding identity and disclosure controls match the sensitivity of what users submit.
Why This Matters for Security Teams
On-device AI processing is often treated as a privacy shortcut, but that framing misses the real security boundary. Local inference can reduce data transfer, yet the application still decides what content is accepted, retained, shared, or forwarded to other services. That means identity, disclosure, and authorisation controls remain active even when the model never leaves the device. NIST’s Cybersecurity Framework 2.0 still applies because the asset at risk is the data and the workflow, not only the compute location.
Teams also underweight the persistence of risk after a prompt is entered. A local model can summarise regulated content, expose sensitive fragments through sharing features, or amplify mistakes made by users who assume “offline” means “safe.” NHIMG’s Ultimate Guide to NHIs — Lifecycle Processes for Managing NHIs is useful here because it frames access governance as a lifecycle problem, not a deployment-location problem. In practice, many security teams encounter leakage only after a user has already pasted sensitive material into an approved local assistant, rather than through intentional design reviews.
How It Works in Practice
Security teams should evaluate on-device AI as a set of control points around the model, not as a special case exempt from governance. The first question is what data classes the application can ingest. The second is who can invoke specific capabilities, such as file access, clipboard access, contact lookup, or content export. The third is whether the app logs prompts, caches outputs, or synchronises derived content into cloud services. The DeepSeek breach is a reminder that implementation choices and disclosure paths can matter as much as the model itself.
In practical terms, teams should map local AI features to policy decisions:
- Restrict sensitive data entry by content type, not just by application name.
- Require user disclosure when outputs may be shared, copied, or uploaded elsewhere.
- Classify local AI logs, caches, and telemetry as sensitive by default.
- Apply device posture, app integrity, and account assurance checks before enabling higher-risk features.
- Review third-party SDKs and plug-ins that can move data off device without obvious user intent.
Current guidance suggests that runtime controls matter more than static deployment assumptions. If a local assistant can call tools, access enterprise files, or generate summaries from regulated material, then its effective trust boundary is wider than the device chassis. This is where policy-as-code and conditional access become useful, because the decision should reflect what the user is trying to do, what data is being processed, and whether export or sharing is allowed. These controls tend to break down in unmanaged BYOD environments because the organisation cannot reliably enforce device state, data loss prevention, or local telemetry collection.
Common Variations and Edge Cases
Tighter control over on-device AI often increases friction, requiring organisations to balance privacy gains against usability and support overhead. That tradeoff is real: if the policy becomes too rigid, users route around approved tools and move sensitive work into unmanaged channels.
Best practice is evolving for scenarios where offline processing is paired with sync, shared embeddings, or enterprise connectors. A device may perform inference locally, yet still upload prompts for improvement, store vectors in a cloud index, or expose outputs through collaboration apps. In those cases, security teams should treat the system as hybrid rather than purely on-device. Another edge case is consumer-grade assistants embedded in productivity tools, where disclosure to the user is weak and consent is buried in product settings. That is a governance gap, not a user-training issue.
Organisations with regulated data should also distinguish between minimising exposure and eliminating it. On-device processing can reduce some attack paths, but it does not remove the need for content classification, retention limits, or auditable approval flows. The operational question is whether the app’s surrounding identity and disclosure controls match the sensitivity of what users submit. When a local model is allowed to act on behalf of the user without clear tool boundaries, the risk reappears through sharing, export, and secondary processing.
Standards & Framework Alignment
This section maps relevant standards and security frameworks to the operational risks and controls described in this guidance.
OWASP Non-Human Identity Top 10 address the attack and risk surface, while NIST CSF 2.0 and NIST AI RMF set the governance and control requirements practitioners need to meet.
| Framework | Control / Reference | Relevance |
|---|---|---|
| NIST CSF 2.0 | PR.AA-1 | On-device AI still needs authenticated access decisions for features and data. |
| OWASP Non-Human Identity Top 10 | NHI-05 | Local assistants create non-human access paths that can expose sensitive data. |
| NIST AI RMF | AI RMF governance applies to disclosure, transparency, and downstream harm from local AI. |
Inventory AI-enabled service identities, restrict their scope, and review every data-sharing path.
Related resources from NHI Mgmt Group
Deepen Your Knowledge
Reviewed and updated by the NHIMG editorial team on June 10, 2026.
NHI Mgmt Group — the #1 independent authority on Non-Human Identity, IAM, and Agentic AI security. nhimg.org