A governance approach that builds privacy controls into systems and processes from the start rather than adding them later. In SaaS environments, it means access control, retention, and vendor oversight are designed into the operating model, not left to local judgment.
Expanded Definition
Data protection by design means privacy, retention, and access safeguards are treated as core system requirements, not post-launch patches. In NHI-heavy SaaS environments, that means service accounts, API keys, automations, and vendor access paths are constrained from the outset through NIST Cybersecurity Framework 2.0 aligned governance, rather than being left to application teams or local administrators.
Definitions vary across vendors when this principle is applied to AI workflows, but the governance intent is consistent: sensitive data should be minimised, access should be scoped, retention should be bounded, and oversight should be continuous. For NHIs, the practical focus is not only where data is stored, but which machine identities can read, move, or transform it. That is why NHI lifecycle controls and data handling controls must be designed together, as described in Ultimate Guide to NHIs — Key Research and Survey Results. The most common misapplication is treating data protection by design as a legal checkbox, which occurs when teams document retention rules but leave secrets, logs, and API permissions unmanaged in production.
Examples and Use Cases
Implementing data protection by design rigorously often introduces more upfront engineering and governance work, requiring organisations to weigh faster delivery against lower exposure and simpler audits.
- A SaaS platform limits a service account to a single dataset, preventing broad reads across tenants and reducing the blast radius if that identity is compromised.
- An engineering team stores tokens in a managed secrets system instead of code or configuration files, reflecting the risk patterns documented in Ultimate Guide to NHIs — Key Research and Survey Results.
- A vendor integration is approved only after data-sharing terms, retention limits, and revocation steps are defined in advance, rather than negotiated after deployment.
- An AI workflow masks or excludes personal data before an agent can call downstream tools, reducing the chance that the agent can exfiltrate more than it needs.
- A breach review follows the pattern seen in the Schneider Electric credentials breach, where credential exposure turns a data governance issue into an access-control failure.
These use cases show that the principle applies as much to machine identity pathways as to user-facing privacy notices. It is especially relevant when API keys, pipeline agents, and third-party connectors can reach regulated data with little direct human oversight.
Why It Matters in NHI Security
Data protection by design matters in NHI security because sensitive data is often exposed through identities that were created for convenience, not governance. NHIMG research shows that 96% of organisations store secrets outside secrets managers in vulnerable locations, and 97% of NHIs carry excessive privileges, making data exposure a predictable outcome when controls are bolted on later. Those failures are not abstract privacy problems; they become operational incidents when a service account, token, or automation can read more data than it should.
Used properly, this approach reduces secret sprawl, shortens retention windows, and forces vendor oversight into the architecture instead of the exception process. It also supports the NIST Cybersecurity Framework 2.0 by aligning data handling with access control and governance objectives, not just incident response. The same logic should inform how teams interpret the findings in Ultimate Guide to NHIs — Key Research and Survey Results, especially where identity sprawl and unmanaged secrets undermine data minimisation. Organisations typically encounter the full cost only after a credential leak or vendor misconfiguration exposes regulated data, at which point data protection by design becomes operationally unavoidable to address.
Standards & Framework Alignment
This section maps relevant standards and security frameworks to the operational risks and controls described in this guidance.
OWASP Non-Human Identity Top 10 address the attack and risk surface, while NIST CSF 2.0 and NIST Zero Trust (SP 800-207) set the governance and control requirements practitioners need to meet.
| Framework | Control / Reference | Relevance |
|---|---|---|
| NIST CSF 2.0 | PR.DS | Data protection by design maps to data security and controlled data handling throughout the lifecycle. |
| OWASP Non-Human Identity Top 10 | NHI-02 | Secret handling and identity governance are core to protecting data exposed through NHIs. |
| NIST Zero Trust (SP 800-207) | Zero Trust requires continuously verified, least-privilege access to data resources. |
Build retention, access restriction, and data minimisation into system design and operating procedures.
Related resources from NHI Mgmt Group
- How should security teams design taxonomy for sensitive data protection?
- What is the difference between data protection in LLMs and data protection in agentic AI?
- What is the difference between content inspection and identity-aware data protection?
- What is the difference between encryption and access control in AWS data protection?