Subscribe to the Non-Human & AI Identity Journal

What should organisations do before building a graph-based identity model?

Organisations should first inventory all authoritative identity and access sources, including legacy applications, cloud platforms, and operational systems. They should then define which relationships matter for governance, such as group membership, inherited entitlements, and role dependencies. A graph without trusted source coverage will simply reproduce existing blind spots in a more elegant form.

Why This Matters for Security Teams

A graph-based identity model can make relationships easier to query, but it does not create trust in the underlying data. If the source inventory is incomplete, the graph will faithfully connect missing systems, stale entitlements, and undocumented dependencies into a cleaner-looking version of the same exposure. NHI Management Group’s Ultimate Guide to NHIs notes that only 5.7% of organisations have full visibility into service accounts, which is why the first risk is usually visibility, not modelling.

This matters because identity graph are often introduced as if they were the control, when in reality they are only as reliable as the authoritative sources behind them. Security teams need to know where identities originate, which systems issue or inherit access, and which relationships are actually governance-relevant. The NIST Cybersecurity Framework 2.0 reinforces that asset and identity understanding comes before effective control design, especially when multiple platforms contribute to access decisions. In practice, many security teams encounter graph failures only after access reviews, incident response, or deprovisioning exposes gaps that the model had quietly inherited.

How It Works in Practice

Before building the graph, organisations should complete a structured source-of-truth exercise. That means inventorying every authoritative identity and access source, then classifying what each system actually owns: primary identity records, group membership, entitlements, role mappings, approval history, or downstream inheritance. The point is not to ingest everything indiscriminately. The point is to identify which relationships are trustworthy enough to support governance, and which are merely useful context.

A practical implementation usually starts with four steps:

  • List all identity sources across cloud, on-premises, SaaS, legacy applications, and operational tooling.
  • Mark each source as authoritative, derivative, or observational.
  • Define the relationship types the graph must preserve, such as account-to-user, role-to-entitlement, or group-to-application.
  • Document refresh frequency, ownership, and exception handling for each source.

For non-human identities, this step is even more important because service accounts, API keys, workload identities, and automation tokens often live outside the IAM tools security teams expect. The 52 NHI Breaches Analysis shows that identity failures are frequently tied to poor visibility and weak lifecycle control, not just broken authentication. A graph can help reveal hidden dependencies, but only if the underlying feeds include the systems where those NHIs are actually created, rotated, and revoked.

Current guidance suggests using the graph to express governance relationships, not to replace source systems. That means preserving provenance on every node and edge so reviewers can tell whether a relationship came from HR, an IdP, a cloud control plane, or a manually curated exception. It also means excluding speculative joins that cannot be defended during audit or incident response. These controls tend to break down when legacy applications expose no dependable APIs or when access is granted through ad hoc admin workflows, because the graph then reflects partial truth rather than authoritative state.

Common Variations and Edge Cases

Tighter graph governance often increases integration overhead, requiring organisations to balance modelling precision against the cost of maintaining source trust. That tradeoff becomes more visible in hybrid environments, where one platform may own user lifecycle data while another owns entitlements, approvals, or workload credentials.

There is no universal standard for this yet, but best practice is evolving around a small set of defensible principles. Some organisations begin with human identities and core enterprise applications, then add NHIs and infrastructure identities once source quality is proven. Others model high-risk relationships first, such as privileged access, inherited admin rights, or third-party service accounts. Either approach can work if the team can explain why each data source belongs in the graph and what control decision it supports.

Edge cases include mergers, shadow IT, outsourced operations, and temporary integrations created during incident response. In those environments, the graph should record uncertainty rather than invent certainty. If a relationship cannot be validated, it should be flagged as untrusted or provisional, not treated as governed fact. That is the difference between a useful identity model and a polished blind spot. NHI Management Group’s Top 10 NHI Issues highlights why incomplete visibility and excessive privilege often persist together, and why source coverage must be settled before graph expansion.

Standards & Framework Alignment

This section maps relevant standards and security frameworks to the operational risks and controls described in this guidance.

OWASP Non-Human Identity Top 10 address the attack and risk surface, while NIST CSF 2.0 and NIST AI RMF set the governance and control requirements practitioners need to meet.

Framework Control / Reference Relevance
NIST CSF 2.0 ID.AM-1 Identity source inventory is core to understanding assets and dependencies.
OWASP Non-Human Identity Top 10 NHI-01 Graph models fail if NHI sources are incomplete or untrusted.
NIST AI RMF Graph identity models need governance over data quality and traceability.

Inventory authoritative identity sources first, then map each feed to the control decisions it can support.