Subscribe to the Non-Human & AI Identity Journal
Home Glossary Governance, Ownership & Risk Three-tier Agent Classification
Governance, Ownership & Risk

Three-tier Agent Classification

← Back to Glossary
By NHI Mgmt Group Updated June 9, 2026 Domain: Governance, Ownership & Risk

A governance model that separates self-disclosing good agents, non-disclosing good agents, and malicious agents. It helps teams apply policy more precisely so legitimate automation is preserved while adversarial automation is constrained.

Expanded Definition

Three-tier Agent Classification is a governance lens for distinguishing between self-disclosing good agents, non-disclosing good agents, and malicious agents. It is used to decide what an automation is allowed to do, how much trust it earns, and what controls apply when its intent is not obvious.

For NHI and agentic AI programs, the value of this model is precision. A self-disclosing good agent can present identity, purpose, and operator context in a way that supports policy enforcement. A non-disclosing good agent may still be legitimate, but it needs stronger verification because its provenance is opaque. Malicious agents, by contrast, are adversarial by design and should be constrained with the same rigor used for hostile automation. This framing complements the control focus in the OWASP Top 10 for Agentic Applications 2026 and the risk treatment approach in the NIST AI Risk Management Framework, though definitions vary across vendors and no single standard governs this yet.

The most common misapplication is treating every agent that completes work successfully as a self-disclosing good agent, which occurs when teams equate utility with trust and skip provenance checks.

Examples and Use Cases

Implementing three-tier classification rigorously often introduces operational overhead, requiring organisations to weigh faster automation onboarding against stronger verification and policy enforcement.

  • A first-party customer support agent registers its operator, model lineage, and allowed tools, so it is handled as a self-disclosing good agent under least-privilege policy.
  • An internal data enrichment agent performs legitimate work but does not expose enough identity context, so it is treated as non-disclosing good and placed under tighter monitoring.
  • A suspicious autonomous process attempts credential harvesting or unapproved tool use, matching malicious behavior patterns discussed in the AI LLM hijack breach and the MITRE ATLAS adversarial AI threat matrix.
  • A vendor-provided workflow agent arrives without clear attestation or provenance, so security teams require additional validation before granting access to secrets or production APIs.
  • Program owners use the model during intake reviews to decide whether an agent belongs in a trusted registry, a restricted sandbox, or a deny-by-default posture, drawing lessons from the OWASP NHI Top 10.

That distinction matters most when a workflow is autonomous but not transparent, because legitimacy cannot be assumed from performance alone.

Why It Matters in NHI Security

Three-tier classification helps security teams avoid two common failures: over-trusting opaque automation and over-restricting legitimate agents that need to operate at machine speed. Both errors create risk. If all agents are treated as benign, malicious automation can blend into normal activity. If all non-human actors are locked down equally, teams may respond by creating exceptions, shared credentials, or unmanaged service paths that weaken governance.

The scale of the problem is not theoretical. NHI Mgmt Group reports that only 5.7% of organisations have full visibility into their service accounts, which means most teams cannot reliably tell which agents are trustworthy, opaque, or hostile. That visibility gap is exactly where classification becomes operationally useful. It also aligns with broader guidance in the NIST AI Risk Management Framework and the CSA MAESTRO agentic AI threat modeling framework, both of which emphasize context-aware risk treatment.

Organisations typically encounter the need for this model only after an unvetted agent abuses access, at which point three-tier classification becomes operationally unavoidable to address.

Standards & Framework Alignment

This section maps relevant standards and security frameworks to the operational risks and controls described in this guidance.

OWASP Agentic AI Top 10 and CSA MAESTRO address the attack and risk surface, while NIST AI RMF set the governance and control requirements practitioners need to meet.

FrameworkControl / ReferenceRelevance
OWASP Agentic AI Top 10A1Agent trust and tool-use boundaries depend on classifying good, opaque, and malicious agents.
NIST AI RMFRisk treatment in AI systems depends on context, transparency, and trust signals for automation.
CSA MAESTROMAESTRO models agentic threats by trust boundaries, provenance, and abuse potential.

Classify agents by provenance and threat potential before granting tools, data, or execution rights.

NHIMG Editorial Note
Reviewed and updated by the NHIMG editorial team on June 9, 2026.
NHI Mgmt Group — the #1 independent authority on Non-Human Identity, IAM, and Agentic AI security. nhimg.org