Agentic coding in design systems exposes new identity control gaps

By NHI Mgmt Group Editorial TeamPublished 2026-05-19Domain: Agentic AI & NHIsSource: 1Password

TL;DR: Well-specified workflows can produce workable PRs only after explicit skills, MCP-backed context, and human ticket qualification are added, while cold-start agents guessed conventions and created downstream rework, according to 1Password. The real issue is that agent identity control depends on scoped context and short-lived access, not just better code generation.

At a glance

What this is: This is 1Password’s analysis of using an agent in a design system, and the key finding is that explicit skills and context turn plausible output into usable PRs.

Why it matters: It matters because IAM teams now have to govern agent workflows, not just users and service accounts, and that means context, scope, and session boundaries become identity controls.

👉 Read 1Password's analysis of agentic coding in design systems

Context

Agentic coding only works reliably when the workflow is bounded, the conventions are explicit, and the system can validate output quickly. In a design system, that means the model is not free to invent intent. It has to follow a narrow path of tokens, primitives, stories, tests, and review gates.

That is an identity governance problem as much as a software problem. When an agent can act inside a codebase, the question is no longer whether it can write code, but whether its access, context, and execution scope are constrained tightly enough to keep it inside the intended boundary.

The article is strongest when it shows that agent behaviour improves after the missing knowledge is made explicit. That is a typical pattern in early agent adoption: the first failure is usually not the model itself, but the assumption that tacit human knowledge can be recovered from repository context alone.

Key questions

Q: How should teams govern agentic coding in structured engineering workflows?

A: Start by constraining the workflow, not by trusting the model. Give the agent narrow, executable instructions, limit its context to authoritative sources, and require a human to qualify the task before execution begins. Then measure output quality by rewrite rate and convention adherence, because speed without correctness only shifts work downstream.

Q: Why do design systems expose identity control gaps for agents?

A: Design systems expose control gaps because they depend on tacit conventions that experienced humans usually carry in their heads. An agent can follow files and commands, but it cannot reliably infer unstated token tier rules, component semantics, or team-specific merge habits unless those rules are encoded and governed as machine-readable context.

Q: What breaks when agent credentials are left standing too long?

A: Standing agent credentials turn a bounded workflow into a persistent access path. That increases the chance of misuse, accidental overreach, and unreviewed reuse across sessions. For agent workflows, the safe model is short-lived access tied to a single task, with scope limited to the exact resources needed to complete it.

Q: Who should decide whether a ticket is ready for an agent?

A: A human should decide whether the ticket is specific enough, because readiness is a governance judgment, not a model output. The reviewer should confirm that scope, expected outcome, and conventions are clear before the agent starts. If the ticket still requires interpretation, it is not yet an agent-ready task.

Technical breakdown

Why cold-start agents fail in structured design systems

A design system looks simple only because the rules are already agreed. The agent can navigate files and compile code, but it does not automatically know token tiering, component primitives, merge templates, or the local conventions that make a PR idiomatic. Without that operating context, the model fills gaps with plausible guesses. The result is not random failure, but near-miss output that passes tests while still violating system intent.

Practical implication: treat repository familiarity as a governed capability, not an assumption, and require explicit context before agent execution.

Skills, MCP context, and executable workflow guidance

The article describes a shift from passive documentation to executable skills. That matters because skills encode exact paths, naming patterns, commands, and ordering, while MCP gives agents a runtime way to query authoritative component and token information instead of inferring from stale files. In identity terms, this is not merely better prompting. It is a tighter control plane for how the agent obtains working context and what it is allowed to do with it.

Practical implication: separate reference material from runtime guidance and bind agent access to the smallest context layer that still makes correct execution possible.

Why short-lived agent credentials are a baseline control

The post is clear that agent credentials should expire, because persistent access is a liability when the workflow is already narrowly scoped. Short-lived credentials reduce the chance that a workflow token becomes a standing pathway into the codebase or component registry. This is the same structural lesson seen across machine identity governance: the more tightly scoped the task, the less defensible persistent access becomes.

Practical implication: issue agent credentials with bounded lifetime and workload scope, then tie them to the exact workflow they support.

Moltbook AI agent keys breach — Moltbook breach exposed 1.5M AI agent keys.
Salesloft OAuth token breach — hackers stole OAuth tokens to access Salesforce data via Salesloft.

Read our 52 NHI Breaches Analysis report for a comprehensive view of breaches impacting Non-Human Identities including AI Agents.

NHI Mgmt Group analysis

Agentic coding only becomes governable when tacit design knowledge is made executable. The article shows that a general-purpose agent could read the ticket but still guessed token tiers, primitives, and PR conventions. That failure mode is not about code generation quality alone. It is the result of governance knowledge living outside the system, which means the real control boundary is the context layer, not the editor.

Short-lived access is the correct identity shape for agent workflows, not a convenience trade-off. The post’s recommendation to disconnect the MCP context after a fixed window reflects a broader control truth. Persistent agent access creates a standing credential problem even when the task is narrow. The implication is that agent identity should be treated as ephemeral workload identity, with scope and lifetime aligned to a single bounded workflow.

Design system tickets expose a named governance gap: tacit workflow dependence. That gap exists when an organisation assumes experienced humans will always supply the missing context that the repo does not encode. It fails when the executor is an agent because the system cannot infer unstated conventions, intended component behaviour, or quality thresholds. Practitioners should rethink how much of their operating model depends on human memory rather than machine-readable rules.

Human qualification remains the decisive control point in agentic coding pipelines. The article’s ticket-label trigger works because it keeps a person responsible for deciding whether the task is truly ready. That is not a temporary workaround. It is the control that prevents ambiguous work from being handed to an agent that will otherwise optimise for completion, not correctness. IAM and engineering leaders should treat qualification as a governance gate, not a process detail.

Design systems are an early proving ground for autonomous work, but only because the work is bounded. The article shows the pattern that will repeat elsewhere: agents perform best where conventions are strict, validation is automatic, and the output shape is predictable. That does not mean the model is autonomous in the broader sense. It means the task is well-contained enough for governed execution. Practitioners should use such environments to test control assumptions before expanding agent access.

From our research:
64% of valid secrets leaked in 2022 are still valid and exploitable today, according to The State of Secrets Sprawl 2026.
AI-related credential leaks surged 81.5% year-over-year in 2025, with the surrounding AI infrastructure leaking 5x faster than core LLM providers.
For a broader control model, see OWASP Agentic AI Top 10 for the agentic risk patterns that matter most.

What this signals

Design-system automation is a preview of how agent governance will spread across the enterprise. The same pattern will show up in code review, ticket triage, documentation generation, and platform operations. Teams that rely on tacit human judgment to make workflows safe will discover that the context boundary is the real control surface.

With 28.65 million new hardcoded secrets detected in public GitHub commits in 2025 alone, according to The State of Secrets Sprawl 2026, machine-assisted workflows are already increasing the volume of sensitive material that governance has to absorb. The practical signal is clear: access lifetime, context quality, and review discipline now matter as much as the task itself.

For practitioners

Encode repeated workflows as executable agent skills Write narrow skills for each atomic contributor workflow, such as scaffolding a component, defining tokens, or opening a merge request. Keep the instructions versioned alongside the code so convention changes update the workflow at the same time.
Bind agent access to short-lived, scoped credentials Issue credentials that expire with the task window and are limited to the exact repository, token registry, or component context the agent needs. Do not let one agent session become a standing route into the design system.
Require human ticket qualification before execution Keep the approval step with a developer or designer who can decide whether the ticket is specific enough for an agent to act on. If a new contributor could not implement it from the description alone, the agent should not receive it.
Measure rewrite rate before velocity Track how often agent-generated PRs need substantial correction versus minor review. Also watch whether the agent reaches for the correct primitives or falls back to raw elements, because that is the clearest signal that the context layer is working or failing.

Key takeaways

The core risk is not that agents cannot write code, but that they write plausible code without knowing the system’s hidden rules.
The evidence points to a governance gap, not a tooling gap, because quality improves only after context, skills, and qualification gates are made explicit.
The right control response is short-lived scoped access, executable workflow guidance, and human task qualification before an agent starts.

Standards & Framework Alignment

This section maps relevant standards and security frameworks to the operational risks and controls described in this guidance.

OWASP Agentic AI Top 10 and OWASP Non-Human Identity Top 10 address the attack and risk surface, while NIST CSF 2.0 set the governance and control requirements practitioners need to meet.

Framework	Control / Reference	Relevance
OWASP Agentic AI Top 10		Agent skills, tool context, and bounded execution map directly to agentic AI risk controls.
OWASP Non-Human Identity Top 10	NHI-03	Short-lived agent credentials and scoped context are core NHI lifecycle concerns.
NIST CSF 2.0	PR.AA-01	The post centers on access qualification, identity scope, and controlled execution.

Constrain agent tool use, session scope, and uncertainty handling before allowing workflow automation.

Key terms

Agentic Coding: Software development where an AI system performs part of the implementation work by selecting actions at runtime, using tools, and producing code artifacts. In governed environments, the key question is not whether it can write code, but whether its context, access, and review boundaries are tightly controlled.
Executable Skill: A machine-readable workflow instruction that tells an agent what to do, in what order, and with which commands or paths. Unlike ordinary documentation, an executable skill is designed to be run, audited, and updated alongside the system it describes.
Tacit Workflow Knowledge: The operational understanding people carry without writing it down, such as local component conventions, review habits, and token hierarchy rules. It is a common failure point for agents because the system may appear documented while the critical decision logic still lives in human memory.
Short-Lived Agent Credential: A time-bounded identity token or access grant used by an AI workflow for a specific task or session. For agent governance, short-lived access reduces standing privilege risk and aligns the identity lifespan with the work actually being performed.

Deepen your knowledge

Agentic coding in bounded workflows and short-lived credential design are core topics in our NHI Foundation Level course, the industry's only accredited NHI security programme. If you are translating these patterns into governance for agent workflows, it is worth exploring.

This post draws on content published by 1Password: agentic coding in design systems and what the team learned. Read the original.

NHIMG Editorial Note
Published by the NHIMG editorial team on 2026-05-19.
NHI Mgmt Group — the independent authority on Non-Human Identity, IAM, and Agentic AI security. nhimg.org