Applied AI showcase shows how internal AI systems are changing work

By NHI Mgmt Group Editorial TeamPublished 2026-05-20Domain: Agentic AI & NHIsSource: WorkOS

TL;DR: Internal agents, routines, and workflow systems now help ship code, enrich GTM data, and automate writing, while deterministic harnesses, scoped credentials, and human-verifiable outputs keep those systems usable, according to WorkOS. The lesson for identity teams is that AI capability only scales safely when authority, tooling, and proof are bound to explicit governance constraints.

At a glance

What this is: WorkOS’ Applied AI showcase is an internal case study on building company-wide AI workflows, with a key finding that useful automation depends on deterministic harnesses, scoped credentials, and verifiable outputs.

Why it matters: It matters because AI-native workflows still rely on identity controls, so IAM, NHI, and autonomous governance teams need to understand how access, approval, and traceability shift when agents do more work.

By the numbers:

39 apps shipped to production in a single day during Claude Day.

👉 Read WorkOS' recap of its Applied AI showcase and internal AI tools

Context

AI-native operating models change the identity problem before they change the code. Once internal teams let agents trigger workflows, open pull requests, write content, or enrich data, the question is no longer whether AI is useful. The question is which identities can act, on what authority, and with what proof.

That shift matters for both non-human identity governance and agentic AI oversight. A workflow that can call tools, move through steps, and publish results still depends on credentials, scoped permissions, and lifecycle controls, but the control points move from static access grants toward runtime verification and deterministic execution boundaries.

Key questions

Q: How should security teams govern AI routines that can change production work?

A: Treat each routine as a separate non-human identity with its own lifecycle, scope, and audit record. Do not rely on a shared bot account or generic automation role. Require explicit permissions for the smallest action set, and pair them with evidence gates so the routine must prove completion before it can advance or publish results.

Q: Why do AI workflows need more than standard access reviews?

A: Because the risk is often runtime scope drift, not just excess standing access. An AI system can trigger, branch, and complete work faster than periodic review cycles can observe. Governance needs pre-approved boundaries, typed state transitions, and proof requirements that verify what happened while the workflow was running, not only after the fact.

Q: What breaks when AI agents share one automation identity?

A: Accountability breaks first, followed by offboarding and auditability. Shared access makes it hard to know which routine performed a change, which credential to revoke, and whether a given approval or exception still applies. Separate identities make incident response and lifecycle management materially easier because each routine has a clear owner and scope.

Q: How can organisations keep AI-generated changes trustworthy?

A: Require machine-readable proof, not just a successful prompt or a model claim. Test output, smoke results, validation logs, and recorded execution evidence should be part of the workflow itself. That makes the control plane about verifiable outcomes, which is the only defensible way to let AI operate inside release or operations pipelines.

Technical breakdown

Deterministic harnesses turn agent output into governed work

A harness is the control structure around an AI system that constrains what the agent can do, when it can move, and what evidence it must produce before continuing. In Case, the state machine is the governing layer, not the model itself. Each agent step has typed inputs and outputs, so the system rejects progress when the required artifact is missing. That makes the workflow auditable even when the underlying model is probabilistic. The security lesson is that useful AI work often depends less on model sophistication than on whether the surrounding process can force truth-bearing checkpoints.

Practical implication: require artifact-backed transitions for any AI workflow that can modify code, content, or operational data.

Scoped routine credentials separate trigger rights from broad access

Project Horizon uses per-routine tokens, a permalink, and locked egress to a trusted domain list. That is a classic non-human identity pattern, but adapted for AI-triggered execution. The token identifies the routine, the domain list limits where it can reach, and the trigger surface can be Slack, GitHub events, schedule, or API. This matters because AI systems often fail not from lack of intelligence but from over-broad authority. When the routine can act only through a constrained identity and a constrained network path, the blast radius is narrower and the audit trail is clearer.

Practical implication: bind every AI routine to a distinct identity, egress boundary, and lifecycle so the trigger channel is not the same thing as standing privilege.

Machine-readable proof is becoming part of the access model

The showcase repeatedly treats proof as a first-class control, not an afterthought. Case requires test output, smoke results, or recorded browser sessions before a change can advance. BlogBot uses fact extraction, evals, sensitivity checks, and a feedback loop that can file issues and verify fixes. This is identity-adjacent governance because the workflow does not simply ask whether the actor had access. It asks whether the actor can prove it completed the task correctly. In AI-heavy environments, proof artifacts start to function like conditional authorisation evidence.

Practical implication: design approval gates around verifiable outputs, not just around who or what initiated the task.

NHI Mgmt Group analysis

AI-native operating models are turning identity controls into workflow controls. WorkOS is showing that AI is not just another workload, because it sits inside the operating loop that creates, changes, and ships company work. That means the effective control plane is no longer limited to login, secrets, or ticketing. The practitioner takeaway is that identity governance now has to describe what an AI routine is allowed to change, not only what it is allowed to reach.

Scoped routine identity is the right abstraction for internal AI, not shared bot access. The article’s routine-per-token model, domain restriction, and linked execution surface reflect a more mature NHI pattern than a single shared automation account. Shared access would blur accountability, weaken offboarding, and make audit trails noisy. The implication is that internal AI programmes should be governed like fleets of separate machine identities, each with its own lifecycle and blast radius.

Deterministic harnesses are the governance layer that makes AI production-safe. Case demonstrates that the model can be unreliable, but the workflow can still be reliable if state transitions are typed and evidence-gated. That shifts the governance conversation from trusting the model to constraining the process. Practitioners should treat state machines, validation rules, and proof requirements as core identity controls for autonomous execution paths.

Access review is no longer sufficient when the real risk is runtime scope drift. The old assumption was that a privileged actor’s useful work could be reviewed after the fact. In AI-heavy operating models, the actor may trigger, branch, and complete tasks faster than a review cycle can observe them. The implication is that governance has to move from periodic visibility to pre-approved boundaries and continuous evidence.

Identity blast radius is now a design variable in internal AI programmes. The most important question is not whether a routine can do the job, but how much damage it can do if the prompt, context, or downstream action is wrong. That is a broader identity security problem than classic bot management. Practitioners should measure routine blast radius the same way they measure privileged access risk.

From our research:
DeepSeek accidentally embedded over 11,000 secrets in its training data and left a database exposed online, revealing more than one million sensitive records including chat histories, backend credentials, and API keys, according to LLMjacking: How Attackers Hijack AI Using Compromised NHIs.
When AWS credentials are exposed publicly, attackers attempt access within an average of 17 minutes, and as quickly as 9 minutes in some cases.
For a broader control lens, see NIST Cybersecurity Framework 2.0 and map AI workflow governance to identify, protect, detect, and respond.

What this signals

AI-native teams should expect identity governance to move closer to the runtime layer. When internal systems can spawn routines, write code, and close loops through Slack or API triggers, the meaningful control is not only authentication. It is whether the workflow can prove what it did, where it could reach, and who owns its lifecycle. That is why routine-specific boundaries and evidence-first controls should sit alongside conventional IAM oversight.

Routine sprawl will become the hidden identity debt in AI programmes. Organisations that let internal agents proliferate without a distinct owner, token, and offboarding path will create governance problems that look like shadow automation. The pattern is already familiar in NHI estates, and AI just speeds it up. For a governance baseline, align the programme to NIST Cybersecurity Framework 2.0 so access, detection, and response stay linked.

For practitioners

Define separate identities for every AI routine Give each routine its own token, lifecycle, and ownership record so Slack-triggered work, GitHub-triggered work, and API-triggered work are not collapsed into one shared automation identity.
Require proof before workflow advancement Make test output, smoke evidence, or recorded verification a mandatory input to the next state in any AI-assisted release or content workflow.
Constrain egress and tool reach Limit AI routines to trusted domains, approved data sources, and explicitly mapped tools so the model cannot expand its effective authority at runtime.
Separate trigger surfaces from execution authority Treat Slack commands, GitHub events, schedules, and APIs as different trigger paths, but keep execution rights narrowly scoped to the smallest viable action set.

Key takeaways

The article shows that AI becomes operationally useful only when its actions are boxed in by deterministic workflow controls.
The strongest evidence is operational, not theoretical: WorkOS cites 39 production apps in a day and a three-minute fix loop for its internal BlogBot.
Practitioners should govern AI routines as distinct identities with scoped authority, proof requirements, and explicit lifecycle ownership.

Standards & Framework Alignment

This section maps relevant standards and security frameworks to the operational risks and controls described in this guidance.

OWASP Agentic AI Top 10 and OWASP Non-Human Identity Top 10 address the attack and risk surface, while NIST CSF 2.0 set the governance and control requirements practitioners need to meet.

Framework	Control / Reference	Relevance
OWASP Agentic AI Top 10		AI routines with tool access need constrained execution and evidence gates.
OWASP Non-Human Identity Top 10	NHI-03	Routine-specific tokens and lifecycle ownership match NHI credential governance.
NIST CSF 2.0	PR.AC-4	Scoped access and runtime proof map to access control governance.

Limit routine privileges to the minimum necessary and tie them to explicit evidence requirements.

Key terms

AI Routine: A repeatable AI-driven workflow that performs a bounded business task, often through tools, APIs, or internal systems. In practice it behaves like a non-human identity when it can trigger actions, hold credentials, and move through a workflow with defined authority and evidence requirements.
Deterministic Harness: A controlling workflow layer that forces an AI system to follow typed steps, validated inputs, and required outputs. It reduces model improvisation by making progression dependent on evidence, which is especially important when the workflow can change code, content, or operational state.
Routine Identity: The distinct identity, permissions, and ownership assigned to a specific AI workflow or automation path. It separates one agentic process from another so credentials can be scoped, reviewed, revoked, and audited without treating all machine activity as one shared account.
Machine-Readable Proof: Verifiable output that demonstrates a task was completed correctly, such as test logs, validation artefacts, or recorded execution evidence. For AI-assisted workflows, proof becomes a control because it replaces trust in the model with evidence the system can check before moving on.

Deepen your knowledge

NHI governance, agentic AI identity, and machine identity security are core topics in our NHI Foundation Level course, the industry's only accredited NHI security programme. If you are responsible for identity security strategy or NHI governance in your organisation, it is worth exploring.

This post draws on content published by WorkOS: Inside the WorkOS Applied AI Showcase. Read the original.

NHIMG Editorial Note
Published by the NHIMG editorial team on 2026-05-20.
NHI Mgmt Group — the independent authority on Non-Human Identity, IAM, and Agentic AI security. nhimg.org