AI coding tools create net negative value for most enterprises

By NHI Mgmt Group Editorial TeamPublished 2025-10-28Domain: General NHISource: WorkOS

TL;DR: A panel at Enterprise Ready Conference said about 80% of companies using agentic coding tools see net negative value, highlighting a gap between AI productivity promises and enterprise implementation realities, according to WorkOS's ERC 2025 recap. The real issue is not model capability alone, but whether teams can define, measure, and govern productivity outcomes well enough to capture value.

At a glance

What this is: This is WorkOS's recap of an Enterprise Ready Conference panel arguing that most enterprises are using AI coding tools in ways that destroy value instead of creating it.

Why it matters: It matters because productivity tooling now sits inside broader identity, access, and governance decisions, and the same adoption gap can distort controls for human users, NHI-enabled workflows, and autonomous systems.

By the numbers:

Approximately 80% of people using agentic coding tools are getting net negative value from them, while perhaps 20% or less know how to use them effectively and extract significant value.
A panel at Enterprise Ready Conference said approximately 80% of people using agentic coding tools are getting net negative value from them.

👉 Read WorkOS's recap of the Enterprise Ready Conference productivity panel

Context

AI coding tools are increasingly sold as productivity multipliers, but the enterprise reality is messier: when teams cannot define output, quality, or adoption clearly, the tools can increase activity without improving outcomes. In identity programmes, that kind of mismatch matters because access, automation, and approval logic often follow the same measurement habits as software delivery.

The article's central point is not that AI coding tools fail universally. It is that most organisations lack the operating model, metrics, and user guidance needed to make them useful at scale, which turns adoption into a governance problem as much as a technology problem.

For identity and security leaders, the lesson is broader than developer tooling. Any programme that introduces AI-assisted execution, delegated access, or autonomous decision support needs a clear way to separate apparent velocity from real control, or it will optimise the wrong thing.

Key questions

Q: How should security teams measure whether AI-assisted workflows are actually helping?

A: Track outcomes, not just output. Useful measures include cycle time, defect rates, review burden, exception volume, and whether the work reaches production without creating extra remediation. If AI increases throughput but also raises rework or support cost, it is not producing net value.

Q: When does AI-assisted productivity become a governance risk?

A: It becomes a governance risk when teams scale access before they can define quality, accountability, and acceptable use. At that point, the organisation is rewarding activity without proving value, which can amplify hidden cost, weak review discipline, and inconsistent decision-making.

Q: What do organisations get wrong about AI coding tools?

A: They often treat prompting skill as the main issue when the real problem is product fit, workflow design, and control placement. If users need deep tribal knowledge just to get acceptable output, the programme has an adoption and governance problem, not only a training problem.

Q: How can enterprises keep humans accountable when AI speeds up execution?

A: Keep a clear human approval or review point wherever AI output can affect production, access, or customer experience. The goal is not to slow everything down, but to preserve a named owner for quality, exceptions, and escalation before the work is committed.

Technical breakdown

Why AI coding tools create false productivity signals

Agentic coding tools can raise visible output, such as more pull requests, more commits, or faster issue closure, while quietly increasing rework, review load, and downstream defects. That creates a measurement trap: the organisation sees motion, not value. Productivity becomes distorted when the metrics track activity volume instead of business outcome, code quality, or maintainability. In practice, the risk is not only bad code. It is a governance model that rewards the wrong signal and then treats the result as success.

Practical implication: measure quality, rework, and delivery outcomes together, not PR volume alone.

Why user guidance matters more than model capability

The panel's discussion of vocabulary, prompting, and onboarding shows a common enterprise failure mode: teams assume the tool will self-explain. In reality, AI-assisted systems often need structured usage guidance, because their output quality depends on how well users frame work, constrain scope, and interpret results. That is especially true when the tool sits inside an engineering workflow where a small wording change can change output shape, confidence, or task decomposition. Without explicit enablement, most users never reach the value-producing minority.

Practical implication: build role-specific enablement and usage patterns before scaling access across teams.

How productivity tooling intersects with governance and access control

When productivity tools become embedded in enterprise workflows, they start influencing who can act, how decisions are reviewed, and what gets automated. That makes them adjacent to identity governance even when they are not identity tools themselves. The enterprise question is no longer just whether the tool can generate code. It is whether the surrounding controls can contain errors, validate outputs, and preserve human accountability when AI accelerates execution. In other words, productivity tooling changes the pressure on IAM, not just the pace of development.

Practical implication: review approval, review, and escalation controls before expanding AI-assisted execution.

NHI Mgmt Group analysis

Productivity is becoming an identity governance problem, not just a tooling problem. When enterprises cannot define what productive output looks like, they create room for AI-assisted systems to optimise the wrong thing. That failure mode affects human workflows today and will matter even more as delegated machine actions expand across development, operations, and access decisions. The practitioner conclusion is simple: measure what the identity-linked workflow is supposed to achieve, not just how much it produces.

The 80/20 split is really a control design warning. A small group can extract value from AI coding tools because they already know how to constrain, interpret, and verify the output. The majority cannot, which means the organisation is exposing broad access to a capability that only a narrow population can use safely and profitably. This is where identity governance, enablement, and workflow guardrails intersect. The practitioner conclusion is to treat adoption as a controlled entitlement, not a universal rollout.

Named concept: productivity trust debt. The article exposes the gap between promised acceleration and the operational cost of making AI-generated work trustworthy enough to use. That trust debt accumulates when teams accept speed metrics without paying for review, validation, and accountability. The practitioner conclusion is that enterprise AI value depends on reducing trust debt faster than the tool increases throughput.

AI-assisted execution does not eliminate human quality checks, it relocates them. The panel's point that humans still need to judge whether users can actually use the product translates directly to identity programmes. Any workflow that uses AI to draft, route, or recommend actions still needs a human or policy control to verify meaning, impact, and exception handling. The practitioner conclusion is to keep human accountability explicit wherever AI changes the execution path.

The market signal is a move from adoption hype to operational maturity. Enterprises will tolerate experimentation only while the learning curve is manageable. Once leaders see persistent net negative value, they will demand clearer definitions, stronger enablement, and better control evidence before expanding deployment. The practitioner conclusion is to align rollout plans with measurable operating outcomes, not with vendor promise curves.

From our research:
43% of security professionals are concerned about AI systems learning and reproducing sensitive information patterns from codebases, according to The State of Secrets in AppSec.
Only 44% of developers are reported to follow security best practices for secrets management, exposing a significant developer behaviour gap.
That same report shows companies dedicating an average of 32.4% of their security budgets to secrets management and code security, reinforcing why productivity tooling and control design now need to be considered together.

What this signals

Productivity trust debt: when teams optimise for visible throughput before they can prove quality, they accumulate hidden cost that shows up later in review load, defects, and control fatigue. In identity-heavy programmes, that debt becomes especially dangerous because delegated actions can scale faster than the organisation can verify them.

The practical signal for readers is that AI-assisted workflows should not be expanded on the basis of enthusiasm or trial success alone. Programmes should be gated by measurable outcome improvements, clear owners, and a validation step that survives pressure to ship faster.

With 43% of security professionals already worried about AI systems reproducing sensitive information patterns from codebases, per The State of Secrets in AppSec, the productivity conversation is now inseparable from secrets hygiene and output verification.

For practitioners

Define productivity outcomes before expanding AI-assisted workflows Tie tool adoption to specific outcomes such as cycle time, review quality, defect rates, or downstream rework. If the organisation cannot explain what better looks like, the rollout is premature.
Create enablement patterns for high-value users Document prompt formats, task boundaries, review expectations, and escalation paths for the teams most likely to produce value. Treat training as part of the control surface, not a change-management afterthought.
Rebalance metrics away from raw output volume Add measures that capture whether AI-assisted work is usable, maintainable, and trusted in production. Review these alongside delivery metrics so the programme does not reward speed that creates hidden debt.
Keep human validation in the workflow Require a human quality check wherever AI changes execution paths, produces code that will ship, or influences operational decisions. The control should be explicit, repeatable, and owned by the delivery team.

Key takeaways

Most enterprises are not failing at AI adoption because the models are unusable, but because the surrounding operating model is weak.
Visible velocity can hide rework, trust debt, and quality loss, which is why output volume alone is a misleading success metric.
Identity, review, and accountability controls must be designed alongside AI-assisted workflows or the programme will scale the wrong behaviour.

Standards & Framework Alignment

This section maps relevant standards and security frameworks to the operational risks and controls described in this guidance.

OWASP Agentic AI Top 10 address the attack and risk surface, while NIST CSF 2.0 and NIST AI RMF set the governance and control requirements practitioners need to meet.

Framework	Control / Reference	Relevance
NIST CSF 2.0	GV.OV-01	Governance and oversight are central to measuring AI-assisted workflow value.
OWASP Agentic AI Top 10		Agentic coding tools can change output, review, and accountability paths.
NIST AI RMF	GOV-1	The article is fundamentally about governance for AI-assisted decision support.

Assign ownership for AI-assisted workflows and validate that metrics reflect real outcomes.

Key terms

Agentic Coding Tool: A software tool that can propose or generate code with some degree of independent action during development workflows. In practice, the risk is not simply automation, but the way it shifts review burden, trust, and accountability across the delivery chain.
Productivity Trust Debt: The hidden cost created when a team accepts AI-generated output faster than it can validate quality, security, or maintainability. The debt shows up later as rework, review fatigue, incident exposure, or loss of confidence in the workflow.
Outcome-Based Productivity: A way of measuring productivity by business result rather than visible activity. In enterprise settings, this means tracking whether the work improves delivery, quality, and customer outcomes, instead of counting tasks, commits, or prompts.
AI-Assisted Workflow: A workflow in which an AI system helps create, route, or refine work while a human or policy still owns the final decision. The central governance question is where verification, escalation, and accountability sit when the output moves faster than review cycles.

Deepen your knowledge

NHI governance, agentic AI identity, and machine identity lifecycle are core topics in our NHI Foundation Level course, the industry's only accredited NHI security programme. If you are responsible for identity security strategy or governance maturity, it is worth exploring.

This post draws on content published by WorkOS: The Productivity Paradox: When AI Tools Make Things Worse Before They Make Them Better. Read the original.

NHIMG Editorial Note
Published by the NHIMG editorial team on 2025-10-28.
NHI Mgmt Group — the independent authority on Non-Human Identity, IAM, and Agentic AI security. nhimg.org