Prompt-cost scoring is the practice of estimating how expensive a model request will be before generation begins. In security operations, it is used to decide whether a prompt should be allowed, capped, queued, or reviewed when the request is likely to consume disproportionate runtime resources.
Expanded Definition
Prompt-cost scoring evaluates a model request before execution to estimate compute, token, latency, and downstream tooling cost. In NHI and agentic AI security, the score is not just an economics signal, it becomes a control input for whether a prompt is admitted, throttled, queued, routed to a cheaper model, or escalated for review. This is adjacent to prompt-risk scoring, but the two are not identical: cost scoring focuses on resource consumption, while risk scoring focuses on harmful intent or policy violation. Guidance varies across vendors, and no single standard governs this yet, so implementation details differ by stack and workload class. For governance context, organisations often map the control to broader operational resilience practices described in the NIST Cybersecurity Framework 2.0, especially where service reliability and abuse resistance overlap.
The most common misapplication is treating prompt-cost scoring as a content moderation tool, which occurs when teams assume a low-cost request is automatically safe and a high-cost request is automatically malicious.
Examples and Use Cases
Implementing prompt-cost scoring rigorously often introduces latency and tuning overhead, requiring organisations to weigh abuse prevention and budget control against user experience and model flexibility.
- A customer support agent submits a very long context window, and the score triggers queueing because the request would exceed the normal runtime budget.
- A code-generation workflow requests a large batch of tool calls, and the score routes it to a review step before any expensive external action begins.
- An internal assistant tries to fan out across multiple retrieval sources, and the score caps the request to prevent runaway token usage during peak load.
- A high-value admin prompt combines sensitive data access with heavy generation cost, and the score is paired with policy checks before approval.
In mature programs, teams compare prompt-cost scoring with capacity planning patterns from the Ultimate Guide to NHIs and with access-governance concepts in the NIST Cybersecurity Framework 2.0, because a costly prompt is often also an over-privileged one.
Why It Matters in NHI Security
Prompt-cost scoring matters because agentic systems can become expensive attack surfaces long before they become obvious security incidents. A malicious or misconfigured agent can create denial-of-wallet conditions, amplify compute waste, and hide abuse inside ordinary-looking traffic. That becomes especially relevant in environments where NHIs already carry broad privileges, since Ultimate Guide to NHIs reports that 97% of NHIs carry excessive privileges, increasing unauthorised access and broadening the attack surface. Cost scoring therefore supports both resilience and containment: it can slow abusive automation, preserve budget, and force human oversight when a request looks operationally disproportionate. It also complements AI governance guidance in the NIST Cybersecurity Framework 2.0 by turning resource anomalies into enforceable workflow decisions.
Organisations typically encounter prompt-cost scoring after unexpected spend spikes, at which point the term becomes operationally unavoidable to address.
Standards & Framework Alignment
This section maps relevant standards and security frameworks to the operational risks and controls described in this guidance.
OWASP Agentic AI Top 10 address the attack and risk surface, while NIST CSF 2.0 and NIST AI RMF set the governance and control requirements practitioners need to meet.
| Framework | Control / Reference | Relevance |
|---|---|---|
| OWASP Agentic AI Top 10 | Agentic controls address unsafe tool use and runaway model actions that cost scoring helps constrain. | |
| NIST CSF 2.0 | PR.AC-4 | Least-privilege and access governance support limiting costly requests from over-scoped NHIs. |
| NIST AI RMF | AI RMF treats resource and operational impacts as part of managing AI risks. |
Assess cost scoring as a risk-control that reduces operational and financial harm.
Related resources from NHI Mgmt Group
- What is the 'no prompt means no action' principle in Agentic AI security?
- What is the difference between prompt injection risk and identity abuse in agents?
- What is the difference between prompt-based control and runtime authorization for agents?
- What is the difference between prompt guardrails and identity controls for agents?
Deepen Your Knowledge
Reviewed and updated by the NHIMG editorial team on June 12, 2026.
NHI Mgmt Group — the #1 independent authority on Non-Human Identity, IAM, and Agentic AI security. nhimg.org