Request-scoped query caching cuts duplicate database reads by 30%

By NHI Mgmt Group Editorial TeamPublished 2025-06-13Domain: Best PracticesSource: WorkOS

TL;DR: A request-scoped query-cache layer in a NestJS plus TypeORM backend reduced duplicate database reads by about 30% with no query rewrites, according to WorkOS, but only by combining context-local storage with aggressive invalidation on writes. The lesson for practitioners is that performance gains are real when cache scope, staleness, and instrumentation are all designed together, not bolted on.

At a glance

What this is: WorkOS describes a request-scoped query-cache layer for NestJS and TypeORM that cut repeated reads by about 30% without changing individual queries.

Why it matters: IAM and platform teams should care because the same design pattern can reduce repeated entitlement, membership, and policy lookups across NHI, autonomous, and human access paths.

By the numbers:

Only 44% of developers are reported to follow security best practices for secrets management, exposing a significant developer behaviour gap.
Only 44% of organisations are currently using a dedicated secrets management system.

👉 Read WorkOS's query caching implementation for NestJS and TypeORM

Context

Request-scoped query caching is a performance pattern, but it also exposes a broader identity and access lesson: repeated reads happen when application layers keep asking the same question inside one execution path. In practice, that means entitlement checks, membership lookups, and environment settings can be fetched many times unless the system has a safe way to reuse results within a bounded context.

For IAM and platform teams, the important detail is not the cache itself but the governance boundary around it. Once a request can traverse many helpers, the real task becomes deciding when state may be reused, when it must be discarded, and how to prove the optimisation never leaks data across requests or after writes.

Key questions

Q: How should teams reduce repeated database reads in a single request without risking stale identity data?

A: Use request-scoped caching, not process-wide caching, so reuse is limited to one execution context. Tie invalidation to every write that could change the data being read, especially for entitlement, membership, or policy lookups. That preserves correctness while removing redundant database work.

Q: Why do repeated entitlement and membership lookups become a performance problem in layered applications?

A: Because each helper or service can ask the same question again, even when the answer has not changed within that request. In decomposed systems, this multiplies database traffic and adds latency without improving decision quality. The fix is to reuse results only inside the same request boundary.

Q: How do you know whether query caching is actually reducing load?

A: Measure database executions directly, preferably with server-side telemetry such as pg_stat_statements, and compare the same user journey with caching enabled and disabled. Application-side hit counts can be misleading if they do not reflect real database calls. The database should show the reduction clearly.

Q: What is the difference between request-scoped caching and a shared application cache?

A: Request-scoped caching lives only for one HTTP request and cannot leak across users or sessions. A shared application cache persists longer, which can improve reuse but raises correctness and isolation risks when the underlying data changes. For identity-adjacent data, request scope is the safer default.

Technical breakdown

How request-scoped query caching works in a NestJS and TypeORM stack

A request-scoped cache keeps query results only for the lifetime of a single HTTP request. In this setup, NestJS provides a context-local container through continuation-local storage, while TypeORM consults a cache provider before sending repeated SQL to PostgreSQL. The cache key is the query string, so identical reads can be served from memory during the same request path. The design only works safely when the cache is isolated per request and never shared across concurrent sessions.

Practical implication: bind the cache to request context, not to the process, so one user's reads cannot affect another user's response.

Why cache invalidation matters more than cache hits

The main risk in application query caching is stale reads. TypeORM's default cache behaviour is time-based, which is useful for some workloads but does not fit systems that must reflect writes immediately. The article's approach clears the entire request cache on save, update, or delete, which preserves correctness at the cost of some reuse. That trade-off is appropriate when freshness is more important than maximum hit rate, especially for identity, entitlement, or policy data.

Practical implication: invalidate aggressively on writes whenever cached data feeds access decisions or user-visible state.

Why local query instrumentation is better than guesswork

The article uses PostgreSQL's pg_stat_statements extension to count real query executions during manual navigation. That avoids the common mistake of measuring cache hits inside the application while missing the larger database-side effect. By comparing runs with cache disabled and enabled, the team could verify that repeated reads really dropped by 30%. In performance work, external database telemetry is often the cleanest way to prove that an optimisation changed system behaviour.

Practical implication: validate caching with database-level query counts, not just application logs or intuition.

Cisco DevHub NHI breach — IntelBroker exploited exposed Cisco credentials, API tokens and keys in DevHub.
MongoBleed breach — MongoBleed exposed secrets across 87K MongoDB servers.

Read our 52 NHI Breaches Analysis report for a comprehensive view of breaches impacting Non-Human Identities including AI Agents.

NHI Mgmt Group analysis

Repeated-read elimination is an identity governance pattern, not just an application performance trick. Any IAM or entitlement workflow that asks the same question many times inside one request creates avoidable database load and hidden complexity. The deeper point is that governance systems often pay for the same lookup repeatedly because their service layers are decomposed but not context-aware. Practitioners should treat request-scoped reuse as a control boundary for stateful checks, not as a backend optimisation alone.

Cache scope must match decision scope, or the optimisation becomes unsafe. A cache that persists beyond the request would blur the boundary between one user's access path and another's, which is unacceptable in identity workflows. The article's request-scoped model is the right shape because it constrains reuse to the same execution context. That makes the pattern relevant wherever identity-adjacent reads are repeated during a single transaction, especially in entitlement-heavy systems.

Freshness is the governing premise, and stale identity data is more expensive than duplicate reads. The article shows that write-triggered cache clearing is the real safeguard, because a membership or entitlement lookup cannot be trusted if a write has already changed the underlying state. That applies directly to access decisions, role calculations, and environment-specific settings. Practitioners should prioritise correctness boundaries before chasing maximum cache lifetime.

Query caching exposes a broader concept: identity lookup locality. When a service repeatedly asks for the same user, team, or environment facts within one request, it is signalling that the programme needs tighter locality around access context. This is especially relevant in platforms where human, workload, and service-to-service checks all converge on the same underlying data. Teams should use that pattern to identify where governance logic is being recomputed unnecessarily.

The performance lesson scales into governance architecture only when instrumentation is external to the application. The article's use of pg_stat_statements matters because it proves the effect from the database's point of view, not just the application's. That same discipline should be applied to identity-adjacent systems where teams need to distinguish perceived efficiency from actual reduction in reads, retries, and contention. Practitioners should validate optimisations at the enforcement layer, not just in code.

From our research:
The average estimated time to remediate a leaked secret is 27 days, despite 75% of organisations expressing strong confidence in their secrets management capabilities, according to The State of Secrets in AppSec.
44% of developers are reported to follow security best practices for secrets management, which means behaviour gaps often outlast platform intentions.
The Guide to the Secret Sprawl Challenge extends this discussion into rotation, exposure, and remediation patterns that teams need to operationalise.

What this signals

Identity lookup locality: Teams should start treating repeated reads inside a single request as a signal of governance design, not just inefficiency. If the same access facts are recomputed across helpers, the programme has likely spread decision logic across too many layers and needs a clearer context boundary.

The operational signal is simple: if database-side query counts fall sharply when request-scoped caching is enabled, the system was carrying avoidable duplication. That makes telemetry a useful proxy for where identity-adjacent logic is too fragmented, and where the same access context is being rebuilt more than once.

For teams managing human, workload, and service identities in the same platform, the next step is to align cache scope, access scope, and write scope. That is where performance tuning starts to overlap with governance design.

For practitioners

Map repeated identity reads within a single request Trace where the same user, team, entitlement, or settings query appears across helper layers, and quantify the duplication before changing architecture. This shows whether request-scoped reuse can remove work without changing the business logic.
Scope caches to the execution context Use context-local storage or an equivalent request-bound mechanism so cached data disappears when the request ends. Never share these objects across requests or background jobs that can outlive the original access path.
Invalidate on every mutating write Clear the entire request cache after save, update, or delete operations if the cached values can influence access decisions or user-visible state. That keeps the optimisation safe when reads and writes occur in the same lifecycle.
Measure with database-side query counts Enable database telemetry such as pg_stat_statements and compare identical navigation paths with caching on and off. Use the database's own counters to confirm the optimisation reduced duplicate reads, not just local application logs.

Key takeaways

Repeated identity reads inside one request are a design smell, because they often indicate fragmented governance logic rather than unavoidable workload complexity.
The reported 30% reduction in duplicate reads shows that bounded caching can create measurable gains, but only when freshness rules are enforced on writes.
Teams should validate any caching pattern at the database layer and keep the cache scope identical to the request scope.

Standards & Framework Alignment

This section maps relevant standards and security frameworks to the operational risks and controls described in this guidance.

OWASP Non-Human Identity Top 10 address the attack and risk surface, while NIST CSF 2.0 and NIST Zero Trust (SP 800-207) set the governance and control requirements practitioners need to meet.

Framework	Control / Reference	Relevance
NIST CSF 2.0	PR.DS-1	Cached query results affect data integrity and freshness within request scope.
NIST Zero Trust (SP 800-207)	PR.AC-4	Repeated access checks must preserve least privilege and request-bound context.
OWASP Non-Human Identity Top 10	NHI-03	The pattern touches secret-like session state and access-bound data handling.

Keep cached identity-adjacent data fresh by clearing it on writes and validating consistency at the database layer.

Key terms

Request-Scoped Cache: A request-scoped cache stores computed results only for the duration of one application request. It reduces repeated work inside that path while preserving isolation between users and sessions. In identity-heavy services, it is useful only when invalidation on writes is immediate and the cached data cannot outlive the request.
Continuation-Local Storage: Continuation-local storage is a technique for carrying state through asynchronous call chains without passing it explicitly through every function. It lets a service keep request-specific data available across helpers, promises, and middleware. In modern application stacks, it is often used to attach bounded context to one execution path.
Cache Invalidation: Cache invalidation is the process of removing stored results when the underlying source of truth changes. It is the control that keeps a cache from serving stale data after a write. For access, entitlement, or policy-related reads, invalidation discipline matters more than raw cache lifetime.

Deepen your knowledge

Request-scoped caching, invalidation on writes, and safe reuse boundaries are all covered in our NHI Foundation Level course, the industry's only accredited NHI security programme. If your platform has repeated identity or entitlement lookups inside single requests, it is worth exploring.

This post draws on content published by WorkOS: Query caching using Nest.js and Typeorm. Read the original.

NHIMG Editorial Note
Published by the NHIMG editorial team on 2025-06-13.
NHI Mgmt Group — the independent authority on Non-Human Identity, IAM, and Agentic AI security. nhimg.org