Pomerium’s Databroker shift shows why Postgres is not always enough

By NHI Mgmt Group Editorial TeamPublished 2025-11-07Domain: Best PracticesSource: Pomerium

TL;DR: Session storage and directory sync need simpler operational patterns than Postgres can always deliver at scale, according to Pomerium. The deeper lesson is that identity enforcement depends on durable, quickly recoverable state, not just a familiar database choice.

At a glance

What this is: This is Pomerium’s explanation of why its Databroker moved from Postgres to a local file-based backend with Raft-based clustering, with the key finding that identity state needs operational simplicity and recovery more than generic database features.

Why it matters: It matters because access proxies, session stores, and identity context pipelines fail when state management becomes harder to operate than the policy engine itself, affecting NHI, autonomous, and human identity programmes alike.

👉 Read Pomerium's post on why the Databroker moved beyond Postgres

Context

Pomerium’s primary keyword here is Databroker storage, and the article is really about how identity-aware access systems depend on durable state. When session data, directory sync, and external context become too expensive to operate in Postgres, the governance problem shifts from policy design to state reliability.

For IAM teams, the issue is not whether Postgres can store identity data. The issue is whether the chosen storage model supports fast revocation, predictable recovery, and multi-instance operation without turning access enforcement into an infrastructure support burden.

Key questions

Q: How should teams decide whether identity state belongs in Postgres or a simpler backend?

A: Teams should choose the backend that best matches the identity state they must preserve, the recovery time they can tolerate, and the operational skills they actually have. If the workload is mostly key-value session and context data, simpler local persistence can reduce failure modes without weakening access policy enforcement.

Q: Why do directory sync and session storage need to be separated in access control systems?

A: Directory data changes independently of login events, so waiting for the next sign-in delays revocation and can leave access valid longer than intended. Separating sync from session creation lets policy respond to lifecycle changes in minutes rather than hours or days.

Q: What breaks when access context lives only in browser cookies?

A: Central visibility disappears, revocation becomes harder to enforce, and large or changing claims can exceed cookie limits or go stale between logins. That creates a control gap because administrators cannot inspect, invalidate, or efficiently refresh the access state from one place.

Q: Who is accountable when clustered identity storage trades perfect consistency for simpler operations?

A: The platform owner remains accountable for defining the recovery promise, the acceptable inconsistency window, and the business impact of brief stale state. The technical choice is justified only when those trade-offs are explicit and accepted as part of the access-control design.

Technical breakdown

Databroker state and session persistence

Pomerium’s Databroker stores sessions and authorization context so the proxy and authorize services can make policy decisions with current identity data. A cookie-only model was rejected because claims can be too large, directory data can change after login, and cookie state cannot be centrally inspected or revoked. That makes the Databroker the control point for freshness, visibility, and revocation timing rather than a passive cache.

Practical implication: treat session storage as an access-control dependency and validate how quickly revocation and directory changes propagate through it.

File-based storage versus Postgres operational load

The article argues that Postgres was functionally capable but operationally brittle for this use case. At scale, session writes, replication, and directory data updates can create load, latency, and support complexity that the product team increasingly had to troubleshoot for customers. The shift to Pebble reflects a narrower requirement set: simple key-value persistence, easy local operation, and fewer moving parts in environments where reliability targets are about reducing downtime, not guaranteeing perfect durability.

Practical implication: choose the simplest storage backend that still preserves identity freshness and recovery objectives across your deployment model.

Raft clustering for failover and leader election

Pomerium uses Raft for leader election and cluster coordination rather than full state-machine replication. That choice keeps the clustered Databroker available without insisting on strong consistency for every write, because the product can tolerate occasional stale state or brief logout events and rebuild context from upstream sources. The architecture prioritizes service continuity and administrative simplicity over database-grade consistency guarantees that the workload does not truly need.

Practical implication: define which identity-state losses are acceptable before selecting a clustering model, then align failover design to that tolerance.

NHI Mgmt Group analysis

Databroker reliability is an identity governance dependency, not an infrastructure preference. When access decisions depend on current session and directory state, storage design becomes part of the control plane. The article shows that operational complexity in the backend can directly degrade how quickly access changes take effect, which is why identity teams should evaluate storage as a governance boundary rather than a plumbing detail.

Cookie-based identity state collapses under modern directory scale. Claims grow too large, directory membership changes outside login events, and session visibility disappears once data lives only in the browser. That means revocation, monitoring, and lifecycle control stop being centrally governable, which is a structural weakness for any access proxy relying on current identity context.

Postgres is not the default answer when the workload is identity-state distribution. The article’s core point is not that Postgres is weak, but that a broad-purpose relational database can become the wrong abstraction when the system needs fast lookups, local persistence, and resilient coordination in constrained environments. Identity-state durability: the real question is how long access context must persist before the system can safely recover or rehydrate it. Practitioner conclusion: match storage to the identity lifecycle, not the other way around.

Raft solves availability, but it does not erase governance trade-offs. Using Raft for leader election and follower catch-up keeps the cluster operating, yet it also accepts brief inconsistency and possible stale authorization context. That trade-off is reasonable only when teams understand the business impact of short-lived state drift, because access control can tolerate some recovery delay only if policy design and user workflows were built for it.

This architecture points to a broader shift in identity platforms toward recoverable state instead of perfect state. The article implicitly rejects the idea that every identity subsystem must behave like a fully durable transactional database. For practitioners, that means evaluating whether the control objective is exact persistence, rapid reconstitution, or simply acceptable continuity under failure. Practitioner conclusion: define the recovery promise first, then choose the backend.

From our research:
Only 5.7% of organisations have full visibility into their service accounts, according to Ultimate Guide to NHIs.
Only 71% of NHIs are not rotated within recommended time frames, increasing the risk of compromise over time.
Pomerium’s backend design shift belongs in the same operational conversation as Ultimate Guide to NHIs, because visibility and lifecycle control depend on where identity state lives.

What this signals

State location is becoming an identity governance decision. When access systems store current identity context in a backend that is easy to operate, teams reduce the chance that revocation and sync processes fall behind the policy model. The operational signal is simple: if state is hard to observe, it is usually hard to govern.

The next phase of access architecture will reward systems that can reconstitute identity context quickly after failure, not systems that only look durable on paper. That matters for teams running human IAM, NHI workflows, and access proxies because the same recovery question now spans all three.

Identity blast radius: if session and directory state cannot be refreshed or replayed cleanly, a small backend failure can affect many users at once. Teams should review whether their current storage choices make revocation, failover, and rehydration predictable enough for the access model they claim to run.

For practitioners

Map identity-state dependencies before changing storage Inventory which access decisions depend on session cookies, directory sync, and external context so you can see where state loss becomes an authorization event.
Test revocation latency under backend failure Measure how quickly group removal, session invalidation, and directory updates propagate when the primary Databroker node is unavailable or recovering.
Set explicit recovery objectives for identity context Document whether the system must preserve every session, restore most sessions, or simply rehydrate state fast enough to avoid material access disruption.
Validate failover behavior with realistic directory scale Run cluster tests with large group memberships and frequent writes to confirm the leader election path and follower catch-up remain operationally acceptable.

Key takeaways

The article reframes Databroker storage as an identity control problem, not a generic database selection exercise.
The practical risk is not failure alone but stale, invisible, or hard-to-recover identity state that weakens access enforcement.
Teams should choose storage and failover patterns based on recovery objectives, revocation speed, and operational realism rather than feature breadth.

Standards & Framework Alignment

This section maps relevant standards and security frameworks to the operational risks and controls described in this guidance.

OWASP Non-Human Identity Top 10 address the attack and risk surface, while NIST CSF 2.0 and NIST Zero Trust (SP 800-207) set the governance and control requirements practitioners need to meet.

Framework	Control / Reference	Relevance
NIST CSF 2.0	PR.AC-4	Identity state freshness affects how consistently access permissions are enforced.
NIST Zero Trust (SP 800-207)		The article centers on continuous verification and policy decisions tied to live identity state.
OWASP Non-Human Identity Top 10	NHI-03	The piece concerns storage and lifecycle handling of non-human identity context data.

Ensure NHI session and context data can be revoked and rehydrated without operational drift.

Key terms

Databroker: A Databroker is the storage and coordination layer that holds session data, directory context, and other identity state used for access decisions. In this design, it is part of the authorization path, so persistence, freshness, and cluster behavior directly affect whether policy enforcement stays current.
Identity state: Identity state is the live data an access system depends on to decide who or what is allowed through, including sessions, claims, group membership, and external context. It is not just user data. When it becomes stale or unavailable, access control can drift away from policy intent.
Leader election: Leader election is the cluster process that selects which node coordinates write handling and state authority at a given moment. In identity systems, it matters because failover must preserve enough state continuity that access decisions do not become inconsistent or unavailable after node loss.
Revocation latency: Revocation latency is the time between an access change, such as group removal or session invalidation, and the point when the system actually enforces it. Shorter latency reduces exposure. In identity platforms, it is often a better control metric than raw persistence depth.

Deepen your knowledge

NHI governance, agentic AI identity, and machine identity lifecycle are core topics in our NHI Foundation Level course, the industry's only accredited NHI security programme. If you are building or maturing an IAM programme, it is worth exploring.

This post draws on content published by Pomerium: Sometimes Postgres isn’t the Answer. Read the original.

NHIMG Editorial Note
Published by the NHIMG editorial team on 2025-11-07.
NHI Mgmt Group — the independent authority on Non-Human Identity, IAM, and Agentic AI security. nhimg.org