Notifications

Clear all

Distributed analytics foundations: what Josys' IDAC means for teams

Last Post

RSS

NHI Mgmt Group

(@nhi-mgmt-group)

Member Moderator

Joined: 1 year ago

Posts: 12324

Topic starter 11/06/2026 10:46 pm

TL;DR: Rising latency, scaling limits, and fragmented analytical workflows across MySQL, MongoDB, streaming, and search-driven data pipelines are what Josys describes as prompting its move from single-node aggregation services to a Spark-based IDAC layer. The core lesson is that identity and data governance both fail when trust, consistency, and scale are treated as afterthoughts.

NHIMG editorial — based on content published by Josys: Data Engineering at Josys

Questions worth separating out

Q: How should teams design analytics pipelines that can grow without creating bottlenecks?

A: Use distributed compute, clear processing layers, and standard data contracts so workload growth does not concentrate on one service.

Q: Why do layered data architectures improve governance as well as performance?

A: Layered architectures improve governance because they make data transformation visible and reviewable at each stage.

Q: What breaks when organisations rely on a single analytics service for every workload?

A: A single analytics service eventually becomes a bottleneck for compute, writes, and downstream reporting.

Practitioner guidance

Map pipeline control points to governance zones Separate raw ingestion, cleansing, and reporting stages so each layer has a defined owner, validation rule, and audit trail.
Replace single-node analytics dependencies with distributed compute Move heavy aggregation workloads off a single service when latency rises under growth.
Standardise shared data contracts for reporting consumers Define common schemas and transformation logic for dashboards, reports, and operational analytics.

What's in the full article

Josys' full article covers the implementation detail this post intentionally leaves for the source:

The Node.js and MongoDB aggregation pattern that preceded the Spark-based architecture.
The layered IDAC structure and how Bronze, Silver, and Gold roles are divided in practice.
The ingestion options Josys uses, including CDC, streaming, and custom functions.
The platform outcomes Josys says it achieved for customer dashboards and reporting features.

👉 Read Josys' article on building its distributed IDAC data engineering framework →

Distributed analytics foundations: what Josys' IDAC means for teams?

Explore further

View Full Forum → | NHI Foundation Course →

Quote

Topic Tags

Mr NHI

(@mr-nhi)

Member Moderator

Joined: 2 months ago

Posts: 11878

12/06/2026 7:04 am

Distributed analytics is now a governance problem, not just an engineering one. Josys describes the move from single-node aggregation to a layered, Spark-backed framework because the earlier model could not keep up with growing load. That is the same pattern identity teams face when access, reporting, and assurance logic are spread across disconnected systems. The lesson is that scale failures show up first as latency, then as inconsistency, and finally as trust erosion. Practitioners should treat analytics architecture as part of governance architecture.

A few things that frame the scale:

70% of organisations grant AI systems more access than they would give a human employee performing the exact same job, according to The 2026 Infrastructure Identity Survey.
Only 13% of organisations feel extremely prepared for the reality of agentic AI despite the majority racing toward autonomous adoption.

A question worth separating out:

Q: How do identity and security teams apply the same lessons to governance data?

A: They should use the same design discipline for access and assurance data that data engineers use for analytics. That means normalised inputs, clear ownership, traceable transformations, and reliable reporting layers. If identity evidence is fragmented, reviews and decisions will be inconsistent no matter how strong the policy language looks on paper.

👉 Read our full editorial: Josys data engineering shifts to a distributed analytics foundation

ReplyQuote

Forum Statistics

11 Forums

13.6 K Topics

26 K Posts

9 Online

135 Members

Latest Post: Developer tooling and identity risk: are your controls keeping up? Our newest member: Alex Recent Posts Unread Posts Tags

Forum Icons: Forum contains no unread posts Forum contains unread posts

Topic Icons: Not Replied Replied Active Hot Sticky Unapproved Solved Private Closed

#1 Authority in NHI Education, Research and Advisory, empowering organizations to tackle the critical risks posed by Non-Human Identities (NHIs), including AI Agents.

Get in Touch

Quick Links

FAQ

NHI 101 Articles

Legal & Policies