By NHI Mgmt Group Editorial TeamPublished 2026-05-09Domain: Governance & RiskSource: WitnessAI

TL;DR: Explainable AI in finance is moving from post-hoc model explanations to runtime governance as credit, fraud, and AML decisions face stricter transparency demands, while generative and agentic AI fall outside traditional model risk frameworks, according to WitnessAI. The governance gap is no longer whether a model can explain itself, but whether institutions can prove controlled AI behaviour in production.


At a glance

What this is: This is an analysis of explainable AI in financial services, showing that traditional XAI methods help with credit, fraud, and AML decisions, but are not enough for generative and agentic AI.

Why it matters: It matters because IAM, risk, and compliance teams need governance that can evidence, control, and audit AI-driven decisions across human, NHI, and autonomous workflows.

By the numbers:

👉 Read WitnessAI's analysis of explainable AI in finance and runtime governance


Context

Explainable AI in finance is the problem of making model behaviour understandable enough for validation, compliance, and customer-facing decisions. In practice, the question is not whether an AI system can produce an answer, but whether the institution can justify that answer to regulators, reviewers, and affected customers.

That gap becomes more visible when AI affects credit, fraud, and AML workflows. Traditional model risk programmes were built for systems that are stable enough to validate upfront, while generative and agentic systems can change behaviour through runtime interaction, tool use, and orchestration.

For financial institutions, the primary challenge is governance continuity. The same programme that can explain a credit scorecard may not be able to evidence control over a multi-step AI workflow that retrieves data, drafts content, and initiates actions without per-step human review.


Key questions

Q: How should financial institutions govern explainable AI in high-risk use cases?

A: They should match the explanation method to the decision and audience, then back it with audit trails and policy controls. Credit decisions need reasoned, defensible outputs. Fraud and AML workflows need evidence that investigators and validators can review. For generative and agentic systems, runtime observability becomes essential because post-hoc explanations alone do not prove control.

Q: Why do generative and agentic AI create problems for traditional model risk management?

A: Traditional model risk management assumes stable inputs, stable outputs, and a bounded decision path that can be validated before deployment. Generative and agentic systems can retrieve data, combine tools, and initiate actions during runtime, which means their behaviour may change in production. That makes explanation useful, but not sufficient for governance.

Q: What do financial services teams get wrong about SHAP and LIME?

A: They often treat them as universal explanation tools, when they are better understood as partial methods for specific model types. SHAP and LIME can help with structured decisioning, but they do not fully solve causal explanation, stability, or governance for large language models and agentic workflows. Teams should use them as evidence, not as complete control.

Q: How do AI transparency requirements change when systems can act autonomously?

A: Transparency requirements move from explaining a single model output to proving control over a sequence of actions. If an AI system can query data, draft communications, and trigger transactions, institutions need visibility into each step, not just the final answer. That makes runtime policy enforcement and auditability central to accountability.


Technical breakdown

Global explainability versus local explanation in credit decisions

Global explainability describes how a model behaves overall, while local explanation describes why one specific outcome occurred. In finance, both matter because validation teams need pattern-level assurance and customer-facing teams need reason codes that can be defended. Techniques such as SHAP and counterfactuals can provide those views, but they answer different questions. A global view can show feature dependence across a portfolio, while a local explanation can support adverse action notices or case review. The control problem is matching the explanation method to the decision audience, not treating one explanation layer as universal.

Practical implication: align each AI use case to the explanation type reviewers, investigators, and customers actually need.

Why post-hoc explainability struggles with generative and agentic AI

Generative models do not behave like scorecards with stable, discrete inputs and outputs. Their internal computation is distributed across many parameters, which makes it hard to isolate a single causal path for a decision. Once orchestration layers, retrieval, and tool use are added, the system can produce multi-step behaviour that is not captured by a single explanation artifact. That is why post-hoc methods can become incomplete or misleading in these environments. The technical issue is not simply opacity, but the mismatch between how the system acts and how traditional explanation tools are built.

Practical implication: treat explanation methods as partial evidence and pair them with runtime monitoring and policy enforcement.

Runtime governance for AI in financial services

Runtime governance shifts the control point from after-the-fact explanation to live observation and intervention. It combines audit trails, policy enforcement, data controls, and output inspection so institutions can govern AI behaviour as it happens. This is especially relevant when models are used across multiple business lines, when employees use shadow AI tools, or when agents can take multi-step actions without human review. In that setting, explainability remains useful, but it is only one layer in a broader control stack that must prove what the system saw, decided, and did.

Practical implication: build controls that log inputs, policy decisions, and outputs in production, not just model documentation at approval time.


Read our 52 NHI Breaches Analysis report for a comprehensive view of breaches impacting Non-Human Identities including AI Agents.


NHI Mgmt Group analysis

Explainability is no longer a sufficient governance primitive for financial AI. In credit, fraud, and AML, explanation methods still matter because they help reviewers understand model decisions. But once systems move into generative and agentic behaviour, the central governance question shifts from explanation to control, because the institution must evidence what the system did in production. Practitioners should treat explainability as necessary, but no longer as the whole governance model.

Credit underwriting creates the clearest explanation obligation, but it is not the only pressure point. Credit decisions have the strongest disclosure expectations, yet fraud and transaction monitoring also require auditable reasoning when AI affects consumers or investigator workflows. That means the same validation logic cannot be applied uniformly across all financial AI use cases. Practitioners should map explanation requirements to the decision type, not just the model class.

Runtime governance is the named concept this article sharpens. Traditional model risk management assumes the institution can validate a model before deployment and then monitor a bounded set of outputs. That assumption breaks when AI systems can retrieve data, draft outputs, and initiate downstream actions during runtime. The implication is that financial governance must move from model-only approval to production control of AI behaviour.

Generative and agentic AI expose a governance boundary that model explainability alone cannot cross. The article makes clear that existing frameworks were not built for systems whose behaviour depends on orchestration, context, and multi-step execution. That does not make explanation irrelevant, but it does make it incomplete as a standalone control. Practitioners should expect governance programmes to expand from model validation into runtime observability and policy enforcement.

Financial institutions should read the transparency debate as an operating model issue, not a documentation issue. The hardest part is not writing better model summaries, but proving that AI activity stayed within approved policy when it touched regulated decisions. That is where evidence, auditability, and intervention capability become core governance requirements. Practitioners should reframe AI oversight as continuous control assurance rather than periodic explanation review.

From our research:

What this signals

Runtime governance is becoming the practical boundary for financial AI. As institutions move beyond explainability as a documentation exercise, they need evidence that AI behaviour stayed inside policy in production. That means live inventories, immutable logs, and control points that can block or route interactions before they reach regulated decisions.

Our research shows the wider identity problem is already large enough to matter. The average organisation believes more than 1 in 5 of their non-human identities are insufficiently secured, according to The 2024 ESG Report: Managing Non-Human Identities. In an AI programme, that is a warning sign that control assumptions are already strained before explainability even enters the picture.

Financial institutions should expect governance conversations to shift from model approval to control evidence. That makes explainability one layer inside a broader identity and AI operating model, especially where human users, service identities, and autonomous agents all touch the same decision path.


For practitioners

  • Map explanation method to decision audience Use local explanations for consumer-facing decisions, global explanations for model validation, and separate treatment for investigator workflows. Do not force one technique to satisfy all stakeholders.
  • Separate scorecard governance from generative AI governance Keep interpretable model controls for structured decisioning, but add runtime observability and policy enforcement for systems that retrieve data, draft content, or initiate actions.
  • Inventory shadow AI and agentic workflows Maintain a live list of approved models, external AI tools, and autonomous workflows so compliance teams can see where regulated data or customer decisions are being exposed.
  • Log AI interactions for audit readiness Capture prompts, responses, policy decisions, and downstream actions in a tamper-evident trail so model risk, legal, and audit teams can reconstruct events without guesswork.
  • Use policy boundaries for regulated use cases Define which AI interactions may touch credit, fraud, AML, or customer data, and require extra review or blocking before those interactions reach production models or agents.

Key takeaways

  • Explainable AI in finance is now a governance problem, not just a model documentation problem.
  • Credit, fraud, and AML workflows all need defensible AI behaviour, but generative and agentic systems require runtime control as well as explanation.
  • Institutions that can log, inspect, and enforce policy on AI interactions will be better positioned for emerging transparency and examination demands.

Standards & Framework Alignment

This section maps relevant standards and security frameworks to the operational risks and controls described in this guidance.

NIST CSF 2.0, NIST AI RMF and NIST SP 800-63 set the governance and control requirements practitioners need to meet.

FrameworkControl / ReferenceRelevance
NIST CSF 2.0PR.DS-1Runtime AI governance depends on protecting data used in model decisions.
NIST AI RMFGOVERNAI governance must extend beyond explanation into accountable operating controls.
NIST SP 800-63Not directly applicable to model governance, but relevant where customer identity and disclosures intersect.

Use identity assurance controls only where AI outcomes depend on authenticated user context.


Key terms

  • Explainable AI: Explainable AI is the practice of making an AI system’s decisions understandable to the people who have to review, validate, or rely on them. In financial services, that means producing explanations that can support compliance, model validation, customer communications, and audit, not just technical curiosity.
  • Runtime Governance: Runtime governance is the control model that manages AI behaviour while it is operating, not only before deployment. It combines logging, policy enforcement, output inspection, and intervention so institutions can prove how AI used data, what it decided, and whether it stayed within approved boundaries.
  • Local Explainability: Local explainability describes why a model produced one specific result for one specific case. It is most useful when a customer, investigator, or reviewer needs a decision reason that is tied to the exact inputs in play, such as a credit denial or a fraud alert.
  • Agentic AI: Agentic AI is AI that can take multi-step actions, choose tools, and carry out tasks with limited human oversight. In governance terms, that changes the problem from explaining a single model output to controlling a sequence of decisions and actions that can affect regulated data or business outcomes.

Deepen your knowledge

Explainable AI in finance and runtime governance for autonomous systems are covered in our NHI Foundation Level course, the industry's only accredited NHI security programme. If your programme needs to govern AI behaviour across regulated workflows, that is a useful place to start.

This post draws on content published by WitnessAI: explainable AI in finance and runtime governance for generative and agentic AI. Read the original.

NHIMG Editorial Note
Published by the NHIMG editorial team on 2026-05-09.
NHI Mgmt Group — the independent authority on Non-Human Identity, IAM, and Agentic AI security. nhimg.org