Subscribe to the Non-Human & AI Identity Journal

How should security teams govern data exposure in Amazon Bedrock workflows?

Treat Bedrock as a governed data path, not just a model endpoint. Security teams should classify data before it enters the workflow, restrict the identities that can reach sensitive sources, and verify where outputs are stored or copied. The control objective is to keep regulated or confidential data from moving through AI pipelines without traceability.

Why This Matters for Security Teams

Amazon Bedrock can move sensitive material through retrieval, prompt assembly, tool use, logging, and downstream storage, so the real control point is the workflow, not the model call. Security teams need to decide which data is allowed to enter the path, which non-human identities can reach it, and where any derived output can persist. That is a governance problem as much as an access problem. Current guidance from NIST Cybersecurity Framework 2.0 and NIST AI risk guidance both point toward managing AI systems through mapped controls, traceability, and continuous oversight rather than one-time approval.

This matters because excessive access and secret sprawl are still common in machine identities. NHIMG research shows that Ultimate Guide to NHIs — Key Research and Survey Results found 97% of NHIs carry excessive privileges, which is a direct warning sign for AI workflows that can query sources, generate outputs, and write to storage in a single automated path. In practice, many security teams encounter data exposure only after a workflow has already copied regulated content into a prompt log, vector store, or shared output location, rather than through intentional design.

That is why Bedrock governance should start with data classification, identity scoping, and retention rules before any application team wires in a model or agent.

How It Works in Practice

The practical pattern is to treat Bedrock as a controlled data path with explicit checkpoints. First, classify inputs by sensitivity and block regulated records from entering prompts unless there is a defined business need, an approved identity, and a traceable control owner. Second, restrict the AWS roles, service accounts, and application identities that can call retrieval sources, invoke tools, or write outputs. Third, separate prompt content from operational logs so that sensitive context is not duplicated into places that have weaker retention or access controls.

For model-enabled workflows that rely on external knowledge, the strongest baseline is least privilege plus short-lived access. That means just-in-time credentials where possible, scoped to one task or session, with secrets kept out of code and config files. The same discipline that applies to NHIs generally also applies here: Ultimate Guide to NHIs — Lifecycle Processes for Managing NHIs and Guide to the Secret Sprawl Challenge both reinforce that credential visibility and rotation are necessary to keep machine access from becoming persistent exposure.

  • Classify data before retrieval, prompt construction, and tool execution.
  • Use separate identities for read, transform, and write actions.
  • Limit which sources a workflow can reach and record every access decision.
  • Prevent model outputs from being copied into uncontrolled storage or collaboration tools.
  • Review logging, caching, and export paths for hidden replicas of sensitive content.

For standards alignment, NIST Cybersecurity Framework 2.0 supports identifying, protecting, detecting, and governing these paths, while Anthropic — first AI-orchestrated cyber espionage campaign report shows how autonomous AI use can accelerate abuse when tool access and oversight are weak. These controls tend to break down when Bedrock is embedded in fast-moving application pipelines that auto-sync logs, cache prompts, and replicate outputs across multiple accounts without a single retention owner.

Common Variations and Edge Cases

Tighter data controls often increase friction for developers and analysts, requiring organisations to balance model usefulness against privacy, retention, and audit overhead. That tradeoff becomes more visible when teams want to use unstructured documents, because classification is less exact and there is no universal standard for perfect prompt-level redaction yet.

One common edge case is a retrieval workflow that appears safe because the model never sees the original database directly, but the vector index still contains enough context to reconstruct sensitive material. Another is cross-account Bedrock use, where the application account is governed but the source account, log destination, or export bucket has weaker controls. A third is human review of model outputs: if reviewers can copy responses into ticketing systems, chat tools, or analytics platforms, the exposure moves rather than disappears. NHIMG’s McKinsey AI platform breach and AI LLM hijack breach are useful reminders that AI data paths fail when access, logging, and downstream sharing are not governed together.

Best practice is evolving toward policy-based approvals at runtime, rather than static “approved workflow” labels. That is especially important where Bedrock is used in customer support, legal review, finance, or healthcare, because the same workflow may be safe for one record type and inappropriate for another.

Standards & Framework Alignment

This section maps relevant standards and security frameworks to the operational risks and controls described in this guidance.

OWASP Non-Human Identity Top 10 address the attack and risk surface, while NIST CSF 2.0 and NIST AI RMF set the governance and control requirements practitioners need to meet.

Framework Control / Reference Relevance
OWASP Non-Human Identity Top 10 NHI-03 Bedrock workflows depend on machine identities and secret handling, which this control governs.
NIST CSF 2.0 PR.AC-4 Data exposure is reduced by enforcing least-privilege access to sources and outputs.
NIST AI RMF AI RMF covers governance, traceability, and risk oversight for AI data flows.

Apply AI RMF governance to classify data, assign owners, and monitor Bedrock workflow risk.