TL;DR: AWS’s Nova Forge lowers the barrier to custom foundation model training by combining customer data with structured training checkpoints, allowing enterprises to build private Nova-based models without full frontier-lab scale, according to WorkOS. The real shift is that proprietary data can now shape model behavior earlier, which raises governance, safety, and lock-in questions for identity and AI teams.
NHIMG editorial — based on content published by WorkOS: Amazon Nova Forge and the shift to custom foundation models
Questions worth separating out
Q: How should security teams govern custom foundation model training on proprietary data?
A: Security teams should treat custom foundation model training as a governed data and identity workflow, not a one-time ML project.
Q: What risks appear when enterprises train models on internal data instead of only fine-tuning them?
A: The main risk is that internal data stops being passive input and becomes part of the model’s learned behaviour.
Q: When does custom model consolidation become a governance concern?
A: Consolidation becomes a governance concern when multiple specialised models are replaced by one custom model without stronger validation and ownership.
Practitioner guidance
- Define training-data approval gates Require explicit approval for any dataset that will influence model training, including proprietary documents, moderation logs, and domain corpora.
- Assign ownership for model lifecycle decisions Create named accountability for who can start training, change reward functions, approve checkpoints, and validate output behaviour.
- Test for behavioural drift after customisation Run evaluation suites that compare base-model performance against custom-trained performance on safety, refusal quality, and domain accuracy.
What's in the full article
WorkOS's full article covers the operational detail this post intentionally leaves for the source:
- Checkpoint-by-checkpoint explanation of how pre-training, mid-training, and post-training differ in practice.
- The described SageMaker reinforcement fine-tuning pipeline and how customer-defined reward functions shape output quality.
- Reddit and Nimbus Therapeutics implementation examples that show where custom model training changed workflows.
- The article’s own discussion of safety evaluation, moderation policy configuration, and Bedrock deployment boundaries.
👉 Read WorkOS’s analysis of Amazon Nova Forge and custom foundation models →
Amazon Nova Forge and private model training: what changes now?
Explore further
Custom model training is now an identity and governance problem, not just an ML problem. When proprietary data shapes the model during training, the control boundary moves upstream from inference to lineage, access, and approval of the training corpus. That means model risk is inseparable from who can authorise data inclusion, who can initiate training, and who can validate resulting behaviour. Practitioners should treat training pipelines as governed identity workflows, not engineering experiments.
A few things that frame the scale:
- The average estimated time to remediate a leaked secret is 27 days, despite 75% of organisations expressing strong confidence in their secrets management capabilities, according to The State of Secrets in AppSec.
- Only 44% of developers are reported to follow security best practices for secrets management, exposing a significant developer behaviour gap.
A question worth separating out:
Q: What should organisations check before relying on a managed training platform for custom AI models?
A: Organisations should check deployment boundaries, export limits, logging coverage, approval controls, and the ability to validate model behaviour independently. If the platform keeps weights and deployment inside a single ecosystem, the trade-off is less portability in exchange for more operational consistency. That trade-off should be explicit before adoption scales.
👉 Read our full editorial: Amazon Nova Forge changes who can build custom foundation models