Why is model conversion risky when the source artifact may be tampered with?

Why This Matters for Security Teams

Model conversion is not a neutral packaging step. If the source artifact has been tampered with, the conversion tool may preserve embedded weights, graph structure, or malicious control flow while changing the file format. That creates a false sense of safety because the new runtime looks different, but the underlying behaviour can remain intact. Guidance from the NIST Cybersecurity Framework 2.0 is clear that integrity and provenance checks belong in the control process, not after deployment.

This matters because model supply chains are increasingly treated like software supply chains, yet teams often assume that export, quantisation, or runtime conversion strips away risk. It does not. The same concern shows up across NHI exposure patterns documented in Top 10 NHI Issues, where untrusted artifacts and weak verification allow attackers to move from one trusted system to another. In practice, many security teams encounter model tampering only after the converted artifact has already been integrated into a production pipeline, rather than through intentional provenance review.

How It Works in Practice

Conversion risk comes from trust transitivity. A pipeline that imports a source model, rewrites it into ONNX, TensorRT, or another execution format, and then signs or deploys the output is only as trustworthy as the original artifact. If the source contains poisoned weights, malicious layers, or embedded logic that triggers on specific inputs, conversion can faithfully preserve that behaviour. The new format may even make inspection harder because the transformation introduces another layer of abstraction.

Current guidance suggests treating model conversion as a security boundary, not a convenience feature. That means verifying the source artifact before conversion, recording provenance, and checking that the output matches expected structure and semantics. The Ultimate Guide to NHIs — Key Challenges and Risks is useful here because it frames hidden, persistent identity risk as a lifecycle problem, not a one-time event. Similar logic applies to model artifact: once untrusted state enters the pipeline, later steps may faithfully carry it forward.

Verify the source artifact hash and signature before any export or optimisation step.

Store provenance metadata so the converted file can be traced back to the original training or acquisition source.

Scan both the source and the converted artifact for unexpected operators, branches, or embedded payloads.

Use isolated conversion environments with minimal network and secret access.

Require human approval for high-risk transformations, especially when the model will be reused across environments.

For broader control expectations, the OWASP NHI Top 10 and NIST-style supply chain controls both point to the same operational lesson: trust must be proven at each hop, not assumed because the file extension changed. These controls tend to break down when teams convert third-party models at scale in shared build systems, because provenance is lost between acquisition, transformation, and deployment.

Common Variations and Edge Cases

Tighter model validation often increases delivery time and compute cost, requiring organisations to balance deployment speed against assurance. That tradeoff is especially visible when teams need to convert many models quickly for multiple runtimes or edge devices. Best practice is evolving, and there is no universal standard for how deep conversion-time inspection must go.

Some environments need extra caution. If a model is distilled from an untrusted source, conversion may preserve the distilled bias or backdoor even when the final file looks clean. If the runtime performs graph optimisations, those steps can hide provenance clues and make later forensics harder. If the model is distributed through a CI/CD pipeline, any compromised build secret or signing key can turn a tampered source into a trusted release. The practical response is to validate the artifact before conversion, after conversion, and again before deployment, with separate attestations for each step.

For teams building governance around this risk, Ultimate Guide to NHIs — Why NHI Security Matters Now reinforces the broader point: once identity-bearing assets are assumed trustworthy without inspection, compromise becomes durable. Model pipelines are no exception, especially when source artifacts come from external marketplaces, shared repositories, or delegated vendor workflows.

Standards & Framework Alignment

This section maps relevant standards and security frameworks to the operational risks and controls described in this guidance.

OWASP Non-Human Identity Top 10 address the attack and risk surface, while NIST CSF 2.0 and NIST AI RMF set the governance and control requirements practitioners need to meet.

Framework	Control / Reference	Relevance
OWASP Non-Human Identity Top 10	NHI-01	Model tampering is a provenance and trust problem for identities and artifacts.
NIST CSF 2.0	PR.DS-6	Protecting integrity of data and artifacts directly fits conversion risk.
NIST AI RMF		AI RMF addresses trustworthy AI lifecycle controls and provenance risk.

Verify artifact origin and integrity before conversion and reject untrusted sources.

#1 Authority in NHI Education, Research and Advisory, empowering organizations to tackle the critical risks posed by Non-Human Identities (NHIs), including AI Agents.

Why is model conversion risky when the source artifact may be tampered with?

Why This Matters for Security Teams

How It Works in Practice

Common Variations and Edge Cases

Standards & Framework Alignment

Related resources from NHI Mgmt Group