Subscribe to the Non-Human & AI Identity Journal

How should security teams manage certificates when manual renewal no longer scales?

Security teams should treat certificate management as a governed lifecycle process, not a ticket-driven admin task. That means inventorying every certificate, assigning ownership, automating renewals where possible, and linking exceptions to business services. If the organisation cannot see which certificates exist and who owns them, manual renewal will keep creating avoidable outages.

Why This Matters for Security Teams

Certificate renewal stops being a simple admin task the moment the estate grows faster than the spreadsheet. When certificates sit across APIs, internal services, containers, devices, and third-party integrations, missed ownership becomes the real risk, not just missed dates. NHIMG research shows that 57% of organisations lack a complete inventory of their machine identities, and manual tracking still dominates at 61% according to SailPoint’s The Critical Gaps in Machine Identity Management report. That gap turns routine expiry into an outage and, in some cases, a security incident. Best practice is now shifting toward lifecycle management, but there is no universal standard for every environment yet. Teams should anchor decisions in broader identity governance, not just certificate operations, and align with patterns described in the NHI Lifecycle Management Guide and the NIST Cybersecurity Framework 2.0. In practice, many security teams encounter certificate failure only after service interruption has already forced emergency renewal.

How It Works in Practice

Manual renewal scales poorly because certificates are not isolated assets. They are tied to workload identity, service availability, and trust chains that span internal and external systems. The operational model should therefore include four controls: inventory, ownership, renewal automation, and exception handling. Inventory means every certificate is discoverable, including short-lived service certificates and certificates embedded in CI/CD, cloud services, and appliances. Ownership means each certificate is mapped to a business service and an accountable team. Renewal automation means replacing ticket-based renewals with policy-driven processes where the system can request, issue, deploy, and revoke without human handoffs. Exception handling means the few certificates that cannot be automated are tracked as explicit risk acceptances with expiry dates and compensating controls.

A practical workflow often looks like this:

  • Discover certificates continuously and reconcile them against service inventories.
  • Classify certificates by business criticality, lifespan, and renewal path.
  • Automate renewal for low-risk and high-volume certificates first.
  • Use short-lived credentials and dynamic issuance where supported, so expiry is part of the control design rather than an outage event.
  • Escalate only exceptions that require human approval, such as legacy appliances or regulated dependencies.

This model aligns with the OWASP guidance in the OWASP Non-Human Identity Top 10 and the lifecycle framing in Ultimate Guide to NHIs — Lifecycle Processes for Managing NHIs. Organisations that reduce certificate sprawl also tend to improve auditability, which matters because visibility gaps are a recurring cause of machine identity failure. These controls tend to break down when certificates are embedded in legacy appliances or vendor-managed systems because the team cannot automate issuance or enforce consistent ownership.

Common Variations and Edge Cases

Tighter certificate control often increases operational overhead, so organisations must balance availability against administrative burden. Some environments can move almost entirely to automated rotation, while others need a hybrid model because of hardware security modules, third-party integrations, or vendor constraints. Current guidance suggests treating those cases as exceptions, not as evidence that automation is impossible. The real question is how much manual work remains and whether it is governed.

Legacy systems are the most common edge case. A mainframe, network appliance, or externally managed platform may only support manual import and export, which means the renewal process should be wrapped in change control, alerting, and pre-expiry escalation. Another common variation is service-to-service trust in microservices or Kubernetes, where certificates can be short-lived and issued more frequently. That approach reduces exposure, but only if deployment pipelines, trust anchors, and monitoring are equally mature. For organisations looking to reduce repeated certificate failure, the Top 10 NHI Issues and the Ultimate Guide to NHIs — Static vs Dynamic Secrets are useful reference points for separating static credential risk from dynamic lifecycle design. The central tradeoff is simple: the more critical the service, the more rigor the renewal path needs, even if that means keeping a small number of human-managed exceptions. In older estates, that balance often fails because ownership is unclear and expiry alerts are routed to the wrong team.

Standards & Framework Alignment

This section maps relevant standards and security frameworks to the operational risks and controls described in this guidance.

OWASP Non-Human Identity Top 10 address the attack and risk surface, while NIST CSF 2.0 and NIST CSF 2.0 set the governance and control requirements practitioners need to meet.

Framework Control / Reference Relevance
OWASP Non-Human Identity Top 10 NHI-03 Covers certificate rotation and lifecycle gaps that cause outages.
NIST CSF 2.0 PR.AC-1 Supports controlled access and ownership for certificate lifecycle governance.
NIST CSF 2.0 PR.IP-3 Addresses formal maintenance and change processes for renewal operations.

Inventory certificates, automate rotation, and eliminate manual renewal paths wherever possible.