Azure Backup's Expiry Dates Are Lying

Nick Thompson

Summary

Stop protection with retain data in Azure Backup leaves recovery points with expiry dates that never fire. You can't delete them individually. The workaround — vault immutability — is gated behind a feature nobody associates with lifecycle management. If you manage backup vaults at scale, you almost certainly have orphaned recovery points accruing cost right now.

The expiry date is right there in the portal. It just doesn't do anything.

You decommission a workload. The VM is gone, the project is over, but the backup data might still matter — compliance, legal hold, maybe just caution. So you do the responsible thing: stop protection and retain the backup data. Azure keeps the recovery points. You'll clean them up later, or the expiry dates will handle it. That's what expiry dates are for.

Six months later, you're auditing vault costs across your tenants and notice something. Those recovery points are still there. Every single one. The expiry dates stamped on them have passed — some by months — but nothing was cleaned up. The data is still sitting in the vault, still accruing storage charges, and the automated lifecycle you assumed was running never started.

I've seen this across large multi-tenant environments. Thousands of orphaned recovery points, some going back years. I escalated through Microsoft support to get answers. What came back confirmed every suspicion and introduced a few new ones.

What Actually Happens When You Retain Data

Microsoft's own documentation is explicit about this, if you dig deep enough: when you select "Stop protection and retain backup data," the Azure Backup service will forever retain the recovery points that have been backed up. The word "forever" is theirs, not mine.

Here's what that means mechanically. When a backup policy is active and attached to a protected item, the policy's retention rules govern recovery point lifecycle. Expired points get merged or deleted on a regular cleanup cycle. The system works exactly as you'd expect.

The moment you stop protection, that policy detaches. The recovery points remain, but the garbage collection process that would honor those expiry dates stops running against them. The dates you see in the portal — the ones that say your recovery points expire on a specific date — are timestamps from when the policy was still active. They're artifacts. They display in the UI. They surface through the API. They just don't trigger any automated cleanup.

Microsoft has confirmed this is intentional. Not a bug, not a gap in implementation — by design. Once you select "retain data," the service treats the data as intentionally preserved and removes it from lifecycle enforcement entirely. The expiry dates persist in the UI as historical metadata, not as active triggers.

Being that there's no visual indicator in the portal distinguishing "this expiry date is active" from "this expiry date is cosmetic," every engineer looking at these recovery points sees what looks like a functioning lifecycle. The vault looks managed. It isn't.

Microsoft's Own AI Gets This Wrong

I asked Copilot what happens to recovery points after stopping protection with retained data. It told me the expiry dates would still govern cleanup — that the points would age out on schedule. This is wrong. It's confidently, specifically wrong, and it's exactly what most engineers assume because the UI reinforces that assumption.

The documentation does say "forever retain." But it's buried in a section most people skim past, and it contradicts the intuition that an expiry date means expiry. When the vendor's own AI assistant gets the behavior wrong, you know the design has a discoverability problem.

You Can't Surgically Remove the Problem

This is where it gets operationally painful. Say you've found the orphaned recovery points. You want to clean them up — delete the ones that have passed their intended retention, keep the ones that still matter. Reasonable approach.

You can't do it.

There is no supported method — Portal, PowerShell, CLI, or REST API — to delete individual recovery points while retaining the backup item. Microsoft's documentation confirms this, and support will tell you the same thing. Your options are delete the entire protected item — every recovery point for that workload, all or nothing — or delete nothing.

For a workload with three years of daily recovery points where you need to keep the last 90 days but want to purge the rest? All or nothing. For a decommissioned server where you need one recovery point for legal hold but have 400 you don't need? All or nothing.

The workaround the community has landed on is clunky but functional: resume protection on the orphaned item, attach a new policy with the shortest possible retention, wait for the garbage collection cycle to prune the old points, then stop protection again. It works, but it requires manual intervention per protected item. At scale — thousands of orphaned items across a multi-tenant environment — that's not a workaround. That's a project.

The Platform Split Nobody Talks About

Here's a detail that surprised even the Microsoft support engineer working the case: the behavior differs between vault types.

Backup Vaults — which protect Azure Disks, Blobs, PostgreSQL, and Kubernetes workloads — support "retain as per policy" when you stop protection. Recovery points expire on schedule. The lifecycle continues. This is the behavior everyone assumes exists everywhere.

Recovery Services Vaults — which protect VMs, SQL Server in Azure VMs, SAP HANA, and Azure File Shares — do not. When you stop protection with retain data in a Recovery Services Vault, you get indefinite retention. No lifecycle. No automated cleanup. Forever.

The irony is that the workloads most likely to be decommissioned at scale — VMs from completed projects, SQL databases from retired applications — are exactly the ones protected by Recovery Services Vaults. The vault type with the problem is the vault type holding most of the orphaned data.

The Immutable Vault Workaround

While investigating this issue, I found a workaround buried in a feature designed for a completely different purpose — one that didn't surface in any of the obvious documentation.

By enabling vault immutability on the Recovery Services Vault — not locked, just enabled in soft mode — an additional option appears when you stop protection: "Retain as per policy." This is the same behavior that Backup Vaults have natively. Recovery points honor their expiry dates. The GC process continues to run. The lifecycle works.

Disable immutability, and the option disappears. The vault reverts to the standard "retain data forever" behavior.

This means the capability exists in the platform. It's just gated behind a feature flag that most administrators wouldn't think to enable for this purpose, because immutability is marketed as a ransomware protection feature — not a lifecycle management tool. I had to lab it to confirm the behavior, because nothing in the documentation connects these two features.

There's a catch. Enabling vault immutability only gives you the "retain as per policy" option for workloads you stop going forward. It does not retroactively apply lifecycle enforcement to recovery points that were already retained under the old behavior.

What Microsoft Is Doing About It

When you escalate this far enough, the picture that emerges is this: Microsoft is aware. A remediation is planned — bringing the "retain as per policy" option to Recovery Services Vaults without requiring vault immutability, targeted for the first half of 2026. But based on what's been communicated through support channels, the fix will apply to new backups only. It won't be retroactive. Existing orphaned recovery points — the thousands already accumulated across environments — will still require manual cleanup.

There's no interim guidance for reducing the billing impact. There's no supported method for selective cleanup. And the position on billing is that the "forever retain" behavior is publicly documented, so it's working as described — even if nobody reads the description until they're already deep in the problem.

The Compliance Problem Is Worse Than the Cost

The storage costs are visible and annoying. The compliance exposure is invisible and serious.

When recovery points persist indefinitely without an active retention policy, you have data with no lifecycle governance. In regulated environments — healthcare, financial services, anything touching GDPR or data residency requirements — retention policies exist for reasons that go in both directions. You need to keep data long enough to meet regulatory minimums, and you need to stop keeping data once the retention window closes. Orphaned recovery points violate the second half of that equation.

You now have backup data that no policy governs, no automated process will ever clean up, and in many cases no one in the organization knows exists. The data has an owner in the billing sense — someone's subscription is paying for it — but it has no owner in the governance sense. Nobody is making deliberate decisions about its lifecycle.

For environments that went through cloud migrations three, five, eight years ago and decommissioned workloads along the way, the orphaned recovery points are a growing compliance surface that most organizations don't know they have. The vault looks clean in the portal. The expiry dates look like they're working. They're not.

What to Actually Do

If you manage Azure Backup vaults at any scale, audit them. Specifically, look for protected items in a "protection stopped — retain data" state in your Recovery Services Vaults. Check whether the expiry dates on their recovery points have passed. If they have, and the data is still there, you have orphaned recovery points.

For new workloads going forward, enable vault immutability in soft mode on your Recovery Services Vaults before stopping protection. This exposes the "retain as per policy" option and keeps lifecycle enforcement active. You don't need to lock immutability — soft mode is sufficient, and you can disable it later if needed.

For existing orphaned workloads, you have two paths. If you can afford to lose all recovery points for an item, delete the protected item entirely. If you need to keep some and purge others, the resume-with-short-retention workaround is your current option — resume protection, attach a policy with minimal retention, let GC run, then stop again. At scale, plan this as a project with real scope, not a quick cleanup.

Either way, stop assuming the expiry dates mean what they say. In Azure Backup under standard Recovery Services Vault policies, once you stop protection, those dates are a cosmetic artifact that no longer triggers any automated cleanup. The data lives forever. The bill does too.

More from Nick