Where Your Cloud Budget Actually Goes

Nick Thompson

Summary

Azure Advisor points you at the visible savings — right-size this VM, shut down that one. The real cost leaks are in the places Advisor never looks: provisioned disks nobody right-sized, storage tiers nobody reviewed, load balancers distributing traffic to one server, and backup chains nobody optimized.

Advisor says you're optimized. The audit disagrees.

I pull up Azure Advisor on a tenant I'm onboarding. Cost score looks healthy — 85 out of 100. A handful of recommendations: right-size two underutilized VMs, consider a Savings Plan, shut down a dev box that's been idle for 30 days. Maybe $400 a month in projected savings, calculated helpfully in big green numbers.

Then I start actually looking.

Orphaned managed disks from VMs deleted months ago — still billing by tier and size. Premium SSDs provisioned at 1TB for workloads using 60GB. Snapshots taken during a migration that nobody cleaned up. A storage account full of VHDs from the unmanaged disk era that hasn't been touched since 2021. An App Gateway — $170 a month minimum before WAF — load balancing traffic to exactly one backend server.

By the time I'm done, the real waste is five to ten times what Advisor suggested. Not because Advisor is broken. Because Advisor is designed to recommend what Microsoft can confidently quantify with utilization telemetry. Everything else — and it's most of the waste — is invisible to it.

Advisor Is Grading Its Own Homework

Credit where it's due: Advisor is useful for what it does. It catches underutilized compute. It nudges you toward commitment discounts. It flags genuinely idle resources. If you've never looked at it, start there.

But understand what you're looking at. Advisor calculates potential savings against pay-as-you-go retail rates — not your actual negotiated rates. If you're on an Enterprise Agreement or CSP pricing, the "$2,000 in potential savings" Advisor shows might translate to $250 in real savings at your contracted rate. Teams report these phantom savings up the chain. Leadership sees the green numbers and thinks cost optimization is handled.

I've seen Advisor recommend shutting down a VM that spikes to 95% utilization for four hours every month-end during financial close. The other 26 days it sits at 8% utilization, which triggers the "underutilized" flag. Follow that recommendation and you blow up month-end processing. Advisor can't know your business context. It knows CPU averages over a lookback window.

The deeper problem is what Advisor doesn't look at. It doesn't flag orphaned disks. It doesn't know your storage account has 400GB of page blobs from 2019. It doesn't notice that your App Gateway has one backend, or that your backup retention is generating costs that will persist indefinitely. The recommendations dashboard covers maybe 30% of the actual cost surface. The other 70% requires a human who knows where to look.

Compute Waste: It's Not the VMs

Everyone looks at VM costs first because VMs are the biggest line item. But the compute waste I find most often isn't running VMs — it's everything around them.

Disk provisioning gaps. You pay for the provisioned size and tier of a managed disk, not the space you actually use. A P30 Premium SSD is 1TB and costs roughly $135/month whether you've written 50GB to it or 950GB. Across an environment with hundreds of VMs, disk over-provisioning is almost universal — someone picked P30 "to be safe" during the initial build and nobody revisited it. The workload needed P10 at most.

Ultra Disks scoped wrong. Ultra Disk billing is based on provisioned IOPS and throughput, not consumed. An Ultra Disk provisioned for 10,000 IOPS serving a workload that peaks at 2,000 is billing for the other 8,000 every hour of every day. These get set during deployment and forgotten. I've found Ultra Disks still provisioned at migration-day specs long after the workload stabilized.

Orphaned App Gateways and Load Balancers. This one makes me laugh every time. Legacy workloads shift — backends get decommissioned, projects consolidate, migrations half-finish — and what's left behind is an Application Gateway or load balancer with a single backend node or no backends at all. Completely orphaned, still billing. Nobody notices because the architecture diagram still shows the load balancer, so it looks intentional. The gateway costs more per month than the VM it was fronting, and sometimes the VM isn't even there anymore.

Storage Waste: The Silent Budget Killer

Compute gets the attention. Storage is where the money quietly disappears.

Hot tier for cold data. Azure Blob Storage defaults to the hot access tier. Data lands there and stays there unless someone actively moves it. I routinely find storage accounts where 80% of the data hasn't been accessed in over a year — sitting in hot tier, billed at hot-tier rates. The difference between hot and archive tier is roughly 10x in per-GB cost. Lifecycle management policies exist. Almost nobody configures them.

Old VHDs from restores. You restore a VM from backup to test something — maybe verify a recovery point, maybe pull a file. The restored VM gets deleted when you're done. The VHD it was built from stays in the storage account. It's a page blob, it's the full disk size, and unless someone goes looking for it, it sits there for years. I find these in almost every environment.

Legacy unmanaged disk VHDs. Before managed disks became the default, every VM disk was a page blob in a storage account. Most environments migrated to managed disks years ago. The page blobs didn't get cleaned up. They're sitting in storage accounts that nobody opens because "those are the old storage accounts." They're still billing.

Snapshots nobody remembers. Taken for a migration, a troubleshooting session, a "just in case" before a change window. The change succeeded. The snapshot persisted. Multiply by every engineer who's ever taken a precautionary snapshot across every subscription, and you've got a meaningful storage line item that's pure waste.

Backup differentials and archive tiering. Full backup chains where incremental or differential policies would cut storage cost substantially. Old backup data sitting in standard vault storage that should have been moved to archive tier months ago. The Azure Backup pricing model rewards lifecycle management, but the defaults don't push you there — and the cost difference between standard and archive vault storage is significant enough to matter at scale.

Phantom Resources: Paying for Things That Don't Exist

This category is the most frustrating because the resources aren't doing anything. They're artifacts of deletions that didn't clean up completely.

Orphaned managed disks. Delete a VM and the disk doesn't automatically follow unless you explicitly selected that option. The disk persists, unattached, billing at its provisioned tier and size. Across a multi-subscription environment, I've found dozens of these in a single audit. They're invisible in normal day-to-day operations because nobody is looking at unattached disk lists.

Orphaned public IPs and NICs. Same pattern — VM gets deleted, the network interface card and public IP address stick around. Standard SKU public IPs bill whether attached or not. NICs are free, but they clutter the environment and mask the real resource count when you're trying to understand what's actually running.

Orphaned backup recovery points. I wrote about this in detail in a separate article, but the short version: stop protection with retained data, and the recovery points persist indefinitely with expiry dates that never fire. The cost accrues silently, and there's no native mechanism to clean it up at scale without re-enabling and re-disabling protection with delete data.

The Commitment Trap

Advisor loves to recommend Reserved Instances and Savings Plans. The savings are real — 30-60% off pay-as-you-go, depending on term and commitment level. What Advisor can't know is your roadmap.

I've seen organizations buy 3-year Reserved Instances on Advisor's recommendation, then decommission the workload 4 months later. The reservation can be exchanged for a different SKU or scope, but most teams don't know that — or don't get around to it before the workload landscape shifts again. The unused commitment sits there, burning money.

Savings Plans are more flexible — they apply across VM families and regions — but they still require usage forecasting that most teams don't have. If your committed hourly spend is $10 and your actual consumption drops to $6, you're paying $10 anyway.

The fix isn't "don't use commitments." It's: don't let Advisor auto-pilot your commitment strategy based on a 30-day usage snapshot. Review commitments quarterly against actual roadmap. Assign someone to own the commitment lifecycle. This is a FinOps function, and in most organizations nobody is doing it.

What to Go Check Right Now

If you manage Azure environments and you've been relying on Advisor for cost optimization, here's where to start:

Open the Disks blade and filter for unattached. Check the storage accounts you haven't opened in a year — look for page blobs, old VHDs, and snapshots with creation dates from before your last migration. Pull up your Application Gateways and Load Balancers and count the backend pool members. Review your blob lifecycle management policies — if they don't exist, your hot-tier storage is probably full of cold data. Check your Reserved Instance utilization in Cost Management — if it's below 90%, you're leaving money on the table and probably don't know why.

None of this is in the recommendations dashboard. All of it is in your bill.

Advisor will tell you you're doing fine. Go look for yourself.

More from Nick