$336K in Annual Savings: The FinOps Playbook

finopscloudcost-optimizationinfrastructuredatadog

$336K Was Hiding in Plain Sight

$336K. That's how much was hiding in our cloud infrastructure at a global enterprise. Nobody was looking.

I don't mean the money was hard to find. I mean nobody had tried. The invoices came in, finance paid them, engineering kept shipping. Costs went up every quarter. Everyone assumed that was normal.

It wasn't. Here's where the money was hiding.

$336K

Annual Savings

Verified via FinOps tags + invoices

62.5%

DataDog Reduction

$12K/mo to $4.5K/mo

500K

Lines Removed

Legacy code nobody touched

$4.5K/mo

New DataDog Cost

Down from $12K/mo

The Breakdown

Tyler Wall achieved $336K in annualized cloud savings at a global enterprise: $278K from infrastructure consolidation, $58K from DataDog optimization (62.5% reduction), and $32K from RDS consolidation. He also removed 500,000 lines of legacy code that was still being built, tested, and deployed every release cycle.

Here's where the money went:

Cloud infrastructure consolidation: $278K
DataDog optimization: $12K/month down to $4.5K/month ($90K annualized)
RDS consolidation: $32K
500,000 lines of dead code removed: not a dollar figure, but every line was being compiled, tested, and deployed on every release

The total adds up to more than $336K because some savings overlapped across categories. The $336K is the verified net number from FinOps tags and invoices.

The Audit

I started with a question nobody had asked: what are we actually paying for?

The company had grown fast. Teams spun up infrastructure for projects, projects ended, infrastructure stayed. Nobody owned the teardown. I pulled three months of invoices, cross-referenced them with deployment manifests, and built a spreadsheet of every resource, its owner, and its last meaningful traffic.

A third of it had no owner. Another chunk hadn't served real traffic in months.

The hard part wasn't the spreadsheet. It was the meeting where I told leadership we were burning a quarter million dollars a year on infrastructure nobody used. Engineers don't like hearing that their systems are waste. I framed it differently: these resources are costing us headcount. Every $150K we save is an engineer we can hire instead.

That got their attention.

Cloud Infrastructure: $278K

The biggest bucket. Three categories of waste:

Orphaned resources. Dev and staging environments that outlived their projects. Load balancers pointing at nothing. Elastic IPs attached to terminated instances. Each one small. Together, $94K.

Oversized nodes. Production services running on instances four times larger than their peak utilization required. One team had a cluster provisioned for Black Friday traffic year-round. I right-sized 14 services over six weeks, validated each in staging, then cut over during low-traffic windows. $112K.

Redundant services. Two teams had built their own caching layers instead of using the shared Redis cluster. Three microservices existed to transform data between formats that a single service could handle. Consolidating these took the longest because it required cross-team coordination, but it saved $72K and reduced operational surface area.

DataDog: 62.5% Reduction

DataDog was $12K a month. For a company our size, that's high. I dug into the usage.

The problem wasn't DataDog. It was us. We were monitoring everything and looking at almost none of it. Custom metrics nobody had queried in six months. Duplicate monitors across teams watching the same endpoints. APM traces on internal health checks that fired every 10 seconds.

I built a usage report: every custom metric, every dashboard, every monitor — with a "last viewed" timestamp. The results were brutal. Over 40% of our custom metrics had zero dashboard references. We were paying to collect data that went straight to cold storage.

The fix was straightforward but tedious:

Removed 847 unused custom metrics
Consolidated 23 duplicate monitors into 8
Disabled APM tracing on internal-only health endpoints
Moved historical log retention from 15 days to 7 days (nobody looked past 3)

Monthly bill dropped from $12K to $4.5K. Then I called our DataDog account rep with the new utilization numbers and renegotiated the contract. The conversation is easier when you can show them you've already cut usage by half.

The hardest part of FinOps isn't finding the waste. It's convincing engineers that infrastructure costs are their problem too.

RDS and Dead Code

RDS consolidation saved $32K. We had five RDS instances. Two were read replicas that hadn't served a query in weeks — a feature flag change had routed all traffic to the primary months ago, but nobody updated the infrastructure. A third was a staging database running on a production-tier instance. I consolidated to two instances and right-sized staging. $32K.

500,000 lines of legacy code. This is the one that scares people. Half a million lines of code that nobody was using but everybody was afraid to delete. It was still in the build pipeline. Still being compiled. Still running through CI. Still getting deployed.

I used a combination of static analysis and runtime instrumentation to prove the code was dead. No function calls. No API hits. No test coverage that exercised real paths. I deleted it in stages across four PRs, each one sitting in staging for a week. Nothing broke.

The savings here aren't in dollars. They're in build times, CI minutes, cognitive load, and the number of engineers who no longer accidentally wander into a dead module trying to understand the codebase.

How Do You Build FinOps Discipline?

FinOps discipline requires three things: cost visibility at the team level, monthly review cadence, and tying infrastructure spending to engineering ownership. Tyler Wall built this practice by implementing resource tagging, automated cost reports per team, and a monthly review where each team lead explained their infrastructure spend. One-time audits find savings. Ongoing discipline keeps them.

A one-time audit finds $336K. Without discipline, the waste grows back in six months. I built three things to prevent that:

Resource tagging. Every infrastructure resource got a team tag and a service tag. No exceptions. Untagged resources showed up on a weekly report to engineering leadership. Within a month, tagging compliance hit 95%.

Monthly cost reviews. Once a month, each team lead looked at their infrastructure spend. Not the total company bill — their bill. The first month was uncomfortable. By the third month, teams were catching their own waste before I did.

Ownership model. Every resource had an owner. When someone left the company, their resources got reassigned. When a project ended, the teardown was part of the project closeout checklist. Not optional.

The $336K was a one-time win. The discipline is the real result. Costs stayed flat for the next three quarters even as traffic grew 40%.

The same pattern applies to AI-directed development: building the governance layer first, not after the costs surprise you. The FinOps Foundation has excellent frameworks for teams starting this work.

In This Series

$336K in Annual Savings: The FinOps Playbook — Finding and eliminating cloud waste
One Afternoon, 23 Backgrounds — The 23 canvas engines behind every page
One Resume Is Not Enough — How YAML drives 16 portfolio variants
Text Is Not Enough — The profile-aware AI chatbot
Ask ChatGPT Who Tyler Wall Is — Infrastructure and AI discoverability

Tyler Wall builds platforms that scale and cost what they should. See the platform engineering profile for the full picture, or explore the engineering manager profile for team leadership context.

Frequently Asked Questions

How do you find cloud cost savings without breaking production?

Start with observability data, not invoices. I tagged every resource by team and service, then compared actual utilization against provisioned capacity. The biggest savings came from resources nobody owned — orphaned RDS instances, oversized nodes, and monitoring agents scraping metrics no dashboard ever displayed. I validated each cut in staging before touching production.

How much can you realistically save on DataDog?

I cut DataDog spend by 62.5%, from $12K per month to $4.5K per month. The biggest wins were removing custom metrics nobody queried, consolidating duplicate monitors, and renegotiating the contract with utilization data in hand. Most teams over-instrument early and never clean up.

What is the hardest part of building a FinOps practice?

Getting engineers to care. Infrastructure costs feel like someone else's problem until you tie spending to their team's budget. Monthly cost reviews with team-level breakdowns changed the conversation from "ops will handle it" to "why is my service costing $8K a month?" Ownership is the only thing that sticks.