GCP Cost Optimisation Strategies: Practical Ways to Reduce Google Cloud Costs
GCP cost optimisation is the process of reducing what you spend on Google Cloud without degrading the services you run. It covers everything from deleting forgotten resources to restructuring how you buy compute capacity. Done well, it is a repeatable discipline. Done badly, it is a one-off cleanup that drifts back within months.
This page is for anyone whose GCP bill is higher than it should be, and anyone who wants to keep it reasonable as workloads grow. Whether you are reviewing your first invoice or running a quarterly cost review across multiple projects, the framework here applies.
The biggest levers are usually the same: right-size overprovisioned compute, commit to discounted pricing for predictable workloads, clean up idle resources, apply storage lifecycle policies, choose the right service for each workload’s traffic pattern, and build enough visibility that costs do not quietly drift. This page walks through those levers in priority order and links to the deeper guides where you can act on each one. For a full breakdown of how GCP charges work, start with understanding GCP pricing models.
Simple explanation#
Cost optimisation in GCP means spending less money for the same (or better) outcomes. It is not about cutting services or degrading performance. It is about eliminating waste and buying smarter.
Think of it like heating a house. Before you negotiate a better energy tariff, you close the windows that are open (idle resources), insulate the walls (right-size what you are running), and stop heating rooms nobody uses (lifecycle policies on cold data). Only after you have reduced waste does it make sense to lock in a long-term energy contract (Committed Use Discounts).
The core sequence is:
- Reduce waste first. Delete what is not being used, downsize what is overprovisioned.
- Buy cheaper pricing for predictable usage. Committed discounts, Spot VMs for batch work.
- Improve visibility and governance. Budgets, labels, billing exports, and repeatable review processes.
How GCP cost optimisation works#
A useful cost optimisation process follows a clear sequence. Skipping steps, especially jumping straight to buying commitments, is one of the most common and expensive mistakes.
Step 1: See where money is going. Export billing data to BigQuery and set up cost breakdown reports. Without visibility by project, service, and label, every decision is a guess.
Step 2: Find waste and overprovisioning. Use Recommender to find overprovisioned VMs. Audit for unused resources: unattached disks, idle VMs, reserved IPs nobody is using. This step costs nothing and often delivers the fastest savings.
Step 3: Choose better pricing models. Once you know your actual baseline, apply Committed Use Discounts to predictable compute and Spot VMs to fault-tolerant batch work. These are the largest single discounts available.
Step 4: Redesign where the service choice is wrong. A VM running at 2% utilisation overnight might belong on Cloud Run. A bucket full of three-year-old logs should not be in Standard class. Match services and storage tiers to actual access patterns.
Step 5: Put guardrails in place. Set billing budgets and alerts, enforce labelling standards, and schedule regular reviews. Without guardrails, costs drift back up within a quarter.
The order matters. Teams that jump to Step 3 (buying commitments) before completing Step 2 (finding waste) often lock in discounts on oversized resources. That is like getting a loyalty discount on a flat you are only using half of.
For teams ready to build this into a formal practice, the FinOps principles guide covers how to make cost accountability a shared discipline rather than a one-person job.
Start with the biggest cost drivers#
Before diving into individual tactics, understand where GCP bills usually concentrate. Most teams find that 80% of their spend comes from two or three areas. Optimise those first.
| Cost area | What usually goes wrong | Highest-impact fix | Deeper guide |
|---|---|---|---|
| Compute (VMs) | Overprovisioned machine types, no committed discounts on steady workloads | Right-size with Recommender, then buy CUDs for the baseline | Compute Engine cost optimisation |
| Storage | Everything sits in Standard class indefinitely; old disks never deleted | Lifecycle policies to Nearline/Coldline; delete unattached disks | Storage cost optimisation |
| BigQuery / analytics | Full table scans, no partitioning, on-demand pricing at scale | Partition and cluster tables; prune columns in queries | BigQuery cost optimisation |
| Network egress | Cross-region traffic, public internet egress where private connectivity exists | Keep traffic in-region; use Private Google Access; cache at the edge | Network egress costs explained |
| Idle / forgotten resources | Unattached disks, reserved IPs, stopped VMs still paying for disk | Regular audits with CLI scripts or Recommender | Cleaning up unused resources |
| Visibility / governance | No labels, no budgets, billing data never exported | Labels on all resources, budgets at 80% and 100%, BigQuery export | Cost breakdown reports |
Committed Use Discounts: the highest single lever#
If you have predictable baseline compute that runs continuously (production VMs, database instances, long-running workers), Committed Use Discounts (CUDs) are the largest cost reduction available in GCP. You commit to a minimum amount of vCPU and memory for 1 or 3 years. GCP charges the discounted rate for that committed amount regardless of whether you use it.
Think of CUDs like a gym membership. If you go every day, the per-visit cost is far lower than paying each time at the door. But if you stop going after three months, you are still paying for the full year. The savings are real only when the commitment matches your actual, sustained usage.
Discount levels for Compute Engine CUDs:
- 1-year commitment: up to 37% off on-demand pricing
- 3-year commitment: up to 57% off on-demand pricing
CUDs apply at the project level and are flexible. You commit to vCPUs and RAM without being tied to a specific machine type. Any compatible VMs consume the committed capacity first, then overflow to on-demand pricing.
When CUDs save money: steady production workloads you are confident will run for the full commitment period.
When not to use them: dev/staging environments, experimental workloads, anything that might be deprecated within the commitment window. A 3-year CUD on a workload you shut down after six months means paying for unused capacity for two and a half years.
Never buy CUDs before right-sizing. If your VMs are overprovisioned, you lock in the waste at a discount instead of eliminating it. Right-size first, observe for two weeks, then commit.
Spend at least two weeks running Recommender and acting on its suggestions before purchasing CUDs. The deeper guide on Compute Engine cost optimisation walks through the full right-size-then-commit workflow.
Right-sizing: matching resources to actual utilisation#
Overprovisioning is the most common source of compute waste. A team picks a VM size that “feels safe” and never revisits it. It is like wearing shoes two sizes too big because your feet might grow. A VM provisioned at n2-standard-8 (8 vCPUs, 32 GB RAM) running at 10% average CPU utilisation is paying for roughly 7.2 vCPUs it never uses.
GCP’s Recommender service analyses 8 weeks of utilisation data and suggests the right machine type for each VM. It is free and available in the console and via CLI:
# List VM right-sizing recommendations for a specific zone
gcloud recommender recommendations list \
--recommender=google.compute.instance.MachineTypeRecommender \
--location=us-central1-a \
--project=my-app-prod \
--format="table(content.overview.recommendedMachineType.name, primaryImpact.costProjection.cost.units)"
When it saves money: any environment where VMs have been running for more than 8 weeks without a sizing review.
Do not blindly apply every recommendation. VMs with periodic spikes (batch processing, end-of-month reports) may show low average utilisation but need headroom for bursts. Always check peak utilisation, not just averages, before downsizing.
Common mistake: right-sizing once and forgetting. Workloads change over time. Schedule a quarterly review or set up Recommender notifications. The rightsizing virtual machines guide covers safe procedures and decision criteria in detail.
Spot VMs: 60-91% off for fault-tolerant workloads#
Spot VMs are surplus Compute Engine capacity sold at a steep discount. The trade-off is simple: GCP can reclaim them with 30 seconds notice when it needs the capacity for on-demand customers. Think of them like standby airline tickets. The price is dramatically lower, but you accept the risk that your seat could be given away at the last moment.
# Create a Spot VM for batch processing
gcloud compute instances create my-batch-job \
--machine-type=n2-standard-4 \
--provisioning-model=SPOT \
--instance-termination-action=STOP \
--zone=us-central1-a \
--project=my-app-prod
When Spot VMs save money: batch jobs that can checkpoint and restart. Data pipelines, ML training, rendering, large exports. An n2-standard-4 that costs around $0.19/hour on-demand can drop to $0.02-0.07/hour as a Spot VM.
Spot VMs are not just “cheaper VMs.” They are interruptible VMs that happen to be cheaper. Never use them for anything where interruption causes data loss, requires manual recovery, or serves live user traffic.
Common mistake: running Spot VMs without a checkpointing strategy. If the job restarts from scratch after every preemption, you can end up spending more in wasted compute than you save on the discount. See Spot VMs for cost savings for preemption handling patterns.
Cloud Storage lifecycle policies: move cold data automatically#
Cloud Storage data accumulates in the Standard class by default and stays there indefinitely unless you set a lifecycle policy. For teams with significant volumes of backups, logs, or exports, this is a quiet but persistent cost.
Picture a warehouse. You would not keep last year’s archived boxes on the expensive front shelves right next to today’s active inventory. You move them to cheaper storage in the back. Cloud Storage lifecycle policies do the same thing automatically.
Storage class pricing (approximate):
- Standard: $0.020/GB/month
- Nearline: $0.010/GB/month (minimum 30-day storage)
- Coldline: $0.004/GB/month (minimum 90-day storage)
- Archive: $0.0012/GB/month (minimum 365-day storage)
Moving 1 TB from Standard to Coldline saves roughly $16/month on that single terabyte. Across multiple buckets and years of accumulated data, the savings compound quickly.
# Apply lifecycle rules: Standard → Nearline at 30 days, → Coldline at 90 days
cat lifecycle.json
{
"rule": [
{
"action": {"type": "SetStorageClass", "storageClass": "NEARLINE"},
"condition": {"age": 30, "matchesStorageClass": ["STANDARD"]}
},
{
"action": {"type": "SetStorageClass", "storageClass": "COLDLINE"},
"condition": {"age": 90, "matchesStorageClass": ["NEARLINE"]}
}
]
}
gcloud storage buckets update gs://my-app-prod-backups \
--lifecycle-file=lifecycle.json
Set lifecycle policies when you create the bucket, not months later after it has accumulated terabytes in Standard. There is no cost to having them in place and no manual action needed once configured.
The storage cost optimisation guide covers Cloud SQL and other storage services alongside Cloud Storage.
Idle resource cleanup: the fastest wins#
Before optimising what you are running, remove what you are not. These resources are like paying rent on an empty flat you forgot you still lease. They charge money every month while doing nothing useful.
Unattached persistent disks continue charging for storage after a VM is deleted:
# List persistent disks not attached to any instance
gcloud compute disks list \
--filter="NOT users:*" \
--project=my-app-prod
Unused static IP addresses cost approximately $7/month each. Small individually, but they accumulate:
# List reserved IPs not currently in use
gcloud compute addresses list \
--filter="status=RESERVED" \
--project=my-app-prod
Idle VMs at near-zero utilisation pay the full machine rate. Use Cloud Monitoring to identify VMs averaging under 5% CPU for 30 days. These are candidates for deletion, downsizing, or migration to Cloud Run.
Cleanup is not a one-off project. Without a recurring process, idle resources accumulate again within weeks. Schedule a monthly audit or integrate cleanup checks into your deployment pipeline.
The cleaning up unused resources guide includes CLI scripts and a repeatable process.
Choosing Cloud Run over VMs for variable workloads#
Cloud Run charges only for CPU and memory used during request processing. A service receiving 1,000 requests per day with 100ms average processing time pays for roughly 100 seconds of compute per day. A Compute Engine VM of equivalent capacity charges for 24 hours whether or not a single request arrives.
For workloads with variable traffic, especially those idle overnight or on weekends, Cloud Run is often dramatically cheaper than VMs. For high, steady traffic, the cost difference shrinks and a VM with CUDs may be comparable.
When Cloud Run saves money: APIs, webhooks, and services with unpredictable or bursty traffic patterns, and any service that is idle for significant parts of the day.
When VMs may be cheaper: sustained high-throughput workloads where utilisation stays above 60-70% consistently. At that level, a right-sized VM with a CUD can match or beat Cloud Run pricing.
Not sure which is cheaper for your workload? Model both options using your actual traffic data before deciding. The estimating cloud costs guide walks through the calculation.
For a direct cost comparison and decision framework, see Cloud Run vs Compute Engine. For a broader comparison including GKE, see choosing between Cloud Run, GKE, and Compute Engine. The Cloud Run cost optimisation guide covers CPU allocation modes, concurrency settings, and minimum instance tuning.
Budget alerts and cost visibility#
You cannot optimise what you cannot see. Set budget alerts so cost spikes are caught before the end of the billing cycle, not after.
# Set a monthly budget with alerts at 80% and 100%
gcloud billing budgets create \
--billing-account=BILLING_ACCOUNT_ID \
--display-name="Production Monthly Budget" \
--budget-amount=500USD \
--threshold-rule=percent=0.8 \
--threshold-rule=percent=1.0
Three things to set up immediately:
- Budget alerts at 80% and 100% of expected monthly spend. This catches runaway costs early.
- Billing export to BigQuery. This lets you query costs by service, project, label, and time period.
- Consistent labels on all resources (
env=prod,team=backend,service=api). Without labels, your billing report is a list of service charges with no context.
A missing budget alert is one of the most expensive oversights in GCP. A single misconfigured Dataflow job or an infinite retry loop in Cloud Run can generate thousands in unexpected charges before anyone notices. Set alerts on every billing account and project.
For a complete setup guide, see billing budgets and alerts. For teams building cost attribution across multiple projects and teams, the FinOps in Google Cloud page covers governance models that scale.
When to use this page#
This page is designed as a starting point for several common scenarios:
- Your GCP bill is rising and you do not know where to start. Use the cost driver table above to identify your biggest areas of spend, then follow the linked guides for each one.
- You are reviewing a new architecture before production. Walk through the framework to check whether you have chosen the right compute model, storage tiers, and pricing commitments.
- You are running a monthly or quarterly cost review. Use the checklist below as a structured walkthrough.
- You are choosing between Cloud Run, GKE, and VMs partly on cost. The comparison sections and linked guides give you the numbers to model both options.
- You are building a FinOps practice. Start here for the technical levers, then move to the cost optimisation strategy and FinOps principles guides for the organisational side.
Common mistakes#
Mistakes that cost real money
Committing too early. Buying CUDs before right-sizing means locking in waste at a discount. A 3-year commitment on an oversized machine type is expensive to unwind. Right-size first, observe for two weeks, then commit.
Ignoring labels. Without consistent labels, you cannot attribute a cost spike to a team, environment, or service. When the bill jumps 40%, you need to know which project caused it. Enforce labelling from day one.
Leaving storage in Standard class forever. Backups, logs, and exports that are rarely accessed should move to Nearline or Coldline automatically. Set lifecycle policies when you create buckets, not after they have accumulated terabytes.
Missing egress costs. Network egress is one of the most overlooked line items. Cross-region traffic, public internet egress, and data moving between services in different zones all incur charges. Keep traffic in-region where possible.
Treating dev and staging like production 24/7. Dev clusters, staging VMs, and test databases often run around the clock even though nobody uses them outside working hours. Schedule them to shut down overnight and on weekends, or move dev workloads to Cloud Run where you only pay during active use.
Relying on one-off cleanup instead of a repeatable process. A single cleanup sprint saves money for a month. Without a recurring review cadence, idle resources and overprovisioned VMs accumulate again. Build cost review into your monthly operations.
Key comparisons#
Committed Use Discounts vs Sustained Use Discounts#
| Sustained Use Discounts (SUDs) | Committed Use Discounts (CUDs) | |
|---|---|---|
| How it works | Automatic. Applied when a VM runs for more than 25% of a month | You purchase a 1-year or 3-year commitment for vCPUs and memory |
| Maximum discount | Up to 30% off | Up to 57% off (3-year) |
| Risk | None. Purely automatic | You pay for the commitment whether you use it or not |
| Best for | All eligible VMs (no action needed) | Predictable production baselines you are confident will run for the full term |
| Action required | Nothing. GCP applies it | Purchase via console or CLI after sizing review |
SUDs and CUDs can stack: a VM eligible for SUDs that is also covered by a CUD receives the CUD rate. You do not get both discounts. The CUD replaces the SUD where it applies.
Cloud Run vs Compute Engine for variable traffic cost#
| Cloud Run | Compute Engine | |
|---|---|---|
| Billing model | Per-request: CPU and memory during request processing only | Per-hour: full machine rate whether busy or idle |
| Idle cost | Zero (when scaled to zero) | Full VM cost |
| Best traffic pattern | Bursty, variable, or with significant idle periods | Steady, high utilisation (60%+ consistently) |
| Discount options | None currently | CUDs (up to 57%), SUDs (up to 30%), Spot (60-91%) |
| When it wins on cost | Low-to-moderate request volumes with idle gaps | High sustained throughput with committed pricing |
For a full side-by-side analysis, see Cloud Run vs Compute Engine.
One-off savings vs recurring savings#
Think of it like weeding a garden. Pulling weeds once makes it look great for a week. But without mulch, borders, and a regular schedule, the weeds come back just as fast.
A one-off cleanup (deleting idle resources, right-sizing VMs) delivers immediate savings but does not prevent costs from creeping back. Recurring savings come from structural changes: lifecycle policies that run automatically, committed pricing that locks in lower rates, budget alerts that catch spikes, and a regular review process that keeps everything in check.
The most effective cost optimisation combines both approaches. Use one-off wins to get costs down quickly, then put governance in place to keep them down. If you only do the first part, you will be running the same cleanup exercise every quarter.
A practical cost review checklist#
Run this monthly or quarterly. Each item should take minutes, not hours.
- Check billing reports. Compare this period’s spend to the last. Flag any service or project where costs increased more than 10%.
- Review Recommender suggestions. Run
gcloud recommender recommendations listfor machine type, idle VM, and unattached disk recommenders across active projects. - Audit unattached disks. List disks with no attached VM. Snapshot if needed, then delete.
- Audit reserved IPs. List IPs in RESERVED status. Release any that are not planned for use.
- Check storage lifecycle policies. Verify that backup and log buckets have lifecycle rules. Add policies to any new buckets created since last review.
- Review BigQuery costs. Check for expensive queries using INFORMATION_SCHEMA.JOBS. Look for full table scans on large tables that should be partitioned.
- Check dev/staging schedules. Confirm non-production VMs and clusters are shutting down outside working hours.
- Review CUD utilisation. If you have existing CUDs, check whether committed capacity is being fully used. If not, investigate whether workloads shifted.
- Check egress patterns. Look for unexpected cross-region or internet egress in billing exports.
- Validate labels. Spot-check that new resources created this period have the required labels. Unlabelled resources are invisible in cost attribution.
- Update cost baseline. Record this period’s numbers so next review has a clear comparison point.
The first time you run this checklist it will take longer because you are establishing baselines. After that, most items become a quick comparison against last period’s numbers. Keep a shared spreadsheet or dashboard so the whole team can see trends over time.
Frequently asked questions#
What are the fastest ways to reduce a GCP bill?
Delete idle resources first: unattached disks, unused IPs, and VMs at near-zero utilisation. These savings are immediate and risk-free. Next, apply Cloud Storage lifecycle policies to move cold data out of Standard class. Then right-size overprovisioned VMs using Recommender. These three steps routinely cut bills by 15-30% without commitments or architectural changes. See identifying expensive resources to find where to start.
What is the difference between Sustained Use Discounts and Committed Use Discounts?
Sustained Use Discounts are automatic. GCP applies up to 30% off when a VM runs for more than 25% of a month. No action required. Committed Use Discounts are purchased commitments for 1 or 3 years, giving up to 57% off. CUDs are significantly larger but carry risk: you pay for the committed amount whether you use it or not. Start with SUDs (free), then add CUDs only for workloads you are confident will run for the full commitment term. The GCP pricing models guide covers all discount types in detail.
How do I find the most expensive resources in GCP?
Export billing data to BigQuery and query by service, project, and label. The Billing Reports page in the console gives a visual breakdown by SKU and project. For compute specifically, Recommender flags overprovisioned VMs with estimated savings. For a systematic walkthrough, see how to find your most expensive GCP resources.
When should I use Spot VMs?
Use Spot VMs for fault-tolerant batch work that can checkpoint and restart: data pipelines, ML training runs, rendering jobs, and large data exports. GCP can reclaim Spot VMs with 30 seconds notice, so they must never serve live user traffic or run workloads where interruption causes data loss. The discount (60-91% off on-demand) is substantial, but only valuable if your workload can handle preemption gracefully.
Is Cloud Run cheaper than Compute Engine?
It depends on your traffic pattern. Cloud Run charges only during request processing and can scale to zero, so it wins decisively for variable or bursty workloads with idle periods. Compute Engine with CUDs can be cheaper for sustained high-throughput services running at 60%+ utilisation. If you are unsure, estimate the costs for both options using your actual traffic data before deciding. The Cloud Run vs Compute Engine comparison breaks down the numbers.
Frequently asked questions
What are the fastest ways to reduce a GCP bill?
The fastest wins are usually deleting idle resources (unattached disks, unused IPs, forgotten VMs), applying Cloud Storage lifecycle policies to move cold data out of Standard class, and right-sizing overprovisioned VMs using GCP Recommender. These require no commitments and take effect immediately. For larger savings, Committed Use Discounts on predictable compute can cut costs by up to 57%.
What is the difference between Sustained Use Discounts and Committed Use Discounts?
Sustained Use Discounts (SUDs) are automatic. GCP applies them when a VM runs for more than 25% of a month, giving up to 30% off. Committed Use Discounts (CUDs) require a 1-year or 3-year commitment in exchange for up to 57% off. SUDs require no action and apply to all eligible VMs. CUDs are significantly larger but lock you into paying for a minimum amount of compute whether you use it or not.
How do I find the most expensive resources in GCP?
Export billing data to BigQuery and query by service, project, and label. In the console, use the Billing Reports page to filter by SKU and project. For compute specifically, run Recommender to find overprovisioned VMs. For a systematic approach, see the guide on identifying expensive resources.
When should I use Spot VMs?
Spot VMs suit fault-tolerant batch workloads that can checkpoint progress and restart after preemption: data pipelines, ML training, rendering, and large exports. GCP can reclaim Spot VMs with 30 seconds notice, so never use them for workloads where interruption causes data loss or requires manual recovery. The discount is 60-91% off on-demand pricing.
Is Cloud Run cheaper than Compute Engine?
For variable-traffic workloads with significant idle periods, Cloud Run is usually cheaper because it charges only for CPU and memory used during request processing. For steady high-traffic workloads, Compute Engine with Committed Use Discounts can be comparable or cheaper. The right choice depends on your traffic pattern and utilisation levels.