Rightsizing Virtual Machines in GCP: How to Cut Compute Costs Safely

Rightsizing is the process of matching a virtual machine’s CPU and memory allocation to what the workload actually uses. Done well, it lowers your Compute Engine spend without creating instability, often by 20–50% per VM with no architectural changes.

Simple explanation

Most VMs are created with a generous size estimate and never revisited. A team picks an n2-standard-8 because they are not sure what the workload needs, the application runs fine, and nobody goes back to check whether an n2-standard-4 would have been enough.

Rightsizing means closing that gap. It can take three forms:

  • Downsizing: moving to a smaller machine type in the same family (n2-standard-8 to n2-standard-4).
  • Changing machine family: switching to a family with a better price-performance fit (N2 to E2 for variable workloads, or N2 to C3 for compute-heavy ones).
  • Custom machine types: specifying exact vCPU and memory when no predefined size matches the workload shape.

How GCP rightsizing recommendations work

Google Cloud generates rightsizing recommendations through the Recommender service (also called Active Assist). The service continuously monitors CPU utilisation and, when the Ops Agent is installed, memory utilisation for every Compute Engine instance in your project. When it detects sustained over-provisioning, it suggests a specific target machine type and an estimated monthly saving.

Recommendations are based on recent historical data, typically the last eight days of observed utilisation. This means they reflect what the workload did recently, not what it might do during a seasonal peak, a monthly batch run, or a one-off traffic event.

Tip

Install the Ops Agent on your VMs to give the Recommender memory utilisation signals. Without it, recommendations are based on CPU only, which can lead to suggestions that undersize memory for memory-intensive workloads.

When recommendations may be unavailable or less useful

  • VMs that have been running for fewer than a few days may not have enough data for a recommendation.
  • VMs with highly variable or bursty workloads may receive recommendations that underestimate peak needs.
  • Custom metric workloads (such as GPU-bound or I/O-bound tasks) are not well-captured by CPU and memory signals alone.
  • VMs covered by sole-tenant nodes or with specific licensing constraints may not have applicable recommendations.
Warning

Recommendations can disappear or change after a VM restarts or after a traffic spike. If you see a recommendation you plan to act on, record the details before the next observation window rotates.

Treat every recommendation as a starting point. Cross-reference it with your own monitoring data before acting.

When to use rightsizing

Rightsizing delivers the most value in these situations:

  • Long-running VMs with low sustained utilisation. The classic case. If a VM has been running for months at 10–15% average CPU, it is almost certainly over-provisioned.
  • Conservatively provisioned workloads that were never revisited. Common after initial deployments where the team chose a large size “just in case.”
  • Post-optimisation applications. After developers improve caching, query performance, or concurrency, the same workload may need significantly less CPU and memory.
  • Fleets where small per-VM savings add up. Saving $30/month per VM across 200 VMs is $72,000/year.

When not to rightsize yet

  • Before a known traffic peak. Do not downsize the week before a product launch, seasonal event, or end-of-quarter processing.
  • When memory is already tight. If the VM is swapping or the application is under memory pressure, downsizing memory will cause failures.
  • When monthly or batch spikes are easy to miss. A VM that runs at 5% CPU most of the month but hits 90% during a three-hour batch job on the 28th needs that headroom.
  • When the bigger problem is architecture, not VM size. If the workload would be better served by Cloud Run, GKE, or a serverless option, rightsizing the VM is optimising the wrong layer.
Note

Rightsizing and autoscaling are not mutually exclusive. You can rightsize the base VM to remove waste, then let an autoscaler add or remove instances as demand changes. In many environments the combination saves more than either approach alone.

What to check before resizing

Run through this checklist before resizing any VM. Skipping items here is how resize operations cause outages.

  • CPU average and peaks. Check both over at least two weeks. A low average with high peaks means the VM needs that headroom.
  • Memory usage and pressure. If the Ops Agent is installed, check memory utilisation alongside CPU. Without it, SSH in and check with free -m or top during peak hours.
  • Observation window. Make sure the window is long enough to capture weekly patterns. Four weeks is a good default.
  • Batch jobs, backups, and cron spikes. Check for scheduled processes that spike utilisation on specific days or times.
  • Disk throughput and disk count. Some machine types have different maximum persistent disk throughput or maximum number of attached disks. Verify the target type supports your disk configuration.
  • Network bandwidth. Per-VM egress bandwidth scales with vCPU count on most machine families. If the workload is network-heavy, confirm the smaller type still meets throughput needs.
  • Maintenance window and redundancy. Standalone VMs require a stop/start. Plan for the outage or ensure redundancy is in place.
  • Reservations, commitments, and discounts. If the VM is covered by a committed use discount, check that downsizing does not waste committed resources. Ideally, rightsize first, then commit.

None of these items have universal pass/fail thresholds. A VM at 30% average CPU might be fine to downsize if peaks stay below 60%, or it might be too tight if the workload is latency-sensitive and needs burst headroom. Use the data as input, not as a decision.

Finding rightsizing candidates

You can find candidates in two ways: manually through Cloud Monitoring, or through the Recommender service. Use both because they catch different things. Start by identifying your most expensive resources so you know where downsizing will have the biggest impact.

Manual analysis with Cloud Monitoring

Query CPU utilisation over a 30-day window for a specific VM. Check both the average (to find steady-state waste) and the maximum (to confirm you have headroom to downsize).

# Average CPU utilisation for a VM over the last 30 days
gcloud monitoring time-series list \
  --project=PROJECT_ID \
  --filter='metric.type="compute.googleapis.com/instance/cpu/utilization"
            AND resource.labels.instance_id="INSTANCE_ID"' \
  --aggregation-align-period=86400s \
  --aggregation-per-series-aligner=ALIGN_MEAN \
  --interval-start-time=$(date -d "30 days ago" +%Y-%m-%dT00:00:00Z) \
  --interval-end-time=$(date +%Y-%m-%dT00:00:00Z)

# Peak CPU utilisation over the same period
gcloud monitoring time-series list \
  --project=PROJECT_ID \
  --filter='metric.type="compute.googleapis.com/instance/cpu/utilization"
            AND resource.labels.instance_id="INSTANCE_ID"' \
  --aggregation-align-period=86400s \
  --aggregation-per-series-aligner=ALIGN_MAX \
  --interval-start-time=$(date -d "30 days ago" +%Y-%m-%dT00:00:00Z) \
  --interval-end-time=$(date +%Y-%m-%dT00:00:00Z)

As a rough heuristic: VMs with average CPU below 20% and peak CPU below 60% are strong candidates. VMs between 20–40% average with peaks below 80% are worth investigating. These are guidelines, not rules. Always consider the workload context.

Tip

CPU alone does not tell the full story. A VM at 10% CPU but 85% memory utilisation should not be downsized based on CPU data. Always check memory alongside CPU, especially for Java applications, in-memory caches, and database workloads.

Recommender-based analysis

The Recommender service scans your project automatically and flags VMs it considers over-provisioned. Each recommendation includes the suggested target machine type and an estimated monthly cost reduction.

# List rightsizing recommendations for a zone
gcloud recommender recommendations list \
  --project=PROJECT_ID \
  --location=us-central1-a \
  --recommender=google.compute.instance.MachineTypeRecommender \
  --format="table(name,description,
                   primaryImpact.costProjection.cost.units,
                   primaryImpact.costProjection.cost.nanos)"

# Get full details for a specific recommendation
gcloud recommender recommendations describe RECOMMENDATION_ID \
  --project=PROJECT_ID \
  --location=us-central1-a \
  --recommender=google.compute.instance.MachineTypeRecommender

# Mark a recommendation as accepted after acting on it
gcloud recommender recommendations mark-accepted RECOMMENDATION_ID \
  --project=PROJECT_ID \
  --location=us-central1-a \
  --recommender=google.compute.instance.MachineTypeRecommender

Review recommendations regularly. Monthly is a good cadence. For larger fleets, sort by estimated savings and work through the highest-impact VMs first.

Safe standalone VM resize

Resizing a standalone VM requires stopping it, changing the machine type, and starting it again. The VM keeps all its persistent disks, data, IP addresses (if static), and configuration. Only the vCPU and memory allocation changes. Expect one to three minutes of downtime.

Before you resize

  • Consider taking a snapshot of attached disks if the workload is critical and you want a rollback point.
  • For VMs behind a load balancer, drain traffic first using gcloud compute backend-services get-health to confirm the VM is out of rotation.
  • Schedule the resize during a maintenance window if the VM has no redundancy.
  • Note the current machine type so you can revert quickly if the new size causes problems.
# Step 1: Record the current machine type
gcloud compute instances describe my-vm \
  --zone=us-central1-a \
  --format="value(machineType)"

# Step 2: Stop the VM
gcloud compute instances stop my-vm --zone=us-central1-a

# Step 3: Change the machine type
gcloud compute instances set-machine-type my-vm \
  --zone=us-central1-a \
  --machine-type=n2-standard-2

# Step 4: Start the VM
gcloud compute instances start my-vm --zone=us-central1-a

# Step 5: Verify the application is healthy
gcloud compute ssh my-vm --zone=us-central1-a \
  --command="systemctl status my-app"

After the resize

  • Verify that the application started correctly and is serving traffic.
  • Monitor CPU and memory utilisation for at least 24–48 hours to confirm the new size handles real load.
  • Check application logs for out-of-memory errors, performance degradation, or timeout increases.
  • If the resize causes problems, stop the VM, change back to the previous machine type, and restart.
Danger

Changing machine families (not just sizes) during a resize can affect available CPU platforms, supported disk types, and maximum network bandwidth. A VM that worked on N2 may not behave identically on E2 or C3. Always verify compatibility before switching families, especially for production workloads.

Safe MIG resize

For VMs in a Managed Instance Group, you do not resize individual instances. Instead, you create a new instance template with the target machine type and trigger a rolling update. The MIG replaces instances one at a time, so the service stays available throughout.

Rolling updates reduce risk because only a fraction of instances are being replaced at any moment. If the new machine type causes problems, you can stop the rollout and roll back to the previous template.

# Create a new instance template with the smaller machine type
gcloud compute instance-templates create my-template-v2 \
  --source-instance-template=my-template-v1 \
  --machine-type=n2-standard-2

# Update the MIG to use the new template
gcloud compute instance-groups managed set-instance-template my-mig \
  --template=my-template-v2 \
  --zone=us-central1-a

# Start a rolling update (replace one VM at a time)
gcloud compute instance-groups managed rolling-action start-update my-mig \
  --version=template=my-template-v2 \
  --max-unavailable=1 \
  --zone=us-central1-a

# Watch the rollout progress
gcloud compute instance-groups managed list-instances my-mig \
  --zone=us-central1-a

Rollout safety

  • Make sure health checks are configured on the MIG so that unhealthy new instances are detected automatically.
  • Set —max-unavailable=1 to limit blast radius so only one instance transitions at a time.
  • Monitor error rates and latency during the rollout. If metrics degrade, pause or cancel the update.
  • Keep the old instance template available. If you need to roll back, update the MIG to point back to it and trigger another rolling action.
Analogy

A MIG rolling update is like replacing tyres on a bus one at a time while it keeps driving. At no point does the bus stop completely. If one new tyre does not fit, you stop and swap it back before continuing.

If you use autoscaling on the MIG, the autoscaler will continue to work with the new template. After the rollout completes, the autoscaler creates any new instances using the updated machine type.

Custom machine types

Predefined machine types come in fixed vCPU-to-memory ratios. If your workload needs a combination that does not match any standard size (say 4 vCPUs with 12 GB of RAM instead of the standard 16 GB), a custom machine type avoids paying for the unused 4 GB.

Custom machine types use family-prefixed syntax: FAMILY-custom-VCPU-MEMORYMB.

# Create a VM with 4 vCPUs and 16 GB RAM on the N2 family
gcloud compute instances create my-vm \
  --zone=us-central1-a \
  --machine-type=n2-custom-4-16384

# Create a VM with 2 vCPUs and 8 GB RAM on the E2 family
gcloud compute instances create my-vm \
  --zone=us-central1-a \
  --machine-type=e2-custom-2-8192

# Extended memory: add more than standard ratio allows
gcloud compute instances create my-vm \
  --zone=us-central1-a \
  --machine-type=n2-custom-4-32768 \
  --custom-extensions

When custom types make sense

  • The workload’s CPU-to-memory ratio does not fit any predefined machine type well, and the waste from the nearest standard type is significant.
  • You have profiled the workload carefully and know the exact resource envelope it needs.
  • Extended memory is required. Some applications need much more RAM per vCPU than standard types offer.

When to use a predefined type instead

  • If the nearest predefined type wastes less than 10–15% of resources, the simplicity of standard types usually outweighs the small saving from custom sizing.
  • Custom types cost slightly more per vCPU-hour and per GB-hour than equivalent standard types. The total bill is lower only if you eliminate enough waste to offset the per-unit premium.
  • Custom types still need validation against real workload behaviour. Do not guess the sizing. Profile first, then configure.

Rightsizing vs autoscaling vs Spot VMs vs committed use discounts

Rightsizing is one of several strategies for reducing compute costs. Each solves a different problem. Understanding when to apply each one, and in what order, matters more than picking a favourite.

StrategyWhat it solvesBest forKey trade-off
RightsizingSteady-state waste from over-provisioned VMsLong-running VMs with consistently low utilisationRequires stopping standalone VMs; needs monitoring data
AutoscalingVariable demand: too many VMs at low traffic, too few at peakWeb applications, APIs, and workloads with unpredictable demandOnly works with MIGs; requires health checks and stateless design
Spot VMsHigh compute costs for fault-tolerant or batch workloadsBatch processing, CI/CD runners, data pipelinesVMs can be preempted at any time; not for stateful services
Committed use discountsPaying on-demand prices for predictable, stable workloadsProduction VMs that run 24/7 with known resource needsOne- or three-year commitment; wasted if workload shrinks

The recommended order: rightsize first, then evaluate autoscaling for variable workloads, then commit to discounts once you know the correct baseline size. Committing to a discount before rightsizing locks in waste. Read more about sequencing in building a cost optimisation strategy.

Common beginner mistakes

  1. Checking averages without checking peaks. A VM at 5% average CPU might spike to 90% during a nightly batch job. Downsize without checking the peak and the batch job starts failing. Always look at both average and maximum utilisation across the full observation window.

  2. Ignoring memory pressure. CPU and memory are independent signals. A VM at 10% CPU but 85% memory utilisation should not be downsized based on CPU alone. Install the Ops Agent so that memory data is available to both you and the Recommender.

  3. Acting on recommendations before known traffic events. Recommendations are based on recent historical data. If a product launch, seasonal surge, or large batch run is coming up, wait until after the event to act.

  4. Resizing without validating disk, IP, and compatibility implications. Switching machine families can change maximum disk throughput, supported disk counts, network bandwidth limits, and available CPU platforms. Check compatibility before committing to the change.

  5. Buying committed use discounts before fixing oversizing. A three-year commitment on an n2-standard-8 that should be an n2-standard-4 locks in three years of paying double. Always rightsize before committing. See GCP pricing models for how CUDs work.

  6. Assuming one “good” VM size lasts forever. Applications change. Traffic patterns shift. Performance optimisations reduce resource needs. Revisit VM sizes after major application changes, traffic pattern shifts, or at least quarterly.

  7. Not cleaning up what should be deleted instead of resized. Some VMs are not over-provisioned. They are unused entirely. Before rightsizing a fleet, first clean up resources that should not exist at all.

Frequently asked questions

What does rightsizing actually mean in GCP?

Rightsizing means adjusting a virtual machine so its vCPU count, memory, and machine family match the workload it actually runs, not the workload someone guessed it would run at provisioning time. In practice that can mean moving to a smaller predefined machine type, switching machine families (for example from N2 to E2), or creating a custom machine type when no standard size fits. The goal is to stop paying for capacity the workload never uses while keeping enough headroom for peak demand.

How long should I monitor before resizing a VM?

Two weeks is the minimum useful window. Four weeks is better because it captures weekly patterns, end-of-month batch runs, and any irregular cron jobs. If the workload is seasonal or event-driven, extend the window to cover at least one full cycle of that pattern. Shorter windows risk missing spikes that only happen on specific days.

Will resizing a VM cause downtime?

For a standalone VM, yes. You must stop the instance, change the machine type, and start it again. The outage is typically one to three minutes. Persistent disks and data are not affected. For VMs inside a Managed Instance Group, you can use a rolling update to replace instances one at a time, so the service as a whole stays available throughout the resize.

Should I trust Active Assist rightsizing recommendations automatically?

No. Treat them as informed suggestions, not instructions. Recommendations are based on recent historical CPU and memory data, which means they can miss infrequent spikes, seasonal peaks, or workloads that changed recently. Always cross-check recommendations against your own monitoring data, upcoming traffic events, and application-specific requirements before acting.

Rightsizing or autoscaling: which one solves my problem?

Rightsizing fixes steady-state waste: VMs that are always too large for what they do. Autoscaling fixes variable demand: workloads that need more capacity at some times and less at others. Many environments benefit from both. Rightsize the base instance first, then let autoscaling handle demand fluctuations. If your VMs sit at low utilisation around the clock, rightsizing is the first step. If utilisation swings dramatically, autoscaling is likely more impactful.

Last verified: 27 March 2026 Cloud services change frequently. Verify details against official documentation before making infrastructure decisions.