Cloud Run vs Compute Engine in GCP: Cost, Scaling, and When to Use Each

Cloud Run is a serverless container platform: you deploy a container and Google handles everything else. Compute Engine gives you a full virtual machine with persistent disks, SSH access, and complete OS control. For most stateless web services, Cloud Run is the faster and cheaper starting point. Switch to Compute Engine when you need persistent storage, a custom OS, or sustained high-throughput compute where committed use discounts beat per-request billing.

Simple explanation

Cloud Run is a serverless container platform. You give Google a container image. Google runs it, scales it, and shuts it down when nobody is using it. You never see a server, never patch an OS, and never pay for idle time.

Compute Engine is a virtual machine service. You pick a machine type, choose an operating system, attach disks, and manage the entire stack. The VM runs until you stop it, and you pay for every second it is on regardless of whether it handles traffic.

Analogy

Cloud Run is like renting a food truck that only appears when customers show up. You pay per meal served and the truck vanishes when the lunch rush ends. Compute Engine is like leasing a full restaurant. You control the kitchen, the decor, and the menu, but you pay rent every month even if nobody walks through the door.

How the decision works

The Cloud Run vs Compute Engine choice comes down to five trade-offs:

Control vs ops overhead. Compute Engine gives you root access to the machine. You choose the OS, install any software, and configure networking at the kernel level. Cloud Run removes that control, along with all the work that comes with it. If you do not need root access, Cloud Run saves significant operational time.

Scaling model. Cloud Run scales automatically from zero to thousands of instances based on incoming requests. Scaling behaviour is built in. You set minimums and maximums, and the platform handles the rest. Compute Engine requires managed instance groups with autoscaling policies to get similar behaviour, and even then it cannot scale to zero.

Billing model. Cloud Run charges per 100ms of execution plus memory. Zero traffic means zero cost. Compute Engine charges per second the VM is running regardless of utilisation. At low traffic, Cloud Run wins easily. At sustained high utilisation, Compute Engine with committed use discounts can be 40-60% cheaper.

Workload shape. Stateless, request-driven workloads fit Cloud Run naturally. Stateful workloads like databases, caches, and message queues need persistent storage that Cloud Run cannot provide. Cloud Run’s filesystem is writable but ephemeral: data written to disk does not survive instance recycling.

Startup speed. Cloud Run cold starts take 1-3 seconds. Compute Engine VMs take 30-90 seconds to boot. For services that need to handle sudden traffic spikes, Cloud Run responds far faster.

Rule of thumb

If your service is stateless and you cannot justify why you need a VM, start with Cloud Run. You can always migrate to Compute Engine later, but most teams that start on Cloud Run never need to.

Direct comparison table

Dimension	Cloud Run	Compute Engine
Deployment unit	Container image	VM image or startup script
Infrastructure management	None (fully managed)	Full OS, disk, and network management
Scaling	Automatic (0 to 1,000+ instances)	Manual, or via managed instance group autoscaler
Scale to zero	Yes	No
Pricing model	Per 100ms CPU + memory during request handling	Per second while VM is running
Startup behaviour	1-3 second cold start (mitigated with min instances)	30-90 seconds to boot
Local storage	Writable but ephemeral (lost on instance recycle)	Persistent disks with consistent IOPS
Persistence	Must use external services (Cloud SQL, Cloud Storage)	Local disks persist across reboots
Networking control	Managed HTTPS endpoint; private via VPC connector	Full control: custom routes, firewall rules, kernel params
SSH access	No	Yes
OS control	No (container runtime only)	Full: choose and configure any supported OS
GPU support	Yes (NVIDIA GPUs for inference workloads)	Yes (broader selection, custom drivers, persistent storage)
Long-running workloads	Up to 60 minutes per request (jobs up to 24 hours)	Unlimited
Stateful workloads	Not suitable (no persistent local state)	Fully supported with persistent disks
Ideal use cases	Stateless APIs, webhooks, microservices, scheduled jobs	Databases, legacy apps, GPU workloads, high-utilisation services

When Cloud Run is the better choice

Cloud Run wins when your workload is stateless, request-driven, and does not need OS-level control.

REST and GraphQL APIs with variable or unpredictable traffic. Scale to zero between requests and pay nothing when idle.
Webhook receivers that process events from external services or Pub/Sub triggers.
Microservices that need to scale independently without managing infrastructure for each one.
Scheduled jobs and batch processing using Cloud Scheduler with Cloud Run Jobs.
Development and staging environments where zero idle cost avoids wasting budget on environments nobody is using overnight.
Internal tools and dashboards with low, intermittent traffic.

Cloud Run deploys in under a minute and gives you an HTTPS URL immediately. There is no load balancer to configure, no firewall rules to write, and no TLS certificate to provision.

When Compute Engine is the better choice

Compute Engine wins when you need persistent state, full machine control, or sustained high-throughput compute.

Self-managed databases (PostgreSQL, MySQL, Redis, Elasticsearch) that need persistent disk with consistent IOPS. Cloud Run cannot provide this. Use Compute Engine or a managed service like Cloud SQL.
Legacy applications that cannot be containerised and need a specific OS environment.
GPU-heavy workloads that require custom driver installation, persistent model storage on local SSD, or long-running training jobs. While Cloud Run supports GPUs for inference, Compute Engine gives you broader GPU selection and full VM-level tuning.
High-utilisation services running above 50-70% sustained CPU where committed use discounts make VMs significantly cheaper.
Custom networking requiring specific routes, kernel-level parameters, or software-defined networking configurations.
Build agents and CI runners that need root access, arbitrary software installation, and full OS control.

Use managed instance groups with Compute Engine to get auto-healing and rolling updates. Even for a single VM, an instance group adds self-healing without extra cost.

Common real-world scenarios

APIs with uneven traffic

Use Cloud Run. A product API that handles 50 requests per second during business hours and near zero at night is a textbook Cloud Run workload. You pay only during active request handling, and scaling is automatic in both directions.

Internal services

Use Cloud Run. Internal admin dashboards, Slack bots, and reporting tools typically have low, intermittent traffic. Cloud Run with Serverless VPC Access can reach private resources without exposing anything to the public internet.

Scheduled jobs and background processing

Use Cloud Run Jobs. Nightly data exports, report generation, and cleanup tasks run as Cloud Run Jobs triggered by Cloud Scheduler. The job starts, runs to completion, and costs nothing between runs.

Databases and stateful applications

Use Compute Engine or a managed service. Self-managed databases need persistent disks and consistent IOPS. Cloud Run’s filesystem is writable but ephemeral, so it is not a substitute for real storage. For managed options, consider Cloud SQL or another managed storage service.

Legacy applications

Use Compute Engine. Applications that depend on a specific OS, require kernel modules, or were never designed to run in a container belong on a VM. Containerising these applications may be a long-term goal, but forcing them into Cloud Run causes more problems than it solves.

ML inference and GPU workloads

It depends. Cloud Run now supports NVIDIA GPUs, which works well for stateless inference behind an API. For workloads that need custom driver tuning, persistent model storage on local SSD, training jobs, or broader GPU selection, Compute Engine gives you full control. For GPU workloads at scale, GKE is also worth evaluating.

Always-on services with high utilisation

Use Compute Engine. A service running at 70%+ CPU utilisation around the clock costs less on a Compute Engine VM with a committed use discount than on Cloud Run’s per-request billing. Use Spot VMs for fault-tolerant high-utilisation workloads to save even more.

CI runners and build agents

Use Compute Engine. Build agents need root access, arbitrary software installation, and often large local disk for caches and build artifacts. These requirements exceed what Cloud Run can provide.

How to think about it

Choosing between Cloud Run and Compute Engine is like choosing between a taxi and owning a car. The taxi (Cloud Run) is cheaper if you only need rides a few times a day, and you never worry about parking, insurance, or oil changes. But if you are driving eight hours a day, every day, owning the car (Compute Engine) costs less per kilometre and you can customise it exactly how you want.

Cost comparison

Read before comparing prices

These estimates use simplified assumptions to show the general pattern. Your actual costs depend on request duration, memory allocation, region, and discount commitments. Use the Cloud Run Cost Calculator to model your specific workload before making a decision.

Low traffic: 5 million requests/month

Assumptions: 200ms average duration, 256 MB memory, us-central1.

Cloud Run: ~$2-5/month (within or just above the free tier)
Compute Engine (e2-micro, always on): ~$7/month

Cloud Run wins. The free tier covers most of this workload, and even above it the per-request cost is low.

High traffic: 500 million requests/month

Assumptions: 200ms average duration, 256 MB memory, sustained utilisation, us-central1.

Cloud Run: ~$200-400/month
Compute Engine (e2-standard-4, 1-year CUD): ~$80-100/month

Compute Engine wins at sustained high utilisation. The committed use discount reduces the per-hour cost by 37%, and because you are paying for the machine anyway, high utilisation means you get more value per dollar.

The crossover

The break-even point where Compute Engine becomes cheaper is typically around 50-70% sustained CPU utilisation on equivalent resources. Below that threshold, Cloud Run’s pay-per-use billing wins. Above it, committed use discounts on Compute Engine pull ahead. For a deeper look at reducing Cloud Run spend, see Cloud Run cost optimisation.

How it works in practice

After deploying to Cloud Run

You push a container image to Artifact Registry and run a single deploy command. Within seconds, Cloud Run gives you an HTTPS URL. Traffic arrives, instances spin up automatically. Traffic drops, instances spin down. At zero traffic, you pay nothing. Scaling, load balancing, and TLS are handled for you. Your operational work is limited to building and pushing container images, setting environment variables, and monitoring logs and metrics.

After deploying to Compute Engine

You create a VM from an instance template or manually configure one. The VM boots in 30-90 seconds. You SSH in, install your application, configure a process manager, set up monitoring agents, and manage firewall rules. Scaling means creating a managed instance group with an autoscaling policy. Updates mean rolling out new instance templates. You manage OS patches, disk snapshots, and startup scripts. The trade-off: full control in exchange for ongoing operational work.

Where GKE fits

The middle ground

If Cloud Run feels too limited and Compute Engine feels too manual, GKE (Google Kubernetes Engine) sits in between. GKE gives you container orchestration with persistent volumes, custom scheduling, service mesh, and sidecar containers. These are features Cloud Run does not offer, but GKE does not require you to manage individual VMs either. The trade-off is operational complexity: GKE clusters need node pool management, version upgrades, and Kubernetes expertise. For a full side-by-side of all three options, see Cloud Run vs GKE vs Compute Engine.

Common mistakes

Running a database on Cloud Run. Cloud Run’s filesystem is writable but ephemeral. Files written during one request may vanish when the instance is recycled. Databases require persistent storage. Use Compute Engine with persistent disks or a managed service like Cloud SQL.
Not setting max instances on Cloud Run. A traffic spike can scale to hundreds of instances in seconds. Without a maximum, costs spike unexpectedly. Always set a max-instances limit based on your traffic expectations and budget.
Leaving Compute Engine VMs running idle. An e2-standard-2 costs ~$50-60/month whether it handles traffic or not. Stop or schedule non-production VMs. Use Compute Engine cost optimisation techniques to avoid waste.
Assuming Cloud Run cannot handle your workload. Cloud Run supports WebSockets, HTTP/2, gRPC, GPUs, up to 32 GiB memory, 8 vCPUs, and 60-minute request timeouts. Check the current limits before defaulting to Compute Engine. Many workloads that seem to need a VM actually fit Cloud Run fine.
Defaulting to Compute Engine for every new service. VMs are familiar, but the operational overhead adds up. OS patching, monitoring setup, disk management, and scaling configuration consume ongoing time. For stateless services, Cloud Run eliminates all of this.
Ignoring the Cloud Run free tier. Cloud Run’s free tier covers 2 million requests, 360,000 GiB-seconds of memory, and 180,000 vCPU-seconds per month. Many small services run entirely for free.
Claiming Cloud Run has no GPU support. Cloud Run supports NVIDIA GPUs for inference workloads. Compute Engine is still better for training, custom drivers, and persistent model storage, but blanket “no GPU” claims are outdated.

Best practices

Default to Cloud Run for new stateless services. The operational overhead of Compute Engine is rarely justified for request-driven workloads without persistence requirements.
Set both min and max instances on Cloud Run. A minimum of 1 eliminates cold starts for user-facing services. A maximum caps runaway scaling and costs.
Use managed instance groups for Compute Engine, even for single VMs. Auto-restart and health checks protect against failures without manual intervention.
Use Serverless VPC Access when Cloud Run needs to reach private resources. Connect to Cloud SQL, Redis, or internal VMs through a VPC connector rather than exposing services to the public internet.
Use IAM service accounts, not API keys. Both Cloud Run and Compute Engine should authenticate to other GCP services via attached service accounts.
Monitor both services from day one. Cloud Run metrics are built in. Compute Engine needs the Ops Agent installed for application-level telemetry.
Evaluate cost at your actual utilisation level. Use the Cloud Run Cost Calculator and GCP Pricing Calculator to compare at your real traffic volume, not hypothetical extremes.

The containerisation question

If the only reason you are choosing Compute Engine is that your app is not yet containerised, reconsider. Writing a Dockerfile for most applications takes less than a day. The long-term savings in operational overhead from Cloud Run are significant. Containerise new services from the start rather than defaulting to VMs out of habit.

Frequently asked questions

Is Cloud Run cheaper than Compute Engine?

For low or intermittent traffic, Cloud Run is almost always cheaper because it scales to zero and you pay nothing when idle. For workloads running at sustained high CPU utilisation (above roughly 50-70%), a Compute Engine VM with a committed use discount can cost less. A small always-on VM (e2-micro) is about $7/month. Cloud Run handling 5 million lightweight requests per month can stay under $5. The crossover depends on request volume, duration, and memory.

Can Cloud Run replace Compute Engine for all workloads?

No. Cloud Run is a strong default for stateless HTTP services, but Compute Engine is still required when you need persistent local disk, a custom operating system, full SSH access, kernel-level configuration, or long-running stateful processes like self-managed databases. Cloud Run's filesystem is writable but ephemeral, so data does not survive instance recycling.

Does Cloud Run support GPUs?

Yes. Cloud Run supports NVIDIA GPUs for workloads like ML inference. However, Compute Engine offers broader GPU selection, custom driver installation, persistent-disk-backed model storage, and full VM-level tuning. For long-running or large-scale GPU workloads, Compute Engine or GKE is usually the better fit.

Can Cloud Run connect to Cloud SQL or private VPC resources?

Yes. Cloud Run connects to Cloud SQL through the built-in Cloud SQL Auth Proxy, and to other private VPC resources via Serverless VPC Access connectors. No public IP on the database is required.

When should I consider GKE instead of Cloud Run or Compute Engine?

Consider GKE when you need container orchestration features that Cloud Run does not offer, like persistent volumes, custom scheduling, service mesh, or sidecar containers at scale. GKE sits between Cloud Run's simplicity and Compute Engine's full control. See our three-way comparison for details.

Last verified: 28 March 2026 Cloud services change frequently. Verify details against official documentation before making infrastructure decisions.

Cloud Run vs Compute Engine in GCP: Cost, Scaling, and When to Use Each

Simple explanation

Quick answer

Choose Cloud Run if…

Choose Compute Engine if…

How the decision works

Direct comparison table

When Cloud Run is the better choice

When Compute Engine is the better choice

Common real-world scenarios

APIs with uneven traffic

Internal services

Scheduled jobs and background processing

Databases and stateful applications

Legacy applications

ML inference and GPU workloads

Always-on services with high utilisation

CI runners and build agents

Cost comparison

Low traffic: 5 million requests/month

High traffic: 500 million requests/month

The crossover

How it works in practice

After deploying to Cloud Run

After deploying to Compute Engine

Where GKE fits

Common mistakes

Best practices

Frequently asked questions

Cloud Run vs Compute Engine in GCP: Cost, Scaling, and When to Use Each

Simple explanation

Quick answer

Choose Cloud Run if…

Choose Compute Engine if…

How the decision works

Direct comparison table

When Cloud Run is the better choice

When Compute Engine is the better choice

Common real-world scenarios

APIs with uneven traffic

Internal services

Scheduled jobs and background processing

Databases and stateful applications

Legacy applications

ML inference and GPU workloads

Always-on services with high utilisation

CI runners and build agents

Cost comparison

Low traffic: 5 million requests/month

High traffic: 500 million requests/month

The crossover

How it works in practice

After deploying to Cloud Run

After deploying to Compute Engine

Where GKE fits

Common mistakes

Best practices

Related topics

Frequently asked questions