Cloud Run vs GKE vs Compute Engine: Which Should You Use in GCP?

GCP gives you three main ways to run workloads: Cloud Run, GKE, and Compute Engine. The short answer: start with Cloud Run for new stateless HTTP services, use GKE when you need Kubernetes features specifically, and use Compute Engine when you need full OS control or cannot containerise your workload. Most teams choose wrong because they reach for what is familiar rather than what fits. This page gives you a clear decision framework so you can pick the right option the first time.

Simple explanation

If this is your first time choosing a GCP compute platform, here is the plain-English version:

  • Cloud Run — You package your app in a container and hand it to Google. Google handles servers, scaling, load balancing, and TLS certificates. You pay only when your app is actually handling requests. Best for web APIs, webhooks, and HTTP services that do not store data locally.

  • GKE (Google Kubernetes Engine) — You run a managed Kubernetes cluster. Kubernetes is an orchestration system that manages containers across multiple machines, with advanced features like multi-container pods, persistent storage, and traffic management. More control than Cloud Run, but significantly more to manage.

  • Compute Engine — You get a full virtual machine. You choose the operating system, install software, configure networking, and manage updates yourself. Maximum control, maximum responsibility. Best for legacy apps, Windows workloads, and anything that cannot run in a container.

Analogy

Think of it like food service. Cloud Run is like ordering delivery: you describe what you want, someone else handles all the cooking and logistics, and you only pay per order. GKE is like renting a commercial kitchen: you have full control over what gets cooked and how, but you staff and manage the kitchen yourself. Compute Engine is like buying the entire restaurant building: you own everything, you control everything, and you are responsible for the plumbing.

At a glance

Cloud RunGKECompute Engine
Best forStateless HTTP and gRPC servicesContainer orchestration with Kubernetes featuresFull OS control, legacy apps, non-containerisable workloads
Operational overheadVery lowMedium to highHigh
Scaling modelAutomatic, per-requestHPA + node autoscalerManual or managed instance groups
Scales to zeroYesNo (node pools persist)No
Stateful workloadsNoYes (PersistentVolumes)Yes (persistent disks)
HTTP servicesYes, nativelyYes, via IngressYes, manually configured
Background workersLimited (Cloud Run Jobs)YesYes
Sidecars / multi-containerNoYes (pods)No
OS-level controlNoNode OS access in Standard modeFull
Non-HTTP protocolsgRPC yes; TCP/UDP limitedYesYes
Kubernetes requiredNoYesNo
Pricing modelPer request (CPU + memory during handling)Per node (always on) + cluster feePer VM-second while running
Typical team fitSmall teams, no Kubernetes expertise neededTeams with Kubernetes experienceTeams managing servers or doing lift-and-shift
Main downsideCold starts; stateless containers onlyOperational complexity and cost floorManual management of everything

How this decision works

The decision is not about picking the most powerful option. It is about picking the simplest option that meets your actual requirements. Each step up in complexity adds genuine operational burden. Only take that step when a specific workload requirement demands it.

Tip

Before choosing a platform, ask one question first: is the workload stateless and HTTP-driven? If yes, start with Cloud Run. You can always migrate to GKE using the same container image later if you hit a genuine limitation.

Frame the full choice around these questions:

  • Can it be containerised? If your workload requires a custom kernel module, Windows Server, specific hardware drivers, or an OS configuration that cannot be captured in a container image, you need a VM. See Compute Engine Overview for what a VM gives you.

  • Is it stateless? A stateless service handles each request independently. It stores nothing locally between requests and any instance can serve any request. Cloud Run requires this. If your service writes to local disk or maintains in-memory session state across requests, it is not stateless. Read more in Stateless vs Stateful Services.

  • Is it HTTP or gRPC? Cloud Run is built around HTTP and gRPC. If your workload uses raw TCP, UDP, or other protocols, Cloud Run is not a good fit.

  • Do you need Kubernetes features specifically? GKE is the right answer when you need features only Kubernetes provides. Sidecars are second containers running alongside your main app in the same pod, commonly used for log collectors, service mesh proxies, or init containers. PersistentVolumes are storage volumes attached to pods that survive restarts, used for self-hosted databases and caches. A service mesh like Istio adds mTLS and advanced traffic management between services. If you just need to run containers, Cloud Run does that without Kubernetes overhead. See What Is Kubernetes for background.

  • How much operational complexity can your team sustain? GKE requires expertise in cluster management, node pools, Kubernetes RBAC, version upgrades, and networking. Cloud Run requires none of that. A small team without dedicated platform engineering capacity should start on Cloud Run.

When to use Cloud Run

Cloud Run is the right default for new services on GCP. You deploy a container image and Cloud Run handles load balancing, TLS termination, and scaling automatically. There is no cluster to configure, no OS to patch, and no idle cost when traffic drops to zero.

Note

The main trade-off is the cold start: when your service has been idle and a new request arrives, Cloud Run must boot a fresh container instance from scratch. This adds anywhere from a few hundred milliseconds to several seconds on that first request. For latency-sensitive services, set minimum instances to 1 to keep a warm instance ready. Learn more in Cloud Run Scaling Behaviour.

Ideal workloads

  • REST APIs and GraphQL endpoints with variable or unpredictable traffic
  • Webhooks and event handlers triggered by Pub/Sub, Eventarc, or third-party services
  • Public-facing web applications with stateless rendering
  • Internal microservices communicating over HTTP or gRPC
  • Scheduled jobs and batch tasks via Cloud Run Jobs
  • Short-lived processing: image resizing, PDF generation, data transformation

When not to choose Cloud Run

  • Your service writes to local disk and expects that data to persist — Cloud Run instances are ephemeral
  • You need multi-container pods with a sidecar proxy, log collector, or init container — that requires GKE
  • Your workload is a long-running daemon with no HTTP interface
  • You need non-HTTP protocols at scale, such as raw TCP or UDP
  • Request processing exceeds 3600 seconds — Cloud Run has a hard maximum timeout

Concrete examples

  • User authentication API. Stateless, HTTP, low to moderate traffic, variable load. Cloud Run scales to zero overnight and wakes on demand, costing nothing at idle.

  • Image thumbnail service. Triggered by a Pub/Sub message when a file lands in Cloud Storage, resizes the image and writes the result back. Scales to zero between bursts.

  • Webhook receiver. Listens for payment provider events and writes records to Cloud SQL. No persistent local state, variable traffic, minimal deployment overhead.

  • Internal reporting API. Called by an admin dashboard a few hundred times per day. Scales to zero completely to avoid idle VM costs.

For security details on how Cloud Run handles identity and access, see Cloud Run Security Model. For a full overview of the service, see Cloud Run Overview. For a direct comparison with Compute Engine specifically, see Cloud Run vs Compute Engine.

When to use GKE

GKE is Google’s managed Kubernetes service. Google operates the Kubernetes control plane so you do not have to manage etcd, the API server, or control plane upgrades. But you still manage node pools, cluster version upgrades, networking configuration, RBAC policies, and pod security.

GKE is worth that overhead when you genuinely need what Kubernetes uniquely provides. If the main reason you are considering GKE is that your app runs in a container, that is not sufficient. Cloud Run also runs containers with far less operational burden.

Warning

GKE Standard clusters carry a cluster management fee of roughly $0.10 per cluster-hour (around $74/month), plus always-on node costs. A two-node cluster running a stateless API that Cloud Run handles equally well is paying a significant operational and financial tax for no benefit. Only choose GKE when you have a concrete need for a Kubernetes feature — not just containers.

Ideal workloads

  • Applications that need sidecar containers — a service mesh proxy (Envoy), a log collector (Fluentd), or an init container running setup before the main app starts
  • Stateful workloads requiring PersistentVolumes — self-hosted databases, message queues, and distributed caches where pod restarts must not lose data
  • Workloads using non-HTTP protocols: raw TCP, UDP, or custom binary protocols
  • ML training jobs with GPU node pools and custom node affinity rules
  • Teams already running Kubernetes manifests that would require significant rework to migrate away
  • Architectures using a service mesh (Istio, Cloud Service Mesh) for mTLS, traffic splitting, or detailed observability

When not to choose GKE

  • You are building new stateless HTTP services — Cloud Run is simpler and cheaper
  • Your team has no Kubernetes experience and no plan to develop it
  • You are a small team without a dedicated infrastructure engineer
  • Your main reason for choosing GKE is familiarity with Kubernetes from a previous role

Concrete examples

  • Service mesh microservices. Ten services communicating over mTLS with Istio routing policies. Requires sidecar injection per pod, which only works in Kubernetes.

  • Self-hosted Kafka or Redis. Needs PersistentVolumes so data survives pod restarts. Kubernetes StatefulSets handle this correctly; Cloud Run cannot.

  • ML training pipeline. Requires GPU node pools, custom tolerations, and node affinity rules. GKE gives direct control over node configuration.

  • Migrating a large existing Kubernetes deployment. The team has hundreds of Helm charts and years of Kubernetes manifests. GKE Autopilot reduces node management while preserving the existing deployment model.

If you are choosing between GKE Autopilot and GKE Standard, see GKE Autopilot vs Standard. For a direct head-to-head comparison of GKE and Cloud Run, see GKE vs Cloud Run.

When to use Compute Engine VMs

Compute Engine gives you a full virtual machine. You choose the OS, install packages, configure services, and manage everything yourself. It is the highest-overhead option, but also the most flexible. There is nothing you cannot run on a VM if you are prepared to manage it.

Lift-and-shift means moving an existing on-premises application to the cloud with minimal changes. If an application was designed for a physical server and was never containerised, a VM is often the fastest path to getting it into GCP — even if the long-term goal is to containerise it later.

Tip

Compute Engine is often the right starting point for legacy migrations, not the permanent destination. Get the workload running on a VM first, then evaluate whether containerising it makes sense once it is stable in GCP.

Ideal workloads

  • Legacy applications that depend on specific OS configurations, kernel modules, or native libraries
  • Lift-and-shift migrations from on-premises infrastructure
  • Windows Server workloads that cannot run in a Linux container
  • Long-running daemons with no HTTP interface that run indefinitely
  • Applications managing their own distributed clustering: Cassandra, Elasticsearch, Redis Cluster
  • Workloads requiring specialised hardware: GPU types unsupported by Cloud Run, local NVMe SSDs, or custom networking cards

When not to choose Compute Engine

  • Your workload is a new stateless HTTP API — a VM adds management overhead for no benefit
  • Traffic is variable or unpredictable — a VM runs 24/7 and you pay whether it is busy or idle
  • Your team lacks capacity to manage OS patching, security hardening, and VM lifecycle
  • You want to containerise and modernise — starting on a VM slows that path down

Concrete examples

  • Legacy .NET Framework application. Built for IIS on Windows Server, uses COM components that cannot be containerised. Runs on a Windows Compute Engine VM.

  • On-premises PostgreSQL cluster. Being migrated to GCP with minimal changes before eventual modernisation. Running on Compute Engine VMs first reduces migration risk.

  • Elasticsearch cluster. Manages its own distributed state, requires low-latency local storage, and handles node discovery internally. Runs on VMs with local SSDs.

  • GPU video transcoding worker. Long-running, GPU-intensive, non-HTTP job. Uses a Compute Engine VM with a GPU attached and runs continuously.

For VM types, sizing, and configuration options, see Compute Engine Overview. For cost reduction strategies on VMs, see Compute Engine Cost Optimisation.

Cloud Run vs GKE vs Compute Engine: decision flow

Work through these in order. Stop at the first step that gives a clear answer. Most new workloads end at step 2.

  1. Can the workload be containerised?
    If no — it needs Windows Server, a custom kernel module, specific hardware drivers, or an OS configuration that cannot go in a container — use a Compute Engine VM.

  2. Is it stateless and HTTP or gRPC?
    If yes — it handles each request independently, stores no local state, and communicates over HTTP or gRPC — use Cloud Run. This is the right answer for the majority of new services.

  3. Does it need PersistentVolumes, sidecars, or a service mesh?
    If yes — it requires Kubernetes-native features that Cloud Run does not provide — use GKE.

  4. Does it need non-HTTP protocols or indefinite background processing?
    If yes — it uses raw TCP, UDP, or needs to run without serving HTTP — use GKE (if containerised) or Compute Engine (if not).

  5. Still not sure?
    Use Cloud Run. Migrate to GKE or Compute Engine only when you hit a specific limitation it cannot meet. You can move a Cloud Run service to GKE using the same container image with minimal code changes.

Common beginner mistakes

  1. Choosing GKE just because the app is containerised. Running containers does not mean you need Kubernetes. Cloud Run runs containers with no cluster management, no RBAC configuration, and no node pool sizing. GKE is the right choice when you need Kubernetes features specifically — not just containers.

  2. Putting a simple API on a VM. A VM running one stateless API is on 24/7, requires OS patching, and costs $25–$35/month minimum before storage or networking. The same API on Cloud Run costs a few dollars per month at low traffic and zero at idle. If the workload is stateless and containerisable, a VM adds overhead without benefit.

  3. Choosing based on familiarity, not requirements. Teams comfortable with Kubernetes pick GKE for everything. Teams reluctant to containerise keep everything on VMs. Evaluate each workload against its actual requirements, not your team’s current comfort zone.

  4. Confusing “needs to run all the time” with “must be on a VM”. A service that handles requests continuously does not need a VM. Cloud Run with minimum instances set to 1 is always warm. GKE keeps containers running indefinitely. The difference is about OS control, protocol support, and stateful storage — not uptime.

  5. Treating the decision as permanent. A workload that starts simple enough for Cloud Run may grow to need GKE features. A VM-based workload may become containerisable. Revisit compute platform choices when requirements change rather than accepting early decisions as permanent constraints.

Real-world workload examples

WorkloadBest fitWhy
Small REST API with variable trafficCloud RunStateless, HTTP, scales to zero overnight to eliminate idle cost
Internal admin tool (low request volume)Cloud RunScales to zero, no idle cost, minimal operational overhead
Event-driven image processorCloud RunTriggered by Pub/Sub, short-lived processing, scales with event volume
Legacy Windows applicationCompute EngineRequires Windows Server; cannot run in a Linux container
Self-hosted message broker (RabbitMQ)GKE or Compute EngineNeeds persistent storage and long-running daemon behaviour; use GKE if containerised
ML model training jobCompute Engine (GPU VM)Long-running, GPU-intensive, not HTTP-driven; GPU VMs give direct hardware access
Long-running background workerGKE or Compute EngineCloud Run is an HTTP server; indefinite background daemons need GKE deployments or a VM
Microservices with a service meshGKESidecar injection for Istio proxies requires Kubernetes pods; Cloud Run does not support sidecars

Cost considerations

The three platforms have fundamentally different billing models, so cost comparisons depend heavily on traffic patterns.

  • Cloud Run bills per request: CPU and memory used only while a request is being handled. At zero traffic, cost is zero unless minimum instances is set. For variable or low-traffic services, Cloud Run is almost always cheapest. A low-volume API processing 100,000 requests per day at 100ms average duration typically costs a few dollars per month.

  • GKE bills per node (always on) plus a cluster management fee of roughly $0.10 per cluster-hour (around $74/month for GKE Standard; GKE Autopilot has no separate cluster fee but you pay for pod resource requests). A minimal two-node cluster costs more than most Cloud Run workloads at low to moderate traffic. At high sustained load with committed use discounts on nodes, GKE can become cost-competitive.

  • Compute Engine bills per VM-second while the instance is running, regardless of whether it is doing any work. A small e2-medium VM running 24/7 costs roughly $25–$35/month before storage and networking. At very high and very steady traffic, flat VM billing can beat per-request Cloud Run pricing — but the crossover depends on request rate, average duration, memory allocation, and whether you use committed use discounts.

Warning

Do not choose a compute platform based on cost estimates alone. Verify with real traffic data and GCP’s pricing calculator. The billing models are different enough that a workload that looks cheaper on Cloud Run at low traffic may look different at sustained high volume — and vice versa.

For detailed cost reduction strategies, see Cloud Run Cost Optimisation and Compute Engine Cost Optimisation.

Quick recommendations

  • Start with Cloud Run when you are building a new stateless HTTP service, your team does not have Kubernetes expertise, your traffic is variable or unpredictable, or you want the fastest path to a production deployment.

  • Choose GKE when you need multi-container pods with sidecars, PersistentVolumes for stateful workloads, a service mesh, GPU node pools, or you are running existing Kubernetes deployments that would be costly to rearchitect.

  • Choose Compute Engine when your workload cannot be containerised, you are doing a lift-and-shift migration, you need Windows Server, you need long-running daemons with no HTTP interface, or you need hardware that neither Cloud Run nor GKE supports.

  • Revisit the decision when a Cloud Run service starts needing stateful storage or sidecars (consider GKE), a VM-based workload gets containerised (consider Cloud Run or GKE), or your traffic pattern changes significantly enough to affect cost.

Frequently asked questions

What is the default choice for a new GCP service?

Cloud Run. It requires the least operational overhead, scales to zero at no cost, and handles most stateless HTTP workloads well. Start with Cloud Run and move to GKE or Compute Engine only when you hit a specific limitation: stateful storage, multi-container sidecars, non-HTTP protocols, or custom OS requirements. Most developers who start with GKE because it feels familiar end up with unnecessary operational complexity.

Is GKE overkill for most apps?

For most new services, yes. GKE makes sense when you genuinely need Kubernetes primitives: multi-container pods with sidecars, PersistentVolumes for stateful workloads, a service mesh like Istio, or GPU node pools for ML training. If the only reason you are considering GKE is that your app runs in a container, that is not a strong enough reason. Cloud Run runs containers too, with a fraction of the operational overhead.

Can Cloud Run replace Compute Engine VMs?

For stateless HTTP services, Cloud Run replaces VMs entirely and with less overhead. But Cloud Run cannot replace VMs when: the workload cannot be containerised (custom kernel, specific drivers, legacy app), you need Windows Server, you need full OS access, you need indefinitely running background daemons with no HTTP interface, or you need hardware that Cloud Run does not support.

When is Compute Engine better than Cloud Run?

Compute Engine is better when you need full OS control, are doing a lift-and-shift migration from on-premises, need Windows Server, need long-running daemons with no HTTP entrypoint, or need specialised hardware like custom GPUs or local NVMe SSDs. Also consider VMs for workloads with very high and very steady traffic where per-second VM billing becomes cheaper than per-request Cloud Run billing.

Can I start on Cloud Run and migrate to GKE later?

Yes, and this is the recommended approach. Cloud Run uses standard OCI containers. The same container image runs on GKE with a Kubernetes Deployment and Service. You would add Kubernetes manifests and configure an Ingress or load balancer, but your application code stays the same. Starting on Cloud Run reduces early operational burden and gives you a clear path to GKE if you later hit a limitation.

Last verified: 22 March 2026 Cloud services change frequently. Verify details against official documentation before making infrastructure decisions.