GKE vs Cloud Run: Cost, Complexity, and When to Use Each in GCP
Cloud Run and GKE both run containers in GCP, but they solve different problems. Cloud Run is a fully managed serverless platform where you deploy a container and Google handles everything else. GKE gives you a full Kubernetes cluster with control over scheduling, networking, storage, and every aspect of how your containers run. This page helps you decide which one fits your workload, your team, and your budget.
Simple explanation
Think of Cloud Run as a valet parking service. You hand over your car (container) and someone else parks it, moves it, and brings it back when you need it. You never think about the parking garage.
GKE is like renting your own garage with reserved spaces. You decide which car goes where, how many spaces to keep open, and how to organize traffic in and out. You get full control, but you also maintain the garage.
If you just need your car parked, the valet is simpler and cheaper. If you are running a fleet of vehicles with specific routing and storage requirements, your own garage makes sense.
Stateless services do not store data locally between requests. A REST API that reads from a database and returns JSON is stateless. A Redis cache that holds data in memory across requests is stateful.
Cloud Run excels at stateless services. GKE handles both, plus workloads that need Kubernetes-specific features like persistent volumes, custom scheduling, and cluster-wide networking policies.
Fast answer
- Default to Cloud Run for stateless HTTP services, APIs, event-driven processors, and background jobs. It scales to zero, deploys in seconds, and requires no infrastructure management.
- Choose GKE when you need Kubernetes primitives: StatefulSets for databases, DaemonSets for node-level agents, CRDs for platform tooling, service mesh for inter-service security, or GPU scheduling for ML workloads.
- Cost difference is real. Cloud Run bills per request and scales to zero. A GKE cluster runs continuously and costs money even when idle.
- Operational overhead is the biggest gap. Cloud Run needs no cluster management. GKE requires Kubernetes expertise, manifest maintenance, node upgrades, and monitoring at the cluster level.
- Portability is built in. Both run standard OCI containers. Moving from Cloud Run to GKE later means adding Kubernetes manifests, not rewriting your application.
- Start simple. Most teams that start on GKE when Cloud Run would suffice end up paying more and moving slower. Start on Cloud Run and migrate specific services to GKE only when you hit a concrete capability gap.
How Cloud Run works
Cloud Run is a fully managed container platform. You build a container image, push it to Artifact Registry, and deploy it with a single command. Cloud Run provisions infrastructure, configures a load balancer, issues a TLS certificate, and gives you an HTTPS URL. All of that happens automatically.
When requests arrive, Cloud Run starts container instances to handle them. Each instance can process multiple concurrent requests (up to 1,000 by default). When traffic drops, instances are removed. When traffic reaches zero, all instances stop and billing stops. You pay only for the CPU and memory your containers actually use while processing requests.
Cloud Run also supports Jobs: containers that run to completion without an HTTP endpoint. Jobs are triggered by Cloud Scheduler, Pub/Sub events, or Workflows, making them suitable for batch processing, data pipelines, and scheduled tasks.
Deploying a new version takes seconds. Traffic splitting is built in, so you can send 10% of traffic to a new revision while keeping 90% on the previous version, then promote gradually. Rollback is instant because previous revisions are retained.
Networking uses VPC connectors or Direct VPC egress to reach private resources like Cloud SQL instances, Memorystore, or internal APIs. IAM controls who can invoke the service. Cloud Run integrates with Secret Manager for credentials, Cloud SQL for databases, and Pub/Sub for event-driven architectures.
Cloud Run containers are stateless. They cannot write to persistent local disks. Any writes to the filesystem are ephemeral and lost when the instance stops. There is no concept of a “node” you can access, no way to run a sidecar process in the same pod, and no Kubernetes API. If your workload needs persistent local storage, node-level agents, or Kubernetes-specific orchestration, Cloud Run is not the right fit.
How GKE works
GKE (Google Kubernetes Engine) runs a managed Kubernetes cluster in GCP. Google manages the Kubernetes control plane (the API server, scheduler, and etcd). You manage the workloads that run on the cluster using kubectl commands and YAML manifests.
In GKE, containers run inside Pods, which are scheduled onto Nodes (Compute Engine VMs). You define how containers run using Kubernetes resources:
- Deployments manage stateless workloads with rolling updates and replica counts.
- StatefulSets manage stateful workloads (databases, caches) with stable network identities and persistent storage. Each pod in a StatefulSet gets its own persistent disk that survives restarts.
- DaemonSets run exactly one pod on every node. Used for log collectors, monitoring agents, and security scanners that must run cluster-wide.
- Services and Ingress expose workloads inside and outside the cluster with load balancing and routing rules.
- CRDs (Custom Resource Definitions) extend the Kubernetes API with new resource types. Operators use CRDs to manage databases, message queues, and other complex software as Kubernetes-native resources.
GKE Autopilot vs Standard
GKE Autopilot removes node management entirely. Google provisions, scales, and patches nodes automatically. You define pods and Autopilot handles the rest. Billing is per pod resource (CPU and memory) rather than per node. Autopilot is the recommended mode for most teams adopting GKE.
GKE Standard gives you full control over node pools, including machine types, GPU attachments, local SSDs, custom kernel parameters, and node-level DaemonSets. Use Standard when you need hardware-specific configurations or node-level access that Autopilot restricts.
GKE’s real cost is operational. Your team must learn Kubernetes concepts, write and maintain YAML manifests, plan node pool upgrades, configure RBAC (role-based access control), tune resource requests and limits to avoid wasting capacity, and monitor cluster health separately from application health. For a small team running a handful of stateless services, this overhead is rarely justified.
Side-by-side comparison
| Dimension | Cloud Run | GKE |
|---|---|---|
| Ops overhead | None (fully managed) | High (Standard) or moderate (Autopilot): manifests, upgrades, RBAC, monitoring |
| Scaling model | Automatic, per-request, scales to zero | Horizontal Pod Autoscaler + Cluster Autoscaler; minimum 1 node (Standard) or 0 pods (Autopilot) |
| Stateful workloads | No (containers are ephemeral) | Yes (StatefulSets with PersistentVolumeClaims) |
| Networking control | VPC connector or Direct VPC egress; managed load balancer | Full VPC integration, network policies, custom Ingress controllers, service mesh |
| Deployment model | Container image + one gcloud command | Kubernetes YAML manifests + kubectl apply |
| Security responsibility | Google manages infrastructure; you manage IAM, container image, and application code | Shared: you manage RBAC, network policies, pod security, node patches (Standard), plus IAM and application code |
| Cost model | Pay per request (CPU + memory while processing) | Pay per node running time (Standard) or per pod resource (Autopilot) |
| Cold-start tradeoff | Cold starts on scale-from-zero (mitigated with min instances) | No cold starts because pods run continuously on always-on nodes |
| Portability | Standard OCI containers; portable to any container platform | Kubernetes-native; portable to any Kubernetes cluster (EKS, AKS, on-prem) |
| Team skill requirement | Docker basics + GCP IAM | Kubernetes concepts + YAML + kubectl + cluster operations |
| GPU support | Yes (NVIDIA L4 GPUs in supported regions) | Yes (wide GPU selection including A100, H100, T4, L4) |
| Setup time to first deploy | Minutes | 10 to 20 minutes for cluster creation, hours for production-grade config |
When Cloud Run is the right choice
Cloud Run is the right default for most container workloads that do not require Kubernetes-specific features. Choose it when:
- Your services are stateless. REST APIs, GraphQL endpoints, web applications, and webhook handlers that do not store data locally are ideal Cloud Run workloads.
- Traffic is variable or bursty. Cloud Run scales to zero when idle and handles traffic spikes automatically. You pay nothing during quiet periods.
- You want zero infrastructure management. No node upgrades, no cluster monitoring, no capacity planning. Google handles all of it.
- Your team is small or does not have Kubernetes expertise. Cloud Run’s learning curve is a fraction of GKE’s. A developer who knows Docker can deploy to Cloud Run in minutes.
- You need fast iteration. New versions deploy in seconds. Traffic splitting, canary deploys, and instant rollback are built in.
- You are running dev, staging, or preview environments. These environments are often idle. Scale-to-zero means zero cost when no one is using them.
- You are processing events. Cloud Run services and jobs integrate with Pub/Sub, Eventarc, and Cloud Scheduler for event-driven and batch workloads.
For a deeper look at Cloud Run’s capabilities and limits, see the Cloud Run overview.
When GKE is the right choice
GKE earns its complexity when you need capabilities that Cloud Run does not provide. Choose it when:
- You need stateful workloads as containers. Running Redis, PostgreSQL, Elasticsearch, or Kafka as containers with persistent disks requires StatefulSets and PersistentVolumeClaims. These are Kubernetes primitives that do not exist in Cloud Run.
- You need node-level agents. DaemonSets run a pod on every node for log collection (Fluentd/Fluent Bit), monitoring agents, or security scanners. Cloud Run has no concept of nodes.
- You use Kubernetes operators or CRDs. If your platform depends on operators to manage databases (e.g., CloudNativePG), message queues, or custom infrastructure, you need the Kubernetes API.
- You need a service mesh. A service mesh is a network layer that sits between your services, encrypting all traffic and giving you control over which services can talk to each other. It provides inter-service mTLS, fine-grained traffic routing, and distributed tracing at the network level. This requires a Kubernetes cluster.
- You need GPU workloads with specific hardware. While Cloud Run supports NVIDIA L4 GPUs, GKE gives you access to a wider range of GPU types (A100, H100, T4) with fine-grained scheduling using node affinity (rules that pin pods to specific node types) and taints/tolerations (rules that reserve nodes for specific workloads).
- You run many services and want a unified platform. At 10+ services, a Kubernetes cluster with consistent deployment patterns, shared Ingress, centralized RBAC, and unified monitoring can reduce total operational overhead compared to managing each service individually.
- You need fine-grained network policies. Kubernetes network policies control pod-to-pod communication at the network level. Cloud Run’s networking is simpler but less granular.
- You need workload identity with Kubernetes-native RBAC. GKE’s integration of Kubernetes service accounts with Google Cloud IAM gives you precise per-pod identity management.
When neither is the right choice
Not every workload belongs on a container platform. Consider these alternatives:
- Managed databases. If your only reason for GKE is running PostgreSQL or MySQL, use Cloud SQL instead. Managed databases handle replication, backups, patching, and failover without a Kubernetes cluster.
- Simple event handlers. A single function triggered by a Cloud Storage upload or Pub/Sub message is a better fit for Cloud Functions than either Cloud Run or GKE. Less configuration, no container to build.
- Legacy applications that cannot be containerized. Monolithic apps that depend on specific OS configurations, persistent local filesystems, or long-lived background processes may run better on Compute Engine VMs. Containerize them when ready, not before.
- Batch analytics and data pipelines. BigQuery, Dataflow, and Dataproc are purpose-built for large-scale data processing. Running Spark on GKE is possible but adds unnecessary operational burden when managed alternatives exist.
For a full comparison of all GCP compute options, see Choosing Between Cloud Run, GKE, and Compute Engine.
How to choose: practical decision framework
Work through these questions in order. Stop at the first “yes.”
- Does your workload need persistent local storage? If yes: GKE (StatefulSets) or a managed database (Cloud SQL, Firestore, Memorystore).
- Do you need Kubernetes-specific primitives? DaemonSets, CRDs, operators, network policies, service mesh, or node-level access? If yes: GKE.
- Do you need GPU types beyond NVIDIA L4? A100, H100, or custom GPU scheduling? If yes: GKE.
- Are you running 10+ services and want a unified Kubernetes-native platform? If yes: GKE (Autopilot is likely sufficient).
- Is your workload stateless, event-driven, or request-driven? If yes: Cloud Run.
- Is your traffic bursty or low-volume? If yes: Cloud Run (scale-to-zero saves money).
- Is your team small or new to Kubernetes? If yes: Cloud Run (avoid the learning curve until you have a concrete reason for GKE).
If you answered “no” to questions 1 through 4 and “yes” to any of 5 through 7, Cloud Run is the right starting point. You can always move individual services to GKE later when a specific requirement demands it.
If you only have time for two questions: (1) Does this workload need to store data locally or use Kubernetes-specific features? (2) Does my team have Kubernetes experience and bandwidth to maintain a cluster? If the answer to both is “no,” use Cloud Run.
For a broader comparison that includes Compute Engine VMs in the decision, see Choosing Between Cloud Run, GKE, and VMs.
Real-world examples
REST API for a SaaS product
Best fit: Cloud Run. A stateless API that reads from and writes to Cloud SQL. Traffic varies throughout the day, and the development team pushes multiple deploys per day. Cloud Run handles scaling, TLS, and zero-downtime deploys automatically. The team spends zero time on infrastructure.
Background worker processing Pub/Sub messages
Best fit: Cloud Run. Messages arrive via Pub/Sub push subscription. Each message is processed independently. Cloud Run scales the number of instances based on message volume and scales to zero when the queue is empty. For continuous pull-based consumers that must run 24/7, a GKE Deployment is a better fit.
Internal platform with 15 microservices
Best fit: GKE Autopilot. A platform team managing 15 services benefits from a unified Kubernetes deployment model with shared Ingress, centralized RBAC, consistent manifests, and a single monitoring stack. Autopilot removes node management while keeping the full Kubernetes API.
AI inference service with GPU requirements
Best fit: depends on the GPU. If NVIDIA L4 GPUs meet your model’s requirements, Cloud Run with GPU support is simpler to operate. If you need A100s, H100s, or need to schedule multiple GPU types across a shared cluster, GKE with GPU node pools gives you the hardware selection and scheduling control.
Self-managed PostgreSQL cluster
Best fit: GKE (but consider Cloud SQL first). Running PostgreSQL with replication and persistent storage requires StatefulSets and PersistentVolumeClaims. GKE supports this. However, unless you have specific reasons to self-manage (custom extensions, compliance requirements, version pinning), Cloud SQL is less work and more reliable.
Migration path: Cloud Run first, GKE later
Starting on Cloud Run does not lock you in. Both platforms run standard OCI containers, so the same container image works on either platform without changes to application code.
What changes when you move to GKE
- Add Kubernetes manifests. You write a Deployment, Service, and optionally an Ingress or Gateway resource. These are YAML files that describe how your container runs on the cluster.
- Update your CI/CD pipeline. Replace
gcloud run deploywithkubectl applyor a GitOps tool like Argo CD. If you use Cloud Build, update the build step. - Configure networking. Cloud Run’s automatic HTTPS endpoint is replaced by a Kubernetes Service and Ingress with a managed certificate.
- Set up monitoring. Add GKE-specific monitoring for cluster health, node utilization, and pod resource consumption alongside your existing application-level monitoring.
What stays the same
- Your container image and Dockerfile
- Your application code and dependencies
- Your environment variable configuration (mapped to Kubernetes ConfigMaps or Secrets)
- Your Cloud SQL, Pub/Sub, and Secret Manager integrations (authentication may shift to Workload Identity)
Migrate a service from Cloud Run to GKE when you hit a specific, concrete limitation. Not because you think you might need Kubernetes someday. Good triggers: needing StatefulSets, needing DaemonSets, needing CRDs for an operator, needing a service mesh, or consolidating 10+ services onto a shared platform. Bad triggers: “we might need it later” or “Kubernetes looks more professional.”
Common mistakes
- Adopting GKE for a simple stateless API. The wrong assumption is that GKE is “more production-grade” than Cloud Run. In reality, Cloud Run is fully production-ready and eliminates cluster management entirely. A stateless API on GKE costs more and moves slower with no operational benefit.
- Running databases on Cloud Run. The wrong assumption is that any container can run on Cloud Run. Cloud Run containers are ephemeral, so local disk writes are lost when the instance stops. Databases need persistent storage. Use Cloud SQL for managed databases or GKE StatefulSets for self-managed ones.
- Choosing GKE Standard when Autopilot would suffice. The wrong assumption is that Standard gives you “more control.” Autopilot supports the full Kubernetes API for Deployments, Services, Jobs, and most CRDs. The only reasons for Standard are custom node configurations, DaemonSets requiring node-level access, or specific machine types that Autopilot does not support.
- Keeping dev and staging environments on GKE. The wrong assumption is that dev should mirror production exactly. A GKE cluster running 24/7 for development wastes money on idle nodes. Deploy dev and staging services on Cloud Run where scale-to-zero means zero idle cost. The container image is the same; only the deployment target differs.
- Assuming Cloud Run cannot handle complex workloads. The wrong assumption is that Cloud Run is only for “simple” services. Cloud Run supports custom domains, VPC networking, traffic splitting, concurrent request handling, jobs, GPU workloads, and integrations with most GCP services. The actual limitation is statelessness, not complexity.
- Ignoring the operational cost of Kubernetes. The wrong assumption is that GKE’s only cost is the compute bill. The real cost includes learning Kubernetes, writing and maintaining manifests, planning node upgrades, configuring RBAC, tuning resource requests, and debugging cluster-level issues. For small teams, this overhead dwarfs the compute cost difference.
Frequently asked questions
Can I move a Cloud Run service to GKE without rewriting my application?
Yes. Cloud Run uses standard OCI container images. The same image runs on GKE inside a Kubernetes Deployment. You add a deployment.yaml manifest and configure a Service or Ingress for routing. The application code stays the same. Only the deployment wrapper changes. This portability is one reason starting on Cloud Run is low-risk.
Is Cloud Run production-ready for large-scale applications?
Yes. Cloud Run handles millions of requests per second. It supports custom domains, automatic TLS, traffic splitting for canary deploys, VPC networking, IAM-based auth, and integrations with Cloud SQL, Secret Manager, and Pub/Sub. The main constraint is that containers must be stateless. If your workload needs persistent local storage or Kubernetes-specific primitives like StatefulSets, GKE is the better fit.
Does GKE always cost more than Cloud Run?
For low-traffic or bursty workloads, yes. GKE Standard charges a control plane management fee plus always-on node costs, while Cloud Run scales to zero. For sustained high-traffic workloads, a right-sized GKE cluster with committed use discounts can be cheaper per request than Cloud Run. GKE Autopilot narrows the gap by removing the control plane fee and billing only for pod resources.
Should I use GKE Standard or GKE Autopilot?
Start with Autopilot unless you need custom node configurations, specific machine types, or DaemonSets that require node-level access. Autopilot removes node management, patches nodes automatically, and bills per pod resource rather than per node. It keeps full Kubernetes API compatibility for Deployments, Services, Jobs, and most CRDs.
Can Cloud Run handle background jobs, not just HTTP requests?
Yes. Cloud Run Jobs run containers to completion without needing an HTTP endpoint. They are triggered by Cloud Scheduler, Pub/Sub via Eventarc, or Workflows. Cloud Run services can also process Pub/Sub messages as push subscriptions. For long-running always-on workers that do not respond to events, GKE Deployments are a better fit.