GKE Node Pools Explained: What They Are, When to Use Them, and How They Work

A GKE node pool is a group of virtual machines inside a GKE Standard cluster that all share the same configuration. Multiple node pools let you run GPU workloads, batch jobs on Spot VMs, and general web services in a single cluster, each on the hardware it actually needs. This page explains how node pools work, when to create more than one, and how taints, tolerations, and autoscaling fit together.

What is a GKE node pool?

In GKE, worker nodes are Compute Engine virtual machines running inside your GCP project. A node pool is a named group of those VMs that all share the same configuration: machine type, disk size, operating system image, Kubernetes node labels, and taints. When GKE adds a node to a pool, the new node is identical to every other node already in that pool.

Node pools are a GKE Standard concept. If you are using GKE Autopilot, node pools do not exist. Google manages all the infrastructure behind the scenes and you never interact with node pools directly.

When you create a Standard cluster, GKE automatically creates a default node pool for you. You can add more pools at any time without recreating the cluster.

Standard mode only

Node pools are a GKE Standard feature. In GKE Autopilot, Google manages the node infrastructure entirely and does not expose node pools. Every reference to “node pools” in the GKE documentation assumes Standard mode.

Why you can’t change machine type in place

You cannot change the machine type of an existing node within a pool. A pool is a template and all its nodes are instances of that template. To switch to a different machine type, create a new pool with the new type, migrate workloads across, then delete the old pool.

Simple explanation for beginners

If you are new to Kubernetes, the relationship between clusters, nodes, and pools can feel abstract. Here is a concrete way to think about it.

Analogy: the factory floor

Think of a GKE cluster as a factory. The control plane is the management office — it decides what gets made, where, and when. You never touch it on GKE; Google runs it for you.

The worker nodes are the machines on the factory floor where actual work happens. Your containerised applications run here.

A node pool is a section of the factory floor where every machine is the same model. One section has heavy-duty machines for demanding tasks (GPU nodes for ML). Another has cheaper standard machines for everyday tasks (general-purpose nodes). Another has machines that shut down when idle to save cost (Spot nodes).

All sections are part of the same factory, but each section has equipment matched to the work it handles.

How GKE node pools fit into a cluster

Understanding where node pools sit in the GKE hierarchy makes everything else easier:

Layer	What it is	Who manages it on GKE
Cluster	The entire Kubernetes environment	You create it; GKE manages the control plane
Control plane	API server, scheduler, etcd	Google (fully managed)
Node pool	A named group of identically configured VMs	You (in Standard mode)
Node	A single VM that runs pods	GKE provisions within your pool config
Pod	One or more containers sharing a network	You (via Deployments, Jobs, etc.)

A single cluster can have many node pools. Pools are independent in terms of machine type and scaling settings, but pods across all pools share the same Kubernetes namespaces and cluster network. A pod in the GPU pool can communicate with a pod in the standard pool without special configuration, following the same rules described in GKE networking.

Because all nodes in a pool are identical, the Kubernetes scheduler treats them as interchangeable. If you need nodes with different capabilities, you need different pools.

Why node pools exist

Node pools solve a real problem: different workloads have very different hardware needs, and putting them all on identical machines is wasteful, expensive, or technically impossible.

Cost control. Running expensive GPU nodes 24/7 when your ML training job only runs overnight wastes money. A dedicated GPU pool can autoscale to zero when not in use, so you only pay while it is active.

Workload isolation. Keeping production-critical services on their own pool means a resource-hungry batch job cannot starve them of CPU or memory.

Hardware compatibility. Some workloads cannot run on certain node types at all. Windows containers require Windows nodes. GPU workloads require nodes with attached accelerators. These cannot be mixed with standard Linux nodes in the same pool.

Independent autoscaling. Each pool has its own autoscaling settings. You can allow a batch-processing pool to scale between 0 and 50 nodes while keeping the production API pool stable at 3 to 10.

Spot VM savings. Spot VMs cost significantly less than on-demand VMs but can be reclaimed by Google with 30 seconds notice. Placing fault-tolerant batch workloads on a Spot pool captures the savings without exposing latency-sensitive services to that availability risk.

Memory-optimised workloads. In-memory caches, analytics engines, and some databases need far more RAM than a general-purpose VM provides. A high-memory pool gives those workloads what they need without forcing you to provision expensive nodes everywhere.

Quick rule

If two workloads have meaningfully different hardware needs, they belong in different pools. If they have similar needs, one pool is simpler and cheaper to manage.

When to use multiple node pools

A single node pool is fine for simple applications where all workloads have similar resource needs. You need multiple pools when workloads diverge significantly.

GPU workloads. ML training and inference jobs need GPU nodes. GPUs are expensive, so you do not want non-GPU workloads accidentally landing on them. A separate GPU pool with taints ensures only GPU-requesting pods are scheduled on those nodes.

High-memory workloads. In-memory caches, analytics engines, or stateful databases may need 64 GB or more of RAM per pod. A high-memory pool handles this without over-provisioning everywhere.

Batch jobs on Spot nodes. Nightly ETL pipelines, image-processing jobs, and large-scale data tasks that can tolerate interruption are ideal for a Spot pool. Keep your always-on services on standard on-demand nodes in the same cluster.

Windows containers. Linux and Windows containers cannot share the same node. If your application has Windows-based components, they must live in a Windows Server node pool alongside your Linux pools.

Separating production from cheaper workloads. Running staging workloads in the same cluster as production is common for cost reasons. Separate pools with taints make it easy to direct staging to cheaper nodes while guaranteeing production pods are never displaced.

Independent scaling requirements. If your video transcoding service needs to burst from 0 to 40 nodes overnight, you do not want that affecting the predictable scaling of your web tier. Separate pools scale independently.

Decision checklist

Does this workload need hardware that my current pool cannot provide? Add a pool.
Do I need to protect production from noisy-neighbour batch jobs? Add a pool.
Do I need Windows containers? Add a Windows pool.
Do I want to capture Spot VM savings on fault-tolerant work? Add a Spot pool.
None of the above? One pool is probably enough.

GKE node pools vs GKE Autopilot

Node pools are a Standard mode concept. In Autopilot, you do not create, configure, or manage them. If someone refers to a “node pool” in the context of GKE, they are talking about a Standard cluster.

	GKE Standard	GKE Autopilot
Node pools	Yes, user-managed	No (Google manages nodes)
Machine type control	Full control per pool	Not user-configurable
Billing	Per provisioned node VM	Per pod resource request
Cluster autoscaler	You configure per pool	Automatic, no config needed
Taints on pools	Fully configurable	Not user-configurable
DaemonSets	Fully supported	System-managed only
GPU nodes	Full control	Supported, Google provisions
Recommended when	You need hardware control	You want simplicity

Which should you use?

Start with Autopilot for standard web services, APIs, and microservices. Switch to Standard (and use node pools) when you need GPU nodes with custom configurations, host-level DaemonSets, fine-grained taint-based isolation, or Spot VM pools for batch workloads.

See GKE Autopilot vs Standard Mode for a full comparison.

Node pools vs labels vs taints vs node selectors

These four concepts are related but distinct. Beginners often confuse them.

Concept	Where it lives	What it does
Node pool	GKE infrastructure	Groups VMs with identical machine type and config
Node label	Kubernetes node metadata	Tags a node with a key-value attribute (informational only)
Taint	Kubernetes node	Actively repels pods that do not explicitly tolerate it
Toleration	Kubernetes pod spec	Permits a pod to be scheduled on a tainted node
nodeSelector / Node Affinity	Kubernetes pod spec	Steers a pod toward nodes with specific labels

A node pool is an infrastructure-level group. When you create a pool with --node-labels and --node-taints, GKE applies those labels and taints to every node in the pool automatically.

A node label on its own does not restrict scheduling. Any pod can still land on a labelled node unless a taint is also present.

A taint actively blocks pods without a matching toleration. The NoSchedule effect prevents new pods landing on that node. NoExecute also evicts existing pods that lack a matching toleration.

A toleration in a pod spec says “this pod is allowed on nodes with that taint.” It is permissive, not directive. The pod may go there but is not forced there.

A nodeSelector or Node Affinity rule in a pod spec says “schedule me on nodes with this label.” Used alongside a toleration, this both permits the pod on the tainted node and steers it toward that node.

Common mistake: labels without taints

Adding a workload=gpu label to your GPU node pool but no taint means any pod in the cluster can land on those expensive GPU nodes. Always pair a specialised node pool with a NoSchedule taint to protect it from general workloads.

Practical example. A GPU node pool has a taint gpu=true:NoSchedule and label workload=gpu. Your ML training pod needs a toleration for the taint (so it is permitted on the GPU node) and a nodeSelector: workload: gpu (so it is actively placed there). Without both, the pod might end up on a standard node with no GPU.

See Kubernetes Pods Explained for more on how pod scheduling works.

How node pool autoscaling works

The GKE cluster autoscaler monitors pod scheduling and adjusts node counts in each pool automatically.

Scaling up happens when a pod cannot be scheduled because no node in the pool has enough CPU or memory. The autoscaler detects the unschedulable pod and adds a new node to the pool (up to --max-nodes). Once the node is ready, the pending pod is scheduled onto it.

Scaling down happens when a node has been underutilised for ten minutes and all its pods can safely move to other nodes. GKE cordons the node (marks it unschedulable), drains it (gracefully evicts pods, respecting PodDisruptionBudgets), then removes the node (down to --min-nodes).

Scale to zero is possible by setting --min-nodes=0. The pool runs no nodes at all when idle. When a pod needs that pool, the autoscaler provisions a new node.

Cold-start delay

Provisioning a node from zero takes one to three minutes. For batch jobs this is usually fine. For latency-sensitive workloads, keep at least one node warm with —min-nodes=1 to avoid that delay hitting your users.

Cluster autoscaler vs HPA. These two work at different layers and complement each other. The Horizontal Pod Autoscaler (HPA) scales the number of pod replicas based on CPU, memory, or custom metrics. When HPA adds more replicas than existing nodes can accommodate, the cluster autoscaler adds nodes. When load drops, HPA reduces replicas, nodes become underutilised, and the cluster autoscaler removes them. They are not alternatives to each other.

Update autoscaling settings on an existing pool:

gcloud container node-pools update batch-pool \
  --cluster=my-cluster \
  --zone=us-central1-a \
  --min-nodes=0 \
  --max-nodes=20

How to create a node pool in GKE

You must have a GKE Standard cluster running before adding node pools.

Standard general-purpose pool:

gcloud container node-pools create standard-pool \
  --cluster=my-cluster \
  --zone=us-central1-a \
  --machine-type=e2-standard-4 \
  --num-nodes=3 \
  --min-nodes=2 \
  --max-nodes=10 \
  --enable-autoscaling \
  --disk-size=100 \
  --disk-type=pd-ssd

GPU pool for ML workloads:

gcloud container node-pools create gpu-pool \
  --cluster=my-cluster \
  --zone=us-central1-a \
  --machine-type=n1-standard-4 \
  --accelerator=type=nvidia-tesla-t4,count=1 \
  --num-nodes=1 \
  --min-nodes=0 \
  --max-nodes=4 \
  --enable-autoscaling \
  --node-labels=workload=gpu \
  --node-taints=gpu=true:NoSchedule

Spot node pool for batch jobs:

gcloud container node-pools create batch-spot-pool \
  --cluster=my-cluster \
  --zone=us-central1-a \
  --machine-type=e2-standard-8 \
  --spot \
  --num-nodes=0 \
  --min-nodes=0 \
  --max-nodes=50 \
  --enable-autoscaling \
  --node-labels=workload=batch \
  --node-taints=spot=true:NoSchedule

Spot VMs can be interrupted

Spot nodes can be reclaimed by Google with 30 seconds notice. Never run databases, in-memory caches, or any workload that cannot survive an abrupt shutdown on a Spot pool. Reserve Spot pools for fault-tolerant, restartable batch work only.

Key flags explained:

--machine-type — The Compute Engine machine type for every node in the pool.
--num-nodes — Initial node count when the pool is created.
--min-nodes and --max-nodes — The autoscaling range. Set --min-nodes=0 to allow scale to zero.
--enable-autoscaling — Activates the cluster autoscaler for this pool.
--node-labels — Kubernetes labels applied to every node. Used by nodeSelector in pod specs.
--node-taints — Taints applied to every node. Pods must have a matching toleration to be scheduled here.
--spot — Provisions Spot VMs, which cost significantly less but can be reclaimed with 30 seconds notice.
--accelerator — Attaches GPU accelerators. Requires the NVIDIA GPU driver DaemonSet to be deployed separately.

List all pools in a cluster:

gcloud container node-pools list \
  --cluster=my-cluster \
  --zone=us-central1-a

Delete a pool:

gcloud container node-pools delete gpu-pool \
  --cluster=my-cluster \
  --zone=us-central1-a

Before you delete

Deleting a pool drains and removes all its nodes. Pods on those nodes are evicted and Kubernetes attempts to reschedule them on other pools. If no suitable capacity exists, those pods enter Pending state. Always confirm replacement capacity before deleting a pool that contains running workloads.

Directing pods to the right node pool

Adding taints and labels to a node pool is only half the job. You also need to configure your pod spec to match.

A pod that should run on the GPU pool needs both a toleration (permitted on the tainted node) and a nodeSelector (actively steered toward the right pool):

apiVersion: v1
kind: Pod
metadata:
  name: ml-training-job
spec:
  tolerations:
    - key: "gpu"
      operator: "Equal"
      value: "true"
      effect: "NoSchedule"
  nodeSelector:
    workload: gpu
  containers:
    - name: trainer
      image: my-registry/ml-trainer:latest
      resources:
        requests:
          cpu: "2"
          memory: "8Gi"
          nvidia.com/gpu: "1"
        limits:
          nvidia.com/gpu: "1"

The tolerations field permits the pod on the tainted GPU node. The nodeSelector actively steers it there. The nvidia.com/gpu resource request ensures the scheduler only considers nodes that have a GPU device available.

Analogy: VIP pass vs reserved seating

A toleration is like a VIP pass — it gets you through the door of the GPU section. But it does not assign you a specific seat. A nodeSelector is like a reserved seat ticket — it places you at a specific spot. You need both: the pass to enter, and the ticket to sit in the right place.

See Deploying Containers with kubectl and Deployments in Kubernetes for how to structure workloads for production use. When managing secrets that workloads in specific pools might need, see Managing Secrets in Kubernetes.

Common real-world node pool patterns

Most production GKE Standard clusters end up with two to four pools. Here are the most common patterns:

Default general-purpose pool

Machine type: e2-standard-4 or e2-standard-8
Purpose: web servers, APIs, background workers, standard microservices
Autoscaling: yes, minimum of 2 nodes so the cluster is never empty
Watch out for: keep this pool lean and affordable; most workloads fit here

GPU pool

Machine type: n1-standard-4 with NVIDIA T4 or A100
Purpose: ML training, inference serving, video processing
Autoscaling: scale to zero when idle, max 4 to 8 nodes
Watch out for: always add a NoSchedule taint; without it, general pods land on expensive GPU nodes

Spot batch pool

Machine type: e2-standard-8 or n2-standard-8 with --spot
Purpose: data pipelines, nightly ETL, image processing, large-scale jobs
Autoscaling: scale to zero when idle, high maximum (20 to 100 nodes)
Watch out for: workloads must handle interruption cleanly and must never be stateful

High-memory pool

Machine type: n2-highmem-8 or n2-highmem-16
Purpose: in-memory databases, analytics engines, workloads needing large RAM per pod
Autoscaling: limited range to control costs
Watch out for: confirm persistent volumes are configured correctly before this pool scales down

Windows pool

Machine type: any compatible Windows Server node image
Purpose: Windows containers, legacy .NET workloads requiring Windows runtime
Watch out for: Linux and Windows pods cannot share nodes; a Linux pool must exist alongside it for any Linux workloads

Node pool upgrades

When a new Kubernetes version becomes available, both the control plane and node pools need upgrading separately. GKE performs surge upgrades by default: it adds a temporary extra node, migrates pods from the node being upgraded to the surge node, upgrades the original node, then removes the surge node. Pods stay running throughout.

Trigger a manual node pool upgrade:

gcloud container clusters upgrade my-cluster \
  --node-pool=standard-pool \
  --zone=us-central1-a

Control plane and node pools upgrade separately

Upgrading the control plane does not automatically upgrade your node pools. Node pools can run behind the control plane version for a grace period, but must eventually catch up. Always upgrade the control plane first, then upgrade each node pool in turn.

See Upgrading GKE Clusters Safely for the full upgrade playbook, including PodDisruptionBudgets and blue-green strategies.

Single pool vs multiple pools

Not every cluster needs multiple pools. Here is when to keep things simple and when the extra complexity is worth it:

Situation	Recommendation
Simple web app or API with uniform workloads	One pool is enough
Need GPU or specialised accelerated hardware	Add a dedicated GPU pool
Want Spot VMs for batch without risking production	Add a separate Spot pool
Need Windows containers	Add a Windows pool
Workloads need to scale independently	Add separate pools
Compliance requires workload isolation at the node level	Add dedicated pools with taints
Using GKE Autopilot	No pools — Google manages infrastructure

The extra complexity of multiple pools pays off when the hardware or isolation requirement is real. Do not add pools for organisational tidiness — labels and namespaces handle logical separation without additional infrastructure. For more on how GKE compares to other container options, see GKE vs Cloud Run.

Monitoring and logging for node pools

Node pool health feeds directly into cluster observability. Nodes under pressure (high CPU, memory pressure, disk pressure) show status conditions visible via kubectl describe node. GKE also exposes these signals in Cloud Monitoring.

See Monitoring GKE Clusters for how to set up alerts on node pool utilisation, and Logging in Kubernetes for surfacing application logs from pods across pools.

For security-specific concerns around node pool configuration — including workload identity, network policies, and node hardening — see Securing GKE Clusters and Private GKE Clusters.

Common beginner mistakes

Thinking node pools exist in Autopilot. Node pools are a GKE Standard concept. Attempting to create a node pool on an Autopilot cluster returns an error. If you need node pool control, you need a Standard cluster.
Using one large pool for everything. Putting all workloads — web services, ML training, batch processing — onto a single pool means either paying for GPUs everywhere or not having them where they are needed. Match hardware to workload with purpose-built pools.
Adding labels without taints on expensive nodes. A node label tells the scheduler a node has a characteristic, but any pod can still land there. Label GPU nodes without a NoSchedule taint and general pods will land on them, wasting expensive GPU resources.
Scaling to zero without considering cold-start delay. A pool at zero nodes takes one to three minutes to provision a new node when a pod arrives. For latency-sensitive workloads this is unacceptable. Use —min-nodes=1 to keep at least one node warm.
Running stateful or critical workloads on Spot nodes. Spot VMs can be interrupted with 30 seconds notice. Databases, in-memory caches, or any workload that cannot tolerate abrupt termination must not run on Spot pools.
Assuming tolerations force scheduling. A toleration permits a pod on a tainted node — it does not guarantee the pod goes there. You need both a toleration and a nodeSelector or Node Affinity rule to ensure the pod lands on the right pool.
Deleting a node pool without ensuring replacement capacity. If pods running on the deleted pool cannot be rescheduled elsewhere, they remain in Pending state. Before deleting a pool, verify that other pools have sufficient capacity or create replacement capacity first.
Confusing node pool upgrades with cluster upgrades. Upgrading the control plane does not automatically upgrade node pools. They must be upgraded separately. See Upgrading GKE Clusters Safely for the correct sequence.

Frequently asked questions

What is the difference between a GKE cluster and a node pool?

A cluster is the entire Kubernetes environment, including a control plane (API server, scheduler, etcd) and one or more groups of worker nodes. A node pool is one of those groups — a set of VMs inside the cluster that all share the same machine type, disk size, OS, and Kubernetes labels. A cluster can have many node pools, but a node pool always belongs to exactly one cluster.

Can I use node pools in GKE Autopilot?

No. GKE Autopilot does not expose user-managed node pools. Google manages the underlying node infrastructure entirely, provisioning and scaling nodes automatically based on your pod resource requests. Node pools are a GKE Standard concept. If you need manual node pool control, you need a Standard cluster.

Do node pools improve security or just organisation?

Both. Beyond organisation, node pools let you isolate sensitive workloads onto separate nodes with dedicated service accounts, restrict access via taints, and apply tighter firewall rules at the node level. Using a dedicated node pool for a sensitive service prevents it from sharing infrastructure with less-trusted workloads.

When should I create a separate node pool?

Create a new pool when a workload needs hardware the current pool cannot provide (GPU, high memory), when you want Spot VMs for batch jobs without risking your production workloads, when you need Windows containers, or when you want to scale a specific class of workload independently without affecting the rest of the cluster.

What happens when I delete a node pool?

Before GKE deletes a node pool, it cordons and drains each node, marking the node unschedulable and evicting running pods gracefully. Kubernetes then attempts to reschedule those pods onto nodes in other pools. If no suitable node exists elsewhere, the pods enter Pending state until capacity becomes available. Always verify replacement capacity exists before deleting a pool that contains running workloads.

Can a node pool scale to zero?

Yes. If you enable autoscaling and set --min-nodes=0, GKE can scale the pool to zero when no pods need it. When a pod that requires nodes from that pool is scheduled, the cluster autoscaler provisions a new node automatically. Be aware that provisioning a node from zero takes one to three minutes, which causes a cold-start delay for the first pod in that batch.

Last verified: 23 March 2026 Cloud services change frequently. Verify details against official documentation before making infrastructure decisions.