Kubernetes Deployments Explained: Rolling Updates, Rollbacks, and Scaling
A Kubernetes Deployment is the standard way to run a containerised application reliably. It manages a group of identical Pods, keeps the desired number of replicas running, rolls out new versions without downtime, and makes rollbacks a single command. If you are deploying anything stateless on GKE, you are almost certainly using a Deployment.
What is a Kubernetes Deployment?
A Deployment is a Kubernetes object that declares how many copies of your application should be running and what those copies should look like. You write down your desired state (three replicas of this container image, with these resource limits and this health check) and the Deployment controller makes it happen and keeps it that way.
Deployments give you four capabilities that raw Pods do not have:
- Self-healing: if a Pod crashes or is evicted, the Deployment controller replaces it automatically.
- Horizontal scaling: change the replica count and Kubernetes brings the cluster into line.
- Rolling updates: push a new image version and Kubernetes replaces old Pods incrementally, keeping the application available throughout.
- Rollback: if an update causes problems, one command reverts to the previous known-good version.
Deployments are designed for stateless workloads, meaning applications where every Pod is interchangeable and holds no unique data or identity. Web servers, REST APIs, and background workers are all good fits.
On GKE Autopilot, you do not need to manage nodes at all. You define the Deployment and GKE handles the underlying infrastructure, scaling, and node provisioning automatically. Autopilot is a great starting point if you are new to Kubernetes.
Simple explanation
A Deployment is a set of instructions that tells Kubernetes: “I want three copies of this container running at all times. Here is what each copy should look like. Keep them healthy, and when I want to update them, do it gradually so nothing goes down.”
Kubernetes then acts as the enforcer of those instructions, constantly checking, correcting, and maintaining that state.
Think of a Deployment as a staffing contract with a temp agency. The contract says “always have three developers on this project”. If one leaves, the agency immediately sends a replacement. If you want to upgrade to senior developers, the agency swaps them in one at a time so the project never stalls. If the new developers turn out to be wrong for the role, you call the agency and get the old team back instantly.
Why use a Deployment instead of a raw Pod?
When you create a Pod directly, it is tied to a single node. If that node fails, the Pod disappears and nothing replaces it. There is no update mechanism, no scaling, and no rollback.
A Deployment solves all of these problems automatically. It is the correct primitive for every stateless workload on GKE, whether you are running on Autopilot or Standard mode.
In production, you should almost never create a naked Pod. If the node it runs on is drained, upgraded, or fails, the Pod is gone permanently. The only valid use case for a bare Pod is a short-lived debugging container you plan to delete immediately.
How a Deployment works: Deployment, ReplicaSet, Pod
Understanding the three-layer hierarchy makes everything else click.
| Object | What it does | Do you create it directly? |
|---|---|---|
| Pod | Runs one or more containers. Has an IP address, but that IP is ephemeral. | Rarely (use a Deployment instead) |
| ReplicaSet | Ensures a fixed number of identical Pod replicas are running. Uses a label selector to find the Pods it owns. | Almost never (the Deployment manages this) |
| Deployment | Manages one or more ReplicaSets. Handles rolling updates by creating a new ReplicaSet and scaling the old one down. Retains old ReplicaSets for rollback. | Yes (your day-to-day interface) |
How reconciliation works, step by step:
- You apply a Deployment manifest declaring 3 replicas of
my-app:v1. - The Deployment controller creates a ReplicaSet matching those 3 replicas.
- The ReplicaSet creates 3 Pods using the Pod template.
- Kubernetes schedules the Pods onto nodes and starts the containers.
- If a Pod crashes, the ReplicaSet detects the shortfall and creates a replacement.
- When you change the image to
v2, the Deployment creates a new ReplicaSet forv2and gradually scales thev1ReplicaSet to zero, keeping it around for rollback.
Label selectors are how each layer finds the objects it owns. The Deployment’s spec.selector.matchLabels must match the metadata.labels on the Pod template exactly. Mismatching these is one of the most common sources of confusing errors for beginners and will cause the Deployment to either fail on apply or spin up Pods it cannot track.
Deployment example YAML
The recommended way to manage Deployments is with a YAML manifest stored in version control. Here is a production-aware example with annotations:
apiVersion: apps/v1
kind: Deployment
metadata:
name: my-app
namespace: production
labels:
app: my-app
version: v1
spec:
replicas: 3
selector:
matchLabels:
app: my-app
strategy:
type: RollingUpdate
rollingUpdate:
maxSurge: 1
maxUnavailable: 0
template:
metadata:
labels:
app: my-app
version: v1
spec:
containers:
- name: my-app
image: us-docker.pkg.dev/my-project/my-repo/my-app:v1.4.2
ports:
- containerPort: 8080
resources:
requests:
cpu: "250m"
memory: "256Mi"
limits:
cpu: "500m"
memory: "512Mi"
readinessProbe:
httpGet:
path: /healthz
port: 8080
initialDelaySeconds: 5
periodSeconds: 10
failureThreshold: 3
livenessProbe:
httpGet:
path: /healthz
port: 8080
initialDelaySeconds: 15
periodSeconds: 20
terminationGracePeriodSeconds: 30Key fields explained:
spec.replicas— how many Pod copies you want running at once.spec.selector.matchLabels— how the Deployment finds its Pods. Must matchspec.template.metadata.labelsexactly.maxSurge: 1— one extra Pod is allowed above the desired count during an update. With 3 replicas, up to 4 Pods can exist temporarily.maxUnavailable: 0— no Pod is removed until its replacement is ready. This is what makes a rollout zero-downtime.resources.requests— required for the scheduler to make good placement decisions and for Horizontal Pod Autoscaling to work.readinessProbe— Kubernetes only sends traffic to a Pod once this probe passes. Without it, traffic can reach a Pod before the application has finished starting.livenessProbe— Kubernetes restarts a Pod if this probe fails repeatedly, catching deadlocks and hangs.terminationGracePeriodSeconds— time allowed for in-flight requests to complete before a Pod is forcibly stopped.image: ...my-app:v1.4.2— an immutable tag. Never uselatestin production.
Apply the manifest:
kubectl apply -f deployment.yamlStore your container images in Artifact Registry rather than Docker Hub for better security, lower latency within GCP, and tighter IAM integration. The image path format shown above (us-docker.pkg.dev/…) is the standard Artifact Registry format.
How to create and inspect a Deployment
After applying, use these commands to confirm everything is healthy:
# Apply the manifest
kubectl apply -f deployment.yaml
# Check Deployment status
kubectl get deployments
# Detailed view including events and rollout history
kubectl describe deployment my-app
# Watch Pods as they come up (filtered by label selector)
kubectl get pods -l app=my-app --watch
# Monitor a rollout in progress
kubectl rollout status deployment/my-appWhat each command tells you:
| Command | What to look for |
|---|---|
kubectl get deployments | READY shows 3/3 when all replicas are healthy |
kubectl describe deployment my-app | Events section reveals scheduling failures, image pull errors, and probe failures |
kubectl get pods -l app=my-app | All Pods should show Running and 1/1 ready |
kubectl rollout status | Confirms a rollout completed and exits non-zero if it failed or timed out |
kubectl rollout status blocks until the rollout finishes and returns a non-zero exit code on failure. This makes it ideal for gating CI/CD pipelines. Run it immediately after kubectl apply in your deploy script so a bad rollout fails the pipeline rather than silently drifting.
See deploying containers with kubectl for a deeper walkthrough of the full deployment workflow.
How rolling updates work
When you release a new version, you update the image tag in your manifest and re-apply.
Option 1: Declarative (recommended)
# Edit deployment.yaml to change the image tag, then:
kubectl apply -f deployment.yamlOption 2: Imperative
kubectl set image deployment/my-app my-app=us-docker.pkg.dev/my-project/my-repo/my-app:v2.0.0What happens with maxSurge: 1 and maxUnavailable: 0 on a 3-replica Deployment:
- A new ReplicaSet is created for
v2. - One new
v2Pod starts. The cluster now has 4 Pods (3 old, 1 new). - Kubernetes waits for the
v2Pod’s readiness probe to pass. - Once ready, one
v1Pod is terminated. Back to 3 Pods (2 old, 1 new). - Steps 2 through 4 repeat until all 3 Pods run
v2. - The
v1ReplicaSet is scaled to zero but kept for rollback.
What maxSurge and maxUnavailable actually control:
| Setting | Effect |
|---|---|
maxSurge: 1, maxUnavailable: 0 | Zero-downtime. One extra Pod created before any old Pod is removed. Slightly slower. |
maxSurge: 0, maxUnavailable: 1 | Faster rollout. One old Pod removed before replacement is ready. Brief capacity drop. |
maxSurge: 2, maxUnavailable: 0 | Two new Pods start simultaneously. Faster on large Deployments. |
Zero-downtime during updates is not automatic. It requires all three of the following: maxUnavailable: 0 in your strategy, a readiness probe that correctly gates traffic, and a terminationGracePeriodSeconds long enough for your app to finish in-flight requests. Missing any one of these will cause dropped requests during rollouts.
How rollbacks work
If a new version introduces a problem, roll back immediately:
kubectl rollout undo deployment/my-appKubernetes reactivates the previous ReplicaSet and scales the current one back to zero. Because the old ReplicaSet already exists with its Pods pre-configured, the rollback is fast. There is no new image pull or Pod build phase.
Check rollout history before rolling back:
kubectl rollout history deployment/my-appThis shows revision numbers. To roll back to a specific revision:
kubectl rollout undo deployment/my-app --to-revision=2How many revisions are retained:
By default, Kubernetes keeps the last 10 ReplicaSets. Control this with spec.revisionHistoryLimit:
spec:
revisionHistoryLimit: 5 # Keep last 5 revisions for rollbackTwo things that will silently break your rollback ability: setting revisionHistoryLimit: 0 (Kubernetes immediately deletes old ReplicaSets, making rollback impossible), and using the latest image tag (the previous ReplicaSet points to the same tag as the current one, so the “rollback” runs the exact same broken image). Always use immutable tags and keep revision history enabled.
Scaling a Deployment
To manually adjust the number of running replicas:
kubectl scale deployment my-app --replicas=5To scale back down:
kubectl scale deployment my-app --replicas=2Kubernetes terminates Pods gracefully, honouring terminationGracePeriodSeconds so in-flight requests can complete.
Prefer updating the replicas field in your YAML and running kubectl apply. Imperative scale commands leave a gap between what your manifest declares and what is actually running, which causes drift in GitOps workflows.
For automatic scaling based on CPU or memory metrics, see Horizontal Pod Autoscaling.
Pausing and resuming rollouts
If you need to make several changes and apply them as a single rollout rather than triggering separate rollouts for each change, pause the Deployment first:
kubectl rollout pause deployment/my-appMake your changes (update the image, adjust resource limits, change environment variables) then resume:
kubectl rollout resume deployment/my-appAll queued changes are applied together as one rolling update.
When to use a Deployment
Use a Deployment for any stateless workload where every replica is interchangeable:
- Web applications — serving HTTP or HTTPS traffic, where any Pod can handle any request.
- REST and gRPC APIs — horizontally scalable backends that do not store state locally.
- Internal microservices — services called by other services within the cluster.
- Background workers — job processors, queue consumers, and event handlers that do not need a stable network identity.
- Stateless batch processors — short-lived processing containers that do not require persistent storage.
When to use something else:
| Workload type | Better choice | Why |
|---|---|---|
| Database (Postgres, MySQL) | StatefulSet | Needs stable identity, stable hostname, persistent volume attached to the same Pod |
| Distributed data store (Cassandra, Kafka) | StatefulSet | Nodes need unique IDs and stable network addresses |
| Node-level daemon (log collector, monitoring agent) | DaemonSet | Must run exactly once on every node |
| One-off batch task | Job | Should run to completion and stop |
| Scheduled task (nightly report) | CronJob | Should run on a schedule and stop |
If your workload needs to write to disk and re-read that data on restart, it likely belongs in a StatefulSet rather than a Deployment.
Deployment vs Pod vs StatefulSet
If every running copy of your app is identical and interchangeable, use a Deployment. If each copy needs its own name, its own disk, or its own stable address, use a StatefulSet. If something needs to run on every single node in the cluster, use a DaemonSet.
| Pod | Deployment | StatefulSet | |
|---|---|---|---|
| Purpose | Run a single container instance | Manage multiple identical stateless replicas | Manage stateful replicas with stable identity |
| Self-healing | No (if it dies, it is gone) | Yes (replaced automatically) | Yes (replaced with the same identity) |
| Scaling | Manual only | Easy, change replicas | Supported but ordered |
| Rolling updates | Not supported | Yes, with configurable strategy | Yes, but ordered (Pod 0 before Pod 1) |
| Stable network identity | Ephemeral IP | Ephemeral, use a Service | Stable hostname per Pod (e.g. pod-0) |
| Persistent storage | None by default | Shared volumes only | Dedicated PersistentVolumeClaim per Pod |
| Typical use | Debugging, one-offs | Web apps, APIs, workers | Databases, queues, stateful services |
For most workloads you will encounter on GKE, a Deployment is the right choice. StatefulSets are for when identity and storage matter at the individual Pod level.
Common beginner mistakes
- Creating naked Pods instead of a Deployment. Bare Pods are not rescheduled if the node they run on fails or is drained. Always use a Deployment for any workload you want to keep running. See Kubernetes Pods Explained to understand why.
- Mismatching labels and selectors. The
spec.selector.matchLabelsmust matchspec.template.metadata.labelsexactly. A mismatch causes the Deployment to create Pods it cannot find, resulting in runaway replica creation or an error on apply. - Omitting a readiness probe. Without one, Kubernetes sends traffic to a Pod as soon as the container starts, which may be before your application has finished initialising. This causes user-facing errors during rolling updates that are difficult to diagnose.
- Omitting resource requests. Without
resources.requests, the scheduler cannot make sensible placement decisions, the Horizontal Pod Autoscaler cannot function, and Pods are vulnerable to eviction under memory pressure. - Using
latestas the image tag. Thelatesttag makes rollbacks unreliable because the previous ReplicaSet points to the same tag as the current one. Use an immutable tag such as a Git commit SHA or a semantic version. See Artifact Registry for image management. - Editing a ReplicaSet directly. The Deployment controller will immediately reconcile the ReplicaSet back to the desired state, overwriting your changes. Always make changes at the Deployment level.
- Assuming a Deployment exposes traffic automatically. A Deployment manages Pods but does not create network access to them. You need a Service to give the Deployment a stable IP or DNS name, and an Ingress to expose it outside the cluster.
- Not monitoring rollouts in CI/CD pipelines. Running
kubectl applyand moving on does not tell you whether the rollout succeeded. Always addkubectl rollout statusas a pipeline step. It blocks until the rollout completes and exits with a non-zero code on failure.
Best practices
- Use immutable image tags. Tag images with a Git commit SHA or a semantic version. Never use
latestin production. Store images in Artifact Registry. - Define resource requests and limits. Required for autoscaling, scheduler efficiency, and protection against noisy-neighbour evictions.
- Add readiness and liveness probes. Readiness gates traffic. Liveness catches deadlocks. Both belong in every production Deployment.
- Manage manifests in version control. Declarative YAML in a Git repository gives you an audit trail, enables GitOps workflows, and makes rollbacks predictable. Use Helm for templating across environments.
- Prefer declarative updates. Use
kubectl apply -frather thankubectl set image. The former keeps your repository in sync with what is running. - Monitor rollouts actively. Add
kubectl rollout statusto your CI/CD pipeline and alert on rollout failure. See monitoring GKE clusters for a full observability setup. - Review logs during deploys. New Pods starting up is a good moment for application errors to surface. Have Kubernetes logging in place before you deploy to production.
- Use Services and Ingress together. A Deployment without a Service is unreachable. A Service without an Ingress is only reachable inside the cluster. Wire these three together as a unit.
- Secure your workload identity. On GKE, use Workload Identity to give Pods scoped access to GCP services rather than mounting service account keys.
- Keep
revisionHistoryLimitbetween 3 and 10. Enough to roll back across a few releases without accumulating stale ReplicaSets.
Summary
- A Deployment manages a ReplicaSet, which manages Pods, ensuring the desired number of healthy replicas are always running.
- Deployments provide self-healing, horizontal scaling, rolling updates, and rollback, capabilities that raw Pods do not have.
- Define Deployments in YAML and apply with
kubectl apply -ffor a repeatable, version-controlled workflow. - Rolling updates replace Pods incrementally.
maxSurgeandmaxUnavailablecontrol pace and availability. Zero-downtime requiresmaxUnavailable: 0plus a working readiness probe. - Roll back instantly with
kubectl rollout undo deployment/my-app. Kubernetes retains old ReplicaSets for exactly this purpose. - Use a Deployment for stateless workloads. Use StatefulSets for databases and other workloads that need stable identity or persistent per-Pod storage.
- Always set resource requests, readiness probes, and immutable image tags. Omitting any of these causes hard-to-diagnose production problems.
Frequently asked questions
What does a Kubernetes Deployment do?
A Deployment declares the desired state for your application, for example: keep three replicas of this container image running. The Deployment controller continuously monitors the cluster and reconciles reality with that declaration by restarting crashed Pods, replacing evicted Pods, rolling out new image versions incrementally, and enabling one-command rollbacks if something goes wrong.
What is the difference between a Deployment and a Pod?
A Pod is a single running instance of your container. If it crashes, nothing replaces it. A Deployment manages a group of identical Pods (replicas) and keeps that group healthy automatically. It also handles rolling updates and rollbacks. In practice you should almost always use a Deployment rather than creating Pods directly.
When should I use a Deployment vs a StatefulSet?
Use a Deployment for stateless workloads such as web apps, REST APIs, and background workers, where any Pod replica is interchangeable. Use a StatefulSet when each Pod needs a stable identity, a stable hostname, or a persistent volume that stays attached to the same instance across restarts. Databases like PostgreSQL, Cassandra, and Kafka are typical StatefulSet candidates.
Does a Deployment guarantee zero downtime during updates?
Not automatically. Zero-downtime rollouts require three things: setting maxUnavailable: 0 in the rolling update strategy, adding a correctly configured readiness probe so Kubernetes only routes traffic to Pods that are genuinely ready, and giving your application enough time to start via initialDelaySeconds. With all three in place, users experience no interruption during a rollout.
How do I roll back a failed Deployment?
Run kubectl rollout undo deployment/my-app. Kubernetes reactivates the previous ReplicaSet, which it kept for exactly this purpose. To roll back to a specific revision, run kubectl rollout undo deployment/my-app --to-revision=2 after checking kubectl rollout history deployment/my-app. The rollback is fast because no new Pods need to be built since the old ReplicaSet already exists.