Kubernetes Roadmap: Learn Kubernetes for a Cloud Career

Kubernetes has become one of the most sought-after skills in cloud infrastructure roles, but the learning curve is steep and the ecosystem is vast. This roadmap breaks Kubernetes down into five stages, giving you a clear path from understanding what a pod is to running production-grade clusters — and it’s honest about which stage actually moves the needle on your job applications.

Why Kubernetes matters for your career

Kubernetes has won the container orchestration market. Whether a company runs on GKE, EKS, or AKS, the underlying technology is Kubernetes. That means demand for engineers who can work with it is high and consistent across employers. The salary impact of Kubernetes knowledge is measurable — engineers who can demonstrate real Kubernetes competency command meaningfully higher compensation than those who cannot.

The important caveat is that “Kubernetes knowledge” is not binary. There is a wide spectrum between “I know what a pod is” and “I administer multi-cluster production environments.” This roadmap is explicit about where on that spectrum different employers are looking, so you can focus your time on what actually matters for your goals.

If you are working toward a cloud engineering role, you do not need to reach Stage 4 or 5 before applying. If you are aiming for a platform engineering or dedicated Kubernetes engineering role, you do. Read this roadmap accordingly.

Stage 1: Core concepts (week 1–2)

Before touching a cluster, understand the mental model. Kubernetes is a system for declaring the desired state of containerised workloads, and letting a control plane reconcile reality with that declaration. Everything else follows from that idea.

The concepts to understand at Stage 1:

Pods — the smallest deployable unit. One or more containers sharing a network namespace and storage.
Deployments — a controller that manages a set of identical pods. Handles rolling updates and rollbacks.
Services — stable network endpoints that route traffic to pods. Understand ClusterIP, NodePort, and LoadBalancer types.
Namespaces — logical partitions within a cluster. Used to separate environments or teams.
ReplicaSets — ensures a specified number of pod replicas are running at all times (usually managed by a Deployment, not directly).
Nodes — the worker machines. Understand the role of the control plane vs worker nodes.

You do not need a full cluster to learn this. Use Minikube or kind (Kubernetes in Docker) locally. Both are single-node clusters that run on your laptop. Install kubectl and practice the basic commands: kubectl get pods, kubectl describe pod, kubectl logs, kubectl exec.

At Stage 1, you should be able to explain what happens when you run kubectl apply -f deployment.yaml. You cannot get a job on Stage 1 knowledge alone, but it is the foundation for everything that follows.

Stage 2: Writing and applying YAML manifests (week 3–6)

Stage 2 is where Kubernetes starts becoming a tangible skill. You move from reading about objects to writing YAML manifests from scratch and managing a running workload through its lifecycle.

Key skills to develop at Stage 2:

Writing YAML manifests — write Deployment, Service, and ConfigMap manifests without copy-pasting. Understand the apiVersion, kind, metadata, and spec structure.
ConfigMaps and Secrets — externalise configuration from container images. Understand the difference between a ConfigMap (non-sensitive config) and a Secret (sensitive data, base64-encoded). Know how to mount both as environment variables and as volume files.
Health checks — configure livenessProbe and readinessProbe on containers. Understand the difference: liveness restarts a broken container, readiness removes it from the service endpoint until it is ready.
Resource requests and limits — set resources.requests and resources.limits on containers. This affects scheduling and is required for the cluster autoscaler to work correctly.
Rolling updates — understand RollingUpdate strategy parameters: maxSurge and maxUnavailable.
kubectl fluency — practice kubectl rollout status, kubectl rollout undo, kubectl scale, kubectl port-forward, and kubectl apply -f vs kubectl create -f.

Build a project at Stage 2: deploy a multi-component application (e.g., a web app with a database) using only Kubernetes YAML manifests. Use ConfigMaps for application config, Secrets for database credentials, and Services to wire components together. Document it on GitHub.

Stage 2 is the minimum viable Kubernetes knowledge for most cloud engineering job applications. If a job description lists Kubernetes as a requirement and you can demonstrate Stage 2 knowledge with a real project, you are competitive for those roles.

Stage 3: Managed clusters and production-adjacent patterns (week 7–12)

Stage 3 moves from local clusters to real cloud-managed Kubernetes. This is where employers start separating candidates who have “used Kubernetes” from those who have “used Kubernetes in a real environment.”

The three major managed Kubernetes services are GKE (Google Kubernetes Engine), EKS (Amazon Elastic Kubernetes Service), and AKS (Azure Kubernetes Service). You do not need deep expertise in all three. Pick one and learn it well. GKE is generally considered the most mature and developer-friendly; EKS is most relevant if you are targeting AWS-heavy organisations.

Key topics at Stage 3:

RBAC — Role-Based Access Control. Understand Role, ClusterRole, RoleBinding, and ClusterRoleBinding. Know how to grant a service account permission to read secrets in a specific namespace without granting cluster-wide access.
Ingress — understand how an Ingress resource routes external HTTP/S traffic to Services. Know the difference between an Ingress resource and an IngressController (e.g., ingress-nginx, Traefik). Configure path-based and host-based routing.
Persistent Volumes — understand PersistentVolume (PV), PersistentVolumeClaim (PVC), and StorageClass. Know when stateful workloads need persistent storage and how StatefulSet differs from a Deployment for stateful applications.
Horizontal Pod Autoscaler (HPA) — configure autoscaling based on CPU or custom metrics. Understand the relationship between HPA and resource requests.
Cluster networking basics — understand how pods communicate within a cluster. Know what a CNI plugin is (Calico, Flannel, Cilium) even if you do not configure them manually on managed services.

At Stage 3, you should be comfortable provisioning a GKE or EKS cluster (using the cloud console or gcloud/eksctl CLI), deploying a real application, configuring Ingress, and setting up basic RBAC. This is genuinely useful for a cloud engineering role and is what most cloud engineer interviews test.

Stage 4: CKA certification and cluster administration (month 4–6)

The Certified Kubernetes Administrator (CKA) is the most credible Kubernetes certification in the industry. It is a hands-on exam — you configure a real cluster under time pressure, not multiple choice. Employers who care about Kubernetes take CKA seriously.

CKA is relevant for platform engineering, Kubernetes engineering, and DevOps roles where Kubernetes is a core responsibility. It is less critical for general cloud engineering roles where Kubernetes is one of many tools. Be honest with yourself about which category your target role falls into before investing time in CKA preparation.

Topics that CKA covers (beyond what Stage 3 already covers):

Cluster installation and configuration — using kubeadm to bootstrap a cluster from scratch. Understanding control plane components: kube-apiserver, etcd, kube-scheduler, kube-controller-manager.
etcd backup and restore — a common CKA exam task. Know the etcdctl snapshot save and etcdctl snapshot restore commands.
Network policies — use NetworkPolicy resources to restrict pod-to-pod traffic. Understand ingress and egress rules.
Node maintenance — drain and cordon nodes for maintenance using kubectl drain and kubectl uncordon.
Troubleshooting — diagnose failing pods, broken schedulers, and misconfigured RBAC. This is a large portion of the CKA exam.
Storage classes and dynamic provisioning — configure a StorageClass and verify dynamic PV provisioning works.

Prepare for CKA using killer.sh (the official simulator), the Kubernetes documentation (which you can use during the exam), and hands-on practice in a real cluster. Budget 4–6 weeks of study if you are already at Stage 3.

Stage 5: Production patterns and advanced tooling (ongoing)

Stage 5 is the territory of engineers whose job is Kubernetes — platform teams, infrastructure engineers, and SREs. The patterns here are not taught in a linear sequence; they are skills you accumulate as you encounter real problems in production environments.

Key areas at Stage 5:

Helm — the Kubernetes package manager. Understand chart structure (Chart.yaml, values.yaml, templates/), how to write a chart for your own application, and how to use helm upgrade —install in a CI/CD pipeline.
GitOps with ArgoCD or Flux — declarative continuous delivery for Kubernetes. ArgoCD watches a Git repository and reconciles the cluster state to match. This is the standard pattern in mature Kubernetes environments. Understanding ArgoCD Applications, sync policies, and health checks is valuable. See the DevOps engineer roadmap for broader GitOps context.
Operators — Kubernetes controllers that extend the API to manage complex stateful applications (databases, message queues). Understanding what an Operator does and how to use community Operators (e.g., the Postgres Operator) is useful. Writing your own Operator with the Operator SDK is an advanced specialisation.
Multi-cluster patterns — fleet management, cross-cluster service discovery, and disaster recovery across clusters. Tools like ArgoCD ApplicationSets and GKE Fleet cover this space.
Security hardening — Pod Security Standards (replacing Pod Security Policies), OPA/Gatekeeper for policy enforcement, image vulnerability scanning, and seccomp/AppArmor profiles. This overlaps with the cloud security engineer roadmap.
Cluster autoscaling — the Cluster Autoscaler and Karpenter (AWS). Understand how node-level autoscaling interacts with pod-level HPA/KEDA.

At Stage 5, the Certified Kubernetes Security Specialist (CKS) certification is worth considering if you are specialising in platform security. There is also CKAD (Certified Kubernetes Application Developer) — this is relevant for application developers and easier than CKA, but less valuable for infrastructure-focused roles.

What employers actually test in Kubernetes interviews

Kubernetes interview questions fall into predictable categories. Understanding what interviewers are really assessing helps you prepare efficiently.

Conceptual questions — “What is the difference between a Deployment and a StatefulSet?” or “How does a Service route traffic to pods?” These test whether you understand the object model. You need Stage 1–2 knowledge to answer these well.

Troubleshooting scenarios — “A pod is stuck in CrashLoopBackOff. Walk me through how you would debug it.” This is the most common practical question. You need to know kubectl describe pod, kubectl logs, kubectl logs —previous, and how to interpret common failure messages. This is Stage 2–3 territory.

Architecture questions — “How would you configure RBAC to allow a developer team read-only access to pods in their namespace?” These test whether you have thought about real multi-team Kubernetes environments. Stage 3 knowledge covers this.

Hands-on tasks — some companies give take-home tasks or live coding sessions where you deploy a workload. Stage 2–3 knowledge is sufficient for most of these.

CKA-level questions (control plane components, etcd, kubeadm) typically only appear in interviews for dedicated Kubernetes/platform engineering roles. Do not over-prepare for Stage 4 knowledge if your target role does not require it.

Building a Kubernetes project portfolio

Hands-on projects are more convincing to employers than certifications alone. Here is a progression of projects that map to the learning stages:

Stage 2 project — Deploy a multi-tier application (web server, API, database) using YAML manifests on Minikube or kind. Configure health checks, resource limits, ConfigMaps, and Secrets. Push manifests to a public GitHub repository with a clear README.
Stage 3 project — Provision a GKE or EKS cluster (free tier or short-lived), deploy the same application, configure an Ingress with TLS, add RBAC for a hypothetical team, and document the process. Screenshot or record the running cluster.
Stage 5 project — Build a Helm chart for your application. Set up ArgoCD to deploy it from a Git repository. Demonstrate automated rollout on a commit to the main branch.

Each project compounds on the previous one. By the time you have a Stage 3 project on your portfolio, you have tangible evidence of Kubernetes ability that an interviewer can evaluate. If you are early in your career, read the self-taught cloud engineer guide for advice on building a portfolio without formal experience.