What Is Kubernetes? A Beginner-Friendly Explanation

Kubernetes is an open-source system for running and managing containerised applications across a cluster of machines. It solves a hard practical problem: once you have more than a handful of containers, something needs to schedule them, restart them when they crash, scale them when traffic spikes, and route traffic to healthy instances, without you doing all of that by hand. On Google Cloud, you run Kubernetes through Google Kubernetes Engine (GKE), a fully managed service where Google operates the control plane so you can focus on your workloads.

Kubernetes in simple terms

Think of a container as a box that holds your application and everything it needs to run. A container tool like Docker lets you build and run one of those boxes on a single machine. That is useful for development, but it does not solve the production problem.

In production, you might need fifty copies of your web server running simultaneously across multiple machines. Some of those machines will fail. Traffic will spike unpredictably. You will need to deploy a new version of your app without taking the whole thing offline. You need something watching over everything, making sure the right number of containers are running at all times, routing traffic only to healthy ones, and replacing broken ones automatically.

That is what Kubernetes does. If a container tool runs individual containers, Kubernetes keeps hundreds of containers running reliably across a whole fleet of machines, automatically scheduling, healing, and scaling them as needed.

Analogy

Imagine Kubernetes as a shift manager at a large restaurant. Individual chefs (containers) prepare dishes, but the shift manager decides which chef works which station, spots problems before they become incidents, calls in extra staff when the lunch rush hits, and sends someone home when it is quiet. The chefs do not need to coordinate with each other, because the shift manager handles that. Your application code is the chef. Kubernetes is the shift manager.

Key idea

The concept that unlocks Kubernetes thinking is desired state. You tell Kubernetes what you want (three replicas of this container, always running). It works out how to make that happen, and keeps working to stay there. You describe the goal; Kubernetes figures out the steps.

What problem does Kubernetes solve?

Containers are a great way to package software consistently. A container that works on a developer’s laptop works the same way in a data centre. That consistency is genuinely valuable.

But consistency at packaging time does not solve the problems that appear at runtime. Once you move to production at any real scale, you run into a specific set of operational challenges:

  • Keeping containers running. Containers crash. Machines fail. Something needs to detect failures and start replacements without waking you up at 3am.
  • Scaling when traffic spikes. A single copy of your service will not survive a sudden traffic spike. Something needs to spin up more instances quickly and take them back down afterwards.
  • Rolling out updates safely. Replacing all containers at once causes downtime. You need a way to swap in new versions gradually, and roll back immediately if something goes wrong.
  • Routing traffic to healthy instances. As containers start and stop, their IP addresses change. Something needs to track which containers are healthy and route traffic only to those.
  • Running many services consistently. As the number of services grows, managing them with custom scripts becomes unmanageable. You need a consistent, repeatable approach that works the same way for every service.
Note

None of these problems appear when you are running two or three containers in development. They appear at scale, in production, under real traffic. Kubernetes is designed for that environment specifically.

Before Kubernetes, teams built custom tooling for each of these problems. The tooling was fragile, hard to maintain, and did not transfer between organisations. Google had already solved this internally with a system called Borg. In June 2014, Google open-sourced a general-purpose version. That project became Kubernetes, and it is now governed by the Cloud Native Computing Foundation (CNCF).

How Kubernetes works, step by step

Here is what actually happens when you deploy an application with Kubernetes:

  1. You package your app into a container image. Using Docker or another container tool, you build an image and push it to a registry such as Google Artifact Registry.

  2. You describe what you want in a YAML file. You write a Deployment manifest that says something like: “Run three copies of this container image, and make sure three are always running.” You apply it with kubectl apply -f deployment.yaml.

  3. The control plane schedules pods onto nodes. The Kubernetes scheduler reads your Deployment, works out which worker nodes have enough free CPU and memory, and assigns pods to those nodes. A pod is the smallest unit Kubernetes manages (it wraps one or more containers together).

  4. A Service routes incoming traffic. Because pod IP addresses change every time a pod restarts, you create a Service in front of them. The Service gets a stable virtual IP and automatically load-balances traffic across all healthy pods. See Services in Kubernetes for the full breakdown.

  5. The Deployment controller maintains desired state. If a pod crashes, the Deployment controller starts a replacement. If you update the container image, it replaces pods gradually, a few at a time, so the service stays available throughout. See Deployments in Kubernetes for how rolling updates work in practice.

  6. Kubernetes self-heals and scales. If a worker node fails, every pod on that node is rescheduled onto other healthy nodes. If you have configured autoscaling, Kubernetes scales the number of pods up or down based on CPU usage, memory pressure, or custom metrics. See Horizontal Pod Autoscaling for how that is configured.

Tip

You can inspect everything Kubernetes is doing at any time with kubectl get and kubectl describe. Kubernetes is transparent by design. Nothing happens silently.

Core Kubernetes concepts

These seven concepts appear in almost every Kubernetes workflow. Understanding them before you start will save a lot of confusion.

Cluster#

A Kubernetes cluster is the fundamental unit of deployment. It consists of a control plane and one or more worker nodes. Everything you run in Kubernetes lives inside a cluster. When you use GKE, you are creating and working with clusters.

Control plane#

The control plane is the brain of the cluster. It makes global decisions (scheduling pods, replacing failures, rolling out updates) and runs the API server that every tool communicates with: kubectl, the Cloud Console, CI pipelines. In GKE, Google operates the control plane entirely. You never manage those machines.

Node#

A node is a worker machine that runs your containerised workloads. Each node runs an agent called the kubelet, which receives instructions from the control plane and ensures the right containers are running. In GKE Standard mode, nodes are Compute Engine VMs in your project. In GKE Autopilot mode, Google manages the nodes for you. See Node Pools Explained for how to configure them in Standard mode.

Pod#

A pod is the smallest deployable unit in Kubernetes. It wraps one or more containers that share a network namespace (they can reach each other over localhost) and can share storage volumes. In practice, most pods run a single container. Pods are ephemeral by design: they start, do their job, and may be replaced at any time. For a deep dive, see Kubernetes Pods Explained.

Watch out

Pods are ephemeral. Never store important data in a pod’s local filesystem expecting it to persist, and never hardcode a specific pod’s IP address in your configuration. Both disappear when the pod is replaced. Use persistent volumes for data and Services for networking.

Deployment#

A Deployment describes a desired state: “I want three replicas of this container image running at all times.” The Deployment controller continuously reconciles the actual state of the cluster with that desired state. If a pod dies, it starts a replacement. If you change the image version, it replaces pods gradually using a rolling update strategy. Bare pods (pods created without a Deployment) have no self-healing. Always use Deployments for application workloads. See Deployments in Kubernetes.

Service#

A Service is a stable network endpoint that routes traffic to a set of pods. Because pod IP addresses change every time a pod restarts, you never send traffic directly to a pod. A Service selects the right pods by label and load-balances across them, providing a consistent address regardless of how many pods are running or where they are scheduled. See Services in Kubernetes.

Namespace#

Namespaces partition cluster resources between teams, environments, or applications. Two teams can each have a deployment named api as long as they are in different namespaces. GKE clusters include a few built-in namespaces: kube-system for cluster infrastructure, default for workloads that do not specify one. For most real projects, you will create your own namespaces to keep things organised.

Kubernetes vs Docker

This is one of the most common points of confusion for beginners. Docker and Kubernetes are complementary tools, not competing ones. They operate at completely different levels.

Docker (and similar tools)Kubernetes
Primary jobBuild and run containersOrchestrate containers across many machines
ScopeOne machineA cluster of machines
SchedulingYou run containers manuallyKubernetes places containers automatically
Self-healingNo: crashed containers stay crashedYes: failed pods restart automatically
ScalingManualAutomatic, based on load
NetworkingSingle-hostCluster-wide, with stable Service addresses
Load balancingNot built inBuilt in via Services
One-sentence rule

Use Docker (or another container tool) to build your image and push it to a registry. Let Kubernetes run it. Once the image is in the registry, you barely need to think about Docker again. Kubernetes uses its own container runtime (containerd on GKE) to pull and run images directly.

When Kubernetes makes sense

Kubernetes is the right choice when you have:

  • Multiple services that need to scale independently. Kubernetes handles each service as a separate Deployment with its own scaling configuration.
  • High availability requirements. Kubernetes automatically spreads workloads across nodes and zones, replacing failed pods without manual intervention.
  • Rolling deployments with zero downtime. The Deployment controller makes progressive rollouts straightforward, with automatic rollback if health checks fail.
  • A platform team managing workloads for many application teams. Kubernetes’ namespace model and RBAC system make it practical to run many teams’ workloads on shared infrastructure safely.
  • Consistency across environments. The same Kubernetes manifests describe workloads in development, staging, and production. There is no environment-specific configuration layer to maintain separately.
  • Workloads that need fine-grained scheduling control. Kubernetes lets you express constraints like “this pod must run on a GPU node” or “do not co-locate these two pods on the same machine”.

When Kubernetes may be overkill

Kubernetes is powerful, but it comes with genuine operational complexity. It is worth being honest about when it is not the right tool.

Consider something simpler if:

  • You have a single service with modest traffic. A fully managed platform like Cloud Run will deploy your container, scale it to zero, and charge you only for what you use, with far less configuration than a Kubernetes cluster.
  • Your team does not yet have Kubernetes experience. The learning curve is real. Introducing Kubernetes prematurely slows you down and introduces operational risk.
  • You are building an internal tool with low traffic and no high availability requirement. The overhead of cluster management is hard to justify for a low-stakes workload.
  • You need to move quickly. Setting up and maintaining a Kubernetes cluster takes time. For an early-stage product, that time is usually better spent building features.
Important

Managed Kubernetes (GKE) significantly lowers the operational burden compared to running Kubernetes yourself, but it does not eliminate the need to understand Kubernetes concepts. You still write manifests, debug pod failures, manage namespaces and RBAC, and reason about cluster networking. Choose GKE because you want managed infrastructure, not because you want to skip learning Kubernetes.

The comparison between these options is covered in detail on the GKE vs Cloud Run page and the Kubernetes vs serverless page.

Kubernetes on Google Cloud: Google Kubernetes Engine

Running Kubernetes yourself is a significant operational undertaking. You need to provision and patch control plane VMs, configure etcd, manage TLS certificates, set up load balancers, and keep pace with Kubernetes release cycles before your first workload can even run.

Google Kubernetes Engine (GKE) removes that burden. Google operates and maintains the control plane: the API server, etcd, scheduler, and controller manager run on Google-owned infrastructure and are not visible as VMs in your project. You interact with your cluster using the same standard kubectl commands you would use with any Kubernetes cluster anywhere in the world.

What GKE manages for you:

  • The entire Kubernetes control plane
  • Control plane high availability across zones
  • Kubernetes version upgrades (when enrolled in a release channel)
  • Security patching of the control plane
  • etcd backups

What you still manage:

  • Your application workloads (pods, deployments, services)
  • IAM permissions for cluster access
  • Networking configuration (VPC, subnets, private cluster settings)
  • Persistent storage configuration
  • Worker nodes in Standard mode

To start using GKE, enable the Kubernetes Engine API in your GCP project:

gcloud services enable container.googleapis.com

GKE Autopilot is Google’s recommended default for most workloads. Google manages the worker nodes entirely (provisioning, scaling, patching, and securing them). You describe what you want to run and GKE provisions exactly the infrastructure needed. Billing is per pod based on CPU and memory requests, not per node.

GKE Standard gives you full control over the node infrastructure. You define node pools, choose machine types, and configure autoscaling parameters. Use Standard when you need specific hardware (GPUs, high-memory machines) or have compliance requirements that demand control over the underlying infrastructure.

Where to start

If you are new to Kubernetes on GCP, start with a GKE Autopilot cluster. It removes node management entirely and lets you focus on learning Kubernetes itself. The full tradeoffs between modes are covered on the GKE Autopilot vs Standard page. For a practical walkthrough, see Creating Your First GKE Cluster.

Inside a Kubernetes cluster

Understanding what is actually running inside a cluster helps you reason about what is happening when things go wrong.

Analogy

Think of a Kubernetes cluster like an airport. The control tower (control plane) does not carry any passengers itself. It tracks every flight, assigns runways, reroutes aircraft when weather hits, and keeps the whole operation coordinated. The aircraft (nodes) do the actual work of carrying passengers (your application containers) from A to B. When a plane has a mechanical problem, the control tower knows immediately and reroutes the passengers. Pilots do not need to manage the whole airport, just their aircraft.

Control plane components (managed by Google on GKE)

  • kube-apiserver: the front door of the cluster. Every interaction from kubectl, the Cloud Console, or a CI pipeline goes through the API server. It validates requests and persists cluster state.
  • etcd: a distributed key-value store holding the entire state of the cluster: every pod, service, config map, and secret. GKE manages etcd automatically.
  • kube-scheduler: watches for newly created pods that need a home, then picks the best node based on resource availability, affinity rules, and other constraints.
  • kube-controller-manager: runs a set of controllers that each watch cluster state and work to bring reality in line with desired state. The Deployment controller and ReplicaSet controller are both included here.

Node components (running on each worker node)

  • kubelet: an agent on every node that receives pod specifications from the API server and ensures the right containers are running. If a container exits unexpectedly, the kubelet restarts it.
  • kube-proxy: maintains network rules that allow pods to communicate within the cluster and allow external traffic to reach Services. For the full picture, see GKE Networking Model.
  • Container runtime: the software that actually runs containers. GKE uses containerd on Container-Optimized OS nodes.

Common beginner mistakes

  1. Treating pods as permanent. Pods are ephemeral. They can be stopped, replaced, or rescheduled at any time. Never store data inside a pod’s local filesystem expecting it to persist. Never hardcode a specific pod’s IP address. Design workloads assuming pods will come and go.
  2. Creating bare pods instead of Deployments. A pod created directly (without a Deployment) has no self-healing. If the node it runs on fails, the pod is gone and nothing replaces it. Always use a Deployment for application workloads so Kubernetes maintains the desired number of replicas automatically.
  3. Confusing containers, pods, and nodes. A container is the runnable unit. A pod wraps one or more containers and is the smallest thing Kubernetes schedules. A node is the machine (VM) that runs pods. Getting these layers muddled makes it much harder to diagnose problems.
  4. Sending traffic directly to pod IP addresses. Pod IPs are ephemeral. A pod gets a new IP each time it restarts. Always route traffic through a Kubernetes Service, which provides a stable virtual IP and automatically load-balances across healthy pods.
  5. Assuming managed Kubernetes means no Kubernetes knowledge needed. GKE Autopilot removes node management, but you still write Kubernetes manifests, debug pod failures, manage namespaces and RBAC, and think about cluster networking. GKE simplifies operations; it does not abstract away Kubernetes itself.
  6. Ignoring security from the start. Default cluster settings are a starting point, not a finished security posture. Consider network policies, workload identity, and pod security standards early. The Securing GKE Clusters page covers the key controls.

Kubernetes vs the alternatives

Kubernetes vs Docker#

Docker builds and runs individual containers on a single machine. Kubernetes orchestrates many containers across many machines. They are complementary: Docker-built images run on Kubernetes clusters. The confusion arises because both involve containers, but they operate at entirely different levels. The comparison table in the Kubernetes vs Docker section above covers this in detail.

Kubernetes vs serverless#

Serverless platforms like Cloud Functions or Cloud Run manage the infrastructure entirely. You deploy code or a container and the platform handles everything else. Kubernetes gives you much more control: over scheduling, networking, resource limits, and cluster configuration. The tradeoff is complexity. Kubernetes is the better choice when you need that control; serverless is the better choice when you do not. See the Kubernetes vs serverless comparison for a detailed breakdown.

Kubernetes vs Cloud Run#

Cloud Run is Google’s fully managed container platform. It runs your containers, scales them to zero, and charges per request. Kubernetes (via GKE) gives you persistent workloads, complex scheduling, and full cluster control. Cloud Run is significantly simpler for stateless HTTP workloads. Kubernetes is the better choice for stateful workloads, batch jobs, internal services, or anything that does not fit the request-response model well. See the GKE vs Cloud Run comparison for guidance on choosing between them.

Frequently asked questions

What is Kubernetes in simple words?

Kubernetes is a system that runs and manages many containers across many machines at once. You tell it what you want running (for example, five copies of your web server) and it makes that happen, keeps it that way, and fixes things automatically when something goes wrong.

Is Kubernetes the same as Docker?

No. Docker builds and runs individual containers on a single machine. Kubernetes orchestrates containers across many machines, handling scheduling, self-healing, and scaling. They solve different problems and are often used together. Docker (or another container tool) builds the image, and Kubernetes runs it at scale.

Do I need Kubernetes for a small app?

Probably not. Kubernetes adds real operational complexity. For a single service with low traffic, something like Cloud Run will be simpler, cheaper, and faster to get started with. Kubernetes makes sense when you have multiple services, high availability requirements, or a team that needs to manage many workloads consistently.

What is the difference between Kubernetes and GKE?

Kubernetes is the open-source orchestration system. GKE (Google Kubernetes Engine) is Google Cloud's managed service that runs Kubernetes for you. With GKE, Google operates and maintains the control plane. You still use standard Kubernetes tools and write standard Kubernetes configuration. GKE removes the burden of managing the cluster infrastructure itself.

Is Kubernetes hard to learn?

The core concepts (pods, deployments, services, namespaces) are learnable in a few days. Getting comfortable with real-world operations takes longer. GKE Autopilot removes a lot of the infrastructure complexity, so it is a good starting point. The learning curve is real, but manageable if you take it step by step.

Last verified: 23 March 2026 Cloud services change frequently. Verify details against official documentation before making infrastructure decisions.