Kubernetes Services Explained: ClusterIP, NodePort, and LoadBalancer
A Kubernetes Service is a stable network endpoint that routes traffic to your Pods. Every time Kubernetes deploys or replaces a Pod, that Pod gets a new IP address. No other component can safely rely on that IP. A Service solves this by sitting in front of your Pods and giving clients one address that never changes, regardless of what happens to the Pods behind it.
Services are not optional. Without them, there is no reliable way to connect components inside your cluster or expose applications to the outside world. This guide explains how Services work, covers every service type in Kubernetes, and helps you choose the right one for your GKE workload.
By the end of this guide you will understand what a Kubernetes Service is, how ClusterIP, NodePort, LoadBalancer, and ExternalName work, when to use each, how Services relate to Ingress, and how to troubleshoot a Service that is not routing traffic.
Kubernetes Service in simple terms
A Service is a load balancer built into your cluster. You point it at a group of Pods using labels, and it gives that group one stable address. Clients connect to that address and Kubernetes figures out which Pod handles the request.
Think of a Service like a restaurant’s single phone number. The kitchen might have five chefs on Monday and three on Thursday, and they rotate constantly. Customers never need to know which chef is available. They always dial the same number and the Service routes the call to whoever is ready. If a chef leaves and is replaced, the phone number stays the same and callers notice nothing.
What matters practically: you write your application config once, pointing at my-backend (the Service name), and it keeps working through deployments, crashes, and scale events without any changes on your part.
Why Kubernetes Services exist
Pods in Kubernetes are designed to be temporary. Kubernetes replaces them constantly: during rolling updates from a Deployment, after crashes, when a node is drained for maintenance, and when horizontal autoscaling scales the cluster down.
Every time a Pod is created, it gets a new IP address from the cluster’s pod network range. That IP is never reused. If your frontend holds a reference to 10.4.2.17 (a backend Pod IP), that reference becomes invalid the next time the backend is updated or the Pod restarts.
A Service adds a stable layer between consumers and the Pods they need to reach:
- Stable virtual IP (ClusterIP): The Service gets an IP address that never changes for its lifetime, regardless of Pod churn.
- Stable DNS name: CoreDNS resolves the Service name automatically. Within the same namespace,
my-backendresolves to the ClusterIP. From any namespace,my-backend.my-namespace.svc.cluster.localworks. - Automatic endpoint tracking: Kubernetes continuously watches for Pods matching the Service’s label selector. Ready Pods are added to the Endpoints list; terminating or unhealthy Pods are removed.
Configure your application once to connect to the Service name, and routing stays correct through every deployment, crash, and scale event.
How a Kubernetes Service works
Here is the sequence from YAML to working traffic:
-
Labels on Pods: When you define a Deployment, you attach labels to the Pod template, for example
app: apiandtier: backend. Every Pod created by that Deployment carries those labels. -
Selector on the Service: Your Service spec includes a
selectorthat specifies which labels to match. Any Pod carrying those labels in the same namespace is a candidate endpoint. -
Endpoints object: Kubernetes automatically maintains an
Endpointsobject (orEndpointSlicein newer clusters) for each Service. It lists the IP addresses and ports of all Pods that match the selector and are currently passing their readiness probe. You never create or update this object yourself. -
kube-proxy routing: On each node,
kube-proxywatches the Endpoints object and programsiptables(or IPVS) rules to forward traffic destined for the Service’s ClusterIP to one of the endpoint Pod IPs. This happens at the kernel level with no userspace proxy in the data path. -
DNS resolution: CoreDNS runs in every GKE cluster and resolves Service names to ClusterIPs. Use the short name
my-backendwithin the same namespace, or the fully-qualified namemy-backend.my-namespace.svc.cluster.localfrom anywhere in the cluster. No hardcoded IPs needed.
On GKE, the networking model adds VPC-native pod addressing on top of this, which integrates cleanly with Google Cloud load balancers for external Service types.
ClusterIP
ClusterIP is the default Service type. It assigns the Service a virtual IP address that is only reachable from within the cluster. Nothing outside the cluster — no users, no external systems — can connect to a ClusterIP Service directly.
When to use it: Any time one workload inside your cluster needs to talk to another. Frontend to backend API, application to cache, one microservice to another. ClusterIP is the right choice for the vast majority of Services.
When not to use it: When external access is needed. ClusterIP has no external reachability by design.
apiVersion: v1
kind: Service
metadata:
name: backend-api
spec:
type: ClusterIP
selector:
app: my-app
tier: backend
ports:
- protocol: TCP
port: 80 # Port clients connect to on the Service
targetPort: 8080 # Port the container is actually listening onOnce created, this Service is reachable at:
- DNS (same namespace):
backend-api - DNS (any namespace):
backend-api.default.svc.cluster.local - IP: the assigned ClusterIP (e.g.
10.96.45.23)
Always use the DNS name in your application config, never the IP. DNS names survive the Service being deleted and recreated. The ClusterIP is likely to change if the Service is ever recreated, which will break anything that hardcoded it.
Omitting type in the spec defaults to ClusterIP. You do not need to specify it explicitly.
NodePort
NodePort extends ClusterIP by opening a static port in the range 30000–32767 on every node in the cluster. External traffic can reach the Service by sending requests to any node’s IP at that port: <node-ip>:<node-port>.
When to use it: Local development, testing on clusters without a cloud provider, on-premises deployments, or accessing a service from within the same VPC without going through a public load balancer.
When not to use it: Production internet-facing traffic. Node IPs are unstable on GKE (nodes are replaced during upgrades and autoscaling events), the high port number is not user-friendly, and you have to manage firewall rules yourself.
apiVersion: v1
kind: Service
metadata:
name: my-app-nodeport
spec:
type: NodePort
selector:
app: my-app
ports:
- protocol: TCP
port: 80
targetPort: 8080
nodePort: 30080 # Optional: omit to let Kubernetes assign one automaticallyNodePort opens the port on every node, including nodes not running your Pods. On GKE with node autoscaling, node IPs change whenever nodes are replaced. There is no built-in TLS, no health routing, and no stable entry point. For production external traffic on GKE, use a LoadBalancer Service or an Ingress resource instead.
A NodePort Service also has a ClusterIP, so it remains reachable inside the cluster the same way as a ClusterIP Service.
LoadBalancer
LoadBalancer extends NodePort by instructing GCP to provision an external load balancer automatically and point it at the NodePort. On GKE, this creates a Google Cloud external passthrough Network Load Balancer (Layer 4) with a stable public IP address.
When to use it: Exposing a single TCP or UDP service to the internet: a database proxy, a gRPC service, a game server, or anything that does not work well behind an HTTP(S) load balancer.
When not to use it: When you need to expose many HTTP/S services. Each LoadBalancer Service creates a separate cloud load balancer with its own IP and cost. For HTTP/S, use an Ingress resource backed by ClusterIP Services instead.
apiVersion: v1
kind: Service
metadata:
name: my-app-lb
annotations:
# Optional: use a pre-reserved static IP
networking.gke.io/load-balancer-ip-addresses: "my-static-ip"
spec:
type: LoadBalancer
selector:
app: my-app
ports:
- protocol: TCP
port: 80
targetPort: 8080
# Optional: restrict which source IPs can connect
loadBalancerSourceRanges:
- "203.0.113.0/24"After applying this, run kubectl get svc my-app-lb and wait for the EXTERNAL-IP column to show a real IP. It shows <pending> for 30–90 seconds while GCP provisions the load balancer.
On GKE you can create a LoadBalancer accessible only from within your VPC by adding the annotation networking.gke.io/load-balancer-type: “Internal”. This is useful for services that backend systems in the same VPC need to reach without going through a public IP.
ExternalName
ExternalName maps a Service name inside the cluster to an external DNS name. It creates a CNAME record in CoreDNS rather than provisioning any load balancer or proxy. No traffic proxying happens — DNS resolution simply returns the external hostname.
When to use it: When your workloads need to reach an external dependency (a managed database, a third-party API, a Cloud SQL instance) and you want a stable cluster-internal name rather than hardcoding the external hostname in every application config file.
When not to use it: As a substitute for a real Service. ExternalName does no health checking, no load balancing, and no traffic proxying. It is purely a DNS alias.
apiVersion: v1
kind: Service
metadata:
name: external-database
spec:
type: ExternalName
externalName: my-database.example.comThis type is especially useful during migrations. Configure your workloads to connect to the Service name from day one. Later, when you move the dependency in-cluster, swap the ExternalName Service for a ClusterIP Service pointing at the in-cluster replacement. No application config changes needed.
ClusterIP vs NodePort vs LoadBalancer vs ExternalName
| Type | Reachable from | Best for | Main downside |
|---|---|---|---|
| ClusterIP | Inside cluster only | Internal service-to-service communication | Not accessible externally |
| NodePort | Inside cluster + external via node IP | Local dev, testing, on-premises | Unstable node IPs, awkward ports, manual firewall rules |
| LoadBalancer | Inside cluster + external via cloud LB | Single TCP/UDP service exposed to internet | One load balancer per Service, cost compounds quickly |
| ExternalName | Inside cluster (DNS only) | Aliasing an external hostname inside the cluster | DNS alias only, no proxying, no health checks |
In production, most clusters use ClusterIP for everything internal, Ingress for HTTP/S external traffic, and LoadBalancer only for non-HTTP external services. Most teams end up with many ClusterIP Services and a small number of LoadBalancer Services for specific Layer 4 requirements. If you are deciding between Cloud Run and GKE, the GKE vs Cloud Run comparison covers the trade-offs.
Kubernetes Service vs Ingress
A Service and an Ingress solve related but different problems:
- A Service gives a group of Pods a stable IP and DNS name. It handles Layer 4 routing (TCP/UDP) and is always in the traffic path, even when Ingress is used.
- An Ingress is a Layer 7 routing layer. It examines HTTP request headers, hostnames, and paths to decide which Service to forward traffic to. On GKE, an Ingress provisions a Google Cloud HTTP(S) Load Balancer.
Ingress does not replace Services. It routes traffic to Services. The common production pattern on GKE for web apps:
- Define ClusterIP Services for each backend
- Define one Ingress resource with routing rules (host-based or path-based)
- GKE provisions one HTTP(S) Load Balancer that serves all of them
You have a React frontend and a Go API. Rather than creating two LoadBalancer Services (two public IPs, two billing entries), you create two ClusterIP Services and one Ingress. The Ingress routes api.example.com to the Go API Service and example.com to the frontend Service. One IP. One load balancer. One TLS certificate.
Ingress does not handle non-HTTP protocols. For TCP or UDP external exposure, use a LoadBalancer Service directly. For clusters in private GKE configurations, the networking model also affects how external load balancers and Ingress controllers reach your Pods.
When to use each Service type
- ClusterIP — internal microservice communication, service discovery within the cluster, any workload that does not need external access
- NodePort — local development and testing, on-premises clusters without a cloud load balancer, VPC-internal access without a public IP
- LoadBalancer — exposing a single TCP or UDP service externally (database proxies, game servers, gRPC services, anything Ingress cannot handle)
- Ingress (backed by ClusterIP Services) — exposing one or many HTTP/S applications externally with host-based or path-based routing and TLS termination
- ExternalName — mapping a cluster-internal name to an external service hostname, useful during migrations or when referencing managed external dependencies
When choosing between running workloads as GKE Services or using a managed platform entirely, the Cloud Run vs GKE vs Compute Engine guide covers when each option makes sense.
Common mistakes with Kubernetes Services
- Selector does not match Pod labels. If the labels on your Pods do not exactly match the
selectoron your Service, the Endpoints list is empty and the Service returns connection refused. Always verify withkubectl get endpoints <service-name>. If it shows<none>, the selector is not matching anything. Runkubectl get pods —show-labelsto see what labels the Pods actually carry. - Confusing
portandtargetPort.portis the port clients use to connect to the Service.targetPortis the port the container actually listens on. Getting these reversed causes silent connection timeouts. Verify withkubectl describe svc <name>and compare to what your container exposes. - Using NodePort for production internet traffic. Node IPs are unstable, the high port number is unusual for users, and there is no built-in TLS or health checking. Use LoadBalancer or Ingress for any production external traffic on GKE.
- Creating one LoadBalancer Service per HTTP microservice. Each LoadBalancer Service creates a separate cloud load balancer with its own IP and ongoing cost. For HTTP/S services, a single Ingress routes to many backend ClusterIP Services: one load balancer, one IP, one billing entry.
- Forgetting that readiness probes affect endpoints. A Service only routes to Pods that pass their readiness probe. If all Pods are running but the Endpoints list is empty, a failing readiness probe is the most likely cause. Check with
kubectl describe pod <name>and look at the readiness probe status and recent events. - Expecting a Service to deploy your application. A Service does not create Pods. It routes traffic to existing Pods. If no Pods are running, the Service exists but nothing responds. Create a Deployment or another workload first, then point a Service at it.
- Relying on the ClusterIP staying the same after recreation. Deleting and recreating a Service usually assigns a different ClusterIP. If other services hardcode the IP rather than using the DNS name, they will break. Always use DNS names in application configuration.
How to troubleshoot a Kubernetes Service
Step 1: Check the Service exists and has an IP
kubectl get svc
kubectl get svc <service-name>Check the TYPE, CLUSTER-IP, and EXTERNAL-IP columns. A LoadBalancer showing <pending> for more than 5 minutes may indicate a GCP quota issue or missing IAM permissions for the cluster’s service account.
Step 2: Check the Endpoints
kubectl get endpoints <service-name>If this shows <none> or an empty addresses list, no Pods are matching the selector. This is the single most common cause of a Service not routing traffic.
Step 3: Verify Pod labels match the selector
kubectl get pods --show-labels
kubectl describe svc <service-name>Compare the Selector field in the Service description to the labels on your Pods. A single typo (app: myapp vs app: my-app) breaks the match entirely and produces no error.
Step 4: Check Pod readiness
kubectl get pods
kubectl describe pod <pod-name>Look at the Ready column and the readiness probe events. A Pod that is Running but 0/1 ready is excluded from the Endpoints list and the Service will not route to it.
Step 5: Check application logs
kubectl logs <pod-name>
kubectl logs <pod-name> --previous # if the Pod has recently crashedThe Service may be routing correctly to the Pod, but the application is erroring internally. Logs will show this. You can also check GKE cluster logging if logs have been shipped to Cloud Logging.
Step 6: Test connectivity directly
# Forward a local port to the Service
kubectl port-forward svc/<service-name> 8080:80
# Forward directly to a Pod, bypassing the Service entirely
kubectl port-forward pod/<pod-name> 8080:8080If port-forward to the Pod works but the Service does not respond, the problem is in the selector or Endpoints, not the application. If neither works, the application itself is the issue. This split test saves a lot of debugging time.
Step 7: Check for controller events
kubectl describe svc <service-name>The Events section shows recent activity: load balancer provisioning errors, GCP quota failures, and other issues reported by the cloud controller manager.
Frequently asked questions
What is a Kubernetes Service?
A Kubernetes Service is a stable network endpoint that routes traffic to one or more Pods. Because Pod IPs change every time a Pod is replaced, a Service provides a fixed virtual IP and DNS name. Kubernetes automatically keeps the list of Pods behind the Service up to date as they start and stop.
What is the difference between ClusterIP and LoadBalancer?
ClusterIP gives the Service an IP reachable only inside the cluster. LoadBalancer extends this by provisioning a cloud load balancer with a public IP. Use ClusterIP for internal communication and LoadBalancer when you need to expose a non-HTTP service to the internet. For HTTP/S, an Ingress is usually the better choice.
Do I need Ingress or a LoadBalancer Service to expose my app?
It depends on the protocol. For TCP or UDP services, LoadBalancer is the straightforward choice. For HTTP/S apps, particularly multiple services, use an Ingress resource backed by ClusterIP Services. Ingress gives you host-based and path-based routing, TLS termination, and shared load balancer infrastructure, which is significantly cheaper than one LoadBalancer per service.
Why is my Kubernetes Service not routing traffic?
The most common causes are: the label selector does not match your Pod labels, or the Pods are failing their readiness probe. Run kubectl get endpoints <service-name> and if it shows <none>, no Pods are matching. Run kubectl get pods --show-labels to verify the labels, and kubectl describe pod to check readiness probe status.
Can one Service route to multiple Pods?
Yes. A Service routes to all Pods whose labels match its selector and that pass their readiness probe. Traffic is distributed across those Pods using round-robin by default. This is exactly how Deployments with multiple replicas spread load — all replicas share the same Pod labels, so the Service sends traffic to whichever Pods are currently ready.
Summary
- Pods have ephemeral IP addresses. A Service provides a stable virtual IP and DNS name that persists regardless of Pod replacement.
- Label selectors connect a Service to its Pods. Kubernetes automatically maintains the Endpoints list as Pods come and go. Pods must pass their readiness probe to be included.
- ClusterIP (the default) is for internal cluster communication only, not reachable from outside the cluster.
- NodePort opens a port on every node (30000–32767). Useful for development and testing, not suitable for production internet traffic.
- LoadBalancer provisions a cloud load balancer with a stable public IP. On GKE this is an external passthrough Network Load Balancer (Layer 4). Each Service gets its own load balancer, so use sparingly.
- ExternalName maps a cluster-internal Service name to an external DNS CNAME. Useful during migrations or when referencing external dependencies.
- For HTTP/S web apps on GKE, the standard pattern is ClusterIP Services plus a single Ingress resource: one load balancer routes to many services.
Frequently asked questions
What is a Kubernetes Service?
A Kubernetes Service is a stable network endpoint that routes traffic to one or more Pods. Because Pod IP addresses change every time a Pod is replaced, a Service provides a fixed virtual IP address and DNS name that clients always connect to. Kubernetes automatically updates the list of Pods behind the Service as they come and go.
What is the difference between ClusterIP and LoadBalancer?
ClusterIP gives the Service an IP that is only reachable inside the cluster, not from the internet. LoadBalancer extends this by provisioning a cloud load balancer (on GKE, a Layer 4 passthrough Network Load Balancer) with a public IP. Use ClusterIP for internal service-to-service communication and LoadBalancer when you need to expose a non-HTTP service externally.
Do I need Ingress or LoadBalancer to expose my app?
It depends on the protocol and how many services you need to expose. For a single TCP or UDP service, a LoadBalancer Service is straightforward. For HTTP/S apps, particularly if you have multiple services, use an Ingress resource backed by ClusterIP Services. Ingress lets you route to many services from a single load balancer using hostname and path rules, which is cheaper and more manageable on GKE.
Why is my Kubernetes Service not routing traffic?
The most common causes are: the label selector on the Service does not match the labels on your Pods, or the Pods are failing their readiness probe. Run 'kubectl get endpoints <service-name>' and if the output shows '<none>', no Pods are matching. Also check 'kubectl describe svc <name>' for events, and 'kubectl get pods --show-labels' to verify labels match exactly.
Can one Service route to multiple Pods?
Yes. A Service routes to all Pods whose labels match its selector and that are passing their readiness probe. Traffic is distributed across those Pods using round-robin by default through kube-proxy. This is how Deployments with multiple replicas spread load — all replicas share the same labels, so the Service sends traffic to whichever Pods are ready.