Kubernetes Services Explained: ClusterIP, NodePort, and LoadBalancer

A Kubernetes Service is a stable network endpoint that routes traffic to your Pods. Every time Kubernetes deploys or replaces a Pod, that Pod gets a new IP address. No other component can safely rely on that IP. A Service solves this by sitting in front of your Pods and giving clients one address that never changes, regardless of what happens to the Pods behind it.

Services are not optional. Without them, there is no reliable way to connect components inside your cluster or expose applications to the outside world. This guide explains how Services work, covers every service type in Kubernetes, and helps you choose the right one for your GKE workload.

By the end of this guide you will understand what a Kubernetes Service is, how ClusterIP, NodePort, LoadBalancer, and ExternalName work, when to use each, how Services relate to Ingress, and how to troubleshoot a Service that is not routing traffic.

Kubernetes Service in simple terms

A Service is a load balancer built into your cluster. You point it at a group of Pods using labels, and it gives that group one stable address. Clients connect to that address and Kubernetes figures out which Pod handles the request.

Analogy

Think of a Service like a restaurant’s single phone number. The kitchen might have five chefs on Monday and three on Thursday, and they rotate constantly. Customers never need to know which chef is available. They always dial the same number and the Service routes the call to whoever is ready. If a chef leaves and is replaced, the phone number stays the same and callers notice nothing.

What matters practically: you write your application config once, pointing at my-backend (the Service name), and it keeps working through deployments, crashes, and scale events without any changes on your part.

Why Kubernetes Services exist

Pods in Kubernetes are designed to be temporary. Kubernetes replaces them constantly: during rolling updates from a Deployment, after crashes, when a node is drained for maintenance, and when horizontal autoscaling scales the cluster down.

Every time a Pod is created, it gets a new IP address from the cluster’s pod network range. That IP is never reused. If your frontend holds a reference to 10.4.2.17 (a backend Pod IP), that reference becomes invalid the next time the backend is updated or the Pod restarts.

A Service adds a stable layer between consumers and the Pods they need to reach:

  • Stable virtual IP (ClusterIP): The Service gets an IP address that never changes for its lifetime, regardless of Pod churn.
  • Stable DNS name: CoreDNS resolves the Service name automatically. Within the same namespace, my-backend resolves to the ClusterIP. From any namespace, my-backend.my-namespace.svc.cluster.local works.
  • Automatic endpoint tracking: Kubernetes continuously watches for Pods matching the Service’s label selector. Ready Pods are added to the Endpoints list; terminating or unhealthy Pods are removed.

Configure your application once to connect to the Service name, and routing stays correct through every deployment, crash, and scale event.

How a Kubernetes Service works

Here is the sequence from YAML to working traffic:

  1. Labels on Pods: When you define a Deployment, you attach labels to the Pod template, for example app: api and tier: backend. Every Pod created by that Deployment carries those labels.

  2. Selector on the Service: Your Service spec includes a selector that specifies which labels to match. Any Pod carrying those labels in the same namespace is a candidate endpoint.

  3. Endpoints object: Kubernetes automatically maintains an Endpoints object (or EndpointSlice in newer clusters) for each Service. It lists the IP addresses and ports of all Pods that match the selector and are currently passing their readiness probe. You never create or update this object yourself.

  4. kube-proxy routing: On each node, kube-proxy watches the Endpoints object and programs iptables (or IPVS) rules to forward traffic destined for the Service’s ClusterIP to one of the endpoint Pod IPs. This happens at the kernel level with no userspace proxy in the data path.

  5. DNS resolution: CoreDNS runs in every GKE cluster and resolves Service names to ClusterIPs. Use the short name my-backend within the same namespace, or the fully-qualified name my-backend.my-namespace.svc.cluster.local from anywhere in the cluster. No hardcoded IPs needed.

On GKE, the networking model adds VPC-native pod addressing on top of this, which integrates cleanly with Google Cloud load balancers for external Service types.

ClusterIP

ClusterIP is the default Service type. It assigns the Service a virtual IP address that is only reachable from within the cluster. Nothing outside the cluster — no users, no external systems — can connect to a ClusterIP Service directly.

When to use it: Any time one workload inside your cluster needs to talk to another. Frontend to backend API, application to cache, one microservice to another. ClusterIP is the right choice for the vast majority of Services.

When not to use it: When external access is needed. ClusterIP has no external reachability by design.

apiVersion: v1
kind: Service
metadata:
  name: backend-api
spec:
  type: ClusterIP
  selector:
    app: my-app
    tier: backend
  ports:
    - protocol: TCP
      port: 80        # Port clients connect to on the Service
      targetPort: 8080  # Port the container is actually listening on

Once created, this Service is reachable at:

  • DNS (same namespace): backend-api
  • DNS (any namespace): backend-api.default.svc.cluster.local
  • IP: the assigned ClusterIP (e.g. 10.96.45.23)
Tip

Always use the DNS name in your application config, never the IP. DNS names survive the Service being deleted and recreated. The ClusterIP is likely to change if the Service is ever recreated, which will break anything that hardcoded it.

Omitting type in the spec defaults to ClusterIP. You do not need to specify it explicitly.

NodePort

NodePort extends ClusterIP by opening a static port in the range 30000–32767 on every node in the cluster. External traffic can reach the Service by sending requests to any node’s IP at that port: <node-ip>:<node-port>.

When to use it: Local development, testing on clusters without a cloud provider, on-premises deployments, or accessing a service from within the same VPC without going through a public load balancer.

When not to use it: Production internet-facing traffic. Node IPs are unstable on GKE (nodes are replaced during upgrades and autoscaling events), the high port number is not user-friendly, and you have to manage firewall rules yourself.

apiVersion: v1
kind: Service
metadata:
  name: my-app-nodeport
spec:
  type: NodePort
  selector:
    app: my-app
  ports:
    - protocol: TCP
      port: 80
      targetPort: 8080
      nodePort: 30080   # Optional: omit to let Kubernetes assign one automatically
Not for production

NodePort opens the port on every node, including nodes not running your Pods. On GKE with node autoscaling, node IPs change whenever nodes are replaced. There is no built-in TLS, no health routing, and no stable entry point. For production external traffic on GKE, use a LoadBalancer Service or an Ingress resource instead.

A NodePort Service also has a ClusterIP, so it remains reachable inside the cluster the same way as a ClusterIP Service.

LoadBalancer

LoadBalancer extends NodePort by instructing GCP to provision an external load balancer automatically and point it at the NodePort. On GKE, this creates a Google Cloud external passthrough Network Load Balancer (Layer 4) with a stable public IP address.

When to use it: Exposing a single TCP or UDP service to the internet: a database proxy, a gRPC service, a game server, or anything that does not work well behind an HTTP(S) load balancer.

When not to use it: When you need to expose many HTTP/S services. Each LoadBalancer Service creates a separate cloud load balancer with its own IP and cost. For HTTP/S, use an Ingress resource backed by ClusterIP Services instead.

apiVersion: v1
kind: Service
metadata:
  name: my-app-lb
  annotations:
    # Optional: use a pre-reserved static IP
    networking.gke.io/load-balancer-ip-addresses: "my-static-ip"
spec:
  type: LoadBalancer
  selector:
    app: my-app
  ports:
    - protocol: TCP
      port: 80
      targetPort: 8080
  # Optional: restrict which source IPs can connect
  loadBalancerSourceRanges:
    - "203.0.113.0/24"

After applying this, run kubectl get svc my-app-lb and wait for the EXTERNAL-IP column to show a real IP. It shows <pending> for 30–90 seconds while GCP provisions the load balancer.

Internal LoadBalancer

On GKE you can create a LoadBalancer accessible only from within your VPC by adding the annotation networking.gke.io/load-balancer-type: “Internal”. This is useful for services that backend systems in the same VPC need to reach without going through a public IP.

ExternalName

ExternalName maps a Service name inside the cluster to an external DNS name. It creates a CNAME record in CoreDNS rather than provisioning any load balancer or proxy. No traffic proxying happens — DNS resolution simply returns the external hostname.

When to use it: When your workloads need to reach an external dependency (a managed database, a third-party API, a Cloud SQL instance) and you want a stable cluster-internal name rather than hardcoding the external hostname in every application config file.

When not to use it: As a substitute for a real Service. ExternalName does no health checking, no load balancing, and no traffic proxying. It is purely a DNS alias.

apiVersion: v1
kind: Service
metadata:
  name: external-database
spec:
  type: ExternalName
  externalName: my-database.example.com
Migration pattern

This type is especially useful during migrations. Configure your workloads to connect to the Service name from day one. Later, when you move the dependency in-cluster, swap the ExternalName Service for a ClusterIP Service pointing at the in-cluster replacement. No application config changes needed.

ClusterIP vs NodePort vs LoadBalancer vs ExternalName

TypeReachable fromBest forMain downside
ClusterIPInside cluster onlyInternal service-to-service communicationNot accessible externally
NodePortInside cluster + external via node IPLocal dev, testing, on-premisesUnstable node IPs, awkward ports, manual firewall rules
LoadBalancerInside cluster + external via cloud LBSingle TCP/UDP service exposed to internetOne load balancer per Service, cost compounds quickly
ExternalNameInside cluster (DNS only)Aliasing an external hostname inside the clusterDNS alias only, no proxying, no health checks
What real GKE clusters look like

In production, most clusters use ClusterIP for everything internal, Ingress for HTTP/S external traffic, and LoadBalancer only for non-HTTP external services. Most teams end up with many ClusterIP Services and a small number of LoadBalancer Services for specific Layer 4 requirements. If you are deciding between Cloud Run and GKE, the GKE vs Cloud Run comparison covers the trade-offs.

Kubernetes Service vs Ingress

A Service and an Ingress solve related but different problems:

  • A Service gives a group of Pods a stable IP and DNS name. It handles Layer 4 routing (TCP/UDP) and is always in the traffic path, even when Ingress is used.
  • An Ingress is a Layer 7 routing layer. It examines HTTP request headers, hostnames, and paths to decide which Service to forward traffic to. On GKE, an Ingress provisions a Google Cloud HTTP(S) Load Balancer.

Ingress does not replace Services. It routes traffic to Services. The common production pattern on GKE for web apps:

  1. Define ClusterIP Services for each backend
  2. Define one Ingress resource with routing rules (host-based or path-based)
  3. GKE provisions one HTTP(S) Load Balancer that serves all of them
Example scenario

You have a React frontend and a Go API. Rather than creating two LoadBalancer Services (two public IPs, two billing entries), you create two ClusterIP Services and one Ingress. The Ingress routes api.example.com to the Go API Service and example.com to the frontend Service. One IP. One load balancer. One TLS certificate.

Ingress does not handle non-HTTP protocols. For TCP or UDP external exposure, use a LoadBalancer Service directly. For clusters in private GKE configurations, the networking model also affects how external load balancers and Ingress controllers reach your Pods.

When to use each Service type

  • ClusterIP — internal microservice communication, service discovery within the cluster, any workload that does not need external access
  • NodePort — local development and testing, on-premises clusters without a cloud load balancer, VPC-internal access without a public IP
  • LoadBalancer — exposing a single TCP or UDP service externally (database proxies, game servers, gRPC services, anything Ingress cannot handle)
  • Ingress (backed by ClusterIP Services) — exposing one or many HTTP/S applications externally with host-based or path-based routing and TLS termination
  • ExternalName — mapping a cluster-internal name to an external service hostname, useful during migrations or when referencing managed external dependencies

When choosing between running workloads as GKE Services or using a managed platform entirely, the Cloud Run vs GKE vs Compute Engine guide covers when each option makes sense.

Common mistakes with Kubernetes Services

  1. Selector does not match Pod labels. If the labels on your Pods do not exactly match the selector on your Service, the Endpoints list is empty and the Service returns connection refused. Always verify with kubectl get endpoints <service-name>. If it shows <none>, the selector is not matching anything. Run kubectl get pods —show-labels to see what labels the Pods actually carry.
  2. Confusing port and targetPort. port is the port clients use to connect to the Service. targetPort is the port the container actually listens on. Getting these reversed causes silent connection timeouts. Verify with kubectl describe svc <name> and compare to what your container exposes.
  3. Using NodePort for production internet traffic. Node IPs are unstable, the high port number is unusual for users, and there is no built-in TLS or health checking. Use LoadBalancer or Ingress for any production external traffic on GKE.
  4. Creating one LoadBalancer Service per HTTP microservice. Each LoadBalancer Service creates a separate cloud load balancer with its own IP and ongoing cost. For HTTP/S services, a single Ingress routes to many backend ClusterIP Services: one load balancer, one IP, one billing entry.
  5. Forgetting that readiness probes affect endpoints. A Service only routes to Pods that pass their readiness probe. If all Pods are running but the Endpoints list is empty, a failing readiness probe is the most likely cause. Check with kubectl describe pod <name> and look at the readiness probe status and recent events.
  6. Expecting a Service to deploy your application. A Service does not create Pods. It routes traffic to existing Pods. If no Pods are running, the Service exists but nothing responds. Create a Deployment or another workload first, then point a Service at it.
  7. Relying on the ClusterIP staying the same after recreation. Deleting and recreating a Service usually assigns a different ClusterIP. If other services hardcode the IP rather than using the DNS name, they will break. Always use DNS names in application configuration.

How to troubleshoot a Kubernetes Service

Step 1: Check the Service exists and has an IP

kubectl get svc
kubectl get svc <service-name>

Check the TYPE, CLUSTER-IP, and EXTERNAL-IP columns. A LoadBalancer showing <pending> for more than 5 minutes may indicate a GCP quota issue or missing IAM permissions for the cluster’s service account.

Step 2: Check the Endpoints

kubectl get endpoints <service-name>

If this shows <none> or an empty addresses list, no Pods are matching the selector. This is the single most common cause of a Service not routing traffic.

Step 3: Verify Pod labels match the selector

kubectl get pods --show-labels
kubectl describe svc <service-name>

Compare the Selector field in the Service description to the labels on your Pods. A single typo (app: myapp vs app: my-app) breaks the match entirely and produces no error.

Step 4: Check Pod readiness

kubectl get pods
kubectl describe pod <pod-name>

Look at the Ready column and the readiness probe events. A Pod that is Running but 0/1 ready is excluded from the Endpoints list and the Service will not route to it.

Step 5: Check application logs

kubectl logs <pod-name>
kubectl logs <pod-name> --previous  # if the Pod has recently crashed

The Service may be routing correctly to the Pod, but the application is erroring internally. Logs will show this. You can also check GKE cluster logging if logs have been shipped to Cloud Logging.

Step 6: Test connectivity directly

# Forward a local port to the Service
kubectl port-forward svc/<service-name> 8080:80

# Forward directly to a Pod, bypassing the Service entirely
kubectl port-forward pod/<pod-name> 8080:8080
Isolation trick

If port-forward to the Pod works but the Service does not respond, the problem is in the selector or Endpoints, not the application. If neither works, the application itself is the issue. This split test saves a lot of debugging time.

Step 7: Check for controller events

kubectl describe svc <service-name>

The Events section shows recent activity: load balancer provisioning errors, GCP quota failures, and other issues reported by the cloud controller manager.

Frequently asked questions

What is a Kubernetes Service?

A Kubernetes Service is a stable network endpoint that routes traffic to one or more Pods. Because Pod IPs change every time a Pod is replaced, a Service provides a fixed virtual IP and DNS name. Kubernetes automatically keeps the list of Pods behind the Service up to date as they start and stop.

What is the difference between ClusterIP and LoadBalancer?

ClusterIP gives the Service an IP reachable only inside the cluster. LoadBalancer extends this by provisioning a cloud load balancer with a public IP. Use ClusterIP for internal communication and LoadBalancer when you need to expose a non-HTTP service to the internet. For HTTP/S, an Ingress is usually the better choice.

Do I need Ingress or a LoadBalancer Service to expose my app?

It depends on the protocol. For TCP or UDP services, LoadBalancer is the straightforward choice. For HTTP/S apps, particularly multiple services, use an Ingress resource backed by ClusterIP Services. Ingress gives you host-based and path-based routing, TLS termination, and shared load balancer infrastructure, which is significantly cheaper than one LoadBalancer per service.

Why is my Kubernetes Service not routing traffic?

The most common causes are: the label selector does not match your Pod labels, or the Pods are failing their readiness probe. Run kubectl get endpoints <service-name> and if it shows <none>, no Pods are matching. Run kubectl get pods --show-labels to verify the labels, and kubectl describe pod to check readiness probe status.

Can one Service route to multiple Pods?

Yes. A Service routes to all Pods whose labels match its selector and that pass their readiness probe. Traffic is distributed across those Pods using round-robin by default. This is exactly how Deployments with multiple replicas spread load — all replicas share the same Pod labels, so the Service sends traffic to whichever Pods are currently ready.

Frequently asked questions

What is a Kubernetes Service?

A Kubernetes Service is a stable network endpoint that routes traffic to one or more Pods. Because Pod IP addresses change every time a Pod is replaced, a Service provides a fixed virtual IP address and DNS name that clients always connect to. Kubernetes automatically updates the list of Pods behind the Service as they come and go.

What is the difference between ClusterIP and LoadBalancer?

ClusterIP gives the Service an IP that is only reachable inside the cluster, not from the internet. LoadBalancer extends this by provisioning a cloud load balancer (on GKE, a Layer 4 passthrough Network Load Balancer) with a public IP. Use ClusterIP for internal service-to-service communication and LoadBalancer when you need to expose a non-HTTP service externally.

Do I need Ingress or LoadBalancer to expose my app?

It depends on the protocol and how many services you need to expose. For a single TCP or UDP service, a LoadBalancer Service is straightforward. For HTTP/S apps, particularly if you have multiple services, use an Ingress resource backed by ClusterIP Services. Ingress lets you route to many services from a single load balancer using hostname and path rules, which is cheaper and more manageable on GKE.

Why is my Kubernetes Service not routing traffic?

The most common causes are: the label selector on the Service does not match the labels on your Pods, or the Pods are failing their readiness probe. Run 'kubectl get endpoints <service-name>' and if the output shows '<none>', no Pods are matching. Also check 'kubectl describe svc <name>' for events, and 'kubectl get pods --show-labels' to verify labels match exactly.

Can one Service route to multiple Pods?

Yes. A Service routes to all Pods whose labels match its selector and that are passing their readiness probe. Traffic is distributed across those Pods using round-robin by default through kube-proxy. This is how Deployments with multiple replicas spread load — all replicas share the same labels, so the Service sends traffic to whichever Pods are ready.

Last verified: 23 March 2026 Cloud services change frequently. Verify details against official documentation before making infrastructure decisions.