GCP Internal Load Balancers Explained | Private Traffic Routing

An internal load balancer gives your private services a single stable IP address that never leaves your VPC. When a web tier talks to an app tier, or one microservice calls another, that request goes to the ILB’s private IP. The load balancer handles routing, health checking, and failover. Nothing is exposed to the internet.

Simple explanation

A load balancer distributes incoming requests across a group of backend servers. An internal load balancer does this entirely within your private network.

The “internal” part means the frontend IP is a private address from your VPC network, such as 10.128.0.5. There is no public IP. Internet users cannot reach it, and it cannot be made accessible from the internet.

Here is the problem it solves. Say you have three app-tier VMs running the same service. Without a load balancer, a web-tier VM needs to know the IP of each app VM, track which ones are healthy, and choose one manually. If an app VM dies, the web VM keeps trying to reach it. An internal load balancer removes that complexity. The web tier connects to one stable private IP. The load balancer distributes the requests, monitors health, and skips any VM that has stopped responding.

Analogy

Think of an internal load balancer as an internal extension number at a large company. Employees dial one shared number to reach the support team. The switchboard routes each call to whoever is available. External callers never see this number; it only works from inside the building.

Internal load balancer vs external load balancer

	Internal LB	External LB
Frontend IP type	Private (RFC 1918)	Public
Reachable from internet?	No, by design	Yes
Who connects to it?	VMs, services, or on-prem systems inside the VPC or hybrid network	Any internet client
Typical use	App tier, internal API, private microservices	Web front end, public API
Regional or global?	Regional (standard types)	Both options available

In a typical layered architecture, both types work together. An external load balancer handles public internet traffic at the web tier. Behind it, an internal load balancer distributes requests across the app tier. The app tier is never exposed directly. For global internet-facing traffic, see Global Load Balancing.

How internal load balancers work

The request flow through an internal load balancer goes like this:

A VM inside your VPC sends a request to the internal load balancer’s private IP and port (for example, 10.128.0.5:8080).
GCP intercepts the traffic at the network layer and matches it against a forwarding rule, which defines the IP, port, and protocol for this load balancer.
The forwarding rule points to a backend service. The backend service holds the list of backend instance groups and a reference to the health check.
The load balancer checks which backend VMs are currently passing health checks.
The request is forwarded to one of the healthy backends. Unhealthy backends are skipped entirely.
If a backend VM goes down, the load balancer detects the health check failure and stops routing new requests to it. No manual intervention needed.

Note

How the backend sees the request depends on the type. With a passthrough load balancer, packets are forwarded unchanged and the backend sees the original source IP of the calling VM. With an application load balancer, the connection terminates at a GCP-managed Envoy proxy, which opens a new connection to the backend. The backend sees the proxy IP, not the original caller, but can retrieve the real source from the X-Forwarded-For HTTP header. If your application rate-limits or logs by client IP, this distinction matters.

Types of internal load balancers in GCP

GCP offers two distinct internal load balancer types. The configuration is not interchangeable, and switching type later means rebuilding from scratch.

Tip

Pick the type before you start building. A passthrough and an application load balancer use different gcloud schemes, different backend protocols, and different health check setups. Getting this decision right early saves significant rework.

Internal passthrough network load balancer (layer 4)

Operates at layer 4 (TCP/UDP). Forwards packets directly to backends without inspecting their contents. The backend VM receives the original client IP as the source address. Use this for non-HTTP protocols, UDP traffic, database connections, or any case where the backend needs the real source IP.

Configuration uses —load-balancing-scheme=INTERNAL.

Internal application load balancer (layer 7)

Operates at layer 7 (HTTP/HTTPS). Terminates the connection at a managed Envoy proxy and forwards it to the backend. Supports URL path-based routing, host-based routing, TLS termination, and gRPC. Use this for HTTP and HTTPS internal traffic, microservice architectures where different paths route to different services, or when you want TLS handled centrally rather than on each backend VM.

Configuration uses —load-balancing-scheme=INTERNAL_MANAGED.

Passthrough vs application: side by side

	Internal passthrough (L4)	Internal application (L7)
Protocol	TCP, UDP	HTTP, HTTPS, HTTP/2, gRPC
Client IP at backend	Original client IP preserved	Proxy IP (original in X-Forwarded-For)
URL-based routing	No	Yes, by path, host, or header
TLS termination	No (passes TLS through to backend)	Yes, terminates at the load balancer
gRPC support	No	Yes
Load balancing scheme	INTERNAL	INTERNAL_MANAGED
Choose when	Database connections, DNS, custom TCP/UDP, need original client IP	Internal HTTP APIs, microservices, path routing, internal HTTPS

When to use an internal load balancer

Multi-tier architectures. A web tier handles public traffic. Behind it, an internal load balancer fronts the application tier, keeping it off the public internet entirely. If an app VM fails, the load balancer routes around it automatically.
Internal HTTP APIs. Microservices communicating over HTTP inside a VPC benefit from the internal application load balancer. You get URL-based routing, TLS termination, and health checking without any external exposure.
Non-HTTP private services. A database proxy, message broker, or custom TCP service across multiple VMs needs the passthrough type for layer 4 traffic distribution with client IP preservation.
Hybrid connectivity. On-premises systems connected via Cloud VPN or Cloud Interconnect can reach an internal load balancer’s private IP as if it were on their local network. This is a clean way to expose GCP services to on-prem clients without creating public endpoints.
Shared VPC environments. In a Shared VPC setup, an internal load balancer in a service project can use a subnet from the host project, making it reachable across multiple service projects in the same organisation.

When not to use an internal load balancer

Warning

An internal load balancer’s frontend IP is a private RFC 1918 address. It is not reachable from the internet under any circumstances. Adding a firewall rule will not change this. The address is simply not routable beyond your VPC. If external access is required, use an external load balancer as the public entry point.

Simple VM-to-VM communication. If you have one VM calling one other VM at a fixed private IP, a load balancer adds unnecessary complexity. ILBs are useful when you have two or more backends and want automatic health checking and traffic distribution.
Wrong type for the protocol. A passthrough load balancer cannot inspect URLs or route by path. An application load balancer does not preserve the original client IP. Using the wrong type and trying to work around its constraints causes more problems than choosing correctly from the start.
Serverless backends connecting to private resources. If your Cloud Run or Cloud Functions service needs to reach a private IP in your VPC, that requires a different pattern. See Serverless VPC Access. If it needs to reach the internet from a private subnet, see Cloud NAT.

Core components

Forwarding rule (the frontend)

Defines the private IP address, port, and protocol that clients connect to. The IP must come from a subnet in your VPC. You can let GCP assign an ephemeral internal IP, but for production workloads reserve a static one. A static IP survives backend changes and deployments without requiring updates across every service that connects to it.

Tip

Reserve a static internal IP for your forwarding rule before creating it. If you ever tear down and recreate the forwarding rule, the static IP remains the same. An ephemeral IP can change, which would require updating every service that uses it as a target.

Backend service

Groups the backend VMs or Network Endpoint Groups (NEGs) together. It references the health check, sets the balancing mode, and is where you configure connection draining. The protocol set here must match the load balancer type: TCP or UDP for passthrough, HTTP or HTTPS for application.

Instance group or NEG

The actual backends. These can be managed instance groups (MIGs), unmanaged instance groups, or NEGs for container-native or serverless backends. Managed instance groups are recommended for production. They automatically replace failed instances and support autoscaling.

Health check

Probes each backend at a regular interval. If a backend fails the configured number of consecutive checks, it is removed from rotation until it recovers. Without the health check working correctly, and without the firewall rule that allows probe traffic to reach your VMs, the load balancer cannot separate healthy from failed backends.

Architecture example: three-tier application

A common production pattern uses an external and an internal load balancer together in the same VPC:

Web tier. Public users connect to an external Application Load Balancer with a public IP. It routes HTTPS traffic to web-tier VMs tagged web-server. TLS terminates here.
App tier. Web-tier VMs call the app tier by connecting to the internal load balancer’s private IP (say 10.128.0.10:8080). The internal load balancer distributes those requests across app-tier VMs tagged app-server. These VMs have no public IP and are not reachable from the internet.
Database tier. App-tier VMs connect to Cloud SQL using its private IP address. No load balancer is needed here. Cloud SQL manages its own internal failover.

The web tier never knows which app-tier VM handled a request. If an app VM fails, the internal load balancer stops sending it traffic within one health check interval. The only entry point for external traffic is the external load balancer. The app tier is entirely isolated.

Firewall rules enforce this isolation: app-tier VMs only accept traffic on port 8080 from VMs with the web-server tag, plus health check probes from GCP’s IP ranges. Everything else is dropped.

Configuring an internal TCP load balancer

The following example creates a regional internal passthrough load balancer for an application running on port 8080 in us-central1.

# Step 1: Create a TCP health check on port 8080
gcloud compute health-checks create tcp hc-tcp-8080 \
  --port=8080 \
  --check-interval=10s \
  --healthy-threshold=2 \
  --unhealthy-threshold=3

# Step 2: Create an instance group and add the backend VMs
gcloud compute instance-groups unmanaged create app-group-uc1 \
  --zone=us-central1-a

gcloud compute instance-groups unmanaged add-instances app-group-uc1 \
  --instances=app-vm-1,app-vm-2 \
  --zone=us-central1-a

# Named port maps the group's service to port 8080
gcloud compute instance-groups set-named-ports app-group-uc1 \
  --named-ports=app:8080 \
  --zone=us-central1-a

# Step 3: Create the backend service
gcloud compute backend-services create app-backend \
  --load-balancing-scheme=INTERNAL \
  --protocol=TCP \
  --health-checks=hc-tcp-8080 \
  --region=us-central1

# Step 4: Attach the instance group to the backend service
gcloud compute backend-services add-backend app-backend \
  --instance-group=app-group-uc1 \
  --instance-group-zone=us-central1-a \
  --region=us-central1

# Step 5: Create the forwarding rule (this assigns the internal frontend IP)
gcloud compute forwarding-rules create app-internal-lb \
  --load-balancing-scheme=INTERNAL \
  --network=production-vpc \
  --subnet=app-subnet \
  --region=us-central1 \
  --backend-service=app-backend \
  --ports=8080

# Check the assigned private IP address
gcloud compute forwarding-rules describe app-internal-lb \
  --region=us-central1 \
  --format="get(IPAddress)"

Once created, any VM in production-vpc can connect to that private IP on port 8080. The load balancer distributes connections across app-vm-1 and app-vm-2.

Note

For an internal application load balancer (layer 7), use —load-balancing-scheme=INTERNAL_MANAGED and set the protocol to HTTP or HTTPS. You also need a URL map and a target HTTP(S) proxy. The pattern follows the same model as the HTTP Load Balancer Setup guide, but with the internal scheme applied throughout.

Firewall and health check requirements

An internal load balancer needs two distinct firewall rules. Most first-time setups only create one and then spend time wondering why the load balancer is not sending traffic.

Warning

The health check firewall rule is the single most common reason a newly created internal load balancer sends no traffic. GCP’s health check probes originate from 35.191.0.0/16 and 130.211.0.0/22. Without a rule allowing those ranges to reach your backends on the service port, every backend fails health checks and the load balancer sits completely idle — even when the backend VMs are running perfectly.

Rule 1: Allow client traffic to reach the backends

# Allow web-tier VMs to reach app-tier VMs on port 8080
gcloud compute firewall-rules create allow-web-to-app \
  --network=production-vpc \
  --direction=INGRESS \
  --action=ALLOW \
  --rules=tcp:8080 \
  --source-tags=web-server \
  --target-tags=app-server

Rule 2: Allow GCP health check probes

# Allow GCP health check probes to reach app-tier VMs on port 8080
gcloud compute firewall-rules create allow-health-checks-app \
  --network=production-vpc \
  --direction=INGRESS \
  --action=ALLOW \
  --rules=tcp:8080 \
  --source-ranges=35.191.0.0/16,130.211.0.0/22 \
  --target-tags=app-server

For a deeper look at writing firewall rules correctly (including how priority and targets work), see Firewall Rules Explained. For network security more broadly, see Network Security Best Practices.

Common mistakes

Missing the health check firewall rule. The load balancer marks all backends unhealthy and forwards no traffic. The cause is almost always the missing rule allowing GCP’s health check ranges 35.191.0.0/16 and 130.211.0.0/22 to reach your backends on the service port. This rule is not created automatically when you create a load balancer.
Expecting the private IP to be internet-accessible. An internal load balancer frontend is a private RFC 1918 address. It cannot be reached from outside the VPC. If external access is required, put an external load balancer in front. They are designed to work together in a layered architecture.
Choosing the wrong load balancer type. A passthrough load balancer cannot route by URL or path. An application load balancer does not preserve the original client IP at the backend. Picking the wrong type mid-project means rebuilding the configuration. Think through the protocol and routing requirements before you start.
Misunderstanding regional scope. A standard internal load balancer’s frontend IP is tied to one region. A VM in a different region cannot reach that private IP over normal VPC routing. If you have workloads in multiple regions, deploy a separate ILB per region, or use the cross-region internal application load balancer if your protocol supports it.
Not configuring connection draining. When you remove a backend during a rolling deployment, in-flight connections are dropped immediately by default. Enable connection draining on the backend service to give active requests time to complete before the VM is taken out of rotation.

Troubleshooting tips

All backends showing as unhealthy. Confirm the health check firewall rule exists and targets the right backend VMs by tag or subnet. Then manually verify the backends respond on the health check port: SSH into another VM in the same VPC and run curl http://BACKEND_IP:PORT.
Traffic not reaching the load balancer at all. Confirm a firewall rule allows traffic from the source VM’s tag or IP range to the backend tag on the correct port. Also check the forwarding rule is in the correct region and associated with the correct subnet.
Named port mismatch. The named port set on the instance group must match what the backend service expects. A mismatch sends traffic to the wrong port on the backend VM, which will silently fail or be rejected.
Cross-project subnet issues. If using a Shared VPC, confirm the subnet IAM permissions allow the service project to use the host project’s subnet. See Shared VPC for the required configuration.
Private DNS not resolving to the internal IP. If clients use a hostname to reach the ILB rather than the raw IP, make sure a private DNS zone is configured with a record pointing to the forwarding rule’s private IP.
Broader network diagnosis. For a systematic approach to diagnosing connectivity problems, see Troubleshooting Network Issues.

Frequently asked questions

What is an internal load balancer in GCP?

An internal load balancer (ILB) distributes traffic to backend VMs or services using a private IP address that stays within your VPC. The traffic never leaves your private network, and the frontend IP is never reachable from the internet. ILBs are used for services that should only be accessible from inside your VPC, such as an API tier behind a web tier or a database proxy accessed by application servers.

What is the difference between an internal and external load balancer?

An internal load balancer has a private RFC 1918 frontend IP and is only reachable from within your VPC or a connected hybrid network. An external load balancer has a public IP and accepts traffic from the internet. Use an internal LB for backend tiers that should never be exposed to the public internet, and an external LB as the internet-facing entry point.

When should I use an internal application load balancer instead of passthrough?

Use an internal application load balancer (layer 7) when you need URL path-based routing, host-based routing, or TLS termination at the load balancer. Use a passthrough load balancer (layer 4) when you need to preserve the original client IP at the backend, are working with non-HTTP protocols such as database connections or UDP, or want minimal processing overhead.

Is an internal load balancer regional or global?

The standard internal load balancer is regional. Its frontend IP sits in a specific region and routes traffic to backends in that same region. GCP also offers a cross-region internal application load balancer for multi-region architectures, but most setups use a regional ILB per region.

What firewall rules does an internal load balancer require?

Two sets of rules are needed. First, allow traffic from your client VMs to reach the backend VMs on the service port. Second, allow GCP health check probes from the IP ranges 35.191.0.0/16 and 130.211.0.0/22 to reach your backends on the same port. Without the health check rule, backends appear unhealthy and receive no traffic.

Last verified: 24 March 2026 Cloud services change frequently. Verify details against official documentation before making infrastructure decisions.