GKE Security Best Practices: How to Secure Google Kubernetes Engine

Securing a GKE cluster means applying controls at every layer: the nodes it runs on, the workloads inside it, the identities accessing it, the network traffic between pods, the images being deployed, and the secrets in use. No single setting secures a cluster. This page explains how each layer works, what it protects, and how to apply it in practice.

Simple explanation

Analogy Think of a GKE cluster like a building with multiple entry points. Locking the front door (making the cluster private) is a good start. But if service account keys are lying on a desk inside, the network rules are wide open, and anyone can run any software from anywhere, the building is still vulnerable. Real security means controlling every entry point and every action that happens inside.

GKE security works in layers because each layer protects against a different type of failure:

  • Identity and IAM: controls who can call Google Cloud APIs from inside the cluster
  • Kubernetes RBAC: controls who can interact with the Kubernetes API — creating pods, reading secrets, and so on
  • Node security: protects the machines the cluster runs on from tampering at boot time
  • Network controls: restricts which pods can reach which other pods, and who can reach the control plane
  • Image security: ensures that only verified, scanned images make it into production
  • Secrets protection: prevents sensitive values from being stored or retrieved insecurely
  • Logging and monitoring: gives you the visibility to detect problems when other controls are bypassed or misconfigured

If you skip one layer, a failure in an adjacent area can cascade across the whole cluster. That is why production security covers all of these areas, not just one or two.

How GKE cluster security works

GKE is a managed Kubernetes service. Google manages the control plane, but you are responsible for configuring the security of your cluster, its workloads, and the identities that access it.

Here is how the layers connect:

Step 1: Access to the control plane. The Kubernetes API server is the entry point for all cluster operations. Restricting who can reach it (using authorised networks or a private endpoint) and who can authenticate to it (using Google Cloud IAM and Kubernetes RBAC) is the first line of defence.

Step 2: What pods are allowed to do. Pods run under a Kubernetes ServiceAccount. With Workload Identity, that ServiceAccount is bound to a Google Cloud IAM service account, so pods receive scoped, short-lived credentials automatically. Without Workload Identity, pods fall back to the node’s service account, carrying whatever permissions the node was given.

Step 3: Node integrity. Shielded GKE Nodes use hardware-level attestation to verify that a node booted without tampering. Container-Optimized OS reduces the attack surface by removing unnecessary software from the node image.

Step 4: Pod-to-pod and pod-to-external traffic. By default, all pods in a GKE cluster can communicate with each other. NetworkPolicy restricts this to only the connections that are explicitly required. Private clusters remove node IP addresses from the public internet entirely.

Step 5: What images can run. Artifact Registry scans images for known CVEs. Binary Authorization enforces a deploy-time policy requiring that an image has been cryptographically attested before GKE will allow it to run.

Step 6: Secrets. Kubernetes Secrets are base64-encoded in etcd by default, not encrypted. Application-layer encryption via Cloud KMS, or routing secrets through Secret Manager entirely, keeps sensitive values protected even if someone gains etcd access.

Step 7: Logging and detection. Cloud Audit Logs record every API call against the cluster. Logging in Kubernetes surfaces workload output. Security Command Center surfaces GKE-specific misconfigurations and threats automatically.

Each of these steps depends on the others. A perfectly configured RBAC policy does not help if a misconfigured node service account gives every pod broad IAM access. A private cluster does not help if an unscanned image runs a known exploit. The layers reinforce each other.

When to use this approach

Apply the full layered security model when:

  • Running production workloads: any cluster where data loss, service disruption, or credential theft would have a real impact
  • Handling regulated or sensitive data: PCI, HIPAA, SOC 2, and similar frameworks often mandate specific controls like encryption at rest and audit logging
  • Multi-team clusters: when different teams share a cluster, RBAC, namespace isolation, and NetworkPolicy prevent one team’s misconfigured workload from affecting others
  • Internet-facing services: workloads exposed to the internet have a larger attack surface and warrant tighter network controls and image verification
  • CI/CD pipelines deploying to GKE: secure CI/CD pipelines are part of the security perimeter; a compromised pipeline can bypass every runtime control

For short-lived learning clusters or sandboxes with no sensitive data and no connection to production resources, a lighter setup may be acceptable. Even then, it is worth enabling Workload Identity and using a scoped node service account. The habits carry directly into production work.

Core security layers for GKE

Node security: Shielded GKE Nodes and Container-Optimized OS#

The security of a GKE cluster starts at the machine level.

Container-Optimized OS (COS) is a minimal Linux distribution maintained by Google for running containers. It ships with no package manager, no SSH server by default, and a read-only root filesystem. This reduces the attack surface considerably compared to a general-purpose Linux distribution. COS is the default node image for GKE and should be your first choice unless you have a specific compatibility requirement.

Shielded GKE Nodes add hardware-rooted security features on top of COS:

  • Secure Boot verifies that the node’s bootloader, kernel, and kernel modules have not been tampered with. If verification fails, the node refuses to boot.
  • vTPM (Virtual Trusted Platform Module) is a virtualised cryptographic chip that holds boot measurements, enabling attestation of the node’s boot state.
  • Integrity Monitoring compares runtime boot measurements against a known-good baseline and alerts if a deviation is detected.

Enable Shielded Nodes at cluster creation:

gcloud container clusters create my-cluster \
  --region=europe-west2 \
  --shielded-secure-boot \
  --shielded-integrity-monitoring \
  --enable-shielded-nodes
Note

Shielded GKE Nodes are not the same as Confidential GKE Nodes. Confidential Nodes run on Confidential VM instances, which encrypt the VM’s memory using AMD SEV (Secure Encrypted Virtualisation), protecting data even from Google’s infrastructure. Confidential Nodes have additional CPU requirements and are suited to the most sensitive workloads.

Workload identity and IAM#

Pods often need to call Google Cloud APIs: reading from Cloud Storage, writing to Pub/Sub, accessing Secret Manager. The only production-grade way to grant this access in GKE is Workload Identity.

Workload Identity binds a Kubernetes ServiceAccount to a Google Cloud IAM service account. Pods receive short-lived, automatically rotated tokens. No key file ever touches disk or enters a container.

Mounting a service account key file as a Kubernetes Secret is the alternative many teams use initially. It is the wrong approach in production. Key files do not expire, are difficult to rotate across many pods, and represent a permanent credential that can be used from anywhere if exfiltrated. See Why service account keys are dangerous and Service account keys explained for a detailed breakdown of the risks.

Enable Workload Identity at cluster creation:

gcloud container clusters create my-cluster \
  --region=europe-west2 \
  --workload-pool=PROJECT_ID.svc.id.goog \
  --workload-metadata=GKE_METADATA
Tip

If you are prioritising where to start, Workload Identity is usually the highest-impact first change for teams that are currently using service account key files. It eliminates a whole class of credential leak risk in one step.

Node service account minimisation. GKE nodes run under a Google Cloud service account. The default is the Compute Engine default service account, which has the primitive Editor role. This creates a very large blast radius if any pod bypasses Workload Identity and calls the metadata server directly. Create a dedicated, narrowly scoped node service account instead:

gcloud iam service-accounts create gke-node-sa \
  --display-name="GKE Node Service Account"

for role in roles/logging.logWriter roles/monitoring.metricWriter \
            roles/monitoring.viewer roles/artifactregistry.reader; do
  gcloud projects add-iam-policy-binding PROJECT_ID \
    --member="serviceAccount:gke-node-sa@PROJECT_ID.iam.gserviceaccount.com" \
    --role="$role"
done

gcloud container clusters create my-cluster \
  --service-account=gke-node-sa@PROJECT_ID.iam.gserviceaccount.com

These four roles are the minimum required for a GKE node to function. Anything beyond them is unnecessary privilege. See IAM Roles Explained and Least Privilege in GCP for the broader principle.

Kubernetes RBAC#

Kubernetes Role-Based Access Control (RBAC) governs who can perform which operations on which resources inside the cluster. In GKE, RBAC integrates with Google Cloud IAM: when a user authenticates with gcloud and runs kubectl, their Google identity is matched against RBAC policies.

Role vs ClusterRole. A Role grants permissions within a single namespace. A ClusterRole grants permissions across all namespaces or for cluster-scoped resources such as nodes and persistent volumes.

Creating a Role that allows reading Pods and Services in the production namespace:

apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
  name: pod-reader
  namespace: production
rules:
  - apiGroups: [""]
    resources: ["pods", "services"]
    verbs: ["get", "list", "watch"]

Binding this Role to a user:

kubectl create rolebinding pod-reader-binding \
  --role=pod-reader \
  --user=alice@example.com \
  --namespace=production

Cluster-wide roles follow the same pattern:

kubectl create clusterrole node-reader \
  --verb=get,list,watch \
  --resource=nodes

kubectl create clusterrolebinding node-reader-binding \
  --clusterrole=node-reader \
  --user=ops-team@example.com
Warning

Avoid granting cluster-admin to individuals or service accounts except in break-glass scenarios. The cluster-admin ClusterRole allows all operations on all resources in the cluster, including creating new ClusterRoleBindings. In practice this means anyone with cluster-admin can escalate their own privileges or those of any other identity. It is the master key to the whole cluster.

GKE also integrates with Google Groups for RBAC, allowing you to bind roles to a Google Workspace group rather than individual email addresses. This is much easier to manage as teams change.

To audit what permissions an identity currently has:

kubectl auth can-i --list --as=alice@example.com -n production

Network controls#

NetworkPolicy controls traffic between pods at layer 3/4. Without it, all pods in your cluster can reach all other pods by default. Enable NetworkPolicy support at cluster creation with --enable-network-policy, then apply a default deny-all ingress policy per namespace and explicitly allow only the required traffic paths.

A default deny-all ingress policy for a namespace:

apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: default-deny-ingress
  namespace: production
spec:
  podSelector: {}
  policyTypes:
    - Ingress

See the GKE Networking Model page for detailed NetworkPolicy examples including egress controls and pod selector patterns.

Private clusters give nodes only RFC 1918 private IP addresses, removing them from direct internet exposure:

gcloud container clusters create my-cluster \
  --enable-private-nodes \
  --master-ipv4-cidr=172.16.0.0/28 \
  --enable-private-endpoint

Authorised Networks restrict which source IP ranges can reach the control plane’s Kubernetes API endpoint:

gcloud container clusters create my-cluster \
  --enable-master-authorized-networks \
  --master-authorized-networks=203.0.113.0/24,198.51.100.0/24

You can update authorised networks on an existing cluster without recreating it:

gcloud container clusters update my-cluster \
  --region=europe-west2 \
  --enable-master-authorized-networks \
  --master-authorized-networks=203.0.113.0/24

Image and supply chain security#

A cluster is only as secure as the container images running in it. Two controls protect the image supply chain.

Artifact Registry vulnerability scanning automatically scans images pushed to Artifact Registry for known CVEs. Results appear in the Cloud Console or via the Container Analysis API. Integrate this into your CI/CD pipeline to fail builds when critical or high vulnerabilities are detected. See Artifact Registry best practices for scanning configuration and image lifecycle management.

Binary Authorization is a deploy-time policy control. It requires that container images have been cryptographically attested, typically by your CI/CD pipeline after passing defined checks, before they can be deployed to GKE. Unsigned images are blocked at deploy time.

Enable Binary Authorization on a cluster:

gcloud container clusters create my-cluster \
  --binauthz-evaluation-mode=PROJECT_SINGLETON_POLICY_ENFORCE

A typical Binary Authorization workflow:

  1. CI pipeline builds and pushes image to Artifact Registry.
  2. Pipeline runs security scans. If scans pass, a Cloud KMS-based attestor signs the image digest.
  3. A CD pipeline deploys to GKE.
  4. GKE checks the Binary Authorization policy, verifies the attestation, and either allows or blocks the deployment.
Note

Binary Authorization operates on image digests (SHA-256 hashes), not tags. A tag like latest can be updated to point at a different image without Binary Authorization noticing. Always use image digests in production manifests: image: europe-west2-docker.pkg.dev/my-project/repo/my-app@sha256:abc123…

Secrets and encryption#

Kubernetes Secrets store sensitive values: database passwords, API keys, TLS certificates. By default, these values are stored in etcd base64-encoded but not encrypted. Anyone with etcd access, or with sufficient RBAC permissions to read Secrets, can retrieve plaintext values.

Warning

Base64 is encoding, not encryption. A Kubernetes Secret value can be decoded in one command: echo “dGVzdA==” | base64 -d. Without application-layer encryption, anyone who can read a Secret object in Kubernetes has the plaintext value immediately.

GKE supports application-layer secret encryption using Cloud KMS. When enabled, the Kubernetes API server encrypts Secret values with a Cloud KMS key before writing them to etcd, and decrypts them on retrieval.

Enable application-layer encryption at cluster creation:

gcloud container clusters create my-cluster \
  --database-encryption-key=projects/PROJECT_ID/locations/europe-west2/keyRings/my-keyring/cryptoKeys/my-key

For the highest assurance, use Secret Manager instead of Kubernetes Secrets entirely. With Secret Manager, secrets never enter etcd. Pods retrieve secrets at runtime via the Secret Manager API or the Secret Manager CSI driver, which mounts secrets as files or environment variables. See Managing Secrets in Kubernetes for the full comparison between Kubernetes Secrets, KMS-encrypted Secrets, and Secret Manager.

Logging, monitoring, and audit visibility#

Security controls reduce the chance of a breach. Logging and monitoring give you the ability to detect one when it happens, and to prove what occurred after the fact.

Cloud Audit Logs record every administrative and data access API call against the cluster: who created or deleted what, which secrets were read, which RBAC bindings changed. Enable Data Access audit logs for the Kubernetes API in addition to the default Admin Activity logs.

Logging in Kubernetes captures workload output including application logs, crash events, and container stderr, and routes it to Cloud Logging for structured querying and alerting.

Monitoring GKE clusters surfaces resource usage, node health, and deployment status in Cloud Monitoring dashboards.

Security Command Center surfaces GKE-specific findings automatically: misconfigured RBAC bindings, overprivileged workloads, open firewall rules that expose cluster nodes, and known vulnerabilities in running images.

Detecting suspicious activity with logs explains how to create log-based alerts for events like unexpected exec commands into pods, unauthorised API calls, or changes to RBAC bindings.

Tip

If you are just starting with audit logging, enable Admin Activity logs first (they are always on) and then add Data Access logs for the Kubernetes API. Data Access logs can be high volume, so it is worth routing them to a log bucket with a defined retention policy before enabling them in production.

GKE security vs related controls

These controls are often confused because they all relate to GKE security. They are complementary, not interchangeable.

Private clusters vs full cluster security

A private cluster hides node IP addresses from the public internet and optionally hides the control plane endpoint. This is one network-layer control. It does not affect IAM permissions, RBAC, image verification, secret encryption, or what workloads can do at runtime. A cluster can be private and still be highly vulnerable if the other layers are misconfigured.

Binary Authorization vs full cluster security

Binary Authorization enforces which container images are allowed to run. It does not protect the control plane, the network, IAM, RBAC, or secrets. You can run Binary Authorization on an otherwise poorly configured cluster and it will only prevent unauthorised images from deploying. It will not protect you from a compromised node service account or open NetworkPolicy. It is a supply chain control, not a complete security posture.

Workload Identity vs full cluster security

Workload Identity controls how pods authenticate to Google Cloud APIs. It replaces a dangerous pattern (mounting service account key files) with a safe one (short-lived tokens bound to a Kubernetes ServiceAccount). It does not affect RBAC, network controls, image security, or node hardening. It is foundational and should always be enabled, but it is one layer among many.

Practical GKE security checklist

Use this as a starting point for any production cluster review. Work through the layers in order, starting with identity and nodes, then moving to network, image supply chain, and secrets.

  • Create a dedicated, minimally scoped node service account (not the Compute Engine default)
  • Enable Workload Identity (--workload-pool) on all clusters
  • Enable Shielded GKE Nodes: Secure Boot, vTPM, Integrity Monitoring
  • Use Container-Optimized OS as the node image
  • Review all RBAC bindings; remove cluster-admin grants that are not break-glass
  • Apply default deny-all ingress NetworkPolicies per namespace; explicitly allow required paths
  • Enable --enable-master-authorized-networks or use a private endpoint
  • Use private clusters for production workloads that should not be internet-reachable
  • Scan images in Artifact Registry; fail CI builds on critical CVEs
  • Enforce Binary Authorization in production; use image digests, not mutable tags
  • Enable application-layer secret encryption with Cloud KMS, or migrate to Secret Manager
  • Enable Data Access audit logs for the Kubernetes API
  • Set up Cloud Monitoring dashboards and alerts for the cluster
  • Review Security Command Center findings for GKE-specific recommendations

Common beginner mistakes

  1. Using the default Compute Engine service account on GKE nodes. The default service account has the Editor primitive role, granting broad write access across the project. Any pod that bypasses Workload Identity and calls the metadata server will inherit these permissions. Always create a dedicated, narrowly scoped node service account with only the four roles GKE requires.
  2. Granting cluster-admin as the path of least resistance. When something fails due to RBAC, the tempting fix is to grant cluster-admin. This removes the security value of RBAC entirely. Diagnose which specific permissions are missing using kubectl auth can-i —list and grant only those.
  3. Assuming Kubernetes Secrets are encrypted. They are not encrypted by default. They are base64-encoded. Without application-layer encryption via Cloud KMS, a Kubernetes Secret is only as secure as your etcd access controls and RBAC policies. Enable KMS encryption or migrate to Secret Manager for genuinely sensitive values.
  4. Thinking a private cluster means the cluster is secure. A private cluster removes nodes from the public internet. It does not encrypt secrets, enforce image policies, scope IAM permissions, or restrict pod-to-pod traffic. It is one useful layer, not a complete security posture.
  5. Using mutable image tags with Binary Authorization. Enforcing Binary Authorization against a tag like latest gives false security. The tag can be updated to point at an unattested image and the policy will still see the same tag name. Always pin images to their SHA-256 digest in production manifests when Binary Authorization is in use.

Frequently asked questions

What is the best way to secure a GKE cluster?

There is no single setting that secures a GKE cluster. Security is layered. Start with the foundations: enable Shielded Nodes, use Workload Identity instead of service account keys, restrict RBAC to least privilege, apply default deny NetworkPolicies, scan images in Artifact Registry, and enable Cloud Audit Logs. Each layer closes a different attack surface, and omitting one can undermine the others.

Is a private GKE cluster enough for security?

No. A private cluster removes your nodes from direct internet exposure, which reduces the attack surface on the data plane. But it does not control what workloads can do once running, how IAM permissions are scoped, whether images are verified, or how secrets are protected. Private clusters are one useful layer, not a complete security model.

Should I use Workload Identity in GKE?

Yes, always. Workload Identity is the correct way to grant pods access to Google Cloud APIs. It provides short-lived, automatically rotated tokens tied to a Kubernetes ServiceAccount. The alternative is mounting a service account key file, which creates a long-lived credential that never expires and can be exfiltrated from the container. There is no production scenario where a key file is the better choice.

Are Kubernetes Secrets encrypted at rest on GKE?

Not by default. Kubernetes Secrets are stored in etcd base64-encoded, which is not encryption. Anyone with etcd access or sufficient RBAC permissions can retrieve plaintext values. To encrypt at rest, enable application-layer encryption with Cloud KMS using the --database-encryption-key flag at cluster creation. For the highest assurance, use Secret Manager instead of Kubernetes Secrets so sensitive values never enter etcd.

What does Binary Authorization protect against?

Binary Authorization protects against deploying unverified container images. It enforces a policy requiring that images have been cryptographically attested, typically by your CI/CD pipeline after passing security scans, before they can run in GKE. It does not protect against runtime behaviour, network traffic, or RBAC escalation. It is a supply chain control, not a runtime isolation tool.

Last verified: 23 March 2026 Cloud services change frequently. Verify details against official documentation before making infrastructure decisions.