GKE Monitoring Explained: Cloud Monitoring, kubectl top, and Managed Prometheus

When something goes wrong in a GKE cluster, the question is always the same: how do you know before your users do? GKE answers that by integrating with Cloud Monitoring out of the box. Every cluster automatically sends node, pod, and container metrics the moment it starts. No agents to install, no pipelines to configure. This page explains how that integration works, how to use kubectl top for quick checks, how to build dashboards and alerting policies, and how Managed Service for Prometheus extends visibility to your application’s own metrics.

What GKE monitoring gives you

Think of monitoring a GKE cluster as watching three layers at once: are the machines (nodes) healthy, are the workloads (pods) running, and are the applications behaving correctly. GKE handles the first two layers automatically. The third layer requires your application to expose metrics.

The tools GKE gives you out of the box:

Cloud Monitoring — the main observability platform. GKE automatically sends CPU, memory, network, and restart metrics here. You build dashboards and alerting policies on top of those metrics.
Cloud Logging — collects logs from every container running in your cluster. Logs complement metrics when you need to understand why something went wrong, not just that it went wrong. See Logging in Kubernetes on GKE for how this works.
kubectl top — shows current CPU and memory usage for nodes and pods directly from the metrics-server. Good for quick spot-checks, not a substitute for Cloud Monitoring.
Managed Service for Prometheus — a fully managed Prometheus-compatible backend. Use it when your applications expose custom metrics that Cloud Monitoring does not collect by default.

None of these require manual agent installation. The cluster handles the integration automatically.

How monitoring flows from cluster to alert

GKE emits metrics continuously. Every node runs a monitoring agent that collects CPU utilisation, memory usage, disk I/O, network throughput, and pod restart counts. These are sent to Cloud Monitoring in near real time.
Cloud Monitoring stores the metrics as time-series data. All GKE metrics are stored under the kubernetes.io monitored resource, keyed by cluster, namespace, pod, and container. They are retained for 24 months by default.
Dashboards visualise trends over time. You can use the pre-built GKE dashboards under Kubernetes Engine › Observability, or build custom dashboards that combine GKE metrics with metrics from other services.
Alerting policies notify you when thresholds are crossed. You define conditions like “node CPU above 80% for 5 minutes” and Cloud Monitoring fires a notification to email, Slack, PagerDuty, or Pub/Sub when the condition is met.
Managed Service for Prometheus adds application-level metrics. If your application exposes a /metrics endpoint, a PodMonitoring resource tells GMP to scrape it and send those metrics into Cloud Monitoring alongside the built-in GKE metrics.
Logs and traces fill in the gaps. When metrics show something is wrong, Cloud Trace shows request latency through your services and Cloud Profiler shows which functions are consuming CPU or memory.

Tip

If you have a GKE cluster already running, open Kubernetes Engine › Observability in the Cloud Console right now. You will find pre-built dashboards for node health, workload performance, and pod status with no setup required.

Built-in monitoring in GKE

When you create a GKE cluster, Cloud Monitoring is enabled by default. A monitoring agent runs on each node and forwards metrics automatically, with no installation or configuration required.

Metrics cover three levels:

Node metrics — CPU utilisation, memory utilisation, disk I/O, and network throughput per node. See Node Pools Explained for how nodes are organised.
Pod and container metrics — CPU and memory requests versus actual usage for every running container. Kubernetes Pods Explained covers the pod model this builds on.
Cluster-level metrics — node count, node conditions, API server request latency, and etcd metrics.

All GKE metrics in Cloud Monitoring live under the kubernetes.io monitored resource:

kubernetes.io/container/cpu/core_usage_time
kubernetes.io/container/memory/used_bytes
kubernetes.io/node/cpu/allocatable_utilization
kubernetes.io/node/memory/allocatable_utilization
kubernetes.io/pod/network/received_bytes_count
kubernetes.io/pod/network/sent_bytes_count

Explore these in Metrics Explorer by searching for the kubernetes.io namespace. For a broader overview of how GCP metrics work, see Metrics in GCP.

Note

GKE separates System Components monitoring from Workloads monitoring. Both are enabled by default but are independently configurable. If monitoring was disabled at cluster creation, re-enable it with: gcloud container clusters update CLUSTER_NAME —enable-managed-prometheus —region REGION

Monitoring from the Cloud Console

The Cloud Console provides a live Kubernetes view under Kubernetes Engine › Workloads. This is often the fastest place to start when something looks wrong.

For each workload (Deployments, StatefulSets, DaemonSets, Jobs), you can see:

Pod status — Running, Pending, Failed, or CrashLoopBackOff for each pod.
Restart count — a rising restart count is one of the clearest signals that a container is crashing repeatedly.
Unavailable pods — any gap between desired and running pods means a deployment is degraded.
CPU and memory usage — displayed relative to the requests set on each container, so you can see at a glance which workloads are near their limits.
Node placement — which pods are running on which nodes.

Clicking into a specific workload shows historical CPU and memory graphs, the associated pod list, and a direct link into Cloud Logging for that workload’s logs. For most incidents, you can go from “something is wrong” to a working hypothesis without leaving this view.

Real-time checks with kubectl top

For instant command-line visibility, kubectl top queries the metrics-server pre-installed on GKE clusters and returns current CPU and memory usage.

# Current CPU and memory for all nodes
kubectl top nodes

# Current usage for all pods in a namespace
kubectl top pods -n my-namespace

# Usage across all namespaces
kubectl top pods -A

# Sort by memory usage
kubectl top pods -n my-namespace --sort-by=memory

# Show per-container usage within pods
kubectl top pods -n my-namespace --containers

The output shows actual current usage, not configured requests or limits. Comparing kubectl top output against the requests and limits in your manifests quickly reveals pods near their limits (throttling or OOM risk) and pods that are massively over-allocated (wasted cost).

Tip

kubectl top is best for answering “which pod is eating all the CPU right now?” during active debugging. It has no history, no alerting, and cannot answer “was CPU high at 3am last Tuesday?” For that, you need Cloud Monitoring.

Dashboards in Cloud Monitoring

Dashboards give you a persistent, visual view of cluster health over time. Unlike the Workloads console view, dashboards let you compare metrics across time windows, correlate node pressure with pod restarts, and share a single URL with your team during an incident.

GKE provides pre-built dashboards under Kubernetes Engine › Observability that display cluster health, node status, and workload performance without any setup. For a custom dashboard, open Cloud Monitoring › Dashboards › Create Dashboard and add charts for:

Node CPU utilisation — kubernetes.io/node/cpu/allocatable_utilization grouped by node name
Node memory utilisation — kubernetes.io/node/memory/allocatable_utilization grouped by node name
Pod CPU usage — kubernetes.io/container/cpu/core_usage_time grouped by namespace or pod name
Pod memory usage — kubernetes.io/container/memory/used_bytes grouped by pod name
Container restart count — kubernetes.io/container/restart_count; rising values are an early warning signal

For a deeper look at how to build and structure dashboards, see Dashboards in Cloud Monitoring.

Alerting for GKE clusters

Dashboards are passive: you have to be looking at them. Alerting policies are active. Configure them in Cloud Monitoring › Alerting › Create Policy so you get notified when something crosses a threshold, even at 3am.

Four policies to set up first:

Container restart rate — signals CrashLoopBackOff or repeated crashes:

Metric:    kubernetes.io/container/restart_count
Filter:    namespace_name = "production"
Condition: rate of change > 5 restarts in 10 minutes

Node CPU pressure:

Metric:    kubernetes.io/node/cpu/allocatable_utilization
Condition: value > 0.80 (80%) for 5 minutes

Sustained high CPU on a node means workloads are competing for resources. New pods may fail to schedule and existing pods may be throttled.

Node memory pressure:

Metric:    kubernetes.io/node/memory/allocatable_utilization
Condition: value > 0.85 (85%) for 5 minutes

Warning

Nodes near memory capacity will start evicting pods automatically. Evictions happen fast and without warning. Memory pressure is often the first sign of a cascade of pod failures — do not ignore rising memory utilisation.

Deployment unavailable replicas — signals a failed rollout or crashing pods:

Metric:    kubernetes.io/deployment/desired_pods minus kubernetes.io/deployment/available_pods
Condition: value > 0 for 5 minutes

Any gap between desired and available replicas means your service is running at reduced capacity.

Alerts can notify via email, PagerDuty, Slack webhooks, or Pub/Sub topics. See Creating Alerts in Cloud Monitoring for full policy configuration, and Incident Response with Monitoring for how to structure your response process.

Warning

Setting alert thresholds too low generates alert fatigue, where your team starts ignoring notifications because they fire too often on normal variation. Start conservative — 80% CPU for 5 minutes rather than 60% for 1 minute — and tune thresholds after observing your workload’s normal behaviour for a week.

Managed Service for Prometheus

The built-in GKE metrics cover infrastructure: nodes, pods, containers. They do not know anything about your application’s internal state — how many items are in a queue, how fast requests are completing, or how many errors a specific endpoint is returning.

For application-level metrics, use Google Cloud Managed Service for Prometheus (GMP): a fully managed, Prometheus-compatible backend integrated with Cloud Monitoring. You get Prometheus scraping without running or managing a Prometheus server.

A PodMonitoring resource scrapes metrics from pods matching a label selector within a single namespace:

apiVersion: monitoring.googleapis.com/v1
kind: PodMonitoring
metadata:
  name: my-app-metrics
  namespace: production
spec:
  selector:
    matchLabels:
      app: my-app
  endpoints:
    - port: metrics
      interval: 30s

ClusterPodMonitoring works the same way but is cluster-scoped, scraping matching pods across all namespaces. Use PodMonitoring when you want namespace-level control; use ClusterPodMonitoring for platform-wide metrics collection.

Once configured, your custom metrics appear in Cloud Monitoring under the prometheus.googleapis.com namespace. Query them using PromQL in Metrics Explorer, add them to dashboards, or connect Grafana to GMP as a data source. GMP is also what powers custom-metric-based autoscaling with Horizontal Pod Autoscaling.

Tip

If you only care about node and pod health, the built-in GKE metrics are enough. You need Managed Prometheus only when your applications expose custom metrics — queue depth, request rates, error rates per endpoint — that GKE does not track automatically.

Cloud Monitoring vs kubectl top vs Managed Prometheus

	Cloud Monitoring	kubectl top	Managed Prometheus
Purpose	Long-term visibility, alerting, dashboards	Instant spot-check of current usage	Custom application metrics
Historical data	Yes, 24 months	No	Yes, 24 months
Alerts	Yes	No	Via Cloud Monitoring
Setup required	None, enabled by default	None, metrics-server pre-installed	PodMonitoring resources
Metrics covered	Node, pod, container infrastructure	Current CPU and memory only	Custom app metrics you define
Best use case	Production monitoring, incident response	Active debugging right now	Queue depth, request rates, error rates

Use all three. kubectl top answers “what is consuming resources right now.” Cloud Monitoring answers “what happened over the last hour, day, or month and what should alert me.” Managed Prometheus answers “what is my application doing internally.”

When to set up GKE monitoring

The short answer is: before your first production deployment. Specifically, set up dashboards and alerting when:

You are running any workload in production. Alerts should be in place before go-live, not after the first incident.
You need historical metrics to understand trends, compare before and after a change, or run a post-mortem.
You are diagnosing performance issues: comparing actual usage against requests and limits, or identifying throttled containers.
You need to catch node pressure before pods get evicted and the cluster runs out of scheduling capacity.
Your applications expose custom metrics and you want to alert on them or use them to drive autoscaling.
You are doing cluster upgrades. Monitoring before and after confirms the upgrade did not degrade workload performance. See Upgrading GKE Clusters Safely.

Common mistakes

Treating kubectl top as full monitoring. kubectl top shows only the current instant. It has no history, no alerting, and no trend analysis. It cannot answer “was CPU high last Tuesday at 3am?” For production visibility, you need Cloud Monitoring.
Not configuring any alerting policies. Dashboards are passive: you have to be looking at them. Production clusters should always have alerts for pod restarts, node pressure, and unavailable replicas at minimum. An unmonitored cluster is one where users find problems first.
Setting alert thresholds too aggressively. If an alert fires multiple times a day on normal variation, teams start ignoring all alerts. Calibrate thresholds against observed baseline behaviour and require conditions to persist for at least 5 minutes before firing.
Confusing resource requests with actual usage. Cloud Monitoring reports actual usage. A pod can have a 1 CPU request but consume only 50m in practice, or be throttled because it is hitting its CPU limit. Always compare core_usage_time against both request_cores and limit_cores for the full picture.
Ignoring container restart counts. A rising restart count is one of the most reliable early warning signals in Kubernetes. A container restarting repeatedly is either crashing (check logs), hitting an OOM limit (increase the memory limit or fix the leak), or failing health checks (check readiness probe config). Set an alert on restart rate and treat it seriously.

Frequently asked questions

Is Cloud Monitoring enabled by default on GKE clusters?

Yes. GKE clusters have Cloud Monitoring and Cloud Logging enabled by default when created through the console or gcloud. The monitoring agent collects node and pod metrics automatically. You can verify or change this under the cluster's Features section in the Google Cloud Console, or with the --monitoring flag in gcloud.

What is the difference between kubectl top and Cloud Monitoring?

kubectl top gives you real-time resource usage (CPU and memory) directly from the metrics-server, useful for instant spot-checks. Cloud Monitoring stores historical metrics over time and supports dashboards, alerting policies, and long-term trend analysis. Use kubectl top for immediate debugging and Cloud Monitoring for ongoing visibility and alerting.

Do I need Prometheus to monitor a GKE cluster?

No. GKE's built-in Cloud Monitoring integration covers node, pod, and container metrics without any Prometheus setup. You only need Managed Service for Prometheus when you want to collect custom application metrics exposed in the Prometheus format: queue depths, request latency histograms, business counters, and so on.

What alerts should I set up first for a new GKE cluster?

Start with four: container restart rate above 5 in 10 minutes (CrashLoopBackOff signal), node CPU utilisation above 80% for 5 minutes, node memory utilisation above 85% for 5 minutes, and deployment unavailable replicas above 0 for 5 minutes. These cover the most common production failure modes without generating alert fatigue.

Can I monitor custom application metrics in GKE?

Yes. Use Google Cloud Managed Service for Prometheus (GMP) with a PodMonitoring custom resource to scrape metrics your application exposes in the Prometheus format. Those metrics then appear in Cloud Monitoring under the prometheus.googleapis.com namespace, where you can query them with PromQL and add them to dashboards.

Last verified: 23 March 2026 Cloud services change frequently. Verify details against official documentation before making infrastructure decisions.

GKE Monitoring Explained: Cloud Monitoring, kubectl top, and Managed Prometheus

What GKE monitoring gives you

A hospital ward monitor

How monitoring flows from cluster to alert

Built-in monitoring in GKE

Monitoring from the Cloud Console

Real-time checks with kubectl top

Dashboards in Cloud Monitoring

Alerting for GKE clusters

Managed Service for Prometheus

Cloud Monitoring vs kubectl top vs Managed Prometheus

When to set up GKE monitoring

Common mistakes

Summary

Frequently asked questions

GKE Monitoring Explained: Cloud Monitoring, kubectl top, and Managed Prometheus

What GKE monitoring gives you

A hospital ward monitor

How monitoring flows from cluster to alert

Built-in monitoring in GKE

Monitoring from the Cloud Console

Real-time checks with kubectl top

Dashboards in Cloud Monitoring

Alerting for GKE clusters

Managed Service for Prometheus

Cloud Monitoring vs kubectl top vs Managed Prometheus

When to set up GKE monitoring

Common mistakes

Summary

Related guides

Frequently asked questions