GKE Monitoring Explained: Cloud Monitoring, kubectl top, and Managed Prometheus
When something goes wrong in a GKE cluster, the question is always the same:
how do you know before your users do? GKE answers that by integrating with
Cloud Monitoring out of the box. Every cluster automatically sends node, pod,
and container metrics the moment it starts. No agents to install, no pipelines
to configure. This page explains how that integration works, how to use
kubectl top for quick checks, how to build dashboards and alerting
policies, and how Managed Service for Prometheus extends visibility to your
application’s own metrics.
What GKE monitoring gives you
Think of monitoring a GKE cluster as watching three layers at once: are the machines (nodes) healthy, are the workloads (pods) running, and are the applications behaving correctly. GKE handles the first two layers automatically. The third layer requires your application to expose metrics.
The tools GKE gives you out of the box:
Cloud Monitoring — the main observability platform. GKE automatically sends CPU, memory, network, and restart metrics here. You build dashboards and alerting policies on top of those metrics.
Cloud Logging — collects logs from every container running in your cluster. Logs complement metrics when you need to understand why something went wrong, not just that it went wrong. See Logging in Kubernetes on GKE for how this works.
kubectl top — shows current CPU and memory usage for nodes and pods directly from the metrics-server. Good for quick spot-checks, not a substitute for Cloud Monitoring.
Managed Service for Prometheus — a fully managed Prometheus-compatible backend. Use it when your applications expose custom metrics that Cloud Monitoring does not collect by default.
None of these require manual agent installation. The cluster handles the integration automatically.
A hospital ward monitor
Think of your cluster as a hospital ward. Cloud Monitoring is the patient
display showing heart rate (CPU), blood pressure (memory), and breathing
(restarts) over time. kubectl top is the nurse doing a quick
manual check right now. Managed Prometheus is the specialist equipment you
bring in when you need to measure something the standard monitors do not
track. You need all three in different situations.
How monitoring flows from cluster to alert
GKE emits metrics continuously. Every node runs a monitoring agent that collects CPU utilisation, memory usage, disk I/O, network throughput, and pod restart counts. These are sent to Cloud Monitoring in near real time.
Cloud Monitoring stores the metrics as time-series data. All GKE metrics are stored under the
kubernetes.iomonitored resource, keyed by cluster, namespace, pod, and container. They are retained for 24 months by default.Dashboards visualise trends over time. You can use the pre-built GKE dashboards under Kubernetes Engine › Observability, or build custom dashboards that combine GKE metrics with metrics from other services.
Alerting policies notify you when thresholds are crossed. You define conditions like “node CPU above 80% for 5 minutes” and Cloud Monitoring fires a notification to email, Slack, PagerDuty, or Pub/Sub when the condition is met.
Managed Service for Prometheus adds application-level metrics. If your application exposes a
/metricsendpoint, aPodMonitoringresource tells GMP to scrape it and send those metrics into Cloud Monitoring alongside the built-in GKE metrics.Logs and traces fill in the gaps. When metrics show something is wrong, Cloud Trace shows request latency through your services and Cloud Profiler shows which functions are consuming CPU or memory.
If you have a GKE cluster already running, open Kubernetes Engine › Observability in the Cloud Console right now. You will find pre-built dashboards for node health, workload performance, and pod status with no setup required.
Built-in monitoring in GKE
When you create a GKE cluster, Cloud Monitoring is enabled by default. A monitoring agent runs on each node and forwards metrics automatically, with no installation or configuration required.
Metrics cover three levels:
Node metrics — CPU utilisation, memory utilisation, disk I/O, and network throughput per node. See Node Pools Explained for how nodes are organised.
Pod and container metrics — CPU and memory requests versus actual usage for every running container. Kubernetes Pods Explained covers the pod model this builds on.
Cluster-level metrics — node count, node conditions, API server request latency, and etcd metrics.
All GKE metrics in Cloud Monitoring live under the kubernetes.io
monitored resource:
kubernetes.io/container/cpu/core_usage_time
kubernetes.io/container/memory/used_bytes
kubernetes.io/node/cpu/allocatable_utilization
kubernetes.io/node/memory/allocatable_utilization
kubernetes.io/pod/network/received_bytes_count
kubernetes.io/pod/network/sent_bytes_countExplore these in Metrics Explorer by searching for the
kubernetes.io namespace. For a broader overview of how GCP
metrics work, see
Metrics in GCP.
GKE separates System Components monitoring from Workloads monitoring. Both
are enabled by default but are independently configurable. If monitoring
was disabled at cluster creation, re-enable it with:
gcloud container clusters update CLUSTER_NAME —enable-managed-prometheus —region REGION
Monitoring from the Cloud Console
The Cloud Console provides a live Kubernetes view under Kubernetes Engine › Workloads. This is often the fastest place to start when something looks wrong.
For each workload (Deployments, StatefulSets, DaemonSets, Jobs), you can see:
- Pod status — Running, Pending, Failed, or CrashLoopBackOff for each pod.
- Restart count — a rising restart count is one of the clearest signals that a container is crashing repeatedly.
- Unavailable pods — any gap between desired and running pods means a deployment is degraded.
- CPU and memory usage — displayed relative to the requests set on each container, so you can see at a glance which workloads are near their limits.
- Node placement — which pods are running on which nodes.
Clicking into a specific workload shows historical CPU and memory graphs, the associated pod list, and a direct link into Cloud Logging for that workload’s logs. For most incidents, you can go from “something is wrong” to a working hypothesis without leaving this view.
Real-time checks with kubectl top
For instant command-line visibility, kubectl top queries the
metrics-server pre-installed on GKE clusters and returns current CPU and
memory usage.
# Current CPU and memory for all nodes
kubectl top nodes
# Current usage for all pods in a namespace
kubectl top pods -n my-namespace
# Usage across all namespaces
kubectl top pods -A
# Sort by memory usage
kubectl top pods -n my-namespace --sort-by=memory
# Show per-container usage within pods
kubectl top pods -n my-namespace --containersThe output shows actual current usage, not configured requests or
limits. Comparing kubectl top output against the requests and
limits in your manifests quickly reveals pods near their limits (throttling
or OOM risk) and pods that are massively over-allocated (wasted cost).
kubectl top is best for answering “which pod is eating all
the CPU right now?” during active debugging. It has no history, no
alerting, and cannot answer “was CPU high at 3am last Tuesday?” For that,
you need Cloud Monitoring.
Dashboards in Cloud Monitoring
Dashboards give you a persistent, visual view of cluster health over time. Unlike the Workloads console view, dashboards let you compare metrics across time windows, correlate node pressure with pod restarts, and share a single URL with your team during an incident.
GKE provides pre-built dashboards under Kubernetes Engine › Observability that display cluster health, node status, and workload performance without any setup. For a custom dashboard, open Cloud Monitoring › Dashboards › Create Dashboard and add charts for:
- Node CPU utilisation —
kubernetes.io/node/cpu/allocatable_utilizationgrouped by node name - Node memory utilisation —
kubernetes.io/node/memory/allocatable_utilizationgrouped by node name - Pod CPU usage —
kubernetes.io/container/cpu/core_usage_timegrouped by namespace or pod name - Pod memory usage —
kubernetes.io/container/memory/used_bytesgrouped by pod name - Container restart count —
kubernetes.io/container/restart_count; rising values are an early warning signal
For a deeper look at how to build and structure dashboards, see Dashboards in Cloud Monitoring.
Alerting for GKE clusters
Dashboards are passive: you have to be looking at them. Alerting policies are active. Configure them in Cloud Monitoring › Alerting › Create Policy so you get notified when something crosses a threshold, even at 3am.
Four policies to set up first:
Container restart rate — signals CrashLoopBackOff or repeated crashes:
Metric: kubernetes.io/container/restart_count
Filter: namespace_name = "production"
Condition: rate of change > 5 restarts in 10 minutesNode CPU pressure:
Metric: kubernetes.io/node/cpu/allocatable_utilization
Condition: value > 0.80 (80%) for 5 minutesSustained high CPU on a node means workloads are competing for resources. New pods may fail to schedule and existing pods may be throttled.
Node memory pressure:
Metric: kubernetes.io/node/memory/allocatable_utilization
Condition: value > 0.85 (85%) for 5 minutesNodes near memory capacity will start evicting pods automatically. Evictions happen fast and without warning. Memory pressure is often the first sign of a cascade of pod failures — do not ignore rising memory utilisation.
Deployment unavailable replicas — signals a failed rollout or crashing pods:
Metric: kubernetes.io/deployment/desired_pods minus kubernetes.io/deployment/available_pods
Condition: value > 0 for 5 minutesAny gap between desired and available replicas means your service is running at reduced capacity.
Alerts can notify via email, PagerDuty, Slack webhooks, or Pub/Sub topics. See Creating Alerts in Cloud Monitoring for full policy configuration, and Incident Response with Monitoring for how to structure your response process.
Setting alert thresholds too low generates alert fatigue, where your team starts ignoring notifications because they fire too often on normal variation. Start conservative — 80% CPU for 5 minutes rather than 60% for 1 minute — and tune thresholds after observing your workload’s normal behaviour for a week.
Managed Service for Prometheus
The built-in GKE metrics cover infrastructure: nodes, pods, containers. They do not know anything about your application’s internal state — how many items are in a queue, how fast requests are completing, or how many errors a specific endpoint is returning.
For application-level metrics, use Google Cloud Managed Service for Prometheus (GMP): a fully managed, Prometheus-compatible backend integrated with Cloud Monitoring. You get Prometheus scraping without running or managing a Prometheus server.
A PodMonitoring resource scrapes metrics from pods matching a
label selector within a single namespace:
apiVersion: monitoring.googleapis.com/v1
kind: PodMonitoring
metadata:
name: my-app-metrics
namespace: production
spec:
selector:
matchLabels:
app: my-app
endpoints:
- port: metrics
interval: 30sClusterPodMonitoring works the same way but is cluster-scoped,
scraping matching pods across all namespaces. Use PodMonitoring
when you want namespace-level control; use ClusterPodMonitoring
for platform-wide metrics collection.
Once configured, your custom metrics appear in Cloud Monitoring under the
prometheus.googleapis.com namespace. Query them using PromQL in
Metrics Explorer, add them to dashboards, or connect Grafana to GMP as a data
source. GMP is also what powers custom-metric-based autoscaling with
Horizontal
Pod Autoscaling.
If you only care about node and pod health, the built-in GKE metrics are enough. You need Managed Prometheus only when your applications expose custom metrics — queue depth, request rates, error rates per endpoint — that GKE does not track automatically.
Cloud Monitoring vs kubectl top vs Managed Prometheus
| Cloud Monitoring | kubectl top | Managed Prometheus | |
|---|---|---|---|
| Purpose | Long-term visibility, alerting, dashboards | Instant spot-check of current usage | Custom application metrics |
| Historical data | Yes, 24 months | No | Yes, 24 months |
| Alerts | Yes | No | Via Cloud Monitoring |
| Setup required | None, enabled by default | None, metrics-server pre-installed | PodMonitoring resources |
| Metrics covered | Node, pod, container infrastructure | Current CPU and memory only | Custom app metrics you define |
| Best use case | Production monitoring, incident response | Active debugging right now | Queue depth, request rates, error rates |
Use all three. kubectl top answers “what is consuming resources
right now.” Cloud Monitoring answers “what happened over the last hour, day,
or month and what should alert me.” Managed Prometheus answers “what is my
application doing internally.”
When to set up GKE monitoring
The short answer is: before your first production deployment. Specifically, set up dashboards and alerting when:
- You are running any workload in production. Alerts should be in place before go-live, not after the first incident.
- You need historical metrics to understand trends, compare before and after a change, or run a post-mortem.
- You are diagnosing performance issues: comparing actual usage against requests and limits, or identifying throttled containers.
- You need to catch node pressure before pods get evicted and the cluster runs out of scheduling capacity.
- Your applications expose custom metrics and you want to alert on them or use them to drive autoscaling.
- You are doing cluster upgrades. Monitoring before and after confirms the upgrade did not degrade workload performance. See Upgrading GKE Clusters Safely.
Common mistakes
Treating
kubectl topas full monitoring.kubectl topshows only the current instant. It has no history, no alerting, and no trend analysis. It cannot answer “was CPU high last Tuesday at 3am?” For production visibility, you need Cloud Monitoring.Not configuring any alerting policies. Dashboards are passive: you have to be looking at them. Production clusters should always have alerts for pod restarts, node pressure, and unavailable replicas at minimum. An unmonitored cluster is one where users find problems first.
Setting alert thresholds too aggressively. If an alert fires multiple times a day on normal variation, teams start ignoring all alerts. Calibrate thresholds against observed baseline behaviour and require conditions to persist for at least 5 minutes before firing.
Confusing resource requests with actual usage. Cloud Monitoring reports actual usage. A pod can have a 1 CPU request but consume only 50m in practice, or be throttled because it is hitting its CPU limit. Always compare
core_usage_timeagainst bothrequest_coresandlimit_coresfor the full picture.Ignoring container restart counts. A rising restart count is one of the most reliable early warning signals in Kubernetes. A container restarting repeatedly is either crashing (check logs), hitting an OOM limit (increase the memory limit or fix the leak), or failing health checks (check readiness probe config). Set an alert on restart rate and treat it seriously.
Summary
- GKE has Cloud Monitoring enabled by default. Node, pod, and container metrics are collected automatically under the
kubernetes.ionamespace with no setup required. - The GKE Workloads view in Cloud Console shows live pod status, restart counts, and resource usage — the fastest place to start during an incident.
kubectl top nodesandkubectl top podsgive real-time usage from the pre-installed metrics-server, useful for instant spot-checks but not for historical analysis.- Cloud Monitoring dashboards and alerting policies provide proactive notification for pod restarts, CPU pressure, memory pressure, and unavailable replicas.
- Managed Service for Prometheus extends monitoring to custom application metrics using
PodMonitoringresources, with no Prometheus server to manage. - Cloud Trace captures distributed request traces and Cloud Profiler provides low-overhead CPU and memory profiling for running workloads.
- Use all three tools together: Cloud Monitoring for infrastructure trends and alerting, Managed Prometheus for application metrics, and
kubectl topfor real-time debugging.
Frequently asked questions
Is Cloud Monitoring enabled by default on GKE clusters?
Yes. GKE clusters have Cloud Monitoring and Cloud Logging enabled by default when created through the console or gcloud. The monitoring agent collects node and pod metrics automatically. You can verify or change this under the cluster's Features section in the Google Cloud Console, or with the --monitoring flag in gcloud.
What is the difference between kubectl top and Cloud Monitoring?
kubectl top gives you real-time resource usage (CPU and memory) directly from the metrics-server, useful for instant spot-checks. Cloud Monitoring stores historical metrics over time and supports dashboards, alerting policies, and long-term trend analysis. Use kubectl top for immediate debugging and Cloud Monitoring for ongoing visibility and alerting.
Do I need Prometheus to monitor a GKE cluster?
No. GKE's built-in Cloud Monitoring integration covers node, pod, and container metrics without any Prometheus setup. You only need Managed Service for Prometheus when you want to collect custom application metrics exposed in the Prometheus format: queue depths, request latency histograms, business counters, and so on.
What alerts should I set up first for a new GKE cluster?
Start with four: container restart rate above 5 in 10 minutes (CrashLoopBackOff signal), node CPU utilisation above 80% for 5 minutes, node memory utilisation above 85% for 5 minutes, and deployment unavailable replicas above 0 for 5 minutes. These cover the most common production failure modes without generating alert fatigue.
Can I monitor custom application metrics in GKE?
Yes. Use Google Cloud Managed Service for Prometheus (GMP) with a PodMonitoring custom resource to scrape metrics your application exposes in the Prometheus format. Those metrics then appear in Cloud Monitoring under the prometheus.googleapis.com namespace, where you can query them with PromQL and add them to dashboards.