GCP Metrics Explained: Gauge vs Delta vs Cumulative in Cloud Monitoring

Cloud Monitoring stores all observability data as time series: sequences of numeric measurements over time, each tagged with labels. Every metric has a kind — gauge, delta, or cumulative. That kind determines how to interpret values, build charts, and write alerts. Get it wrong and your alerts fire on incorrect conditions, or never fire at all.

What is a metric?

A metric is a named, numeric measurement that changes over time. Cloud Monitoring records these as time series: ordered sequences of data points, each with a timestamp and a value.

For example, CPU utilization on a VM is a metric. Every minute, Cloud Monitoring records “this VM’s CPU was at 43%.” Those data points accumulate into a time series you can chart, alert on, and query.

Every metric in Cloud Monitoring has three defining properties:

Metric type: the unique identifier, like compute.googleapis.com/instance/cpu/utilization
Metric kind: gauge, delta, or cumulative
Value type: INT64, DOUBLE, BOOL, or DISTRIBUTION

The metric kind is the one most beginners overlook. It also causes the most broken alerts.

Tip: Metrics Explorer

Open Metrics Explorer in the GCP Console (Monitoring → Metrics Explorer). Select any metric and the details panel shows its kind, value type, unit, and labels. This is the fastest way to check an unfamiliar metric before writing an alert or dashboard.

Why metric kind matters

Metric kind is not an implementation detail. It directly affects whether your monitoring works correctly.

Bad alerts. Alert on a raw cumulative value and the condition is true the moment the counter starts. The alert fires immediately and stays firing, paging you for something that was never actually wrong.
Misleading charts. Chart a delta metric without proper alignment and you get jagged spikes or flat lines depending on UI defaults, not what actually happened.
False confidence. A gauge showing zero requests might mean no traffic, or it might mean the wrong time window is selected. Knowing the metric kind tells you which interpretation is correct.

Real consequence

A team sets an alert: fire when container.googleapis.com/container/cpu/core_usage_time exceeds 100. That metric is cumulative. It starts at 0 and climbs forever. The alert fires within minutes of deployment and never clears. On-call gets paged at 2am for a container that is running perfectly. This is not hypothetical — it is a routine mistake with cumulative metrics.

Understanding metric kinds makes you faster when writing alerting policies and building dashboards. You know immediately which aggregation to reach for.

How metrics work in GCP

When Cloud Monitoring collects a measurement, it stores it as a data point on a time series. Each time series is uniquely identified by its metric type plus its label values.

Here is how the pieces fit together:

Metric descriptor: defines the metric’s type string, kind, value type, unit, and allowed labels. Every metric has exactly one descriptor. Inspect it in Metrics Explorer or via the metricDescriptors.get API.
Time series: the stream of data points for one specific combination of metric type and label values. CPU utilization for vm-instance-a in us-central1-a is one time series. The same metric for vm-instance-b is a different time series.
Labels: key-value pairs that identify which resource or dimension produced the data. Labels let you filter by service, region, response code class, and more.
Value type: what kind of number the metric records. INT64 for integers, DOUBLE for floating-point, DISTRIBUTION for latency histograms.
Metric kind: the semantic meaning of each data point. An instantaneous reading, a count over an interval, or a running total.

Cloud Monitoring stores all this data in Monarch, Google’s internal time-series database used to monitor Google’s own infrastructure. You interact with Monarch only through the Cloud Monitoring API, which is why the query model is time-series-first rather than SQL-style.

Gauge, Delta, and Cumulative compared

Metric kind	What it means	Typical example	How to chart it	How to alert on it	Common mistake
Gauge	Current value at this instant	CPU utilization, memory usage, Pub/Sub backlog	Plot the raw value	Alert when value exceeds threshold	Assuming two readings imply a meaningful rate
Delta	Count of events in this interval	Request count, error count, messages published	Sum or rate per alignment period	Alert on rate or sum per window	Charting without specifying an alignment period
Cumulative	Running total since start	Total CPU core-seconds, total billable instance time	Apply Rate alignment first	Alert on rate of increase, not the raw value	Alerting on the raw value (it always increases)

Gauge

A gauge measures a value at a specific point in time. The value can go up or down between readings. Each data point is independent: it tells you the state right now, not how it changed.

Verified GCP examples:

compute.googleapis.com/instance/cpu/utilization: CPU as a fraction (0.0–1.0). Each point shows utilization at that moment.
run.googleapis.com/container/memory/utilization: Cloud Run container memory as a fraction of the memory limit.
pubsub.googleapis.com/subscription/num_undelivered_messages: current backlog size. This is a gauge because the backlog can grow or shrink at any time depending on whether your subscriber is keeping up.
cloudsql.googleapis.com/database/cpu/utilization: Cloud SQL CPU usage right now.

Charting and alerting: Plot the raw value. Alert when it exceeds a threshold. No alignment conversion needed.

Wrong way

Computing a “rate of change” by subtracting consecutive gauge readings. Gauges measure state, not flow. Two readings of 72% and 74% do not tell you how fast CPU is rising. For that, you need a metric specifically designed to measure CPU increase over time.

Delta

A delta metric counts events that occurred during a measurement interval. Each data point covers a fixed window of time and resets at the start of the next window. It does not accumulate across intervals.

Verified GCP examples:

run.googleapis.com/request_count: requests handled during the interval. A value of 500 means 500 requests happened in that window.
run.googleapis.com/request_latencies: latency distribution for requests in the interval (DISTRIBUTION value type).
pubsub.googleapis.com/topic/send_message_operation_count: messages published to a topic during the interval.

Charting and alerting: In Metrics Explorer or an alert policy editor, set the per-series aligner to Sum for totals or Rate for per-second throughput. Always specify an explicit alignment period. Do not leave it on the UI default when writing alerts.

Wrong way

Charting raw delta values without an alignment period. The UI may show spikes that are artefacts of interval length, not real traffic increases. If the reporting interval changes, the chart looks like a traffic spike even when nothing changed.

Cumulative

A cumulative metric is a running total that only increases. It has a start time and grows over the lifetime of the process or resource.

Verified GCP examples:

container.googleapis.com/container/cpu/core_usage_time: total CPU core-seconds used by a GKE container since it started. This is a cumulative DOUBLE.
run.googleapis.com/container/billable_instance_time: total billable instance time for a Cloud Run service, accumulated since service creation.

Charting and alerting: Never chart or alert on the raw cumulative value. It slopes upward forever and tells you nothing useful on its own. Apply Rate alignment (per-series aligner ALIGN_RATE) to convert it to a per-second rate of increase. In PromQL, the rate() function does this automatically for counter metrics. In the alert policy editor, Cloud Monitoring flags cumulative metrics and prompts you to set an alignment period for exactly this reason.

Most common alerting mistake

Setting an alert threshold on the raw cumulative value. The alert fires the moment the counter exceeds the threshold and never clears. This will page your on-call team immediately on every deployment. Apply ALIGN_RATE first, then compare the rate to a threshold.

Analogy

A gauge is a speedometer: it shows what is happening right now. A cumulative metric is an odometer: it only goes up and shows the total distance since the counter started. To find your current speed from an odometer, you divide the distance change by the time elapsed. Rate alignment does exactly that calculation for you.

How to choose the right metric kind

When reading an unfamiliar metric or designing a custom one, ask these three questions:

Quick reference

Can the value go up and down right now? Use Gauge. (CPU load, active connections, queue depth, memory usage)
Does it count events that happened during an interval? Use Delta. (Requests per minute, errors per hour, bytes transferred this window)
Is it a running total that only ever increases? Use Cumulative. (Total bytes sent since startup, total requests served since deployment)

If you are creating a custom metric, prefer cumulative for counters. Cumulative counters preserve full history even if your reporting interval changes, and OpenTelemetry maps its counter type to cumulative by default when exporting to Cloud Monitoring.

Always verify your assumption by checking the metric descriptor in Metrics Explorer before writing an alert. A metric named “count” might be cumulative, not delta.

Metric labels explained

Labels are key-value pairs attached to each time series. They split one metric into multiple streams: one per service, one per region, one per response code class.

For example, run.googleapis.com/request_count includes these labels:

service_name: which Cloud Run service handled the request
revision_name: which deployment revision
response_code_class: 2xx, 4xx, or 5xx
location: GCP region

When writing an alert condition, filter on labels to scope it precisely: “alert when 5xx request count for this specific service in this region exceeds 10 per minute,” not for all services in the project. Without label filters, you get aggregated noise that obscures which service is actually broken.

High cardinality warning

Labels like user_id, session_id, or request_id create one time series per unique value. Millions of users means millions of time series: expensive to store and slow to query. Use labels only for bounded dimensions such as environment, region, service name, response code class, and feature flag name.

Built-in metrics vs custom metrics

GCP services publish hundreds of metrics automatically. As soon as you use a service, its metrics appear in Cloud Monitoring with no instrumentation code required. Built-in metrics cover infrastructure: CPU, memory, disk I/O, network throughput, request counts, latency distributions.

What they do not cover is your application’s business logic.

Create a custom metric when:

You need to track business events: orders placed, payments processed, jobs enqueued
You need error rates for specific code paths not surfaced by infrastructure metrics
You need feature usage counters or experiment exposure rates
You need an application-level queue depth separate from the underlying infrastructure queue

Custom metric type strings start with custom.googleapis.com/.

The recommended instrumentation path is OpenTelemetry. It is the vendor-neutral standard for metrics, traces, and logs. Configure the Google Cloud Monitoring exporter and your metrics flow into Cloud Monitoring without coupling your code to a GCP-specific API. If you later need to export to a second backend, you add an exporter rather than rewriting instrumentation. See Cloud Monitoring Overview for how custom metrics fit into the broader observability picture.

For simple one-off use cases, you can also write directly to the Cloud Monitoring API using a client library. This is faster to prototype but ties your code to GCP.

When this knowledge matters

You need to understand metric kinds when:

Writing alerting policies. The correct threshold and aggregation depend entirely on the metric kind. A wrong assumption means a broken or permanently-firing alert.
Building dashboards. Charts need the right aligner. A cumulative metric without rate alignment shows a line that climbs forever.
Debugging incidents. During an outage, knowing whether a spike is from a gauge or a delta metric changes how you interpret the chart. See Incident Response with Monitoring for the full workflow.
Instrumenting applications. Choosing the wrong metric kind for a custom metric means every downstream alert and dashboard built on it will need rethinking.
Reading service dashboards. Cloud Run, GKE, Pub/Sub, and Cloud SQL each use a mix of gauge, delta, and cumulative metrics. Without knowing which is which, you will misread the charts. See Debugging Production Systems for how metric kind affects root-cause analysis.

Real examples in common GCP services

For each example: what is the metric kind, and what does that mean for how you should use it?

Cloud Run

run.googleapis.com/request_count: Delta. Counts requests in each measurement interval. To alert on error rate, filter by response_code_class=5xx and use Sum alignment. See Monitoring Cloud Run for a complete walkthrough.
run.googleapis.com/request_latencies: Delta, DISTRIBUTION. Use percentile aggregation (p95, p99) to catch latency tail issues. Averaging a distribution discards the shape. Percentiles tell you what your slowest users actually experience.
run.googleapis.com/container/memory/utilization: Gauge. Alert when utilization stays above 0.85 for more than a few minutes to catch memory pressure before OOM kills start.

GKE

container.googleapis.com/container/cpu/core_usage_time: Cumulative. Apply Rate alignment to get CPU core usage per second. Chart it alongside kubernetes.io/container/cpu/request_cores (gauge) to see how close workloads are to their CPU requests. See Monitoring GKE for recommended dashboards and alert thresholds.
kubernetes.io/container/memory/used_bytes: Gauge. Current memory in bytes. Alert when it approaches the container memory limit to catch memory leaks early.

Pub/Sub

pubsub.googleapis.com/subscription/num_undelivered_messages: Gauge. Current backlog size — how many messages are waiting right now. It can go up or down depending on whether your subscriber is keeping up. Alert when it exceeds your acceptable lag threshold.
pubsub.googleapis.com/subscription/oldest_unacked_message_age: Gauge. Age in seconds of the oldest unacknowledged message. Often a more useful signal than raw backlog count. If messages are aging out, consumers are stalled.

Cloud SQL

cloudsql.googleapis.com/database/cpu/utilization: Gauge. Alert when it stays above 0.8 for several minutes. A brief spike is normal during a query burst. Sustained high utilization points to an under-provisioned instance or a runaway query.
cloudsql.googleapis.com/database/disk/bytes_used: Gauge. Monitor and alert with enough headroom before hitting the disk limit. A full disk causes writes to fail immediately.

Metrics vs logs vs log-based metrics

Cloud Monitoring has three distinct data types. Knowing the difference prevents confusion about where to look for information.

🏥

Think of a hospital ward. The vital signs monitor showing heart rate every second is a metric: numeric, continuous, instant to query. The nurse’s written notes (“patient reported chest pain at 14:32”) are logs: detailed, text-based, specific to one event. A weekly report counting how many chest pain events occurred is a log-based metric: a number derived by counting log entries.

Metrics: numeric time series. Fast to query, efficient to store, ideal for alerting on thresholds, rates, and distributions. Cannot tell you why something happened.
Logs: structured or unstructured records of individual events. Tell you exactly what happened, including stack traces, request payloads, and context. Expensive to query at scale. Managed in Logs Explorer. See Structured Logging to make your logs filterable and useful for deriving metrics.
Log-based metrics: metrics derived from log entries. A log-based metric counts how many log entries match a filter per minute, or extracts a value from a log field and tracks it as a distribution. This bridges the two systems: you get the alerting efficiency of metrics from events that only exist in logs. See Log-Based Metrics for how to create them.

When to use each

Use metrics for ongoing monitoring and threshold-based alerting. Use logs when you need to understand a specific event in detail. Use log-based metrics when an event only exists in your application logs but you need to alert on its frequency.

Common mistakes

Alerting on a raw cumulative value. A cumulative metric grows forever. An alert that fires when the value exceeds 1000 will trigger immediately and never clear. Apply Rate alignment (ALIGN_RATE) to get the per-second rate of increase, then compare that rate to a threshold.
Thinking Pub/Sub’s num_undelivered_messages is a delta. It is a gauge. It shows current backlog size, not the count of new messages delivered in an interval. Alert on the raw gauge value.
Using high-cardinality label values. Labels like user_id or trace_id create one time series per unique value. Millions of users means millions of time series: significant cost and slow queries. Use bounded dimensions only.
Averaging a distribution metric. Latency metrics like run.googleapis.com/request_latencies are DISTRIBUTION type. Averaging discards the shape. Use percentile aggregation (p50, p95, p99) instead, otherwise you will miss the tail latency affecting your slowest users.
Copying metric examples without checking the descriptor. Before writing an alert, look up the metric kind in Metrics Explorer. A metric named “count” or “total” might be cumulative, not delta. The name alone does not tell you.
Using logs when a metric would work. Running a Logs Explorer query every minute to count error occurrences is slower and more expensive than using a metric or log-based metric. Metric-based alerts are evaluated continuously and efficiently.

Frequently asked questions

What is the difference between gauge, delta, and cumulative metrics in GCP?

A gauge measures a value right now (like CPU at 72%). A delta counts events during a measurement interval (like 500 requests in the last minute). A cumulative metric is a running total that only increases (like total requests since the process started). The metric kind determines which aggregation to use for charts and alerts.

How do I know which metric kind a Google Cloud metric uses?

Open Metrics Explorer in the GCP Console, select the metric, and check the metric details panel. You can also call the Cloud Monitoring API metricDescriptors.get endpoint to read the descriptor programmatically. Always check the kind before writing an alert, do not assume.

Are request counts in Cloud Monitoring usually gauge or delta?

Request counts are usually delta metrics. run.googleapis.com/request_count, for example, reports the number of requests handled during each measurement interval. To get a per-second rate, apply a rate or sum alignment in your chart or alert policy.

When should I create a custom metric?

Create a custom metric when built-in GCP metrics do not cover what you need to observe. Common cases include business events (orders placed, payments processed), application-level queue depths, feature usage counters, and error rates for specific code paths.

Should I use Metrics Explorer, PromQL, or the Monitoring API?

For interactive exploration in the console, use Metrics Explorer. It is the fastest way to browse and visualize metrics without writing a query. For writing alerts or dashboards with a query language, PromQL is available in Cloud Monitoring and is a good choice if you know it. For programmatic access from scripts or applications, use the Cloud Monitoring API or a client library (Python, Go, Java, etc.).

Last verified: 25 March 2026 Cloud services change frequently. Verify details against official documentation before making infrastructure decisions.

GCP Metrics Explained: Gauge vs Delta vs Cumulative in Cloud Monitoring

What is a metric?

Why metric kind matters

How metrics work in GCP

Gauge, Delta, and Cumulative compared

Gauge

Delta

Cumulative

How to choose the right metric kind

Metric labels explained

Built-in metrics vs custom metrics

When this knowledge matters

Real examples in common GCP services

Cloud Run

GKE

Pub/Sub

Cloud SQL

Metrics vs logs vs log-based metrics

Common mistakes

Summary

Related topics

Frequently asked questions