Google Cloud Artifact Registry Best Practices for Secure CI/CD

A well-managed Artifact Registry prevents three problems that get expensive to fix later: sprawling untagged images that drive up storage costs, unknown vulnerabilities sitting in production unnoticed, and over-privileged pipelines that can push to repositories they should never touch. This page covers the practices that prevent all three, with specific guidance on naming, tagging, scanning, cleanup, and access control.

Simple explanation

Artifact Registry is where your container images live between being built and being deployed. Every time your CI pipeline builds a Docker image, it pushes that image to Artifact Registry. When Cloud Run starts a new instance or GKE schedules a pod, it pulls the image from there.

Best practices for Artifact Registry are the habits and setup decisions that make this storage layer easier to manage, safer to use, cheaper to run, and easier to audit. Most of them are choices you make once when creating a repository, not ongoing work.

A concrete example: if you tag every image with the git commit SHA, you can look at any running container, read its image tag, and immediately know which commit it came from. Without that, tracing what is running in production back to its source code becomes a guessing game. SHA tagging alone is the difference between a five-minute investigation and a two-hour one.

If you are new to the service itself rather than its configuration, start with the Artifact Registry Overview.

How Artifact Registry fits into a CI/CD pipeline

Understanding where the registry sits in the pipeline makes the best practices easier to reason about. Here is what a typical flow looks like:

  1. Developer pushes code. A commit to main (or a release branch) triggers the pipeline.
  2. Pipeline builds the image. Cloud Build runs your Dockerfile and produces an image. See Building Docker Images with Cloud Build for how to structure this well.
  3. Image is tagged. The image gets tagged with the git commit SHA: an immutable reference you will use for deployments and rollbacks.
  4. Image is pushed to Artifact Registry. The pipeline authenticates via a service account and pushes to the correct repository for this environment.
  5. Scanning happens. Container Analysis scans the image for known CVEs. The pipeline can query results and fail if critical vulnerabilities are found.
  6. Deployment uses the SHA tag. Cloud Build deploys the image by its exact SHA tag, not latest. The deployed image is exactly what was tested.
  7. Old images are cleaned up automatically. Cleanup policies delete images that are too old or beyond a count threshold, keeping storage costs predictable.
  8. Access is controlled by IAM. Only the build pipeline can push. Cloud Run services and GKE nodes can only pull from the repositories they need.

Each section in the rest of this page maps directly to one of these steps.

Repository naming conventions

Choose a naming strategy before creating your first repository. Changing it later means updating image references across every pipeline, Dockerfile, and deployment config. The two main approaches are by environment and by service.

One repository per environment

Each environment has its own repository. Images are promoted between repositories as they move through the pipeline. An image in the prod repository has been through the full pipeline, and that fact is encoded in its location. This pattern maps naturally to separate GCP projects per environment:

europe-west2-docker.pkg.dev/my-app-dev/images/api:abc1234
europe-west2-docker.pkg.dev/my-app-staging/images/api:abc1234
europe-west2-docker.pkg.dev/my-app-prod/images/api:abc1234

One repository per service

Each service has its own repository. All environment tags live together. Simpler for small teams with few services and a single GCP project:

europe-west2-docker.pkg.dev/my-app-prod/api/api:abc1234
europe-west2-docker.pkg.dev/my-app-prod/worker/worker:abc1234
Tip

If you use separate GCP projects per environment, the repository naturally lives in the project that corresponds to the environment. Dev images in the dev project, prod images in the prod project. This is the cleanest setup for IAM and billing separation, and the repository location itself becomes a signal that an image is production-ready.

Repository strategy: per environment vs per service

Both approaches work in production systems. Here is a direct comparison:

ApproachBest forMain advantageMain drawback
Per environmentTeams using separate GCP projects per environmentImage location signals pipeline stage; IAM aligns with project boundariesMore repositories to manage; cross-service scanning setup is less straightforward
Per serviceSmall teams, single-project setups, simple pipelinesFewer repositories; easy to audit image history per serviceHarder to enforce “this image has been through staging” without extra tooling

For most teams building on GCP, the per-environment approach scales better once you are running more than one or two services. It aligns naturally with environment separation in CI/CD, keeps IAM boundaries tight, and makes image promotion explicit: you are copying the image from the staging repository to the prod repository, not relying on a tag to convey that it was tested.

Per service is a reasonable choice for small projects or teams that do not yet have separate GCP projects per environment. The risk is that the convention stops scaling cleanly as the number of services grows.

Whichever you choose, consistency matters more than chasing the perfect pattern. Pick one approach, document it, and apply it across the organisation. The problems arise when different teams or services use different approaches in the same project.

Image tagging strategy

Tags are how you identify which code is inside an image. A weak tagging strategy makes it impossible to answer basic operational questions: what is running in production right now, and where did it come from?

Always tag with the commit SHA

Use $SHORT_SHA in Cloud Build, which gives you the first 7 characters of the git commit hash. This tag is immutable: it permanently identifies one specific build of one specific commit. It provides an unambiguous audit trail from a running container back to source code. If you need to know what is in production, read the image tag, find the commit, read the diff. That chain of traceability is what makes incidents easier to diagnose and rollbacks faster to execute.

How container images are structured and layered is covered on the Container Images in GCP page.

Add mutable tags for convenience

In addition to the SHA tag, add latest (for the most recent build of main) or a release version like v1.2.3. These are mutable, meaning they point to different images over time. They are useful for caching, local development, and finding the current version quickly. They are not appropriate for production deployments.

Immutable tags vs mutable tags

An immutable tag always resolves to the same image digest. A mutable tag can be reassigned to any image. The SHA tag is immutable by its nature: it only makes sense for one build of one commit. latest is mutable by design.

Artifact Registry also lets you enforce immutability at the repository level, preventing any tag from being reassigned once set. This is useful in stricter environments where overwriting a tag should never happen, but it can create friction in development workflows where overwriting is sometimes intentional.

Analogy

The SHA tag is a serial number permanently etched on the product. The latest tag is a “current model” sticker that moves to the newest version each time. You need both: the serial number for audits and rollbacks, the sticker for finding the latest image quickly.

Warning

Never deploy to production using only the latest tag. If a new image is pushed between when you trigger the deploy and when the runtime pulls the image, you deploy unreviewed code. Always deploy with the specific SHA tag. This is one of the most common causes of mysterious production behaviour in teams that are new to container-based delivery.

Vulnerability scanning

Artifact Registry integrates with Container Analysis to scan images for known CVEs in OS packages and language dependencies. Scanning is not enabled by default; you turn it on per repository:

# Enable vulnerability scanning on a repository
gcloud artifacts repositories update api \
  --location=europe-west2 \
  --project=my-app-prod \
  --enable-vulnerability-scanning

Scanning is asynchronous. Results appear a few minutes after each push. You can query them in your pipeline and fail the build on critical findings:

# Check for CRITICAL vulnerabilities after pushing the image
gcloud artifacts docker images scan \
  europe-west2-docker.pkg.dev/my-app-prod/api/api:$SHORT_SHA \
  --format='value(response.scan)' \
  --project=my-app-prod

Add a brief wait or polling step in your pipeline before querying results. Results typically appear within two to three minutes of a push.

For a harder gate, use Binary Authorization to block deployment of images that have not passed scanning or received a valid attestation. This enforces the check at deploy time rather than build time, which catches vulnerabilities discovered after an image was already built and pushed.

Vulnerability scanning is one layer of the broader secure CI/CD pipeline, alongside attestations, signed images, and deployment policies.

Watch out

Enabling scanning is not the same as acting on findings. Many teams turn it on and then ignore the results entirely. Decide upfront what severity level will fail the build, how critical findings will be triaged, and who is responsible for updating base images when vulnerabilities are found. Without that process, the scanner produces false confidence rather than actual security.

Remote repositories

Artifact Registry supports remote repositories that proxy and cache images from upstream public registries. Instead of builds pulling directly from Docker Hub, which enforces aggressive rate limits on unauthenticated and free-tier requests, configure a remote repository to sit in between:

# Create a remote repository that proxies Docker Hub
gcloud artifacts repositories create dockerhub-proxy \
  --location=europe-west2 \
  --repository-format=docker \
  --mode=remote-repository \
  --remote-repo-config-desc="Docker Hub" \
  --remote-docker-repo=DOCKER_HUB \
  --project=my-app-prod

Update your Dockerfiles to pull base images through the proxy instead of directly from Docker Hub. The first pull fetches from upstream and caches the image in Artifact Registry. Subsequent pulls use the cache, which is faster and rate-limit-free.

Analogy

Think of it like a library branch near your office rather than ordering every book directly from the publisher. The branch holds the books you need, they are available immediately, and there is no limit on how many you can borrow in a day. The branch stocks its shelves from the publisher the first time a book is requested, then serves it locally after that.

Note

Remote repository caches do not refresh automatically when upstream images change. If you need a fresher version of a cached base image, delete the cached copy from Artifact Registry and the next pull will fetch a new version from upstream. This is rarely necessary in practice, but worth knowing when you are chasing a recent security patch in an upstream image.

Cleanup policies

Without cleanup policies, every pushed image stays forever. A busy pipeline pushing ten images a day generates over 3,500 images per year per service. Most of those images are never used again. Storage costs accumulate quietly and the registry becomes a graveyard of outdated images.

Silent cost

Teams typically discover they have no cleanup policy when the storage bill arrives at month three or four. By then there are thousands of images to deal with, and deciding which ones are safe to delete requires investigation. Set cleanup policies at repository creation time and you never have this problem.

Artifact Registry cleanup policies automatically delete images based on age or count:

# Delete images older than 30 days, but keep at least the 10 most recent
gcloud artifacts repositories set-cleanup-policies api \
  --project=my-app-prod \
  --location=europe-west2 \
  --policy='[{"name":"delete-old-images","action":{"type":"Delete"},"condition":{"olderThan":"30d"}},{"name":"keep-minimum","action":{"type":"Keep"},"mostRecentVersions":{"keepCount":10}}]'

The combination of a delete rule and a keep-minimum rule is important. Without the keep rule, the cleanup policy could remove all images if your pipeline is quiet for a month. With both rules, the last 10 images are always preserved regardless of age.

For dev and staging repositories, you can be more aggressive: delete after 7 days and keep only 5 recent images. Production repositories warrant more conservative retention to support rollbacks without scrambling to find an old image.

IAM roles for Artifact Registry

Artifact Registry supports IAM at the repository level, not just at the project level. This means you can grant a CI pipeline for the API service write access to the API repository without also giving it access to every other repository in the project. Use this. It is one of the most important security controls the registry offers.

The three roles you will use most often:

  • roles/artifactregistry.writer for CI pipelines that push images
  • roles/artifactregistry.reader for Cloud Run services, GKE nodes, and other services that pull images
  • roles/artifactregistry.repoAdmin for managing repository settings; not needed by pipelines or running services
# Grant Cloud Run service account permission to pull from this repository only
gcloud artifacts repositories add-iam-policy-binding api \
  --location=europe-west2 \
  --project=my-app-prod \
  --member="serviceAccount:api-service@my-app-prod.iam.gserviceaccount.com" \
  --role="roles/artifactregistry.reader"

Each service should run under its own service account with only the permissions it needs. The Cloud Run service running the API only needs to pull from the API repository. It does not need to write to it or access the worker repository. Following the principle of least privilege here limits the blast radius if a service account is compromised or misconfigured.

Good pattern

The CI pipeline service account gets writer on its own repository only. Cloud Run and GKE service accounts get reader on the repositories they pull from. Nobody gets project-level roles. A misconfigured pipeline with project-level writer access can overwrite images in any repository in the project.

When these practices matter most

All of these practices are worth following from the start, but some situations make them especially important:

  • Cloud Run delivery pipelines. Cloud Run pulls images directly at startup. If you deploy with latest and another push happens before the new instance starts, you run a different image than intended. See CI/CD Pipelines for Cloud Run for the full pipeline setup, including how to wire SHA-tagged deployments.
  • GKE-based deployments. Pods pull images from Artifact Registry on each node. Repository-level IAM and Workload Identity control which node pools can pull which images, which matters in clusters running multiple services with different trust levels.
  • Multi-environment delivery. Teams running dev, staging, and production benefit most from the per-environment repository strategy. The repository location becomes a signal about pipeline stage, not just a storage location.
  • Teams with multiple services. Once you have more than two or three services, repository-level IAM prevents pipelines for one service from accidentally or maliciously affecting another service’s images.
  • Teams with compliance or audit requirements. SHA tagging, vulnerability scanning, and repository-level IAM together create an audit trail from any deployment back to source code and security checks.
  • Teams migrating from Container Registry. Container Registry (gcr.io) is a legacy product that will not receive new features. Artifact Registry has better IAM, supports more artifact formats, and is the current recommendation. New projects should start on Artifact Registry; migration for existing projects requires updating all image references across pipelines and deployment configs.

Common mistakes

  1. Deploying to production with the latest tag. latest is mutable; it can point to a different image every time. If a new push happens between triggering a deployment and the runtime pulling the image, you deploy unreviewed code. Always deploy with the SHA tag. Reserve latest for local development and layer caching.

  2. Not setting up cleanup policies at repository creation time. Every build pushes a new image. Without a cleanup policy, all of them stay forever. Storage costs accumulate silently, and the registry fills with thousands of images you will never use. Set policies when you create the repository, not after the storage bill arrives.

  3. Granting project-level IAM roles instead of repository-level roles. A CI pipeline for the API service does not need write access to the worker repository, the frontend repository, or anything else. Grant at the repository level. Project-level grants create broad access that is difficult to audit and easy to misuse.

  4. No consistent repository naming pattern. Repositories with no consistent naming (some by team, some by environment, some by service, some by project) become impossible to navigate. The naming convention also affects how you set up IAM, cleanup policies, and image promotion flows. Choose one strategy, document it, and apply it across the board.

  5. Enabling vulnerability scanning but not acting on findings. Scanning gives you information. It does not fix anything. If you enable scanning but have no process for triaging critical CVEs, no pipeline step that blocks on findings, and no schedule for updating base images, the scanner becomes background noise. Define what happens when critical findings appear before you enable it.

  6. Mixing environments in one repository without a clear convention. Putting dev, staging, and prod images in the same repository using tags like api:dev-abc1234 and api:prod-abc1234 works technically but loses the IAM and promotion benefits of per-environment repositories. It also makes cleanup policies harder to write correctly, since you cannot target images by environment.

  7. Continuing to use Container Registry for new projects. Container Registry is a legacy product. Starting new projects on gcr.io means missing better IAM, multi-format support, and all future feature development. Start on Artifact Registry.

  8. Not documenting the image promotion flow between environments. When an image moves from dev to staging to prod, that promotion should be a documented, repeatable step. If it is not, different engineers do it differently, and the pipeline loses its guarantee that what is in production has been through the full pipeline.

Frequently asked questions

What is the best tagging strategy for Artifact Registry?

Tag every image with the git commit SHA using $SHORT_SHA in Cloud Build. This gives you an immutable, traceable tag that maps directly to source code. Add a mutable tag like latest or a semver version for convenience and caching, but never use mutable tags alone in production deployments.

Should I use one repository per environment or per service?

Both patterns work. Per environment aligns cleanly with separate GCP projects per environment: dev images live in the dev project, prod images in the prod project, and the presence of an image in the prod repository signals it has been through the full pipeline. Per service is simpler for small teams with few services. The most important thing is to pick one and apply it consistently.

Does Artifact Registry scan images automatically?

Not by default. You need to enable Container Analysis on a repository using the --enable-vulnerability-scanning flag. Once enabled, GCP automatically scans images for known CVEs after each push. Results appear in the console within a few minutes. Scanning alone does not block deployments; you need to query results in your pipeline or use Binary Authorization to enforce policy at deploy time.

How do cleanup policies work in Artifact Registry?

Cleanup policies automatically delete images based on age or count. You define rules such as delete images older than 30 days or keep only the 10 most recent versions. Combine delete and keep rules so the policy never removes all images, even if your pipeline is quiet for a period. Set cleanup policies when you create the repository. Retrofitting them after thousands of images have accumulated is painful.

Can I use Artifact Registry with Cloud Run or GKE?

Yes. Artifact Registry is the recommended image store for both Cloud Run and GKE on GCP. Grant the Cloud Run or GKE service account roles/artifactregistry.reader at the repository level, not the project level. For GKE, Workload Identity is the cleanest way to control which node pools and workloads can pull from which repositories.

Last verified: 25 March 2026 Cloud services change frequently. Verify details against official documentation before making infrastructure decisions.