Infrastructure as Code in GCP
Infrastructure as Code means writing your GCP resources as text files, not clicking through the Console. Those files live in git, changes go through pull requests, and environments become reproducible on demand. Terraform is how most teams do this on GCP.
What is Infrastructure as Code?
When you create a Cloud Storage bucket in the Console, GCP creates it immediately, but there is no permanent record of what you configured or why. A new team member cannot reproduce it. You cannot compare it to another environment. If someone changes the settings six months later, that change is invisible.
Infrastructure as Code solves this by treating your cloud resources the same way you treat application code. You write a configuration file that describes the bucket, the IAM bindings, the network, or whatever resource you need. That file goes into version control. Terraform reads it, figures out what needs to change in GCP, shows you a plan, and makes the change when you approve it.
The result is infrastructure that is auditable, reviewable, and reproducible. A junior engineer can read your Terraform code and understand what exists in production. A new staging environment takes minutes to create. Mistakes get caught in code review before they reach production.
Managing cloud infrastructure without IaC is like maintaining an application without version control. Day to day, things work fine. But when something breaks and you need to know exactly what changed last Tuesday, or who created that service account with owner permissions, there is nothing to look at. IaC gives your infrastructure the same audit trail that git gives your code.
Why Infrastructure as Code matters in GCP
GCP makes it very easy to create resources through the Console. That ease is also its risk. A few clicks and you have a new VPC, a service account with broad permissions, or a Cloud SQL instance with public IP enabled. Undoing a bad click, or understanding what someone else clicked three months ago, requires digging through audit logs.
IaC changes the model. Any resource you want to create, you define in code first. The code is the proposal. A teammate reviews it. Terraform shows you exactly what will be created before you run anything. The change lands in git with a commit message explaining why. If something breaks, you revert the commit.
This matters especially in GCP because so much of what you manage has security implications: IAM roles and bindings, service accounts, firewall rules, network configurations. When those change through the Console, there is no built-in review step. When they change through Terraform, a pull request is that review step.
How Infrastructure as Code works
Terraform operates on the concept of desired state. You write HCL (HashiCorp Configuration Language) files that describe what you want to exist: a bucket named my-app-prod-assets in europe-west1, a service account named app-deployer, a Cloud Run service with specific memory and CPU settings. That is your desired state.
Terraform maintains a state file that records what it has already created. When you run terraform plan, Terraform compares three things: your code, the state file, and the actual resources in GCP. If there is a gap between what your code says should exist and what actually exists, Terraform tells you what it plans to do about it.
That plan is the output you review before making any change. A resource marked with + will be created. One marked with - will be destroyed. One marked with ~ will be updated in place. A resource marked with -/+ will be destroyed and recreated, which can cause downtime or data loss. Reading the plan carefully is one of the most important skills in the Terraform workflow.
Drift happens when someone makes a change outside Terraform: resizing a Cloud SQL instance in the Console, manually deleting a firewall rule, tweaking a service account binding. The next time you run terraform plan, Terraform sees that real infrastructure no longer matches your code and flags the difference. That is drift detection, and it is one of the most practically useful things Terraform gives you.
A gcloud script is imperative: “create this bucket, add this IAM binding, enable this API.” If the bucket already exists, the script errors. Terraform is declarative: “here is what should exist.” If the bucket already exists and matches your configuration, Terraform does nothing. If it is missing, Terraform creates it. This distinction matters most when you re-run Terraform against infrastructure that already partially exists, which is the normal situation on any real project.
IaC tools for GCP
GCP works with several IaC tools. For most teams, Terraform is the right default. The others serve narrower purposes.
Terraform
Terraform is the industry standard for GCP infrastructure. The hashicorp/google provider covers nearly every GCP resource and is actively maintained. Terraform uses HCL to declare desired state, and the plan/apply workflow means changes are reviewable before anything touches production. It integrates naturally with Cloud Build and CI/CD pipelines, and has a large ecosystem of modules and community documentation.
Config Connector
Config Connector is a Kubernetes add-on that lets you manage GCP resources through Kubernetes custom resource definitions. If your team already runs GKE and wants to manage cloud infrastructure through the same GitOps toolchain as your application manifests, Config Connector is a reasonable fit. It is not a general-purpose IaC tool. Do not adopt it unless you are already operating Kubernetes and want to avoid maintaining a separate Terraform workflow.
Google Deployment Manager (legacy)
Deployment Manager is Google’s original IaC tool, using YAML or Python templates. Google no longer recommends it for new projects. Resource coverage is incomplete, the tool has not kept pace with GCP growth, and the community is a fraction of the size of Terraform’s. If you encounter Deployment Manager in existing infrastructure, treat it as technical debt and plan a migration to Terraform when the opportunity arises.
gcloud scripts
Shell scripts using gcloud commands are useful for one-time migrations, bootstrapping, and quick automation tasks. They are not IaC in the full sense because they are imperative: they tell GCP what to do rather than describing what should exist. They cannot detect drift and cannot show you a plan before making changes. For persistent infrastructure, use Terraform.
Quick comparison
| Tool | Approach | Best for | Not suited for |
|---|---|---|---|
| Terraform | Declarative, state-managed | Any GCP project, any team size | Nothing, it is general purpose |
| Config Connector | Kubernetes CRDs | GKE teams using GitOps | Non-Kubernetes environments |
| Deployment Manager | Legacy YAML/Python | Existing legacy setups only | Any new project |
| gcloud scripts | Imperative shell | One-off tasks, bootstrapping | Persistent managed infrastructure |
When to use Infrastructure as Code
IaC pays off quickly on anything you intend to keep, anything you need to replicate, and anything that touches security. Here are the situations where it clearly belongs:
- Setting up environments. Defining dev, staging, and production as Terraform configurations means those environments are consistent and created the same way every time. The dev vs staging vs production guide covers how to structure this.
- Managing IAM and service accounts. IAM policy changes are high-risk. Putting them through Terraform means they are reviewed and tracked before anything is applied. See managing IAM with Terraform for a practical walkthrough.
- Networking. VPCs, subnets, firewall rules, and peering configurations are exactly the kind of thing that should never be hand-crafted on a real project.
- Standardising Cloud Run, GKE, and Cloud SQL infrastructure. If you run the same type of workload across multiple projects or environments, Terraform modules let you define it once and reuse it consistently.
- Infrastructure changes through pull requests. A PR gives your team a review step, a history, and the ability to revert. This connects naturally with managing environments in CI/CD.
- Disaster recovery. If a project is deleted or corrupted, a full
terraform applyrecreates the infrastructure from known-good code. Without IaC, this means piecing it back together from memory.
If the resource is going to be there next week, it belongs in Terraform. Exploratory sandboxes and throwaway tasks are fine in the Console. Everything else gets a Terraform file and a pull request.
Getting started with Terraform on GCP
Terraform authenticates to GCP using Application Default Credentials. Run the login command once on your workstation and Terraform picks up your credentials automatically, with no configuration needed in the provider block.
# Step 1: Authenticate Terraform to GCP using ADC
gcloud auth application-default login
# Step 2: Initialise a new Terraform working directory
# Downloads the google provider and sets up local state
terraform init
# Step 3: Preview what will change before touching anything
terraform plan
# Step 4: Apply the changes after reviewing the plan
terraform applyThe most important step is reading the plan. Before you type yes at the apply prompt, scan for any resource marked -/+ (destroy and recreate) or any unexpected deletion. A few seconds here prevents outages.
Start with a single small resource: a Cloud Storage bucket or a service account. Get comfortable with the init, plan, and apply cycle before bringing complex infrastructure under management. The Terraform for Google Cloud guide covers provider setup and authentication in more detail.
The default Terraform setup stores state locally in a terraform.tfstate file. If two people run Terraform against the same infrastructure, or if your pipeline runs it, local state causes conflicts and can silently corrupt your state file. Configure a remote GCS backend before this becomes a shared project. See Terraform state management for the setup.
For CI/CD pipelines, authentication works differently. Running gcloud auth application-default login only works on a developer workstation. For Cloud Build and automated pipelines, use Workload Identity Federation to authenticate without service account keys.
Bringing existing resources under management
Most projects do not start from a blank slate. If you have resources already created through the Console, you can bring them under Terraform management using terraform import. This adds the resource to your state file without recreating it.
# Import an existing Cloud Storage bucket into Terraform state
terraform import google_storage_bucket.assets my-app-prod-assetsAfter importing, run terraform plan. If the HCL you wrote accurately describes the existing resource, the plan should show no changes. If it shows changes, your configuration does not match reality. Fix the HCL until the plan is clean before moving on to the next resource.
Import one resource at a time. Importing large amounts of existing infrastructure all at once makes it hard to diagnose mismatches, and a broken state file is painful to untangle.
Never manually edit resources that Terraform manages. If you change a resource in the Console after Terraform created it, the next terraform apply may overwrite your change or flag unexpected drift. All changes go through Terraform.
Common beginner mistakes
Making “quick” Console changes. Every manual change on a Terraform-managed resource is invisible to Terraform and causes drift. Even urgent fixes should be followed up with a code change to bring configuration back in sync. Build this habit early, or drift accumulates faster than you expect.
Using local state on a team. Terraform stores state locally by default. If two people run Terraform against the same infrastructure, or if Terraform runs from a CI pipeline, local state causes conflicts and potential data loss. Set up a remote GCS backend from day one. The state management guide covers this.
Not reading terraform plan carefully. The plan is the most important output in the workflow. A
-/+next to a Cloud SQL instance means it will be destroyed and recreated, which can drop data. Read the plan before every apply. Look for unexpected deletions and recreations.Putting secrets in Terraform variables. Sensitive values like database passwords or API keys should not go into
.tfvarsfiles checked into git. Reference secrets from Secret Manager as data sources, or pass them through environment variables at apply time.Importing everything at once. Bringing dozens of existing resources under Terraform management in a single session is a recipe for half-imported state and broken plans. Import incrementally, verify a clean plan after each resource, and keep the state consistent.
No environment separation. Running the same Terraform code against dev and production without clear separation leads to mistakes. Use separate state files and separate variable files for each environment. The dev vs staging vs production guide explains how to structure this.
Starting a new project with Deployment Manager. Deployment Manager is a legacy tool. Any effort put into it is time that could go into Terraform. New projects start with Terraform.
Best practices for teams
IaC only delivers its full value when the whole team follows it consistently. Here is what that looks like in practice.
All long-lived infrastructure is in code. No exceptions. If a resource is going to be there next week, it belongs in Terraform. Establish this as a team norm before resources accumulate outside version control.
Changes go through pull requests. A pull request is your review step. Another engineer reads the plan output, checks the configuration, and approves before anything reaches production. This catches the kinds of mistakes that are easy to make alone.
Use remote state. Store Terraform state in a GCS bucket with versioning and locking enabled. This prevents state corruption when multiple people or pipelines run Terraform concurrently. See Terraform state management for the configuration.
Organise the repository clearly. A well-structured Terraform repo is easier to review and maintain as infrastructure grows. The Terraform project structure guide covers environment directories, modules, and file conventions that scale well.
Separate environments cleanly. Dev, staging, and production should use separate state files and separate variable files. Changes should flow through environments in sequence, not be applied directly to production.
Limit direct production access. Use IAM to restrict who can make changes directly in production. Routing all changes through your pipeline keeps your Terraform state accurate and reduces the chance of unreviewed changes landing in production.
Lock your Terraform and provider versions. Agree on a Terraform version and a provider version across the team. Lock these in a .terraform-version file and a required_providers block. Version drift between team members causes subtle bugs that are hard to diagnose.
You can also enforce infrastructure guardrails across your organisation using Policy as Code, which lets you write rules that prevent misconfigured resources from being applied at all.
Summary
- IaC means defining infrastructure in version-controlled files, not clicking through the Console.
- Terraform with the
hashicorp/googleprovider is the standard IaC tool for GCP. - The core workflow: write HCL, run
terraform plan, review the output carefully, runterraform apply. - Drift detection shows you when real infrastructure has diverged from what your code describes.
- Google Deployment Manager is a legacy tool. Do not use it for new projects.
- Config Connector is a Kubernetes-native option for teams already operating GKE.
- Use remote GCS state, separate environments, and pull request review to get the full benefit from IaC on a team.
Frequently asked questions
What is Infrastructure as Code in GCP?
Infrastructure as Code means defining your GCP resources in text files that you check into version control. Instead of creating a Cloud Storage bucket by clicking through the Console, you write a Terraform file that describes the bucket and Terraform creates it. Every change is tracked, reviewable, and repeatable.
Is Terraform the best IaC tool for Google Cloud?
For most teams, yes. The hashicorp/google provider covers nearly every GCP resource, is actively maintained, and is recommended by Google for new projects. If you are starting out, start with Terraform.
What is the difference between Terraform and gcloud scripts?
gcloud scripts are imperative: they tell GCP what actions to take. Terraform is declarative: you describe the desired end state and Terraform figures out what needs to change. Terraform also tracks state and detects drift, so you always know when real infrastructure has diverged from your code. Scripts cannot do this.
Is Deployment Manager deprecated or still usable?
Deployment Manager is a legacy tool. Google does not recommend it for new projects. It has incomplete resource coverage and has not kept pace with GCP growth. Start with Terraform instead.
Can I manage existing GCP resources with Terraform?
Yes, using terraform import. You point Terraform at an existing resource, it adds it to state, and you write the matching HCL. After a successful import, running terraform plan should show no changes. Import resources one at a time to keep things manageable.