Cloud Security Basics Every Engineer Needs to Know
You do not need to be a security specialist to work safely in cloud. But you do need a baseline that keeps you from introducing the kinds of vulnerabilities that cause real incidents. This page covers the non-negotiable security knowledge every cloud engineer should have from day one.
IAM and the principle of least privilege
Identity and Access Management (IAM) is the single most important security control in any cloud environment. Most cloud breaches do not involve exotic attacks — they involve an overly permissive IAM configuration that let an attacker do things they should not have been able to do.
The principle of least privilege means every identity — user, service account, application — gets only the permissions it actually needs to do its job. Nothing more. A service that reads from a storage bucket should have the storage read role, not the owner role. A deployment pipeline that creates VMs should have permission to create VMs in specific projects, not admin access to the entire organisation.
What this looks like in practice
Start with the narrowest possible role. If a service needs to read objects from a specific bucket, bind it to the bucket-level reader role on that bucket — not a project-level role that grants access to all buckets. If it only needs to read and does not need to list bucket contents, use an even more restrictive custom role.
Common mistakes beginners make with IAM:
- Attaching
roles/editor(GCP) orAdministratorAccess(AWS) to service accounts because it is easy and “we can fix it later” - Sharing service account credentials between multiple services instead of creating separate accounts
- Granting IAM permissions to personal user accounts instead of service accounts for automated processes
- Forgetting to remove access when a person leaves the team or a service is decommissioned
Service accounts vs user accounts
Automated processes — CI/CD pipelines, applications running on compute instances, scheduled jobs — should use service accounts (or managed identities on Azure), not personal user credentials. Service accounts can be scoped precisely, rotated, and audited independently of individual users. If a service account is compromised, you can revoke it without affecting any person’s access.
Secrets management: never hardcode credentials
Hardcoded credentials — API keys, database passwords, service account keys embedded in source code — are one of the most common causes of cloud security incidents. Secrets committed to version control are frequently discovered by automated scanners, and once a secret is in git history, it cannot be safely removed without rotating the credential.
The rule is simple: secrets never live in code, environment variable files checked into version control, or container image layers. They belong in a secrets management service.
The right tools for secrets
- AWS Secrets Manager or AWS Systems Manager Parameter Store — store and retrieve credentials programmatically, with automatic rotation for supported services
- GCP Secret Manager — versioned secrets, IAM-controlled access, audit logging on every access
- Azure Key Vault — secrets, certificates, and encryption keys in a managed service
- HashiCorp Vault — vendor-neutral option, common in multi-cloud or on-premises environments
Applications fetch secrets at runtime from the secrets service rather than reading them from environment variables or config files. If a secret needs to change, you update it in the secrets service and the application picks up the new value on its next restart (or immediately, if designed to do so). No code changes, no redeployment required.
What to do about existing hardcoded secrets
If you discover hardcoded credentials in existing code: rotate the credential immediately (before removing it from the code), then migrate to a secrets service, then clean up the history if feasible. The rotation comes first — the key is already exposed, so cleaning up the code without rotating does nothing to reduce risk.
Encryption at rest and in transit
Two types of encryption matter for cloud engineers, and they solve different problems.
Encryption at rest
Data at rest is data stored on disk — databases, storage buckets, block volumes, snapshots. All major cloud providers encrypt storage at rest by default using provider-managed keys. For most workloads, this default is sufficient and requires no configuration.
When you need more control — regulated industries, sensitive data, compliance requirements — you use Customer-Managed Encryption Keys (CMEK). You create and manage the encryption key in a key management service (AWS KMS, GCP Cloud KMS, Azure Key Vault). You control who can use the key, you can audit every use, and you can revoke the key if needed. The trade-off is operational overhead: if you lose or accidentally delete the key, the data is unrecoverable.
Encryption in transit
Data in transit is data moving between services, between the client and a server, or between cloud regions. This should always travel over TLS. Practically, this means:
- Enforcing HTTPS-only on storage buckets and APIs (reject plain HTTP requests)
- Using managed certificate services (AWS ACM, GCP Certificate Manager) rather than managing TLS certificates manually
- Ensuring internal service-to-service traffic uses TLS, not just external traffic
- Using modern TLS versions — TLS 1.2 minimum, TLS 1.3 preferred; disable older versions
Public access controls
The most common cloud breach pattern is a misconfigured storage bucket that is publicly accessible. An engineer creates a bucket to share some files, accidentally enables public access, forgets about it, and months later a security researcher (or attacker) finds it and its contents get exposed.
Block public access at the account or organisation level wherever possible. AWS provides an S3 Block Public Access setting that can be enforced at the account level, preventing any individual bucket from being made public regardless of its own settings. GCP and Azure have equivalent controls.
If you genuinely need public access to a bucket — for a static website, for public downloads — make this an explicit, documented decision. Use the minimum access required (read-only, specific paths), and audit the bucket regularly to confirm the contents are appropriate for public access.
The same principle applies to compute: no instance should have a public IP address unless it is specifically meant to be publicly accessible. Internal services should live in private subnets and be accessible only through load balancers or VPN.
Security groups and firewall rules
Security groups (AWS) and firewall rules (GCP) control which network traffic is allowed to reach your resources. Getting these wrong is a common source of both security vulnerabilities and application connectivity problems.
Key rules to follow:
- Deny by default — start with no inbound access allowed, then explicitly permit only what is needed
- Restrict source IP ranges — if only your office or VPN needs SSH access, allow only those IP ranges, not
0.0.0.0/0 - Avoid opening 0.0.0.0/0 on administrative ports — SSH (22) and RDP (3389) should never be open to the world
- Use security group references where possible — instead of allowing traffic from an IP range, allow traffic from another security group. This is more maintainable and more precise
- Audit firewall rules regularly — rules accumulate over time. Review and remove rules that are no longer needed
A practical test: after setting up a new service, try connecting to it from the public internet using a port that should not be accessible. If you can connect, your firewall rules are too permissive.
Audit logging: knowing what happened and when
Audit logs record every API call made in your cloud environment — who did what, when, and from where. Without audit logs, you have no way to investigate a security incident, answer compliance questions, or understand how a misconfiguration was introduced.
Enable audit logging at the organisation or account level and treat it as a non-negotiable baseline:
- AWS CloudTrail — records all API activity across your AWS account. Enable it in all regions, including global services
- GCP Cloud Audit Logs — Admin Activity logs are always on; Data Access logs must be explicitly enabled for services that handle sensitive data
- Azure Activity Log and Azure Monitor — captures management-plane operations across the subscription
Store audit logs in a separate account or project from the one being audited. If an attacker compromises a production account and has admin access, you do not want them to also be able to delete the evidence of what they did.
Set up alerts for high-risk events: IAM policy changes, root account or super-admin login, firewall rule changes, public access enabled on a storage bucket. These events should generate notifications, not be discovered in a weekly review.
What a security review looks like in practice
Most teams do not have dedicated security engineers reviewing every change. In practice, cloud engineers are the first line of defence. A basic security review of a new piece of infrastructure involves checking:
- Does every IAM binding follow least privilege? Are there any wildcard permissions?
- Are secrets stored in a secrets service, not in code or environment variables?
- Are storage resources private by default? Is public access an explicit, justified exception?
- Are firewall rules specific — no
0.0.0.0/0inbound on anything other than HTTPS? - Is encryption at rest using at least provider-managed keys?
- Is all traffic between services using TLS?
- Is audit logging enabled and being sent somewhere persistent?
Running this checklist on a new Terraform module before merging takes five minutes and catches the common mistakes before they reach production. This is what security-conscious cloud engineering looks like in practice — not a separate security audit, but continuous attention during normal engineering work.
Summary
- Least-privilege IAM is the single most important security control — grant only the permissions a service actually needs, and review them regularly
- Secrets belong in a managed secrets service, never in source code, environment files, or container images
- Block public access at the account level and treat any public resource as a deliberate, documented exception
- Firewall rules should default to deny, and administrative ports should never be open to the internet
- Enable audit logging from day one and store logs in a separate account so they cannot be tampered with