Cloud Engineer Cheatsheet: Key Concepts, Services, and Patterns

This page is a quick reference for working cloud engineers and people studying for cloud roles. It covers the core service categories, networking and IAM fundamentals, storage patterns, and architectural concepts you will use on the job every day.


Core Service Categories#

CategoryWhat it doesAWSGCPAzure
ComputeRun application workloadsEC2Compute EngineVirtual Machines
StoragePersist dataS3 / EBS / EFSCloud Storage / Persistent DiskBlob / Disk / Files
NetworkingConnect resourcesVPC / Route 53 / ELBVPC / Cloud DNS / Cloud LBVNet / Azure DNS / Load Balancer
DatabaseManaged data storesRDS / DynamoDBCloud SQL / Spanner / FirestoreAzure SQL / Cosmos DB
IdentityAuth and access controlIAMCloud IAMMicrosoft Entra ID
MonitoringObserve running systemsCloudWatchCloud Monitoring / LoggingAzure Monitor
ServerlessRun code without managing VMsLambdaCloud Functions / Cloud RunAzure Functions

Networking Fundamentals#

VPC / VNet — A logically isolated virtual network inside a cloud provider. You define the IP address range (using CIDR notation), subnets, and routing rules.

CIDR notation — A way of expressing IP ranges. 10.0.0.0/16 means the first 16 bits are fixed, giving you 65,536 addresses. /24 gives 256 addresses. Smaller suffix = larger range.

Subnets

NAT Gateway — Allows resources in private subnets to make outbound connections (e.g., downloading updates) without being directly reachable from the internet.

Load Balancers

DNS concepts — DNS translates domain names to IP addresses. A records point to IPv4 addresses. CNAME records are aliases. TTL (Time To Live) controls how long records are cached. Cloud providers offer managed DNS: Route 53 (AWS), Cloud DNS (GCP), Azure DNS.


IAM Concepts#

TermDefinition
PrincipalThe identity making a request — a user, service account, or role
PolicyA document defining what actions are allowed or denied on which resources
RoleA set of permissions that can be attached to a principal
Least privilegeGrant only the permissions needed for the task — nothing more
AuthenticationProving who you are (identity)
AuthorisationDetermining what you are allowed to do (permissions)

Service accounts vs IAM users — IAM users are for humans. Service accounts (AWS IAM roles for services, GCP service accounts, Azure Managed Identities) are for applications and automation. Avoid giving applications long-lived user credentials. Use service accounts or instance-attached roles instead.


Storage Patterns#

TypeWhat it isBest forAWSGCPAzure
ObjectFlat namespace, accessed via APIBackups, static files, data lakes, mediaS3Cloud StorageBlob Storage
BlockRaw storage volumes attached to a VMOS disks, databases, high-IOPS workloadsEBSPersistent DiskAzure Disk
FileShared file system mounted by multiple VMsShared app configs, legacy NFS workloadsEFSFilestoreAzure Files

Object storage is the most common choice for new cloud-native workloads. Block storage is used when you need low-latency disk access (databases, OS volumes). File storage is used when multiple VMs need to share the same directory tree.


Compute Options: When to Use What#

OptionBest for
Virtual MachinesFull OS control, legacy apps, stateful workloads, GPU or specialised hardware needs
ContainersPortable microservices, repeatable environments, CI/CD pipelines, Kubernetes workloads
ServerlessEvent-driven functions, infrequent workloads, API backends — no idle cost, no server management

Key Architectural Patterns#

Stateless vs stateful — A stateless service holds no session data in memory. Any instance can serve any request. This makes horizontal scaling and rolling deployments much simpler. Store session data in a cache (Redis) or database instead of in the application process.

Idempotency — An operation is idempotent if running it multiple times produces the same result as running it once. Critical for retries and message queues. A HTTP PUT that sets a value to 42 is idempotent; a POST that increments a counter is not.

Immutable infrastructure — Never modify running servers. Instead, build a new image, deploy it, and retire the old one. This eliminates configuration drift and makes rollbacks predictable.

Blue/green deployment — Run two identical environments (blue = current, green = new). Route traffic to green when ready. Rollback is instant: switch traffic back to blue. Requires double the infrastructure during the switch.

Rolling deployment — Replace instances gradually, a few at a time. Lower resource cost than blue/green but rollback is slower.


Cost Basics#

Common cost drivers

Pricing models

ModelDescription
On-demand / pay-as-you-goFull price, no commitment, maximum flexibility
Reserved / committed use1 or 3 year commitment, 30–70% discount
Spot / preemptibleSpare capacity at steep discount (60–90%), can be interrupted with short notice

Use reserved instances for predictable baseline workloads. Use spot/preemptible for batch jobs, CI builds, and fault-tolerant workloads.


Reliability Concepts#

Regions — Geographically separated data centre clusters. Deploying across regions protects against regional outages.

Availability Zones (AZs) — Physically separate data centres within a single region, connected by low-latency links. Deploying across AZs protects against single-facility failures.

Fault tolerance — The system continues operating correctly even when a component fails. Achieved through redundancy, health checks, and automatic failover.

RTO (Recovery Time Objective) — The maximum acceptable time for a system to be restored after a failure.

RPO (Recovery Point Objective) — The maximum acceptable amount of data loss measured in time. An RPO of 1 hour means you can tolerate losing up to 1 hour of data.


Quick Decision Guide#

If you need…Use…
Store large files or backupsObject storage (S3 / Cloud Storage / Blob)
Run a containerised app at scaleKubernetes (EKS / GKE / AKS)
Run event-driven code without a serverServerless functions (Lambda / Cloud Functions / Azure Functions)
Share a file system across multiple VMsFile storage (EFS / Filestore / Azure Files)
Route HTTPS traffic based on URL pathL7 load balancer (ALB / Cloud LB / App Gateway)
Reduce costs on long-running VMsReserved / committed use pricing
Run fault-tolerant batch jobs cheaplySpot / preemptible instances
Restrict what an application can accessService account or instance-attached IAM role
Protect against a single data centre outageMulti-AZ deployment
Protect against a regional outageMulti-region deployment