EC2 Instance Types Explained: Families, Sizes, and How to Choose the Right One
An EC2 instance type determines the CPU, RAM, storage, and network available to your virtual machine. Picking the right one matters. Not because you cannot change it later, but because the right starting point saves you time debugging performance problems and money on a bill that grows every hour.
Think of it like renting a vehicle. You first choose the category (compact, SUV, truck) based on the job, then you choose the size within that category. EC2 families work the same way: pick the right family for your workload type, then size up or down based on how much of those resources you need.
- T-family for low-cost bursty workloads (dev, low-traffic sites)
- M-family for steady general-purpose workloads (app servers, APIs)
- C-family for CPU-heavy workloads (encoding, batch processing)
- R-family for memory-heavy workloads (Redis, large databases, Spark)
- Storage-optimised and GPU families only when the workload clearly needs them
EC2 instance types in one simple explanation
When you launch an EC2 instance, you are renting a virtual machine. The instance type is the specification sheet for that machine: it controls how fast the CPU is, how much RAM you get, what storage is attached, and how much network bandwidth is available.
AWS groups instance types into families based on what they are optimised for, and within each family there are sizes that let you scale up or down while keeping the same resource ratio.
Simple rule of thumb:
- You are running a web server or API: start with T or M family.
- Your app spends most of its time in computation (encoding, compiling, math): try C family.
- Your app loads a large dataset into memory (Redis, in-memory DB, Spark): try R family.
- You need fast local disk I/O (NoSQL, OLTP): try I family.
- You are training or running ML models: try G or P family.
- You are unsure: start with
t3.mediumorm7i.large, monitor for a week, and resize.
How to read an EC2 instance type name
Every EC2 instance type follows the same naming pattern:
[family][generation][processor/options].[size]
Three examples decoded:
t4g.medium: T family (burstable), 4th generation, Graviton (ARM), medium sizem7i.xlarge: M family (general purpose), 7th generation, Intel, xlarge sizec6g.2xlarge: C family (compute optimised), 6th generation, Graviton, 2xlarge size
Processor/options suffixes
| Suffix | Meaning |
|---|---|
g | AWS Graviton (ARM64 processor) |
i | Intel processor |
a | AMD processor (usually cheaper than Intel equivalent) |
n | Higher network bandwidth |
d | Local NVMe SSD instance storage included |
b | Block storage optimised |
z | High-frequency processor |
e | Extra storage or memory capacity |
Sizes
Each step up in size roughly doubles both vCPUs and RAM. The table below uses m7i as an example. Actual numbers vary slightly by family and generation.
| Size | vCPUs (m7i) | RAM (m7i) |
|---|---|---|
| large | 2 | 8 GB |
| xlarge | 4 | 16 GB |
| 2xlarge | 8 | 32 GB |
| 4xlarge | 16 | 64 GB |
| 8xlarge | 32 | 128 GB |
| 12xlarge | 48 | 192 GB |
| 16xlarge | 64 | 256 GB |
| metal | 64 | 256 GB (bare metal, no hypervisor) |
Some families also offer bare metal sizes (.metal), which give you direct access to the physical host with no hypervisor. This is mainly used for workloads that need hardware virtualisation themselves, or that require specific licensing tied to physical cores.
T-family has additional small sizes: nano, micro, small, and medium. M-family starts at large. When you see a tutorial using t2.micro, that is an older generation. Use t3.micro or t4g.micro instead. Both are included in the AWS Free Tier and are cheaper and faster than t2.
Instance families at a glance
| Family | Optimised for | Best use cases | Common mistake | Safe starting point |
|---|---|---|---|---|
| T (t3, t4g) | Burstable CPU | Dev/test, low-traffic web apps, staging environments | Using T-family for sustained CPU load. It throttles when credits run out. | t4g.small or t3.medium |
| M (m7i, m7g, m6i) | Balanced CPU + RAM | App servers, mid-traffic APIs, background workers | Paying for M when workload is light enough for T | m7i.large or m7g.large |
| C (c7g, c6i, c6a) | Compute (CPU) | Transcoding, HPC, ML inference, busy web tiers | Using C-family when the bottleneck is actually memory, not CPU | c7g.large or c6i.large |
| R (r7i, r6i, r6g) | Memory | Redis/Memcached, large databases, Spark, real-time analytics | Using M-family for a memory-heavy workload that keeps OOMing | r7i.large or r6g.large |
| I (i4i, i3en) | Local NVMe SSD storage | NoSQL databases (Cassandra, MongoDB), OLTP at high IOPS | Treating instance store as durable. It disappears on stop. | i4i.large (only if you need local NVMe) |
| D (d3en) | Dense HDD storage | Hadoop, data lakes, high sequential read/write | Choosing D when you need IOPS, not throughput. Use I-family for IOPS. | Specialist. Evaluate only when workload is clear. |
| G (g5, g4dn) | GPU (NVIDIA) | ML inference, game streaming, video processing | Paying for GPU when a CPU-based instance can do the inference | g4dn.xlarge for inference; g5.xlarge for graphics |
| P (p4d, p5) | GPU (NVIDIA, high-end) | Large-scale ML training, deep learning | Starting with P-family before validating the model architecture | Specialist. Use SageMaker or spot P instances to prototype first. |
| Inf (inf2) | AWS Inferentia chip | High-throughput ML inference at lower cost than GPU | Ignoring Inf2 for production inference workloads where cost matters | inf2.xlarge once model is validated |
T-family instances work like a prepaid phone: they accumulate CPU credits when idle, and spend them when the CPU spikes. If your workload runs at high CPU for a long time, the credits run out and the instance throttles to a fraction of its baseline. A t3.medium at sustained load can feel as slow as a t3.nano. If you see this happening, switch to M-family.
How to choose the right EC2 instance type
Use this table to map your workload to a starting family. After launch, watch the metrics listed and resize within a week if needed.
| Workload | Starting family | Metric to watch | Common mistake |
|---|---|---|---|
| Small website, low-traffic API, dev environment | T-family (t4g.small or t3.medium) | CPU credit balance. If it runs out under normal load, upgrade. | Leaving it on T when traffic grows steadily |
| Busy app server or mid-tier API (sustained load) | M-family (m7i.large or m7g.large) | CPU utilisation. Stay below 70% under peak load. | Using T-family and wondering why response times spike |
| Background workers, CI/CD runners | M or C family; spot instances for cost | Job throughput and queue depth | Using on-demand for interruptible batch work. Spot instances cost up to 90% less. |
| Redis, Memcached, or memory-heavy service | R-family (r7i.large or r6g.large) | Memory utilisation. Aim to keep under 75% used. | Hitting OOM errors on M-family when R would fix it immediately |
| NoSQL or storage-heavy OLTP database | I-family (i4i) if you need local NVMe; M or R + EBS otherwise | Disk IOPS and read/write latency | Using I-family without replicating data. Instance store is ephemeral. |
| GPU / ML inference | G-family (g4dn.xlarge) or Inf2 | GPU utilisation, inference latency | Using P-family (training hardware) for inference. G or Inf2 is cheaper. |
| ML model training | P-family (p4d, p5) or Trn1 | Training loss convergence, GPU utilisation | Training on a single GPU when distributed training is faster per dollar |
| Interruptible batch jobs | Any family using EC2 Spot Instances | Interruption rate and job retry logic | Not handling spot interruptions. Always checkpoint batch work. |
| Legacy workload that needs a persistent VM | M-family; match the on-prem vCPU/RAM ratio | CPU, memory, and disk I/O. Use CloudWatch agent for memory. | Lifting-and-shifting without right-sizing. On-prem machines are usually over-provisioned. |
Worked example 1: small Django or Node.js app
You are deploying a Django API with a PostgreSQL database. Estimated traffic: a few hundred requests per minute at peak. Start with a t4g.medium (2 vCPU, 4 GB RAM, Graviton) at around $0.027/hour. Install the CloudWatch agent and watch CPU credits and memory for a week. If credits stay healthy and memory stays below 3 GB, you are fine. If CPU credits deplete under normal load, move to m7g.medium for consistent performance.
Worked example 2: Redis cache
You are running Redis as a session cache. Your dataset is 10 GB. An M-family instance with 8 GB RAM will constantly page and cause latency spikes. Start with r7i.large (2 vCPU, 16 GB RAM). It gives Redis room to operate with a safe memory buffer. Watch memory utilisation. If you grow to a 12 GB dataset, move to r7i.xlarge (4 vCPU, 32 GB RAM). If you are running Redis in production long-term, also consider Amazon ElastiCache, which handles upgrades and failover for you.
Worked example 3: CI/CD build runners
Build jobs are CPU-intensive but run for 5 to 15 minutes and are safe to retry. Use c7g.xlarge (4 vCPU, 8 GB RAM) or c6i.xlarge spot instances. Enable Auto Scaling Groups to spin up runners when the queue grows and terminate them when it drains. At spot pricing this can cut your CI costs by 60 to 80% compared to always-on on-demand instances.
If you cannot decide: use t4g.medium for light and variable traffic, or m7i.large for steady workloads. Both are current-generation, sensible defaults. You can stop the instance, change the type, and start it again in about two minutes. There is no penalty for starting with the wrong size.
Graviton vs Intel vs AMD
AWS Graviton is the company’s own ARM64 processor. It appears as the g suffix in instance names: t4g, m7g, c7g, r6g. Here is how to think about the choice:
| Situation | Recommendation |
|---|---|
| New Linux service, open-source stack (Python, Node.js, Java, Go, Ruby) | Default to Graviton. It offers better price-performance for most workloads. |
| Docker containers | Use multi-arch images or build ARM64-specific. Most public images now have ARM64 variants. |
| Windows Server workloads | Graviton does not support Windows. Use Intel or AMD. |
| Proprietary software or pre-compiled binaries | Check vendor support for ARM64 before switching. If unsure, stay on Intel x86. |
| AMD (a-suffix, e.g. m6a, c6a) | Good middle ground: x86 compatible, usually 5 to 10% cheaper than Intel equivalent. Worth considering when Graviton is not an option. |
Run your app in a Docker container on an ARM64 machine locally. Apple Silicon Macs can do this natively. Alternatively, spin up a t4g.small spot instance for a few cents and test there. If it works, Graviton is a strong choice. Most teams that test it end up switching.
EBS vs instance store
This distinction matters when choosing between instance families. Think of it like the difference between a hard drive and a whiteboard:
- EBS is the hard drive. Durable, persistent, and survives the instance being stopped or replaced. It is the right place for your actual data.
- Instance store is the whiteboard. Fast and convenient to write on, but wiped clean the moment the instance stops. Never your only copy of anything important.
| EBS (Elastic Block Store) | Instance store (local NVMe) | |
|---|---|---|
| Persistence | Persists across reboots, stops, and instance termination (by default) | Wiped completely on stop or termination, not on reboot |
| Performance | Good IOPS on gp3/io2; latency depends on network | Very high IOPS and low latency, physically attached |
| Cost | Billed separately per GB provisioned | Included in instance price |
| Families with local NVMe | Available on all families via EBS | I-family, D-family, some M and C with d-suffix (e.g., m6id) |
When instance store is useful: temp directories, shuffle space for sort-heavy queries, write-ahead log staging, or any data you can reconstruct. Its speed advantage is real for these cases.
When instance store is dangerous: any data you cannot afford to lose. A stopped instance, a host failure, or a spot interruption wipes it permanently. Always replicate data from instance store to EBS or S3 if it matters.
Choosing an I or D family instance because “it comes with fast storage” and then storing your only copy of the database there is a common and expensive mistake. Instance store is a performance tool, not a durability tool. Stop the instance and the data is gone. Permanently.
When to use this page
A few situations where this page is most relevant:
- You are launching your first EC2 instance and do not know what to pick. Use the decision table above, start with T or M family, and monitor.
- You are seeing performance problems: slow responses, OOM errors, CPU throttling. Check CloudWatch metrics and compare against the family characteristics.
- You are resizing after monitoring and need to understand whether to move up a size or switch families entirely. If CPU is the bottleneck, size up within the same family or move to C. If memory is the bottleneck, move to R.
- You are comparing EC2 against Lambda or containers and want to understand what EC2 is uniquely good at. See the section below.
- You are reviewing an existing setup for cost optimisation. Overprovisioned instances are common. Check Rightsizing EC2 Instances for a structured approach.
If you are genuinely unsure and need to pick something right now: start with t4g.medium if your workload is light and variable, or m7i.large if it is steady. Both are solid, current-generation defaults. You can always change it later. Stopping and restarting takes about two minutes.
Common mistakes
Over-provisioning from the start. Launching a 4xlarge “just in case” wastes significant money. Start small, monitor with CloudWatch, and scale up when you have real data. EC2 resizing is fast and reversible.
Using T-family for sustained CPU workloads. If your process runs at 80%+ CPU continuously, T-family will exhaust its credits and throttle hard. Switch to M or C for any sustained compute.
Ignoring Graviton without checking. Most teams default to Intel out of habit. Graviton (ARM64) is worth testing for any new Linux service. It is not experimental. It is what AWS runs much of its own infrastructure on.
Treating instance store as durable storage. Any local NVMe on I, D, or
d-suffix instances is gone when the instance stops. Period. Always replicate important data elsewhere.Sticking to older-generation defaults. Instances like m4, c4, or t2 are more expensive and less capable than their current equivalents. If you see a tutorial using m4.large, use m7i.large or m7g.large instead.
Choosing by vCPU count alone. A c7g.xlarge and an r7i.xlarge both have 4 vCPUs, but very different RAM. A process that needs 24 GB of RAM will OOM on the c7g.xlarge (8 GB) before it ever loads. Always check the CPU-to-RAM ratio for your family.
Not measuring memory separately. CloudWatch does not collect memory metrics by default. You need to install the CloudWatch agent. Many teams only see CPU and miss a memory bottleneck entirely.
Copying a size from a tutorial without workload context. Tutorials pick whatever is convenient for demonstration, often t2.micro or t3.small. Your production workload is different. Match the size to your actual traffic and dataset, not to a tutorial’s screenshots.
EC2 vs other AWS compute options
EC2 is not always the right tool. Here is a quick orientation:
| Compute option | When to use it instead of EC2 | When EC2 is still right |
|---|---|---|
| Lambda | Short-lived event-driven functions (under 15 minutes), unpredictable traffic, zero-server management | Long-running processes, persistent connections (WebSockets), workloads that need persistent local state |
| ECS / EKS (containers) | Containerised apps that need orchestration, zero-downtime deploys, horizontal scaling per service | Workloads that are hard to containerise, require specific OS-level configuration, or need bare-metal access |
| App Runner | Simple containerised web services where you want zero infrastructure management | Anything that needs more control than App Runner exposes |
EC2 gives you the most control and the widest range of configurations. That is its strength and its complexity. If your workload fits a managed service like Lambda or App Runner, those services are often simpler to operate. See Choosing Between EC2, Lambda, and Containers for a more detailed breakdown.
If you are scaling EC2 horizontally in response to traffic, you probably want Auto Scaling Groups with a Launch Template. These let you define your instance type and configuration once, and scale out and in automatically based on CloudWatch alarms or scheduled rules.
Summary
- Instance type names encode family (m), generation (7), processor type (i=Intel, a=AMD, g=Graviton), and size (large, xlarge, 2xlarge).
- T-family is burstable: good for variable workloads, not sustained CPU load. M-family is the steady general-purpose default.
- C-family for CPU-bound work. R-family for memory-bound work. Only use I, D, G, or P families when the workload clearly needs them.
- Graviton (ARM64) is a strong default for new Linux services. Check compatibility for existing software before switching.
- EBS is persistent network-attached storage. Instance store is fast local NVMe that disappears on stop. Do not use it as your only data store.
- Start with a reasonable size, install the CloudWatch agent to monitor both CPU and memory, and resize after a week of real data. Changing instance type takes about two minutes.
- EC2 is right when you need persistent VMs, fine-grained control, or specific hardware. Lambda and containers are often simpler for stateless services.
Frequently asked questions
What is the difference between an EC2 family and an EC2 size?
The family (T, M, C, R, etc.) determines what the instance is optimised for: burstable CPU, balanced compute, compute-heavy, or memory-heavy. The size (large, xlarge, 2xlarge) determines how much of that resource you get. Same family, bigger size means more vCPUs and RAM in roughly the same ratio.
Is t3 or t4g enough for a small web app?
Yes, for most low-to-medium traffic web apps. A t3.small or t3.medium handles a typical Node.js, Django, or WordPress site comfortably at modest traffic. t4g (Graviton) is cheaper and faster for the same price if your software runs on ARM, which most Linux/Docker workloads do. If CPU usage stays above 40% for extended periods, upgrade to m7i or m7g instead.
Should I choose Graviton (ARM) or Intel/AMD?
Graviton is a safe default for most new Linux workloads. Java, Python, Node.js, Go, Ruby, and Rust all run natively on ARM64. Docker images need to be multi-arch or ARM64-specific. If you are running a proprietary binary, a Windows Server instance, or software with known ARM compatibility issues, stick with x86 (Intel or AMD). The price-performance advantage of Graviton is real and worth checking on every new service.
What is the difference between EBS and instance store?
EBS volumes are persistent network-attached block storage. They survive reboots, stops, and can be detached and moved. Instance store is local NVMe SSD physically attached to the host. It is much faster, but it is completely wiped when the instance stops or terminates. Use instance store for temp files, buffers, and caches only. Your real data belongs on EBS or S3.
Can I change the EC2 instance type later?
Yes. Stop the instance, change the instance type in the AWS console or CLI, then start it again. It takes about two minutes. EBS-backed instances retain their storage. Instance store data is lost on stop regardless of the type change. This makes EC2 easy to right-size: start small and adjust after you see real usage data.
How do I know if I picked the wrong instance size?
Watch two metrics in CloudWatch: CPU utilisation and memory (requires the CloudWatch agent). If CPU consistently runs above 80%, move up a size or switch to a compute-optimised family. If memory pressure is high but CPU is low, consider a memory-optimised family like R. If both are consistently low, move down a size. One week of production metrics is usually enough to make a confident decision.