Cloud Networking Interview Questions: The Fundamentals Still Matter

Networking is the most commonly skipped topic in cloud interview preparation. Engineers who have spent years working with managed services — RDS, Lambda, S3, Cloud Run — can often get by without deeply understanding the network layer beneath. Then an interviewer asks them to explain the difference between a security group and a NACL, and the answer falls apart.

This page covers the networking questions that appear in cloud engineering, DevOps, and SRE interviews — what is actually being asked, and how to structure a clear answer even when your networking knowledge has gaps.

Why Networking Gets Tested in Cloud Interviews#

The assumption that networking only matters for network engineers is incorrect. In cloud environments, almost every problem eventually touches networking:

Cloud engineers who cannot reason about the network layer spend hours debugging problems that a solid mental model would resolve in minutes. Interviewers know this, which is why networking questions appear even in application-focused cloud roles.

What Different Roles Are Expected to Know#

Cloud engineer — Expected to configure and troubleshoot VPCs, subnets, security groups, load balancers, DNS, and VPN connectivity. Should understand CIDR notation and be able to design a network for a new workload.

DevOps engineer — Expected to understand enough networking to configure CI/CD infrastructure, manage service connectivity, and debug deployment-related network issues. Deep knowledge of BGP or routing protocols is not expected.

SRE — Expected to understand networking deeply enough to diagnose latency, packet loss, and connectivity failures. DNS, TCP behaviour, and load balancer health check mechanics are all in scope. May be expected to understand cloud interconnect and direct connectivity options.

Application/backend engineer moving to cloud — Expected to understand the basics: what a VPC is, what a security group does, how a service gets internet access. Not expected to design complex network topologies.

Core Networking Questions#

“What is a VPC and why do you need one?” Testing: foundational cloud networking knowledge. A Virtual Private Cloud is a logically isolated network in a cloud environment. You define the IP address range, create subnets, configure routing, and control access. It is the container for most cloud resources. The interviewer wants to hear that it provides isolation — resources in one VPC cannot communicate with resources in another VPC by default.

“What is CIDR notation and how do you use it to plan a network?” Testing: practical IP addressing. CIDR (Classless Inter-Domain Routing) notation expresses an IP address range. 10.0.0.0/16 means 65,536 addresses. /24 gives 256 addresses. When planning a VPC, you choose a CIDR block large enough to accommodate current and future subnets. Key insight: once a VPC CIDR is set, changing it is painful — get it right at the start. Strong candidates also mention leaving headroom for additional subnets.

“What is the difference between a public subnet and a private subnet?” Testing: basic network architecture. A public subnet has a route to an internet gateway — resources in it can be reached from and reach the internet directly (given appropriate security group rules). A private subnet does not have a direct route to the internet. Resources in private subnets that need outbound internet access (for package updates, API calls, etc.) go through a NAT gateway. Databases and internal services belong in private subnets.

“What is a NAT gateway and when do you need one?” Testing: outbound connectivity understanding. NAT (Network Address Translation) gateways allow resources in private subnets to initiate outbound connections to the internet while remaining unreachable from the internet directly. Common use cases: EC2 instances in private subnets that need to download updates, Lambda functions in a VPC that need to call external APIs. NAT gateways are not free — they have an hourly cost plus data transfer charges, which matters for cost discussions.

“What is an internet gateway?” Testing: inbound/outbound connectivity. An internet gateway is attached to a VPC and enables communication between the VPC and the internet. Without one, nothing in the VPC can reach the internet. The route table of a public subnet includes a route to the internet gateway for 0.0.0.0/0 traffic.

“What is the difference between a security group and a network ACL?” Testing: a very common cloud interview question. Security groups are stateful firewalls that operate at the instance/resource level. If you allow inbound traffic on port 80, the return traffic is automatically allowed regardless of outbound rules. NACLs (Network Access Control Lists) operate at the subnet level and are stateless — you must explicitly allow both inbound and outbound traffic. Security groups are evaluated first; NACLs provide a secondary layer. Most engineers use security groups for access control and only use NACLs for additional subnet-level restrictions.

“What is VPC peering and what are its limitations?” Testing: multi-VPC architecture knowledge. VPC peering connects two VPCs so that resources in each can communicate using private IP addresses. Limitations: peering is not transitive — if A peers with B and B peers with C, A cannot reach C through B. For large mesh networking between many VPCs, a Transit Gateway is more appropriate. Also, peered VPCs cannot have overlapping CIDR ranges.

“What is a Transit Gateway?” Testing: large-scale networking knowledge. A Transit Gateway is a central hub that connects multiple VPCs and on-premises networks through a single gateway. It solves the non-transitive peering problem. It also simplifies routing considerably when you have many VPCs. The trade-off is cost — Transit Gateways have an attachment fee per VPC plus data transfer charges.

“What is the OSI model and which layers are most relevant to cloud networking?” Testing: fundamentals depth. The 7-layer OSI model: Physical, Data Link, Network, Transport, Session, Presentation, Application. Most cloud networking operates at layers 3 (Network — IP routing, VPCs, subnets) and 4 (Transport — TCP/UDP, ports, security groups). Load balancers operate at layer 4 (TCP/UDP load balancing) or layer 7 (HTTP/HTTPS routing). DNS resolution happens at the application layer. You do not need to recite all seven layers from memory, but understanding what happens at layers 3, 4, and 7 is genuinely useful.

DNS Questions#

“How does DNS resolution work?” Testing: practical troubleshooting knowledge. A DNS query goes to the resolver (often provided by the cloud or your organisation), which checks its cache, then queries the authoritative name servers for the domain. In cloud environments, this is complicated by split-horizon DNS — the same hostname might resolve to a private IP inside the VPC and a public IP externally. Common source of connection bugs.

“What is the difference between an A record, CNAME, and ALIAS record?” Testing: DNS record knowledge. An A record maps a hostname to an IP address. A CNAME maps a hostname to another hostname. An ALIAS record (AWS Route 53 specific) maps a hostname to an AWS resource like a load balancer or CloudFront distribution — it functions like a CNAME but can be used at the zone apex (root domain), which CNAMEs cannot.

“What is TTL and when would you change it before a migration?” Testing: operational thinking. TTL (Time To Live) is the cache duration for a DNS record. Before a migration where you plan to change an IP address or endpoint, lowering TTL to 60 seconds or less in advance means the old cached value will expire quickly after you make the change, reducing the propagation window. Forgetting to lower TTL before a migration is a common mistake that leads to users hitting the old endpoint for hours.

Load Balancing Questions#

“What is the difference between a layer 4 and layer 7 load balancer?” Testing: load balancer selection knowledge. A layer 4 (transport layer) load balancer routes traffic based on IP and TCP/UDP port. It is faster and lower overhead. A layer 7 (application layer) load balancer can inspect HTTP headers, paths, and host names to make routing decisions. Use L7 when you need path-based routing (/api/* to one target group, /static/* to another), host-based routing, or SSL termination. On AWS: Network Load Balancer (L4), Application Load Balancer (L7). On GCP: TCP Proxy (L4), HTTP(S) Load Balancer (L7).

“What is a health check in the context of a load balancer?” Testing: production operations knowledge. Load balancers periodically probe a target (usually an HTTP endpoint) to verify it can serve traffic. If a target fails health checks, it is removed from the rotation until it passes again. Strong candidates mention: the difference between health check failure and target group being empty, what happens to in-flight connections during a scale-in event, and the role of health checks in rolling deployments.

VPN and Direct Connectivity Questions#

“What is the difference between a VPN and Direct Connect (or Cloud Interconnect on GCP)?” Testing: hybrid connectivity knowledge. A VPN connects an on-premises network to a cloud VPC over the public internet, encrypted with IPsec. It is cheaper and faster to set up but subject to internet variability. Direct Connect (AWS) / Cloud Interconnect (GCP) / ExpressRoute (Azure) is a dedicated physical connection to the cloud provider, bypassing the public internet. It offers consistent latency, higher bandwidth, and is the choice for large data transfers or latency-sensitive workloads. It takes weeks to months to provision.

Scenario Question: Designing a 3-Tier Web Application Network#

An interviewer might ask: “Design the network architecture for a 3-tier web application. What decisions would you make and why?”

A structured answer:

Tier 1 — Web/presentation layer. Load balancer in a public subnet accepting traffic on 443. Behind it, web servers in private subnets. Security group on the load balancer allows 443 from 0.0.0.0/0. Security group on web servers allows traffic only from the load balancer’s security group.

Tier 2 — Application layer. Application servers in private subnets in a different availability zone pair than the web tier for resilience. Security group allows traffic only from the web tier security group on the application port.

Tier 3 — Database layer. Database in private subnets, ideally across two AZs for high availability. Security group allows traffic only from the application tier security group on the database port (3306 for MySQL, 5432 for PostgreSQL, etc.).

Outbound internet access. NAT gateway in public subnet(s), with private subnet route tables pointing 0.0.0.0/0 at the NAT gateway. Consider one NAT gateway per AZ for availability (one NAT gateway per VPC is cheaper but a single AZ failure takes out outbound connectivity for all private subnets).

DNS. Application hostname resolves through Route 53 to the load balancer. Internal service discovery uses private DNS.

Trade-offs to mention: Multi-AZ NAT gateways add cost but improve availability. You might use a single AZ in development to save money. CIDR block sizing — leave room for additional subnets without renumbering.

What to Say When You Do Not Know a Networking Answer#

Networking gaps are common — even experienced cloud engineers have areas they have not needed to touch. A few honest approaches that work in interviews:

Name what you do know. “I haven’t personally configured BGP routing, but I understand it’s used for dynamic route advertisement and I’d know where to start looking.”

Describe your troubleshooting approach. Even if you don’t know the answer, showing that you know how to investigate demonstrates maturity. “I’d check the route tables first, then the security group rules, then look at VPC flow logs to see where packets are dropping.”

Do not pretend. Interviewers know networking is a gap for many cloud engineers. An honest “I haven’t worked with that directly but here’s how I’d approach learning it” is better than a confident wrong answer.