VPC, Load Balancing & DNS

One-line summary: How requests flow through GCP networking: VPC routing, load balancer selection, and DNS resolution.

Prerequisites: Basic networking concepts (IP addresses, subnets, routing), Foundations.


Mental Model

Request Flow

flowchart LR Client[Client] --> DNS[DNS Resolution] DNS --> LB[Load Balancer] LB --> VPC[VPC Routing] VPC --> VM[VM/Instance] style DNS fill:#99ccff style LB fill:#ffcc99 style VPC fill:#99ff99

Key insight: Every request goes through DNS → Load Balancer → VPC → Instance. Understanding each layer is critical for debugging and design.

VPC Architecture

VPC (Virtual Private Cloud): Logically isolated network in GCP.

Components: - Subnets: IP address ranges within VPC - Routes: How traffic is routed - Firewall rules: What traffic is allowed - Peering: Connecting VPCs


Internals & Architecture

VPC Deep Dive

Subnets

Regional subnets: Span all zones in a region.

IP ranges: CIDR blocks (e.g., 10.0.0.0/16).

Private vs public: - Private: No external IP, use NAT gateway - Public: External IP, direct internet access

Routes

Route table: Determines where traffic goes.

Default routes: - Default internet gateway: Routes to internet - Default VPC peering: Routes to peered VPCs - Default subnet routes: Routes within subnet

Custom routes: - Static routes: Manual routing rules - Dynamic routes: BGP routes (for VPN/Interconnect)

Firewall Rules

Firewall rules: Control what traffic is allowed.

Components: - Direction: Ingress (inbound) or egress (outbound) - Source/Destination: IP ranges, tags, service accounts - Protocol/Port: TCP, UDP, ICMP, specific ports - Action: Allow or deny - Priority: Lower number = higher priority

Default rules: - Default allow egress: All outbound traffic allowed - Default deny ingress: All inbound traffic denied (unless allowed)

VPC Peering

VPC peering: Connect VPCs for private communication.

Types: - Private peering: Within same project - Cross-project peering: Across projects - Cross-organization peering: Across organizations

Limitations: - No transitive peering (A→B→C doesn't work) - IP ranges must not overlap

Load Balancing

Load Balancer Types

1. Global Load Balancer (HTTP(S)) - Use case: HTTP/HTTPS traffic - Features: SSL termination, content-based routing, CDN integration - Scope: Global (anywhere → any region)

2. Global Load Balancer (TCP/SSL) - Use case: TCP/SSL traffic - Features: SSL passthrough, TCP proxy - Scope: Global

3. Regional Load Balancer (Internal) - Use case: Internal traffic within region - Features: Private IP only, lower latency - Scope: Regional

4. Network Load Balancer - Use case: Non-HTTP traffic, high performance - Features: Pass-through, preserves client IP - Scope: Regional

Load Balancing Algorithms

HTTP(S) Load Balancer: - Least connections: Route to backend with fewest connections - Round robin: Distribute evenly - Geographic: Route based on client location

Network Load Balancer: - 5-tuple hash: Hash on (src IP, src port, dst IP, dst port, protocol) - Consistent: Same client → same backend (session affinity)

Health Checks

Health checks: Determine if backends are healthy.

Types: - HTTP: Check HTTP endpoint (e.g., /health) - HTTPS: Check HTTPS endpoint - TCP: Check TCP port - SSL: Check SSL handshake

Configuration: - Interval: How often to check (e.g., 10s) - Timeout: How long to wait (e.g., 5s) - Healthy threshold: Consecutive successes needed (e.g., 2) - Unhealthy threshold: Consecutive failures needed (e.g., 3)

DNS Resolution

DNS Architecture

Cloud DNS: GCP's DNS service.

Components: - Zones: DNS namespaces (e.g., example.com) - Records: DNS records (A, AAAA, CNAME, etc.) - Policies: Routing policies (geolocation, weighted, etc.)

DNS Resolution Flow

  1. Client queries DNS: Resolves domain name to IP
  2. DNS resolver: Checks cache, queries authoritative DNS
  3. Authoritative DNS: Returns IP address
  4. Client connects: Uses IP to connect to load balancer

DNS Caching

TTL (Time To Live): How long DNS records are cached.

Tradeoffs: - Short TTL: Faster updates, more DNS queries - Long TTL: Fewer queries, slower updates

Recommendation: Use short TTL (60s) for production, longer for static resources.


Failure Modes & Blast Radius

VPC Failures

Scenario 1: Misconfigured Firewall Rules

Scenario 2: Subnet Exhaustion

Scenario 3: Route Misconfiguration

Load Balancer Failures

Scenario 1: All Backends Unhealthy

Scenario 2: Load Balancer Overload

Scenario 3: SSL Certificate Expiry

DNS Failures

Scenario 1: DNS Resolution Failure

Scenario 2: DNS Cache Poisoning

Scenario 3: DNS Propagation Delay

Overload Scenarios

10× Normal Load

100× Normal Load


Observability Contract

Metrics to Track

VPC Metrics

Load Balancer Metrics

DNS Metrics

Logs

Log events: - Firewall rule matches (allow/deny) - Load balancer access logs - DNS query logs - Route changes

Traces

Trace: - End-to-end request latency - DNS resolution time - Load balancer processing time - VPC routing time

Alerts

Critical alerts: - All backends unhealthy - Load balancer error rate > threshold - DNS resolution failures - Firewall blocking legitimate traffic

Warning alerts: - Backend health degrading - Load balancer latency increasing - DNS query rate spiking


Change Safety

VPC Changes

Adding Subnets

Changing Firewall Rules

VPC Peering

Load Balancer Changes

Adding Backends

Changing Health Checks

SSL Certificate Updates

DNS Changes

Updating Records

Changing TTL


Security Boundaries

VPC Security

Load Balancer Security

DNS Security


Tradeoffs

VPC Tradeoffs

Regional vs Zonal: - Regional: Span multiple zones, more resilient - Zonal: Lower latency, simpler

Public vs Private: - Public: Direct internet access, simpler - Private: More secure, requires NAT gateway

Load Balancer Tradeoffs

Global vs Regional: - Global: Lower latency (closer to users), more complex - Regional: Simpler, higher latency for distant users

HTTP(S) vs Network: - HTTP(S): More features (SSL termination, routing), higher latency - Network: Lower latency, fewer features

DNS Tradeoffs

Short vs Long TTL: - Short TTL: Faster updates, more queries - Long TTL: Fewer queries, slower updates


Operational Considerations

Capacity Planning

VPC: - Plan subnet sizes (don't run out of IPs) - Plan for VPC peering (IP ranges must not overlap)

Load Balancer: - Plan backend capacity - Plan for health check overhead

DNS: - Plan for DNS query rate - Plan for DNS record storage

Monitoring & Debugging

VPC: - Monitor firewall rule matches - Monitor route changes - Monitor network latency

Load Balancer: - Monitor backend health - Monitor request rate and latency - Monitor error rate

DNS: - Monitor DNS query rate - Monitor DNS resolution time - Monitor DNS errors

Incident Response

Common incidents: - Services unreachable (firewall, routing) - Load balancer failures - DNS resolution failures

Response: 1. Check firewall rules 2. Check routes 3. Check load balancer health 4. Check DNS records 5. Verify connectivity


What Staff Engineers Ask in Reviews

Design Questions

Scale Questions

Security Questions

Operational Questions


Further Reading

Comprehensive Guide: Further Reading: VPC, Load Balancing & DNS

Quick Links: - GCP VPC Documentation - GCP Load Balancing Documentation - GCP Cloud DNS Documentation - Google Cloud Architecture Center - Back to GCP Core Building Blocks


Exercises

  1. Design VPC: Design a VPC for a multi-tier application (web, app, database). What subnets do you need? What firewall rules?

  2. Load balancer selection: You have an API that needs low latency and high throughput. Do you use HTTP(S) or Network Load Balancer? Why?

  3. DNS strategy: You're deploying a new service. What DNS TTL do you use? How do you handle DNS updates during deployments?

Answer Key: View Answers