Intermediate
You’ve got private subnets that need internet access. You slap a single NAT Gateway in one AZ, point all private route tables at it, and move on. Six months later, your AWS bill has a mysterious cross-AZ data transfer line item bleeding hundreds of dollars — and your “highly available” architecture has a single point of failure sitting in us-east-1a.
This article is for engineers who’ve deployed NAT Gateways but never done the math. We’ll break down exactly how a single regional NAT Gateway compares to a per-AZ setup in cost, availability, and operational complexity — with real numbers and production Terraform.
How Each Architecture Actually Works
Regional (Single) NAT Gateway: You deploy one NAT Gateway in a single Availability Zone. All private subnets across every AZ route their outbound internet traffic to this single gateway. Traffic originating from other AZs crosses AZ boundaries to reach it.
Per-AZ (Zonal) NAT Gateway: You deploy one NAT Gateway in each AZ that has private subnets. Each AZ’s private route table points to its own local NAT Gateway. Traffic stays within the same AZ.
The key difference isn’t the NAT Gateway itself — it’s what happens to the traffic path. AWS charges $0.01/GB for cross-AZ data transfer in each direction. When a private subnet in us-east-1b sends traffic through a NAT Gateway in us-east-1a, you pay this fee on top of the standard NAT Gateway processing charge. This cross-AZ charge applies to traffic going to the NAT Gateway and the response coming back — so it’s $0.02/GB round-trip for cross-AZ traffic.
Cost Breakdown: Real Dollar Example (1 TB/month, 3 AZs)
Let’s assume us-east-1 pricing, 1 TB total outbound data per month evenly distributed across 3 AZs (~333 GB per AZ), and standard on-demand rates.
| Cost Component | Single NAT GW (1 AZ) | Per-AZ NAT GW (3 AZs) |
|---|---|---|
| NAT GW hourly ($0.045/hr × 730 hrs) | $32.85 (1 gateway) | $98.55 (3 gateways) |
| NAT GW data processing ($0.045/GB × 1024 GB) | $46.08 | $46.08 |
| Cross-AZ transfer ($0.01/GB each direction) | ~$13.65 (⅔ of traffic crosses AZ, round-trip: 682 GB × $0.02) | $0.00 |
| Total Monthly Cost | $92.58 | $144.63 |
At 1 TB/month, the single NAT Gateway is cheaper by ~$52. But watch what happens as traffic scales.
The cross-AZ cost scales linearly with traffic. The per-AZ hourly cost is fixed at ~$65.70/month extra. Let’s find the break-even point.
Break-Even Calculation
The extra hourly cost of 2 additional NAT Gateways is:
# Extra fixed cost per month for 2 additional NAT Gateways
2 × $0.045/hr × 730 hrs = $65.70/month
The cross-AZ savings per GB (assuming ⅔ of traffic crosses AZs, round-trip):
# Cross-AZ cost per GB of total traffic (2/3 crosses AZ, $0.01 each direction)
(2/3) × $0.02 = $0.01333 per GB of total traffic
Break-even:
# Break-even GB = Extra fixed cost / Cross-AZ savings per GB
$65.70 / $0.01333 = ~4,928 GB ≈ 4.8 TB/month
At roughly 4.8 TB/month of total outbound traffic across 3 AZs, per-AZ NAT Gateways become cheaper. Above that, every additional TB saves you ~$13.65/month. At 10 TB/month, per-AZ saves ~$69/month. At 50 TB/month, it saves ~$600/month. The savings compound fast in data-heavy workloads.
The High Availability Angle: The Hidden Single Point of Failure
This is where the cost discussion becomes secondary. A NAT Gateway is redundant within a single AZ — AWS manages that. But it is not redundant across AZs. If the AZ hosting your single NAT Gateway experiences an outage, every private subnet in every AZ loses outbound internet connectivity simultaneously.
This has happened. The us-east-1 AZ outages in 2020 and 2023 took down resources in specific AZs while others remained healthy. If your NAT Gateway was in the affected AZ, your “multi-AZ” architecture collapsed to zero outbound connectivity.
Per-AZ NAT Gateways provide true fault isolation. An AZ failure only affects the subnets in that AZ. The remaining AZs continue processing traffic normally. For any production workload, this alone justifies the per-AZ approach regardless of cost.
Advantages and disadvantages at a glance:
| Factor | Single NAT GW | Per-AZ NAT GW |
|---|---|---|
| Cost (low traffic) | ✅ Cheaper | ❌ Higher fixed cost |
| Cost (high traffic) | ❌ Cross-AZ adds up | ✅ No cross-AZ fees |
| High availability | ❌ AZ SPOF | ✅ Full AZ isolation |
| Operational complexity | ✅ Simpler | ❌ More route tables |
| Terraform complexity | ✅ Minimal | ⚠️ Slightly more |
| Blast radius | ❌ All AZs affected | ✅ Single AZ contained |
When to Choose Each
Choose a single NAT Gateway when:
- It’s a dev/staging environment where downtime is acceptable
- Monthly outbound traffic is well under 5 TB
- You’re optimizing for cost above all else
- The workload has no SLA requirements
Choose per-AZ NAT Gateways when:
- It’s a production environment — period
- You need genuine multi-AZ resilience
- Monthly outbound traffic exceeds 5 TB (it’s now cheaper anyway)
- You’re running in a regulated industry requiring HA
- Your architecture follows the AWS Well-Architected Framework (which explicitly recommends per-AZ)
For most teams, the answer is straightforward: per-AZ for production, single for everything else.
Terraform Snippets: Both Setups
Single NAT Gateway (Regional Pattern):
resource "aws_eip" "nat" {
domain = "vpc"
tags = {
Name = "nat-eip"
}
}
resource "aws_nat_gateway" "single" {
allocation_id = aws_eip.nat.id
subnet_id = aws_subnet.public[0].id
tags = {
Name = "nat-gw-single"
}
}
resource "aws_route_table" "private" {
vpc_id = aws_vpc.main.id
route {
cidr_block = "0.0.0.0/0"
nat_gateway_id = aws_nat_gateway.single.id
}
tags = {
Name = "private-rt"
}
}
# All private subnets share one route table
resource "aws_route_table_association" "private" {
count = length(var.azs)
subnet_id = aws_subnet.private[count.index].id
route_table_id = aws_route_table.private.id
}
Per-AZ NAT Gateways (Zonal Pattern):
variable "azs" {
default = ["us-east-1a", "us-east-1b", "us-east-1c"]
}
resource "aws_eip" "nat" {
count = length(var.azs)
domain = "vpc"
tags = {
Name = "nat-eip-${var.azs[count.index]}"
}
}
resource "aws_nat_gateway" "per_az" {
count = length(var.azs)
allocation_id = aws_eip.nat[count.index].id
subnet_id = aws_subnet.public[count.index].id
tags = {
Name = "nat-gw-${var.azs[count.index]}"
}
}
resource "aws_route_table" "private" {
count = length(var.azs)
vpc_id = aws_vpc.main.id
route {
cidr_block = "0.0.0.0/0"
nat_gateway_id = aws_nat_gateway.per_az[count.index].id
}
tags = {
Name = "private-rt-${var.azs[count.index]}"
}
}
# Each private subnet gets its own AZ-local route table
resource "aws_route_table_association" "private" {
count = length(var.azs)
subnet_id = aws_subnet.private[count.index].id
route_table_id = aws_route_table.private[count.index].id
}
The per-AZ pattern adds exactly two more resources per additional AZ: one EIP, one NAT Gateway, and one route table. The Terraform complexity is marginal.
Common Mistakes Engineers Actually Make
- Ignoring cross-AZ data transfer on the bill. It shows up under “EC2-Other” → “NatGateway” in Cost Explorer, not as a separate NAT line item. Many engineers never see it. Enable Cost Allocation Tags and filter by
UsageTypecontainingNatGatewayandDataTransfer-Regional-Bytes. - Deploying per-AZ NAT Gateways but sharing a single route table. This defeats the entire purpose. Each AZ’s private subnet must have its own route table pointing to its own NAT Gateway. Verify with:
aws ec2 describe-route-tables \ --filters "Name=association.subnet-id,Values=subnet-0abc123" \ --query "RouteTables[].Routes[?NatGatewayId].NatGatewayId" \ --output text - Forgetting that NAT Gateway has a 45 Gbps burst limit. If you’re processing massive data volumes through a single NAT Gateway, you may hit bandwidth limits. Per-AZ deployment effectively triples your aggregate throughput capacity.
- Not considering VPC endpoints first. If your traffic is primarily to S3 or DynamoDB, Gateway VPC Endpoints are free and bypass the NAT Gateway entirely. For other AWS services, Interface VPC Endpoints ($0.01/GB processing + hourly) are often cheaper than NAT Gateway ($0.045/GB processing + hourly) for high-volume AWS API traffic.
- Using NAT Gateways in dev for resources that don’t need internet access. Audit what actually needs outbound connectivity. Often it’s only package updates and API calls that can be handled by VPC endpoints or scheduled in a maintenance window.
Conclusion
The “cheaper” option depends entirely on your traffic volume and your tolerance for AZ-level failures. Below ~5 TB/month, a single NAT Gateway costs less. Above that, per-AZ wins on both cost and resilience. For production, per-AZ is the only defensible choice regardless of traffic volume.
- Single NAT Gateway saves ~$65/month in fixed costs but introduces cross-AZ fees and a critical single point of failure.
- Per-AZ NAT Gateways break even at ~4.8 TB/month across 3 AZs and provide true AZ fault isolation.
- Always check if VPC endpoints can eliminate NAT Gateway traffic entirely — Gateway endpoints for S3/DynamoDB are free.
- Audit your “EC2-Other” cost category in Cost Explorer — cross-AZ data transfer hides there.
- Production workloads should always use per-AZ NAT Gateways. Use single NAT Gateways only for non-production environments.
Found this helpful? Share it with your team. For more practical AWS and DevOps guides, visit riseofcloud.com.
Let’s keep learning consistently at a medium pace.