Intermediate
Introduction: Why Per-Client API Throttling Matters
You’ve built a REST API on Amazon API Gateway. It works. Traffic is growing. Then one client starts hammering your endpoint with 10,000 requests per second, degrading the experience for everyone else. Or maybe you’re monetizing your API and need to enforce different rate limits for free-tier vs. premium customers.
This is exactly the problem that Usage Plans and API Keys in Amazon API Gateway solve. They give you a built-in mechanism to throttle requests per client, set daily or monthly quotas, and meter usage — all without writing a single line of application code.
This article is for intermediate AWS developers and DevOps engineers who already have a working API Gateway REST API and need to add per-client rate limiting. We’ll cover the full setup using the AWS CLI and the AWS Management Console, walk through real examples, discuss common mistakes, and address cost and security implications.
Important distinction: Usage Plans and API Keys are available for REST APIs (the original API Gateway type, also called v1). HTTP APIs (v2) do not support usage plans or API keys in the same way. If you’re using HTTP APIs, you’ll need to implement throttling differently (e.g., via AWS WAF or custom authorizers).
Prerequisites
- An AWS account with appropriate IAM permissions for API Gateway
- AWS CLI v2 installed and configured (
aws configure) - An existing API Gateway REST API deployed to at least one stage
- Basic familiarity with API Gateway concepts: resources, methods, stages, and deployments
If you don’t have a REST API yet, you can create a simple one quickly:
# Create a REST API
aws apigateway create-rest-api \
--name "MyThrottledAPI" \
--description "Demo API for usage plans" \
--endpoint-configuration types=REGIONAL
# Note the returned API ID — you'll need it throughout this guide
# Example output: "id": "abc123def4"
Understanding the Core Concepts: Usage Plans, API Keys, and Stages
Before diving into commands, let’s clarify how the three pieces fit together:
| Concept | What It Does | Analogy |
|---|---|---|
| API Key | A unique alphanumeric string that identifies a client. Passed via the x-api-key header. |
A membership card |
| Usage Plan | Defines throttle limits (requests/second, burst) and quotas (requests/day, week, or month). Associated with one or more API stages. | A membership tier (Free, Pro, Enterprise) |
| Stage | A deployment snapshot of your API (e.g., dev, prod). Usage plans are attached to specific stages. |
The venue where the membership is valid |
The relationship works like this: A Usage Plan is linked to one or more API Stages, and one or more API Keys are associated with a Usage Plan. When a client sends a request with their API key, API Gateway looks up which usage plan that key belongs to and enforces the corresponding throttle and quota settings.
Throttle vs. Quota — Know the Difference
- Throttle controls the rate — how many requests per second (steady-state) and the burst capacity. Think of it as a speed limit.
- Quota controls the total volume — how many requests a client can make in a given period (day, week, or month). Think of it as a mileage cap.
You can use both together. For example: “Client X can make up to 10 requests/second (throttle) and no more than 50,000 requests per month (quota).”
Step-by-Step: Setting Up Usage Plans and API Keys via AWS CLI
Let’s walk through the complete setup. Assume you have a REST API with ID abc123def4 deployed to a stage called prod.
Step 1: Require API Keys on Your Methods
This is the step most people forget. You must explicitly mark each method (or resource) as requiring an API key. Without this, API Gateway won’t check for or enforce API keys — even if you have usage plans configured.
# First, get the root resource ID
aws apigateway get-resources \
--rest-api-id abc123def4
# Suppose the root resource "/" has ID "xyz789"
# And you have a GET method on /items with resource ID "res456"
# Update the GET method to require an API key
aws apigateway update-method \
--rest-api-id abc123def4 \
--resource-id res456 \
--http-method GET \
--patch-operations op=replace,path=/apiKeyRequired,value=true
After changing method settings, you must redeploy your API:
aws apigateway create-deployment \
--rest-api-id abc123def4 \
--stage-name prod \
--description "Enable API key requirement"
Step 2: Create API Keys
# Create an API key for Client A (Free tier)
aws apigateway create-api-key \
--name "ClientA-Free" \
--description "API key for Client A on free plan" \
--enabled
# Output includes the key "id" and "value"
# Example: "id": "key111aaa", "value": "aBcDeFgHiJkLmNoPqRsTuVwXyZ123456789"
# Create an API key for Client B (Premium tier)
aws apigateway create-api-key \
--name "ClientB-Premium" \
--description "API key for Client B on premium plan" \
--enabled
# Example: "id": "key222bbb", "value": "zYxWvUtSrQpOnMlKjIhGfEdCbA987654321"
The value field is the actual key string clients will send in the x-api-key header. The id is the internal AWS identifier you’ll use in subsequent API calls.
Step 3: Create Usage Plans
# Create a Free tier usage plan
aws apigateway create-usage-plan \
--name "FreePlan" \
--description "Free tier: 5 req/sec, 1000 req/day" \
--api-stages apiId=abc123def4,stage=prod \
--throttle burstLimit=10,rateLimit=5 \
--quota limit=1000,offset=0,period=DAY
# Output: "id": "plan111"
# Create a Premium tier usage plan
aws apigateway create-usage-plan \
--name "PremiumPlan" \
--description "Premium tier: 50 req/sec, 100000 req/month" \
--api-stages apiId=abc123def4,stage=prod \
--throttle burstLimit=100,rateLimit=50 \
--quota limit=100000,offset=0,period=MONTH
# Output: "id": "plan222"
Key parameters explained:
- rateLimit: The steady-state requests per second allowed (token refill rate).
- burstLimit: The maximum number of concurrent requests (token bucket size). Must be greater than or equal to rateLimit.
- quota limit: Total number of requests allowed in the given period.
- quota period:
DAY,WEEK, orMONTH. - quota offset: The day of the period to start counting (0 = start of period).
Step 4: Associate API Keys with Usage Plans
# Add Client A's key to the Free plan
aws apigateway create-usage-plan-key \
--usage-plan-id plan111 \
--key-id key111aaa \
--key-type API_KEY
# Add Client B's key to the Premium plan
aws apigateway create-usage-plan-key \
--usage-plan-id plan222 \
--key-id key222bbb \
--key-type API_KEY
Step 5: Test It
# Request WITHOUT an API key — should return 403 Forbidden
curl -s -o /dev/null -w "%{http_code}" \
https://abc123def4.execute-api.us-east-1.amazonaws.com/prod/items
# Output: 403
# Request WITH Client A's free-tier key
curl -H "x-api-key: aBcDeFgHiJkLmNoPqRsTuVwXyZ123456789" \
https://abc123def4.execute-api.us-east-1.amazonaws.com/prod/items
# Output: 200 (your API response)
# Exceeding the rate limit will return 429 Too Many Requests
Monitoring Usage and Retrieving Metrics
Once usage plans are active, you need to monitor them. API Gateway tracks usage per API key per usage plan.
Check Usage via CLI
# Get usage data for the Free plan for a specific date range
aws apigateway get-usage \
--usage-plan-id plan111 \
--key-id key111aaa \
--start-date "2024-01-01" \
--end-date "2024-01-31"
The response includes an array of daily usage data with two values per day: [requests_used, requests_remaining].
CloudWatch Metrics
API Gateway automatically publishes metrics to Amazon CloudWatch. The most relevant ones for throttling:
- Count: Total API requests in a period
- 4XXError: Includes 403 (forbidden — missing/invalid key) and 429 (throttled)
- Latency / IntegrationLatency: Monitor if throttled clients are retrying aggressively
# Get the count of 4XX errors for your API stage
aws cloudwatch get-metric-statistics \
--namespace "AWS/ApiGateway" \
--metric-name "4XXError" \
--dimensions Name=ApiName,Value=MyThrottledAPI Name=Stage,Value=prod \
--start-time 2024-01-15T00:00:00Z \
--end-time 2024-01-16T00:00:00Z \
--period 3600 \
--statistics Sum
Programmatic Usage Tracking with Python (Boto3)
import boto3
from datetime import datetime, timedelta
client = boto3.client('apigateway')
# Get all usage plans
plans = client.get_usage_plans()
for plan in plans['items']:
print(f"Plan: {plan['name']} (ID: {plan['id']})")
print(f" Throttle: {plan.get('throttle', 'N/A')}")
print(f" Quota: {plan.get('quota', 'N/A')}")
# Get keys associated with this plan
keys = client.get_usage_plan_keys(usagePlanId=plan['id'])
for key in keys['items']:
print(f" Key: {key['name']} (ID: {key['id']})")
# Get usage for this key
end_date = datetime.utcnow().strftime('%Y-%m-%d')
start_date = (datetime.utcnow() - timedelta(days=7)).strftime('%Y-%m-%d')
usage = client.get_usage(
usagePlanId=plan['id'],
keyId=key['id'],
startDate=start_date,
endDate=end_date
)
for date, values in usage.get('items', {}).items():
# values is a list of [used, remaining] pairs
if values:
used = values[0][0]
remaining = values[0][1]
print(f" {date}: Used={used}, Remaining={remaining}")
Per-Method Throttle Overrides
Sometimes you need different rate limits for different endpoints within the same usage plan. For example, a GET /items endpoint might tolerate higher traffic than a POST /orders endpoint.
You can specify method-level throttle overrides when creating or updating a usage plan:
# Update the Free plan with per-method throttle overrides
aws apigateway update-usage-plan \
--usage-plan-id plan111 \
--patch-operations \
op=replace,path="/apiStages/abc123def4:prod/throttle/items/GET/burstLimit",value="20" \
op=replace,path="/apiStages/abc123def4:prod/throttle/items/GET/rateLimit",value="10" \
op=replace,path="/apiStages/abc123def4:prod/throttle/orders/POST/burstLimit",value="5" \
op=replace,path="/apiStages/abc123def4:prod/throttle/orders/POST/rateLimit",value="2"
The path format for method-level throttling is: /apiStages/{apiId}:{stageName}/throttle/{resourcePath}/{httpMethod}/rateLimit. Note that the resource path uses the actual path (without the leading slash) and slashes in nested paths are replaced with ~1. For example, a method at /v1/items would be v1~1items/GET.
Common Mistakes and How to Avoid Them
Mistake 1: Forgetting to Set apiKeyRequired on Methods
This is the #1 issue. You create usage plans and API keys, but requests still go through without a key. The fix: set apiKeyRequired: true on every method that should be gated, then redeploy.