How to Manage Terraform State with S3-Compatible Remote Backends: Locking, Encryption, and Team Collaboration

Intermediate

Why Terraform State Management Matters More Than You Think

If you’ve been using Terraform for any length of time, you’ve encountered the terraform.tfstate file. It’s the single source of truth that maps your configuration to real-world infrastructure. Lose it, corrupt it, or let two engineers write to it simultaneously, and you’re in for a very bad day.

When you’re working solo on a side project, the default local state file works fine. But the moment a second person touches the same infrastructure — or you start running Terraform in CI/CD pipelines — local state becomes a liability. Here’s what can go wrong:

  • State conflicts: Two engineers run terraform apply at the same time, and one overwrites the other’s changes.
  • Data loss: The state file lives on someone’s laptop. Laptop dies, state is gone. Now Terraform doesn’t know what it manages.
  • Security exposure: State files contain sensitive data — database passwords, API keys, private IPs — all in plain text.
  • No audit trail: With local state, you have no versioning, no history, and no way to recover from a bad apply.

Remote backends solve all of these problems. In this guide, we’ll set up an S3-compatible remote backend with DynamoDB-based state locking, server-side encryption, and a configuration that supports real team collaboration. Every command and configuration here is verified against Terraform 1.x (the current stable major version).

Understanding Terraform State: The Basics

Before we jump into remote backends, let’s make sure we’re clear on what state actually does. When you run terraform apply, Terraform needs to know:

  • What resources it currently manages
  • The attributes of those resources (IDs, ARNs, IP addresses)
  • The mapping between your .tf configuration and real cloud resources
  • Dependency relationships between resources

All of this lives in the state file. Without it, Terraform would have no idea what’s already deployed. It would try to create everything from scratch on every run — or worse, it simply wouldn’t be able to manage existing resources at all.

Local vs. Remote State

Feature Local State Remote State (S3 Backend)
Storage location Developer’s machine S3 bucket (centralized)
Team collaboration Not practical Built-in support
State locking None DynamoDB-based locking
Encryption at rest No (plain JSON on disk) Yes (SSE-S3, SSE-KMS)
Versioning / recovery Manual backups only S3 versioning
CI/CD friendly No Yes

Step 1: Create the S3 Bucket and DynamoDB Table

We need two AWS resources before configuring the backend: an S3 bucket for storing state files and a DynamoDB table for state locking. There’s a bit of a chicken-and-egg problem here — we need these resources before Terraform can use them as a backend. The common approach is to create them with a separate Terraform configuration (or via the AWS CLI), and manage them independently.

Let’s create them using the AWS CLI first, so you understand exactly what’s being provisioned:

Create the S3 Bucket

# Choose a globally unique bucket name
BUCKET_NAME="my-terraform-state-$(aws sts get-caller-identity --query Account --output text)"
AWS_REGION="us-east-1"

# Create the bucket
aws s3api create-bucket \
  --bucket "$BUCKET_NAME" \
  --region "$AWS_REGION"

# Enable versioning — this is critical for state recovery
aws s3api put-bucket-versioning \
  --bucket "$BUCKET_NAME" \
  --versioning-configuration Status=Enabled

# Enable server-side encryption by default
aws s3api put-bucket-encryption \
  --bucket "$BUCKET_NAME" \
  --server-side-encryption-configuration '{
    "Rules": [
      {
        "ApplyServerSideEncryptionByDefault": {
          "SSEAlgorithm": "aws:kms"
        },
        "BucketKeyEnabled": true
      }
    ]
  }'

# Block all public access — state files should never be public
aws s3api put-public-access-block \
  --bucket "$BUCKET_NAME" \
  --public-access-block-configuration \
    BlockPublicAcls=true,IgnorePublicAcls=true,BlockPublicPolicy=true,RestrictPublicBuckets=true

Why versioning? If a terraform apply corrupts your state, you can roll back to a previous version. This has saved countless teams from disaster. Don’t skip this step.

Create the DynamoDB Table for State Locking

aws dynamodb create-table \
  --table-name terraform-state-lock \
  --attribute-definitions AttributeName=LockID,AttributeType=S \
  --key-schema AttributeName=LockID,KeyType=HASH \
  --billing-mode PAY_PER_REQUEST \
  --region "$AWS_REGION"

The table needs exactly one attribute: LockID of type S (String). Terraform uses this to create and check locks. We use PAY_PER_REQUEST billing because lock operations are infrequent — you’ll pay fractions of a cent per month.

Verify both resources exist:

# Check the bucket
aws s3api head-bucket --bucket "$BUCKET_NAME" && echo "Bucket exists"

# Check the DynamoDB table
aws dynamodb describe-table --table-name terraform-state-lock --query "Table.TableStatus" --output text
# Expected output: ACTIVE

Step 2: Configure the S3 Backend in Terraform

Now let’s configure Terraform to use these resources. Create a new directory for your project and add a backend configuration:

mkdir terraform-remote-state-demo && cd terraform-remote-state-demo

Create a file called main.tf:

terraform {
  required_version = ">= 1.0"

  required_providers {
    aws = {
      source  = "hashicorp/aws"
      version = "~> 5.0"
    }
  }

  backend "s3" {
    bucket         = "my-terraform-state-123456789012"
    key            = "environments/dev/terraform.tfstate"
    region         = "us-east-1"
    dynamodb_table = "terraform-state-lock"
    encrypt        = true
  }
}

provider "aws" {
  region = "us-east-1"
}

# A simple resource to test the setup
resource "aws_s3_bucket" "example" {
  bucket_prefix = "demo-remote-state-"

  tags = {
    Environment = "dev"
    ManagedBy   = "terraform"
  }
}

Let me explain each backend argument:

  • bucket — The S3 bucket name where state will be stored. Replace with your actual bucket name.
  • key — The path within the bucket for this specific state file. This is how you organize multiple projects/environments in a single bucket.
  • region — The AWS region where your bucket lives.
  • dynamodb_table — The DynamoDB table for state locking.
  • encrypt — Ensures state is encrypted at rest using the bucket’s default encryption.

Initialize and Apply

# Initialize the backend
terraform init

You should see output like this:

Initializing the backend...

Successfully configured the backend "s3"! Terraform will automatically
use this backend unless the backend configuration changes.

Now run a plan and apply:

terraform plan
terraform apply

After the apply completes, verify your state is stored remotely:

aws s3 ls "s3://$BUCKET_NAME/environments/dev/"
# Expected output: You should see terraform.tfstate listed

Step 3: Understanding State Locking in Action

State locking prevents concurrent operations from corrupting your state. Here’s exactly what happens when you run terraform apply with DynamoDB locking configured:

  • Terraform writes a lock entry to the DynamoDB table with a unique LockID
  • If another process tries to run at the same time, it sees the lock and fails with a clear error message
  • When the operation completes (or fails), Terraform releases the lock

You can see this yourself. Open two terminal windows, navigate to the same Terraform directory, and run terraform apply in both simultaneously. The second one will output something like:

Error: Error acquiring the state lock

Error message: ConditionalCheckFailedException: The conditional request failed
Lock Info:
  ID:        a1b2c3d4-e5f6-7890-abcd-ef1234567890
  Path:      my-terraform-state-123456789012/environments/dev/terraform.tfstate
  Operation: OperationTypeApply
  Who:       engineer@workstation
  Version:   1.9.x
  Created:   2024-01-15 10:30:00.000000000 +0000 UTC

This is exactly what you want. The lock prevented a potentially destructive concurrent write.

Force-Unlocking State (Use With Extreme Caution)

Sometimes a lock gets stuck — for example, if a CI/CD runner crashes mid-apply. You can manually release it:

# Only use this if you're SURE no other operation is running
terraform force-unlock LOCK_ID

Replace LOCK_ID with the ID shown in the error message. Never force-unlock if another operation might genuinely be in progress. You’ll end up with the exact state corruption that locking was designed to prevent.

Step 4: Encryption — What’s Actually Happening

With encrypt = true in the backend configuration and KMS default encryption on the bucket, your state is encrypted at rest. But let’s be precise about what this covers and what it doesn’t:

Protection Layer What It Does Configuration
encrypt = true Sends x-amz-server-side-encryption header with every PUT request Backend config
Bucket default encryption Encrypts objects at rest using SSE-S3 or SSE-KMS S3 bucket setting
HTTPS (TLS) Encrypts state in transit between your machine and S3 Enabled by default
IAM policies Controls who can read/write the state file IAM configuration

Using a Custom KMS Key

For stricter control, you can use a customer-managed KMS key. This lets you define exactly who can decrypt the state via KMS key policies:

terraform {
  backend "s3" {
    bucket         = "my-terraform-state-123456789012"
    key            = "environments/dev/terraform.tfstate"
    region         = "us-east-1"
    dynamodb_table = "terraform-state-lock"
    encrypt        = true
    kms_key_id     = "arn:aws:kms:us-east-1:123456789012:key/your-key-id-here"
  }
}

Important: The kms_key_id here should be the full ARN or key ID of a KMS key that the Terraform execution role has access to. If you lose access to this KMS key, you lose access to your state file.

Step 5: Organizing State for Team Collaboration

When multiple teams or environments share infrastructure, how you organize your state files matters enormously. Here’s a pattern that works well at scale:

Option A: Key-Based Separation (Single Bucket)

s3://my-terraform-state/
├── networking/
│   ├── dev/terraform.tfstate
│   ├── staging/terraform.tfstate
│   └── prod/terraform.tfstate
├── compute/
│   ├── dev/terraform.tfstate
│   ├── staging/terraform.tfstate
│   └── prod/terraform.tfstate
└── databases/
    ├── dev/terraform.tfstate
    ├── staging/terraform.tfstate
    └── prod/terraform.tfstate

Each component and environment gets its own state file, scoped by the key in the backend configuration. This keeps blast radius small — a bad apply to the networking stack doesn’t risk corrupting your database state.

Option B: Using Workspaces (Be Careful)

Terraform workspaces let you maintain multiple state files for the same configuration. They’re useful for creating identical environments:

# Create and switch to a new workspace
terraform workspace new staging
terraform workspace new prod

# List workspaces
terraform workspace list

# Switch between them
terraform workspace select staging

When using workspaces with an S3 backend

Leave a Comment

Your email address will not be published. Required fields are marked *