Terraform State Management: Remote State and Locking

Terraform state is the source of truth for what infrastructure Terraform manages. Mismanaging state causes drift, duplicate resources, or destroyed infrastructure. Remote state with locking is mandato

Introduction#

Terraform state is the source of truth for what infrastructure Terraform manages. Mismanaging state causes drift, duplicate resources, or destroyed infrastructure. Remote state with locking is mandatory for team environments.

Why State Matters#

Terraform state maps your configuration to real infrastructure resources. Without it, Terraform cannot determine what exists and what needs to change.

1
2
3
4
5
6
7
8
9
10
11
12
# State stores:
# - Resource IDs (e.g., i-0a1b2c3d4e5f for an EC2 instance)
# - Resource attributes
# - Dependency graph
# - Metadata (provider versions, workspace)

# View current state
terraform state list
terraform state show aws_instance.web_server

# Inspect raw state file
cat terraform.tfstate | jq '.resources[].instances[].attributes.id'

Never commit terraform.tfstate to git. It contains sensitive values (passwords, private keys).

Remote State Configuration#

1
2
3
4
5
6
7
8
9
10
11
12
13
# S3 backend with DynamoDB locking
terraform {
  backend "s3" {
    bucket         = "my-terraform-state-bucket"
    key            = "production/vpc/terraform.tfstate"
    region         = "us-east-1"
    encrypt        = true
    kms_key_id     = "arn:aws:kms:us-east-1:123456789:key/abc-def"

    # DynamoDB table for state locking
    dynamodb_table = "terraform-state-lock"
  }
}
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
# Create the S3 bucket and DynamoDB table (bootstrap)
aws s3api create-bucket --bucket my-terraform-state-bucket --region us-east-1
aws s3api put-bucket-versioning \
  --bucket my-terraform-state-bucket \
  --versioning-configuration Status=Enabled
aws s3api put-bucket-encryption \
  --bucket my-terraform-state-bucket \
  --server-side-encryption-configuration \
  '{"Rules":[{"ApplyServerSideEncryptionByDefault":{"SSEAlgorithm":"aws:kms"}}]}'

aws dynamodb create-table \
  --table-name terraform-state-lock \
  --attribute-definitions AttributeName=LockID,AttributeType=S \
  --key-schema AttributeName=LockID,KeyType=HASH \
  --billing-mode PAY_PER_REQUEST

State Locking#

When terraform apply runs, it acquires a lock on the state file. Concurrent applies fail with an error instead of corrupting state.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
# List current locks
aws dynamodb scan --table-name terraform-state-lock

# Force unlock if a lock is stuck (e.g., apply was killed)
terraform force-unlock LOCK-ID

# Lock file content
# {
#   "ID": "abc-123",
#   "Operation": "OperationTypeApply",
#   "Who": "alice@hostname",
#   "Created": "2025-03-03T10:00:00Z",
#   "Path": "production/vpc/terraform.tfstate"
# }

State Organization Strategies#

By Environment and Component#

1
2
3
4
5
6
7
8
9
10
11
12
13
terraform/
├── environments/
│   ├── production/
│   │   ├── vpc/
│   │   │   ├── main.tf
│   │   │   └── backend.tf  # key: "production/vpc/terraform.tfstate"
│   │   ├── eks/
│   │   │   └── backend.tf  # key: "production/eks/terraform.tfstate"
│   │   └── rds/
│   └── staging/
└── modules/
    ├── vpc/
    └── eks/

Separate state per component limits the blast radius of a failed apply. A broken EKS apply does not lock the VPC state.

Remote State Data Sources#

Reference outputs from another state without duplicating configuration.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
# In eks/main.tf: read VPC outputs from vpc state
data "terraform_remote_state" "vpc" {
  backend = "s3"
  config = {
    bucket = "my-terraform-state-bucket"
    key    = "production/vpc/terraform.tfstate"
    region = "us-east-1"
  }
}

resource "aws_eks_cluster" "main" {
  name = "production"
  vpc_config {
    subnet_ids = data.terraform_remote_state.vpc.outputs.private_subnet_ids
  }
}
1
2
3
4
# In vpc/outputs.tf
output "private_subnet_ids" {
  value = aws_subnet.private[*].id
}

State Manipulation Commands#

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
# Move a resource to a new address (refactoring)
terraform state mv aws_instance.old_name aws_instance.new_name

# Remove a resource from state without destroying it
# (for resources you want Terraform to stop managing)
terraform state rm aws_instance.imported

# Import an existing resource into state
terraform import aws_instance.web_server i-0a1b2c3d4e5f6

# Pull remote state to local file (for debugging)
terraform state pull > current.tfstate

# Push local state back (use with extreme caution)
terraform state push current.tfstate

Workspaces vs Separate State Files#

1
2
3
4
5
6
7
# Workspaces: multiple state files in the same backend path
terraform workspace new staging
terraform workspace select production

# State stored at: key/env:staging/terraform.tfstate
# Problem: workspaces share the same codebase — poor isolation for prod/staging
# Prefer separate directories over workspaces for environment separation

Workspaces are appropriate for short-lived feature environments. Separate directories are better for long-lived environments like staging/production.

Conclusion#

Always use remote state with locking for team environments. Organize state by environment and component to minimize blast radius. Use terraform_remote_state data sources to share outputs between components. Never manually edit state files — use terraform state mv, rm, or import commands instead.

Contents