Mirroring Public Container Images to ECR with Regsync
January 24, 2026
I’ve been burned by public registries before. There’s nothing quite like watching a production Kubernetes deployment fail because Docker Hub is rate limiting your cluster, or worse, experiencing an outage at the exact moment you need to scale up. After one too many incidents where my deployments were at the mercy of systems I couldn’t control, I decided to take a different approach.
The Problem
In enterprise environments, relying on public container registries for production workloads is a risk that’s easy to overlook until it bites you. The issues are predictable but often ignored:
Availability Dependencies: When you deploy a pod in Kubernetes and the image needs to be pulled from Docker Hub, you’re now dependent on Docker Hub being available. If it’s down, your pod doesn’t start. Your autoscaling doesn’t work. Your disaster recovery fails at the worst possible moment.
Rate Limits: Docker Hub introduced rate limits, and suddenly clusters that were pulling images frequently started hitting walls. Anonymous pulls are limited to 100 per 6 hours, and even authenticated free accounts only get 200. In a large cluster with many nodes, that limit gets exhausted faster than you’d expect.
Airgapped Environments: Many enterprise and government environments operate in airgapped or restricted networks. These environments simply can’t reach public registries, making local mirrors a hard requirement rather than a nice-to-have.
Compliance and Security: Having a known, controlled set of images that have been scanned and approved is a security requirement for many organizations. Pulling directly from public registries means you’re trusting whatever is there at pull time.
The Solution: Regsync with ECR
The answer is to mirror public images to a private registry you control. AWS ECR is a natural choice if you’re already in the AWS ecosystem. While AWS offers pull-through cache as a managed solution, I found that regsync gives me more control over exactly what gets mirrored and when.
The approach I settled on has three components:
- A master list of images that need to be mirrored
- Terraform to pre-create ECR repositories with appropriate lifecycle policies
- A pipeline that runs regsync to sync source images to their ECR targets
Component 1: The Master Image List
I maintain a YAML file that defines every image we need mirrored. This becomes the single source of truth:
# images.yaml
images:
- source: docker.io/library/nginx
tags: ["1.25", "1.24", "latest"]
- source: docker.io/library/redis
tags: ["7.2", "7.0", "6.2"]
- source: quay.io/prometheus/prometheus
tags: ["v2.48.0", "v2.47.0"]
- source: gcr.io/distroless/static-debian12
tags: ["nonroot", "latest"]
- source: public.ecr.aws/docker/library/postgres
tags: ["16", "15", "14"]
Component 2: Terraform for ECR Repositories
Before syncing, the ECR repositories need to exist. Terraform handles this, creating repositories from the master list with consistent lifecycle policies:
variable "mirrored_images" {
description = "List of images to mirror"
type = list(object({
source = string
tags = list(string)
}))
}
locals {
# Convert source paths to ECR repository names
# docker.io/library/nginx -> mirror/docker-hub/nginx
ecr_repos = { for img in var.mirrored_images :
replace(replace(img.source, "docker.io/library/", "mirror/docker-hub/"),
"quay.io/", "mirror/quay/") => img
}
}
resource "aws_ecr_repository" "mirror" {
for_each = local.ecr_repos
name = each.key
image_tag_mutability = "IMMUTABLE" # Prevent tag overwrites
image_scanning_configuration {
scan_on_push = true
}
encryption_configuration {
encryption_type = "KMS"
}
tags = {
Purpose = "image-mirror"
ManagedBy = "terraform"
}
}
resource "aws_ecr_lifecycle_policy" "mirror" {
for_each = aws_ecr_repository.mirror
repository = each.value.name
policy = jsonencode({
rules = [
{
rulePriority = 1
description = "Keep last 10 images per tag prefix"
selection = {
tagStatus = "tagged"
tagPrefixList = ["v", "1", "2", "3", "4", "5", "6", "7", "8", "9"]
countType = "imageCountMoreThan"
countNumber = 10
}
action = {
type = "expire"
}
},
{
rulePriority = 2
description = "Remove untagged images after 7 days"
selection = {
tagStatus = "untagged"
countType = "sinceImagePushed"
countUnit = "days"
countNumber = 7
}
action = {
type = "expire"
}
}
]
})
}
Component 3: Regsync Pipeline
With repositories created, regsync handles the actual synchronization. I generate the regsync config from the same master list:
# regsync.yaml
version: 1
creds:
- registry: docker.io
user: ""
pass: ""
- registry: ".dkr.ecr..amazonaws.com"
credHelper: ecr-login
defaults:
rateLimit:
min: 100ms
parallel: 4
sync:
- source: docker.io/library/nginx
target: ".dkr.ecr..amazonaws.com/mirror/docker-hub/nginx"
type: repository
tags:
allow:
- "1.25"
- "1.24"
- "latest"
- source: quay.io/prometheus/prometheus
target: ".dkr.ecr..amazonaws.com/mirror/quay/prometheus/prometheus"
type: repository
tags:
allow:
- "v2.48.0"
- "v2.47.0"
The pipeline itself runs on a schedule:
# .github/workflows/image-sync.yaml
name: Sync Container Images
on:
schedule:
- cron: '0 */6 * * *' # Every 6 hours
workflow_dispatch:
jobs:
sync:
runs-on: self-hosted
permissions:
id-token: write
contents: read
steps:
- uses: actions/checkout@v4
- name: Configure AWS credentials
uses: aws-actions/configure-aws-credentials@v4
with:
role-to-assume: arn:aws:iam::$:role/image-sync-role
aws-region: us-east-1
- name: Login to ECR
uses: aws-actions/amazon-ecr-login@v2
- name: Install regsync
run: |
curl -L https://github.com/regclient/regclient/releases/latest/download/regsync-linux-amd64 -o regsync
chmod +x regsync
- name: Run sync
env:
DOCKER_USER: $
DOCKER_TOKEN: $
AWS_ACCOUNT_ID: $
AWS_REGION: us-east-1
run: ./regsync sync --config regsync.yaml
Why Regsync Over Pull-Through Cache
AWS ECR pull-through cache is convenient, but regsync offers advantages for my use case:
Explicit Control: I know exactly what images are in my registry. With pull-through cache, images get cached on first pull, which means you might not have what you need cached when an outage hits.
Pre-warming: Regsync runs on a schedule, ensuring images are always available before they’re needed. Pull-through cache is reactive.
Multi-source Support: Regsync can pull from Docker Hub, Quay, GCR, GitHub Container Registry, and more. Pull-through cache has a more limited set of supported upstream registries.
Tag Filtering: I can explicitly control which tags get mirrored rather than caching everything that gets pulled.
Key Learnings
- Public registry dependencies are a production risk - Outages and rate limits will eventually affect your deployments
- A master image list as a single source of truth - Makes both Terraform and regsync configuration consistent and auditable
- Immutable tags in ECR prevent unexpected changes - Combined with lifecycle policies, you get both stability and cost control
- Scheduled syncing beats reactive caching - Images are available before you need them, not when you first request them
- Regsync handles multi-registry sources well - One tool to mirror from Docker Hub, Quay, GCR, and other registries
- The pipeline approach integrates with existing CI/CD - Standard GitHub Actions workflow that fits into existing patterns
- Pre-creating ECR repos with Terraform ensures consistency - Lifecycle policies, scanning, and encryption are configured uniformly
After implementing this, I no longer worry about Docker Hub outages during critical deployments. The images my clusters need are already waiting in ECR, scanned, and ready to pull at line speed within AWS. It’s one less external dependency to worry about.