Scale-to-Zero
Lambda-based auto-scaler that scales Tendril ECS services to zero when idle.
Scale-to-Zero
Cloud-hosted Tendrils support automatic scale-to-zero. When no jobs are queued, the Lambda scaler scales ECS services down to 0 desired tasks. When a job arrives, it scales back to 1. This eliminates Fargate costs during idle periods.
How It Works
A Lambda function runs on a 1-minute schedule via EventBridge. It checks the Trellis database for queued jobs and adjusts ECS service desired counts accordingly.
Scale-Up Logic
Every 1 minute:
1. Query Supabase: SELECT count(*) FROM provision_jobs WHERE status = 'QUEUED'
2. For each registered Tendril ECS service:
- If queued > 0 AND current desired = 0:
→ Scale UP to 1
→ Reset idle counterScale-Down Logic
3. For each registered Tendril ECS service:
- If queued = 0 AND current desired > 0:
→ Increment idle counter
→ If idle counter >= 5 (5 consecutive checks = 5 minutes):
→ Scale DOWN to 0
- If queued > 0:
→ Reset idle counterThe 5-check threshold (5 minutes) prevents flapping — a brief gap between jobs won't cause a scale-down followed by an immediate scale-up.
Infrastructure
Lambda Function
| Property | Value |
|---|---|
| Runtime | Python 3.12 |
| Architecture | ARM64 |
| Memory | 128 MB |
| Timeout | 30 seconds |
| Trigger | EventBridge rule (every 1 minute) |
IAM Permissions
The Lambda function needs:
ecs:DescribeServices— read current desired countecs:UpdateService— change desired count- Network access to Supabase REST API (for job count query)
Environment Variables
| Variable | Purpose |
|---|---|
SUPABASE_URL | Supabase project URL |
SUPABASE_SERVICE_ROLE_KEY | Service role key for admin API access |
WORKERS | JSON array of ECS service configurations |
Workers Configuration
[
{
"region": "eu-west-1",
"cluster": "trellis-prod-tendril-eu",
"service": "trellis-prod-tendril-eu-service"
},
{
"region": "us-east-1",
"cluster": "trellis-prod-tendril-us",
"service": "trellis-prod-tendril-us-service"
}
]Each entry maps to one ECS service running a Tendril. The scaler handles multiple regions independently.
Job Count Query
The Lambda queries Supabase's REST API:
url = f"{SUPABASE_URL}/rest/v1/provision_jobs?status=eq.QUEUED&select=id"
req = urllib.request.Request(url, headers={
'apikey': SUPABASE_KEY,
'Authorization': f'Bearer {SUPABASE_KEY}',
'Prefer': 'count=exact',
})It uses the content-range header to get the count without fetching all rows.
Cold Start Behavior
When a Tendril is scaled from 0 → 1:
- ECS task launch — ~30 seconds (Fargate cold start, image pull)
- Tendril boot — ~5 seconds (binary startup, API registration)
- First heartbeat — ~5 seconds after boot
- First job claim — next poll cycle (≤ 10 seconds)
Total cold start: ~50 seconds from job queue to job start.
During this time, the job remains in QUEUED status. The user sees "Waiting for Tendril..." in the UI.
Cost Savings
Without scale-to-zero, a single Tendril running 24/7 on Fargate costs:
- 1 vCPU + 4 GiB memory ≈ $40/month per Tendril
With scale-to-zero, you pay only for:
- Lambda invocations — 43,200/month (1/min) × 128MB × 1s ≈ $0.02/month
- Fargate — only when jobs are running
For workloads with sporadic provisioning (a few deploys per week), this reduces Tendril costs by ~99%.
Terraform Configuration
The scaler is defined in infra/platform/scaler/:
module "scaler" {
source = "./scaler"
name_prefix = local.name_prefix
supabase_url = var.supabase_url
supabase_service_role_key = var.supabase_service_role_key
workers = [
for name, w in module.tendril : {
region = var.tendrils[name].region
cluster = w.cluster_name
service = w.service_name
}
]
}The scaler automatically discovers all Tendril deployments from the module.tendril outputs.