Scale-to-Zero

Cloud-hosted Tendrils support automatic scale-to-zero. When no jobs are queued, the Lambda scaler scales ECS services down to 0 desired tasks. When a job arrives, it scales back to 1. This eliminates Fargate costs during idle periods.

How It Works

A Lambda function runs on a 1-minute schedule via EventBridge. It checks the Trellis database for queued jobs and adjusts ECS service desired counts accordingly.

Scale-Up Logic

Every 1 minute:
  1. Query Supabase: SELECT count(*) FROM provision_jobs WHERE status = 'QUEUED'
  2. For each registered Tendril ECS service:
     - If queued > 0 AND current desired = 0:
       → Scale UP to 1
       → Reset idle counter

Scale-Down Logic

  3. For each registered Tendril ECS service:
     - If queued = 0 AND current desired > 0:
       → Increment idle counter
       → If idle counter >= 5 (5 consecutive checks = 5 minutes):
         → Scale DOWN to 0
     - If queued > 0:
       → Reset idle counter

The 5-check threshold (5 minutes) prevents flapping — a brief gap between jobs won't cause a scale-down followed by an immediate scale-up.

Infrastructure

Lambda Function

Property	Value
Runtime	Python 3.12
Architecture	ARM64
Memory	128 MB
Timeout	30 seconds
Trigger	EventBridge rule (every 1 minute)

IAM Permissions

The Lambda function needs:

ecs:DescribeServices — read current desired count
ecs:UpdateService — change desired count
Network access to Supabase REST API (for job count query)

Environment Variables

Variable	Purpose
`SUPABASE_URL`	Supabase project URL
`SUPABASE_SERVICE_ROLE_KEY`	Service role key for admin API access
`WORKERS`	JSON array of ECS service configurations

Workers Configuration

[
  {
    "region": "eu-west-1",
    "cluster": "trellis-prod-tendril-eu",
    "service": "trellis-prod-tendril-eu-service"
  },
  {
    "region": "us-east-1",
    "cluster": "trellis-prod-tendril-us",
    "service": "trellis-prod-tendril-us-service"
  }
]

Each entry maps to one ECS service running a Tendril. The scaler handles multiple regions independently.

Job Count Query

The Lambda queries Supabase's REST API:

url = f"{SUPABASE_URL}/rest/v1/provision_jobs?status=eq.QUEUED&select=id"
req = urllib.request.Request(url, headers={
    'apikey': SUPABASE_KEY,
    'Authorization': f'Bearer {SUPABASE_KEY}',
    'Prefer': 'count=exact',
})

It uses the content-range header to get the count without fetching all rows.

Cold Start Behavior

When a Tendril is scaled from 0 → 1:

ECS task launch — ~30 seconds (Fargate cold start, image pull)
Tendril boot — ~5 seconds (binary startup, API registration)
First heartbeat — ~5 seconds after boot
First job claim — next poll cycle (≤ 10 seconds)

Total cold start: ~50 seconds from job queue to job start.

During this time, the job remains in QUEUED status. The user sees "Waiting for Tendril..." in the UI.

Cost Savings

Without scale-to-zero, a single Tendril running 24/7 on Fargate costs:

1 vCPU + 4 GiB memory ≈ $40/month per Tendril

With scale-to-zero, you pay only for:

Lambda invocations — 43,200/month (1/min) × 128MB × 1s ≈ $0.02/month
Fargate — only when jobs are running

For workloads with sporadic provisioning (a few deploys per week), this reduces Tendril costs by ~99%.

Terraform Configuration

The scaler is defined in infra/platform/scaler/:

module "scaler" {
  source = "./scaler"

  name_prefix               = local.name_prefix
  supabase_url              = var.supabase_url
  supabase_service_role_key = var.supabase_service_role_key

  workers = [
    for name, w in module.tendril : {
      region  = var.tendrils[name].region
      cluster = w.cluster_name
      service = w.service_name
    }
  ]
}

The scaler automatically discovers all Tendril deployments from the module.tendril outputs.

Scale-to-Zero

On this page