"Try again in 30 seconds" / heavy queue busy

Heavy templates (ai-photoshoots, ghost-mannequin, batch flags, multi-variant ai-model and image-creation) run one at a time per server process. If you hit the heavy queue, you'll see a 503 with a 30-second retry hint. The /api/generate rate limit is 60 requests per user per minute.

Last updated Apr 28, 2026

Heavy templates run one at a time per server process to keep latency predictable. If you hit the heavy queue while another heavy job is in flight, the API returns 503 with a try again in 30 seconds hint. The /api/generate rate limit is 60 requests per user per minute.

What counts as a heavy job

White Studio (ai-photoshoots).
Ghost Mannequin.
Batch flag on any template (e.g. batch ghost mannequin, batch image-creation).
AI model with multiple variants.
Image Creation with multiple variants.

What to do when you hit it

Wait 30 seconds and retry. The previous heavy job almost always finishes in that window.
Switch to a non-heavy template while waiting (e.g. Image Creation single-variant, Virtual Try-On).
Reduce variant count on Image Creation / AI Model from 4 to 1 to drop the heavy classification.
Use the offload worker — some heavy templates are routed to a separate Docker worker fleet (10 parallel containers on the apiway server). The offload list is configured via the OFFLOAD_TEMPLATES env var. When offloaded, the heavy queue does not gate the job.

Rate limit details

/api/generate: 60 requests / user / minute.
Instagram connect: 5 / minute / user (via checkInstagramConnectRateLimit).

Hitting the rate limit returns 429. Wait one minute and the bucket refills.

About the offload worker

Apiway runs heavy generation through a Docker microservice (10 worker containers in parallel) on a dedicated server (64 GB RAM). Communication with the worker is via the generation_jobs Supabase table — not HTTP — so worker scale-out does not require code changes on the Next.js side. See our infrastructure note on uploads and training.

What counts as a heavy job

What to do when you hit it

Rate limit details

About the offload worker

Related docs