Skip to content
Agentic AI for serious engineers
Pattern

Cold Start Mitigation

Serverless agent runtimes have a 1-3 second cold start; design around it.

When to use

Production deployments where p99 latency matters. Strategies: keep-warm pings every N minutes, provisioned concurrency for predictable load, or fallback to a smaller / cached model on cold-start detection.

When not to use

When the workload is batched and latency-insensitive. Cold start adds cost in keep-warm strategies; only pay it when latency is the bottleneck.

Used in