← Back to Index
↳ Project /02AWS · AI

Multi-Agent AI Orchestrator

A fully asynchronous multi-agent system that routes natural-language coding tasks to specialist agents, designed around API Gateway's timeout instead of against it.

Role
AI Infra
Cloud
AWS
Pattern
Async agents
Model
Anthropic
Client
NL coding task
Orchestrator λ
202 + job ID
Coder λ
agentic loop
Anthropic
tool use
DynamoDB
24h TTL
Status λ
poll job
Async by design — return a job ID in under 2s, run the agentic loop in the background

/01Problem

Agentic loops that make sequential model tool-use calls routinely run longer than API Gateway's hard 29-second integration timeout, which breaks any synchronous request/response design.

The system had to return fast, run the loop reliably in the background, and keep blast radius contained between agents.

/02Approach

  • The orchestrator returns 202 with a job ID in under two seconds, then the coder Lambda processes the agentic loop independently, writing results to DynamoDB with a 24-hour TTL.
  • ARN-scoped least-privilege IAM isolates blast radius: the orchestrator can invoke only the coder Lambda, status can only read DynamoDB, and the coder cannot invoke any Lambda at all.
  • The write_code, explain_code, and debug_code tools are deterministic Python functions returning structured scaffolds and AST metadata, grounding the loop in real code analysis instead of recursive LLM self-talk.

/03Architecture

Three separately sized Lambda packages, Secrets Manager for the Anthropic key, and CloudWatch log groups with 14-day retention, all provisioned in Terraform.

/04Outcome

A resilient async pattern that turns a platform constraint (the 29-second timeout) into the architecture, with grounded tools and tightly scoped permissions per agent.

LambdaAPI GatewayDynamoDBAnthropic SDKTerraformPythonIAMSecrets Manager