AI Infrastructure

AI Agents: Building an Autonomous Task Orchestration System

AI agentsautomationinfrastructureLLM
Client Internal / AI Agents Platform
Duration 6 weeks
Agents handle 200+ concurrent task chains
80% reduction in manual task routing
Sub-2 second agent response time at scale
Fully observable via real-time dashboard

Background

This was one of our most technically ambitious projects: building a multi-agent orchestration platform that could take high-level goals and decompose them into coordinated, autonomous task chains executed by specialised AI agents.

The use case: a content and research business that needed to automate complex multi-step workflows. Tasks like “research competitor X, extract their top 10 content pieces, identify content gaps, draft a brief for each gap, and post to Notion” used to take a human 3 to 4 hours.

The Architecture Challenge

Single-agent AI systems are relatively straightforward. Multi-agent systems, where multiple specialised agents collaborate, hand off tasks, and share context, introduce a different class of problems:

Problems we needed to solve:

  1. Task decomposition: How do you reliably break a high-level goal into executable sub-tasks?
  2. Agent routing: How do you decide which specialised agent handles which sub-task?
  3. Context passing: How do agents downstream receive the right context from agents upstream?
  4. Failure recovery: What happens when one agent in a chain fails or produces bad output?
  5. Observability: How does the operator know what’s happening inside a running agent chain?

The Solution

We built a central Orchestrator that sits above all specialised agents. It receives the high-level goal, uses an LLM call to produce a structured execution plan (JSON), and then dispatches tasks to the appropriate agents in the right sequence.

Agent Types Built

AgentResponsibility
Research AgentWeb search, content extraction, source validation
Analysis AgentPattern recognition, gap identification, summarisation
Writer AgentLong-form content drafting from structured briefs
Formatter AgentMarkdown, HTML, Notion, or Slack output formatting
Publisher AgentPush content to Notion, Airtable, Google Docs, or Slack

Orchestration Flow

Goal: "Research [topic], identify 5 content gaps, draft briefs for each"

Orchestrator → produces execution plan:
  Step 1: Research Agent (searches, extracts, returns structured data)
  Step 2: Analysis Agent (receives Step 1 output, identifies gaps)
  Step 3 (parallel): Writer Agent × 5 (one per gap, runs concurrently)
  Step 4: Formatter Agent (assembles all drafts)
  Step 5: Publisher Agent (posts to Notion)

Step 3 runs in parallel: all 5 writer agents execute simultaneously, reducing total time by 4× compared to sequential execution.

Failure Recovery

Each agent step has a retry policy with exponential backoff. If an agent fails after 3 retries, the orchestrator:

  1. Logs the failure with full context
  2. Attempts a fallback strategy (different model, simpler prompt)
  3. If still failing, marks the step as degraded and continues with partial output
  4. Alerts the operator via webhook

This means a single agent failure doesn’t bring down the entire chain.

Observability Dashboard

One of the non-obvious requirements: the humans overseeing this system needed to see what was happening in real time. We built a live dashboard showing:

  • Active agent chains with step-by-step status
  • Token usage per agent per run (cost tracking)
  • Success/failure rates per agent type over time
  • Average chain completion time by goal type

The Stack

  • Orchestration: Custom Python service + LangGraph for complex agent graphs
  • LLMs: Claude 3.5 Sonnet (orchestrator + writer) + Claude Haiku (research + formatting)
  • Memory: Redis for short-term context passing between agents
  • Queue: Celery + Redis for parallel agent dispatch
  • Database: PostgreSQL for run history and analytics
  • Dashboard: Next.js + Recharts
  • Webhooks: Custom event bus for real-time status updates

Results

MetricManual ProcessAI Agent System
Time per full workflow3 to 4 hours8 to 12 minutes
Concurrent workflows1 (one human)200+
ConsistencyVariableHighly consistent
Cost per workflow$45 to $60 in labour$0.80 to $1.20 in API costs

The platform now runs hundreds of concurrent task chains. The team it replaced wasn’t eliminated. They were redirected to the tasks that actually required human judgement: strategy, client relationships, and quality review.

Build time: 6 weeks from architecture design to production deployment.

Want results like this?

Book a free discovery call. We'll scope your project and give you a clear timeline.

Book a Call