Dynamic Workflows and Agent Orchestration

A single agent with a single context window works linearly: reads, thinks, writes, repeats. For most tasks, that's enough. But when you need to simultaneously audit security, performance, and test coverage on a PR with 2,000 lines of changes — or refactor five independent modules in a single run — linear execution becomes a bottleneck.

The key insight: large tasks can be decomposed into subtasks that are either independent (run in parallel) or have dependencies (pipeline). Orchestration in Claude Code is the toolset for managing these structures.

Three Levels of Parallelism

Claude Code provides three fundamentally different mechanisms, and the choice between them determines both the architecture and the cost:

Subagents — isolated workers within a single session. Each receives a subtask, executes it in its own context window, and returns only the result. Communication flows exclusively through the lead agent.
Agent teams — full sessions with a shared task list and a mailbox. Teammate agents can communicate directly with one another, not only through the lead.
Agent SDK — programmatic pipeline assembly in Python or TypeScript, where the developer explicitly controls parallelism via query() calls.

flowchart TD
    subgraph SUB["Subagents (within a single session)"]
        direction TB
        M1[Lead agent] -->|delegates task| A1[Subagent: security]
        M1 -->|delegates task| A2[Subagent: perf]
        M1 -->|delegates task| A3[Subagent: tests]
        A1 -->|result only| M1
        A2 -->|result only| M1
        A3 -->|result only| M1
    end

    subgraph TEAM["Agent teams (agent teams, exp.)"]
        direction TB
        L[Lead] -->|creates tasks| TL[(Task list)]
        T1[Member 1] <-->|messages| T2[Member 2]
        T1 <-->|messages| T3[Member 3]
        T1 & T2 & T3 -->|claim| TL
        T1 & T2 & T3 -->|messages| L
    end

    subgraph SDK["Agent SDK (programmatic)"]
        direction TB
        ORCH[Your code] -->|asyncio.gather| Q1["query() #1"]
        ORCH -->|asyncio.gather| Q2["query() #2"]
        ORCH -->|asyncio.gather| Q3["query() #3"]
        Q1 & Q2 & Q3 -->|results| ORCH
    end

Three orchestration models: subagents (context isolation), agent teams (direct communication), Agent SDK (programmatic parallelism)

Subagents: Delegation with Context Isolation

A subagent is not simply "another Claude call." It is a separate instance with its own context window, its own system prompt, and an independent set of tools. When a subagent finishes its work, only the final result is returned to the main conversation — not the search queries, not the files read, not the intermediate steps.

This matters: context isolation is the primary reason to use subagents. If a researcher agent reads 50 files and produces a 200-line summary, the main context receives 200 lines — not 50 files.

Defining a Subagent

Custom subagents are stored in .claude/agents/ (project level) or ~/.claude/agents/ (globally). Each is a Markdown file with frontmatter:

.claude/
└── agents/
    ├── security-reviewer.md   # security audit
    ├── test-writer.md         # test generation
    └── doc-updater.md         # documentation updates

---
name: security-reviewer
description: Performs a security audit of code. Use for reviewing auth, input validation, and token handling.
model: claude-sonnet-4-5
tools:
  - Read
  - Glob
  - Grep
---

You are an application security expert. Analyze:
- Authentication and session handling
- Input validation and sanitization
- Token and secret management
- SQL injection and XSS

Always specify severity (critical/high/medium/low) and propose a concrete fix.

Claude automatically delegates tasks to a subagent when its description matches the request. You can invoke one explicitly via Task (in interactive mode) or the Agent tool. The tools field strictly restricts available tools — the subagent cannot do anything not explicitly listed.

Cost Control Through Model Selection

One underappreciated use of subagents is routing to cheaper models. A research subagent that only reads and summarizes content works perfectly well with claude-haiku-4-5. A subagent writing complex code needs claude-sonnet-4-5 or claude-opus-4-5. The model field in frontmatter lets you set this once and forget about it on every request.

Agent Teams: When Agents Need to Communicate

Subagents are ideal when the result matters and the process does not. But sometimes you need a security agent to challenge the conclusions of a performance agent, or a testing agent to alert an architecture agent about a discovered issue before it finishes implementation.

For this, Claude Code has agent teams — an experimental feature (Claude Code v2.1.32+). Disabled by default:

// .claude/settings.json
{
  "env": {
    "CLAUDE_CODE_EXPERIMENTAL_AGENT_TEAMS": "1"
  }
}

Team Architecture

A team consists of a team lead and teammates. The lead is the current session that creates the team. Teammates are fully independent Claude Code sessions, each with their own context window.

Coordination happens through two mechanisms:

Shared task list — the lead creates tasks; teammates claim and execute them. Dependencies are resolved automatically.
Mailbox — any teammate can send a message directly to any other, without going through a mediator.

This is the fundamental difference from subagents: team members talk to one another. A typical scenario is parallel investigation with competing hypotheses:

Create a team of 4 agents to investigate a performance regression
in the payment processing module. One investigates N+1 queries, another
memory leaks, another lock contention, another network latency. Have them
actively challenge each other's conclusions.

Quality-Control Hooks

Special hooks are available for teams in settings.json:

{
  "hooks": {
    "TeammateIdle": [
      {
        "matcher": ".*",
        "hooks": [{ "type": "command", "command": "./scripts/check-tests-pass.sh" }]
      }
    ],
    "TaskCompleted": [
      {
        "matcher": ".*",
        "hooks": [{ "type": "command", "command": "./scripts/validate-output.sh" }]
      }
    ]
  }
}

TeammateIdle fires when a teammate is about to go idle. Exiting with code 2 returns the teammate to work with feedback. TaskCompleted works analogously for tasks. This lets you automatically block a task from being "completed" if tests are not passing.

Important: agent teams consume significantly more tokens. Each teammate has a separate context counter. For routine tasks, a single agent with subagents is always cheaper. Teams are justified when the value of parallel investigation or competing hypotheses outweighs the cost.

Check yourself

An agent team is investigating a bug: three participants are testing competing hypotheses. Agent A discovers that its hypothesis has been disproved by Agent B's data. How does Agent A communicate this to Agent B within the agent team? How does this differ from subagents?

Agent SDK: Programmatic Pipelines

The Claude Code CLI is excellent for interactive work. But if orchestration is part of a CI/CD pipeline or a production system, you need programmatic control. That is what the Claude Agent SDK (@anthropic-ai/claude-agent-sdk / claude-agent-sdk) is for.

The central function is query(). It runs a full agent loop: the model plans, calls tools, observes results, and repeats — until completion.

import asyncio
from claude_agent_sdk import query, ClaudeAgentOptions

async def main():
    async for message in query(
        prompt="Find all TODO comments and create a summary",
        options=ClaudeAgentOptions(
            allowed_tools=["Read", "Glob", "Grep"]
        ),
    ):
        if hasattr(message, "result"):
            print(message.result)

asyncio.run(main())

The Parallel Pattern

The SDK's strength lies in explicit parallelism control. Here is a classic fan-out: several independent tasks launch simultaneously, and results are collected together:

import asyncio
from claude_agent_sdk import query, ClaudeAgentOptions

async def review_dimension(dimension: str, pr_diff: str) -> str:
    """One agent — one review dimension."""
    result_text = ""
    async for message in query(
        prompt=f"Review this diff from the perspective of {dimension}:\n\n{pr_diff}",
        options=ClaudeAgentOptions(allowed_tools=["Read", "Grep"]),
    ):
        if hasattr(message, "result"):
            result_text = message.result
    return result_text

async def parallel_review(pr_diff: str):
    dimensions = ["security", "performance", "test coverage"]
    
    # All three agents start simultaneously
    tasks = [review_dimension(dim, pr_diff) for dim in dimensions]
    results = await asyncio.gather(*tasks)
    
    for dim, result in zip(dimensions, results):
        print(f"=== {dim.upper()} ===\n{result}\n")

asyncio.run(parallel_review(open("pr.diff").read()))

Sessions and Continuation

The Agent SDK supports saving and resuming sessions via resume. This enables multi-step pipelines where each step builds on the context of the previous one:

async def staged_pipeline(filepath: str):
    session_id = None
    
    # Step 1: reconnaissance
    async for msg in query(
        prompt=f"Analyze the structure of {filepath} and identify potential issues",
        options=ClaudeAgentOptions(allowed_tools=["Read", "Grep", "Glob"]),
    ):
        if msg.type == "system" and msg.subtype == "init":
            session_id = msg.data["session_id"]
    
    # Step 2: implementation — Claude remembers everything from step one
    async for msg in query(
        prompt="Now fix the identified issues, starting with the critical ones",
        options=ClaudeAgentOptions(
            resume=session_id,
            allowed_tools=["Read", "Edit", "Bash"]
        ),
    ):
        if hasattr(msg, "result"):
            print(msg.result)

Subagents in the SDK

Just as in interactive mode, the SDK lets you define specialized subagents via AgentDefinition. The lead agent delegates tasks to them through the Agent tool — include it in allowed_tools:

from claude_agent_sdk import AgentDefinition

options = ClaudeAgentOptions(
    allowed_tools=["Read", "Glob", "Grep", "Agent"],
    agents={
        "security-reviewer": AgentDefinition(
            description="Security expert: JWT, validation, SQL.",
            prompt="Analyze code for vulnerabilities. Always specify severity.",
            tools=["Read", "Grep"],
        ),
        "perf-reviewer": AgentDefinition(
            description="Performance expert: N+1, caches, indexes.",
            prompt="Find performance issues. Assess impact.",
            tools=["Read", "Grep", "Bash"],
        ),
    }
)

Check yourself

In the Agent SDK you ran the first query() to analyze code and saved the session_id. You then launch a second query() with `resume=session_id`. What exactly is passed to the second agent — and what does it NOT receive?

Patterns for Real-World Tasks

Fan-out → synthesis. One large task is split into N independent subtasks (by module, by check type, by hypothesis). All run in parallel; a synthesizing agent collects the results. The classic example is parallel PR review across dimensions: security, performance, tests.

Pipeline with context. Sequential steps where each one builds on the result of the previous. Reconnaissance → planning → implementation → verification. Convenient to implement via session_id + resume in the Agent SDK, or simply through a multi-step conversation in interactive mode.

Competing hypotheses. Particularly effective for debugging: several agents simultaneously investigate different possible causes of a problem. Through agent teams, they can challenge each other's conclusions. The hypothesis that survives scrutiny from the others is most likely correct.

Domain specialization. Persistent subagents in .claude/agents/: security-reviewer, test-writer, doc-updater, migration-helper. Each knows its domain, has access only to the tools it needs, and can use a cheaper model when appropriate.

When Orchestration Is Not Needed

Multi-agent pipelines add complexity and cost. Before building a team of five agents, ask yourself: can this task be solved sequentially in a reasonable amount of time? If yes — a single agent with a well-crafted prompt is more reliable, cheaper, and easier to debug.

Orchestration is justified when:

The task naturally decomposes into independent subtasks (parallel gains are real)
One agent's result is not needed by another before it finishes (otherwise it is simply a pipeline)
The cost of additional tokens is offset by time savings or quality gains (competing hypotheses, multi-aspect review)