Dynamic Workflows in Claude Code: A Complete Guide to Multi-Agent Harnesses

Author: Research compilation — Anthropic blog + Claude Code documentation Source: "A harness for every task" — Anthropic Blog, 2026 (Thariq Shihipar & Sid Bidasaria) Target: Claude Code users who want to orchestrate multi-agent workflows for complex tasks Date: 2026-06-03

Part I — What Are Dynamic Workflows?

Chapter 1 — The Problem Workflows Solve

The default Claude Code harness is built for coding tasks — and coding tasks are what most Claude Code work resembles. But certain task classes push beyond what a single context window handles well:

Long-running tasks that exceed practical context limits
Massively parallel tasks where independent work would benefit from isolated contexts
Highly structured tasks that need adversarial checks or rubric-based evaluation
Tasks requiring different intelligence levels for different subtasks

Before dynamic workflows, these required building custom static harnesses using the Claude Agent SDK or claude -p — generic infrastructure that needed to handle every edge case.

The three failure modes of single-context-window execution:

Failure Mode	Description	Example
Agentic laziness	Claude stops before finishing a complex multi-part task and declares it done	Addresses 20 of 50 items in a security review and reports completion
Self-preferential bias	Claude prefers its own results when asked to verify or judge them	Code reviewer that wrote the code grades it higher than warranted
Goal drift	Gradual loss of fidelity to original objective across many turns; lossy compaction causes edge-case requirements to disappear	"Don't touch the auth module" constraint gets lost after several summarization rounds

Dynamic workflows combat all three by orchestrating separate Claude instances — each with their own context window, focused on an isolated goal.

Chapter 2 — Dynamic vs Static Workflows

Property	Static Workflow	Dynamic Workflow
Created by	Human (pre-written JavaScript/SDK)	Claude itself, on the fly
Scope	Generic — must handle all edge cases	Purpose-built for the specific task
Flexibility	Fixed structure	Adapts to task requirements
Quality	Good for known, repeatable patterns	Best for novel or complex tasks
Setup	Requires engineering effort	Just ask Claude, or say "ultracode"
Reusability	Highly reusable	Can be saved and shared as templates
Requires	Claude Agent SDK or claude -p	Built into Claude Code

Static workflows shine when you have a known, repeatable process that should always run the same way. Dynamic workflows shine when the structure of the work depends on the work itself — when you need Claude to inspect the task and design a harness for it.

Chapter 3 — How Dynamic Workflows Execute

Dynamic workflows are JavaScript files. When Claude builds a workflow, it writes a .js file that uses special orchestration functions alongside standard JavaScript (JSON, Math, Array, etc.).

Core orchestration capabilities:

Spawn subagents: Launch Claude instances with specific prompts, models, and isolation levels
Choose models per agent: Route different subtasks to Haiku (cheap/fast), Sonnet (balanced), or Opus (most capable)
Worktree isolation: Run agents in git worktrees so their file changes don't conflict
Session resumption: If a workflow is interrupted (user action, terminal quit), resuming the session picks up where it left off

Triggering workflows:

Ask Claude directly: "Use a workflow to..." or "Set up a workflow that..."
Use the trigger word ultracode — guarantees Claude Code builds a workflow rather than attempting the task inline

The workflow runs as a deterministic JavaScript program. The non-deterministic intelligence lives inside the spawned agents; the orchestrating code is plain, predictable JS. This split is what makes workflows debuggable, resumable, and shareable.

Part II — The Six Core Patterns

Chapter 4 — Pattern 1: Classify-and-Act

What it is: A classifier agent first determines the type of task or input, then routes to specialized agents or behaviors based on that classification.

Variants:

Upfront classification: Classify first, then route (most common)
Post-hoc classification: Do the work, then classify the output to determine how to format or present results

When to use:

Heterogeneous inputs that need different handling (support tickets with different categories)
When the work to be done depends on properties of the input you don't know upfront
When you want consistent output format regardless of input variation

Example prompt:

"Here's a folder of 80 resumes, use a workflow to rank them for the backend role
and double-check the top ten. Interview me using the AskUserQuestion tool for a rubric."

The workflow would: (1) classify resumes by experience level, (2) route to role-specific evaluators, (3) collect results, (4) rank.

Anti-patterns:

Don't use classify-and-act when inputs are uniform — it just adds latency
Don't use it for branching that can be determined by a simple regex or string check — that's plain JS, not an agent decision

Chapter 5 — Pattern 2: Fan-Out-and-Synthesize

What it is: Split a task into N independent subtasks, run an agent on each in parallel, then collect and synthesize results in a barrier step.

Input
  │
  ├──> Agent 1 (subtask A) ──┐
  ├──> Agent 2 (subtask B) ──┤
  ├──> Agent 3 (subtask C) ──┤── Synthesizer Agent ──> Output
  └──> Agent N (subtask N) ──┘
        (barrier: wait for all)

Why clean context windows matter: When agents work in isolation, their results don't cross-contaminate. Agent 2's findings about module B don't bias Agent 3's review of module C.

The synthesize step is a barrier: It waits for ALL fan-out agents to complete before merging their structured outputs into one result.

When to use:

Large number of smaller independent steps (code review across 50 files)
Each subtask benefits from focused context (research across separate topics)
Parallel execution dramatically reduces wall-clock time

Example prompts:

"Use a workflow to dig through #incidents in Slack for the past six months
and find recurring root causes where nobody has filed a ticket."

"Go through my blog post draft and using a workflow verify every technical
claim against the codebase, I don't want to ship anything wrong."

Structured output discipline: Each fan-out agent should return a structured object (JSON) that the synthesizer can merge mechanically. Free-form text is hard to combine; schemas force clarity.

Chapter 6 — Pattern 3: Adversarial Verification

What it is: For every agent that produces an output, spawn a second agent whose sole job is to adversarially challenge that output against a rubric.

Task ──> Worker Agent ──> Output ──> Verifier Agent ──> Verified Output
                                         (adversarial)     or Rejection

Why adversarial matters: A worker agent has self-preferential bias toward its own output. A verifier with no knowledge of the worker's reasoning process is structurally more skeptical.

Skeptic persona pattern: Give the verifier agent explicit "skeptic" instructions — "assume the worker made mistakes, look for them specifically, do not accept the output unless you can actively verify each claim."

When to use:

Security reviews (worker finds vulnerabilities; verifier challenges each finding)
Factual research (worker finds sources; verifier checks source quality)
Code migrations (worker makes change; verifier checks correctness)
Any task where false positives are costly and you need high confidence

Verifier prompt template:

"You are an adversarial reviewer. The worker agent produced the following output.
Assume the worker made at least one mistake. Your job is to find it.
Reject the output unless you can independently verify each claim against [source].
Do not trust the worker's reasoning — re-derive each conclusion yourself."

Chapter 7 — Pattern 4: Generate-and-Filter

What it is: Generate many candidate outputs, then filter by quality, verification, and deduplication to return only the highest-quality results.

Prompt ──> Generator (N options) ──> Filter Agent ──> Dedup ──> Top K Results

When to use:

Creative tasks with qualitative criteria (naming, design, taglines)
Idea generation where quantity first, quality second
When you want diversity of approaches before converging

Example prompt:

"I need a name for this CLI tool. Use a workflow to brainstorm a bunch of options
and run a tournament to pick the top 3."

Diversity tactics:

Spawn generators with explicit "differentiate from these" instructions to force variety
Use different temperatures or different model sizes for different generators
Have generators target different sub-niches (one for technical names, one for whimsical, etc.)

Chapter 8 — Pattern 5: Tournament

What it is: Instead of dividing work, have N agents compete on the same task using different approaches. A judging agent compares pairs of outputs until a winner emerges.

Task ──> Agent A (approach 1) ──┐
Task ──> Agent B (approach 2) ──┤──> Judge ──> Pairwise ──> Winner
Task ──> Agent C (approach 3) ──┘             Comparisons

Why pairwise works better than absolute scoring: Comparative judgment ("which of A vs B is better?") is more reliable than asking a judge to score each output on a 1-10 scale. The judge only needs to make local comparisons, not global assessments.

When to use:

Taste-based decisions (design, naming, copy)
Solutions where the "best" approach is unclear until you see alternatives
Sorting large lists where qualitative judgment matters
Any case where you want to explore the space before converging

Bracket structure: For N candidates, the deterministic tournament loop holds the bracket structure in the workflow's own context — only the current running comparison stays in each judge's context.

Tournament formats:

Format	Comparisons	Use For
Single elimination	N-1	Quick top-1 selection
Round-robin	N*(N-1)/2	Need full ranking
Swiss / bracket-with-byes	N log N	Larger lists, balanced

Chapter 9 — Pattern 6: Loop Until Done

What it is: Spawn agents repeatedly, checking a stop condition after each round, rather than a fixed number of passes.

while (stopCondition === false):
    spawn agent(remaining_work)
    check stopCondition
return results

Stop conditions:

No new findings (security scan found nothing new this pass)
No more errors in logs
All items processed (queue empty)
Quality threshold met (review agent approves)

When to use:

Tasks with unknown amounts of work (security scan until clean)
Iterative refinement until quality threshold
Continuous triage (pair with /loop for ongoing operation)

Example prompt:

"This test fails maybe 1 in 50 runs. Set up a workflow to reproduce it,
form theories and adversarially test them in worktrees /goal don't stop
until one theory works."

Always set a maximum iteration cap. Even with a clear stop condition, hardware bugs, model errors, or external service flakes can cause infinite loops. Cap at N rounds and surface a "stopped early" signal if the cap is hit.

Part III — Use Cases

Chapter 10 — Migrations and Refactors

Large-scale code migrations benefit enormously from workflows because they decompose naturally into independent parallel units (files, modules, callsites).

The Bun/Rust pattern (as used in the real Bun Zig→Rust rewrite):

Enumerate all units of work (callsites, failing tests, modules)
Spawn a subagent for each unit in its own worktree
Each agent makes its fix
Adversarial review agent checks each change
Merge approved changes

Key performance tip: Instruct agents to avoid resource-intensive commands (large grep, full rebuilds) so you can maximize parallelization without exhausting machine resources.

Example prompt:

"Use a workflow to rename our User model to Account everywhere."

Migration checklist:

Step	Owner
Enumerate callsites	Orchestrator (deterministic JS)
Generate per-callsite fix	Worker agent (Sonnet usually sufficient)
Verify each fix compiles + tests pass	Verifier agent
Resolve cross-callsite conflicts	Synthesizer agent (Opus)
Final integration check	Single Opus pass over merged result

Chapter 11 — Deep Research

Claude Code's built-in /deep-research skill is itself a dynamic workflow. It demonstrates the fan-out-and-synthesize pattern applied to research:

Fan out: Run N web searches in parallel
Fetch: Pull source content for each result
Verify: Adversarially verify claims from each source
Synthesize: Merge findings into a cited report with a barrier step

Beyond web search — research from internal sources:

Mine Slack channels for status patterns, incident trends
Explore how a feature works by fanning out across the codebase
Compile reports from JIRA/Linear tickets in parallel

Citation discipline: Force every claim in the synthesizer output to carry a citation back to the source that produced it. The synthesizer drops any unverifiable claim rather than smoothing it over.

Chapter 12 — Deep Verification

The inverse of research: you have a document and want to verify every factual claim in it.

Workflow pattern:

Agent 1: Scan document, extract all factual claims as structured list
Fan out: One agent per claim — verify each independently
Verifier agents: Check that each source is high-quality (not circular references, not outdated)
Synthesize: Report verified, unverified, and wrong claims

Example prompt:

"Go through my blog post draft and using a workflow verify every technical claim
against the codebase, I don't want to ship anything wrong."

Output format:

Claim	Status	Source	Notes
"Our API supports HTTP/2"	Verified	`src/server/http.ts:42`	confirmed
"Latency is under 100ms"	Unverified	—	no benchmark in repo
"We use Postgres 15"	Wrong	`docker-compose.yml` shows 14	needs correction

Chapter 13 — Sorting at Scale

Sorting 1000+ items by qualitative measurement (bug severity, resume quality, support ticket priority) degrades badly in a single context window — the list doesn't fit and quality collapses.

Workflow approaches:

Approach	Best For	How
Tournament	Small-medium lists (<200), taste decisions	Pairwise comparisons, bracket-style
Parallel bucket-rank then merge	Large lists (1000+)	Fan out into buckets, rank within each, merge ranks
Pairwise pipeline	Any size, highest accuracy	Each comparison is its own agent

Key insight: Each comparison is its own agent — the deterministic loop holds the bracket, and only the running order stays in any agent's context. Agents never see the full list; they only see two items at a time.

Bucket-rank merge example:

1000 resumes
  ├─> Bucket A (250) → ranked locally → [A1..A250]
  ├─> Bucket B (250) → ranked locally → [B1..B250]
  ├─> Bucket C (250) → ranked locally → [C1..C250]
  └─> Bucket D (250) → ranked locally → [D1..D250]
        merge-rank agent → final ranking

Chapter 14 — Memory and Rule Adherence

Problem: Even rules in CLAUDE.md get missed or misapplied, especially as context grows.

Solution — one verifier per rule:

Create a workflow with an explicit list of rules
Spawn one verifier agent per rule
Each verifier checks ONLY its assigned rule — focused, no distraction
Skeptic agent reviews the verifiers to prevent false positives

Mining rules from sessions (reverse direction):

Read last N sessions (fan out)
Cluster corrections you keep making (parallel clustering agents)
For each candidate rule: adversarially verify — would this rule have prevented a real mistake?
Distill survivors back into CLAUDE.md

Example prompt:

"Using a workflow, go through my last 50 sessions and mine them for corrections
I keep making and turn the recurring ones into CLAUDE.md rules"

Chapter 15 — Root-Cause Investigation

Problem: Single-context debugging leads to self-preferential bias — Claude formed a hypothesis early and now cherry-picks evidence to support it.

Solution — structurally separated evidence streams:

Fan out: Separate agents for different evidence sources (logs agent, code agent, data agent)
Each agent generates hypotheses independently from its evidence slice
Panel of verifier + refuter agents challenges each hypothesis
Synthesizer consolidates surviving hypotheses

This is not just for code:

Sales analytics: "Why did sales drop in March?" — separate agents for marketing data, product changes, competitive moves, economic factors
Data engineering: "Why did this pipeline fail?" — separate agents for each system component
Post-mortem exercises: Any root-cause analysis benefits from structurally independent investigation

Hypothesis lifecycle:

Stage	Agent	Output
Evidence gather	Logs / Code / Data agents (parallel)	Structured findings per source
Hypothesis form	Per-source theorist	Candidate root causes
Adversarial test	Refuter agent	Pass / refuted / inconclusive
Consolidate	Synthesizer	Ranked surviving hypotheses

Chapter 16 — Triaging at Scale

Every team has a queue (support tickets, bug reports, PR reviews) that can't be fully processed by humans.

Triage workflow pattern:

Classify each item by category, severity, and urgency
Deduplicate against already-tracked items
Route: attempt automated fix, escalate to human, or archive

Quarantine pattern (security-critical): Agents that READ untrusted public content are not allowed to take high-privilege actions. A separate privileged agent acts on the information after the reading agent summarizes it. This prevents prompt injection from untrusted content from triggering destructive actions.

Continuous triage: Pair with /loop to run triage at regular intervals — the workflow runs, processes the queue, then sleeps until the next cycle.

Example prompt:

"Use a workflow to dig through #incidents in Slack for the past six months
and find recurring root causes where nobody has filed a ticket."

Chapter 17 — Exploration and Taste

Taste-based decisions (naming, design, copy, architecture style) benefit from:

Many candidate options (generate-and-filter)
Structured evaluation against a rubric
Tournament-style selection to converge on the best

Pattern:

Generate N candidates
Give review agent a rubric for what "good" looks like
Review agent iterates until rubric criteria are met, or tournament selects the winner

Example prompts:

"Take my business plan and run a workflow where different agents tear it apart
from an investor's, a customer's, and a competitor's perspective."

"I need a name for this CLI tool. Use a workflow to brainstorm a bunch of options
and run a tournament to pick the top 3."

Chapter 18 — Evals

Build lightweight evals for your own skills, prompts, or code:

Spin off N agents in isolated worktrees — each runs the thing being evaluated
Comparison agents grade outputs pairwise against a rubric
Aggregate grades to produce a ranked result
Optionally: use the lowest-scoring outputs to refine the evaluated item and re-run

When to use: When you've built a skill or feature and want to measure quality before shipping, or when you want to continuously monitor quality as you make changes.

Eval pipeline:

Step	Purpose
Define rubric	Concrete criteria the output must satisfy
Generate test inputs	N representative scenarios
Run candidate	Each worktree runs the target skill/prompt
Pairwise grade	Comparison agent picks better of each pair
Rank	Aggregate pairwise wins into a ranking
Refine	Use the worst outputs to improve the target

Chapter 19 — Model and Intelligence Routing

Not every subtask needs Opus. A well-designed workflow routes subtasks to the appropriate model:

Task Type	Recommended Model	Reason
Classification, routing decisions	Haiku	Fast, cheap, sufficient for binary decisions
Standard implementation, search	Sonnet	Good balance of speed and capability
Complex reasoning, synthesis, review	Opus	Maximum capability for judgment tasks

Classifier agent pattern:

Classifier agent researches the task (reads relevant files, counts lines, assesses complexity)
Based on findings, routes to Sonnet or Opus for the actual work
Result: expensive Opus tokens used only where they're needed

Example: "Explain how the auth module works" — the right model depends on how many files are in the auth module. Classifier reads the file tree, picks the model, then that model does the explanation.

Routing heuristics:

If the work fits in <10k tokens and has a clear right answer → Haiku
If the work needs to read multiple files and write code → Sonnet
If the work needs judgment, synthesis, or adversarial review → Opus

Part IV — Practical Usage

Chapter 20 — Triggering Workflows

Two ways to start a workflow:

Natural language: "Use a workflow to..." / "Set up a workflow that..." / "Build a workflow for..."
Ultracode trigger word: Typing ultracode in your prompt guarantees Claude Code builds a workflow rather than attempting the task inline

Quick workflows: Workflows aren't only for large tasks. Prompt for "quick workflow" to get a small adversarial check, assumption validation, or fast parallel search.

Detailed prompting gets better workflows: The more specific you are about what pattern you want (fan-out, tournament, adversarial verification), the better the generated workflow. Claude is intelligent enough to implement any of the six patterns if you name them.

Trigger phrase reference:

You type	What happens
"Use a workflow to..."	Claude proposes and builds a workflow
"Set up a workflow that..."	Same, often slightly more interactive
"Build a quick workflow..."	Lightweight workflow, small budget
"ultracode <task>"	Guaranteed workflow path, no inline attempt

Chapter 21 — Prompting Strategies

Name the pattern explicitly:

"Use a fan-out-and-synthesize workflow to..."
"Build an adversarial verification workflow that..."
"Run a tournament workflow to decide between..."

Specify models where it matters:

"Use Haiku agents for the classification step and Opus for the final synthesis"

Set stop conditions clearly:

"...don't stop until at least one hypothesis survives adversarial review"

Specify parallelism constraints:

"Don't use resource-intensive commands so we can parallelize maximally"

Combine with /goal for hard completion requirements:

"Use a workflow to fix all lint errors /goal zero lint errors remaining"

Combine with /loop for recurring workflows:

"Set up a triage workflow for new GitHub issues, then /loop to run it hourly"

Prompting checklist:

Did you name the pattern (fan-out, tournament, etc.)?
Did you specify the stop condition or completion criteria?
Did you set a token or time budget?
Did you route models appropriately (Haiku/Sonnet/Opus)?
Did you call out any invariants ("don't touch X")?

Chapter 22 — Token Budgets

Dynamic workflows use more tokens than single-context execution — sometimes significantly more. Each spawned agent uses its own context.

Setting explicit budgets:

"Use a workflow for this, budget 10k tokens"
"Build a quick workflow, keep it under 5k tokens total"

Cost optimization strategies:

Use cheap models (Haiku) for classification and routing steps
Set low token budgets for each individual agent where appropriate
Limit parallelism when running many agents simultaneously
Use "quick workflow" for small tasks rather than heavy orchestration
Ask yourself: does this task truly benefit from multiple agents, or is it a standard coding task that the default harness handles well?

When workflows are NOT worth the token cost:

Standard coding tasks (write a function, fix a bug, add a test)
Tasks that fit comfortably in one context window
Tasks that don't benefit from parallelism or adversarial review
Quick one-off changes

Budget reference rules of thumb:

Workflow scale	Typical token spend	When
Quick check	2k–5k	Single adversarial pass, validation
Standard fan-out	20k–80k	5–20 agents on a focused task
Deep research	100k–500k	Many sources, deep verification
Large migration	500k+	Hundreds of worktrees, full review pass

Chapter 23 — Saving and Sharing Workflows

Saving a workflow:

Press s in the workflow menu while a workflow is running
Workflow is saved to ~/.claude/workflows/

Sharing via skills:

Put the JavaScript workflow files in your skill folder
Reference them in SKILL.MD
Distribute the skill (GitHub, npm, Claude Code skills marketplace)

Using shared workflows as templates:

When receiving a workflow from someone else, prompt Claude to treat it as a template rather than a verbatim script:

"Use this workflow as a template but adapt it for my specific codebase structure"

This allows flexible reuse without hard-coded assumptions from the original context.

Versioning shared workflows:

Treat workflows like code — commit them, review them, tag releases
Document the input contract (what the workflow expects) and output contract (what it produces)
Include a sample input/output pair in the skill folder for grounding

Chapter 24 — Combining with /loop and /goal

/loop integration: Pair repeatable workflows (triage, monitoring, rule-checking) with /loop to run at regular intervals:

"Set up a triage workflow for the #bugs Slack channel, pair with /loop to check every 4 hours"

/goal integration: Set a hard stop condition that the workflow must achieve:

"Use a workflow to find and fix flaky tests /goal all tests passing reliably over 100 runs"

Combined pattern: /loop drives the cadence; /goal defines when the loop can stop.

Practical recipes:

Goal	Recipe
Hourly issue triage	workflow + `/loop 1h`
Fix every flaky test until green	workflow + `/goal all tests green`
Daily CLAUDE.md rule audit	workflow + `/loop 24h`
Migrate until zero callsites left	workflow + `/goal zero remaining callsites`

Part V — Reference

Chapter 25 — Pattern Selection Guide

You want to...	Use this pattern
Handle heterogeneous inputs differently	Classify-and-Act
Process N items independently in parallel	Fan-Out-and-Synthesize
Check work for errors or misses	Adversarial Verification
Generate many options then pick best	Generate-and-Filter or Tournament
Make a taste-based decision	Tournament
Run until something is done	Loop Until Done
Combine multiple patterns	Compose them — patterns are composable

Composition example: A migration is Fan-Out (per file) + Adversarial Verification (per change) + Loop Until Done (until zero remaining callsites) + Classify-and-Act (route trivial vs complex callsites to different models).

Chapter 26 — Use Case Quick Reference

Use Case	Primary Pattern	Example
Large migration/refactor	Fan-out + Adversarial	Rename User to Account everywhere
Deep research	Fan-out + Synthesize	Research Slack incidents, find patterns
Claim verification	Fan-out + Adversarial	Verify every claim in a blog post
Sorting at scale	Tournament or Bucket-rank	Rank 80 resumes for a role
CLAUDE.md rule mining	Fan-out + Adversarial	Mine last 50 sessions for rules
Root-cause analysis	Fan-out + Adversarial	Debug flaky test across N hypotheses
Continuous triage	Loop + Classify-and-Act	Process support queue hourly
Design/naming	Generate-and-Filter + Tournament	Pick top 3 names for CLI tool
Evals	Fan-out + Pairwise comparison	Grade skill outputs against rubric
Model routing	Classify-and-Act	Route auth questions to Opus, trivial to Haiku

Chapter 27 — Failure Mode Reference

Failure Mode	What Happens	Workflow Fix
Agentic laziness	Claude stops at 40/50 items, says "done"	Each agent has a fixed, bounded scope; completion is measurable
Self-preferential bias	Claude grades its own work too generously	Separate verifier agents have no access to the worker's reasoning process
Goal drift	"Don't touch auth" forgotten after compaction	Each agent's goal is explicit in its spawn prompt; loop context holds invariants

Diagnostic questions when a workflow goes wrong:

Did the agent's spawn prompt include the invariant that was violated?
Was the stop condition measurable, or fuzzy?
Did the verifier have independent access to the source of truth, or was it just looking at the worker's output?
Was the task split fine enough that each agent could finish within its budget?

Chapter 28 — Example Prompt Gallery

Debugging:

"This test fails maybe 1 in 50 runs. Set up a workflow to reproduce it, form theories
and adversarially test them in worktrees /goal don't stop until one theory works."

Knowledge mining:

"Using a workflow, go through my last 50 sessions and mine them for corrections I keep
making and turn the recurring ones into CLAUDE.md rules"

Incident analysis:

"Use a workflow to dig through #incidents in Slack for the past six months and find
recurring root causes where nobody has filed a ticket."

Adversarial review:

"Take my business plan and run a workflow where different agents tear it apart from
an investor's, a customer's, and a competitor's perspective."

Structured hiring:

"Here's a folder of 80 resumes, use a workflow to rank them for the backend role and
double-check the top ten. Interview me using the AskUserQuestion tool for a rubric."

Naming with tournament:

"I need a name for this CLI tool. Use a workflow to brainstorm a bunch of options
and run a tournament to pick the top 3."

Large refactor:

"Use a workflow to rename our User model to Account everywhere."

Technical fact-checking:

"Go through my blog post draft and using a workflow verify every technical claim
against the codebase, I don't want to ship anything wrong."

Closing Notes

Dynamic workflows shift the unit of work in Claude Code from "a single Claude turn" to "an orchestrated program of Claude turns." The orchestration is deterministic JavaScript; the intelligence inside each step is a focused Claude agent with its own context.

The most important skill is recognizing when a task wants a workflow rather than a single turn. The three failure modes — agentic laziness, self-preferential bias, and goal drift — are your signals. When you see them coming, reach for the right pattern from the six core patterns and compose as needed.

When in doubt: start with a quick workflow, set a clear stop condition, name the pattern explicitly, route models by intelligence required, and adversarially verify anything that matters.

Dynamic Workflows in Claude Code: A Complete Guide to Multi-Agent Harnesses

About

Preview