Skip to content

Advisor Strategy — Smart Model Pairing for Cost-Efficiency

Advisor Strategy

TL;DR: Instead of a smart orchestrator delegating to dumb workers, flip it: a cheap model (Sonnet/Haiku) does the work and consults an expensive model (Opus) only when stuck. Anthropic reports 85% cost reduction with 2x performance improvement in some cases.

The Inversion

Traditional multi-agent pattern:

Expensive Orchestrator (Opus)
↓ delegates to
Cheap Workers (Sonnet/Haiku)

Advisor Strategy (inverted):

Cheap Executor (Sonnet/Haiku)
↓ consults when stuck
Expensive Advisor (Opus)

The executor handles tasks end-to-end, escalating to the advisor only for difficult decisions. Advanced reasoning applies precisely when needed.

Why This Works

Traditional PatternAdvisor Strategy
Expensive model runs constantlyExpensive model called rarely
Orchestrator sees everythingAdvisor sees only hard problems
High base costLow base cost
Overhead on simple tasksSimple tasks stay simple

Key insight: Most subtasks in an agentic workflow don’t need the smartest model. Only the genuinely hard decisions do.

Benchmark Results

Sonnet + Opus Advisor

MetricSonnet AloneSonnet + Opus Advisor
SWE-bench Multilingualbaseline+2.7 percentage points
Cost100%88.1% (11.9% savings)

Better performance AND lower cost.

Haiku + Opus Advisor

MetricHaiku AloneHaiku + Opus Advisor
BrowseComp accuracy19.7%41.2% (2x improvement)
Cost vs Sonnet~15%~15% (85% savings vs Sonnet)

Haiku becomes dramatically more capable while staying cheap.

Implementation

Anthropic provides this as a built-in API feature:

response = client.messages.create(
model="claude-sonnet-4-6", # Executor
tools=[
{
"type": "advisor_20260301",
"name": "advisor",
"model": "claude-opus-4-6", # Advisor
"max_uses": 3, # Cap expensive calls
},
],
messages=[...]
)

Key Implementation Details

FeatureBenefit
Single API requestNo extra round-trips; routing happens server-side
Separate billingExecutor and advisor tokens billed at respective rates
max_uses capControl advisor costs; prevent runaway consulting
Transparent trackingAdvisor tokens appear separately in usage reports

When to Use This Pattern

✅ Good Fit

  • Cost-sensitive applications — Need quality but can’t afford Opus for everything
  • Variable difficulty tasks — Mix of easy and hard subtasks
  • Agentic workflows — Many steps where most are routine
  • Development/testing — Get Opus-level quality checks while iterating cheaply

❌ Poor Fit

  • Uniformly hard tasks — If every step needs Opus, just use Opus
  • Latency-critical — Advisor consultation adds latency
  • Simple one-shot tasks — No agentic loop, no opportunity to consult

Comparison with Other Patterns

PatternFlowBest For
Single AgentOne model does everythingSimple tasks
Orchestrator + WorkersSmart model delegates to cheap workersParallelizable tasks
Dispatcher + Deep WorkerCoordinator routes to specialistDepth-requiring tasks
Advisor StrategyCheap executor consults expensive advisorCost-sensitive agentic work

The Advisor Strategy complements other multi-agent patterns — use it when cost efficiency is the priority.

Cost Optimization Tips

  1. Set max_uses conservatively — Start with 2-3 advisor calls per task
  2. Profile your tasks — Measure how often advisor is actually needed
  3. Tune executor prompts — Better prompts reduce need for escalation
  4. Monitor advisor hit rate — High rate may indicate executor is too weak for the task

Connection to Managed Agents

In Claude Managed Agents, this pattern can be combined with multi-agent coordination:

  • Use Sonnet as your main agent with Opus advisor
  • Delegate subtasks to Haiku workers (also with Opus advisor access)
  • Get cost efficiency at every level

Key Takeaways

  • Flip the pattern: cheap executor, expensive advisor
  • Sonnet + Opus: +2.7pp performance, -11.9% cost
  • Haiku + Opus: 2x performance, 85% cheaper than Sonnet
  • Built into Claude API — no custom orchestration needed
  • Cap advisor usage with max_uses for cost control

Sources