Advisor Strategy — Pairing a Smarter Model as an Occasional Advisor With a Cheaper Executor

Advisor Strategy

TL;DR: Anthropic’s advisor strategy (April 2026) inverts the conventional orchestrator pattern. Instead of a smart model decomposing work for smaller workers, a cheaper executor model drives the task end-to-end and escalates selectively to a smarter advisor model on hard decisions. Implemented as a server-side advisor_20260301 tool inside a single API request — no extra round-trips. Benchmark results: Sonnet paired with Opus as advisor improves SWE-bench Multilingual by 2.7 percentage points while cutting cost 11.9%. Haiku with an Opus advisor more than doubles its solo BrowseComp accuracy (19.7% → 41.2%) at 85% lower cost than running Sonnet alone. The architecture is the model-level fractal of the comparisons/strategy-vs-execution-ai pattern: judgment-heavy decisions bubble up to the smarter model, execution stays in the cheaper one.

What it is

The advisor strategy is a model-pairing pattern Anthropic published in April 2026. Two roles:

Executor — the model that actually drives the task. Sonnet or Haiku. Handles all the volume work: tool calls, drafting, execution, intermediate decisions.
Advisor — a more capable model (Opus) consulted only on hard decisions. The executor decides when to call it.

The architecture inverts the conventional sub-agent pattern. Conventionally, a large orchestrator decomposes work and dispatches it to smaller workers. The advisor strategy reverses this: a smaller model owns the workflow and pulls in expensive reasoning only when it can’t make a decision on its own.

How it works

The advisor_20260301 tool is a server-side feature — the model handoff happens inside a single /v1/messages API request. No extra round-trips for the caller, no separate orchestration layer to build.

response = client.messages.create(
    model="claude-sonnet-4-6",
    tools=[
        {
            "type": "advisor_20260301",
            "name": "advisor",
            "model": "claude-opus-4-6",
            "max_uses": 3,  # cost ceiling on advisor calls
        },
    ],
    messages=[...]
)

Operational details:

Single API request. Context routing and model handoff are inside /v1/messages. Caller doesn’t manage the round-trip.
Separate token billing. Executor and advisor tokens are billed at their respective model rates.
Cost ceiling. max_uses parameter caps how many times the executor can call the advisor.
Transparent tracking. Advisor token usage shows up separately in the response.

Benchmark results

The published numbers (Anthropic, April 2026):

Configuration	Benchmark	Score	Cost vs Sonnet alone
Sonnet alone	SWE-bench Multilingual	baseline	100%
Sonnet + Opus advisor	SWE-bench Multilingual	+2.7pp	88.1% (11.9% savings)
Haiku alone	BrowseComp	19.7%	~15%
Haiku + Opus advisor	BrowseComp	41.2% (more than 2×)	~15% (85% cheaper than Sonnet alone)

Two patterns in the data:

Sonnet + Opus advisor — modest accuracy gain, real cost savings. The advisor pulls Sonnet up on the hardest cases that Sonnet alone would miss.
Haiku + Opus advisor — large accuracy gain, very cheap baseline. The cost differential between Haiku and Sonnet is large enough that paying for occasional Opus calls is still much cheaper than running Sonnet straight.

For tasks where Haiku-alone is too weak and Sonnet-alone is overkill, the Haiku+Opus pair sits in the gap and outperforms either single-model baseline on the benchmark.

Why it matters

It’s the strategy-vs-execution pattern at the model-architecture level

The wiki’s comparisons/strategy-vs-execution-ai thesis describes a team-level pattern: senior strategists make judgment calls; AI-augmented execution staff handle volume. The advisor strategy is the model-architecture-level fractal of the same pattern: Opus makes judgment calls; Sonnet/Haiku handle volume.

The same logic applies at both layers:

Most decisions don’t require top-tier judgment. Routing them to top-tier capability is wasteful.
A small fraction of decisions are pivotal. Routing those to top-tier capability is high-leverage.
The architecture wins when escalation is selective and the executor is decent on its own.

This is a useful concept handle. The same pattern repeats at three layers: org chart (senior strategist + AI-augmented juniors), individual workflow (human + personal AI advisor), and model architecture (Opus advisor + Sonnet/Haiku executor). The recurring pattern is itself worth naming.

It changes the economics of agent design

Without the advisor strategy, the typical cost-vs-capability trade-off forced a binary: pay for the best model on every call, or accept worse outputs from a cheaper model. The advisor pattern dissolves the binary by letting the executor itself decide which calls warrant the expensive model.

For agentic work where most steps are easy and a few are hard, the cost reduction is dramatic. The Haiku+Opus result (85% cheaper than Sonnet alone, 2× the accuracy of Haiku alone) is the cleanest demonstration: you can compose a system that beats both single-model baselines on the relevant axis.

Implications for questions/managed-agents-break-even

The advisor strategy shifts the build-vs-buy calculus:

Building this manually (multiple API calls, your own routing logic, your own context handoff) is real engineering work. Server-side advisor tools make it a configuration change.
Cost ceiling (max_uses) gives a hard guardrail against runaway advisor spending — which is exactly the kind of guardrail that takes weeks to build robustly in DIY infrastructure.
Token-rate parity with direct API calls means there’s no managed-agents penalty for the advisor pattern itself; you’re paying the same Opus rates you would pay calling Opus directly.

Implications for questions/ai-as-personal-advisor

There’s a lexical irony: this page is also titled around an “advisor” but operates at the model-architecture level, while the open question is about an AI advisor for the user. The two are connected in a non-trivial way:

A well-designed personal AI advisor is itself an instance of this pattern — a fast cheap model handles routine triage and only escalates to slower expensive reasoning on hard decisions. The user-facing advisor and the model-architecture advisor are the same architectural insight applied at different layers. See questions/ai-as-personal-advisor for the user-facing case.

When it makes sense

Conditions that favor the advisor strategy:

Tasks have heterogeneous difficulty. Most steps are easy; a few are hard. If everything is uniformly hard, the executor will burn budget calling the advisor constantly. If everything is uniformly easy, you’re paying for an unused tool.
The executor is competent on its own. If Haiku-alone fails on 80% of the work, Haiku+Opus advisor can’t rescue it — the advisor calls are too rare to fix that many failures. The executor needs to be at least roughly capable.
Decisions are escalatable. The pattern depends on the executor recognizing when it’s stuck. If the executor is silently confident on hard cases (Dunning-Kruger pattern), it won’t call the advisor when it most needs it.

Honest limits

Benchmark performance ≠ production performance. The SWE-bench and BrowseComp numbers are indicative, not predictive of arbitrary tasks. Local benchmarking is necessary.
Executor self-awareness is the bottleneck. Advisor escalation only works if the executor accurately recognizes its own limits. The same glossary/recognition-primed-decision / Klein-Kahneman conditions apply: executor pattern-matching of “this is hard” is reliable only when the executor has been trained on that decision pattern. Models may underuse or overuse the advisor in ways that aren’t obvious from the prompt.
Cost ceiling is a coarse tool. max_uses caps the count, not the difficulty. A poorly-set ceiling either truncates needed calls or waves through unneeded ones.
Adds latency. Each advisor call is a separate model invocation behind the scenes. For real-time interactive UIs, the cost is noticeable.
Pattern still new. April 2026 release; long-tail edge cases not yet documented in the field. Treat as a working pattern, not yet a proven default.

comparisons/strategy-vs-execution-ai — the team/org-level analog of the same pattern; the advisor strategy is the model-architecture fractal
tools/claude-managed-agents — managed agents support the advisor toolset; the configurations stack
questions/ai-as-personal-advisor — the user-facing analog: human + personal AI advisor as an instance of the same architectural insight
questions/managed-agents-break-even — the advisor pattern shifts the cost economics of agent design
automation/ai-agent-organization — broader patterns for reliable agent design
automation/multi-agent-patterns — adjacent: orchestrator/dispatcher/deep-worker patterns
glossary/recognition-primed-decision — Klein-Kahneman conditions explain when the executor can reliably decide to escalate
glossary/jagged-frontier — the executor’s accuracy is jagged; advisor escalation is a way to repair the worst cases without paying for top-tier capability everywhere

Key takeaways

The advisor strategy inverts the conventional orchestrator pattern: cheap executor drives the task, smart advisor consulted only on hard decisions.
Implemented as a server-side advisor_20260301 tool inside a single API request. No extra round-trips, separate token billing, max_uses cost ceiling.
Sonnet + Opus advisor: +2.7pp on SWE-bench Multilingual at 11.9% cost savings. Haiku + Opus: 19.7% → 41.2% on BrowseComp at 85% cheaper than Sonnet alone.
The architecture is the model-level fractal of comparisons/strategy-vs-execution-ai: judgment escalates upward; execution stays cheap.
Three conditions that favor it: heterogeneous task difficulty + competent executor + escalatable decisions.
Bottleneck is executor self-awareness — pattern only works when the executor accurately recognizes “this is hard, ask the advisor.”

Sources

Anthropic (April 2026). The Advisor Strategy. claude.com/blog/the-advisor-strategy. Source for the pattern definition, benchmark numbers, API surface.
Internal Primores observation (2026): the architecture pattern recurs at three layers (org chart, individual workflow, model architecture). Synthesis worth tracking as a working frame.