Skip to content

Claude Managed Agents — Anthropic's Agent Infrastructure

Claude Managed Agents

TL;DR: Claude Managed Agents is Anthropic’s ready-made infrastructure for running AI agents. Instead of building tool orchestration, sandboxes, and error handling yourself, you describe the agent (model, prompt, tools) and Anthropic runs it in their managed cloud. Available in open beta.

What Problem It Solves

Building agents on the Messages API requires months of engineering:

  • Tool orchestration logic
  • Context management
  • Secure sandboxes
  • Error handling and recovery
  • Secret storage
  • Long-running session management

Managed Agents provides all of this out of the box.

Two key benefits:

  1. Scaffolding stays current — Any custom scaffolding bakes in assumptions about what Claude can’t do. These assumptions become outdated with every model update. Managed Agents updates automatically.

  2. Long-running tasks — Anthropic expects future Claude versions to work for days, weeks, or months. This requires fault-tolerant, secure, scalable infrastructure that’s hard to build yourself.

Four Key Concepts

ConceptWhat It Is
AgentVersioned configuration: model, system prompt, tools, MCP servers. Create once, reference by ID.
EnvironmentContainer template: sandbox type, network rules, pre-installed packages.
SessionRunning instance of agent inside environment. Stores conversation, filesystem, status. Can run for hours.
EventsMessage exchange via Server-Sent Events (SSE). You send user messages, agent streams responses and tool calls.

How They Connect

Agent (configuration)
Environment (container template)
Session (running instance)
Events (message stream)

Quick Start (10 Minutes)

Step 1: Install

Terminal window
# CLI (macOS)
brew install anthropics/tap/ant
# Python SDK
pip install anthropic

Step 2: Create Agent

from anthropic import Anthropic
client = Anthropic()
agent = client.beta.agents.create(
name="Coding Assistant",
model="claude-sonnet-4-6",
system="You are a helpful coding assistant.",
tools=[{"type": "agent_toolset_20260401"}],
)

agent_toolset_20260401 includes all built-in tools: bash, file operations, web search.

Step 3: Create Environment

environment = client.beta.environments.create(
name="dev-env",
config={
"type": "cloud",
"networking": {"type": "unrestricted"},
"packages": {"pip": ["pandas", "numpy"]} # optional
},
)

Step 4: Start Session and Send Task

session = client.beta.sessions.create(
agent=agent.id,
environment_id=environment.id,
)
with client.beta.sessions.events.stream(session.id) as stream:
client.beta.sessions.events.send(
session.id,
events=[{
"type": "user.message",
"content": [{"type": "text", "text": "Create a Python script..."}],
}],
)
for event in stream:
match event.type:
case "agent.message":
for block in event.content:
print(block.text, end="")
case "agent.tool_use":
print(f"\n[Tool: {event.name}]")
case "session.status_idle":
print("\n\nDone.")
break

What happens under the hood:

  1. Container deploys from environment template
  2. Claude decides which tools to use
  3. Tool calls execute inside container
  4. Results stream to you in real-time
  5. session.status_idle when task is done

Built-in Tools

ToolDescription
bashExecute shell commands in container
readRead files
writeWrite files
editReplace strings in files
globFind files by pattern
grepSearch text by regex
web_fetchDownload content by URL
web_searchInternet search

Tool Configuration

Disable specific tools:

{
"type": "agent_toolset_20260401",
"configs": [
{"name": "web_fetch", "enabled": false},
{"name": "web_search", "enabled": false}
]
}

Enable only specific tools:

{
"type": "agent_toolset_20260401",
"default_config": {"enabled": false},
"configs": [
{"name": "bash", "enabled": true},
{"name": "read", "enabled": true}
]
}

Custom Tools

Define your own tools with structured input schemas:

agent = client.beta.agents.create(
name="Weather Agent",
model="claude-sonnet-4-6",
tools=[
{"type": "agent_toolset_20260401"},
{
"type": "custom",
"name": "get_weather",
"description": "Get current weather for a location",
"input_schema": {
"type": "object",
"properties": {
"location": {"type": "string", "description": "City name"}
},
"required": ["location"],
},
},
],
)

Best practices for custom tools:

  • Write detailed descriptions (3-4 sentences): what it does, when to use, limitations
  • Combine related operations with an action parameter
  • Use namespaces in names (db_query, storage_read)
  • Return only essential info — stable identifiers, not internal references

Permission System

Two modes for tool execution:

ModeBehaviorUse Case
always_allowTools execute automaticallyTrusted internal agents
always_askSession pauses for approvalUser-facing agents

Modes can combine: file reading automatic, bash commands need approval.

This is more production-ready than most open-source frameworks (LangGraph, CrewAI, AutoGen) — none provide per-tool permissions out of the box.

Usage Patterns

Event-triggered

External service triggers agent. Bug detected → agent writes patch and opens PR. Example: Sentry integration

Scheduled

Agent runs on schedule. Daily digests: GitHub activity, team tasks, X (Twitter) summary.

Fire-and-forget

Human sets task via Slack → gets result: table, presentation, app. Example: Asana AI Teammates

Long-horizon

Tasks running for hours. Research projects, large-scale code migrations, deep analysis.

CLI for setup, SDK for runtime

Agent templates stored as YAML in git. CLI applies them in deploy pipeline. SDK manages sessions at runtime.

Outcomes (Research Preview)

Outcomes turn sessions from conversations into goal-oriented work:

client.beta.sessions.events.send(
session_id=session.id,
events=[{
"type": "user.define_outcome",
"description": "Build a DCF model for Costco in .xlsx",
"rubric": {"type": "text", "content": RUBRIC},
"max_iterations": 5, # default 3, max 20
}],
)

A separate grader evaluates whether criteria are met. Agent iterates until satisfied or max iterations reached.

Good rubric criteria:

  • ✅ “CSV contains price column with numeric values”
  • ❌ “Data looks good”

Multi-Agent (Research Preview)

One coordinator can delegate to other agents:

orchestrator = client.beta.agents.create(
name="Engineering Lead",
model="claude-sonnet-4-6",
system="Delegate code review to reviewer, tests to test agent.",
tools=[{"type": "agent_toolset_20260401"}],
callable_agents=[
{"type": "agent", "id": reviewer_agent.id, "version": reviewer_agent.version},
{"type": "agent", "id": test_writer_agent.id, "version": test_writer_agent.version},
],
)

Use cases:

  • Code review (separate agent with read-only tools)
  • Test generation (writes tests, doesn’t touch production code)
  • Research (agent with web tools collects information)

Limitation: Only one level of delegation.

Architecture

Anthropic designed three independent components:

ComponentRole
”Brain”Claude and scaffolding (agent loop, tool selection)
“Hands”Sandboxes and tools that execute actions
”Session”Event journal

Each component can fail or be replaced independently. Built-in optimizations: prompt caching, context compression, automatic recovery.

Pricing

  • Standard Claude API token rates
  • + $0.08 per hour of active session

A 10-minute coding session costs a few cents for compute.

Who’s Using It

CompanyUse Case
NotionAgents for parallel task execution
RakutenCorporate agents per department (launched in <1 week each)
AsanaAI Teammates working alongside humans
SentryDebugger finds bug → agent writes patch → opens PR
VibecodeDefault infrastructure

Limits

OperationLimit
Resource creation (agents, sessions, environments)60 requests/min
Reads (get, list, streaming)600 requests/min

Access

  • Status: Open beta
  • Header required: managed-agents-2026-04-01 (SDK sets automatically)
  • Outcomes/Multi-agent/Memory: Research preview (request access separately)

When to Use Managed Agents vs. Messages API

Choose Managed AgentsChoose Messages API
Long-running tasks (hours)Simple chat completions
Need code execution sandboxFull control over orchestration
Quick launch priorityCustom scaffolding requirements
Async background workReal-time low-latency needs

Key Takeaways

  • Ready-made infrastructure — no Docker, orchestration code, or tool execution to build
  • Four concepts: Agent (config) → Environment (container) → Session (instance) → Events (stream)
  • Built-in tools for common operations + custom tool support
  • Per-tool permissions for production safety
  • Outcomes turn conversations into goal-oriented work
  • Multi-agent coordination for complex workflows
  • $0.08/hour + token costs

Sources

  • Anthropic Managed Agents documentation (April 2026)
  • Telegram @prompt_design