Context Engineering — Designing Information Flow for AI Agents
Context Engineering
TL;DR: Context engineering structures how tools present information to AI agents. Unlike prompt engineering (which guides model behavior), context engineering designs tool responses so agents can reason across multiple calls — giving them “peripheral vision” to navigate complex information spaces.
What It Is
Context engineering is:
“Structuring tool responses and information flow to give agents the right data in the right format for effective reasoning.”
It recognizes that agentic systems are persistent — they make multiple sequential tool calls to build understanding. This is fundamentally different from humans making single queries.
Context Engineering vs. Prompt Engineering
| Prompt Engineering | Context Engineering |
|---|---|
| Guides model behavior | Shapes information flow |
| Optimizes instructions | Optimizes tool responses |
| Single interaction focus | Multi-turn agent focus |
| ”How should the model act?" | "What should the model see?” |
Key insight: Tool responses ARE prompt engineering. The XML structure, metadata, and system instructions in tool outputs directly influence how agents think about subsequent searches.
The Four-Level Framework
Level 1: Minimal Chunks (Baseline RAG)
Raw text without metadata. Agent sees content but has no context about where it came from or what else exists.
Problem: Agent can’t strategically explore because it has no map.
Level 2: Source Metadata
Adds document source, page numbers, and clustering signals.
Enables: Strategic full-page loading. If multiple chunks come from one document, agent can load the whole thing.
Example: “3 chunks from document X” → agent uses load_pages() instead of searching again.
Level 3: Multi-Modal Optimization
Formats different content types appropriately:
- Tables → Markdown or HTML
- Images → Include OCR text
- Structured data → Preserve structure
Why: Agents reason differently about tabular vs. prose data.
Level 4: Faceted Landscape
Returns complete metadata distributions revealing patterns across all dimensions.
The breakthrough: Shows what the agent didn’t retrieve.
Faceted Search Explained
Facets expose metadata aggregations alongside results:
| Facet Type | What It Shows |
|---|---|
| Source clustering | Which documents contain multiple relevant chunks |
| Category distributions | Patterns across document types, teams, projects |
| Coverage mapping | How many results exist beyond top-k ranking |
Example: Search returns 3 “Done” tickets, but facets show 5 “Open” tickets exist. The agent now knows there’s hidden relevant information filtered out by similarity ranking.
Tool Response Design Principles
1. Signal-to-Noise Prioritization
Return contextually relevant information, not comprehensive technical details. Agents don’t need everything — they need what matters for the next decision.
2. Peripheral Vision Architecture
Provide metadata about the broader information space beyond top-k results. Let agents see the shape of what they haven’t retrieved.
3. Response-as-Instruction
Structure outputs so metadata aggregations guide subsequent tool calls.
Example system instruction:
“High facet counts for sources with few returned chunks indicate valuable information filtered by similarity ranking — investigate with source filters.”
4. Strategic Filtering
Include parameters aligned with facet dimensions discovered in responses. If facets show categories, tools should accept category filters.
Implementation Example
Before (Level 1):
Result: "The Q4 revenue was $2.3M..."Result: "Marketing budget increased by..."After (Level 4):
<search_results total="47" shown="5"> <facets> <source name="Q4-Report.pdf" chunks="12" /> <source name="Board-Deck.pptx" chunks="8" /> <category name="Financial" count="23" /> <category name="Marketing" count="15" /> <category name="Operations" count="9" /> </facets> <results> <result source="Q4-Report.pdf" page="4"> The Q4 revenue was $2.3M... </result> ... </results> <guidance> Multiple chunks from Q4-Report.pdf suggest loading full document. High count in "Operations" category (9) not shown in top results. </guidance></search_results>Now the agent knows:
- There’s a concentrated source worth loading fully
- There’s a category dimension to explore
- The search didn’t show everything relevant
Success Metrics
Teams implementing context engineering report:
| Metric | Improvement |
|---|---|
| Clarification questions | 90% reduction |
| Expert escalations | 75% reduction |
| 504 errors | 95% reduction |
| Resolution times | 4x improvement |
Practical Implementation
Quick Wins (Level 2)
- Wrap search results in XML with source tracking
- Include page numbers and document paths
- Add clustering detection: “Multiple chunks from same source = use load_pages()”
- System instructions teaching agents facet-driven refinement
Token Efficiency
- Support response format parameters (“concise” vs “detailed”)
- Implement pagination with sensible defaults
- Truncate verbose responses based on agent reasoning context
Architecture Note
You don’t need to rebuild infrastructure. Most improvements are XML structuring and metadata addition with minimal backend changes.
Key Insight
Agents with sufficient budget demonstrate systematic persistence — they’ll refine searches iteratively based on facet patterns.
This reverses traditional RAG’s “stuff everything relevant upfront” approach. Instead, design tools that reveal the information landscape so agents can systematically explore dimensions.
Key Takeaways
- Context engineering shapes what agents see, not how they act
- Tool responses ARE prompt engineering
- Four levels: Chunks → Metadata → Multi-modal → Faceted landscape
- Facets give agents “peripheral vision” of unseen information
- 4x resolution time improvement with proper implementation
Related Concepts
- glossary/rag — The baseline that context engineering improves
- glossary/llm-evals — Measuring agent performance
- glossary/ai-agent — The systems that benefit from context engineering
- automation/multi-agent-patterns — Architectural patterns for agents
Sources
- Beyond Chunks: Why Context Engineering is the Future of RAG — Jason Liu (August 2025)