Skip to content

Context Engineering — Designing Information Flow for AI Agents

Context Engineering

TL;DR: Context engineering structures how tools present information to AI agents. Unlike prompt engineering (which guides model behavior), context engineering designs tool responses so agents can reason across multiple calls — giving them “peripheral vision” to navigate complex information spaces.

What It Is

Context engineering is:

“Structuring tool responses and information flow to give agents the right data in the right format for effective reasoning.”

It recognizes that agentic systems are persistent — they make multiple sequential tool calls to build understanding. This is fundamentally different from humans making single queries.

Context Engineering vs. Prompt Engineering

Prompt EngineeringContext Engineering
Guides model behaviorShapes information flow
Optimizes instructionsOptimizes tool responses
Single interaction focusMulti-turn agent focus
”How should the model act?""What should the model see?”

Key insight: Tool responses ARE prompt engineering. The XML structure, metadata, and system instructions in tool outputs directly influence how agents think about subsequent searches.

The Four-Level Framework

Level 1: Minimal Chunks (Baseline RAG)

Raw text without metadata. Agent sees content but has no context about where it came from or what else exists.

Problem: Agent can’t strategically explore because it has no map.

Level 2: Source Metadata

Adds document source, page numbers, and clustering signals.

Enables: Strategic full-page loading. If multiple chunks come from one document, agent can load the whole thing.

Example: “3 chunks from document X” → agent uses load_pages() instead of searching again.

Level 3: Multi-Modal Optimization

Formats different content types appropriately:

  • Tables → Markdown or HTML
  • Images → Include OCR text
  • Structured data → Preserve structure

Why: Agents reason differently about tabular vs. prose data.

Level 4: Faceted Landscape

Returns complete metadata distributions revealing patterns across all dimensions.

The breakthrough: Shows what the agent didn’t retrieve.

Faceted Search Explained

Facets expose metadata aggregations alongside results:

Facet TypeWhat It Shows
Source clusteringWhich documents contain multiple relevant chunks
Category distributionsPatterns across document types, teams, projects
Coverage mappingHow many results exist beyond top-k ranking

Example: Search returns 3 “Done” tickets, but facets show 5 “Open” tickets exist. The agent now knows there’s hidden relevant information filtered out by similarity ranking.

Tool Response Design Principles

1. Signal-to-Noise Prioritization

Return contextually relevant information, not comprehensive technical details. Agents don’t need everything — they need what matters for the next decision.

2. Peripheral Vision Architecture

Provide metadata about the broader information space beyond top-k results. Let agents see the shape of what they haven’t retrieved.

3. Response-as-Instruction

Structure outputs so metadata aggregations guide subsequent tool calls.

Example system instruction:

“High facet counts for sources with few returned chunks indicate valuable information filtered by similarity ranking — investigate with source filters.”

4. Strategic Filtering

Include parameters aligned with facet dimensions discovered in responses. If facets show categories, tools should accept category filters.

Implementation Example

Before (Level 1):

Result: "The Q4 revenue was $2.3M..."
Result: "Marketing budget increased by..."

After (Level 4):

<search_results total="47" shown="5">
<facets>
<source name="Q4-Report.pdf" chunks="12" />
<source name="Board-Deck.pptx" chunks="8" />
<category name="Financial" count="23" />
<category name="Marketing" count="15" />
<category name="Operations" count="9" />
</facets>
<results>
<result source="Q4-Report.pdf" page="4">
The Q4 revenue was $2.3M...
</result>
...
</results>
<guidance>
Multiple chunks from Q4-Report.pdf suggest loading full document.
High count in "Operations" category (9) not shown in top results.
</guidance>
</search_results>

Now the agent knows:

  • There’s a concentrated source worth loading fully
  • There’s a category dimension to explore
  • The search didn’t show everything relevant

Success Metrics

Teams implementing context engineering report:

MetricImprovement
Clarification questions90% reduction
Expert escalations75% reduction
504 errors95% reduction
Resolution times4x improvement

Practical Implementation

Quick Wins (Level 2)

  • Wrap search results in XML with source tracking
  • Include page numbers and document paths
  • Add clustering detection: “Multiple chunks from same source = use load_pages()”
  • System instructions teaching agents facet-driven refinement

Token Efficiency

  • Support response format parameters (“concise” vs “detailed”)
  • Implement pagination with sensible defaults
  • Truncate verbose responses based on agent reasoning context

Architecture Note

You don’t need to rebuild infrastructure. Most improvements are XML structuring and metadata addition with minimal backend changes.

Key Insight

Agents with sufficient budget demonstrate systematic persistence — they’ll refine searches iteratively based on facet patterns.

This reverses traditional RAG’s “stuff everything relevant upfront” approach. Instead, design tools that reveal the information landscape so agents can systematically explore dimensions.

Key Takeaways

  • Context engineering shapes what agents see, not how they act
  • Tool responses ARE prompt engineering
  • Four levels: Chunks → Metadata → Multi-modal → Faceted landscape
  • Facets give agents “peripheral vision” of unseen information
  • 4x resolution time improvement with proper implementation

Sources