Pages tagged "llm"
5 pages tagged with llm.
← all tags
- Prompt Caching — The Production Cost-Optimization Layer for LLM Applications Prompt caching reuses LLM input tokens across requests, cutting input-token costs by up to 90% (Anthropic cache reads are 10% of base price; OpenAI cached inputs run 75-90% cheaper). Combined caching strategies achieve 70-80% total cost reduction in production. The 2026 production landscape: Anthropic cache_control markers with 5-min default TTL (1-hour extended), OpenAI automatic prompt caching, semantic caching via vector similarity (Redis, GPTCache). Distinct from KV caching (model-internal) and agentic memory (cross-session persistence).
- Fine-Tuning — What It Means Plain-English explanation of LLM fine-tuning for business professionals
- LLM Evals — Evaluation Systems for AI Products Plain-English guide to building evaluation systems that make AI products actually work
- LLM Nudges — How AI Guides User Decisions Understanding the follow-up suggestions AI systems use to shape user behavior and customer journeys
- Tokens — What They Mean Plain-English explanation of LLM tokens for business professionals