Context Management & Reliability

6 build exercises to practise the concepts in this domain.

5.1Build a Persistent Case Facts Context Manager

Intermediate
45 minutes
  • Implement the persistent case facts block pattern to protect transactional data from summarisation
  • Trim verbose tool results to relevant fields before they accumulate in context
  • Recognise and mitigate the progressive summarisation trap for numerical values, dates, and identifiers
  • Apply the lost-in-the-middle mitigation by placing key findings at the beginning of aggregated inputs
  • Understand that the Claude API is stateless and each request must include complete conversation history

5.2Build an Escalation Decision Engine

Intermediate
40 minutes
  • Implement the three valid escalation triggers: explicit human request, policy exceptions/gaps, and inability to progress
  • Identify and avoid the two unreliable triggers: sentiment-based escalation and self-reported confidence scores
  • Handle the frustration nuance: frustrated customer with resolvable issue vs explicit human request
  • Design ambiguous customer matching that requests additional identifiers rather than selecting heuristically
  • Add explicit escalation criteria with few-shot examples to system prompts as the proportionate first response

5.3Build a Structured Error Propagation System

Advanced
50 minutes
  • Design structured error context with the four required elements: failure type, attempted action, partial results, and alternative approaches
  • Distinguish access failures (timeout, connection error) from valid empty results (successful query, no matches)
  • Identify and avoid the two anti-patterns: silent suppression and workflow termination
  • Implement local retry logic for transient failures before propagating to the coordinator
  • Add coverage annotations to synthesis output for transparency about information gaps

5.4Build a Context-Resilient Codebase Explorer

Advanced
60 minutes
  • Recognise context degradation as an attention quality problem, not a token limit problem
  • Implement scratchpad files to persist key findings outside the conversation context
  • Use subagent delegation for context isolation, not just parallelisation
  • Design crash recovery via structured state manifests for session resilience
  • Apply summary injection between exploration phases to prevent cold start duplication

5.5Build a Confidence-Calibrated Review Router

Advanced
50 minutes
  • Recognise the aggregate metrics trap: 97% overall accuracy can hide 40% error rates on specific document types
  • Implement accuracy tracking broken down by document type AND field segment
  • Calibrate raw confidence scores using labelled validation sets to produce reliable routing thresholds
  • Design stratified random sampling that includes high-confidence extractions for ongoing verification
  • Prioritise limited reviewer capacity on the highest-uncertainty items with dynamic queue ordering

5.6Build a Provenance-Preserving Synthesis Pipeline

Advanced
60 minutes
  • Design structured claim-source mappings with claim, source URL, document name, excerpt, and publication date
  • Preserve attribution through multi-step synthesis pipelines without loss during summarisation
  • Handle conflicting sources by annotating both values rather than arbitrarily selecting one
  • Use temporal awareness (publication dates) to distinguish trends from contradictions
  • Apply content-appropriate rendering: tables for financial data, prose for news, lists for technical findings