Context Management & Reliability

6 build exercises to practise the concepts in this domain.

5.1 — Build a Persistent Case Facts Context Manager

Implement the persistent case facts block pattern to protect transactional data from summarisation
Trim verbose tool results to relevant fields before they accumulate in context
Recognise and mitigate the progressive summarisation trap for numerical values, dates, and identifiers
Apply the lost-in-the-middle mitigation by placing key findings at the beginning of aggregated inputs
Understand that the Claude API is stateless and each request must include complete conversation history

Implement the three valid escalation triggers: explicit human request, policy exceptions/gaps, and inability to progress
Identify and avoid the two unreliable triggers: sentiment-based escalation and self-reported confidence scores
Handle the frustration nuance: frustrated customer with resolvable issue vs explicit human request
Design ambiguous customer matching that requests additional identifiers rather than selecting heuristically
Add explicit escalation criteria with few-shot examples to system prompts as the proportionate first response

Design structured error context with the four required elements: failure type, attempted action, partial results, and alternative approaches
Distinguish access failures (timeout, connection error) from valid empty results (successful query, no matches)
Identify and avoid the two anti-patterns: silent suppression and workflow termination
Implement local retry logic for transient failures before propagating to the coordinator
Add coverage annotations to synthesis output for transparency about information gaps

Recognise context degradation as an attention quality problem, not a token limit problem
Implement scratchpad files to persist key findings outside the conversation context
Use subagent delegation for context isolation, not just parallelisation
Design crash recovery via structured state manifests for session resilience
Apply summary injection between exploration phases to prevent cold start duplication

Recognise the aggregate metrics trap: 97% overall accuracy can hide 40% error rates on specific document types
Implement accuracy tracking broken down by document type AND field segment
Calibrate raw confidence scores using labelled validation sets to produce reliable routing thresholds
Design stratified random sampling that includes high-confidence extractions for ongoing verification
Prioritise limited reviewer capacity on the highest-uncertainty items with dynamic queue ordering

Design structured claim-source mappings with claim, source URL, document name, excerpt, and publication date
Preserve attribution through multi-step synthesis pipelines without loss during summarisation
Handle conflicting sources by annotating both values rather than arbitrarily selecting one
Use temporal awareness (publication dates) to distinguish trends from contradictions
Apply content-appropriate rendering: tables for financial data, prose for news, lists for technical findings