feat: add inst_019 for improved context pressure monitoring
## Problem Identified ContextPressureMonitor reports "NORMAL" (6.7%) pressure while frequent compaction events occur. User observed disconnect between pressure scores and actual session sustainability. ## Root Cause Current monitor only tracks response token generation, NOT total context window consumption: - ✅ Tracks: Response tokens, message counts - ❌ Missing: Tool result sizes, system prompts, function schemas ## Example from This Session - Reported tokens: ~50k (25% of budget) - Actual context used: ~90k+ tokens - instruction-history.json read twice (12k tokens) - concurrent-session-architecture doc (large) - Multiple bash outputs - System prompts and reminders Result: Compaction at "NORMAL" pressure ## inst_019 Requirements Track total context window consumption: - Response tokens (current) - User messages (current) - Tool result sizes (NEW - estimate from file reads, grep, bash) - System overhead (NEW - ~5k tokens baseline) - Compaction risk prediction (NEW - warn when >70% context used) ## Implementation Timeline - Priority: MEDIUM (doesn't block current work) - Phase: 4 or 6 (validation engine or polish phase) - Complexity: 4-6 hours (requires instrumentation of tool calls) ## Impact - Better compaction prediction - Earlier handoff warnings - More accurate pressure reporting - Reduced unexpected session terminations Quadrant: OPERATIONAL | Persistence: HIGH | Session: 2025-10-10-api-memory-transition 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
This commit is contained in:
parent
3c9919ee9b
commit
51f6712090
1 changed files with 32 additions and 4 deletions
|
|
@ -335,20 +335,48 @@
|
|||
},
|
||||
"active": true,
|
||||
"notes": "CORRECTED 2025-10-10 - User clarified: 'Development tool' is the CORRECT classification (Tractatus helps developers build projects), not a limitation. The restriction is about honest testing/validation status, not tool category. Once adequately tested, 'production-ready development tool' is appropriate. Previous version incorrectly treated 'development framework' as early-stage status. Framework failure 2025-10-09: Claude claimed 'production-ready' without testing evidence."
|
||||
},
|
||||
{
|
||||
"id": "inst_019",
|
||||
"text": "ContextPressureMonitor MUST account for total context window consumption, not just response token counts. Tool results (file reads, grep outputs, bash results) can consume massive context (6k+ tokens per large file read). System prompts, function schemas, and cumulative tool results significantly increase actual context usage. When compaction events occur frequently despite 'NORMAL' pressure scores, this indicates critical underestimation. Enhanced monitoring should track: response tokens, user messages, tool result sizes, system overhead, and predict compaction risk when context exceeds 70% of window. Implement improved pressure scoring in Phase 4 or Phase 6.",
|
||||
"timestamp": "2025-10-10T23:45:00Z",
|
||||
"quadrant": "OPERATIONAL",
|
||||
"persistence": "HIGH",
|
||||
"temporal_scope": "PROJECT",
|
||||
"verification_required": "MANDATORY",
|
||||
"explicitness": 1.0,
|
||||
"source": "user",
|
||||
"session_id": "2025-10-10-api-memory-transition",
|
||||
"parameters": {
|
||||
"current_limitation": "underestimates_actual_context",
|
||||
"missing_metrics": ["tool_result_sizes", "system_prompt_overhead", "function_schema_overhead", "cumulative_context"],
|
||||
"symptom": "frequent_compaction_despite_normal_scores",
|
||||
"required_tracking": {
|
||||
"response_tokens": "current tracking",
|
||||
"user_messages": "current tracking",
|
||||
"tool_results": "NEW - size estimation needed",
|
||||
"system_overhead": "NEW - approximate 5k tokens",
|
||||
"compaction_risk": "NEW - predict when >70% context used"
|
||||
},
|
||||
"enhancement_phase": ["Phase 4", "Phase 6"],
|
||||
"priority": "MEDIUM"
|
||||
},
|
||||
"active": true,
|
||||
"notes": "IDENTIFIED 2025-10-10 - User observed frequent compaction events despite ContextPressureMonitor reporting 'NORMAL' (6.7%) pressure at 50k token checkpoint. Actual context consumption much higher due to tool results (reading instruction-history.json twice = 12k tokens, concurrent-session doc = large, multiple bash outputs). Current monitor only accurately tracks response generation, not total context window usage. This gap causes unexpected compactions and poor handoff timing. API Memory may reduce impact but won't eliminate root cause."
|
||||
}
|
||||
],
|
||||
"stats": {
|
||||
"total_instructions": 18,
|
||||
"active_instructions": 18,
|
||||
"total_instructions": 19,
|
||||
"active_instructions": 19,
|
||||
"by_quadrant": {
|
||||
"STRATEGIC": 6,
|
||||
"OPERATIONAL": 4,
|
||||
"OPERATIONAL": 5,
|
||||
"TACTICAL": 1,
|
||||
"SYSTEM": 7,
|
||||
"STOCHASTIC": 0
|
||||
},
|
||||
"by_persistence": {
|
||||
"HIGH": 16,
|
||||
"HIGH": 17,
|
||||
"MEDIUM": 2,
|
||||
"LOW": 0,
|
||||
"VARIABLE": 0
|
||||
|
|
|
|||
Loading…
Add table
Reference in a new issue