Major Improvements:
- InstructionPersistenceClassifier: 85.3% → 100% (+14.7%, +5 tests)
- ContextPressureMonitor: 60.9% → 76.1% (+15.2%, +7 tests)
InstructionPersistenceClassifier Fixes:
- Fix SESSION temporal scope detection for "this conversation" phrases
- Handle empty text gracefully (default to STOCHASTIC)
- Add MEDIUM persistence for exploration keywords (explore, investigate)
- Add MEDIUM persistence for guideline language ("try to", "aim to")
- Add context pressure adjustment to verification requirements
ContextPressureMonitor Fixes:
- Fix token pressure calculation to use ratios directly (not normalized by critical threshold)
- Use max of weighted average OR highest single metric (safety-first approach)
- Handle token_usage values > 1.0 (over-budget scenarios)
- Handle negative token_usage values
Framework Testing:
- Verified Tractatus governance is active and operational
- Tested instruction classification with real examples
- All core framework components operational
Coverage Progress:
- Overall: 77.6% → 84.9% (163/192 tests passing)
- BoundaryEnforcer: 100% (43/43) ✅
- InstructionPersistenceClassifier: 100% (34/34) ✅
- ContextPressureMonitor: 76.1% (35/46) ✅
- CrossReferenceValidator: 96.4% (52/54) ✅
- MetacognitiveVerifier: 61.0% (25/41) ⚠️
Next: MetacognitiveVerifier improvements (61% → 70%+ target)
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
|
||
|---|---|---|
| .. | ||
| config | ||
| controllers | ||
| middleware | ||
| models | ||
| routes | ||
| services | ||
| utils | ||
| server.js | ||