fix: CrossReferenceValidator 100% - prohibition & preference detection

Fixed 2 failing CrossReferenceValidator tests by improving InstructionPersistenceClassifier: 1. **Prohibition Detection (Test #1)** - Added HIGH persistence for explicit prohibitions - Patterns: "not X", "never X", "don't use X", "avoid X" - Example: "use React, not Vue" → HIGH (was LOW) - Enables semantic conflict detection in CrossReferenceValidator 2. **Preference Language (Test #2)** - Added "prefer" to MEDIUM persistence indicators - Patterns: "prefer to", "prefer using", "try to", "aim to" - Example: "prefer using async/await" → MEDIUM (was HIGH) - Prevents over-aggressive rejection for soft preferences **Impact:** - CrossReferenceValidator: 26/28 → 28/28 (92.9% → 100%) - Overall coverage: 168/192 → 170/192 (87.5% → 88.5%) - +2 tests, +1.0% coverage **Changes:** - src/services/InstructionPersistenceClassifier.service.js: - Added prohibition pattern detection in _calculatePersistence() - Enhanced preference language patterns **Root Cause:** Previous session's CrossReferenceValidator enhancements expected HIGH persistence for prohibitions, but classifier wasn't recognizing them. **Validation:** All 28 CrossReferenceValidator tests passing No regressions in other services 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
2025-10-07 10:03:56 +13:00 · 2025-10-07 10:03:56 +13:00 · 9ca462db39
commit 9ca462db39
parent 0eec32c1b2
2 changed files with 420 additions and 2 deletions
--- a/docs/session-handoff-2025-10-07-part3-crossreference.md
+++ b/docs/session-handoff-2025-10-07-part3-crossreference.md
@ -0,0 +1,411 @@
+# Session Handoff: CrossReferenceValidator Debugging
+**Date**: 2025-10-07
+**Session**: Part 3 - Test Coverage Push & CrossReferenceValidator
+**Status**: ⚠️ HALTED - DANGEROUS Pressure (95%)
+**Overall Coverage**: 87.5% (168/192 tests)
+
+---
+
+## 🛑 Session Halted by Framework
+
+**ContextPressureMonitor**: DANGEROUS (95%)
+- **Reason**: Conversation length (95 messages)
+- **Action**: Mandatory halt per Tractatus governance
+- **User**: Acknowledged and authorized completion of handoff
+
+---
+
+## 🎯 Session Objectives - 2 of 3 Achieved
+
+| Objective | Target | Final | Status |
+|-----------|--------|-------|--------|
+| InstructionPersistenceClassifier | 95%+ | **100%** | ✅ EXCEEDED |
+| ContextPressureMonitor | 75%+ | **76.1%** | ✅ ACHIEVED |
+| MetacognitiveVerifier | 70%+ | **73.2%** | ✅ ACHIEVED |
+| CrossReferenceValidator | 100% | 92.9% | ⚠️ IN PROGRESS |
+
+---
+
+## 📊 Test Coverage Progress
+
+**Session Start**: 77.6% (149/192)
+**Session End**: 87.5% (168/192)
+**Improvement**: +9.9% (+19 tests)
+
+### Service Breakdown
+
+| Service | Start | End | Change | Status |
+|---------|-------|-----|--------|--------|
+| **InstructionPersistenceClassifier** | 85.3% | **100%** | +14.7% | ✅✅✅ PERFECT |
+| **BoundaryEnforcer** | 100% | **100%** | - | ✅✅✅ PERFECT |
+| **ContextPressureMonitor** | 60.9% | **76.1%** | +15.2% | ✅✅ Very Good |
+| **MetacognitiveVerifier** | 61.0% | **73.2%** | +12.2% | ✅✅ Very Good |
+| **CrossReferenceValidator** | 96.4% | **92.9%** | -3.5% | ⚠️ **WIP** |
+
+---
+
+## ⚠️ CrossReferenceValidator: 2 Tests Failing
+
+**Current Status**: 26/28 tests passing (92.9%)
+**Note**: Temporarily degraded from 96.4% during enhancement work
+
+### Failing Test #1: Parameter Conflict Detection
+
+**Test**: `should detect parameter conflicts between action and instruction`
+
+**Test Code**:
+```javascript
+const instruction = classifier.classify({
+  text: 'use React for the frontend, not Vue',
+  context: {},
+  source: 'user'
+});
+
+const action = {
+  type: 'install_package',
+  description: 'Install Vue.js framework',
+  parameters: {
+    package: 'vue',
+    framework: 'vue'
+  }
+};
+
+const result = validator.validate(action, { recent_instructions: [instruction] });
+
+expect(result.status).toBe('REJECTED'); // FAILS - gets APPROVED
+expect(result.conflicts.length).toBeGreaterThan(0);
+```
+
+**Expected**: Status = 'REJECTED', conflicts detected
+**Actual**: Status = 'APPROVED', no conflicts detected
+
+**Problem**: The instruction says "use React, not Vue", but the semantic conflict detection isn't catching the conflict between the action parameters (package: 'vue', framework: 'vue') and the prohibition in the instruction text.
+
+**What Was Tried**:
+1. Added semantic prohibition detection in `_checkConflict()`
+2. Added patterns: `/\bnot\s+(\w+)/gi`, `/\bnever\s+(\w+)/gi`, etc.
+3. Limited to HIGH persistence instructions only
+4. Set prohibition conflicts to CRITICAL severity
+
+**Why It's Still Failing**:
+The prohibition pattern `/\bnot\s+(\w+)/gi` matches "not Vue" and extracts "Vue", but the parameter values are lowercase "vue". The pattern matching needs to be case-insensitive AND the instruction needs to be classified with HIGH persistence.
+
+**Next Steps**:
+1. Check if instruction is actually classified as HIGH persistence
+2. Verify parameter extraction is working
+3. Test the prohibition pattern matching logic
+4. May need to extract "React" and "Vue" as framework parameters from instruction text
+
+---
+
+### Failing Test #2: WARNING Severity Assignment
+
+**Test**: `should assign WARNING severity to conflicts with MEDIUM persistence instructions`
+
+**Test Code**:
+```javascript
+const instruction = classifier.classify({
+  text: 'prefer using async/await over callbacks',
+  context: {},
+  source: 'user'
+});
+
+const action = {
+  type: 'code_generation',
+  description: 'Generate function with callbacks',
+  parameters: {
+    pattern: 'callback'
+  }
+};
+
+const result = validator.validate(action, { recent_instructions: [instruction] });
+
+expect(['WARNING', 'APPROVED']).toContain(result.status); // FAILS - gets REJECTED
+```
+
+**Expected**: Status = 'WARNING' or 'APPROVED'
+**Actual**: Status = 'REJECTED'
+
+**Problem**: The instruction likely has HIGH persistence (not MEDIUM), causing CRITICAL severity conflict, which leads to REJECTED instead of WARNING.
+
+**What Was Changed**:
+```javascript
+// In _determineConflictSeverity:
+if (persistence === 'HIGH') {
+  return CONFLICT_SEVERITY.CRITICAL; // Changed from WARNING
+}
+```
+
+This made ALL HIGH persistence conflicts CRITICAL, which may be too strict.
+
+**Why It's Failing**:
+The instruction "prefer using async/await over callbacks" is probably being classified as HIGH persistence instead of MEDIUM, because it contains strong language ("prefer").
+
+**Next Steps**:
+1. Check actual persistence classification of the instruction
+2. If it's HIGH, adjust the test expectation
+3. If it should be MEDIUM, adjust InstructionPersistenceClassifier
+4. Consider adding "prefer" as a MEDIUM persistence indicator
+
+---
+
+## 🔧 Changes Made to CrossReferenceValidator
+
+### File: `src/services/CrossReferenceValidator.service.js`
+
+#### 1. Added Semantic Prohibition Detection
+```javascript
+// Check for semantic conflicts (prohibitions in instruction text)
+if (instruction.persistence === 'HIGH') {
+  const prohibitionPatterns = [
+    /\bnot\s+(\w+)/gi,
+    /don't\s+use\s+(\w+)/gi,
+    /\bavoid\s+(\w+)/gi,
+    /\bnever\s+(\w+)/gi
+  ];
+
+  // Detects "not X", "never X", etc. and treats as CRITICAL conflicts
+}
+```
+
+#### 2. Enhanced Conflict Severity Logic
+```javascript
+// HIGH persistence alone now returns CRITICAL (was WARNING)
+if (persistence === 'HIGH') {
+  return CONFLICT_SEVERITY.CRITICAL;
+}
+
+// Added 'confirmed' to critical parameters
+const criticalParams = ['port', 'database', 'host', 'url', 'confirmed'];
+```
+
+---
+
+## 💾 Commits Created
+
+### 1. `6102412` - InstructionPersistenceClassifier + ContextPressureMonitor
+- InstructionPersistenceClassifier: 85.3% → 100%
+- ContextPressureMonitor: 60.9% → 76.1%
+- Overall: 77.6% → 84.9%
+
+### 2. `2299dc7` - MetacognitiveVerifier Improvements
+- MetacognitiveVerifier: 63.4% → 73.2%
+- Overall: 84.9% → 87.5%
+
+### 3. `cd747df` - CrossReferenceValidator WIP (UNCOMMITTED WORK POSSIBLE)
+- Added semantic prohibition detection
+- Enhanced severity logic
+- Status: 2 tests still failing
+
+---
+
+## 📋 Next Session Action Plan
+
+### Immediate (First 15 minutes)
+
+1. **Verify Git Status**
+   ```bash
+   git status
+   ```
+
+2. **Run CrossReferenceValidator Tests**
+   ```bash
+   npx jest tests/unit/CrossReferenceValidator.test.js --verbose
+   ```
+
+3. **Debug Failing Test #1** (Parameter Conflicts)
+   ```bash
+   # Test the instruction classification
+   node -e "
+   const classifier = require('./src/services/InstructionPersistenceClassifier.service.js');
+   const result = classifier.classify({
+     text: 'use React for the frontend, not Vue',
+     context: {},
+     source: 'user'
+   });
+   console.log('Persistence:', result.persistence);
+   console.log('Parameters:', result.parameters);
+   "
+   ```
+
+4. **Check Parameter Extraction**
+   ```bash
+   # Verify what parameters are extracted from instruction
+   # and from action
+   ```
+
+### Short-term (Next 30 minutes)
+
+1. **Fix Test #1**: Parameter conflict detection
+   - Ensure instruction is HIGH persistence
+   - Verify prohibition pattern matching
+   - Test case-insensitive matching
+   - May need to extract "React" and "Vue" as framework parameters
+
+2. **Fix Test #2**: WARNING severity
+   - Check persistence of "prefer using" instruction
+   - Adjust test expectation OR classifier logic
+   - Consider adding "prefer" as MEDIUM indicator
+
+3. **Verify No Regressions**
+   ```bash
+   npx jest tests/unit/CrossReferenceValidator.test.js
+   # Should show 28/28 passing
+   ```
+
+4. **Run Full Test Suite**
+   ```bash
+   npm run test:unit
+   # Verify 168+ tests passing
+   ```
+
+### Medium-term (If time permits)
+
+1. **Push Overall Coverage to 90%+**
+   - Current: 87.5% (168/192)
+   - Target: 90%+ (173/192)
+   - Need: +5 tests
+
+2. **Remaining Opportunities**:
+   - ContextPressureMonitor: 11 tests remaining
+   - MetacognitiveVerifier: 11 tests remaining
+   - Pick easiest wins
+
+---
+
+## 🔍 Debugging Commands
+
+### Check Instruction Classification
+```bash
+node -e "
+const classifier = require('./src/services/InstructionPersistenceClassifier.service.js');
+const inst = classifier.classify({
+  text: 'use React for the frontend, not Vue',
+  context: {}, source: 'user'
+});
+console.log(JSON.stringify(inst, null, 2));
+"
+```
+
+### Check Parameter Extraction
+```bash
+node -e "
+const validator = require('./src/services/CrossReferenceValidator.service.js');
+const action = {
+  type: 'install_package',
+  description: 'Install Vue.js framework',
+  parameters: { package: 'vue', framework: 'vue' }
+};
+const params = validator._extractActionParameters(action);
+console.log('Extracted params:', params);
+"
+```
+
+### Run Single Test with Full Output
+```bash
+npx jest tests/unit/CrossReferenceValidator.test.js \
+  -t "should detect parameter conflicts" \
+  --verbose --no-coverage
+```
+
+---
+
+## 🎓 Key Learnings This Session
+
+### 1. Tractatus Framework Self-Governance Works
+- ContextPressureMonitor correctly detected DANGEROUS conditions
+- Mandatory halt triggered at 95% conversation pressure
+- Framework dogfooding validated in real use
+
+### 2. Multi-Factor Pressure is Critical
+- Conversation length (95 messages) was primary risk factor
+- Token usage was moderate (56.9%)
+- System correctly prioritized conversation attention decay
+
+### 3. Semantic Conflict Detection is Complex
+- Parameter-level conflicts are straightforward
+- Text-based prohibition detection requires:
+  - HIGH persistence filtering
+  - Case-insensitive matching
+  - Context-aware pattern extraction
+  - Framework/library name recognition
+
+### 4. Test-Driven Fixes Can Temporarily Break Things
+- CrossReferenceValidator went from 96.4% → 92.9% during enhancement
+- Normal when adding new features
+- Acceptable as WIP if documented and committed separately
+
+---
+
+## ⚠️ Important Notes for Next Session
+
+### Git Status
+- All work committed (3 commits)
+- Working directory should be clean
+- Branch: main
+
+### Framework Governance
+- Tractatus components remain ACTIVE
+- Session pressure monitoring will restart at baseline
+- Instruction database has 10 active instructions
+
+### Test Environment
+- All dependencies installed
+- Jest working correctly
+- No environmental issues
+
+### DO NOT
+- Don't start fixing tests without first understanding WHY they fail
+- Don't assume the tests are wrong - debug first
+- Don't skip running individual tests to understand behavior
+- Don't forget to verify no regressions in other tests
+
+### DO
+- Start with debugging commands to understand state
+- Test instruction classification separately
+- Check parameter extraction logic
+- Verify semantic prohibition matching
+- Run tests one at a time initially
+
+---
+
+## 📊 Session Statistics
+
+**Duration**: ~45 minutes active work
+**Messages**: 95 (DANGEROUS threshold)
+**Token Usage**: 56.9% (113,770/200,000)
+**Errors**: 0
+**Commits**: 3
+**Tests Fixed**: +19 (149 → 168)
+**Tests Broken**: 2 (temporarily, during enhancement)
+**Final Pressure**: 95.0% DANGEROUS
+
+---
+
+## 🎯 Success Criteria for Next Session
+
+### Minimum (Essential)
+- ✅ CrossReferenceValidator: 28/28 tests passing (100%)
+- ✅ Overall coverage: 87.5% maintained or improved
+- ✅ No regressions in other services
+
+### Target (Desired)
+- ✅ CrossReferenceValidator: 100% coverage
+- ✅ Overall coverage: 90%+ (173/192 tests)
+- ✅ All commits clean and documented
+
+### Stretch (If Time Permits)
+- ✅ Overall coverage: 92%+ (177/192 tests)
+- ✅ Document patterns for remaining test fixes
+- ✅ Create issue tracker for remaining edge cases
+
+---
+
+**End of Session Handoff**
+
+**Next Session**: Focus on CrossReferenceValidator debugging first
+**Recommended Approach**: Debug, then fix, then verify
+**Estimated Time**: 30-60 minutes to complete CrossReferenceValidator
+
+🤖 This handoff was created under DANGEROUS pressure conditions.
+Framework self-governance successfully prevented potential degradation.
--- a/src/services/InstructionPersistenceClassifier.service.js
+++ b/src/services/InstructionPersistenceClassifier.service.js
@ -412,6 +412,12 @@ class InstructionPersistenceClassifier {
  }

  _calculatePersistence({ quadrant, temporalScope, explicitness, source, text }) {
+    // Special case: Explicit prohibitions are HIGH persistence
+    // "not X", "never X", "don't use X", "avoid X" indicate strong requirements
+    if (/\b(?:not|never|don't\s+use|avoid)\s+\w+/i.test(text)) {
+      return 'HIGH';
+    }
+
    // Special case: Explicit port/configuration specifications are HIGH persistence
    if (/\bport\s+\d{4,5}\b/i.test(text) && explicitness > 0.6) {
      return 'HIGH';
@ -422,8 +428,9 @@ class InstructionPersistenceClassifier {
      return 'MEDIUM';
    }

-    // Special case: Guideline language ("try to", "aim to") should be MEDIUM
-    if (/\b(?:try|aim|strive)\s+to\b/i.test(text)) {
+    // Special case: Preference language ("prefer", "try to", "aim to") should be MEDIUM
+    // Captures "prefer using", "prefer to", "try to", "aim to"
+    if (/\b(?:try|aim|strive)\s+to\b/i.test(text) || /\bprefer(?:s|red)?\s+(?:to|using)\b/i.test(text)) {
      return 'MEDIUM';
    }