From 6a80f344c14ffd74ade7e1c8f55b68ba72ccfa09 Mon Sep 17 00:00:00 2001 From: TheFlow Date: Tue, 21 Oct 2025 15:51:26 +1300 Subject: [PATCH] docs(framework): create comprehensive improvement implementation plan MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit ASSESSMENT: Framework effectiveness rated 4/10 this session - Hooks work (reactive enforcement) ✅ - But don't guide decisions (proactive assistance) ❌ - Metrics collected but not actionable ❌ - Rules exist but aren't consulted during work ❌ KEY FINDING: Framework missed 15+ inst_017 violations for weeks - Only caught when user manually requested audit - No proactive scanning or detection - Framework was REACTIVE, not PROACTIVE TOP 3 IMPROVEMENTS PLANNED: 1. Proactive Content Scanning (5-7 hours) - Auto-scan for inst_016/017/018 violations on session start - Pre-commit hook to prevent violations - Would have caught all 15 violations immediately 2. Context-Aware Rule Surfacing (8-9 hours) - Surface relevant rules based on activity - Editing markdown? Show inst_016/017/018 - Debugging? Show inst_050/024 - Makes 52 rules actionable when relevant 3. Active MetacognitiveVerifier (9-11 hours) - Detect patterns (repeated failures, same file edited 5x) - Suggest relevant solutions ("Try minimal reproduction") - Would have guided integration test debugging IMPLEMENTATION: - Total effort: 32-40 hours (1 month part-time) - Expected effectiveness: 4/10 → 8/10 - ROI: HIGH - Prevents violations, guides work, reduces debugging time See: docs/framework-improvements/IMPLEMENTATION_PLAN_2025-10-21.md 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude --- .../IMPLEMENTATION_PLAN_2025-10-21.md | 439 ++++++++++++++++++ 1 file changed, 439 insertions(+) create mode 100644 docs/framework-improvements/IMPLEMENTATION_PLAN_2025-10-21.md diff --git a/docs/framework-improvements/IMPLEMENTATION_PLAN_2025-10-21.md b/docs/framework-improvements/IMPLEMENTATION_PLAN_2025-10-21.md new file mode 100644 index 00000000..0921f761 --- /dev/null +++ b/docs/framework-improvements/IMPLEMENTATION_PLAN_2025-10-21.md @@ -0,0 +1,439 @@ +# Tractatus Framework Improvement Implementation Plan +**Date**: 2025-10-21 +**Session**: 2025-10-07-001 +**Based On**: Session effectiveness assessment (4/10 rating) + +--- + +## Executive Summary + +**Problem**: Framework is architecturally sound but behaviorally passive +- Hooks work (reactive enforcement) ✅ +- But don't guide decisions (proactive assistance) ❌ +- Metrics collected but not actionable ❌ +- Rules exist but aren't consulted during work ❌ + +**Impact**: Framework missed 15+ inst_017 violations that existed for weeks + +**Solution**: Implement 3 critical improvements to make framework ACTIVE, not passive + +--- + +## Current vs Future State + +### Current State (4/10) +``` +┌─────────────────────────────────────────────────────────────┐ +│ USER WORKS │ +│ │ +│ ┌──────────┐ ┌──────────┐ ┌──────────┐ │ +│ │ Read │ --> │ Edit │ --> │ Commit │ │ +│ │ Files │ │ Files │ │ Changes │ │ +│ └──────────┘ └──────────┘ └──────────┘ │ +│ │ +│ Framework Activity: │ +│ - Hooks validate (background, invisible) │ +│ - Metrics collected (not surfaced) │ +│ - Rules exist (not consulted) │ +│ │ +│ Result: Violations slip through ❌ │ +└─────────────────────────────────────────────────────────────┘ +``` + +### Future State (8/10) +``` +┌─────────────────────────────────────────────────────────────┐ +│ SESSION START │ +│ 🔍 Scanning for prohibited terms... │ +│ ⚠ Found 15 violations (inst_017) │ +│ Run: node scripts/scan-violations.js --fix │ +│ │ +│ USER WORKS │ +│ │ +│ ┌──────────┐ 📋 Editing markdown? │ +│ │ Edit │ Rules: inst_016, inst_017, inst_018 │ +│ │ README │ │ +│ └──────────┘ ┌──────────┐ │ +│ │ Validate │ │ +│ └──────────┘ │ +│ │ +│ 💡 MetacognitiveVerifier: │ +│ Test failed 3 times - try minimal reproduction? │ +│ (inst_050) │ +│ │ +│ Result: Violations prevented proactively ✅ │ +└─────────────────────────────────────────────────────────────┘ +``` + +--- + +## 🔴 Improvement 1: Proactive Content Scanning + +### Problem +- inst_017 violations (15+ instances of "guarantee") existed for weeks +- No automated detection until user manually requested audit +- Framework was REACTIVE, not PROACTIVE + +### Solution +**File**: `scripts/framework-components/ProhibitedTermsScanner.js` + +Automated scanner that: +1. Runs on session start +2. Scans user-facing files for prohibited terms +3. Reports violations immediately +4. Provides auto-fix suggestions + +### Integration Points +1. **Session Init**: Show violations at startup +2. **Pre-Commit Hook**: Block commits with violations +3. **CLI Tool**: Manual scanning and fixing + +### Example Output +```bash +▶ 7. Scanning for Prohibited Terms + + ⚠ Found violations in user-facing content: + inst_017: 15 violations + + Run: node scripts/scan-violations.js --details + Or: node scripts/scan-violations.js --fix +``` + +### Effort & Impact +- **Development**: 5-7 hours +- **Impact**: Would have caught all 15 violations at session start +- **ROI**: HIGH - Prevents values violations before they reach production + +--- + +## 🔴 Improvement 2: Context-Aware Rule Surfacing + +### Problem +- 52 active rules - too many to remember +- Rules not surfaced during relevant activities +- Framework was invisible during decision-making + +### Solution +**File**: `scripts/framework-components/ContextAwareRules.js` + +Context detection system that: +1. Detects activity type (editing markdown, debugging, deploying) +2. Surfaces relevant rules for that context +3. Reduces cognitive load (show 3-5 rules, not 52) + +### Context Mappings +``` +editing_markdown → inst_016, inst_017, inst_018 (content rules) +editing_public_html → inst_017, inst_041, inst_042 (values + CSP) +writing_tests → inst_050, inst_051 (testing rules) +debugging → inst_050, inst_024 (minimal repro, document) +deploying → inst_038, inst_039 (pre-action, closedown) +``` + +### Example Output +```bash +📋 You're editing documentation. Remember: + • inst_017: NEVER use prohibited terms: 'guarantee', 'guaranteed' + • inst_016: Avoid fabricated statistics without sources + • inst_018: Accurate status claims (proof-of-concept, not production-ready) + +🔍 Hook: Validating file edit: docs/introduction.md +``` + +### Effort & Impact +- **Development**: 8-9 hours +- **Impact**: Makes 52 rules actionable when relevant +- **ROI**: HIGH - Guides decisions during work + +--- + +## 🟡 Improvement 3: Active MetacognitiveVerifier + +### Problem +- Spent 2+ hours debugging integration tests without framework guidance +- Made repeated attempts (trial and error) +- No suggestions like "Try minimal reproduction" + +### Solution +**Enhanced**: `scripts/framework-components/MetacognitiveVerifier.service.js` + +Pattern detection system that: +1. Logs activities (test runs, file edits, commands) +2. Detects patterns (repeated failures, same file edited 5+ times) +3. Surfaces relevant suggestions automatically + +### Patterns Detected +``` +repeated_test_failure → Suggest: Create minimal reproduction (inst_050) +same_file_edited_5x → Suggest: Make incremental changes (inst_025) +high_token_usage → Suggest: Run pressure check (inst_034) +long_running_command → Suggest: Use timeout or background execution +``` + +### Example Output +```bash +💡 MetacognitiveVerifier: Suggestions available + +> node scripts/show-suggestions.js + +💡 METACOGNITIVE SUGGESTIONS + +1. Repeated test failures detected + Related rules: inst_050 + + • Create minimal reproduction case + • Isolate the failing component + • Check test setup (beforeAll/afterAll) + • Verify dependencies are connected + +2. File edited 7 times: tests/integration/api.auth.test.js + Related rules: inst_025 + + • Are you making incremental changes? + • Test each change before the next + • Document what you're learning +``` + +### Effort & Impact +- **Development**: 9-11 hours +- **Impact**: Guides debugging, reduces trial-and-error time +- **ROI**: MEDIUM-HIGH - Most helpful for complex problem-solving + +--- + +## Implementation Roadmap + +### Phase 1: Proactive Scanning (Week 1) +**Files to Create**: +- `scripts/framework-components/ProhibitedTermsScanner.js` +- `tests/unit/ProhibitedTermsScanner.test.js` +- `.git/hooks/pre-commit` (optional) + +**Modifications**: +- `scripts/session-init.js` - Add scanning step + +**Deliverable**: Session start shows violations immediately + +--- + +### Phase 2: Context Awareness (Week 2) +**Files to Create**: +- `scripts/framework-components/ContextAwareRules.js` +- `scripts/framework-components/context-prompt.js` (CLI tool) + +**Modifications**: +- `scripts/hook-validators/validate-file-edit.js` - Surface rules + +**Deliverable**: Relevant rules shown during work + +--- + +### Phase 3: Metacognitive Assistant (Week 3) +**Files to Create**: +- `scripts/hook-validators/log-activity.js` (post-tool hook) +- `scripts/framework-components/show-suggestions.js` (CLI tool) + +**Modifications**: +- `scripts/framework-components/MetacognitiveVerifier.service.js` - Enhance + +**Deliverable**: Framework provides suggestions during complex work + +--- + +## Success Criteria + +### Effectiveness Target +**Current**: 4/10 +**Target**: 8/10 + +### Quantitative Metrics + +**Proactive Detection**: +- ✅ 100% of inst_016/017/018 violations caught on session start +- ✅ Pre-commit hook prevents violations (0% slip through) +- ✅ Scan time <5 seconds + +**Context Awareness**: +- ✅ Relevant rules surfaced >90% of the time +- ✅ User surveys rate rules as helpful (>80%) +- ✅ Rule overhead <2 seconds per tool use + +**Metacognitive Assistance**: +- ✅ Suggestions appear after 3rd repeated failure +- ✅ Pattern detection accuracy >80% +- ✅ User reports reduced debugging time (30%+ improvement) + +--- + +## Resource Requirements + +### Development Time +- **Phase 1**: 5-7 hours +- **Phase 2**: 8-9 hours +- **Phase 3**: 9-11 hours +- **Total**: 22-27 hours (3-4 weeks part-time) + +### Testing Time +- **Unit Tests**: 5-6 hours +- **Integration Testing**: 3-4 hours +- **User Testing**: 2-3 hours +- **Total**: 10-13 hours + +### Grand Total: 32-40 hours (1 month part-time) + +--- + +## Risks & Mitigation + +### Risk 1: Notification Fatigue +**Risk**: Too many suggestions become annoying + +**Mitigation**: +- Rate limit to 1 suggestion per 10 minutes +- Allow `--quiet` mode +- User can configure threshold (3 failures vs 5) + +### Risk 2: False Positives +**Risk**: Scanner flags legitimate uses + +**Mitigation**: +- Comprehensive exclude patterns (tests, case studies) +- Easy whitelist mechanism +- Context-aware scanning + +### Risk 3: Performance Impact +**Risk**: Scanning slows session start + +**Mitigation**: +- Scan only user-facing files (not node_modules, tests) +- Run asynchronously, show when ready +- Cache results, re-scan only changed files + +--- + +## Expected Outcomes + +### Immediate Benefits (Phase 1) +1. Zero inst_017 violations in future commits +2. Violations caught before they reach production +3. User confidence in framework enforcement + +### Medium-term Benefits (Phase 2) +1. Reduced cognitive load (don't need to remember 52 rules) +2. Rules become part of natural workflow +3. Faster decision-making with relevant context + +### Long-term Benefits (Phase 3) +1. Reduced debugging time (30%+ improvement) +2. Better problem-solving patterns +3. Framework actively guides learning + +--- + +## Next Steps + +### Immediate +1. Review this plan with user +2. Get approval to proceed +3. Set up development branch + +### Week 1 +1. Implement ProhibitedTermsScanner.js +2. Write unit tests +3. Integrate with session-init.js +4. Test on current codebase + +### Week 2 +1. Implement ContextAwareRules.js +2. Build context mappings +3. Integrate with hooks +4. User testing + +### Week 3 +1. Enhance MetacognitiveVerifier +2. Implement pattern detection +3. Build CLI tools +4. Final integration testing + +--- + +## Appendix: Technical Specifications + +### ProhibitedTermsScanner API +```javascript +const scanner = new ProhibitedTermsScanner(); + +// Scan all files +const violations = await scanner.scan(); + +// Scan with options +const violations = await scanner.scan({ + silent: false, + fixMode: false, + staged: false // Git staged files only +}); + +// Auto-fix (simple replacements) +const result = await scanner.autoFix(violations); +// => { fixed: 12, total: 15 } +``` + +### ContextAwareRules API +```javascript +const contextRules = new ContextAwareRules(); + +// Detect context +const contexts = contextRules.detectContext('public/index.html'); +// => ['editing_public_html'] + +// Get relevant rules +const rules = contextRules.getRelevantRules('editing_public_html'); +// => [{ id: 'inst_017', text: '...', quadrant: 'VALUES' }] + +// Format for display +const message = contextRules.formatRulesForDisplay('editing_public_html'); +// => "📋 You're editing public HTML. Remember:..." +``` + +### MetacognitiveVerifier API +```javascript +const verifier = new MetacognitiveVerifier(); + +// Log activity +verifier.logActivity({ + type: 'bash', + command: 'npm test', + exitCode: 1, + duration: 5000 +}); + +// Check patterns +verifier.checkPatterns(tokenCount); +// => Surfaces suggestions if patterns detected + +// Clear suggestions +verifier.clearSuggestions(); +``` + +--- + +## Conclusion + +The Tractatus Framework has **excellent architecture** but **weak behavioral integration**. These 3 improvements transform it from a passive validator to an active assistant. + +**Key Insight**: Framework needs to be PROACTIVE, not just REACTIVE. + +**Bottom Line**: With these improvements, framework effectiveness goes from 4/10 to 8/10. + +--- + +**Status**: Ready for implementation +**Approval Required**: User sign-off to proceed +**Timeline**: 1 month part-time development +**Expected ROI**: High - Prevents violations, guides work, reduces debugging time + +--- + +**Created**: 2025-10-21 +**Author**: Claude Code (Tractatus Framework v3.4) +**Session**: 2025-10-07-001