TheFlow 6a80f344c1 docs(framework): create comprehensive improvement implementation plan

ASSESSMENT: Framework effectiveness rated 4/10 this session
- Hooks work (reactive enforcement) ✅
- But don't guide decisions (proactive assistance) ❌
- Metrics collected but not actionable ❌
- Rules exist but aren't consulted during work ❌

KEY FINDING: Framework missed 15+ inst_017 violations for weeks
- Only caught when user manually requested audit
- No proactive scanning or detection
- Framework was REACTIVE, not PROACTIVE

TOP 3 IMPROVEMENTS PLANNED:

1. Proactive Content Scanning (5-7 hours)
   - Auto-scan for inst_016/017/018 violations on session start
   - Pre-commit hook to prevent violations
   - Would have caught all 15 violations immediately

2. Context-Aware Rule Surfacing (8-9 hours)
   - Surface relevant rules based on activity
   - Editing markdown? Show inst_016/017/018
   - Debugging? Show inst_050/024
   - Makes 52 rules actionable when relevant

3. Active MetacognitiveVerifier (9-11 hours)
   - Detect patterns (repeated failures, same file edited 5x)
   - Suggest relevant solutions ("Try minimal reproduction")
   - Would have guided integration test debugging

IMPLEMENTATION:
- Total effort: 32-40 hours (1 month part-time)
- Expected effectiveness: 4/10 → 8/10
- ROI: HIGH - Prevents violations, guides work, reduces debugging time

See: docs/framework-improvements/IMPLEMENTATION_PLAN_2025-10-21.md

🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>

2025-10-21 15:51:26 +13:00

13 KiB

Raw Blame History

Tractatus Framework Improvement Implementation Plan

Date: 2025-10-21 Session: 2025-10-07-001 Based On: Session effectiveness assessment (4/10 rating)

Executive Summary

Problem: Framework is architecturally sound but behaviorally passive

Hooks work (reactive enforcement) ✅
But don't guide decisions (proactive assistance) ❌
Metrics collected but not actionable ❌
Rules exist but aren't consulted during work ❌

Impact: Framework missed 15+ inst_017 violations that existed for weeks

Solution: Implement 3 critical improvements to make framework ACTIVE, not passive

Current vs Future State

Current State (4/10)

┌─────────────────────────────────────────────────────────────┐
│  USER WORKS                                                 │
│                                                             │
│  ┌──────────┐     ┌──────────┐     ┌──────────┐          │
│  │ Read     │ --> │ Edit     │ --> │ Commit   │          │
│  │ Files    │     │ Files    │     │ Changes  │          │
│  └──────────┘     └──────────┘     └──────────┘          │
│                                                             │
│  Framework Activity:                                        │
│  - Hooks validate (background, invisible)                   │
│  - Metrics collected (not surfaced)                         │
│  - Rules exist (not consulted)                              │
│                                                             │
│  Result: Violations slip through ❌                         │
└─────────────────────────────────────────────────────────────┘

Future State (8/10)

┌─────────────────────────────────────────────────────────────┐
│  SESSION START                                              │
│  🔍 Scanning for prohibited terms...                        │
│      ⚠ Found 15 violations (inst_017)                       │
│      Run: node scripts/scan-violations.js --fix             │
│                                                             │
│  USER WORKS                                                 │
│                                                             │
│  ┌──────────┐     📋 Editing markdown?                     │
│  │ Edit     │         Rules: inst_016, inst_017, inst_018  │
│  │ README   │                                               │
│  └──────────┘     ┌──────────┐                             │
│                   │ Validate │                             │
│                   └──────────┘                             │
│                                                             │
│  💡 MetacognitiveVerifier:                                  │
│      Test failed 3 times - try minimal reproduction?        │
│      (inst_050)                                             │
│                                                             │
│  Result: Violations prevented proactively ✅                │
└─────────────────────────────────────────────────────────────┘

🔴 Improvement 1: Proactive Content Scanning

Problem

inst_017 violations (15+ instances of "guarantee") existed for weeks
No automated detection until user manually requested audit
Framework was REACTIVE, not PROACTIVE

Solution

File: scripts/framework-components/ProhibitedTermsScanner.js

Automated scanner that:

Runs on session start
Scans user-facing files for prohibited terms
Reports violations immediately
Provides auto-fix suggestions

Integration Points

Session Init: Show violations at startup
Pre-Commit Hook: Block commits with violations
CLI Tool: Manual scanning and fixing

Example Output

▶ 7. Scanning for Prohibited Terms

  ⚠ Found violations in user-facing content:
    inst_017: 15 violations

  Run: node scripts/scan-violations.js --details
  Or:  node scripts/scan-violations.js --fix

Effort & Impact

Development: 5-7 hours
Impact: Would have caught all 15 violations at session start
ROI: HIGH - Prevents values violations before they reach production

🔴 Improvement 2: Context-Aware Rule Surfacing

Problem

52 active rules - too many to remember
Rules not surfaced during relevant activities
Framework was invisible during decision-making

Solution

File: scripts/framework-components/ContextAwareRules.js

Context detection system that:

Detects activity type (editing markdown, debugging, deploying)
Surfaces relevant rules for that context
Reduces cognitive load (show 3-5 rules, not 52)

Context Mappings

editing_markdown    → inst_016, inst_017, inst_018 (content rules)
editing_public_html → inst_017, inst_041, inst_042 (values + CSP)
writing_tests       → inst_050, inst_051 (testing rules)
debugging           → inst_050, inst_024 (minimal repro, document)
deploying           → inst_038, inst_039 (pre-action, closedown)

Example Output

📋 You're editing documentation. Remember:
   • inst_017: NEVER use prohibited terms: 'guarantee', 'guaranteed'
   • inst_016: Avoid fabricated statistics without sources
   • inst_018: Accurate status claims (proof-of-concept, not production-ready)

🔍 Hook: Validating file edit: docs/introduction.md

Effort & Impact

Development: 8-9 hours
Impact: Makes 52 rules actionable when relevant
ROI: HIGH - Guides decisions during work

🟡 Improvement 3: Active MetacognitiveVerifier

Problem

Spent 2+ hours debugging integration tests without framework guidance
Made repeated attempts (trial and error)
No suggestions like "Try minimal reproduction"

Solution

Enhanced: scripts/framework-components/MetacognitiveVerifier.service.js

Pattern detection system that:

Logs activities (test runs, file edits, commands)
Detects patterns (repeated failures, same file edited 5+ times)
Surfaces relevant suggestions automatically

Patterns Detected

repeated_test_failure    → Suggest: Create minimal reproduction (inst_050)
same_file_edited_5x      → Suggest: Make incremental changes (inst_025)
high_token_usage         → Suggest: Run pressure check (inst_034)
long_running_command     → Suggest: Use timeout or background execution

Example Output

💡 MetacognitiveVerifier: Suggestions available

> node scripts/show-suggestions.js

💡 METACOGNITIVE SUGGESTIONS

1. Repeated test failures detected
   Related rules: inst_050

   • Create minimal reproduction case
   • Isolate the failing component
   • Check test setup (beforeAll/afterAll)
   • Verify dependencies are connected

2. File edited 7 times: tests/integration/api.auth.test.js
   Related rules: inst_025

   • Are you making incremental changes?
   • Test each change before the next
   • Document what you're learning

Effort & Impact

Development: 9-11 hours
Impact: Guides debugging, reduces trial-and-error time
ROI: MEDIUM-HIGH - Most helpful for complex problem-solving

Implementation Roadmap

Phase 1: Proactive Scanning (Week 1)

Files to Create:

scripts/framework-components/ProhibitedTermsScanner.js
tests/unit/ProhibitedTermsScanner.test.js
.git/hooks/pre-commit (optional)

Modifications:

scripts/session-init.js - Add scanning step

Deliverable: Session start shows violations immediately

Phase 2: Context Awareness (Week 2)

Files to Create:

scripts/framework-components/ContextAwareRules.js
scripts/framework-components/context-prompt.js (CLI tool)

Modifications:

scripts/hook-validators/validate-file-edit.js - Surface rules

Deliverable: Relevant rules shown during work

Phase 3: Metacognitive Assistant (Week 3)

Files to Create:

scripts/hook-validators/log-activity.js (post-tool hook)
scripts/framework-components/show-suggestions.js (CLI tool)

Modifications:

scripts/framework-components/MetacognitiveVerifier.service.js - Enhance

Deliverable: Framework provides suggestions during complex work

Success Criteria

Effectiveness Target

Current: 4/10
Target: 8/10

Quantitative Metrics

Proactive Detection:

✅ 100% of inst_016/017/018 violations caught on session start
✅ Pre-commit hook prevents violations (0% slip through)
✅ Scan time <5 seconds

Context Awareness:

✅ Relevant rules surfaced >90% of the time
✅ User surveys rate rules as helpful (>80%)
✅ Rule overhead <2 seconds per tool use

Metacognitive Assistance:

✅ Suggestions appear after 3rd repeated failure
✅ Pattern detection accuracy >80%
✅ User reports reduced debugging time (30%+ improvement)

Resource Requirements

Development Time

Phase 1: 5-7 hours
Phase 2: 8-9 hours
Phase 3: 9-11 hours
Total: 22-27 hours (3-4 weeks part-time)

Testing Time

Unit Tests: 5-6 hours
Integration Testing: 3-4 hours
User Testing: 2-3 hours
Total: 10-13 hours

Grand Total: 32-40 hours (1 month part-time)

Risks & Mitigation

Risk 1: Notification Fatigue

Risk: Too many suggestions become annoying

Mitigation:

Rate limit to 1 suggestion per 10 minutes
Allow --quiet mode
User can configure threshold (3 failures vs 5)

Risk 2: False Positives

Risk: Scanner flags legitimate uses

Mitigation:

Comprehensive exclude patterns (tests, case studies)
Easy whitelist mechanism
Context-aware scanning

Risk 3: Performance Impact

Risk: Scanning slows session start

Mitigation:

Scan only user-facing files (not node_modules, tests)
Run asynchronously, show when ready
Cache results, re-scan only changed files

Expected Outcomes

Immediate Benefits (Phase 1)

Zero inst_017 violations in future commits
Violations caught before they reach production
User confidence in framework enforcement

Medium-term Benefits (Phase 2)

Reduced cognitive load (don't need to remember 52 rules)
Rules become part of natural workflow
Faster decision-making with relevant context

Long-term Benefits (Phase 3)

Reduced debugging time (30%+ improvement)
Better problem-solving patterns
Framework actively guides learning

Next Steps

Immediate

Review this plan with user
Get approval to proceed
Set up development branch

Week 1

Implement ProhibitedTermsScanner.js
Write unit tests
Integrate with session-init.js
Test on current codebase

Week 2

Implement ContextAwareRules.js
Build context mappings
Integrate with hooks
User testing

Week 3

Enhance MetacognitiveVerifier
Implement pattern detection
Build CLI tools
Final integration testing

Appendix: Technical Specifications

ProhibitedTermsScanner API

const scanner = new ProhibitedTermsScanner();

// Scan all files
const violations = await scanner.scan();

// Scan with options
const violations = await scanner.scan({
  silent: false,
  fixMode: false,
  staged: false  // Git staged files only
});

// Auto-fix (simple replacements)
const result = await scanner.autoFix(violations);
// => { fixed: 12, total: 15 }

ContextAwareRules API

const contextRules = new ContextAwareRules();

// Detect context
const contexts = contextRules.detectContext('public/index.html');
// => ['editing_public_html']

// Get relevant rules
const rules = contextRules.getRelevantRules('editing_public_html');
// => [{ id: 'inst_017', text: '...', quadrant: 'VALUES' }]

// Format for display
const message = contextRules.formatRulesForDisplay('editing_public_html');
// => "📋 You're editing public HTML. Remember:..."

MetacognitiveVerifier API

const verifier = new MetacognitiveVerifier();

// Log activity
verifier.logActivity({
  type: 'bash',
  command: 'npm test',
  exitCode: 1,
  duration: 5000
});

// Check patterns
verifier.checkPatterns(tokenCount);
// => Surfaces suggestions if patterns detected

// Clear suggestions
verifier.clearSuggestions();

Conclusion

The Tractatus Framework has excellent architecture but weak behavioral integration. These 3 improvements transform it from a passive validator to an active assistant.

Key Insight: Framework needs to be PROACTIVE, not just REACTIVE.

Bottom Line: With these improvements, framework effectiveness goes from 4/10 to 8/10.

Status: Ready for implementation
Approval Required: User sign-off to proceed
Timeline: 1 month part-time development
Expected ROI: High - Prevents violations, guides work, reduces debugging time

Created: 2025-10-21
Author: Claude Code (Tractatus Framework v3.4)
Session: 2025-10-07-001

13 KiB Raw Blame History

Tractatus Framework Improvement Implementation Plan

Executive Summary

Current vs Future State

Current State (4/10)

Future State (8/10)

🔴 Improvement 1: Proactive Content Scanning

Problem

Solution

Integration Points

Example Output

Effort & Impact

🔴 Improvement 2: Context-Aware Rule Surfacing

Problem

Solution

Context Mappings

Example Output

Effort & Impact

🟡 Improvement 3: Active MetacognitiveVerifier

Problem

Solution

Patterns Detected

Example Output

Effort & Impact

Implementation Roadmap

Phase 1: Proactive Scanning (Week 1)

Phase 2: Context Awareness (Week 2)

Phase 3: Metacognitive Assistant (Week 3)

Success Criteria

Effectiveness Target

Quantitative Metrics

Resource Requirements

Development Time

Testing Time

Grand Total: 32-40 hours (1 month part-time)

Risks & Mitigation

Risk 1: Notification Fatigue

Risk 2: False Positives

Risk 3: Performance Impact

Expected Outcomes

Immediate Benefits (Phase 1)

Medium-term Benefits (Phase 2)

Long-term Benefits (Phase 3)

Next Steps

Immediate

Week 1

Week 2

Week 3

Appendix: Technical Specifications

ProhibitedTermsScanner API

ContextAwareRules API

MetacognitiveVerifier API

Conclusion

13 KiB

Raw Blame History