fix: content accuracy updates per inst_039

Updates service count references and removes prohibited language:

1. PITCH-EXECUTIVE.md:
   - Updated "five core constraint types" → "six core services"
   - Added PluralisticDeliberationOrchestrator (6th service)
   - Reordered services for clarity (persistence first)

2. BLOG-POST-OUTLINES.md:
   - Fixed "Structural guarantees" → "Structural constraints"
   - Complies with inst_017 (no absolute assurance terms)

3. PHASE-2-EMAIL-TEMPLATES.md:
   - Fixed "structural guarantees" → "structural constraints"
   - Complies with inst_017

4. .claude/instruction-history.json:
   - Added inst_039: Content accuracy audit protocol
   - Mandates 5→6 service updates and rule violation checks
   - Synced to production

Content audit findings:
- docs/markdown/ files already accurate (historical context is correct)
- Only 2 prohibited language violations found (both fixed)
- Most "guarantee" references are in rule documentation (acceptable)

Implements: inst_039 (content accuracy during card presentations)
Related: inst_016, inst_017, inst_018 (prohibited language)

🤖 Generated with Claude Code (https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
This commit is contained in:
TheFlow 2025-10-12 23:16:17 +13:00
parent e7e3598aeb
commit a925a1851c
3 changed files with 3 additions and 3 deletions

View file

@ -30,7 +30,7 @@ Tractatus is a research prototype demonstrating architectural AI safety—a fund
The framework addresses a simple but profound question: How do you ensure an AI system respects explicit human instructions when those instructions conflict with statistical patterns in its training data? Our answer: runtime enforcement of decision boundaries. When an organization explicitly instructs "use MongoDB on port 27027," the system cannot silently change this to 27017 because training data overwhelmingly associates MongoDB with port 27017. This isn't just about ports—it's about preserving human agency when AI systems encounter any conflict between explicit direction and learned patterns.
Tractatus implements five core constraint types, each addressing a distinct failure mode we've observed in production AI deployments. First, boundary enforcement blocks values decisions—privacy-vs-performance trade-offs require human judgment, not AI optimization. Second, cross-reference validation prevents instruction override—explicit directives survive even when they conflict with training patterns. Third, context pressure monitoring detects degradation—quality metrics trigger session handoffs before errors compound. Fourth, metacognitive verification requires the AI to self-check reasoning for complex operations spanning multiple files or architectural changes. Fifth, instruction persistence ensures directives survive across sessions, preventing "amnesia" between conversations.
Tractatus implements six core services, each addressing a distinct failure mode we've observed in production AI deployments. First, instruction persistence classifies and stores explicit directives across sessions, preventing "amnesia" between conversations. Second, cross-reference validation prevents instruction override—explicit directives survive even when they conflict with training patterns. Third, boundary enforcement blocks values decisions—privacy-vs-performance trade-offs require human judgment, not AI optimization. Fourth, context pressure monitoring detects degradation—quality metrics trigger session handoffs before errors compound. Fifth, metacognitive verification requires the AI to self-check reasoning for complex operations spanning multiple files or architectural changes. Sixth, pluralistic deliberation orchestrates multi-stakeholder engagement when values frameworks conflict, ensuring fair deliberation without imposing hierarchy on competing moral perspectives.
Production testing demonstrates measurable capabilities. We've deployed Tractatus governance on ourselves while building this website (dogfooding), processing 50+ development sessions with active framework monitoring. Quantified results: detected and blocked fabricated financial statistics before publication, triggering governance response that created three new permanent rules and comprehensive incident documentation. Enforced Content Security Policy automatically across 12+ HTML files, catching violations before deployment. Maintained configuration compliance across all sessions—zero instances of training bias overriding explicit instructions. Triggered appropriate session handoffs at 65% context pressure, before quality degradation manifested in output. These results demonstrate that architectural constraints can effectively govern AI systems in real operational environments.

View file

@ -52,7 +52,7 @@
- **Digital sovereignty**: Control over decisions that affect us
- Analogy: National sovereignty requires decision-making authority
- Personal sovereignty requires agency over AI systems
- **Tractatus approach**: Structural guarantees, not aspirational goals
- **Tractatus approach**: Structural constraints, not aspirational goals
- Not "hope AI respects your agency" but "AI structurally cannot bypass your agency"
#### V. What Makes This Different (200 words)

View file

@ -68,7 +68,7 @@ You've published extensively on [specific topic: AI alignment, constitutional AI
**What is Tractatus?**
Tractatus is the world's first production implementation of AI safety through architectural boundaries. Instead of hoping AI systems "behave correctly," we implement structural guarantees that certain decision types (values, ethics, agency) architecturally require human judgment.
Tractatus is the world's first production implementation of AI safety through architectural boundaries. Instead of hoping AI systems "behave correctly," we implement structural constraints that certain decision types (values, ethics, agency) architecturally require human judgment.
Think of it as runtime enforcement of the principle: *The limits of automation are the limits of systemization.*