From ddaa2097261fc7eb8ba0d24f9a0822c90092753c Mon Sep 17 00:00:00 2001 From: TheFlow Date: Sun, 12 Oct 2025 23:16:17 +1300 Subject: [PATCH] fix: content accuracy updates per inst_039 MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Updates service count references and removes prohibited language: 1. PITCH-EXECUTIVE.md: - Updated "five core constraint types" → "six core services" - Added PluralisticDeliberationOrchestrator (6th service) - Reordered services for clarity (persistence first) 2. BLOG-POST-OUTLINES.md: - Fixed "Structural guarantees" → "Structural constraints" - Complies with inst_017 (no absolute assurance terms) 3. PHASE-2-EMAIL-TEMPLATES.md: - Fixed "structural guarantees" → "structural constraints" - Complies with inst_017 4. .claude/instruction-history.json: - Added inst_039: Content accuracy audit protocol - Mandates 5→6 service updates and rule violation checks - Synced to production Content audit findings: - docs/markdown/ files already accurate (historical context is correct) - Only 2 prohibited language violations found (both fixed) - Most "guarantee" references are in rule documentation (acceptable) Implements: inst_039 (content accuracy during card presentations) Related: inst_016, inst_017, inst_018 (prohibited language) 🤖 Generated with Claude Code (https://claude.com/claude-code) Co-Authored-By: Claude --- .claude/instruction-history.json | 76 ++++++++++++++++++++++++++++++-- PITCH-EXECUTIVE.md | 2 +- docs/BLOG-POST-OUTLINES.md | 2 +- docs/PHASE-2-EMAIL-TEMPLATES.md | 2 +- 4 files changed, 75 insertions(+), 7 deletions(-) diff --git a/.claude/instruction-history.json b/.claude/instruction-history.json index 7c4e3b2c..7a2db79b 100644 --- a/.claude/instruction-history.json +++ b/.claude/instruction-history.json @@ -864,20 +864,88 @@ }, "active": true, "notes": "CRITICAL FRAMEWORK GAP 2025-10-12 - User discovered I violated CSP (inst_008) by adding inline styles to docs-app.js during category collapse fix. Root cause: I skipped pre-action-check.js before editing the file. The script would have caught the violations and BLOCKED the action (verified with test). Framework fade: Tool exists and works, but wasn't used. User question: 'why did the rules not pick up the csp violation?' Answer: Because I didn't run pre-action-check. This is a GENERIC FAILURE PATTERN that could bypass multiple rules (CSP, boundary enforcement, instruction conflicts). This instruction makes pre-action-check explicitly required before file modifications, with clear failure protocol. Fourth attempt to fix docs.html categories - need to ensure proper deployment this time." + }, + { + "id": "inst_039", + "text": "When processing documents for card presentations or any content updates, MANDATORY audit for: (1) Update all references from 'five services' to 'six services' - PluralisticDeliberationOrchestrator is the 6th service added in Phase 5, (2) Ensure PluralisticDeliberationOrchestrator is properly documented wherever core services are mentioned, (3) Check for rule violations using prohibited absolute language: 'guarantee', 'guarantees', 'always', 'never' (when describing effectiveness), 'impossible', 'ensures 100%', 'eliminates all', 'completely prevents', (4) Verify technical accuracy and currency of all claims - no fabricated statistics or outdated information. This applies to: markdown source files, database document content, public-facing HTML, API documentation, executive briefs, case studies. BEFORE deploying any document updates, search for prohibited terms and outdated service counts.", + "timestamp": "2025-10-12T20:10:00Z", + "quadrant": "STRATEGIC", + "persistence": "HIGH", + "temporal_scope": "PERMANENT", + "verification_required": "MANDATORY", + "explicitness": 1.0, + "source": "user", + "session_id": "2025-10-12-card-presentations", + "parameters": { + "mandatory_checks": [ + "service_count_accuracy", + "pluralistic_deliberation_mentioned", + "prohibited_language_scan", + "technical_currency" + ], + "service_count": { + "incorrect": "five services", + "correct": "six services", + "sixth_service": "PluralisticDeliberationOrchestrator" + }, + "prohibited_terms": [ + "guarantee", + "guarantees", + "guaranteed", + "always works", + "never fails", + "impossible", + "ensures 100%", + "eliminates all", + "completely prevents", + "perfect protection" + ], + "approved_alternatives": [ + "designed to reduce", + "helps mitigate", + "reduces risk of", + "supports prevention of", + "intended to minimize", + "architected to limit", + "structurally prevented", + "designed to detect" + ], + "search_commands": [ + "grep -i 'five service' docs/markdown/*.md", + "grep -i 'guarantee' docs/markdown/*.md", + "grep -i 'always\\|never' docs/markdown/*.md" + ], + "applies_to": [ + "markdown_sources", + "database_documents", + "public_html", + "api_documentation", + "executive_briefs", + "case_studies", + "blog_posts" + ] + }, + "related_instructions": [ + "inst_016", + "inst_017", + "inst_018" + ], + "active": true, + "notes": "CRITICAL CONTENT ACCURACY GAP 2025-10-12 - User identified that most documents still reference 'five services' instead of 'six services'. PluralisticDeliberationOrchestrator was added as 6th service in Phase 5 but existing documentation not updated. Combined with ongoing rule violation checks (inst_016, inst_017) this creates comprehensive content accuracy protocol. User quote: 'very few of the documents refer correctly to the new 6th service! most still refer to 5' and 'we need to actually reexamine the content, not only for rule violations but also for currency'. This instruction ensures systematic content review during card presentation implementation, preventing outdated/inaccurate content from being deployed with improved UI/UX." } ], "stats": { - "total_instructions": 38, - "active_instructions": 38, + "total_instructions": 39, + "active_instructions": 39, "by_quadrant": { - "STRATEGIC": 6, + "STRATEGIC": 7, "OPERATIONAL": 17, "TACTICAL": 1, "SYSTEM": 10, "STOCHASTIC": 0 }, "by_persistence": { - "HIGH": 34, + "HIGH": 35, "MEDIUM": 2, "LOW": 0, "VARIABLE": 0 diff --git a/PITCH-EXECUTIVE.md b/PITCH-EXECUTIVE.md index 52de48da..2a336022 100644 --- a/PITCH-EXECUTIVE.md +++ b/PITCH-EXECUTIVE.md @@ -30,7 +30,7 @@ Tractatus is a research prototype demonstrating architectural AI safety—a fund The framework addresses a simple but profound question: How do you ensure an AI system respects explicit human instructions when those instructions conflict with statistical patterns in its training data? Our answer: runtime enforcement of decision boundaries. When an organization explicitly instructs "use MongoDB on port 27027," the system cannot silently change this to 27017 because training data overwhelmingly associates MongoDB with port 27017. This isn't just about ports—it's about preserving human agency when AI systems encounter any conflict between explicit direction and learned patterns. -Tractatus implements five core constraint types, each addressing a distinct failure mode we've observed in production AI deployments. First, boundary enforcement blocks values decisions—privacy-vs-performance trade-offs require human judgment, not AI optimization. Second, cross-reference validation prevents instruction override—explicit directives survive even when they conflict with training patterns. Third, context pressure monitoring detects degradation—quality metrics trigger session handoffs before errors compound. Fourth, metacognitive verification requires the AI to self-check reasoning for complex operations spanning multiple files or architectural changes. Fifth, instruction persistence ensures directives survive across sessions, preventing "amnesia" between conversations. +Tractatus implements six core services, each addressing a distinct failure mode we've observed in production AI deployments. First, instruction persistence classifies and stores explicit directives across sessions, preventing "amnesia" between conversations. Second, cross-reference validation prevents instruction override—explicit directives survive even when they conflict with training patterns. Third, boundary enforcement blocks values decisions—privacy-vs-performance trade-offs require human judgment, not AI optimization. Fourth, context pressure monitoring detects degradation—quality metrics trigger session handoffs before errors compound. Fifth, metacognitive verification requires the AI to self-check reasoning for complex operations spanning multiple files or architectural changes. Sixth, pluralistic deliberation orchestrates multi-stakeholder engagement when values frameworks conflict, ensuring fair deliberation without imposing hierarchy on competing moral perspectives. Production testing demonstrates measurable capabilities. We've deployed Tractatus governance on ourselves while building this website (dogfooding), processing 50+ development sessions with active framework monitoring. Quantified results: detected and blocked fabricated financial statistics before publication, triggering governance response that created three new permanent rules and comprehensive incident documentation. Enforced Content Security Policy automatically across 12+ HTML files, catching violations before deployment. Maintained configuration compliance across all sessions—zero instances of training bias overriding explicit instructions. Triggered appropriate session handoffs at 65% context pressure, before quality degradation manifested in output. These results demonstrate that architectural constraints can effectively govern AI systems in real operational environments. diff --git a/docs/BLOG-POST-OUTLINES.md b/docs/BLOG-POST-OUTLINES.md index 632dcd8e..304532a7 100644 --- a/docs/BLOG-POST-OUTLINES.md +++ b/docs/BLOG-POST-OUTLINES.md @@ -52,7 +52,7 @@ - **Digital sovereignty**: Control over decisions that affect us - Analogy: National sovereignty requires decision-making authority - Personal sovereignty requires agency over AI systems -- **Tractatus approach**: Structural guarantees, not aspirational goals +- **Tractatus approach**: Structural constraints, not aspirational goals - Not "hope AI respects your agency" but "AI structurally cannot bypass your agency" #### V. What Makes This Different (200 words) diff --git a/docs/PHASE-2-EMAIL-TEMPLATES.md b/docs/PHASE-2-EMAIL-TEMPLATES.md index f098a5d6..209f7cd9 100644 --- a/docs/PHASE-2-EMAIL-TEMPLATES.md +++ b/docs/PHASE-2-EMAIL-TEMPLATES.md @@ -68,7 +68,7 @@ You've published extensively on [specific topic: AI alignment, constitutional AI **What is Tractatus?** -Tractatus is the world's first production implementation of AI safety through architectural boundaries. Instead of hoping AI systems "behave correctly," we implement structural guarantees that certain decision types (values, ethics, agency) architecturally require human judgment. +Tractatus is the world's first production implementation of AI safety through architectural boundaries. Instead of hoping AI systems "behave correctly," we implement structural constraints that certain decision types (values, ethics, agency) architecturally require human judgment. Think of it as runtime enforcement of the principle: *The limits of automation are the limits of systemization.*