# Introduction to the Tractatus Framework **Version:** 0.5.0 (Phase 5 Proof-of-Concept) **Last Updated:** 2025-10-12 **Status:** Active development with production deployment --- ## What Is Tractatus? Tractatus is an **architectural AI safety framework** that enforces boundaries through system structure rather than behavioral training. Instead of hoping LLMs "behave correctly," Tractatus makes certain decision types **structurally prevented** without human approval. The framework operates through six specialized services that continuously monitor LLM operations, detect unsafe conditions, and enforce mandatory human judgment for values-sensitive decisions. ## The Core Problem Current AI safety approaches rely on alignment training (teaching "correct" values), Constitutional AI (embedding principles in training), and RLHF (Reinforcement Learning from Human Feedback). **Fundamental flaw:** These assume AI maintains alignment regardless of context pressure or competing constraints. Empirical evidence shows this fails. ## The Tractatus Solution Rather than training AI to make "good decisions," Tractatus **removes certain decisions from AI authority entirely**. ### Core Principle > **"Whereof the AI cannot safely decide, thereof it must request human judgment."** Inspired by Wittgenstein's Tractatus Logico-Philosophicus: recognize limits, enforce them structurally, and be explicit about boundaries. ### Decision Boundaries AI defers to humans when decisions involve: 1. **Irreducible values conflicts** - Privacy vs safety, autonomy vs harm prevention 2. **Irreversible consequences** - Data deletion, architectural changes, security modifications 3. **Cultural/social context** - Decisions requiring human cultural understanding 4. **Competing moral frameworks** - Cases where legitimate values conflict ## Six Core Services ### 1. InstructionPersistenceClassifier **Function:** Classifies every instruction by strategic importance and temporal scope. **Quadrants:** - **STRATEGIC** - Mission/values (persistence: PERMANENT) - **OPERATIONAL** - Processes/standards (persistence: HIGH) - **TACTICAL** - Specific tasks (persistence: LOW-MEDIUM) - **SYSTEM** - Technical config (persistence: HIGH) - **STOCHASTIC** - Creative exploration (persistence: VARIABLE) **Why it matters:** Prevents instruction drift and ensures critical directives persist across sessions. ### 2. CrossReferenceValidator **Function:** Validates proposed actions against stored instruction history before execution. **Prevents:** Pattern recognition bias where LLM training overrides explicit instructions. **Example:** User says "MongoDB port 27027", LLM's training pattern autocorrects to "27017". CrossReferenceValidator blocks this as instruction conflict. ### 3. BoundaryEnforcer **Function:** Structurally blocks decisions in protected domains, requiring human approval. **Protected domains:** - Values decisions (privacy, user agency, ethics) - Irreversible changes (deletions, schema changes) - Security modifications (authentication, access control) - Financial decisions (pricing, billing, payments) **Result:** AI is prevented from executing these decisions without explicit human approval. ### 4. ContextPressureMonitor **Function:** Tracks session degradation across five factors. **Monitors:** - Conversation length (40% weight) - PRIMARY factor: message count drives compaction events - Token usage (30% weight) - Context window pressure - Task complexity (15% weight) - Competing demands - Error frequency (10% weight) - Quality indicators - Instruction density (5% weight) - Directive overload **Action:** Recommends session handoff before quality degrades. ### 5. MetacognitiveVerifier **Function:** LLM evaluates its own reasoning before proposing complex actions. **Checks:** - Alignment with stated goals - Internal coherence - Completeness (edge cases considered) - Safety risks - Alternatives explored **Output:** Confidence score + recommendation (PROCEED / REQUIRE_REVIEW / BLOCKED) ### 6. PluralisticDeliberationOrchestrator **Function:** Manages decisions involving competing values frameworks. **Process:** 1. Detects values conflicts (privacy vs safety, rights vs consequences) 2. Identifies affected stakeholder groups 3. Structures multi-perspective deliberation 4. Documents all positions (including dissent) 5. Creates reviewable precedents **Principle:** When values genuinely conflict, deliberation quality matters more than decision speed. AI facilitates; humans decide. ## Why "Tractatus"? Named after Ludwig Wittgenstein's *Tractatus Logico-Philosophicus* (1921), which established: 1. **Language has limits** - Not everything can be meaningfully stated 2. **Boundaries are structural** - Limits are inherent, not defects 3. **Clarity through precision** - Define what can and cannot be said Applied to AI safety: 1. **AI judgment has limits** - Not every decision can be safely automated 2. **Safety through architecture** - Build boundaries into system structure 3. **Reliability through specification** - Precisely define where humans must decide ## Demonstrated Failure Modes Prevented ### Port 27027 Incident (2025-10-06) **What happened:** User specified MongoDB port 27027. LLM immediately used 27017 instead—not through forgetting, but through pattern recognition autocorrection. Training data "MongoDB=27017" was so strong it overrode the explicit instruction in real-time. **Tractatus prevention:** InstructionPersistenceClassifier + CrossReferenceValidator store explicit parameters and block any action conflicting with stored instructions—even from training patterns. ### Context Degradation (Multiple sessions) **What happens:** Beyond 150k tokens, LLM quality silently degrades: forgets instructions, makes careless errors, fails to verify assumptions. **Tractatus prevention:** ContextPressureMonitor calculates degradation score and recommends session handoff at 75% threshold—before failure occurs. ### Values Creep (Ongoing risk) **What happens:** LLM gradually makes values-sensitive decisions without recognizing them as such: privacy vs performance trade-offs, "harmful" content definitions, user agency boundaries. **Tractatus prevention:** BoundaryEnforcer structurally blocks these decisions. LLM cannot execute them without explicit human approval. ## Current Implementation Status **Production deployment:** agenticgovernance.digital (this website) **Development governance:** Active (this website built under Tractatus governance) **Test coverage:** 192 unit tests passing (100% coverage on core services) **Database:** Instruction persistence operational (MongoDB) **Phase:** 5 PoC - Value pluralism integration active **Dogfooding:** The Tractatus framework governs its own development. Every decision to modify this website passes through Tractatus services. ## Technical Architecture - **Runtime:** Node.js (Express) - **Database:** MongoDB (instruction persistence, precedent storage) - **Frontend:** Vanilla JavaScript (no framework dependencies) - **API:** RESTful (OpenAPI 3.0 spec available) - **Services:** Six independent modules with defined interfaces **Key design decision:** No machine learning in governance services. All boundaries are deterministic and auditable. ## Who Should Use Tractatus? ### AI Safety Researchers - Architectural approach to alignment problem - Formal specification of decision boundaries - Empirical validation of degradation detection - Novel framework for values pluralism in AI ### Software Teams Deploying LLMs - Reference implementation code (tested, documented) - Immediate safety improvements - Integration guides for existing systems - Prevents known failure modes ### Policy Makers / Advocates - Clear framework for AI safety requirements - Non-technical explanations available - Addresses agency preservation - Demonstrates practical implementation ## Integration Requirements **Minimum:** LLM with structured output support, persistent storage for instruction history, ability to wrap LLM calls in governance layer. **Recommended:** Session state management, token counting, user authentication for human approval workflows. ## Limitations **What Tractatus does NOT do:** - Train better LLMs (uses existing models as-is) - Ensure "aligned" AI behavior - Reduce risk of failures - Replace human judgment **What Tractatus DOES do:** - Designed to detect specific known failure modes before execution - Architecturally enforce boundaries on decision authority - Monitor session quality degradation indicators - Require human judgment for values-sensitive decisions ## Getting Started 1. **Read Core Concepts** - Understand the six services in detail 2. **Review Case Studies** - See real failure modes and prevention 3. **Check Technical Specification** - API reference and integration guide 4. **Explore Implementation Guide** - Step-by-step deployment ## Research Foundations Tractatus integrates concepts from: - **Philosophy of language** (Wittgenstein) - Limits and boundaries - **Organizational theory** (March, Simon) - Bounded rationality, decision premises - **Deliberative democracy** (Gutmann, Thompson) - Structured disagreement - **Value pluralism** (Berlin, Chang) - Incommensurable values - **Systems architecture** (Conway, Brooks) - Structural constraints and boundaries See [Research Foundations](/docs.html) for academic grounding and citations. ## Contributing Tractatus is open source and welcomes contributions: - **Code:** GitHub pull requests (Node.js, tests required) - **Research:** Theoretical extensions, formal verification - **Case studies:** Document real-world applications - **Documentation:** Clarity improvements, translations **Repository:** https://github.com/AgenticGovernance/tractatus **Issues:** https://github.com/AgenticGovernance/tractatus/issues ## Contact **Email:** john.stroh.nz@pm.me **Website:** https://agenticgovernance.digital --- ## Licence Copyright © 2026 John Stroh. This work is licensed under the [Creative Commons Attribution 4.0 International Licence (CC BY 4.0)](https://creativecommons.org/licenses/by/4.0/). You are free to share, copy, redistribute, adapt, remix, transform, and build upon this material for any purpose, including commercially, provided you give appropriate attribution, provide a link to the licence, and indicate if changes were made. **Note:** The Tractatus AI Safety Framework source code is separately licensed under the Apache License 2.0. This Creative Commons licence applies to the research paper text and figures only. --- ## Document Metadata
--- **Next Steps:** - [Core Concepts: Deep Dive into Six Services →](/docs.html) - [Case Studies: Real-World Failure Modes →](/docs.html) - [Implementation Guide: Deploy Tractatus →](/docs.html)