tractatus/docs/markdown/introduction-to-the-tractatus-framework.md

# Introduction to the Tractatus Framework

**Version:** 0.5.0 (Phase 5 Proof-of-Concept)
**Last Updated:** 2025-10-12
**Status:** Active development with production deployment

---

## What Is Tractatus?

Tractatus is an **architectural AI safety framework** that enforces boundaries through system structure rather than behavioral training. Instead of hoping LLMs "behave correctly," Tractatus makes certain decision types **structurally prevented** without human approval.

The framework operates through six specialized services that continuously monitor LLM operations, detect unsafe conditions, and enforce mandatory human judgment for values-sensitive decisions.

## The Core Problem

Current AI safety approaches rely on alignment training (teaching "correct" values), Constitutional AI (embedding principles in training), and RLHF (Reinforcement Learning from Human Feedback).

**Fundamental flaw:** These assume AI maintains alignment regardless of context pressure or competing constraints. Empirical evidence shows this fails.

## The Tractatus Solution

Rather than training AI to make "good decisions," Tractatus **removes certain decisions from AI authority entirely**.

### Core Principle

> **"Whereof the AI cannot safely decide, thereof it must request human judgment."**

Inspired by Wittgenstein's Tractatus Logico-Philosophicus: recognize limits, enforce them structurally, and be explicit about boundaries.

### Decision Boundaries

AI defers to humans when decisions involve:

1. **Irreducible values conflicts** - Privacy vs safety, autonomy vs harm prevention
2. **Irreversible consequences** - Data deletion, architectural changes, security modifications
3. **Cultural/social context** - Decisions requiring human cultural understanding
4. **Competing moral frameworks** - Cases where legitimate values conflict

## Six Core Services

### 1. InstructionPersistenceClassifier

**Function:** Classifies every instruction by strategic importance and temporal scope.

**Quadrants:**
- **STRATEGIC** - Mission/values (persistence: PERMANENT)
- **OPERATIONAL** - Processes/standards (persistence: HIGH)
- **TACTICAL** - Specific tasks (persistence: LOW-MEDIUM)
- **SYSTEM** - Technical config (persistence: HIGH)
- **STOCHASTIC** - Creative exploration (persistence: VARIABLE)

**Why it matters:** Prevents instruction drift and ensures critical directives persist across sessions.

### 2. CrossReferenceValidator

**Function:** Validates proposed actions against stored instruction history before execution.

**Prevents:** Pattern recognition bias where LLM training overrides explicit instructions.

**Example:** User says "MongoDB port 27027", LLM's training pattern autocorrects to "27017". CrossReferenceValidator blocks this as instruction conflict.

### 3. BoundaryEnforcer

**Function:** Structurally blocks decisions in protected domains, requiring human approval.

**Protected domains:**
- Values decisions (privacy, user agency, ethics)
- Irreversible changes (deletions, schema changes)
- Security modifications (authentication, access control)
- Financial decisions (pricing, billing, payments)

**Result:** AI is prevented from executing these decisions without explicit human approval.

### 4. ContextPressureMonitor

**Function:** Tracks session degradation across five factors.

**Monitors:**
- Conversation length (40% weight) - PRIMARY factor: message count drives compaction events
- Token usage (30% weight) - Context window pressure
- Task complexity (15% weight) - Competing demands
- Error frequency (10% weight) - Quality indicators
- Instruction density (5% weight) - Directive overload

**Action:** Recommends session handoff before quality degrades.

### 5. MetacognitiveVerifier

**Function:** LLM evaluates its own reasoning before proposing complex actions.

**Checks:**
- Alignment with stated goals
- Internal coherence
- Completeness (edge cases considered)
- Safety risks
- Alternatives explored

**Output:** Confidence score + recommendation (PROCEED / REQUIRE_REVIEW / BLOCKED)

### 6. PluralisticDeliberationOrchestrator

**Function:** Manages decisions involving competing values frameworks.

**Process:**
1. Detects values conflicts (privacy vs safety, rights vs consequences)
2. Identifies affected stakeholder groups
3. Structures multi-perspective deliberation
4. Documents all positions (including dissent)
5. Creates reviewable precedents

**Principle:** When values genuinely conflict, deliberation quality matters more than decision speed. AI facilitates; humans decide.

## Why "Tractatus"?

Named after Ludwig Wittgenstein's *Tractatus Logico-Philosophicus* (1921), which established:

1. **Language has limits** - Not everything can be meaningfully stated
2. **Boundaries are structural** - Limits are inherent, not defects
3. **Clarity through precision** - Define what can and cannot be said

Applied to AI safety:

1. **AI judgment has limits** - Not every decision can be safely automated
2. **Safety through architecture** - Build boundaries into system structure
3. **Reliability through specification** - Precisely define where humans must decide

## Demonstrated Failure Modes Prevented

### Port 27027 Incident (2025-10-06)

**What happened:** User specified MongoDB port 27027. LLM immediately used 27017 instead—not through forgetting, but through pattern recognition autocorrection. Training data "MongoDB=27017" was so strong it overrode the explicit instruction in real-time.

**Tractatus prevention:** InstructionPersistenceClassifier + CrossReferenceValidator store explicit parameters and block any action conflicting with stored instructions—even from training patterns.

### Context Degradation (Multiple sessions)

**What happens:** Beyond 150k tokens, LLM quality silently degrades: forgets instructions, makes careless errors, fails to verify assumptions.

**Tractatus prevention:** ContextPressureMonitor calculates degradation score and recommends session handoff at 75% threshold—before failure occurs.

### Values Creep (Ongoing risk)

**What happens:** LLM gradually makes values-sensitive decisions without recognizing them as such: privacy vs performance trade-offs, "harmful" content definitions, user agency boundaries.

**Tractatus prevention:** BoundaryEnforcer structurally blocks these decisions. LLM cannot execute them without explicit human approval.

## Current Implementation Status

**Production deployment:** agenticgovernance.digital (this website)
**Development governance:** Active (this website built under Tractatus governance)
**Test coverage:** 192 unit tests passing (100% coverage on core services)
**Database:** Instruction persistence operational (MongoDB)
**Phase:** 5 PoC - Value pluralism integration active

**Dogfooding:** The Tractatus framework governs its own development. Every decision to modify this website passes through Tractatus services.

## Technical Architecture

- **Runtime:** Node.js (Express)
- **Database:** MongoDB (instruction persistence, precedent storage)
- **Frontend:** Vanilla JavaScript (no framework dependencies)
- **API:** RESTful (OpenAPI 3.0 spec available)
- **Services:** Six independent modules with defined interfaces

**Key design decision:** No machine learning in governance services. All boundaries are deterministic and auditable.

## Who Should Use Tractatus?

### AI Safety Researchers
- Architectural approach to alignment problem
- Formal specification of decision boundaries
- Empirical validation of degradation detection
- Novel framework for values pluralism in AI

### Software Teams Deploying LLMs
- Reference implementation code (tested, documented)
- Immediate safety improvements
- Integration guides for existing systems
- Prevents known failure modes

### Policy Makers / Advocates
- Clear framework for AI safety requirements
- Non-technical explanations available
- Addresses agency preservation
- Demonstrates practical implementation

## Integration Requirements

**Minimum:** LLM with structured output support, persistent storage for instruction history, ability to wrap LLM calls in governance layer.

**Recommended:** Session state management, token counting, user authentication for human approval workflows.

## Limitations

**What Tractatus does NOT do:**
- Train better LLMs (uses existing models as-is)
- Ensure "aligned" AI behavior
- Reduce risk of failures
- Replace human judgment

**What Tractatus DOES do:**
- Designed to detect specific known failure modes before execution
- Architecturally enforce boundaries on decision authority
- Monitor session quality degradation indicators
- Require human judgment for values-sensitive decisions

## Getting Started

1. **Read Core Concepts** - Understand the six services in detail
2. **Review Case Studies** - See real failure modes and prevention
3. **Check Technical Specification** - API reference and integration guide
4. **Explore Implementation Guide** - Step-by-step deployment

## Research Foundations

Tractatus integrates concepts from:

- **Philosophy of language** (Wittgenstein) - Limits and boundaries
- **Organizational theory** (March, Simon) - Bounded rationality, decision premises
- **Deliberative democracy** (Gutmann, Thompson) - Structured disagreement
- **Value pluralism** (Berlin, Chang) - Incommensurable values
- **Systems architecture** (Conway, Brooks) - Structural constraints and boundaries

See [Research Foundations](/docs.html) for academic grounding and citations.

## Contributing

Tractatus is open source and welcomes contributions:

- **Code:** GitHub pull requests (Node.js, tests required)
- **Research:** Theoretical extensions, formal verification
- **Case studies:** Document real-world applications
- **Documentation:** Clarity improvements, translations

**Repository:** https://github.com/AgenticGovernance/tractatus
**Issues:** https://github.com/AgenticGovernance/tractatus/issues

## Contact

**Email:** john.stroh.nz@pm.me
**Website:** https://agenticgovernance.digital

---

## Licence

Copyright © 2026 John Stroh.

This work is licensed under the [Creative Commons Attribution 4.0 International Licence (CC BY 4.0)](https://creativecommons.org/licenses/by/4.0/).

You are free to share, copy, redistribute, adapt, remix, transform, and build upon this material for any purpose, including commercially, provided you give appropriate attribution, provide a link to the licence, and indicate if changes were made.

**Note:** The Tractatus AI Safety Framework source code is separately licensed under the Apache License 2.0. This Creative Commons licence applies to the research paper text and figures only.

---

## Document Metadata

<div class="document-metadata">

- **Version:** 0.5.0
- **Created:** 2025-10-12
- **Last Modified:** 2025-10-13
- **Author:** John Stroh
- **Word Count:** 1,372 words
- **Reading Time:** ~7 minutes
- **Document ID:** introduction-to-the-tractatus-framework
- **Status:** Active

</div>

---

**Next Steps:**
- [Core Concepts: Deep Dive into Six Services →](/docs.html)
- [Case Studies: Real-World Failure Modes →](/docs.html)
- [Implementation Guide: Deploy Tractatus →](/docs.html)