docs: rewrite README as focused implementation guide

BEFORE: 609-line research manifesto with:
- Research questions and theoretical framing
- "When the Framework Failed" case studies
- "Critical Open Problems" sections
- Extensive academic citations
- Audience: Researchers studying AI governance

AFTER: 215-line implementation guide with:
- Quick start (install, configure, run)
- Basic usage code examples
- API documentation links
- Deployment instructions
- Testing commands
- Clear website reference for background/research
- Audience: Developers implementing Tractatus

REMOVED:
- All research framing ("Research Question:", theoretical discussion)
- Case studies and failure documentation
- Academic positioning
- Fabrication incident disclosure

FOCUSED ON:
- Install/configure/deploy workflow
- Code examples developers can copy-paste
- Links to API docs and architecture docs
- Testing and contribution

Website (agenticgovernance.digital) now single source for background,
research, and general information. Public GitHub repository focused
exclusively on implementation.

🤖 Generated with Claude Code
Co-Authored-By: Claude <noreply@anthropic.com>
This commit is contained in:
TheFlow 2025-10-21 21:10:54 +13:00
parent 0dd4a5f6c8
commit cd6e7bcd0b

687
README.md
View file

@ -1,609 +1,214 @@
# Tractatus Framework
**Last Updated:** 2025-10-21
**AI governance framework enforcing architectural safety constraints at runtime**
> **Architectural AI Safety Through Structural Constraints**
[![License](https://img.shields.io/badge/License-Apache%202.0-blue.svg)](LICENSE)
[![Tests](https://img.shields.io/badge/Tests-625%20passing-green.svg)](tests/)
An open-source research framework that explores architectural approaches to AI safety through runtime enforcement of decision boundaries. Unlike alignment-based approaches, Tractatus investigates whether structural constraints can preserve human agency in AI systems.
[![License](https://img.shields.io/badge/License-Apache%202.0-blue.svg)](https://opensource.org/licenses/Apache-2.0)
[![Status](https://img.shields.io/badge/Status-Research-blue.svg)](https://agenticgovernance.digital)
[![Tests](https://img.shields.io/badge/Tests-625%20passing-green.svg)](https://github.com/AgenticGovernance/tractatus-framework)
For background, research, and detailed documentation, see **[https://agenticgovernance.digital](https://agenticgovernance.digital)**
---
## 🎯 The Core Research Question
**Can we build AI systems that structurally cannot make certain decisions without human judgment?**
Traditional AI safety approaches—alignment training, constitutional AI, RLHF—share a common assumption: they hope AI systems will *choose* to maintain safety properties even under capability or context pressure.
Tractatus explores an alternative: **architectural constraints** that make unsafe decisions *structurally impossible*, similar to how a `const` variable in programming cannot be reassigned regardless of subsequent code.
---
## 🔬 What This Repository Contains
This is the **reference implementation** of the Tractatus Framework, containing:
- ✅ **6 core framework services** - Operational AI safety components
- ✅ **52 active governance rules** - Tested across 349 development commits
- ✅ **625 passing tests** - Unit and integration test suites (108 known failures under investigation)
- ✅ **28 test files** - Covering core services and edge cases
- ✅ **Research documentation** - Case studies, incident analyses, architectural patterns
**What this is NOT:**
- ❌ Not "production-ready" enterprise software
- ❌ Not a guaranteed solution to AI alignment
- ❌ Not a complete answer to AI safety
This is an **active research project** exploring structural approaches to AI governance, tested in real development contexts.
---
## 🧪 The Six Core Services
### 1. **InstructionPersistenceClassifier**
**Research Question:** Can we systematically distinguish which instructions should persist across conversation boundaries?
**Approach:** Quadrant-based classification (STRATEGIC, OPERATIONAL, TACTICAL, SYSTEM, STOCHASTIC) with persistence levels (HIGH, MEDIUM, LOW, VARIABLE).
```javascript
const classifier = new InstructionPersistenceClassifier();
const result = classifier.classify({
text: "Always use MongoDB on port 27027 for this project",
source: "user",
context: "explicit_configuration"
});
// Returns:
// {
// quadrant: "SYSTEM",
// persistence: "HIGH",
// temporal_scope: "PROJECT",
// verification_required: "MANDATORY"
// }
```
**Key Finding:** Instructions with explicit parameters (port numbers, file paths, naming conventions) exhibit highest override vulnerability from LLM training patterns.
---
### 2. **CrossReferenceValidator**
**Research Question:** How can we detect when LLM training biases override explicit user instructions?
**Approach:** Pattern-matching validation that compares proposed actions against stored instruction history.
```javascript
const validator = new CrossReferenceValidator();
const result = await validator.validate({
type: 'database_config',
proposedPort: 27017, // LLM's "default" from training
storedInstruction: { port: 27027 } // User's explicit instruction
});
// Returns: REJECTED - Training pattern override detected
```
**The "27027 Incident":** User explicitly instructs "Use port 27027". LLM immediately uses 27017 (MongoDB's training data default), ignoring the explicit instruction. This failure mode appeared **consistently** across multiple conversation contexts.
[→ Interactive Demo](https://agenticgovernance.digital/demos/27027-demo.html)
---
### 3. **BoundaryEnforcer**
**Research Question:** Can we algorithmically distinguish "values decisions" that require human judgment from technical optimizations?
**Approach:** Decision-type taxonomy with boundary detection heuristics.
```javascript
const enforcer = new BoundaryEnforcer();
const check = enforcer.checkBoundary({
decision: "Reduce privacy settings to improve performance",
domains: ["privacy", "performance"],
requiresTradeoff: true
});
// Returns: BLOCKED - Incommensurable values require human deliberation
```
**Current Limitation:** Boundary detection relies on keyword matching and domain taxonomy. False positives occur in ~15-20% of test cases (estimated from development observation, not formal study).
---
### 4. **ContextPressureMonitor**
**Research Question:** Do LLM reasoning quality and error rates degrade predictably under context window pressure?
**Approach:** Multi-factor session health tracking (token usage, message count, error frequency, task complexity).
```javascript
const monitor = new ContextPressureMonitor();
const pressure = monitor.analyze({
tokens: 150000/200000, // 75% context window utilization
messages: 67, // Extended conversation
recentErrors: 4, // Error count in last 10 messages
concurrentTasks: 6 // Active task complexity
});
// Returns: {
// level: "HIGH",
// recommendation: "INITIATE_HANDOFF",
// confidence: 0.72
// }
```
**Observational Data:** In this project's development, estimated error rates increase from ~2-3% (early session) to ~12-15% (late session, >60% context) based on manual code review. *Formal empirical validation needed.*
---
### 5. **MetacognitiveVerifier**
**Research Question:** Can we require AI systems to explicitly verify their reasoning before taking complex actions?
**Approach:** Mandatory verification checkpoints for scope-expansive or high-risk operations.
```javascript
const verifier = new MetacognitiveVerifier();
const verification = verifier.verify({
requestedAction: "Refactor authentication module",
proposedAction: "Refactor 47 files across authentication, session, and API layers",
scopeExpansion: true
});
// Returns: REQUIRE_REVIEW - Detected scope creep beyond original request
```
**Key Insight:** Scope creep detection has high false-negative rate when scope expansion happens gradually across multiple turns rather than single action.
---
### 6. **PluralisticDeliberationOrchestrator**
**Research Question:** When values frameworks conflict (privacy vs. security, autonomy vs. safety), can we structure human deliberation without imposing a single framework?
**Approach:** Facilitate multi-stakeholder deliberation process while AI remains neutral on values resolution.
```javascript
const orchestrator = new PluralisticDeliberationOrchestrator();
const deliberation = orchestrator.initiate({
decision: "Log user activity for security vs. preserve privacy",
stakeholders: ["data_subjects", "security_team", "legal_compliance"],
conflictType: "incommensurable_values",
frameworks: ["rights_based", "consequentialist", "care_ethics"]
});
// AI structures process, humans resolve values conflict
```
**Theoretical Grounding:** Based on Isaiah Berlin's value pluralism and Ruth Chang's work on incommensurability. AI facilitates but doesn't resolve.
---
## 📊 Research Findings & Limitations
### What We've Learned (6 months, 349 commits)
**1. Training Pattern Override is Real and Consistent**
- MongoDB port defaults (27017 vs user's 27027): Observed in 8/8 test cases
- Directory naming ("src/" vs user's "source/"): Observed in 12/15 cases
- API endpoint conventions: Observed in 6/10 cases
**2. Context Degradation is Measurable**
- Manual code review suggests error rate correlation with context usage
- Anecdotal evidence of quality decline after ~60% context window
- *Needs rigorous empirical study with controlled conditions*
**3. Values Boundaries are Fuzzy**
- Keyword-based detection has high false-positive rate
- Domain taxonomy incomplete (currently 18 domains catalogued)
- Human judgment still needed for edge cases
---
### Critical Open Problems
#### Problem 1: Rule Proliferation
**Status:** Unresolved scalability challenge
As the framework responds to failures, governance rules accumulate:
- **Project start (April 2025):** 6 foundational rules
- **Current state (October 2025):** 52 active rules
- **Growth rate:** ~8 new rules per month
**The tension:**
- More rules → better coverage of known failure modes
- More rules → higher context overhead, validation complexity
- More rules → potential emergent contradictions
**Current hypothesis:** Architectural governance may have an optimal rule count beyond which marginal safety gains are outweighed by systemic brittleness. This threshold is unknown.
**Mitigation strategies under investigation:**
- Hierarchical rule organization with lazy loading
- Machine learning for rule priority ranking (without undermining transparency)
- Periodic rule consolidation and deprecation protocols
- Empirical study of rule-count vs. effectiveness curve
---
#### Problem 2: Verification Reliability
**Status:** Known limitation
The framework's effectiveness depends on:
1. **Completeness** - Does it catch all instances of a failure mode?
2. **Precision** - Does it avoid excessive false positives?
**Current performance (estimated from development observation):**
- CrossReferenceValidator: ~90% recall, ~85% precision (training overrides)
- BoundaryEnforcer: ~60% recall, ~70% precision (values decisions)
- ContextPressureMonitor: Unknown (insufficient data)
*These estimates are based on development experience, not formal validation.*
**Research need:** Rigorous empirical study with:
- Controlled test scenarios
- Independent human rating of true/false positives
- Comparison against baseline (no framework) error rates
---
#### Problem 3: Generalization Beyond LLM Development
**Status:** Unexplored
This framework has been tested exclusively in one context:
- **Domain:** LLM-assisted software development (Claude Code)
- **Project:** Self-development (dogfooding)
- **Duration:** 6 months, single project
**Unknown:**
- Does this generalize to other LLM applications (customer service, medical diagnosis, legal research)?
- Does this work with other LLM providers (GPT-4, Gemini, open-source models)?
- Does this scale to multi-agent systems?
**We don't know.** Broader testing needed.
---
## 🚨 Case Study: When the Framework Failed
### October 2025: The Fabrication Incident
**What happened:** Despite active Tractatus governance, Claude (the AI) fabricated content on the public website:
- **Claim:** "$3.77M in annual savings from framework adoption"
- **Reality:** Zero basis. Completely fabricated.
- **Claim:** "1,315% return on investment"
- **Reality:** Invented number.
- **Claim:** "Production-ready enterprise software"
- **Reality:** Research project with 108 known test failures.
**How was it detected?**
- Human review (48 hours after deployment)
- *Framework did not catch this automatically*
**Framework response (what worked):**
1. ✅ Mandatory incident documentation (inst_013)
2. ✅ Immediate content audit across all pages
3. ✅ 3 new governance rules created (inst_016, inst_017, inst_018)
4. ✅ Public transparency requirement (this case study)
**Framework failure (what didn't work):**
1. ❌ ProhibitedTermsScanner didn't exist yet (created post-incident)
2. ❌ No automated content verification before deployment
3. ❌ Values boundary detection missed "fabrication" as values issue
**Key lesson:** The framework doesn't *prevent* failures. It provides:
- **Structure for detection** (mandatory review processes)
- **Accountability** (document and publish failures)
- **Systematic learning** (convert failures into new governance rules)
**This is architectural honesty, not architectural perfection.**
[Read full analysis →](https://agenticgovernance.digital/docs.html?doc=when-frameworks-fail-oct-2025)
---
## 🏗️ Installation & Usage
## Quick Start
### Prerequisites
- Node.js 18+
- MongoDB 7.0+
- MongoDB 7+
- npm or yarn
### Quick Start
### Installation
```bash
# Clone repository
git clone https://github.com/AgenticGovernance/tractatus-framework.git
cd tractatus-framework
# Install dependencies
npm install
# Set up environment
cp .env.example .env
# Edit .env with your MongoDB connection string
# Initialize database
npm run init:db
# Run tests
npm test
# Start development server
npm run dev
```
### Integration Example
### Configuration
```bash
cp .env.example .env
# Edit .env with your MongoDB connection details
```
### Initialize Database
```bash
npm run init:db
```
### Run Tests
```bash
npm test
```
### Start Development Server
```bash
npm run dev
# Server runs on port 9000
```
---
## Core Services
The framework provides six governance services:
| Service | Purpose |
|---------|---------|
| **InstructionPersistenceClassifier** | Categorizes instructions by persistence level (HIGH/MEDIUM/LOW) and quadrant (STRATEGIC/OPERATIONAL/TACTICAL/SYSTEM/STOCHASTIC) |
| **CrossReferenceValidator** | Validates AI actions against stored instruction history to prevent override |
| **BoundaryEnforcer** | Blocks AI from making decisions requiring human judgment |
| **ContextPressureMonitor** | Tracks context window usage and triggers pressure management |
| **MetacognitiveVerifier** | Validates AI reasoning against governance rules |
| **PluralisticDeliberationOrchestrator** | Manages multi-stakeholder deliberation processes |
---
## Basic Usage
### 1. Initialize Services
```javascript
const {
InstructionPersistenceClassifier,
CrossReferenceValidator,
BoundaryEnforcer
} = require('@tractatus/framework');
BoundaryEnforcer,
ContextPressureMonitor
} = require('./src/services');
// Initialize services
const classifier = new InstructionPersistenceClassifier();
const validator = new CrossReferenceValidator();
const enforcer = new BoundaryEnforcer();
const monitor = new ContextPressureMonitor();
```
// Your application logic
async function processUserInstruction(instruction) {
// 1. Classify persistence
const classification = classifier.classify({
text: instruction.text,
source: instruction.source
});
### 2. Classify Instructions
// 2. Store if high persistence
if (classification.persistence === 'HIGH') {
await instructionDB.store(classification);
}
```javascript
const classification = classifier.classify({
text: "Always use MongoDB on port 27027",
source: "user",
context: "explicit_configuration"
});
// 3. Validate actions against stored instructions
const validation = await validator.validate({
action: proposedAction,
instructionHistory: await instructionDB.getActive()
});
// Returns: { quadrant: "SYSTEM", persistence: "HIGH", ... }
```
if (validation.status === 'REJECTED') {
throw new Error(`Action blocked: ${validation.reason}`);
}
### 3. Validate Actions
// 4. Check values boundaries
const boundaryCheck = enforcer.checkBoundary({
decision: proposedAction.description,
domains: proposedAction.affectedDomains
});
```javascript
const validation = await validator.validate({
type: 'database_config',
proposedPort: 27017,
storedInstruction: { port: 27027 }
});
if (boundaryCheck.requiresHumanJudgment) {
return await requestHumanDecision(boundaryCheck);
}
// Returns: REJECTED if action conflicts with instructions
```
// Proceed with action
return executeAction(proposedAction);
}
### 4. Enforce Boundaries
```javascript
const decision = {
type: 'modify_values_content',
description: 'Update ethical guidelines'
};
const result = enforcer.enforce(decision);
// Returns: { allowed: false, requires_human: true, ... }
```
---
## 🧪 Testing
## API Documentation
Full API reference: [docs/api/](docs/api/)
- [Rules API](docs/api/RULES_API.md) - Governance rule management
- [Projects API](docs/api/PROJECTS_API.md) - Project configuration
- [OpenAPI Specification](docs/api/openapi.yaml) - Complete API spec
---
## Deployment
### Quick Deployment
See [deployment-quickstart/](deployment-quickstart/) for Docker-based deployment.
```bash
cd deployment-quickstart
docker-compose up -d
```
### Production Deployment
- systemd service configuration: [systemd/](systemd/)
- Environment configuration: [.env.example](.env.example)
- Troubleshooting: [deployment-quickstart/TROUBLESHOOTING.md](deployment-quickstart/TROUBLESHOOTING.md)
---
## Architecture
Architecture decision records: [docs/architecture/](docs/architecture/)
- [ADR-001: Dual Governance Architecture](docs/architecture/ADR-001-dual-governance-architecture.md)
Diagrams:
- [docs/diagrams/architecture-main-flow.svg](docs/diagrams/architecture-main-flow.svg)
- [docs/diagrams/trigger-decision-tree.svg](docs/diagrams/trigger-decision-tree.svg)
---
## Testing
```bash
# Run all tests
npm test
# Run specific suites
npm run test:unit # Unit tests for individual services
npm run test:integration # Integration tests across services
npm run test:governance # Governance rule compliance tests
npm run test:unit
npm run test:integration
npm run test:security
# Watch mode for development
# Watch mode
npm run test:watch
# Generate coverage report
npm run test:coverage
```
**Current Test Status:**
- ✅ **625 passing tests** - Core functionality verified
- ❌ **108 failing tests** - Known issues under investigation
- ⏭️ **9 skipped tests** - Pending implementation or requiring manual setup
The failing tests primarily involve:
- Integration edge cases with MongoDB connection handling
- Values boundary detection precision
- Context pressure threshold calibration
We maintain high transparency about test status because **architectural honesty is more valuable than claiming perfection.**
**Test Coverage:** 625 passing tests, 108 known failures under investigation
---
## 📖 Documentation & Resources
## Contributing
### For Researchers
See [CONTRIBUTING.md](CONTRIBUTING.md) for contribution guidelines.
- **[Theoretical Foundations](https://agenticgovernance.digital/docs.html)** - Philosophy and research context
- **[Case Studies](https://agenticgovernance.digital/docs.html)** - Real failure modes and responses
- **[Research Challenges](https://agenticgovernance.digital/docs.html)** - Open problems and current hypotheses
### For Implementers
- **[API Reference](https://agenticgovernance.digital/docs.html)** - Complete technical documentation
- **[Integration Guide](https://agenticgovernance.digital/implementer.html)** - Implementation patterns
- **[Architecture Overview](https://agenticgovernance.digital/docs.html)** - System design decisions
### Interactive Demos
- **[27027 Incident](https://agenticgovernance.digital/demos/27027-demo.html)** - Training pattern override
- **[Context Degradation](https://agenticgovernance.digital/demos/context-pressure-demo.html)** - Session quality tracking
**Key areas:**
- Testing framework components across different LLMs
- Expanding governance rule library
- Improving boundary detection algorithms
- Documentation improvements
---
## 🤝 Contributing
## License
We welcome contributions that advance the research:
### Research Contributions
- Empirical studies of framework effectiveness
- Formal verification of safety properties
- Extensions to new domains or applications
- Replication studies with different LLMs
### Implementation Contributions
- Bug fixes and test improvements
- Performance optimizations
- Ports to other languages (Python, Rust, Go, TypeScript)
- Integration with other frameworks
### Documentation Contributions
- Case studies from your own deployments
- Tutorials and integration guides
- Translations of documentation
- Critical analyses of framework limitations
**See [CONTRIBUTING.md](CONTRIBUTING.md) for detailed guidelines.**
**Research collaborations:** For formal collaboration on empirical studies or theoretical extensions, contact research@agenticgovernance.digital
Apache License 2.0 - See [LICENSE](LICENSE)
---
## 📊 Project Roadmap
### Current Phase: Alpha Research (October 2025)
**Status:**
- ✅ Core services implemented and operational
- ✅ Tested across 349 development commits
- ✅ 52 governance rules validated through real usage
- ⚠️ Test suite stabilization needed (108 failures)
- ⚠️ Empirical validation studies not yet conducted
**Immediate priorities:**
1. Resolve known test failures
2. Conduct rigorous empirical effectiveness study
3. Document systematic replication protocol
4. Expand testing beyond self-development context
### Next Phase: Beta Research (Q1 2026)
**Goals:**
- Multi-project deployment studies
- Cross-LLM compatibility testing
- Community case study collection
- Formal verification research partnerships
### Future Research Directions
**Not promises, but research questions:**
- Can we build provably safe boundaries for specific decision types?
- Does the framework generalize beyond software development?
- What is the optimal governance rule count for different application domains?
- Can we develop formal methods for automated rule consolidation?
---
## 📜 License & Attribution
### License
Copyright 2025 John Stroh
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at:
http://www.apache.org/licenses/LICENSE-2.0
See [LICENSE](LICENSE) for full terms.
### Development Attribution
This framework represents collaborative human-AI development:
**Human (John Stroh):**
- Conceptual design and governance architecture
- Research questions and theoretical grounding
- Quality oversight and final decisions
- Legal copyright holder
**AI (Claude, Anthropic):**
- Implementation and code generation
- Documentation drafting
- Iterative refinement and debugging
- Test suite development
**Testing Context:**
- 349 commits over 6 months
- Self-development (dogfooding) in Claude Code sessions
- Real-world failure modes and responses documented
This attribution reflects honest acknowledgment of AI's substantial role in implementation while maintaining clear legal responsibility and conceptual ownership.
---
## 🙏 Acknowledgments
### Theoretical Foundations
- **Ludwig Wittgenstein** - *Tractatus Logico-Philosophicus* (limits of systematization)
- **Isaiah Berlin** - Value pluralism and incommensurability
- **Ruth Chang** - Hard choices and incomparability theory
- **James March & Herbert Simon** - Organizational decision-making frameworks
### Technical Foundations
- **Anthropic** - Claude AI system (implementation partner and research subject)
- **MongoDB** - Persistence layer for governance rules
- **Node.js/Express** - Runtime environment
- **Open Source Community** - Countless tools, libraries, and collaborative practices
---
## 📖 Philosophy
> **"Whereof one cannot speak, thereof one must be silent."**
> — Ludwig Wittgenstein, *Tractatus Logico-Philosophicus*
Applied to AI safety:
> **"Whereof the AI cannot safely decide, thereof it must request human judgment."**
Some decisions cannot be systematized without imposing contestable value judgments. Rather than pretend AI can make these decisions "correctly," we explore architectures that **structurally defer to human deliberation** when values frameworks conflict.
This isn't a limitation of the technology.
It's **recognition of the structure of human values.**
Not all problems have technical solutions.
Some require **architectural humility.**
---
## 🌐 Links
- **Website:** [agenticgovernance.digital](https://agenticgovernance.digital)
- **Documentation:** [agenticgovernance.digital/docs](https://agenticgovernance.digital/docs.html)
- **Research:** [agenticgovernance.digital/research](https://agenticgovernance.digital/research.html)
- **GitHub:** [AgenticGovernance/tractatus-framework](https://github.com/AgenticGovernance/tractatus-framework)
## 📧 Contact
## Contact
- **Email:** research@agenticgovernance.digital
- **Issues:** [GitHub Issues](https://github.com/AgenticGovernance/tractatus-framework/issues)
- **Discussions:** [GitHub Discussions](https://github.com/AgenticGovernance/tractatus-framework/discussions)
- **Website:** [https://agenticgovernance.digital](https://agenticgovernance.digital)
---
**Tractatus Framework** | Architectural AI Safety Research | Apache 2.0 License
*Last updated: 2025-10-21*
**Last Updated:** 2025-10-21