Tractatus AI Safety Framework | Architectural Constraints for Human Agency

A Starting Point

Instead of hoping AI systems "behave correctly," we propose structural constraints where certain decision types require human judgment. These architectural boundaries can adapt to individual, organizational, and societal norms—creating a foundation for bounded AI operation that may scale more safely with capability growth.

For AI safety researchers, academics, and scientists investigating LLM failure modes and governance architectures

🔬

Researcher

Academic & technical depth

Explore the theoretical foundations, architectural constraints, and scholarly context of the Tractatus framework.

Technical specifications & proofs
Academic research review
Failure mode analysis
Mathematical foundations

Explore Research →

For software engineers, ML engineers, and technical teams building production AI systems

⚙️

Implementer

Code & integration guides

Get hands-on with implementation guides, API documentation, and reference code examples.

Working code examples
API integration patterns
Service architecture diagrams
Deployment best practices

View Implementation Guide →

For AI executives, research directors, startup founders, and strategic decision makers setting AI safety policy

💼

Leader

Strategic AI Safety

Navigate the business case, compliance requirements, and competitive advantages of structural AI safety.

Executive briefing & business case
Risk management & compliance (EU AI Act)
Implementation roadmap & ROI
Competitive advantage analysis

View Leadership Resources →

Framework Capabilities

Instruction Classification

Quadrant-based classification (STR/OPS/TAC/SYS/STO) with time-persistence metadata tagging

Cross-Reference Validation

Validates AI actions against explicit user instructions to prevent pattern-based overrides

Boundary Enforcement

Implements Tractatus 12.1-12.7 boundaries - values decisions architecturally require humans

Pressure Monitoring

Detects degraded operating conditions (token pressure, errors, complexity) and adjusts verification

Metacognitive Verification

AI self-checks alignment, coherence, safety before execution - structural pause-and-verify

Pluralistic Deliberation

Multi-stakeholder values deliberation without hierarchy - facilitates human decision-making for incommensurable values

Real-World Validation

Framework validated in 6-month deployment across ~500 sessions with Claude Code

Pattern Bias Incident Interactive Demo

The 27027 Incident

Real production incident where Claude Code defaulted to port 27017 (training pattern) despite explicit user instruction to use port 27027. CrossReferenceValidator detected the conflict and blocked execution—demonstrating how pattern recognition can override instructions under context pressure.

Why this matters: This failure mode gets worse as models improve—stronger pattern recognition means stronger override tendency. Architectural constraints remain necessary regardless of capability level.

View Interactive Demo

Additional case studies and research findings documented in technical papers

Browse Case Studies →