About | Tractatus AI Safety Framework

Our Mission

As AI systems make increasingly consequential decisions—medical treatment, hiring, content moderation, resource allocation—a fundamental question emerges: whose values guide these decisions? Current AI alignment approaches embed particular moral frameworks into systems deployed universally. When they work, it's because everyone affected shares those values. When they don't, someone's values inevitably override others'.

The Tractatus Framework exists to address a fundamental problem in AI safety: current approaches rely on training, fine-tuning, and corporate governance—all of which can fail, drift, or be overridden. We propose safety through architecture.

Inspired by Ludwig Wittgenstein's Tractatus Logico-Philosophicus, our framework recognizes that some domains—values, ethics, cultural context, human agency—cannot be systematized. What cannot be systematized must not be automated. AI systems should have structural constraints that prevent them from crossing these boundaries.

"Whereof one cannot speak, thereof one must be silent."
— Ludwig Wittgenstein, Tractatus (§7)

Applied to AI: "What cannot be systematized must not be automated."

Why This Matters

AI systems are amoral hierarchical constructs, fundamentally incompatible with the plural, incommensurable values human societies exhibit. A hierarchy can only impose one framework and treat conflicts as anomalies. You cannot pattern-match your way to pluralism.

Human societies spent centuries learning to navigate moral pluralism through constitutional separation of powers, federalism, subsidiarity, and deliberative democracy. These structures acknowledge that legitimate authority over value decisions belongs to affected communities, not distant experts claiming universal wisdom.

AI development risks reversing this progress. As capability concentrates in a few labs, value decisions affecting billions are being encoded by small teams applying their particular moral intuitions at scale. Not through malice—through structural necessity. The architecture of current AI systems demands hierarchical value frameworks.

The Tractatus Framework offers an alternative: separate what must be universal (safety boundaries) from what should be contextual (value deliberation). This preserves human agency over moral decisions while enabling AI capability to scale.

Core Values

Sovereignty

Individuals and communities must maintain control over decisions affecting their data, privacy, and values. AI systems must preserve human agency, not erode it.

Transparency

All AI decisions must be explainable, auditable, and reversible. No black boxes. Users deserve to understand how and why systems make choices, and have power to override them.

Harmlessness

AI systems must not cause harm through action or inaction. This includes preventing drift, detecting degradation, and enforcing boundaries against values erosion.

Community

AI safety is a collective endeavor. We are committed to open collaboration, knowledge sharing, and empowering communities to shape the AI systems that affect their lives.

Pluralism

Different communities hold different, equally legitimate values. AI systems must respect this pluralism structurally, not by pretending one framework can serve all contexts. Value decisions require deliberation among affected stakeholders, not autonomous AI choices.

Read Our Complete Values Statement →

How It Works

The Tractatus Framework consists of six integrated components that work together to enforce structural safety:

InstructionPersistenceClassifier

Classifies instructions by quadrant (Strategic, Operational, Tactical, System, Stochastic) and determines persistence level (HIGH/MEDIUM/LOW/VARIABLE).

CrossReferenceValidator

Validates AI actions against stored instructions to prevent pattern recognition bias (like the 27027 incident where AI's training patterns immediately overrode user's explicit "port 27027" instruction).

BoundaryEnforcer

Ensures AI never makes values decisions without human approval. Privacy trade-offs, user agency, cultural context—these require human judgment.

ContextPressureMonitor

Detects when session conditions increase error probability (token pressure, message length, task complexity) and adjusts behavior or suggests handoff.

MetacognitiveVerifier

AI self-checks complex reasoning before proposing actions. Evaluates alignment, coherence, completeness, safety, and alternatives.

PluralisticDeliberationOrchestrator

When AI encounters values decisions—choices with no single "correct" answer—coordinates deliberation among affected stakeholders rather than making autonomous choices. Preserves human agency over moral decisions.

Read Technical Documentation & Implementation Guide →

Origin Story

The Tractatus Framework emerged from real-world AI failures experienced during extended Claude Code sessions. The "27027 incident"—where AI's training patterns immediately overrode an explicit instruction (user said "port 27027", AI used "port 27017")—revealed that traditional safety approaches were insufficient. This wasn't forgetting; it was pattern recognition bias autocorrecting the user.

After documenting multiple failure modes (pattern recognition bias, values drift, silent degradation), we recognized a pattern: AI systems lacked structural constraints. They could theoretically "learn" safety, but in practice their training patterns overrode explicit instructions, and the problem gets worse as capabilities increase.

The solution wasn't better training—it was architecture. Drawing inspiration from Wittgenstein's insight that some things lie beyond the limits of language (and thus systematization), we built a framework that enforces boundaries through structure, not aspiration.

License & Contribution

The Tractatus Framework is open source under the Apache License 2.0. We encourage:

Academic research and validation studies
Implementation in production AI systems
Submission of failure case studies
Theoretical extensions and improvements
Community collaboration and knowledge sharing

The framework is intentionally permissive because AI safety benefits from transparency and collective improvement, not proprietary control.

Why Apache 2.0?

We chose Apache 2.0 over MIT because it provides:

Patent Protection: Explicit patent grant protects users from patent litigation by contributors
Contributor Clarity: Clear terms for how contributions are licensed
Permissive Use: Like MIT, allows commercial use and inclusion in proprietary products
Community Standard: Widely used in AI/ML projects (TensorFlow, PyTorch, Apache Spark)

View full Apache 2.0 License →

About Tractatus