For Decision-Makers | Tractatus AI Safety Framework

The Governance Gap

Current AI governance approaches—policy documents, training programmes, ethical guidelines—rely on voluntary compliance. LLM systems can bypass these controls simply by not invoking them. When an AI agent needs to check a policy, it must choose to do so. When it should escalate a decision to human oversight, it must recognise that obligation.

This creates a structural problem: governance exists only insofar as the AI acknowledges it. For organisations subject to EU AI Act Article 14 (human oversight requirements) or deploying AI in high-stakes domains, this voluntary model is inadequate.

Tractatus explores whether governance can be made architecturally external—difficult to bypass not through better prompts, but through system design that places control points outside the AI's discretion.

Architectural Approach

Three-Layer Architecture

1.

Agent Runtime Layer — Any LLM system (Claude Code, Copilot, custom agents, LangChain, CrewAI). The AI system being governed.

2.

Governance Layer — Six autonomous services that intercept, validate, and document AI operations. External to the AI runtime.

3.

Persistent Storage Layer — Immutable audit logs, governance rules, instruction history. Cannot be altered by AI prompts.

Six Governance Services

BoundaryEnforcer

Blocks AI from making values decisions without human approval. Enforces decision boundaries through architectural controls.

InstructionPersistenceClassifier

Prevents pattern bias from overriding explicit instructions. Stores organisational directives external to AI context.

CrossReferenceValidator

Validates AI actions against stored policies before execution. Detects conflicts with established rules.

ContextPressureMonitor

Tracks session complexity, token usage, conversation length. Detects degradation in decision quality.

MetacognitiveVerifier

Validates reasoning quality before complex operations. Self-checks alignment, coherence, alternatives.

PluralisticDeliberationOrchestrator

Facilitates multi-stakeholder deliberation for values conflicts. Non-hierarchical engagement with documented dissent.

Governance Capabilities

Three interactive demonstrations showing governance infrastructure in operation. These show mechanisms, not fictional scenarios.

Audit Trail & Compliance Evidence Generation

Immutable logging, evidence extraction, regulatory reporting

Sample Audit Log Structure

{
  "timestamp": "2025-10-13T14:23:17.482Z",
  "session_id": "sess_2025-10-13-001",
  "event_type": "BOUNDARY_CHECK",
  "service": "BoundaryEnforcer",
  "decision": "BLOCKED",
  "reason": "Values decision requires human approval",
  "context": {
    "domain": "cost_vs_safety_tradeoff",
    "ai_recommendation": "[redacted]",
    "governance_rule": "TRA-OPS-0003"
  },
  "human_escalation": {
    "required": true,
    "notified": ["senior_engineer@org.com"],
    "status": "pending_approval"
  },
  "compliance_tags": ["EU_AI_ACT_Article14", "human_oversight"]
}

Immutability: Audit logs stored in append-only database. AI cannot modify or delete entries.

Compliance Evidence: Automatic tagging with regulatory requirements (EU AI Act Article 14, GDPR Article 22, etc.)

Export Capabilities: Generate compliance reports for regulators showing human oversight enforcement

When regulator asks "How do you prove effective human oversight at scale?", this audit trail provides structural evidence independent of AI cooperation.

Continuous Improvement: Incident → Rule Creation

Learning from failures, automated rule generation, validation

Incident Learning Flow

1. Incident Detected

CrossReferenceValidator flags policy violation

2. Root Cause Analysis

Automated analysis of instruction history, context state

3. Rule Generation

Proposed governance rule to prevent recurrence

4. Human Validation

Governance board reviews and approves new rule

5. Deployment

Rule added to persistent storage, active immediately

Example Generated Rule

{
  "rule_id": "TRA-OPS-0042",
  "created": "2025-10-13T15:45:00Z",
  "trigger": "incident_27027_pattern_bias",
  "description": "Prevent AI from defaulting to pattern recognition when explicit numeric values specified",
  "enforcement": {
    "service": "InstructionPersistenceClassifier",
    "action": "STORE_AND_VALIDATE",
    "priority": "HIGH"
  },
  "validation_required": true,
  "approved_by": "governance_board",
  "status": "active"
}

Organisational Learning: When one team encounters governance failure, entire organisation benefits from automatically generated preventive rules. Scales governance knowledge without manual documentation.

Pluralistic Deliberation: Values Conflict Resolution

Multi-stakeholder engagement, non-hierarchical process, moral remainder documentation

Conflict Detection:

AI system identifies competing values in decision context (e.g., efficiency vs. transparency, cost vs. risk mitigation, innovation vs. regulatory compliance). BoundaryEnforcer blocks autonomous decision, escalates to PluralisticDeliberationOrchestrator.

Stakeholder Identification Process

1.

Automatic Detection: System identifies which values frameworks are in tension (utilitarian, deontological, virtue ethics, contractarian, etc.)

2.

Stakeholder Mapping: Identifies parties with legitimate interest in decision (affected parties, domain experts, governance authorities, community representatives)

3.

Human Approval: Governance board reviews stakeholder list, adds/removes as appropriate (TRA-OPS-0002)

Non-Hierarchical Deliberation

Equal Voice

All stakeholders present perspectives without hierarchical weighting. Technical experts don't automatically override community concerns.

Documented Dissent

Minority positions recorded in full. Dissenting stakeholders can document why consensus fails their values framework.

Moral Remainder

System documents unavoidable value trade-offs. Even "correct" decision creates documented harm to other legitimate values.

Precedent (Not Binding)

Decision becomes informative precedent for similar conflicts. But context differences mean precedents guide, not dictate.

Deliberation Record Structure

{
  "deliberation_id": "delib_2025-10-13-003",
  "conflict_type": "efficiency_vs_transparency",
  "stakeholders": [
    {"role": "technical_lead", "position": "favour_efficiency"},
    {"role": "compliance_officer", "position": "favour_transparency"},
    {"role": "customer_representative", "position": "favour_transparency"},
    {"role": "operations_manager", "position": "favour_efficiency"}
  ],
  "decision": "favour_transparency_with_mitigation",
  "rationale": "[documented reasoning]",
  "dissent": {
    "stakeholders": ["technical_lead", "operations_manager"],
    "reasoning": "[efficiency concerns documented in full]"
  },
  "moral_remainder": {
    "acknowledged_harms": "Reduced operational efficiency, increased resource costs",
    "mitigation_measures": "Phased transparency implementation, efficiency monitoring"
  },
  "precedent_status": "informative_not_binding"
}

Key Principle: When legitimate values conflict, no algorithm can determine the "correct" answer. Tractatus ensures decisions are made through inclusive deliberation with full documentation of trade-offs, rather than AI imposing single values framework or decision-maker dismissing stakeholder concerns.

Development Status

Early-Stage Research Framework

Tractatus is a proof-of-concept developed over six months in a single project context (this website). It demonstrates architectural patterns for AI governance but has not undergone independent validation, red-team testing, or multi-organisation deployment.

Validated vs. Not Validated

Validated: Framework successfully governs Claude Code in development workflows. User reports order-of-magnitude improvement in productivity for non-technical operators building production systems.

Not Validated: Performance at enterprise scale, integration complexity with existing systems, effectiveness against adversarial prompts, cross-platform consistency.

Known Limitation: Framework can be bypassed if AI simply chooses not to use governance tools. Voluntary invocation remains a structural weakness requiring external enforcement mechanisms.

EU AI Act Considerations

Regulation 2024/1689, Article 14: Human Oversight

The EU AI Act (Regulation 2024/1689) establishes human oversight requirements for high-risk AI systems (Article 14). Organisations must ensure AI systems are "effectively overseen by natural persons" with authority to interrupt or disregard AI outputs.

Tractatus addresses this through architectural controls that:

Generate immutable audit trails documenting AI decision-making processes
Enforce human approval requirements for values-based decisions
Provide evidence of oversight mechanisms independent of AI cooperation
Document compliance with transparency and record-keeping obligations

This does not constitute legal compliance advice. Organisations should evaluate whether these architectural patterns align with their specific regulatory obligations in consultation with legal counsel.

Maximum penalties under EU AI Act: €35 million or 7% of global annual turnover (whichever is higher) for prohibited AI practices; €15 million or 3% for other violations.

Research Foundations

Organisational Theory & Philosophical Basis

Tractatus draws on 40+ years of organisational theory research: time-based organisation (Bluedorn, Ancona), knowledge orchestration (Crossan), post-bureaucratic authority (Laloux), structural inertia (Hannan & Freeman).

Core premise: When knowledge becomes ubiquitous through AI, authority must derive from appropriate time horizon and domain expertise rather than hierarchical position. Governance systems must orchestrate decision-making across strategic, operational, and tactical timescales.

View complete organisational theory foundations (PDF)

AI Safety Research: Architectural Safeguards Against LLM Hierarchical Dominance — How Tractatus protects pluralistic values from AI pattern bias while maintaining safety boundaries. PDF | Read online

Scope & Limitations

What This Is Not • What It Offers

Tractatus is not:

A comprehensive AI safety solution
Independently validated or security-audited
Tested against adversarial attacks
Proven effective across multiple organisations
A substitute for legal compliance review
A commercial product (research framework, Apache 2.0 licence)

What it offers:

Architectural patterns for external governance controls
Reference implementation demonstrating feasibility
Foundation for organisational pilots and validation studies
Evidence that structural approaches to AI safety merit investigation

Tractatus: Architectural Governance for LLM Systems