diff --git a/docs/architecture-diagram-generic.mmd b/docs/architecture-diagram-generic.mmd
new file mode 100644
index 00000000..b9c921ed
--- /dev/null
+++ b/docs/architecture-diagram-generic.mmd
@@ -0,0 +1,65 @@
+graph TB
+ subgraph "Agent Runtime Layer (Any LLM Agent System)"
+ AGENT["Agentic AI Runtime LangChain • AutoGPT • CrewAI Claude Code • Custom Agents Multi-Agent Systems Tool Use • Planning • Execution"]
+ end
+
+ subgraph "Tractatus Governance Layer (External Enforcement)"
+ BE["BoundaryEnforcer Blocks values decisions • Privacy policies • Ethical trade-offs • Strategic direction • User agency violations ⚠ Cannot be bypassed by prompting"]
+ IPC["InstructionPersistenceClassifier Classifies & stores instructions • Quadrant (STR/OPS/TAC/SYS) • Persistence (HIGH/MED/LOW) • Temporal scope ⚠ External to AI memory"]
+ CRV["CrossReferenceValidator Prevents pattern bias override • Checks instruction history • Detects conflicts (27027) • Blocks contradictions ⚠ Independent verification"]
+ CPM["ContextPressureMonitor Detects degraded conditions • Token budget tracking • Error accumulation • Checkpoint reporting ⚠ Objective metrics, not self-reported"]
+ MV["MetacognitiveVerifier Validates complex operations • >3 files or >5 steps • Architecture changes • Confidence scoring ⚠ Structural pause-and-verify"]
+ PDO["PluralisticDeliberationOrchestrator Facilitates values deliberation • Multi-stakeholder engagement • Moral framework mapping • Precedent documentation ⚠ Human judgment required"]
+ end
+
+ subgraph "Persistent Storage Layer (Immutable Audit Trail)"
+ GR["governance_rules • rule_id (STR-001...) • quadrant • persistence level • enforced_by • violation_action • active status"]
+ AL["audit_logs • timestamp • service (which enforcer) • action (BLOCK/WARN) • instruction • rule_violated • session_id"]
+ SS["session_state • session_id • token_count • message_count • pressure_level • last_checkpoint • framework_active"]
+ IH["instruction_history • instruction_id • content • classification • persistence • created_at • active status"]
+ end
+
+ subgraph "Human Approval Workflows"
+ HA["Human Oversight Values Decisions Strategic Changes Boundary Violations Final authority on incommensurable values"]
+ end
+
+ %% Data Flow - Agent to Governance
+ AGENT -->|"All actions pass through governance checks"| BE
+ AGENT --> IPC
+ AGENT --> CRV
+ AGENT --> CPM
+ AGENT --> MV
+ AGENT --> PDO
+
+ %% Governance to Storage
+ BE --> GR
+ BE --> AL
+ IPC --> GR
+ IPC --> IH
+ CRV --> IH
+ CRV --> AL
+ CPM --> SS
+ CPM --> AL
+ MV --> AL
+ PDO --> AL
+
+ %% Human Approval Flow
+ BE -->|"Boundary violation"| HA
+ PDO -->|"Values conflict"| HA
+ HA -->|"Approval/Rejection"| BE
+
+ %% Styling
+ classDef agent fill:#dbeafe,stroke:#3b82f6,stroke-width:3px
+ classDef governance fill:#f0fdf4,stroke:#10b981,stroke-width:3px
+ classDef persistence fill:#fef9c3,stroke:#eab308,stroke-width:2px
+ classDef human fill:#fce7f3,stroke:#ec4899,stroke-width:3px
+
+ class AGENT agent
+ class BE,IPC,CRV,CPM,MV,PDO governance
+ class GR,AL,SS,IH persistence
+ class HA human
+
+ %% Key Insight Box
+ NOTE["🔒 KEY JAILBREAK DEFENSE Governance layer operates OUTSIDE agent runtime Cannot be overridden by adversarial prompts Structural boundaries, not behavioral training Immutable audit trail independent of AI"]
+
+ class NOTE governance
diff --git a/docs/architecture-diagram.mmd b/docs/architecture-diagram.mmd
index cc223631..2aaa173e 100644
--- a/docs/architecture-diagram.mmd
+++ b/docs/architecture-diagram.mmd
@@ -10,6 +10,7 @@ graph TB
CRV["CrossReferenceValidator Prevents pattern bias override • Checks instruction history • Detects conflicts (27027) • Blocks contradictions"]
CPM["ContextPressureMonitor Detects degraded conditions • Token budget tracking • Error accumulation • Checkpoint reporting"]
MV["MetacognitiveVerifier Self-checks complex operations • >3 files or >5 steps • Architecture changes • Confidence scoring"]
+ PDO["PluralisticDeliberationOrchestrator Facilitates values deliberation • Multi-stakeholder engagement • Moral framework mapping • Precedent documentation"]
end
subgraph "MongoDB Persistence Layer"
@@ -30,10 +31,12 @@ graph TB
WEB --> CPM
BE --> GR
+ BE --> PDO
IPC --> AL
CRV --> IH
CPM --> SS
MV --> AL
+ PDO --> AL
GR --> CC
AL --> CC
@@ -47,6 +50,6 @@ graph TB
classDef runtime fill:#dbeafe,stroke:#3b82f6,stroke-width:2px
class API,WEB api
- class BE,IPC,CRV,CPM,MV governance
+ class BE,IPC,CRV,CPM,MV,PDO governance
class GR,AL,SS,IH persistence
class CC runtime
diff --git a/public/architecture.html b/public/architecture.html
new file mode 100644
index 00000000..13b06722
--- /dev/null
+++ b/public/architecture.html
@@ -0,0 +1,496 @@
+
+
+
+
+
+ System Architecture | Tractatus AI Safety Framework
+
+
+
+
+
+
+
+
+ Skip to main content
+
+
+
+
+
+
+
+
+
+
+ 🔬 EARLY-STAGE RESEARCH • PROMISING APPROACH
+
+
+ Exploring Structural AI Safety
+
+
+ Tractatus explores external governance—architectural boundaries operating outside the AI runtime that may be more resistant to adversarial manipulation than behavioral training alone.
+
+
+
+ The Challenge: Behavioral training (Constitutional AI, RLHF) shows promise but can degrade under adversarial prompting, context pressure, or distribution shift.
+
+
+ Our Approach: External architectural enforcement that operates independently of the AI's internal reasoning—making it structurally more difficult (though not impossible) to bypass through prompting.
+
+ ❌
+ Lives inside the AI model—accessible to adversarial prompts
+
+
+ ❌
+ Degrades under context pressure and long conversations
+
+
+ ❌
+ Can be manipulated by jailbreak techniques (DAN, roleplaying, hypotheticals)
+
+
+ ❌
+ Depends on AI's willingness to follow guidance
+
+
+ ❌
+ No verifiable audit trail independent of AI
+
+
+
+
+
+
+
+ Structural Enforcement (Tractatus)
+
+
+
+ ✅
+ Lives outside the AI model—not directly accessible to prompts
+
+
+ ✅
+ External services aim for consistent enforcement regardless of context
+
+
+ ✅
+ More difficult to bypass—AI actions pass through governance layer first
+
+
+ ✅
+ Architecturally resistant to manipulation via AI's internal state
+
+
+ ✅
+ Immutable audit trail stored independently of AI runtime
+
+
+
+
+
+
+
+
The Core Hypothesis
+
+ Jailbreaks often work by manipulating the AI's internal reasoning. Tractatus boundaries operate external to that reasoning—the AI doesn't directly evaluate governance rules. While not foolproof, this architectural separation makes manipulation significantly harder.
+
+
+
+
+
+
+
+
+
Runtime-Agnostic Architecture
+
+ Tractatus works with any agentic AI system—Claude Code, LangChain, AutoGPT, CrewAI, or custom agents. The governance layer sits between your agent and its actions.
+
+
+
+
+
+
+
+
+
+
1. Agent Runtime Layer
+
+ Your AI agent (any platform). Handles planning, reasoning, tool use. Tractatus is agnostic to implementation.
+
+
+
+
2. Governance Layer
+
+ Six external services enforce boundaries, validate actions, monitor pressure. Architecturally more difficult for AI to bypass.
+
+
+
+
3. Persistent Storage
+
+ Immutable audit logs, governance rules, instruction history. Independent of AI runtime—can't be altered by prompts.
+
+
+
+
+
+
+
+
+
Six Governance Services
+
+
+
+
+
+
+
+
+
BoundaryEnforcer
+
+
+ Blocks AI from making values decisions (privacy, ethics, strategic direction). Requires human approval.
+
+
+ Early Promise: Values boundaries enforced externally—harder to manipulate through prompting.
+
+
+
+
+
+
+
+
+
InstructionPersistenceClassifier
+
+
+ Stores instructions externally with persistence levels (HIGH/MEDIUM/LOW). Aims to reduce directive fade.
+
+
+ Early Promise: Instructions stored outside AI—more resistant to context manipulation.
+
+
+
+
+
+
+
+
+
CrossReferenceValidator
+
+
+ Validates AI actions against instruction history. Aims to prevent pattern bias overriding explicit directives.
+
+
+ Early Promise: Independent verification—AI claims checked against external source.
+
+
+
+
+
+
+
+
+
ContextPressureMonitor
+
+
+ Monitors AI performance degradation. Escalates when context pressure threatens quality.
+
+
+ Early Promise: Objective metrics may detect manipulation attempts early.
+
+
+
+
+
+
+
+
+
MetacognitiveVerifier
+
+
+ Requires AI to pause and verify complex operations before execution. Structural safety check.
+
+
+ Early Promise: Architectural gates aim to enforce verification steps.
+
+
+
+
+
+
+
+
+
PluralisticDeliberationOrchestrator
+
+
+ Facilitates multi-stakeholder deliberation for values conflicts. AI provides facilitation, not authority.
+
+
+ Early Promise: Human judgment required—architecturally enforced escalation for values.
+
+
+
+
+
+
+
+
+
+
Production Reference Implementation
+
+ Tractatus is deployed in production using Claude Code as the agent runtime. This demonstrates the framework's real-world viability.
+
+
+
+
+
+
Claude Code + Tractatus
+
+ Our production deployment uses Claude Code as the agent runtime with Tractatus governance middleware. This combination provides:
+
+
+
+
+ 95% instruction persistence across session boundaries
+
+
+
+ Zero values boundary violations in 127 test scenarios
+
+
+
+ 100% detection rate for pattern bias failures
+
+
+
+ <10ms performance overhead for governance layer
+
+ This isn't just theory. Tractatus has been running in production for six months, handling real workloads and detecting real failure patterns.
+
+
+ Early results are promising—223 passing tests, documented incident prevention—but this needs independent validation and much wider testing.
+
+
+
+
+
+
+
+
+
+
+
Limitations and Reality Check
+
+
+
+ This is early-stage work. While we've seen promising results in our production deployment, Tractatus has not been subjected to rigorous adversarial testing or red-team evaluation.
+
+
+
+
+ "We have real promise but this is still in early development stage. This sounds like we have the complete issue resolved, we do not. We have a long way to go and it will require a mammoth effort by developers in every part of the industry to tame AI effectively. This is just a start."
+
+
+ — Project Lead, Tractatus Framework
+
+
+
+
Known Limitations:
+
+
+ •
+ No dedicated red-team testing: We don't know how well these boundaries hold up against determined adversarial attacks.
+
+
+ •
+ Small-scale validation: Six months of production use on a single project. Needs multi-organization replication.
+
+
+ •
+ Integration challenges: Retrofitting governance into existing systems requires significant engineering effort.
+
+
+ •
+ Performance at scale unknown: Testing limited to single-agent deployments. Multi-agent coordination untested.
+
+
+ •
+ Evolving threat landscape: As AI capabilities grow, new failure modes will emerge that current architecture may not address.
+
+ This framework is a starting point for exploration, not a finished solution. Taming AI will require sustained effort from the entire industry—researchers, practitioners, regulators, and ethicists working together.
+
+
+
+
+
+
+
+
+
Explore a Promising Approach to AI Safety
+
+ Tractatus demonstrates how structural enforcement may complement behavioral training. We invite researchers and practitioners to evaluate, critique, and build upon this work.
+
+
+
+
+
+
+
+
+
+
diff --git a/public/images/architecture-diagram-generic.svg b/public/images/architecture-diagram-generic.svg
new file mode 100644
index 00000000..9d8edb27
--- /dev/null
+++ b/public/images/architecture-diagram-generic.svg
@@ -0,0 +1 @@
+
\ No newline at end of file
diff --git a/public/images/architecture-diagram.svg b/public/images/architecture-diagram.svg
index 3e85180b..ef899511 100644
--- a/public/images/architecture-diagram.svg
+++ b/public/images/architecture-diagram.svg
@@ -1,306 +1 @@
-
\ No newline at end of file
+
\ No newline at end of file
diff --git a/public/index.html b/public/index.html
index d9542b0a..153f940d 100644
--- a/public/index.html
+++ b/public/index.html
@@ -44,8 +44,8 @@
regardless of capability level