A Starting Point

Instead of hoping AI systems "behave correctly," we propose structural constraints where certain decision types require human judgment. These architectural boundaries can adapt to individual, organizational, and societal norms—creating a foundation for bounded AI operation that may scale more safely with capability growth.

Framework Capabilities

Instruction Classification

Quadrant-based classification (STR/OPS/TAC/SYS/STO) with time-persistence metadata tagging

Cross-Reference Validation

Validates AI actions against explicit user instructions to prevent pattern-based overrides

Boundary Enforcement

Implements Tractatus 12.1-12.7 boundaries - values decisions architecturally require humans

Pressure Monitoring

Detects degraded operating conditions (token pressure, errors, complexity) and adjusts verification

Metacognitive Verification

AI self-checks alignment, coherence, safety before execution - structural pause-and-verify

Pluralistic Deliberation

Multi-stakeholder values deliberation without hierarchy - facilitates human decision-making for incommensurable values

Real-World Validation

Framework validated in 6-month deployment across ~500 sessions with Claude Code

Pattern Bias Incident Interactive Demo

The 27027 Incident

Real production incident where Claude Code defaulted to port 27017 (training pattern) despite explicit user instruction to use port 27027. CrossReferenceValidator detected the conflict and blocked execution—demonstrating how pattern recognition can override instructions under context pressure.

Why this matters: This failure mode gets worse as models improve—stronger pattern recognition means stronger override tendency. Architectural constraints remain necessary regardless of capability level.

View Interactive Demo

Additional case studies and research findings documented in technical papers

Browse Case Studies →