fix: update executive brief copyright to match LICENSE file
Changed copyright from "Tractatus AI Safety Framework" to full Apache 2.0 license text naming John G Stroh as copyright holder. - Added complete Apache 2.0 license boilerplate - Matches LICENSE file format exactly - Ensures legal clarity of copyright ownership - PDF regenerated with correct copyright Note: Not deployed to production (document for manual distribution) 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
This commit is contained in:
parent
1ef31c076e
commit
573fa8726d
2 changed files with 24 additions and 67 deletions
|
|
@ -1,9 +1,9 @@
|
|||
# Architectural Externality in AI Governance: Research Brief
|
||||
# Architectural Externality in AI Governance: Research Brief and Where to from here
|
||||
|
||||
**Prepared for:** Executive Discussion on AI Safety Architecture
|
||||
**Prepared for:** Discussion on AI Safety Architecture
|
||||
**Date:** October 2025
|
||||
**Context:** Research framework exploring structural approaches to LLM governance
|
||||
**Status:** Early-stage proof-of-concept (6 months development, single project validation)
|
||||
**Status:** Early-stage proof-of-concept
|
||||
|
||||
---
|
||||
|
||||
|
|
@ -13,35 +13,21 @@ Current AI governance mechanisms—policy documents, ethics training, usage guid
|
|||
|
||||
This creates architectural vulnerability independent of model capability or fine-tuning. The more sophisticated the AI becomes, the better it can rationalise why governance controls don't apply in particular situations. No amount of prompt engineering resolves this: if the AI determines oversight isn't necessary, oversight doesn't occur.
|
||||
|
||||
For organisations deploying AI at enterprise scale—particularly those subject to EU AI Act Article 14 (human oversight requirements) or operating in high-risk domains—voluntary governance models are structurally insufficient. When a regulator asks "How do you prove human oversight operated?" the honest answer is often "The AI told us it did."
|
||||
For organisations deploying AI at enterprise scale—particularly those subject to EU AI Act Article 14 (human oversight requirements) or operating in high-risk domains—voluntary governance models are structurally insufficient.
|
||||
|
||||
This isn't theoretical. We've observed pattern recognition bias overriding explicit instructions (the "27027 failure"), governance frameworks being bypassed through conversational tactics, and AI systems confidently generating structurally unsafe code while assuring operators all safety checks passed. These aren't model failures—they're architectural gaps.
|
||||
For additional background information: see some of the Q&As that address this issue directly [Frequently Asked Questions | Tractatus AI Safety Framework](https://agenticgovernance.digital/faq.html)
|
||||
|
||||
---
|
||||
|
||||
## Research Question: Architectural Externality
|
||||
|
||||
Tractatus explores whether governance can be made architecturally external to AI systems—difficult to bypass not through better prompts or training, but through system design that places control points outside the AI's discretion.
|
||||
Our Tractatus model explores whether governance can be made architecturally external to AI systems—difficult to bypass not through better prompts or training, but through system design that places control points outside the AI's discretion. For an overview of Tractatus concepts there is a comprehensive Glossary of Terms in the Technical Reference section of the document library [Framework Documentation | Tractatus AI Safety](https://agenticgovernance.digital/docs.html)
|
||||
|
||||
The approach: **governance as infrastructure layer**, not conversation participant.
|
||||
|
||||
Rather than asking AI to invoke governance tools or recognise boundary violations, we architect governance as a separate layer that intercepts operations, validates decisions against stored policies, and enforces human approval requirements independent of AI cooperation. The AI doesn't decide whether governance applies; the architecture enforces it.
|
||||
Rather than asking AI to invoke governance tools or recognise boundary violations, we architect governance as a separate layer that intercepts operations, validates decisions against stored policies, and **enforces human approval requirements independent of AI cooperation**. The AI doesn't decide whether governance applies; the three-layer architecture enforces it. For additional background information: [System Architecture | Tractatus AI Safety Framework](https://agenticgovernance.digital/architecture.html)
|
||||
|
||||
**Three-layer architecture:**
|
||||
|
||||
1. **Agent Runtime Layer** — Any LLM system (Copilot, Claude Code, custom agents, framework tools like LangChain or CrewAI). The system being governed.
|
||||
|
||||
2. **Governance Layer** — Six autonomous services operating external to AI runtime:
|
||||
- BoundaryEnforcer: Blocks values decisions without human approval
|
||||
- InstructionPersistenceClassifier: Prevents pattern bias from overriding explicit directives
|
||||
- CrossReferenceValidator: Validates actions against stored policies
|
||||
- ContextPressureMonitor: Detects session degradation
|
||||
- MetacognitiveVerifier: Validates reasoning before complex operations
|
||||
- PluralisticDeliberationOrchestrator: Facilitates multi-stakeholder deliberation for values conflicts
|
||||
|
||||
3. **Persistent Storage Layer** — Immutable audit logs, governance rules, instruction history that AI cannot alter through prompts.
|
||||
|
||||
**Critical design principle:** Governance services operate autonomously. They don't ask the AI "should I check this decision?" They intercept, validate, and block by default unless criteria met. The AI can attempt to bypass governance, but doing so requires architectural changes (code modification) rather than conversational persuasion.
|
||||
|
||||
---
|
||||
|
||||
|
|
@ -49,13 +35,7 @@ Rather than asking AI to invoke governance tools or recognise boundary violation
|
|||
|
||||
**What We've Demonstrated:**
|
||||
|
||||
Framework successfully governs Claude Code across six months of production website development (this project). User reports order-of-magnitude productivity improvement for non-technical operator building production systems. The governance layer:
|
||||
|
||||
- Caught pattern recognition bias attempting to override explicit numeric specifications (27027-style failures)
|
||||
- Enforced human approval for values decisions 100% of cases where BoundaryEnforcer triggered
|
||||
- Generated complete audit trail independent of AI cooperation
|
||||
- Prevented context degradation through automated session monitoring
|
||||
- Maintained instruction persistence across conversation compaction/restart cycles
|
||||
The Tractatus framework successfully governs Claude Code development with order-of-magnitude productivity improvement for a non-technical operator building production systems. For additional background information: [Production Validation](https://agenticgovernance.digital/leader.html)
|
||||
|
||||
**What We Haven't Demonstrated:**
|
||||
|
||||
|
|
@ -82,39 +62,13 @@ When knowledge was scarce, hierarchical authority made organisational sense. Exp
|
|||
|
||||
Answer (from organisational theory): **appropriate time horizon and legitimate stakeholder representation**, not information asymmetry.
|
||||
|
||||
We draw on:
|
||||
|
||||
- **Time-based organisation** (Bluedorn, Ancona): Strategic/operational/tactical decisions require different time horizons. AI operating at tactical speed shouldn't override strategic decisions made at appropriate temporal scale.
|
||||
|
||||
- **Knowledge orchestration** (Crossan): Authority shifts from knowledge control to knowledge coordination. Governance systems orchestrate decision-making rather than gatekeep information.
|
||||
|
||||
- **Post-bureaucratic organisation** (Laloux): As organisations evolve beyond command-and-control, authority must derive from appropriate expertise and stakeholder representation, not hierarchical position.
|
||||
|
||||
- **Structural inertia** (Hannan & Freeman): When governance is voluntary (embedded in culture/process), system evolution can bypass it. Architectural governance creates structural constraints that resist erosion.
|
||||
|
||||
This isn't abstract philosophy. It's practical framework design informed by research on how organisations actually function when expertise becomes widely distributed.
|
||||
|
||||
The PluralisticDeliberationOrchestrator specifically addresses values pluralism: when legitimate values conflict (efficiency vs. transparency, innovation vs. risk mitigation), no algorithm can determine the "correct" answer. The system facilitates multi-stakeholder deliberation with documented dissent and moral remainder—acknowledging that even optimal decisions create unavoidable harm to other legitimate values.
|
||||
|
||||
---
|
||||
|
||||
## Interactive Demonstrations
|
||||
|
||||
Three capability demonstrations show governance infrastructure in operation (not fictional scenarios):
|
||||
|
||||
**1. Audit Trail & Compliance Evidence Generation**
|
||||
|
||||
Shows immutable logging structure, automatic regulatory tagging (EU AI Act Article 14, GDPR Article 22), and compliance report generation. When regulator asks "How do you prove effective human oversight at scale?", this infrastructure provides structural evidence independent of AI self-reporting.
|
||||
|
||||
**2. Continuous Improvement: Incident → Rule Creation**
|
||||
|
||||
Demonstrates organisational learning flow: incident detection → root cause analysis → automated rule generation → human validation → deployment. When one team encounters governance failure, entire organisation benefits from automatically generated preventive rules. Scales governance knowledge without manual documentation overhead.
|
||||
|
||||
**3. Pluralistic Deliberation: Values Conflict Resolution**
|
||||
|
||||
Shows stakeholder identification process, non-hierarchical deliberation structure, documented dissent recording, and moral remainder acknowledgment. Addresses the reality that many "AI safety" questions are actually values conflicts where multiple legitimate perspectives exist.
|
||||
|
||||
Demonstrations emphasise mechanisms, not outcomes. They show what the infrastructure does, not what decisions it should make.
|
||||
**Interactive governance demonstrations:** The [Leader Page](https://agenticgovernance.digital/leader.html) includes three working examples showing audit trail generation, incident-to-rule learning, and pluralistic deliberation in operation.
|
||||
|
||||
---
|
||||
|
||||
|
|
@ -136,6 +90,7 @@ Tractatus addresses this through:
|
|||
## What This Is Not
|
||||
|
||||
**Not:**
|
||||
|
||||
- A comprehensive AI safety solution
|
||||
- Independently validated or security-audited
|
||||
- Tested against adversarial attacks
|
||||
|
|
@ -144,6 +99,7 @@ Tractatus addresses this through:
|
|||
- A commercial product (research framework, Apache 2.0 licence)
|
||||
|
||||
**What It Offers:**
|
||||
|
||||
- Architectural patterns for external governance controls
|
||||
- Reference implementation demonstrating feasibility
|
||||
- Foundation for organisational pilots and validation studies
|
||||
|
|
@ -153,7 +109,7 @@ We make no claims about solving AI safety. We've explored whether architectural
|
|||
|
||||
---
|
||||
|
||||
## Research Validation Path Forward
|
||||
## Research Validation Path - This is the Question
|
||||
|
||||
To move from proof-of-concept to validated architectural approach requires:
|
||||
|
||||
|
|
@ -167,23 +123,24 @@ To move from proof-of-concept to validated architectural approach requires:
|
|||
|
||||
5. **Industry Collaboration** — Work with LLM platform providers (Microsoft, Anthropic, OpenAI) to integrate governance interception at runtime level rather than application layer
|
||||
|
||||
This isn't a 6-month project. It's 2-3 year validation programme requiring resources beyond single researcher capacity.
|
||||
This is a validation program requiring resources beyond single researcher capacity.
|
||||
|
||||
The question isn't "Does Tractatus solve AI governance?" but rather "Do these architectural patterns warrant investment in rigorous validation?"
|
||||
|
||||
---
|
||||
|
||||
## Discussion Context
|
||||
## License
|
||||
|
||||
This brief provides technical foundation for exploratory conversation. The framework exists, demonstrates feasibility, and reveals both promise and significant limitations. Whether it's relevant to your organisation's context is open question.
|
||||
Copyright 2025 John G Stroh
|
||||
|
||||
We're not pitching solutions. We're presenting research that may inform thinking about governance architecture, whether you adopt Tractatus specifically or develop alternative approaches addressing similar structural problems.
|
||||
Licensed under the Apache License, Version 2.0 (the "License");
|
||||
you may not use this file except in compliance with the License.
|
||||
You may obtain a copy of the License at
|
||||
|
||||
The demonstrations show infrastructure capabilities. The organisational theory provides principled foundation. The validation gaps acknowledge honest limitations. The research question—can governance be made architecturally external?—remains open but promising.
|
||||
http://www.apache.org/licenses/LICENSE-2.0
|
||||
|
||||
**Contact for technical documentation:**
|
||||
Framework specifications, implementation patterns, and research foundations available at https://agenticgovernance.digital
|
||||
|
||||
---
|
||||
|
||||
**Page 1 of 2**
|
||||
Unless required by applicable law or agreed to in writing, software
|
||||
distributed under the License is distributed on an "AS IS" BASIS,
|
||||
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
See the License for the specific language governing permissions and
|
||||
limitations under the License.
|
||||
|
|
|
|||
BIN
EXECUTIVE_BRIEF_GOVERNANCE_EXTERNALITY.pdf
Normal file
BIN
EXECUTIVE_BRIEF_GOVERNANCE_EXTERNALITY.pdf
Normal file
Binary file not shown.
Loading…
Add table
Reference in a new issue