feat: add Guardian Agents section to village-ai.html with philosophy blog link
- New Guardian Agents section between What's Live Today and Limitations - Four verification phases (response, claim-level, anomaly, adaptive learning) - Philosophical foundations grid (Wittgenstein, Berlin, Ostrom, Te Ao Māori) - Guardian Agents card added to What's Live Today grid - Philosophy blog post link added to Further Reading - All i18n keys added to en/village-ai.json Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
This commit is contained in:
parent
871ea0df27
commit
f43e31f63d
2 changed files with 105 additions and 2 deletions
|
|
@ -231,7 +231,9 @@
|
||||||
"story_title": "Story Assistance",
|
"story_title": "Story Assistance",
|
||||||
"story_desc": "Writing prompts, structural advice, narrative enhancement. Cultural context decisions deferred to the storyteller, not resolved by the AI.",
|
"story_desc": "Writing prompts, structural advice, narrative enhancement. Cultural context decisions deferred to the storyteller, not resolved by the AI.",
|
||||||
"memory_title": "AI Memory Transparency",
|
"memory_title": "AI Memory Transparency",
|
||||||
"memory_desc": "Members view and control what the AI remembers. Independent consent for triage memory, OCR memory, and summarisation memory."
|
"memory_desc": "Members view and control what the AI remembers. Independent consent for triage memory, OCR memory, and summarisation memory.",
|
||||||
|
"guardian_title": "Guardian Agents",
|
||||||
|
"guardian_desc": "Four-phase verification system using mathematical similarity rather than generative checking. Confidence badges, claim-level source analysis, and security transparency — all tenant-scoped."
|
||||||
},
|
},
|
||||||
"limitations": {
|
"limitations": {
|
||||||
"heading": "Limitations and Open Questions",
|
"heading": "Limitations and Open Questions",
|
||||||
|
|
@ -253,6 +255,33 @@
|
||||||
"paper_title": "Architectural Alignment Paper",
|
"paper_title": "Architectural Alignment Paper",
|
||||||
"paper_desc": "Academic paper on governance during training",
|
"paper_desc": "Academic paper on governance during training",
|
||||||
"researcher_title": "For Researchers",
|
"researcher_title": "For Researchers",
|
||||||
"researcher_desc": "Open questions, collaboration opportunities, and data access"
|
"researcher_desc": "Open questions, collaboration opportunities, and data access",
|
||||||
|
"guardian_title": "Guardian Agents Philosophy",
|
||||||
|
"guardian_desc": "How Wittgenstein, Berlin, Ostrom, and Te Ao Māori converge in a production governance architecture"
|
||||||
|
},
|
||||||
|
"guardian": {
|
||||||
|
"heading": "Guardian Agents: Verification Without Common-Mode Failure",
|
||||||
|
"intro": "The standard approach to AI safety verification — using additional AI models to check AI output — shares a structural flaw with the systems it checks. When both layers are probabilistic, both hallucinate, and both reward confident outputs, they share failure modes. This is <strong>common-mode failure</strong>: the checker confirms the error because it reasons the same way as the system it checks.",
|
||||||
|
"approach": "Guardian Agents resolve this by operating in a fundamentally different epistemic domain from the generation layer. The verification mechanism is embedding cosine similarity — a mathematical measurement of how closely an AI response aligns with source material. This is measurement, not interpretation. The watcher is not another speaker. The watcher is a measuring instrument.",
|
||||||
|
"phases_heading": "Four Verification Phases",
|
||||||
|
"phase1_title": "Response Verification",
|
||||||
|
"phase1_desc": "Every AI response measured against source material via embedding similarity. Score-derived confidence tiers (verified, partially verified, unverified) presented to the member — not binary safe/unsafe labels.",
|
||||||
|
"phase2_title": "Claim-Level Analysis",
|
||||||
|
"phase2_desc": "Individual claims mapped to sources or marked as unmatched. The system does not say \"this claim is wrong\" — it says \"we could not find this in your community's records.\" Absence of evidence is not evidence of absence.",
|
||||||
|
"phase3_title": "Anomaly Detection",
|
||||||
|
"phase3_desc": "Tenant-scoped baselines detect deviations in AI behaviour. What counts as anomalous in a parish archive differs from a neighbourhood coordination group — these are different values, not different calibrations.",
|
||||||
|
"phase4_title": "Adaptive Learning",
|
||||||
|
"phase4_desc": "Moderator decisions feed back into threshold tuning. Evidence burden is deliberately asymmetric: loosening safety thresholds requires 85% confidence, tightening requires 60%. A regression monitor watches every approved change.",
|
||||||
|
"foundations_heading": "Philosophical Foundations",
|
||||||
|
"foundations_intro": "The architectural choices in Guardian Agents are not engineering decisions that happen to align with philosophical positions — they are philosophical commitments that demanded specific engineering responses.",
|
||||||
|
"wittgenstein_title": "Wittgenstein",
|
||||||
|
"wittgenstein_desc": "Verification and generation must operate in different epistemic domains. The sayable (measurement) verifies what inevitably touches the unsayable (generation).",
|
||||||
|
"berlin_title": "Berlin",
|
||||||
|
"berlin_desc": "No objective function resolves values conflicts. Tenant-scoped governance prevents hidden value hierarchies. Asymmetric evidence burdens make trade-offs visible.",
|
||||||
|
"ostrom_title": "Ostrom",
|
||||||
|
"ostrom_desc": "Polycentric governance with genuinely independent verification centres. Moderators, regression monitors, and audit trails — no single authority is root.",
|
||||||
|
"teaomaori_title": "Te Ao Māori",
|
||||||
|
"teaomaori_desc": "Sovereign processing implements rangatiratanga — the community governs what happens to its own data. The platform exercises kaitiakitanga (guardianship), not ownership.",
|
||||||
|
"read_more": "<strong>Full analysis:</strong> <a href=\"/blog-post.html?slug=guardian-agents-philosophy-of-ai-accountability\" class=\"text-teal-700 underline hover:text-teal-900\">Guardian Agents and the Philosophy of AI Accountability</a> traces the complete philosophical genealogy — from early twentieth-century Vienna to contemporary Aotearoa New Zealand — and examines why these traditions converge on the same architectural requirements."
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
|
|
||||||
|
|
@ -611,6 +611,76 @@
|
||||||
<h3 class="font-bold text-gray-900 mb-2" data-i18n="live_today.memory_title">AI Memory Transparency</h3>
|
<h3 class="font-bold text-gray-900 mb-2" data-i18n="live_today.memory_title">AI Memory Transparency</h3>
|
||||||
<p class="text-gray-700 text-sm" data-i18n="live_today.memory_desc">Members view and control what the AI remembers. Independent consent for triage memory, OCR memory, and summarisation memory.</p>
|
<p class="text-gray-700 text-sm" data-i18n="live_today.memory_desc">Members view and control what the AI remembers. Independent consent for triage memory, OCR memory, and summarisation memory.</p>
|
||||||
</div>
|
</div>
|
||||||
|
<div class="bg-white rounded-lg shadow-sm p-5 border border-gray-200">
|
||||||
|
<h3 class="font-bold text-gray-900 mb-2" data-i18n="live_today.guardian_title">Guardian Agents</h3>
|
||||||
|
<p class="text-gray-700 text-sm" data-i18n="live_today.guardian_desc">Four-phase verification system using mathematical similarity rather than generative checking. Confidence badges, claim-level source analysis, and security transparency — all tenant-scoped.</p>
|
||||||
|
</div>
|
||||||
|
</div>
|
||||||
|
</section>
|
||||||
|
|
||||||
|
<!-- Guardian Agents -->
|
||||||
|
<section class="mb-10">
|
||||||
|
<h2 class="text-3xl font-bold text-gray-900 mb-4" data-i18n="guardian.heading">Guardian Agents: Verification Without Common-Mode Failure</h2>
|
||||||
|
<div class="prose prose-lg text-gray-700">
|
||||||
|
<p class="mb-4" data-i18n-html="guardian.intro">
|
||||||
|
The standard approach to AI safety verification — using additional AI models to check AI output — shares a structural flaw with the systems it checks. When both layers are probabilistic, both hallucinate, and both reward confident outputs, they share failure modes. This is <strong>common-mode failure</strong>: the checker confirms the error because it reasons the same way as the system it checks.
|
||||||
|
</p>
|
||||||
|
<p class="mb-4" data-i18n="guardian.approach">
|
||||||
|
Guardian Agents resolve this by operating in a fundamentally different epistemic domain from the generation layer. The verification mechanism is embedding cosine similarity — a mathematical measurement of how closely an AI response aligns with source material. This is measurement, not interpretation. The watcher is not another speaker. The watcher is a measuring instrument.
|
||||||
|
</p>
|
||||||
|
</div>
|
||||||
|
|
||||||
|
<h3 class="text-xl font-bold text-gray-900 mt-6 mb-3" data-i18n="guardian.phases_heading">Four Verification Phases</h3>
|
||||||
|
<div class="grid grid-cols-1 md:grid-cols-2 gap-4 mb-6">
|
||||||
|
<div class="bg-white rounded-lg shadow-sm p-5 border border-gray-200">
|
||||||
|
<div class="text-sm font-mono text-teal-700 mb-1">Phase 1</div>
|
||||||
|
<h4 class="font-bold text-gray-900 mb-2" data-i18n="guardian.phase1_title">Response Verification</h4>
|
||||||
|
<p class="text-gray-700 text-sm" data-i18n="guardian.phase1_desc">Every AI response measured against source material via embedding similarity. Score-derived confidence tiers (verified, partially verified, unverified) presented to the member — not binary safe/unsafe labels.</p>
|
||||||
|
</div>
|
||||||
|
<div class="bg-white rounded-lg shadow-sm p-5 border border-gray-200">
|
||||||
|
<div class="text-sm font-mono text-teal-700 mb-1">Phase 2</div>
|
||||||
|
<h4 class="font-bold text-gray-900 mb-2" data-i18n="guardian.phase2_title">Claim-Level Analysis</h4>
|
||||||
|
<p class="text-gray-700 text-sm" data-i18n="guardian.phase2_desc">Individual claims mapped to sources or marked as unmatched. The system does not say “this claim is wrong” — it says “we could not find this in your community’s records.” Absence of evidence is not evidence of absence.</p>
|
||||||
|
</div>
|
||||||
|
<div class="bg-white rounded-lg shadow-sm p-5 border border-gray-200">
|
||||||
|
<div class="text-sm font-mono text-teal-700 mb-1">Phase 3</div>
|
||||||
|
<h4 class="font-bold text-gray-900 mb-2" data-i18n="guardian.phase3_title">Anomaly Detection</h4>
|
||||||
|
<p class="text-gray-700 text-sm" data-i18n="guardian.phase3_desc">Tenant-scoped baselines detect deviations in AI behaviour. What counts as anomalous in a parish archive differs from a neighbourhood coordination group — these are different values, not different calibrations.</p>
|
||||||
|
</div>
|
||||||
|
<div class="bg-white rounded-lg shadow-sm p-5 border border-gray-200">
|
||||||
|
<div class="text-sm font-mono text-teal-700 mb-1">Phase 4</div>
|
||||||
|
<h4 class="font-bold text-gray-900 mb-2" data-i18n="guardian.phase4_title">Adaptive Learning</h4>
|
||||||
|
<p class="text-gray-700 text-sm" data-i18n="guardian.phase4_desc">Moderator decisions feed back into threshold tuning. Evidence burden is deliberately asymmetric: loosening safety thresholds requires 85% confidence, tightening requires 60%. A regression monitor watches every approved change.</p>
|
||||||
|
</div>
|
||||||
|
</div>
|
||||||
|
|
||||||
|
<h3 class="text-xl font-bold text-gray-900 mt-6 mb-3" data-i18n="guardian.foundations_heading">Philosophical Foundations</h3>
|
||||||
|
<p class="text-gray-700 mb-4" data-i18n="guardian.foundations_intro">
|
||||||
|
The architectural choices in Guardian Agents are not engineering decisions that happen to align with philosophical positions — they are philosophical commitments that demanded specific engineering responses.
|
||||||
|
</p>
|
||||||
|
<div class="grid grid-cols-1 md:grid-cols-2 gap-4 mb-6">
|
||||||
|
<div class="bg-gray-50 rounded-lg p-4 border border-gray-200">
|
||||||
|
<h4 class="font-bold text-gray-900 text-sm mb-1" data-i18n="guardian.wittgenstein_title">Wittgenstein</h4>
|
||||||
|
<p class="text-gray-600 text-xs" data-i18n="guardian.wittgenstein_desc">Verification and generation must operate in different epistemic domains. The sayable (measurement) verifies what inevitably touches the unsayable (generation).</p>
|
||||||
|
</div>
|
||||||
|
<div class="bg-gray-50 rounded-lg p-4 border border-gray-200">
|
||||||
|
<h4 class="font-bold text-gray-900 text-sm mb-1" data-i18n="guardian.berlin_title">Berlin</h4>
|
||||||
|
<p class="text-gray-600 text-xs" data-i18n="guardian.berlin_desc">No objective function resolves values conflicts. Tenant-scoped governance prevents hidden value hierarchies. Asymmetric evidence burdens make trade-offs visible.</p>
|
||||||
|
</div>
|
||||||
|
<div class="bg-gray-50 rounded-lg p-4 border border-gray-200">
|
||||||
|
<h4 class="font-bold text-gray-900 text-sm mb-1" data-i18n="guardian.ostrom_title">Ostrom</h4>
|
||||||
|
<p class="text-gray-600 text-xs" data-i18n="guardian.ostrom_desc">Polycentric governance with genuinely independent verification centres. Moderators, regression monitors, and audit trails — no single authority is root.</p>
|
||||||
|
</div>
|
||||||
|
<div class="bg-gray-50 rounded-lg p-4 border border-gray-200">
|
||||||
|
<h4 class="font-bold text-gray-900 text-sm mb-1" data-i18n="guardian.teaomaori_title">Te Ao Māori</h4>
|
||||||
|
<p class="text-gray-600 text-xs" data-i18n="guardian.teaomaori_desc">Sovereign processing implements rangatiratanga — the community governs what happens to its own data. The platform exercises kaitiakitanga (guardianship), not ownership.</p>
|
||||||
|
</div>
|
||||||
|
</div>
|
||||||
|
|
||||||
|
<div class="bg-teal-50 border border-teal-200 rounded-lg p-5">
|
||||||
|
<p class="text-teal-900 text-sm" data-i18n-html="guardian.read_more">
|
||||||
|
<strong>Full analysis:</strong> <a href="/blog-post.html?slug=guardian-agents-philosophy-of-ai-accountability" class="text-teal-700 underline hover:text-teal-900">Guardian Agents and the Philosophy of AI Accountability</a> traces the complete philosophical genealogy — from early twentieth-century Vienna to contemporary Aotearoa New Zealand — and examines why these traditions converge on the same architectural requirements.
|
||||||
|
</p>
|
||||||
</div>
|
</div>
|
||||||
</section>
|
</section>
|
||||||
|
|
||||||
|
|
@ -675,6 +745,10 @@
|
||||||
<h3 class="font-bold text-gray-900 mb-1" data-i18n="further_reading.researcher_title">For Researchers</h3>
|
<h3 class="font-bold text-gray-900 mb-1" data-i18n="further_reading.researcher_title">For Researchers</h3>
|
||||||
<p class="text-sm text-gray-600" data-i18n="further_reading.researcher_desc">Open questions, collaboration opportunities, and data access</p>
|
<p class="text-sm text-gray-600" data-i18n="further_reading.researcher_desc">Open questions, collaboration opportunities, and data access</p>
|
||||||
</a>
|
</a>
|
||||||
|
<a href="/blog-post.html?slug=guardian-agents-philosophy-of-ai-accountability" class="block bg-white rounded-lg shadow-sm p-5 border border-gray-200 hover:shadow-md hover:-translate-y-0.5 transition-all">
|
||||||
|
<h3 class="font-bold text-gray-900 mb-1" data-i18n="further_reading.guardian_title">Guardian Agents Philosophy</h3>
|
||||||
|
<p class="text-sm text-gray-600" data-i18n="further_reading.guardian_desc">How Wittgenstein, Berlin, Ostrom, and Te Ao Māori converge in a production governance architecture</p>
|
||||||
|
</a>
|
||||||
</div>
|
</div>
|
||||||
</section>
|
</section>
|
||||||
|
|
||||||
|
|
|
||||||
Loading…
Add table
Reference in a new issue