fix: update village-ai.html — replace stale 3B/8B architecture with current

Replaced Two-Model Architecture (3B/8B) with Specialized Model Architecture
(five production 14B models by community type). Updated Training Tiers:
Tier 2 now describes product-type specialization, not per-tenant adapters.
Fixed infrastructure section: WireGuard inference is live not planned,
model size corrected to 14B. Updated limitations and production timeline.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This commit is contained in:
TheFlow 2026-04-09 17:29:41 +12:00
parent 36122fadfb
commit b7f2245ec4

View file

@ -120,38 +120,60 @@
</div> </div>
</section> </section>
<!-- Two-Model Architecture --> <!-- Specialized Model Architecture -->
<section class="mb-10"> <section class="mb-10">
<h2 class="text-3xl font-bold text-gray-900 mb-4" data-i18n="two_model.heading">Two-Model Architecture</h2> <h2 class="text-3xl font-bold text-gray-900 mb-4">Specialized Model Architecture</h2>
<p class="text-gray-700 mb-4" data-i18n-html="two_model.intro"> <p class="text-gray-700 mb-4">
Village AI uses two models of different sizes, routed by task complexity. This is not a fallback mechanism &mdash; each model is optimised for its role. Village AI uses multiple specialized models, each fine-tuned for a specific community type. The routing layer selects the appropriate model based on the tenant&rsquo;s product type. All models operate under the same governance stack.
</p> </p>
<div class="grid grid-cols-1 md:grid-cols-2 gap-6"> <div class="grid grid-cols-1 md:grid-cols-2 lg:grid-cols-3 gap-6">
<div class="bg-white rounded-lg shadow-sm p-6 border-l-4 border-blue-500"> <div class="bg-white rounded-lg shadow-sm p-6 border-l-4 border-teal-500">
<h3 class="text-lg font-bold text-gray-900 mb-2" data-i18n-html="two_model.fast_title">3B Model &mdash; Fast Assistant</h3> <h3 class="text-lg font-bold text-gray-900 mb-2">Community &amp; Governance</h3>
<span class="inline-block bg-green-100 text-green-800 text-xs font-semibold px-2 py-0.5 rounded mb-2" data-i18n="two_model.fast_badge">Operational</span> <span class="inline-block bg-green-100 text-green-800 text-xs font-semibold px-2 py-0.5 rounded mb-2">Production</span>
<p class="text-gray-700 text-sm mb-3" data-i18n="two_model.fast_desc"> <p class="text-gray-700 text-sm">
Handles help queries, tooltips, error explanations, short summaries, and translation. Target response time: under 5 seconds complete. Generalist model serving neighbourhood communities, governance bodies, and committees. Also serves as fallback for community types without a dedicated model.
</p> </p>
<p class="text-gray-500 text-xs" data-i18n="two_model.fast_routing"> </div>
Routing triggers: simple queries, known FAQ patterns, single-step tasks. <div class="bg-white rounded-lg shadow-sm p-6 border-l-4 border-emerald-500">
<h3 class="text-lg font-bold text-gray-900 mb-2">Wh&#257;nau &amp; Indigenous</h3>
<span class="inline-block bg-green-100 text-green-800 text-xs font-semibold px-2 py-0.5 rounded mb-2">Production</span>
<p class="text-gray-700 text-sm">
Trained on te reo M&#257;ori content, whakapapa structures, and tikanga documentation. Highest indigenous domain accuracy across all variants.
</p> </p>
</div> </div>
<div class="bg-white rounded-lg shadow-sm p-6 border-l-4 border-purple-500"> <div class="bg-white rounded-lg shadow-sm p-6 border-l-4 border-purple-500">
<h3 class="text-lg font-bold text-gray-900 mb-2" data-i18n-html="two_model.deep_title">8B Model &mdash; Deep Reasoning</h3> <h3 class="text-lg font-bold text-gray-900 mb-2">Episcopal &amp; Parish</h3>
<span class="inline-block bg-amber-100 text-amber-800 text-xs font-semibold px-2 py-0.5 rounded mb-2" data-i18n="two_model.deep_badge">Planned</span> <span class="inline-block bg-green-100 text-green-800 text-xs font-semibold px-2 py-0.5 rounded mb-2">Production</span>
<p class="text-gray-700 text-sm mb-3" data-i18n="two_model.deep_desc"> <p class="text-gray-700 text-sm">
Handles life story generation, year-in-review narratives, complex summarisation, and sensitive correspondence. Target response time: under 90 seconds. Trained on Anglican parish governance, Book of Common Prayer, vestry procedures, and liturgical calendar. Serves parish and diocesan communities.
</p> </p>
<p class="text-gray-500 text-xs" data-i18n="two_model.deep_routing"> </div>
Routing triggers: keywords like "everything about", multi-source retrieval, grief/trauma markers. <div class="bg-white rounded-lg shadow-sm p-6 border-l-4 border-blue-500">
<h3 class="text-lg font-bold text-gray-900 mb-2">Family &amp; Heritage</h3>
<span class="inline-block bg-green-100 text-green-800 text-xs font-semibold px-2 py-0.5 rounded mb-2">Production</span>
<p class="text-gray-700 text-sm">
Trained on family storytelling, genealogy, heritage preservation, and inter-generational content. Highest overall FAQ accuracy.
</p>
</div>
<div class="bg-white rounded-lg shadow-sm p-6 border-l-4 border-indigo-500">
<h3 class="text-lg font-bold text-gray-900 mb-2">Business &amp; Professional</h3>
<span class="inline-block bg-green-100 text-green-800 text-xs font-semibold px-2 py-0.5 rounded mb-2">Production</span>
<p class="text-gray-700 text-sm">
Trained on CRM, invoicing, time tracking, and professional services content. Serves business tenants and platform operations.
</p>
</div>
<div class="bg-white rounded-lg shadow-sm p-6 border-l-4 border-gray-300">
<h3 class="text-lg font-bold text-gray-900 mb-2">Additional Types</h3>
<span class="inline-block bg-amber-100 text-amber-800 text-xs font-semibold px-2 py-0.5 rounded mb-2">Trigger-based</span>
<p class="text-gray-700 text-sm">
Conservation, diaspora, clubs, and alumni models are trained when the first tenant of that type is established. Until then, the community generalist model serves.
</p> </p>
</div> </div>
</div> </div>
<p class="text-gray-600 text-sm mt-4" data-i18n-html="two_model.footer"> <p class="text-gray-600 text-sm mt-4">
Both models operate under the same governance stack. Routing governance is designed; ContextPressureMonitor override capability is planned. All models are fine-tuned from the same base using QLoRA. Training data is curated per community type and never mixed across domains. A deterministic FAQ layer handles known questions without model inference. Steering vectors adjust model behaviour at inference time without modifying weights.
</p> </p>
</section> </section>
@ -178,14 +200,14 @@
<div class="bg-white rounded-lg shadow-sm p-6 border-l-4 border-teal-500"> <div class="bg-white rounded-lg shadow-sm p-6 border-l-4 border-teal-500">
<div class="flex items-baseline justify-between mb-2"> <div class="flex items-baseline justify-between mb-2">
<h3 class="text-lg font-bold text-gray-900" data-i18n="training_tiers.tier2_title">Tier 2: Tenant Adapters</h3> <h3 class="text-lg font-bold text-gray-900">Tier 2: Product-Type Specialization</h3>
<span class="text-xs bg-teal-100 text-teal-800 px-2 py-1 rounded" data-i18n="training_tiers.tier2_badge">Per community</span> <span class="text-xs bg-teal-100 text-teal-800 px-2 py-1 rounded">Per community type</span>
</div> </div>
<p class="text-gray-700 text-sm mb-2" data-i18n-html="training_tiers.tier2_desc"> <p class="text-gray-700 text-sm mb-2">
Each community trains a lightweight LoRA adapter on its own content &mdash; stories, documents, photos, and events that members have explicitly consented to include. This allows Village AI to answer questions like "What stories has Grandma shared?" without accessing any other community's data. Each community type (wh&#257;nau, episcopal, business, family, etc.) has a dedicated fine-tuned model trained on domain-specific content. The model learns the vocabulary, governance patterns, and cultural framing appropriate to that community type. Tenant data isolation is maintained &mdash; no tenant&rsquo;s content is used in another tenant&rsquo;s training data.
</p> </p>
<p class="text-gray-500 text-xs" data-i18n-html="training_tiers.tier2_update"> <p class="text-gray-500 text-xs">
Adapters are small (50&ndash;100MB). Consent is per-content-item. Content marked "only me" is never included regardless of consent. Training method: QLoRA fine-tuning with governance-validated data. Specialization is triggered when the first tenant of a new type is established. Training method: QLoRA fine-tuning with governance-validated, curated corpora.
</p> </p>
</div> </div>
@ -293,7 +315,7 @@
<div class="bg-amber-50 rounded-lg p-5 border border-amber-200 mt-4"> <div class="bg-amber-50 rounded-lg p-5 border border-amber-200 mt-4">
<p class="text-amber-900 text-sm" data-i18n-html="dual_layer.caveat"> <p class="text-amber-900 text-sm" data-i18n-html="dual_layer.caveat">
<strong>Honest caveat:</strong> Layer A (inherent governance via training) has been empirically validated across multiple training runs with consistent governance compliance. Layer B (active governance via Village codebase) has been operating in production for 5 months. The dual-layer thesis is demonstrating results, though evaluation remains self-reported. Independent audit is planned. <strong>Honest caveat:</strong> Layer A (inherent governance via training) has been empirically validated across multiple training runs with consistent governance compliance. Layer B (active governance via Village codebase) has been operating in production since October 2025. The dual-layer thesis is demonstrating results, though evaluation remains self-reported. Independent audit is planned.
</p> </p>
</div> </div>
@ -503,17 +525,17 @@
<div class="bg-white rounded-lg shadow-sm p-5 border border-gray-200"> <div class="bg-white rounded-lg shadow-sm p-5 border border-gray-200">
<h3 class="text-lg font-bold text-gray-900 mb-2" data-i18n="infrastructure.remote_title">Remote Inference</h3> <h3 class="text-lg font-bold text-gray-900 mb-2" data-i18n="infrastructure.remote_title">Remote Inference</h3>
<ul class="text-gray-700 text-sm space-y-2"> <ul class="text-gray-700 text-sm space-y-2">
<li data-i18n="infrastructure.remote_item1">Model weights deployed to production server (OVH France)</li> <li>Specialized model weights served from sovereign GPU infrastructure</li>
<li data-i18n="infrastructure.remote_item2">Inference via Ollama on production server</li> <li>Inference via Ollama, routed by tenant product type</li>
<li data-i18n="infrastructure.remote_item3">Home GPU inference via WireGuard VPN (planned)</li> <li>GPU inference via encrypted WireGuard tunnel to both production servers</li>
<li data-i18n="infrastructure.remote_item4">CPU-based inference provides baseline availability</li> <li>Production servers in EU (France) and NZ (Catalyst Cloud)</li>
</ul> </ul>
</div> </div>
</div> </div>
<div class="bg-gray-50 rounded-lg p-5 border border-gray-200 mt-4"> <div class="bg-gray-50 rounded-lg p-5 border border-gray-200 mt-4">
<p class="text-gray-700 text-sm" data-i18n-html="infrastructure.why_consumer"> <p class="text-gray-700 text-sm" data-i18n-html="infrastructure.why_consumer">
<strong>Why consumer hardware?</strong> The SLL thesis is that sovereign AI training should be accessible, not reserved for organisations with data centre budgets. A single consumer GPU can fine-tune a 7B model efficiently via QLoRA. The entire training infrastructure fits on a desk. <strong>Why consumer hardware?</strong> The SLL thesis is that sovereign AI training should be accessible, not reserved for organisations with data centre budgets. Consumer-grade GPUs can fine-tune 14B models efficiently via QLoRA. The entire inference infrastructure fits on a desk.
</p> </p>
</div> </div>
</section> </section>
@ -691,7 +713,7 @@
<ul class="space-y-3 text-amber-800"> <ul class="space-y-3 text-amber-800">
<li class="flex items-start"> <li class="flex items-start">
<span class="mr-2 font-bold">&bull;</span> <span class="mr-2 font-bold">&bull;</span>
<span data-i18n-html="limitations.item1"><strong>Early-stage training:</strong> Multiple QLoRA fine-tuning runs have been completed. A production model is deployed with governance compliance and bias metrics meeting targets. Evaluation is self-reported. Independent audit is planned.</span> <span><strong>Production training:</strong> Multiple specialized models are deployed across five community types, each with governance compliance and bias metrics meeting targets. Evaluation is self-reported. Independent audit is planned.</span>
</li> </li>
<li class="flex items-start"> <li class="flex items-start">
<span class="mr-2 font-bold">&bull;</span> <span class="mr-2 font-bold">&bull;</span>