From b7f2245ec4a0cce6980a60f3337cdc08cfe0a306 Mon Sep 17 00:00:00 2001 From: TheFlow Date: Thu, 9 Apr 2026 17:29:41 +1200 Subject: [PATCH] =?UTF-8?q?fix:=20update=20village-ai.html=20=E2=80=94=20r?= =?UTF-8?q?eplace=20stale=203B/8B=20architecture=20with=20current?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Replaced Two-Model Architecture (3B/8B) with Specialized Model Architecture (five production 14B models by community type). Updated Training Tiers: Tier 2 now describes product-type specialization, not per-tenant adapters. Fixed infrastructure section: WireGuard inference is live not planned, model size corrected to 14B. Updated limitations and production timeline. Co-Authored-By: Claude Opus 4.6 (1M context) --- public/village-ai.html | 88 ++++++++++++++++++++++++++---------------- 1 file changed, 55 insertions(+), 33 deletions(-) diff --git a/public/village-ai.html b/public/village-ai.html index 7e11cff2..b7a2b7a6 100644 --- a/public/village-ai.html +++ b/public/village-ai.html @@ -120,38 +120,60 @@ - +
-

Two-Model Architecture

-

- Village AI uses two models of different sizes, routed by task complexity. This is not a fallback mechanism — each model is optimised for its role. +

Specialized Model Architecture

+

+ Village AI uses multiple specialized models, each fine-tuned for a specific community type. The routing layer selects the appropriate model based on the tenant’s product type. All models operate under the same governance stack.

-
-
-

3B Model — Fast Assistant

- Operational -

- Handles help queries, tooltips, error explanations, short summaries, and translation. Target response time: under 5 seconds complete. +

+
+

Community & Governance

+ Production +

+ Generalist model serving neighbourhood communities, governance bodies, and committees. Also serves as fallback for community types without a dedicated model.

-

- Routing triggers: simple queries, known FAQ patterns, single-step tasks. +

+
+

Whānau & Indigenous

+ Production +

+ Trained on te reo Māori content, whakapapa structures, and tikanga documentation. Highest indigenous domain accuracy across all variants.

-

8B Model — Deep Reasoning

- Planned -

- Handles life story generation, year-in-review narratives, complex summarisation, and sensitive correspondence. Target response time: under 90 seconds. +

Episcopal & Parish

+ Production +

+ Trained on Anglican parish governance, Book of Common Prayer, vestry procedures, and liturgical calendar. Serves parish and diocesan communities.

-

- Routing triggers: keywords like "everything about", multi-source retrieval, grief/trauma markers. +

+
+

Family & Heritage

+ Production +

+ Trained on family storytelling, genealogy, heritage preservation, and inter-generational content. Highest overall FAQ accuracy. +

+
+
+

Business & Professional

+ Production +

+ Trained on CRM, invoicing, time tracking, and professional services content. Serves business tenants and platform operations. +

+
+
+

Additional Types

+ Trigger-based +

+ Conservation, diaspora, clubs, and alumni models are trained when the first tenant of that type is established. Until then, the community generalist model serves.

-

- Both models operate under the same governance stack. Routing governance is designed; ContextPressureMonitor override capability is planned. +

+ All models are fine-tuned from the same base using QLoRA. Training data is curated per community type and never mixed across domains. A deterministic FAQ layer handles known questions without model inference. Steering vectors adjust model behaviour at inference time without modifying weights.

@@ -178,14 +200,14 @@
-

Tier 2: Tenant Adapters

- Per community +

Tier 2: Product-Type Specialization

+ Per community type
-

- Each community trains a lightweight LoRA adapter on its own content — stories, documents, photos, and events that members have explicitly consented to include. This allows Village AI to answer questions like "What stories has Grandma shared?" without accessing any other community's data. +

+ Each community type (whānau, episcopal, business, family, etc.) has a dedicated fine-tuned model trained on domain-specific content. The model learns the vocabulary, governance patterns, and cultural framing appropriate to that community type. Tenant data isolation is maintained — no tenant’s content is used in another tenant’s training data.

-

- Adapters are small (50–100MB). Consent is per-content-item. Content marked "only me" is never included regardless of consent. Training method: QLoRA fine-tuning with governance-validated data. +

+ Specialization is triggered when the first tenant of a new type is established. Training method: QLoRA fine-tuning with governance-validated, curated corpora.

@@ -293,7 +315,7 @@

- Honest caveat: Layer A (inherent governance via training) has been empirically validated across multiple training runs with consistent governance compliance. Layer B (active governance via Village codebase) has been operating in production for 5 months. The dual-layer thesis is demonstrating results, though evaluation remains self-reported. Independent audit is planned. + Honest caveat: Layer A (inherent governance via training) has been empirically validated across multiple training runs with consistent governance compliance. Layer B (active governance via Village codebase) has been operating in production since October 2025. The dual-layer thesis is demonstrating results, though evaluation remains self-reported. Independent audit is planned.

@@ -503,17 +525,17 @@

Remote Inference

    -
  • Model weights deployed to production server (OVH France)
  • -
  • Inference via Ollama on production server
  • -
  • Home GPU inference via WireGuard VPN (planned)
  • -
  • CPU-based inference provides baseline availability
  • +
  • Specialized model weights served from sovereign GPU infrastructure
  • +
  • Inference via Ollama, routed by tenant product type
  • +
  • GPU inference via encrypted WireGuard tunnel to both production servers
  • +
  • Production servers in EU (France) and NZ (Catalyst Cloud)

- Why consumer hardware? The SLL thesis is that sovereign AI training should be accessible, not reserved for organisations with data centre budgets. A single consumer GPU can fine-tune a 7B model efficiently via QLoRA. The entire training infrastructure fits on a desk. + Why consumer hardware? The SLL thesis is that sovereign AI training should be accessible, not reserved for organisations with data centre budgets. Consumer-grade GPUs can fine-tune 14B models efficiently via QLoRA. The entire inference infrastructure fits on a desk.

@@ -691,7 +713,7 @@