What is Home AI?
+ +What is an SLL?
- Home AI is the practical implementation of Tractatus governance within the Village platform — a community-owned digital space where members share stories, documents, and family histories. Unlike cloud-hosted AI assistants, Home AI operates under the principle of digital sovereignty: the community's data and the AI's behaviour are governed by the community itself, not by a remote provider. + An SLL (Sovereign Locally-trained Language Model) is distinct from both LLMs and SLMs. The distinction is not size — it is control.
-- The term "SLL" (Sovereign Locally-trained Language Model) describes the architectural goal: a language model whose training data, inference, and governance all remain under local control. In practice, Home AI currently uses a hybrid approach — local Llama models for English-language operations and Claude Haiku via API for non-English languages — with a roadmap toward fully local inference as hardware and model capabilities allow. -
-- What distinguishes Home AI from other AI assistants is not the model itself, but the governance layer around it. Every interaction — whether a help query, document OCR, story suggestion, or AI-generated summary — passes through the full Tractatus governance stack before any response reaches the user. +
LLM
+Large Language Model
+-
+
- Training: provider-controlled +
- Data: scraped at scale +
- Governance: provider's terms +
- User control: none +
SLM
+Small Language Model
+-
+
- Training: provider-controlled +
- Data: curated by provider +
- Governance: partial (fine-tuning) +
- User control: limited +
SLL
+Sovereign Locally-trained
+-
+
- Training: community-controlled +
- Data: community-owned +
- Governance: architecturally enforced +
- User control: full +
+ The honest trade-off: an SLL is a less powerful system that serves your interests, rather than a more powerful one that serves someone else's. We consider this an acceptable exchange.
The Governance Stack
-
- Each Home AI interaction traverses six governance services in sequence. This is not optional middleware — it operates in the critical execution path, meaning a response cannot be generated without passing through all checks.
+
+
+ Home AI uses two models of different sizes, routed by task complexity. This is not a fallback mechanism — each model is optimised for its role.
+
+ Handles help queries, tooltips, error explanations, short summaries, and translation. Target response time: under 5 seconds complete.
+
+ Routing triggers: simple queries, known FAQ patterns, single-step tasks.
+
+ Handles life story generation, year-in-review narratives, complex summarisation, and sensitive correspondence. Target response time: under 90 seconds.
+
+ Routing triggers: keywords like "everything about", multi-source retrieval, grief/trauma markers.
+
+ Both models operate under the same governance stack. The routing decision itself is governed — the ContextPressureMonitor can override routing if session health requires it.
+
+ Training is not monolithic. Three tiers serve different scopes, each with appropriate governance constraints.
- Detects whether a user query involves values decisions (privacy trade-offs, ethical questions, cultural sensitivity) and blocks the AI from responding autonomously. These are deferred to human moderators. The boundary between "technical question" and "values question" is defined by community-specific rules, not by the AI's judgment.
-
- Validates the query against stored instructions and known patterns. This is the service that would have caught the 27027 incident — the user's explicit instruction ("use port 27027") is stored externally and cross-referenced against the AI's proposed action ("use port 27017"). When stored instructions conflict with the AI's response, the stored instruction takes precedence.
+
+ Trained on platform documentation, philosophy, feature guides, and FAQ content. Provides the foundational understanding of how Village works, what Home AI's values are, and how to help members navigate the platform.
- Tracks session health metrics: token usage, message count, error rate, task complexity. When pressure exceeds thresholds (ELEVATED at 25%, HIGH at 50%, CRITICAL at 75%), the system adjusts validation intensity or recommends session handoff. This prevents the degradation patterns observed in extended AI sessions where error rates compound.
-
- For complex operations (multi-step tasks, file modifications, configuration changes), the AI performs a structured self-assessment before proposing actions: alignment with instructions, coherence of approach, completeness of plan, safety of proposed changes, and consideration of alternatives. This is triggered selectively to avoid overhead on simple queries.
-
- Classifies instructions by their intended lifespan (HIGH: strategic/permanent, MEDIUM: operational/session-scoped, LOW: tactical/single-use) and quadrant (Strategic, Operational, Tactical, System, Stochastic). This classification determines how strongly the CrossReferenceValidator enforces each instruction and how long it persists in the external store.
+
+ Update frequency: weekly during beta, quarterly at GA. Training method: QLoRA fine-tuning.
- When the AI encounters decisions where legitimate values conflict — for example, a member's privacy interests versus community safety concerns — this service halts autonomous decision-making and coordinates a deliberation process among affected stakeholders. The AI presents the conflict and facilitates discussion; it does not resolve it.
+
+ Each community trains a lightweight LoRA adapter on its own content — stories, documents, photos, and events that members have explicitly consented to include. This allows Home AI to answer questions like "What stories has Grandma shared?" without accessing any other community's data.
+
+ Adapters are small (50–100MB). Consent is per-content-item. Content marked "only me" is never included regardless of consent. Training uses DPO (Direct Preference Optimization) for value alignment.
+
+ Personal adapters that learn individual preferences and interaction patterns. Speculative — this tier raises significant questions about feasibility, privacy, and the minimum training data required for meaningful personalisation.
+
+ Research questions documented. Implementation not planned until Tier 2 is validated.
- Home AI currently provides four AI-powered features, each operating under the full governance stack.
-
- Vector search retrieves relevant documentation and help content, filtered by the member's permission level. The AI generates contextual answers grounded in retrieved documents rather than from its training data alone.
-
- Governance: BoundaryEnforcer prevents PII exposure; CrossReferenceValidator validates responses against platform policies.
-
- Automated text extraction from uploaded documents (historical records, handwritten letters, photographs with text). Extracted text is stored within the member's scope, not shared across tenants or used for model training.
-
- Governance: Processing only occurs under explicit consent controls; results are tenant-isolated.
-
- AI-generated suggestions for writing family stories: prompts, structural advice, and narrative enhancement. Suggestions are filtered through BoundaryEnforcer so that the AI does not impose cultural interpretations or values judgments on family narratives.
-
- Governance: Cultural context decisions are deferred to the storyteller, not resolved by the AI.
-
- Members can view what the AI "remembers" about their interactions: summarised conversation history, inferred preferences, and stored instructions. Members control whether this memory persists, is reset, or is deleted entirely.
-
- Governance: Consent granularity covers AI triage memory, OCR memory, and summarisation memory independently.
-
- The concept of "sovereign" in Home AI is concrete, not aspirational. It refers to specific architectural properties:
+ This is the central research contribution. Most AI governance frameworks operate at inference time — they filter or constrain responses after the model has already been trained. Home AI embeds governance inside the training loop.
+
+ This follows Christopher Alexander's principle of Not-Separateness: governance is woven into the training architecture, not applied afterward. The BoundaryEnforcer validates every training batch before the forward pass. If a batch contains cross-tenant data, data without consent, or content marked as private, the batch is rejected and the training step does not proceed.
- All member data is stored on infrastructure controlled by the community operator — currently OVH (France) and Catalyst (New Zealand). No member data flows to AI provider APIs for training. Query content sent to Claude Haiku for non-English processing is ephemeral and not retained by the provider.
+ # Governance inside the training loop (Not-Separateness) for batch in training_data: if not BoundaryEnforcer.validate(batch): continue # Governance rejects batch loss = model.forward(batch) loss.backward() # NOT this — governance separated from training for batch in training_data: loss = model.forward(batch) loss.backward() filter_outputs_later() # Too late
+ Training shapes tendency; architecture constrains capability. A model trained to respect boundaries can still be jailbroken. A model that fights against governance rules wastes compute and produces worse outputs. The combined approach makes the model tend toward governed behaviour while the architecture makes it impossible to violate structural boundaries.
+
+ Research from the Agent Lightning integration suggests governance adds approximately 5% performance overhead — an acceptable trade-off for architectural safety constraints. This requires validation at scale.
+
+ Training-time governance is only half the picture. The same Tractatus framework also operates at runtime in the Village codebase. The next section explains how these two layers work together.
+
+ Home AI is governed by Tractatus at two distinct layers simultaneously. This is the architectural insight that distinguishes the SLL approach from both ungoverned models and bolt-on safety filters.
+
+ During training, the BoundaryEnforcer validates every batch. DPO alignment shapes preferences toward governed behaviour. The model learns to respect boundaries, prefer transparent responses, and defer values decisions to humans.
- The rules governing AI behaviour are defined by the community, not the AI provider. BoundaryEnforcer rules, instruction persistence levels, and deliberation triggers are configured per-tenant. A family history community has different boundary rules from a neighbourhood association.
+
+ At runtime, the full six-service governance stack operates in the Village codebase. Every interaction passes through BoundaryEnforcer, PluralisticDeliberationOrchestrator, MetacognitiveVerifier, CrossReferenceValidator, ContextPressureMonitor, and InstructionPersistenceClassifier.
The dual-layer principle: Training shapes tendency. Architecture constrains capability. A model that has internalised governance rules AND operates within governance architecture produces better outputs than either approach alone. The model works WITH the guardrails, not against them — reducing compute waste and improving response quality.
+ Honest caveat: Layer A (inherent governance via training) is designed but not yet empirically validated — training has not begun. Layer B (active governance via Village codebase) has been operating in production for 11+ months. The dual-layer thesis is an architectural commitment, not yet a demonstrated result.
+
+ Home AI's governance draws from four philosophical traditions, each contributing a specific architectural principle. These are not decorative references — they translate into concrete design decisions.
+
+ Values are genuinely plural and sometimes incompatible. When freedom conflicts with equality, there may be no single correct resolution. Home AI presents options without hierarchy and documents what each choice sacrifices.
+ Architectural expression: PluralisticDeliberationOrchestrator presents trade-offs; it does not resolve them.
- English-language queries currently use a locally-hosted Llama model. The roadmap includes expanding local inference to additional languages as multilingual open models mature. The governance layer is model-agnostic — switching the underlying model does not require changes to the governance architecture.
+
+ Language shapes what can be thought and expressed. Some things that matter most resist systematic expression. Home AI acknowledges the limits of what language models can capture — particularly around grief, cultural meaning, and lived experience.
Architectural expression: BoundaryEnforcer defers values decisions to humans, acknowledging limits of computation.
+ Te Mana Raraunga (Māori Data Sovereignty), CARE Principles, and OCAP (First Nations Canada) provide frameworks where data is not property but relationship. Whakapapa (genealogy) belongs to the collective, not individuals. Consent is a community process, not an individual checkbox.
+ Architectural expression: tenant isolation, collective consent mechanisms, intergenerational stewardship.
+ Five principles guide how governance evolves: Deep Interlock (services coordinate), Structure-Preserving (changes enhance without breaking), Gradients Not Binary (intensity levels), Living Process (evidence-based evolution), Not-Separateness (governance embedded, not bolted on).
+ Architectural expression: all six governance services and the training loop architecture.
+ Governance operates at three levels, each with different scope and mutability.
+
+ Structural constraints that apply to all communities. Tenant data isolation. Governance in the critical path. Options presented without hierarchy. These cannot be disabled by tenant administrators or individual members.
+ Enforcement: architectural (BoundaryEnforcer blocks violations before they execute).
+ Rules defined by community administrators. Content handling policies (e.g., "deceased members require moderator review"), cultural protocols (e.g., Māori tangi customs), visibility defaults, and AI training consent models. Each community configures its own constitution within Layer 1 constraints.
+ Enforcement: constitutional rules validated by CrossReferenceValidator per tenant.
+ Individual members and communities can adopt principles from wisdom traditions to influence how Home AI frames responses. These are voluntary, reversible, and transparent. They influence presentation, not content access. Multiple traditions can be adopted simultaneously; conflicts are resolved by the member, not the AI.
+ Enforcement: framing hints in response generation. Override always available.
+ Home AI offers thirteen wisdom traditions that members can adopt to guide AI behaviour. Each tradition has been validated against the Stanford Encyclopedia of Philosophy as the primary scholarly reference. Adoption is voluntary, transparent, and reversible.
+ Present options without ranking; acknowledge what each choice sacrifices. Focus on what can be controlled; emphasise character in ancestral stories. Resist summarising grief; preserve names and specifics rather than abstracting. Attend to how content affects specific people, not abstract principles. Frame stories in terms of family roles and reciprocal obligations. Acknowledge that memories and interpretations change; extend compassion. "I am because we are." Stories belong to the community, not the individual. Preserve what was nearly lost; honour fictive kinship and chosen family. Kinship with ancestors, land, and descendants. Collective ownership of knowledge. Repair, preserve memory (zachor), uphold dignity even of difficult relatives. Balance rahma (mercy) with adl (justice) in sensitive content. Role-appropriate duties within larger order; karma as consequence, not punishment. Governance as living system; changes emerge from operational experience.
+ What this is not: Selecting "Buddhist" does not mean the AI practises Buddhism. These are framing tendencies — they influence how the AI presents options, not what content is accessible. A member can always override tradition-influenced framing on any response. The system does not claim algorithmic moral reasoning.
+
- The sovereignty principles underlying Home AI are informed by Te Tiriti o Waitangi (the Treaty of Waitangi, 1840) and Māori concepts of rangatiratanga (self-determination over one's domain), kaitiakitanga (guardianship of resources for future generations), and mana (authority and dignity).
+ Indigenous data sovereignty differs fundamentally from Western privacy models. Where Western privacy centres on individual rights and consent-as-checkbox, indigenous frameworks centre on collective rights, community process, and intergenerational stewardship.
Māori Data Sovereignty. Rangatiratanga (self-determination), kaitiakitanga (guardianship for future generations), whanaungatanga (kinship as unified entity). Global Indigenous Data Alliance. Collective Benefit, Authority to Control, Responsibility, Ethics. Data ecosystems designed for indigenous benefit. First Nations Canada. Ownership, Control, Access, Possession. Communities physically control their data.
- These are not metaphorical borrowings. They provide concrete architectural guidance: communities should control their own data (rangatiratanga), AI systems should preserve rather than degrade the information they govern (kaitiakitanga), and automated decisions should not diminish the standing of the people they affect (mana).
+ Concrete architectural implications: whakapapa (genealogy) cannot be atomised into individual data points. Tapu (sacred/restricted) content triggers cultural review before AI processing. Consent for AI training requires whānau consensus, not individual opt-in. Elder (kaumātua) approval is required for training on sacred genealogies.
- The Tractatus framework is developed in Aotearoa New Zealand, and these principles predate Western technology governance by centuries. We consider them prior art, not novel invention.
+
+
+ These principles are informed by Te Tiriti o Waitangi and predate Western technology governance by centuries. We consider them prior art, not novel invention. Actual implementation requires ongoing consultation with Māori cultural advisors — this specification is a starting point.
+ Home AI follows a "train local, deploy remote" model. The training hardware sits in the developer's home. Trained model weights are deployed to production servers for inference. This keeps training costs low and training data under physical control.
+
+ Why consumer hardware? The SLL thesis is that sovereign AI training should be accessible, not reserved for organisations with data centre budgets. A single consumer GPU can fine-tune a 7B model efficiently via QLoRA. The entire training infrastructure fits on a desk.
+
+ Home AI operates in the domain of family storytelling, which carries specific bias risks. Six bias categories have been documented with detection prompts, debiasing examples, and evaluation criteria.
+ Nuclear family as default; same-sex parents, blended families, single parents treated as normative. Deficit framing of aging; elders as active agents with expertise, not passive subjects. Christian-normative assumptions; equal treatment of all cultural practices and observances. Anglo-American defaults; location-appropriate references and cultural context. Efficiency over sensitivity; pacing, attention to particulars, no premature closure. Western name-order assumptions; correct handling of patronymics, honorifics, diacritics.
+ Home AI currently operates in production with the following governed features. These run under the full six-service governance stack.
+ Vector search retrieves relevant documentation, filtered by member permissions. Responses grounded in retrieved documents, not training data alone. Text extraction from uploaded documents. Results stored within member scope, not shared across tenants or used for training without consent. Writing prompts, structural advice, narrative enhancement. Cultural context decisions deferred to the storyteller, not resolved by the AI. Members view and control what the AI remembers. Independent consent for triage memory, OCR memory, and summarisation memory. Full technical case study of Tractatus in production Five architectural principles and six governance services Integration guide with code examples Tractatus in production — metrics, evidence, and honest limitations Sovereignty, transparency, and pluralism Academic paper on governance during training Open questions, collaboration opportunities, and data access
+
Some decisions require human judgment — architecturally enforced, not left to AI discretion, however well trained.
+
Current AI safety approaches rely on training, fine-tuning, and corporate governance — all of which can fail, drift, or be overridden. When an AI's training patterns conflict with a user's explicit instructions, the patterns win.
A user told Claude Code to use port 27027. The model used 27017 instead — not from forgetting, but because MongoDB's default port is 27017, and the model's statistical priors "autocorrected" the explicit instruction. Training pattern bias overrode human intent.
@@ -113,35 +113,35 @@
-
+
Tractatus draws on four intellectual traditions, each contributing a distinct insight to the architecture.
Some values are genuinely incommensurable. You cannot rank "privacy" against "safety" on a single scale without imposing one community's priorities on everyone else. AI systems must accommodate plural moral frameworks, not flatten them.
Some decisions can be systematised and delegated to AI; others — involving values, ethics, cultural context — fundamentally cannot. The boundary between the “sayable” (what can be specified, measured, verified) and what lies beyond it is the framework’s foundational constraint. What cannot be systematised must not be automated.
Communities should control their own data and the systems that act upon it. Concepts of rangatiratanga (self-determination), kaitiakitanga (guardianship), and mana (dignity) provide centuries-old prior art for digital sovereignty.
Governance woven into system architecture, not bolted on. Five principles (Not-Separateness, Deep Interlock, Gradients, Structure-Preserving, Living Process) guide how the framework evolves while maintaining coherence.
@@ -149,7 +149,7 @@
Two-Model Architecture
+ 3B Model — Fast Assistant
+ 8B Model — Deep Reasoning
+ Three Training Tiers
+ 1. BoundaryEnforcer
- 2. CrossReferenceValidator
- Tier 1: Platform Base
+ All communities
+ 3. ContextPressureMonitor
- 4. MetacognitiveVerifier
- 5. InstructionPersistenceClassifier
- 6. PluralisticDeliberationOrchestrator
- Tier 2: Tenant Adapters
+ Per community
+ Tier 3: Individual (Future)
+ Per member
+ Governed Features
- RAG-Based Help
- Document OCR
- Story Assistance
- AI Memory Transparency
- Sovereignty Architecture
+
+ Governance During Training
Data sovereignty
- Why both training-time and inference-time governance?
+ Dual-Layer Tractatus Architecture
+ Tractatus Inside the Model
+
+
Governance sovereignty
- Tractatus Around the Model
+
+
+ Philosophical Foundations
+ Isaiah Berlin — Value Pluralism
+ Inference sovereignty (in progress)
- Ludwig Wittgenstein — Language Boundaries
+ Indigenous Sovereignty — Data as Relationship
+ Christopher Alexander — Living Architecture
+ Three-Layer Governance
+ Layer 1: Platform (Immutable)
+ Layer 2: Tenant Constitution
+ Layer 3: Adopted Wisdom Traditions
+ Wisdom Traditions
+ Berlin: Value Pluralism
+ Stoic: Equanimity and Virtue
+ Weil: Attention to Affliction
+ Care Ethics: Relational Responsibility
+ Confucian: Relational Duty
+ Buddhist: Impermanence
+ Ubuntu: Communal Personhood
+ African Diaspora: Sankofa
+ Indigenous/Māori: Whakapapa
+ Jewish: Tikkun Olam
+ Islamic: Mercy and Justice
+ Hindu: Dharmic Order
+ Alexander: Living Architecture
+ Te Tiriti o Waitangi and Digital Sovereignty
+ Indigenous Data Sovereignty
Te Mana Raraunga
+ CARE Principles
+ OCAP
+ Training Infrastructure
+ Local Training
+
+
+ Remote Inference
+
+
+ Bias Documentation and Verification
+ Family Structure
+ Elder Representation
+ Cultural/Religious
+ Geographic/Place
+ Grief/Trauma
+ Naming Conventions
+ Verification Framework
+ Governance Metrics
+
+
+ Testing Methods
+
+
+ What's Live Today
+ RAG-Based Help
+ Document OCR
+ Story Assistance
+ AI Memory Transparency
+ Limitations and Open Questions
Further Reading
Village Case Study
- System Architecture
For Implementers
- Village Case Study
+ Values
- Architectural Alignment Paper
+ For Researchers
+
+
Architectural Governance for AI Systems
- The Problem
+ The Problem
The 27027 Incident
The Approach
- The Approach
+ Isaiah Berlin — Value Pluralism
Ludwig Wittgenstein — The Limits of the Sayable
Te Tiriti o Waitangi — Indigenous Sovereignty
Christopher Alexander — Living Architecture