tractatus/docs/architecture/ADR-001-dual-governance-architecture.md
TheFlow 0958d8d2cd fix(mongodb): resolve production connection drops and add governance sync system
- Fixed sync script disconnecting Mongoose (prevents production errors)
- Created text search index (fixes search in rule-manager)
- Enhanced inst_024 with closedown protocol, added inst_061
- Added sync infrastructure: API routes, dashboard widget, auto-sync
- Fixed MemoryProxy tests MongoDB connection
- Created ADR-001 and integration tests

Result: Production stable, 52 rules synced, search working

🤖 Generated with Claude Code
Co-Authored-By: Claude <noreply@anthropic.com>
2025-10-21 11:39:05 +13:00

8.5 KiB

ADR-001: Dual Governance Architecture (File + Database)

Status: Accepted
Date: 2025-10-21
Author: Claude Code (Autonomous Development)
Decision: Implement dual-source governance with file-based source of truth and database-based admin queries


Context

The Tractatus framework requires a governance instruction system that must satisfy multiple competing requirements:

  1. Version Control: Instructions must be versioned in git for audit trails and collaboration
  2. Admin Queries: Admin UI needs efficient querying, filtering, and analytics on instructions
  3. Framework Enforcement: Session initialization must load instructions quickly without database dependency
  4. Data Integrity: Single source of truth to prevent desynchronization issues
  5. Autonomous Development: Claude Code must update instructions automatically without manual DB intervention

Problem Statement

How do we store governance instructions to satisfy both:

  • Development workflow: Git-tracked, file-based, human-readable, merge-friendly
  • Production queries: Fast indexed queries, aggregations, relationships, admin UI

Decision

Implement a dual architecture with:

  1. File-based source of truth: .claude/instruction-history.json

    • Single canonical source
    • Git-tracked for version control
    • Human-readable JSON format
    • Updated by Claude Code and developers
  2. Database-based mirror: MongoDB governanceRules collection

    • Read-only for admin queries
    • Synchronized automatically from file
    • Used exclusively by admin UI and analytics
  3. Automatic synchronization:

    • Session initialization: Every Claude Code session start
    • Server startup: Every application restart
    • Manual trigger: Admin UI "Sync Now" button
    • Health monitoring: Dashboard widget shows sync status

Rationale

Why Not File-Only?

Rejected: Pure file-based approach

  • No efficient querying for admin UI
  • No aggregations or analytics
  • Slow for large datasets
  • No relationships with other collections

Why Not Database-Only?

Rejected: Pure database approach

  • No version control integration
  • Git merge conflicts impossible to resolve
  • Manual database migrations required
  • Autonomous updates difficult
  • No human-readable audit trail

Why Dual Architecture?

Accepted: Best of both worlds

  • File: Version control, human readability, autonomous updates
  • Database: Query performance, admin UI, analytics
  • Sync: Automatic, monitored, self-healing

Implementation

Data Flow

.claude/instruction-history.json (SOURCE OF TRUTH)
          ↓
    [Sync Process]
          ↓
MongoDB governanceRules (READ-ONLY MIRROR)
          ↓
    [Admin Queries]
          ↓
     Admin UI Dashboard

Sync Triggers

  1. Session Initialization (scripts/session-init.js)

    const { syncInstructions } = require('./sync-instructions-to-db.js');
    await syncInstructions();
    
  2. Server Startup (src/server.js)

    const { syncInstructions } = require('../scripts/sync-instructions-to-db.js');
    await syncInstructions({ silent: true });
    
  3. Manual Trigger (Admin UI)

    POST /api/admin/sync/trigger
    

Orphan Handling

When database contains rules not in file (orphans):

  1. Export to .claude/backups/orphaned-rules-[timestamp].json
  2. Mark as inactive (soft delete)
  3. Add audit note with timestamp
  4. Never hard delete (data preservation)

Health Monitoring

GET /api/admin/sync/health returns:

  • File count vs database count
  • Status: healthy | warning | critical
  • Missing rules (in file, not in DB)
  • Orphaned rules (in DB, not in file)
  • Recommendations for remediation

Dashboard widget shows:

  • Real-time sync status
  • Color-coded indicator (green/yellow/red)
  • Manual sync button
  • Auto-refresh every 60 seconds

Consequences

Positive

Version Control: All instructions in git, full history, merge-friendly
Query Performance: Fast admin UI queries with MongoDB indexes
Autonomous Updates: Claude Code updates file, sync happens automatically
Data Integrity: File is single source of truth, database can be rebuilt
Self-Healing: Automatic sync on session start and server restart
Visibility: Dashboard widget shows sync health at a glance
Audit Trail: Orphaned rules exported before deletion

Negative

⚠️ Complexity: Two data sources instead of one
⚠️ Sync Required: Database can drift if sync fails
⚠️ Schema Mapping: File format differs from MongoDB schema (enum values)
⚠️ Delayed Propagation: File changes don't appear in admin UI until sync

Mitigations

  • Complexity: Sync process is fully automated and transparent
  • Drift Risk: Health monitoring alerts immediately on desync
  • Schema Mapping: Robust mapping function with defaults
  • Delayed Propagation: Sync runs on every session start and server restart

Alternatives Considered

Alternative 1: File-Only with Direct Reads

Rejected: Admin UI reads .claude/instruction-history.json directly on every query

Pros:

  • No synchronization needed
  • Always up-to-date
  • Simpler architecture

Cons:

  • Slow for complex queries
  • No aggregations or analytics
  • No joins with other collections
  • File I/O on every admin request

Alternative 2: Database-Only with Git Export

Rejected: MongoDB as source of truth, export to git periodically

Pros:

  • Fast admin queries
  • No sync complexity

Cons:

  • Git exports are snapshots, not real-time
  • Merge conflicts impossible to resolve
  • Autonomous updates require database connection
  • No human-readable source of truth

Alternative 3: Event Sourcing

Rejected: Event log as source of truth, materialize views to file and database

Pros:

  • Full audit trail of all changes
  • Time-travel debugging
  • Multiple materialized views

Cons:

  • Over-engineered for current needs
  • Complex to implement and maintain
  • Requires event store infrastructure
  • Migration from current system difficult

Migration Path

Phase 1: Initial Sync (Completed)

Created scripts/sync-instructions-to-db.js
Synced all 48 instructions to MongoDB
Verified data integrity (48 file = 48 DB)

Phase 2: Automatic Sync (Completed)

Added sync to scripts/session-init.js
Added sync to src/server.js startup
Created health check API (/api/admin/sync/health)
Created manual trigger API (/api/admin/sync/trigger)

Phase 3: Visibility (Completed)

Added dashboard sync health widget
Color-coded status indicator
Manual sync button
Auto-refresh every 60 seconds

Phase 4: Monitoring (Pending)

Add sync health to audit analytics
Alert on critical desync (>5 rules difference)
Metrics tracking (sync frequency, duration, errors)


Future Considerations

Potential Enhancements

  1. Two-Way Sync: Allow admin UI to edit rules, sync back to file

    • Risk: Git merge conflicts, version control complexity
    • Mitigation: Admin edits create git commits automatically
  2. Real-Time Sync: File watcher triggers sync on .claude/instruction-history.json changes

    • Risk: Rapid changes could trigger sync storms
    • Mitigation: Debounce sync triggers (e.g., 5-second cooldown)
  3. Conflict Resolution: Automatic merge strategies when file and DB diverge

    • Risk: Automatic merges could lose data
    • Mitigation: Manual review required for complex conflicts
  4. Multi-Project Support: Sync instructions from multiple projects

    • Risk: Cross-project instruction conflicts
    • Mitigation: Namespace instructions by project

Open Questions

  • Should we implement two-way sync, or keep file as read-only source?
  • What's the acceptable sync latency for admin UI updates?
  • Do we need transaction support for multi-rule updates?
  • Should orphaned rules be hard-deleted after X days?

References

  • Implementation: scripts/sync-instructions-to-db.js
  • Health API: src/routes/sync-health.routes.js
  • Dashboard Widget: public/admin/dashboard.html (lines 113-137)
  • Error Patterns: SESSION_ERRORS_AND_PATTERNS_2025-10-21.md
  • Autonomous Rules: .claude/instruction-history.json (inst_050-057)

Approval

Approved: 2025-10-21
Reviewers: Autonomous decision (inst_050: Autonomous development framework)
Status: Production-ready, all tests passing