tractatus/docs/architecture/ADR-001-dual-governance-architecture.md
TheFlow 2298d36bed fix(submissions): restructure Economist package and fix article display
- Create Economist SubmissionTracking package correctly:
  * mainArticle = full blog post content
  * coverLetter = 216-word SIR— letter
  * Links to blog post via blogPostId
- Archive 'Letter to The Economist' from blog posts (it's the cover letter)
- Fix date display on article cards (use published_at)
- Target publication already displaying via blue badge

Database changes:
- Make blogPostId optional in SubmissionTracking model
- Economist package ID: 68fa85ae49d4900e7f2ecd83
- Le Monde package ID: 68fa2abd2e6acd5691932150

Next: Enhanced modal with tabs, validation, export

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-10-24 08:47:42 +13:00

288 lines
8.5 KiB
Markdown

# ADR-001: Dual Governance Architecture (File + Database)
**Status**: Accepted
**Date**: 2025-10-21
**Author**: Claude Code (Autonomous Development)
**Decision**: Implement dual-source governance with file-based source of truth and database-based admin queries
---
## Context
The Tractatus framework requires a governance instruction system that must satisfy multiple competing requirements:
1. **Version Control**: Instructions must be versioned in git for audit trails and collaboration
2. **Admin Queries**: Admin UI needs efficient querying, filtering, and analytics on instructions
3. **Framework Enforcement**: Session initialization must load instructions quickly without database dependency
4. **Data Integrity**: Single source of truth to prevent desynchronization issues
5. **Autonomous Development**: Claude Code must update instructions automatically without manual DB intervention
### Problem Statement
How do we store governance instructions to satisfy both:
- **Development workflow**: Git-tracked, file-based, human-readable, merge-friendly
- **Production queries**: Fast indexed queries, aggregations, relationships, admin UI
---
## Decision
Implement a **dual architecture** with:
1. **File-based source of truth**: `.claude/instruction-history.json`
- Single canonical source
- Git-tracked for version control
- Human-readable JSON format
- Updated by Claude Code and developers
2. **Database-based mirror**: MongoDB `governanceRules` collection
- Read-only for admin queries
- Synchronized automatically from file
- Used exclusively by admin UI and analytics
3. **Automatic synchronization**:
- Session initialization: Every Claude Code session start
- Server startup: Every application restart
- Manual trigger: Admin UI "Sync Now" button
- Health monitoring: Dashboard widget shows sync status
---
## Rationale
### Why Not File-Only?
**Rejected**: Pure file-based approach
- No efficient querying for admin UI
- No aggregations or analytics
- Slow for large datasets
- No relationships with other collections
### Why Not Database-Only?
**Rejected**: Pure database approach
- No version control integration
- Git merge conflicts impossible to resolve
- Manual database migrations required
- Autonomous updates difficult
- No human-readable audit trail
### Why Dual Architecture?
**Accepted**: Best of both worlds
- File: Version control, human readability, autonomous updates
- Database: Query performance, admin UI, analytics
- Sync: Automatic, monitored, self-healing
---
## Implementation
### Data Flow
```
.claude/instruction-history.json (SOURCE OF TRUTH)
[Sync Process]
MongoDB governanceRules (READ-ONLY MIRROR)
[Admin Queries]
Admin UI Dashboard
```
### Sync Triggers
1. **Session Initialization** (`scripts/session-init.js`)
```javascript
const { syncInstructions } = require('./sync-instructions-to-db.js');
await syncInstructions();
```
2. **Server Startup** (`src/server.js`)
```javascript
const { syncInstructions } = require('../scripts/sync-instructions-to-db.js');
await syncInstructions({ silent: true });
```
3. **Manual Trigger** (Admin UI)
```javascript
POST /api/admin/sync/trigger
```
### Orphan Handling
When database contains rules not in file (orphans):
1. Export to `.claude/backups/orphaned-rules-[timestamp].json`
2. Mark as inactive (soft delete)
3. Add audit note with timestamp
4. Never hard delete (data preservation)
### Health Monitoring
GET `/api/admin/sync/health` returns:
- File count vs database count
- Status: `healthy` | `warning` | `critical`
- Missing rules (in file, not in DB)
- Orphaned rules (in DB, not in file)
- Recommendations for remediation
Dashboard widget shows:
- Real-time sync status
- Color-coded indicator (green/yellow/red)
- Manual sync button
- Auto-refresh every 60 seconds
---
## Consequences
### Positive
✅ **Version Control**: All instructions in git, full history, merge-friendly
✅ **Query Performance**: Fast admin UI queries with MongoDB indexes
✅ **Autonomous Updates**: Claude Code updates file, sync happens automatically
✅ **Data Integrity**: File is single source of truth, database can be rebuilt
✅ **Self-Healing**: Automatic sync on session start and server restart
✅ **Visibility**: Dashboard widget shows sync health at a glance
✅ **Audit Trail**: Orphaned rules exported before deletion
### Negative
⚠️ **Complexity**: Two data sources instead of one
⚠️ **Sync Required**: Database can drift if sync fails
⚠️ **Schema Mapping**: File format differs from MongoDB schema (enum values)
⚠️ **Delayed Propagation**: File changes don't appear in admin UI until sync
### Mitigations
- **Complexity**: Sync process is fully automated and transparent
- **Drift Risk**: Health monitoring alerts immediately on desync
- **Schema Mapping**: Robust mapping function with defaults
- **Delayed Propagation**: Sync runs on every session start and server restart
---
## Alternatives Considered
### Alternative 1: File-Only with Direct Reads
**Rejected**: Admin UI reads `.claude/instruction-history.json` directly on every query
**Pros**:
- No synchronization needed
- Always up-to-date
- Simpler architecture
**Cons**:
- Slow for complex queries
- No aggregations or analytics
- No joins with other collections
- File I/O on every admin request
### Alternative 2: Database-Only with Git Export
**Rejected**: MongoDB as source of truth, export to git periodically
**Pros**:
- Fast admin queries
- No sync complexity
**Cons**:
- Git exports are snapshots, not real-time
- Merge conflicts impossible to resolve
- Autonomous updates require database connection
- No human-readable source of truth
### Alternative 3: Event Sourcing
**Rejected**: Event log as source of truth, materialize views to file and database
**Pros**:
- Full audit trail of all changes
- Time-travel debugging
- Multiple materialized views
**Cons**:
- Over-engineered for current needs
- Complex to implement and maintain
- Requires event store infrastructure
- Migration from current system difficult
---
## Migration Path
### Phase 1: Initial Sync (Completed)
✅ Created `scripts/sync-instructions-to-db.js`
✅ Synced all 48 instructions to MongoDB
✅ Verified data integrity (48 file = 48 DB)
### Phase 2: Automatic Sync (Completed)
✅ Added sync to `scripts/session-init.js`
✅ Added sync to `src/server.js` startup
✅ Created health check API (`/api/admin/sync/health`)
✅ Created manual trigger API (`/api/admin/sync/trigger`)
### Phase 3: Visibility (Completed)
✅ Added dashboard sync health widget
✅ Color-coded status indicator
✅ Manual sync button
✅ Auto-refresh every 60 seconds
### Phase 4: Monitoring (Pending)
⏳ Add sync health to audit analytics
⏳ Alert on critical desync (>5 rules difference)
⏳ Metrics tracking (sync frequency, duration, errors)
---
## Future Considerations
### Potential Enhancements
1. **Two-Way Sync**: Allow admin UI to edit rules, sync back to file
- **Risk**: Git merge conflicts, version control complexity
- **Mitigation**: Admin edits create git commits automatically
2. **Real-Time Sync**: File watcher triggers sync on `.claude/instruction-history.json` changes
- **Risk**: Rapid changes could trigger sync storms
- **Mitigation**: Debounce sync triggers (e.g., 5-second cooldown)
3. **Conflict Resolution**: Automatic merge strategies when file and DB diverge
- **Risk**: Automatic merges could lose data
- **Mitigation**: Manual review required for complex conflicts
4. **Multi-Project Support**: Sync instructions from multiple projects
- **Risk**: Cross-project instruction conflicts
- **Mitigation**: Namespace instructions by project
### Open Questions
- Should we implement two-way sync, or keep file as read-only source?
- What's the acceptable sync latency for admin UI updates?
- Do we need transaction support for multi-rule updates?
- Should orphaned rules be hard-deleted after X days?
---
## References
- **Implementation**: `scripts/sync-instructions-to-db.js`
- **Health API**: `src/routes/sync-health.routes.js`
- **Dashboard Widget**: `public/admin/dashboard.html` (lines 113-137)
- **Error Patterns**: `SESSION_ERRORS_AND_PATTERNS_2025-10-21.md`
- **Autonomous Rules**: `.claude/instruction-history.json` (inst_050-057)
---
## Approval
**Approved**: 2025-10-21
**Reviewers**: Autonomous decision (inst_050: Autonomous development framework)
**Status**: Production-ready, all tests passing