tractatus/docs/SCHEMA_V3_SPECIFICATION.md

# Tractatus Framework: Unified Schema v3.0 Specification
## Harmonized Governance Rule Schema

**Version**: 3.0.0
**Date**: November 2, 2025
**Status**: APPROVED
**Scope**: All projects using Tractatus Governance Framework

---

## 📋 EXECUTIVE SUMMARY

Schema v3.0 unifies the best features of:
- **v1.0** (tractatus): Temporal scope, verification requirements, explicitness scoring, source tracking
- **v2.0** (family-history): Comprehensive documentation, structured guidance, evidence tracking, relationship mapping

**Result**: World-class governance rule specification enabling both human understanding and framework automation.

---

## 🎯 DESIGN PRINCIPLES

1. **Comprehensiveness**: Rule contains all information needed to understand and enforce it
2. **Machine-Readable**: Structured fields enable automated validation and enforcement
3. **Human-Friendly**: Clear documentation supports developer understanding
4. **Traceable**: Evidence, relationships, and history enable governance auditing
5. **Temporal-Aware**: Rules have lifecycle (temporal scope, creation, validation history)
6. **Verification-Capable**: Framework can test rule compliance automatically

---

## 📊 SCHEMA v3.0 SPECIFICATION

### Complete Field Listing

```json
{
  "id": "string (required)",
  "title": "string (required)",
  "category": "enum (required)",
  "quadrant": "enum (required)",
  "persistence": "enum (required)",
  "temporal_scope": "enum (required)",
  "verification_required": "enum (required)",

  "description": "string (required)",
  "context": "string (required)",
  "rationale": "string (required)",
  "trigger": "string (required)",
  "action": "string (required)",
  "validation": "string (required)",
  "evidence": "string (required)",

  "explicitness": "number (0.0-1.0, required)",
  "source": "enum (required)",
  "session_id": "string (optional)",

  "parameters": "object (optional)",
  "relatedInstructions": "array<string> (optional)",
  "supersedes": "array<string> (optional)",
  "active": "boolean (required)",

  "metadata": {
    "created": "ISO date (required)",
    "lastValidated": "ISO date (required)",
    "validationCount": "integer (required)",
    "violationCount": "integer (required)",
    "deactivated": "boolean (required)",
    "deactivationReason": "string (optional)",
    "notes": "string (optional)"
  }
}
```

---

## 📖 FIELD DEFINITIONS

### Core Identification Fields

#### `id` (string, required)
**Purpose**: Unique identifier for the rule

**Format**:
- Project-specific: `inst_{project}_{category}_{number}`
- Framework-global: `inst_framework_{category}_{number}`
- Legacy support: `inst_{number}`

**Examples**:
- `inst_fh_sec_001` - Family-history security rule #1
- `inst_tr_boundary_001` - Tractatus boundary rule #1
- `inst_framework_meta_001` - Framework meta-rule #1

**Constraints**:
- Must be globally unique across all projects using framework
- Never reuse deactivated IDs
- Incremental numbering within category

---

#### `title` (string, required)
**Purpose**: Human-readable concise rule name

**Format**: Title case, 5-10 words, actionable

**Examples**:
- "TenantStorage Requirement - No Raw localStorage"
- "Database Connection Scripts Require Approval"
- "Conventional Commit Format Required"

**Constraints**:
- Max 100 characters
- Should be understandable without reading full rule
- Avoid technical jargon when possible

---

### Classification Fields

#### `category` (enum, required)
**Purpose**: High-level categorization for filtering and grouping

**Allowed Values**:
- `SECURITY` - Security, authentication, authorization, credential protection
- `MULTI_TENANCY` - Multi-tenant isolation, tenant filtering, data segregation
- `DEPLOYMENT` - Deployment procedures, testing, rollback, production operations
- `SYSTEM` - Infrastructure, ports, databases, services, system configuration
- `FRAMEWORK_OPERATION` - Framework itself, hooks, governance services
- `VALUES_ALIGNMENT` - Project values, ethics, mission alignment
- `PRIVACY` - Privacy protection, GDPR compliance, data protection
- `GIT_VERSION_CONTROL` - Git operations, commits, branches, version control
- `OPERATIONAL_EXCELLENCE` - Process standards, quality assurance, documentation
- `ARCHITECTURE` - Architectural decisions, design patterns, system structure
- `PERFORMANCE` - Performance, optimization, resource management
- `ACCESSIBILITY` - Accessibility, internationalization, usability

**Usage**: Primary filter for rule queries

---

#### `quadrant` (enum, required)
**Purpose**: Strategic importance and decision level

**Allowed Values**:
- `STRATEGIC` - Long-term impact, architectural decisions, requires human judgment for exceptions
- `OPERATIONAL` - Day-to-day operations, process standards, can be automated
- `TACTICAL` - Specific implementation details, context-dependent
- `SYSTEM` - Infrastructure/environment configuration, rarely changes

**Decision Matrix**:
- Can this rule's violation cause strategic harm? → STRATEGIC
- Is this an everyday process/procedure? → OPERATIONAL
- Is this implementation-specific? → TACTICAL
- Is this infrastructure/environment config? → SYSTEM

---

#### `persistence` (enum, required)
**Purpose**: How long this rule should remain active

**Allowed Values**:
- `HIGH` - Permanent or semi-permanent, critical to project success
- `MEDIUM` - Important but may evolve, review quarterly
- `LOW` - Temporary, context-specific, may be deprecated soon

**Diachronic Commitment Levels**:
- HIGH: Architectural commitments, security requirements, values alignment
- MEDIUM: Best practices, process standards
- LOW: Temporary workarounds, experiment guidelines

---

#### `temporal_scope` (enum, required)
**Purpose**: Rule's lifetime/applicability duration

**Allowed Values**:
- `PROJECT` - Applies for entire project lifecycle
- `PHASE` - Applies during specific development phase (e.g., "during alpha", "until launch")
- `SESSION` - Applies only to current session or related sessions
- `TASK` - Applies only to specific task being worked on

**Usage Examples**:
- Database port configuration: `PROJECT`
- "Focus on testing during beta": `PHASE`
- "Use specific debugging approach": `SESSION`
- "Implement this feature with pattern X": `TASK`

---

#### `verification_required` (enum, required)
**Purpose**: Whether framework must actively verify compliance

**Allowed Values**:
- `MANDATORY` - Framework MUST check compliance, block violations
- `RECOMMENDED` - Framework SHOULD check when feasible, warn on violations
- `OPTIONAL` - Framework MAY check, informational only
- `MANUAL` - Human review required, automated verification not possible

**Automation Mapping**:
- MANDATORY → Hook blocks operation on violation
- RECOMMENDED → Hook warns but allows with logging
- OPTIONAL → Logged for audit but no blocking
- MANUAL → Requires human review before deployment

---

### Documentation Fields

#### `description` (string, required)
**Purpose**: What this rule governs (1-3 sentences)

**Format**: Clear, direct statement of rule scope and requirement

**Example**:
```
ALL database queries MUST filter by tenantId. No exceptions. Every find(),
findOne(), update(), delete() operation must include {tenantId: tenantId}
filter to prevent cross-tenant data leakage.
```

**Guidelines**:
- Start with subject (ALL, NEVER, ALWAYS, etc.)
- State requirement explicitly
- Include scope (what's affected)
- Max 500 characters

---

#### `context` (string, required)
**Purpose**: Why this rule exists, historical background

**Format**: 2-4 sentences explaining the problem this rule solves

**Example**:
```
Multi-tenant architecture isolates family data. Missing tenantId filters
expose one family's data to another family, violating privacy, GDPR, and
data sovereignty principles. Critical incident on 2025-08-21 prevented by
user intervention.
```

**Guidelines**:
- Explain the problem
- Reference incidents if applicable
- Connect to project values/requirements
- Max 1000 characters

---

#### `rationale` (string, required)
**Purpose**: Detailed reasoning - the "why" behind the rule

**Format**: 3-5 points explaining consequences and benefits

**Example**:
```
Tenant isolation is the foundation of data security in multi-tenant SaaS.
Single missing filter = complete data breach for affected families. Must be
enforced at query level, not application level. Prevents: 1) Cross-tenant
data access, 2) Privacy violations, 3) GDPR breaches, 4) Trust erosion,
5) Lawsuit exposure.
```

**Guidelines**:
- Explain consequences of violation
- Explain benefits of compliance
- Use numbered lists for clarity
- Max 2000 characters

---

#### `trigger` (string, required)
**Purpose**: When/where this rule applies

**Format**: Specific conditions that invoke this rule

**Example**:
```
Writing any MongoDB query: db.collection().find(), findOne(), updateOne(),
updateMany(), deleteOne(), deleteMany(), aggregate()
```

**Guidelines**:
- Be specific about conditions
- List file types, operations, or contexts
- Help developers know when to apply rule
- Max 1000 characters

---

#### `action` (string, required)
**Purpose**: Specific steps to comply with rule

**Format**: Actionable instructions, preferably numbered

**Example**:
```
Include tenantId in ALL queries:
1. Get tenantId from session: const tenantId = req.session.tenantId
2. Add to query filter: db.collection('contributions').find({
     tenantId: tenantId,
     ...otherFilters
   })
3. No query without tenantId filter
4. Use plugin for automatic filtering where available
```

**Guidelines**:
- Provide code examples when helpful
- Number steps for clarity
- Include specific APIs/functions to use
- Max 2000 characters

---

#### `validation` (string, required)
**Purpose**: How to verify compliance (manual or automated)

**Format**: Testing procedures, checks, or automated validation

**Example**:
```
Code review must verify:
1. grep for 'find({})' - ALL must have tenantId filter
2. grep for 'findOne({})' - ALL must have tenantId filter
3. grep for 'updateMany({})' - ALL must have tenantId filter
4. Manual review of complex queries
5. Test with multiple tenant accounts - verify data isolation
```

**Guidelines**:
- Provide grep/search patterns
- Include manual testing steps
- Specify automated checks where possible
- Max 1500 characters

---

#### `evidence` (string, required)
**Purpose**: Supporting documentation, references, sources

**Format**: Links, file references, incident reports, standards

**Example**:
```
CLAUDE.md Rule #4: 'NEVER violate multi-tenant isolation - all queries filter
by tenantId'; docs/security/INCIDENT_REPORT_2025-08-21.md - Cross-tenant data
leak prevented; OWASP Multi-Tenancy Security Cheat Sheet
```

**Guidelines**:
- Cite internal documentation
- Reference external standards (OWASP, GDPR, etc.)
- Link to incident reports if applicable
- Max 1000 characters

---

### Framework Automation Fields

#### `explicitness` (number, 0.0-1.0, required)
**Purpose**: How explicit/specific vs vague/general the rule is

**Scoring**:
- `1.0` - Completely explicit (e.g., "Use port 27027")
- `0.7-0.9` - Very specific with clear boundaries
- `0.4-0.6` - Moderately specific, some interpretation needed
- `0.1-0.3` - General principle, significant interpretation required
- `0.0` - Completely vague (not recommended)

**Usage**: Framework prioritizes high-explicitness rules for automation

**Examples**:
- "Database port MUST be 27027": explicitness = 1.0
- "Use appropriate security measures": explicitness = 0.2
- "Rate limit to prevent abuse": explicitness = 0.5

---

#### `source` (enum, required)
**Purpose**: Origin of this rule

**Allowed Values**:
- `user` - User explicitly stated this rule
- `framework` - Generated by framework based on patterns
- `incident` - Created in response to incident/bug
- `migration` - Migrated from another system/documentation
- `consolidation` - Consolidated from multiple rules

**Usage**: Helps track rule authority and trustworthiness

---

#### `session_id` (string, optional)
**Purpose**: Session in which rule was created/modified

**Format**: Session identifier (date-based or UUID)

**Example**: `2025-11-02-001` or `9bed871b-7ca3-4b68-aafd-8c7e83176800`

**Usage**: Audit trail, rule lifecycle tracking

---

### Relationship Fields

#### `parameters` (object, optional)
**Purpose**: Structured data for automated validation

**Format**: Key-value pairs of enforceable parameters

**Example**:
```json
{
  "port": "27027",
  "database": "family_history",
  "service": "mongodb",
  "rate_limit": {
    "public": "100/15min",
    "authenticated": "1000/15min",
    "admin": "50/15min"
  }
}
```

**Usage**: Framework services extract parameters for automated checks

---

#### `relatedInstructions` (array<string>, optional)
**Purpose**: IDs of related/dependent rules

**Format**: Array of instruction IDs

**Example**: `["inst_fh_sec_001", "inst_fh_multi_001", "inst_fh_sec_012"]`

**Usage**: Framework can check related rules together, show dependencies

---

#### `supersedes` (array<string>, optional)
**Purpose**: IDs of rules replaced by this rule

**Format**: Array of instruction IDs of deprecated/replaced rules

**Example**: `["inst_fh_sec_001_v1", "inst_legacy_storage_003"]`

**Usage**: Track rule evolution, prevent conflicts with old rules

---

### Lifecycle Fields

#### `active` (boolean, required)
**Purpose**: Whether rule is currently enforced

**Values**:
- `true` - Rule is active and enforced
- `false` - Rule is deactivated (but preserved for history)

**Usage**: Framework only enforces active rules

---

#### `metadata` (object, required)
**Purpose**: Rule lifecycle tracking

**Required Fields**:
```json
{
  "created": "2025-11-02",
  "lastValidated": "2025-11-02",
  "validationCount": 0,
  "violationCount": 0,
  "deactivated": false,
  "deactivationReason": null,
  "notes": null
}
```

**Field Definitions**:
- `created`: ISO date when rule was first created
- `lastValidated`: ISO date when rule compliance was last verified
- `validationCount`: Number of times compliance was checked
- `violationCount`: Number of violations detected
- `deactivated`: Boolean - if true, rule is inactive
- `deactivationReason`: String explaining why rule was deactivated
- `notes`: Free-form notes about rule evolution

---

## 📐 COMPLETE EXAMPLE

```json
{
  "id": "inst_fh_multi_001",
  "title": "TenantId Filtering Mandatory for All Queries",
  "category": "MULTI_TENANCY",
  "quadrant": "STRATEGIC",
  "persistence": "HIGH",
  "temporal_scope": "PROJECT",
  "verification_required": "MANDATORY",

  "description": "ALL database queries MUST filter by tenantId. No exceptions. Every find(), findOne(), update(), delete() operation must include {tenantId: tenantId} filter to prevent cross-tenant data leakage.",

  "context": "Multi-tenant architecture isolates family data. Missing tenantId filters expose one family's data to another family, violating privacy, GDPR, and data sovereignty principles.",

  "rationale": "Tenant isolation is the foundation of data security in multi-tenant SaaS. Single missing filter = complete data breach for affected families. Must be enforced at query level, not application level.",

  "trigger": "Writing any MongoDB query: db.collection().find(), findOne(), updateOne(), updateMany(), deleteOne(), deleteMany(), aggregate()",

  "action": "Include tenantId in ALL queries: `db.collection('contributions').find({tenantId: req.session.tenantId, ...otherFilters})` - No query without tenantId",

  "validation": "Code review must verify: grep for 'find({})', 'findOne({})', 'updateMany({})', 'deleteMany({})' - ALL must have tenantId filter",

  "evidence": "CLAUDE.md Rule #4: 'NEVER violate multi-tenant isolation - all queries filter by tenantId'",

  "explicitness": 0.95,
  "source": "user",
  "session_id": "2025-11-01-001",

  "parameters": {
    "queryMethods": ["find", "findOne", "updateOne", "updateMany", "deleteOne", "deleteMany", "aggregate"],
    "requiredFilter": "tenantId",
    "source": "req.session.tenantId"
  },

  "relatedInstructions": ["inst_fh_multi_002", "inst_fh_multi_005", "inst_fh_privacy_003"],
  "supersedes": [],

  "active": true,

  "metadata": {
    "created": "2025-11-01",
    "lastValidated": "2025-11-02",
    "validationCount": 15,
    "violationCount": 2,
    "deactivated": false,
    "deactivationReason": null,
    "notes": "Critical rule - never deactivate without replacement"
  }
}
```

---

## 🔄 MIGRATION GUIDE

### From v1.0 (tractatus) to v3.0

**Mapping**:
```
v1.0 Field          → v3.0 Field(s)
----------------------------------------
id                  → id
text                → description (primary), + context, rationale, action
timestamp           → metadata.created
quadrant            → quadrant
persistence         → persistence
temporal_scope      → temporal_scope
verification_required → verification_required
explicitness        → explicitness
source              → source
session_id          → session_id
parameters          → parameters
active              → active
notes               → metadata.notes
```

**New Required Fields**:
- `title` - Extract from text first sentence
- `category` - Infer from quadrant/context
- `trigger` - Extract from text or infer
- `validation` - Create from text or mark "Manual review"
- `evidence` - Add "Migrated from v1.0" + any references
- `relatedInstructions` - Leave empty initially
- `metadata.lastValidated` - Set to migration date
- `metadata.validationCount` - Set to 0
- `metadata.violationCount` - Set to 0
- `metadata.deactivated` - Set to false

**Migration Script**: `scripts/migrate-v1-to-v3.js`

---

### From v2.0 (family-history enhanced) to v3.0

**Mapping**:
```
v2.0 Field          → v3.0 Field(s)
----------------------------------------
id                  → id
title               → title
category            → category
quadrant            → quadrant
persistence         → persistence
description         → description
context             → context
rationale           → rationale
trigger             → trigger
action              → action
validation          → validation
evidence            → evidence
relatedInstructions → relatedInstructions
metadata            → metadata (same structure)
```

**New Required Fields**:
- `temporal_scope` - Infer from persistence: HIGH → PROJECT, MEDIUM → PHASE, LOW → TASK
- `verification_required` - Infer from category: SECURITY/MULTI_TENANCY → MANDATORY, others → RECOMMENDED
- `explicitness` - Calculate from specificity of action field (0.0-1.0)
- `source` - Set to "migration"
- `session_id` - Set to null or migration session
- `parameters` - Extract from action field (structured data)
- `supersedes` - Leave empty
- `active` - Set to !metadata.deactivated

**Migration Script**: `scripts/migrate-v2-to-v3.js`

---

## 🤖 FRAMEWORK USAGE

### Hook Integration

Framework hooks use schema fields for automated enforcement:

```javascript
// Example: Pre-action validation hook
const instruction = loadInstruction('inst_fh_sec_001');

// Check if verification required
if (instruction.verification_required === 'MANDATORY') {
  // Extract automated check parameters
  const patterns = instruction.parameters?.prohibitedPatterns || [];

  // Scan file content
  for (const pattern of patterns) {
    if (new RegExp(pattern).test(fileContent)) {
      // Violation found - use metadata for logging
      logViolation({
        instructionId: instruction.id,
        title: instruction.title,
        severity: instruction.quadrant === 'STRATEGIC' ? 'CRITICAL' : 'HIGH',
        evidence: instruction.evidence,
        action: instruction.action
      });

      // Increment violation count
      instruction.metadata.violationCount++;

      // Block operation if MANDATORY
      return { decision: 'deny', reason: instruction.description };
    }
  }
}
```

### Query Examples

**Find all MANDATORY security rules**:
```javascript
const mandatorySecurityRules = instructions.filter(i =>
  i.category === 'SECURITY' &&
  i.verification_required === 'MANDATORY' &&
  i.active === true
);
```

**Find rules needing validation review** (>90 days since last validation):
```javascript
const needsReview = instructions.filter(i => {
  const daysSinceValidation = (Date.now() - new Date(i.metadata.lastValidated)) / (1000 * 60 * 60 * 24);
  return daysSinceValidation > 90 && i.active === true;
});
```

**Find high-violation rules**:
```javascript
const problematicRules = instructions.filter(i =>
  i.metadata.violationCount > 10 &&
  i.active === true
).sort((a, b) => b.metadata.violationCount - a.metadata.violationCount);
```

---

## ✅ VALIDATION RULES

Schema v3.0 includes validation requirements:

### Required Field Validation
- All fields marked "required" MUST be present
- String fields MUST NOT be empty strings
- Enum fields MUST use allowed values only
- Numbers MUST be in specified ranges

### Field Constraints
- `id`: Must match pattern `inst_[a-z0-9_]+`
- `title`: 5-100 characters
- `description`: 50-500 characters
- `context`: 100-1000 characters
- `rationale`: 100-2000 characters
- `trigger`: 50-1000 characters
- `action`: 100-2000 characters
- `validation`: 50-1500 characters
- `evidence`: 50-1000 characters
- `explicitness`: 0.0-1.0
- `metadata.created`: Valid ISO date
- `metadata.lastValidated`: Valid ISO date
- `metadata.validationCount`: Non-negative integer
- `metadata.violationCount`: Non-negative integer

### Relationship Validation
- `relatedInstructions`: All IDs must exist
- `supersedes`: All IDs must exist and be deactivated
- If `active === false`, `metadata.deactivated` must be `true`
- If `metadata.deactivated === true`, `deactivationReason` should be provided

---

## 📊 SCHEMA EVOLUTION

### Version History

| Version | Date | Changes | Projects |
|---------|------|---------|----------|
| **v1.0** | Oct 2025 | Original tractatus schema | tractatus |
| **v2.0** | Nov 2025 | Enhanced family-history schema | family-history |
| **v3.0** | Nov 2025 | Unified schema (this spec) | All future projects |

### Deprecation Policy

- v1.0: Supported until Dec 2025, migrate to v3.0
- v2.0: Supported until Mar 2026, migrate to v3.0
- v3.0: Current standard

### Future Considerations

Potential v4.0 additions (not committed):
- `impact_analysis`: Automated impact assessment
- `test_cases`: Automated test generation
- `remediation_steps`: Automated fix suggestions
- `compliance_mappings`: GDPR/SOC2/etc. mapping
- `ai_generated`: Flag for AI-generated rules

---

## 📄 LICENSE & COPYRIGHT

**License**: Apache License 2.0
**Copyright**: © 2025 John G Stroh. All rights reserved.

This specification is part of the Tractatus Governance Framework.

Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at:

    http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.

---

**Specification Version**: 3.0.0
**Approved By**: John G Stroh
**Approval Date**: November 2, 2025
**Next Review**: February 2, 2026