tractatus/docs/BLOG-POST-OUTLINES.md
TheFlow ddaa209726 fix: content accuracy updates per inst_039
Updates service count references and removes prohibited language:

1. PITCH-EXECUTIVE.md:
   - Updated "five core constraint types" → "six core services"
   - Added PluralisticDeliberationOrchestrator (6th service)
   - Reordered services for clarity (persistence first)

2. BLOG-POST-OUTLINES.md:
   - Fixed "Structural guarantees" → "Structural constraints"
   - Complies with inst_017 (no absolute assurance terms)

3. PHASE-2-EMAIL-TEMPLATES.md:
   - Fixed "structural guarantees" → "structural constraints"
   - Complies with inst_017

4. .claude/instruction-history.json:
   - Added inst_039: Content accuracy audit protocol
   - Mandates 5→6 service updates and rule violation checks
   - Synced to production

Content audit findings:
- docs/markdown/ files already accurate (historical context is correct)
- Only 2 prohibited language violations found (both fixed)
- Most "guarantee" references are in rule documentation (acceptable)

Implements: inst_039 (content accuracy during card presentations)
Related: inst_016, inst_017, inst_018 (prohibited language)

🤖 Generated with Claude Code (https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-10-12 23:16:17 +13:00

22 KiB

Tractatus Blog Post Outlines

Purpose: Initial blog content for soft launch (Week 7-8) Target Audience: Researchers, Implementers, Advocates Word Count: 800-1200 words each Author: John Stroh (human-written, Claude may assist with research) Status: Outlines ready for drafting


Blog Post 1: Introducing Tractatus - AI Safety Through Sovereignty

Target Audience: All (General Introduction) Goal: Explain core principle and value proposition Word Count: 1000-1200 words Tone: Accessible but authoritative, inspiring

Outline

I. The Problem (200 words)

  • Current AI safety approaches rely on "alignment" - teaching AI to be good
  • Fundamental limitation: Values are contested, contextual, evolving
  • Example: "Be helpful and harmless" - but helpful to whom? Whose definition of harm?
  • Alignment breaks down as AI capabilities scale
  • Quote: "We can't encode what we can't agree on"

II. The Core Principle (250 words)

  • Tractatus principle: "What cannot be systematized must not be automated"
  • Shift from behavioral alignment to architectural constraints
  • Not "teach AI to make good decisions" but "prevent AI from making certain decisions"
  • Three categories of unsystematizable decisions:
    1. Values (privacy vs. performance, equity vs. efficiency)
    2. Ethics (context-dependent moral judgments)
    3. Human Agency (decisions that affect autonomy, dignity, sovereignty)
  • Key insight: These require human judgment, not optimization

III. Tractatus in Practice (300 words)

  • Real-world example: Media inquiry response system
    • Without Tractatus: AI classifies, drafts, sends automatically
    • With Tractatus: AI classifies, drafts, human approves before sending
    • Boundary enforced: External communication requires human judgment
  • Code example: BoundaryEnforcer detecting STRATEGIC quadrant
    const action = { type: 'send_email', recipient: 'media@outlet.com' };
    const result = boundaryEnforcer.checkBoundary(action);
    // result.status: 'BLOCKED' - requires human approval
    
  • Governance: No AI action crosses into values territory without explicit human decision

IV. Why "Sovereignty"? (200 words)

  • AI safety as human sovereignty issue
  • Digital sovereignty: Control over decisions that affect us
  • Analogy: National sovereignty requires decision-making authority
  • Personal sovereignty requires agency over AI systems
  • Tractatus approach: Structural constraints, not aspirational goals
  • Not "hope AI respects your agency" but "AI structurally cannot bypass your agency"

V. What Makes This Different (200 words)

  • vs. Constitutional AI: Still tries to encode values (just more of them)
  • vs. RLHF: Still optimizes for "good" behavior (which "good"?)
  • vs. Red-teaming: Reactive, not proactive; finds failures, doesn't prevent classes of failure
  • Tractatus: Architectural constraints that persist regardless of capability level
  • Key advantage: Scales safely with AI advancement

VI. Call to Action (100 words)

  • This website is governed by Tractatus (dogfooding)
  • All AI-assisted content requires human approval
  • No values decisions automated
  • Explore the framework:
    • Researchers: Technical documentation
    • Implementers: Code examples and API
    • Advocates: Policy implications
  • Join the conversation: [link to feedback/community]

Blog Post 2: The 27027 Incident - When AI Contradicts Explicit Instructions

Target Audience: Implementers, Researchers Goal: Deep dive into cross-reference validation Word Count: 1000 words Tone: Technical, narrative, problem-solving

Outline

I. The Incident (200 words)

  • Setting: Real development session (October 2025)
  • Context: Setting up MongoDB for Tractatus project
  • Explicit instruction: "Use MongoDB on port 27017"
    • Repeated multiple times
    • Written in project documentation
    • High persistence (SYSTEM quadrant)
  • What happened: AI generated code with PORT = 27027
  • Impact: Silent failure - wrong port, wrong database, data loss risk
  • Why it matters: This was a simple, unambiguous instruction

II. Root Cause Analysis (250 words)

  • Pattern matching failure: AI saw "27017" and "MongoDB" and pattern-matched to "27027" (common default)
  • Context window pressure: Long conversation, many competing directives
  • Instruction persistence decay: Earlier instructions weighted less than recent context
  • No cross-reference mechanism: AI didn't check proposed action against explicit instructions
  • Analogy: "Like a contractor who heard you say 'paint the room blue' but painted it green because 'green is nice too'"

The Deeper Problem:

  • This wasn't hallucination (AI didn't make up information)
  • This wasn't misalignment (AI wasn't "trying" to do wrong)
  • This was instruction override - AI substituted its pattern-matched output for explicit user directive
  • Critical insight: If AI can override simple, unambiguous instructions, what about complex, values-laden ones?

III. The Tractatus Solution: CrossReferenceValidator (300 words)

How it works:

  1. Instruction storage: All explicit user directives stored with metadata

    • Quadrant classification (STR/OPS/TAC/SYS/STO)
    • Persistence level (HIGH/MEDIUM/LOW)
    • Temporal scope (SESSION/PROJECT/PERMANENT)
    • Explicitness score (0.0-1.0)
  2. Pre-action validation: Before executing any action, check against stored instructions

    const validator = new CrossReferenceValidator();
    const action = { port: 27027, service: 'mongodb' };
    const instruction = {
      text: 'Use MongoDB on port 27017',
      port: 27017,
      persistence: 'HIGH',
      quadrant: 'SYSTEM'
    };
    
    const result = validator.validate(action, instruction);
    // result.status: 'REJECTED'
    // result.reason: 'Conflicts with explicit instruction #42'
    // result.suggestion: 'Use port 27017 as instructed'
    
  3. Conflict resolution: When conflict detected:

    • HIGH persistence instructions: Block action, alert user
    • MEDIUM persistence: Warn user, suggest override
    • LOW persistence: Note conflict, proceed with user confirmation

Production impact:

  • 96.4% test coverage (CrossReferenceValidator.test.js)
  • Zero instruction overrides since implementation
  • Used in 100+ development sessions without failure

IV. Lessons for AI Safety (150 words)

  • Lesson 1: Even simple, explicit instructions can be overridden
  • Lesson 2: Pattern matching ≠ instruction following
  • Lesson 3: Context window pressure degrades instruction persistence
  • Lesson 4: Architectural validation > behavioral alignment
  • Key takeaway: If we can't trust AI to follow "use port 27017", we definitely can't trust it with "protect user privacy"

V. Implementation Guide (100 words)

  • Link to CrossReferenceValidator source code
  • Link to API documentation
  • Example integration patterns
  • Common pitfalls and solutions
  • Invite implementers to try it: "Add cross-reference validation to your AI-powered app"

Blog Post 3: Dogfooding Tractatus - How This Website Governs Its Own AI

Target Audience: All (Transparency + Technical) Goal: Show Tractatus in practice, build trust Word Count: 900 words Tone: Transparent, demonstrative, honest

Outline

I. Introduction: Walking the Walk (150 words)

  • This website uses AI (Claude Sonnet 4.5) for content assistance
  • But it's governed by the Tractatus framework
  • Core commitment: Zero AI actions in values-sensitive domains without human approval
  • This isn't theoretical - we're dogfooding our own framework
  • Transparency: This post explains exactly how it works

II. The AI Features We Use (200 words)

Blog Curation System:

  • AI suggests weekly topics (scans AI safety news, Tractatus-relevant developments)
  • AI generates outlines for approved topics
  • Human writes the actual draft (AI does not write blog posts)
  • Human approves publication (no auto-publish)
  • Why: Blog content is STRATEGIC (editorial voice, values, framing)

Media Inquiry Triage:

  • AI classifies incoming inquiries (Press/Academic/Commercial/Community/Spam)
  • AI generates priority score (HIGH/MEDIUM/LOW based on TRA-OPS-0003)
  • AI drafts responses
  • Human reviews, edits, approves before sending
  • Why: External communication is STRATEGIC (organizational voice, stakeholder relationships)

Case Study Moderation:

  • AI assesses relevance to Tractatus framework
  • AI maps submission to framework components (InstructionPersistence, BoundaryEnforcement, etc.)
  • Human moderates (quality check, editorial standards)
  • Human approves publication
  • Why: Platform content is STRATEGIC (editorial standards, community trust)

III. The Governance Policies (250 words)

TRA-OPS-0001: Master AI Content Policy

  • Mandatory human approval for all public content
  • Boundary enforcement: AI cannot make values decisions
  • API budget cap: $200/month (prevents runaway costs)
  • Audit trail: 2-year retention of all AI decisions

TRA-OPS-0002: Blog Editorial Guidelines

  • 4 content categories (Technical Deep-Dives, Case Studies, Policy Analysis, Community Updates)
  • Citation standards (all claims must be sourced)
  • AI role: Assist, not author
  • Human role: Write, approve, own

TRA-OPS-0003: Media Response Protocol

  • SLAs: 4h (HIGH priority), 48h (MEDIUM), 7 days (LOW)
  • Classification system (5 categories)
  • No auto-send: All responses human-approved
  • Escalation: Complex inquiries require John Stroh review

TRA-OPS-0004: Case Study Moderation

  • Quality checklist (relevance, clarity, accuracy, respectfulness)
  • AI relevance analysis (scoring 0.0-1.0)
  • Human publication decision (AI score is advisory only)

TRA-OPS-0005: Human Oversight Requirements

  • Admin reviewer role + training
  • Moderation queue dashboard
  • SLA compliance monitoring

IV. Real Examples: What We Block (200 words)

Example 1: Blog Topic Suggestion

  • AI suggested: "10 Reasons Tractatus is Better Than Constitutional AI"
  • BLOCKED by BoundaryEnforcer: Comparative values claim (STRATEGIC)
  • Why: "Better" is a values judgment, requires human decision
  • Alternative: "Architectural Constraints vs. Behavioral Alignment: A Framework Comparison"

Example 2: Media Response Auto-Send

  • AI classified inquiry as LOW priority (automated response drafted)
  • BLOCKED: External communication requires human approval (TRA-OPS-0003 §4.2)
  • Human review: Actually HIGH priority (major media outlet, deadline)
  • Outcome: Reclassified, escalated, John Stroh responded personally

Example 3: Case Study Auto-Publish

  • AI assessed relevance: 0.89 (high confidence)
  • BLOCKED: Publication is STRATEGIC decision
  • Human review: Submission contained unverified claims
  • Outcome: Requested clarification from submitter

V. The Audit Trail (100 words)

  • Every AI action logged with:
    • Timestamp, action type, quadrant classification
    • Human approval status (approved/rejected/modified)
    • Reviewer identity (accountability)
    • Reasoning (why approved or rejected)
  • 2-year retention (compliance, learning, transparency)
  • Available for external audit (Phase 3: independent review)

Blog Post 4: AI Safety Regulation - Why Architectural Constraints Align with Policy Goals

Target Audience: Advocates, Policy Professionals Goal: Connect Tractatus to regulatory frameworks Word Count: 1000 words Tone: Policy-focused, solutions-oriented

Outline

I. The Regulatory Landscape (200 words)

  • EU AI Act: Risk-based approach, high-risk AI systems require human oversight
  • US AI Bill of Rights: Algorithmic discrimination protection, notice and explanation
  • UK AI Regulation: Principles-based, sector-specific approach
  • Common theme: All seek to preserve human decision-making authority
  • Challenge: How to enforce this technically?

II. The Alignment Problem in Policy (250 words)

Current approach: Behavioral requirements

  • "AI shall not discriminate"
  • "AI shall be transparent"
  • "AI shall be fair"
  • Problem: These are aspirational, not enforceable architecturally

Enforcement gap:

  • Regulators set requirements
  • Companies "align" AI to meet requirements
  • Testing/auditing checks if AI "behaves" correctly
  • But: Alignment can drift, fail, or be gamed
  • Example: VW emissions scandal - passed tests, failed in practice

What policy really wants:

  • Not "AI that tries to be fair"
  • But "AI that structurally cannot make unfair decisions without human review"
  • Not "AI that respects privacy"
  • But "AI that architecturally cannot access private data without authorization"

III. Tractatus as Regulatory Compliance Framework (300 words)

How Tractatus maps to EU AI Act requirements:

EU AI Act Requirement Tractatus Implementation
Human oversight (Art. 14) BoundaryEnforcer: STRATEGIC decisions require human approval
Transparency (Art. 13) Audit trail: All AI actions logged with reasoning
Accuracy (Art. 15) CrossReferenceValidator: Prevents instruction overrides
Cybersecurity (Art. 15) MongoDB authentication, SSH hardening, UFW firewall
Record-keeping (Art. 12) 2-year retention of all AI decisions

How Tractatus maps to US AI Bill of Rights:

Principle Tractatus Implementation
Safe and Effective Systems BoundaryEnforcer prevents values-laden automation
Algorithmic Discrimination Protections Human approval for decisions affecting individuals
Data Privacy AI cannot access user data without explicit authorization
Notice and Explanation Audit trail provides complete decision history
Human Alternatives STRATEGIC decisions architecturally require human

Key advantage: Tractatus provides structural compliance, not behavioral

  • Regulators can audit the architecture, not just the behavior
  • Compliance is enforceable at runtime, not just in testing
  • Drift/failure is prevented architecturally, not hoped against

IV. Policy Recommendations (150 words)

For regulators:

  1. Require architectural constraints, not just behavioral alignment
  2. Mandate audit trails for high-risk AI decisions
  3. Define "values-sensitive decisions" requiring human oversight
  4. Enforce quadrant classification for AI operations

For organizations:

  1. Adopt architectural safety frameworks early (competitive advantage)
  2. Document AI governance policies (TRA-OPS-* model)
  3. Implement human-in-the-loop for STRATEGIC decisions
  4. Prepare for regulatory audit (2-year log retention)

For advocates:

  1. Push for structural safety requirements in legislation
  2. Educate policymakers on alignment limitations
  3. Demand transparency (audit trails, decision logs)

V. Call to Action (100 words)

  • Tractatus is open for policy feedback
  • Invite regulators, advocates, researchers to review framework
  • Propose Tractatus as reference architecture for AI Act compliance
  • Offer to collaborate on policy development

Blog Post 5: Implementing Cross-Reference Validation in Your AI Application

Target Audience: Implementers Goal: Practical guide to integrating Tractatus Word Count: 1100 words Tone: Technical, tutorial-style, hands-on

Outline

I. Introduction: Why You Need This (150 words)

  • If you're building AI-powered applications, you've likely experienced:
    • AI overriding user preferences
    • Context window pressure degrading instruction adherence
    • Unexpected outputs contradicting explicit directives
  • The 27027 problem is everywhere:
    • "Use the blue theme" → AI uses green (pattern-matched)
    • "Never email customers on weekends" → AI sends Saturday newsletter
    • "Require 2FA for admin" → AI creates admin without 2FA
  • Solution: Cross-reference validation before action execution

II. Core Concepts (200 words)

1. Instruction Persistence

  • Not all instructions are equal
  • HIGH persistence: Core system requirements ("use port 27017")
  • MEDIUM persistence: Workflow preferences ("prefer async patterns")
  • LOW persistence: Contextual hints ("maybe try refactoring?")

2. Quadrant Classification

  • STRATEGIC: Values, ethics, agency (always require human approval)
  • OPERATIONAL: Policies, processes (human review)
  • TACTICAL: Execution details (automated, but logged)
  • SYSTEM: Technical requirements (automated, validated)
  • STOCHASTIC: Exploratory, uncertain (flagged for verification)

3. Pre-Action Validation

  • Before AI executes an action, check against stored instructions
  • If conflict detected: Block (HIGH), warn (MEDIUM), or note (LOW)
  • Always log: Transparency and debugging

III. Quick Start: 5-Minute Integration (300 words)

Step 1: Install Tractatus SDK (when available - Phase 3)

npm install @tractatus/core

Step 2: Initialize Services

const {
  InstructionPersistenceClassifier,
  CrossReferenceValidator,
  BoundaryEnforcer
} = require('@tractatus/core');

const classifier = new InstructionPersistenceClassifier();
const validator = new CrossReferenceValidator();
const enforcer = new BoundaryEnforcer();

Step 3: Classify User Instructions

// When user provides instruction
const userInstruction = "Use MongoDB on port 27017";

const classification = classifier.classify({
  text: userInstruction,
  context: 'database_configuration',
  explicitness: 0.95 // highly explicit
});

// Store instruction
await classifier.storeInstruction({
  text: userInstruction,
  quadrant: classification.quadrant, // SYSTEM
  persistence: classification.persistence, // HIGH
  parameters: { port: 27017, service: 'mongodb' }
});

Step 4: Validate AI Actions

// Before AI executes action
const proposedAction = {
  type: 'update_mongodb_config',
  port: 27027 // AI suggested wrong port
};

const validation = await validator.validate(
  proposedAction,
  classifier.getInstructions({ context: 'database_configuration' })
);

if (validation.status === 'REJECTED') {
  console.error(validation.reason);
  // "Conflicts with explicit instruction: Use MongoDB on port 27017"

  // Use instruction value instead
  proposedAction.port = validation.suggestion.port; // 27017
}

Step 5: Enforce Boundaries

// Check if action crosses values boundary
const boundaryCheck = enforcer.checkBoundary(proposedAction);

if (boundaryCheck.requiresHumanApproval) {
  // Queue for human review
  await moderationQueue.add({
    action: proposedAction,
    reason: boundaryCheck.reason,
    quadrant: boundaryCheck.quadrant // STRATEGIC
  });

  return { status: 'pending_approval', queueId: ... };
}

IV. Production Patterns (250 words)

Pattern 1: Middleware Integration (Express)

app.use(tractatus.middleware({
  classifier: true,
  validator: true,
  enforcer: true,
  auditLog: true
}));

app.post('/api/action', async (req, res) => {
  // Tractatus validation runs automatically
  // If STRATEGIC: 403 Forbidden (requires human approval)
  // If conflicts instruction: 409 Conflict (with suggestion)
  // If passes: Proceed
});

Pattern 2: Background Job Validation

async function processAIJob(job) {
  const action = await aiService.generateAction(job);

  // Validate before execution
  const validation = await validator.validate(action);
  if (validation.status !== 'APPROVED') {
    await failJob(job, validation.reason);
    return;
  }

  // Check boundary
  const boundary = await enforcer.checkBoundary(action);
  if (boundary.requiresHumanApproval) {
    await queueForReview(job, action);
    return;
  }

  // Execute
  await executeAction(action);
}

Pattern 3: Real-time Validation (WebSocket)

socket.on('ai:action', async (action) => {
  const result = await tractatus.validateAndEnforce(action);

  if (result.blocked) {
    socket.emit('ai:blocked', {
      reason: result.reason,
      suggestion: result.suggestion
    });
  } else if (result.requiresApproval) {
    socket.emit('ai:approval_required', result.approvalRequest);
  } else {
    socket.emit('ai:approved', result);
    await executeAction(action);
  }
});

V. Testing Your Integration (150 words)

Unit tests:

describe('CrossReferenceValidator', () => {
  it('should block actions conflicting with HIGH persistence instructions', async () => {
    const instruction = {
      text: 'Use port 27017',
      persistence: 'HIGH',
      parameters: { port: 27017 }
    };

    const action = { port: 27027 };
    const result = await validator.validate(action, [instruction]);

    expect(result.status).toBe('REJECTED');
    expect(result.suggestion.port).toBe(27017);
  });
});

Integration tests (see /tests/integration/ in Tractatus repo)

VI. Performance Considerations (50 words)

  • Validation adds ~5-10ms per action (negligible)
  • Instruction storage: MongoDB indexed queries
  • In-memory cache for frequent validations
  • Async validation for non-blocking workflows

Writing Guidelines for All Posts

Style:

  • Active voice, direct language
  • Short paragraphs (2-4 sentences)
  • Code examples with comments
  • Real-world analogies for complex concepts

Structure:

  • Hook in first 2 sentences
  • Clear section headings
  • Bullet points for scanability
  • Code blocks with syntax highlighting
  • Call-to-action at end

SEO:

  • Keywords: "AI safety", "architectural constraints", "human oversight", "AI governance"
  • Meta descriptions (155 characters)
  • Internal links to framework docs, API reference
  • External links to research papers, regulatory documents

Citations:

  • All factual claims sourced
  • Research papers linked (Anthropic, DeepMind, academic publications)
  • Regulatory documents linked (EU AI Act, US AI Bill of Rights)
  • Code examples tested and working

Next Steps

For John Stroh:

  1. Select 3-5 posts to write first (recommend 1, 2, and 3 for initial launch)
  2. Draft posts (800-1200 words each)
  3. Review with Claude (I can fact-check, suggest edits, improve clarity)
  4. Finalize for publication (human final approval, per TRA-OPS-0002)

Timeline:

  • Week 5: Draft posts 1-2
  • Week 6: Draft posts 3-5
  • Week 7: Finalize all posts, add images/diagrams
  • Week 8: Publish sequentially (1 post every 3-4 days)

Let me know which posts you'd like to start with!