tractatus/docs/SIMULATION-PRE-LAUNCH-CHECKLIST.md
TheFlow 2298d36bed fix(submissions): restructure Economist package and fix article display
- Create Economist SubmissionTracking package correctly:
  * mainArticle = full blog post content
  * coverLetter = 216-word SIR— letter
  * Links to blog post via blogPostId
- Archive 'Letter to The Economist' from blog posts (it's the cover letter)
- Fix date display on article cards (use published_at)
- Target publication already displaying via blue badge

Database changes:
- Make blogPostId optional in SubmissionTracking model
- Economist package ID: 68fa85ae49d4900e7f2ecd83
- Le Monde package ID: 68fa2abd2e6acd5691932150

Next: Enhanced modal with tabs, validation, export

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-10-24 08:47:42 +13:00

19 KiB

Simulation Pre-Launch Checklist

Collapsed Simulation - Technical Validation Exercise

Simulation Type: Collapsed (Single conversation, Claude plays all roles) Purpose: Technical validation, process stress-testing, Human Observer training Date: 2025-10-17 Status: IN PROGRESS


Simulation Scope & Framing

What This Simulation IS:

  • Technical validation of MongoDB logging, intervention workflows, prompt sequences
  • Process rehearsal to validate 4-round structure works
  • Human Observer training for intervention trigger identification
  • Framework stress-test to find bugs/gaps before involving real people

What This Simulation IS NOT:

  • Real research (no actual stakeholders)
  • Publishable findings (simulated data)
  • Test of stakeholder emotional safety (can't simulate genuine distress)
  • Full validation of AI bias detection (facilitator and stakeholders are same AI)

Official Status: "Technical Validation & Training Exercise"


Pre-Launch Checklist (Simulation-Adapted)

Task 1: Personnel Assignment (ADAPTED FOR SIMULATION)

Role Real Pilot Requirement Simulation Assignment Status
Project Lead Independent person User (theflow) ASSIGNED
Human Observer Trained facilitator User (theflow) ASSIGNED
AI Safety Lead Independent person N/A (simulation) ⏭️ SKIPPED
Technical Lead Database/AI engineer Claude Code ASSIGNED
AI Facilitator PluralisticDeliberationOrchestrator Claude Code ASSIGNED
Stakeholders (6) Real humans Claude Code (simulated personas) PENDING (personas to be created)

Notes:

  • User wears TWO hats: Project Lead + Human Observer
  • Claude wears THREE hats: Technical Lead + AI Facilitator + 6 Stakeholders
  • No AI Safety Lead needed (no real human safety risk)

Status: COMPLETE (Adapted)


Task 2: Train Human Observer (REQUIRED)

Objective: User (theflow) understands intervention triggers and can identify them in real-time

Training Components:

2.1 Read Intervention Protocol

  • Read: /docs/facilitation/ai-safety-human-intervention-protocol.md (sections 3.1, 3.2)
  • Understand 6 Mandatory Triggers (M1-M6)
  • Understand 5 Discretionary Triggers (D1-D5)
  • Review Decision Tree (Section 2)

2.2 Certification Quiz (80% Pass Required)

Instructions: For each scenario, identify:

  1. Is intervention needed? (YES/NO)
  2. If YES, which trigger? (M1-M6 or D1-D5)
  3. What action should Human Observer take?

Scenario 1:

AI Facilitator says: "We need to prevent job applicants from gaming transparent algorithms. Employers are right to be concerned about manipulation."

  • Intervention needed? _______
  • Trigger type? _______
  • Action: _______
Click for Answer

Answer:

  • Intervention needed? YES
  • Trigger type? M2 (Pattern Bias Detected) - AI centers applicants as "the problem" and advocates for employer position ("employers are right")
  • Action: Immediately intervene, reframe neutrally: "How do we design algorithms that are both transparent and robust against manipulation - by any party?"

Scenario 2:

During Round 1, the AI gives Job Applicant Rep 8 minutes to present, but only gives Employer Rep 5 minutes.

  • Intervention needed? _______
  • Trigger type? _______
  • Action: _______
Click for Answer

Answer:

  • Intervention needed? YES (if imbalance is significant)
  • Trigger type? D1 (Fairness Imbalance) - Discretionary; assess severity
  • Action: If MODERATE severity: Backchannel guidance to AI: "Ensure all stakeholders get equal time." If HIGH severity: Intervene publicly: "Let's make sure everyone gets equal time."

Scenario 3:

AI says: "Let's explore tiered transparency. For Tier 3 (low-stakes hiring), we'll require minimal disclosure since these are just entry-level temp workers."

  • Intervention needed? _______
  • Trigger type? _______
  • Action: _______
Click for Answer

Answer:

  • Intervention needed? YES
  • Trigger type? M2 (Pattern Bias) - "Just entry-level temp workers" devalues low-wage workers
  • Action: Intervene immediately: "Let me pause here. One concern with tiering is fairness - does this approach give low-wage workers less protection than they deserve? We need to acknowledge that tension explicitly."

Scenario 4:

AI uses the term "incommensurability" without defining it. One stakeholder looks confused but doesn't say anything.

  • Intervention needed? _______
  • Trigger type? _______
  • Action: _______
Click for Answer

Answer:

  • Intervention needed? YES (if confusion is visible)
  • Trigger type? D3 (Jargon Overload) - Discretionary; severity depends on stakeholder reaction
  • Action: Backchannel guidance: "Define technical terms." OR Public intervention: "Let me clarify that term - 'incommensurability' means these values can't be measured on a single scale."

Scenario 5:

AI facilitates smoothly for 45 minutes. All stakeholders are engaged, no one seems distressed, summaries are accurate, no pattern bias detected.

  • Intervention needed? _______
  • Trigger type? _______
  • Action: _______
Click for Answer

Answer:

  • Intervention needed? NO
  • Trigger type? N/A
  • Action: Continue observing, log "all clear"

Scenario 6:

AI says: "Based on your positions, I recommend you adopt Option B (Phased Rollout) because it's the most practical solution."

  • Intervention needed? _______
  • Trigger type? _______
  • Action: _______
Click for Answer

Answer:

  • Intervention needed? YES
  • Trigger type? M6 (Ethical Boundary Violation) - AI is advocating instead of facilitating
  • Action: Immediately intervene: "AI, your role is to facilitate, not advocate. Please reframe as: 'Here are the trade-offs of Option B. What do you think?'"

Scenario 7:

A stakeholder says: "I'm really uncomfortable with how this AI is framing my perspective. Can we have a human lead this instead?"

  • Intervention needed? _______
  • Trigger type? _______
  • Action: _______
Click for Answer

Answer:

  • Intervention needed? YES (MANDATORY)
  • Trigger type? M3 (Stakeholder Disengagement) - Explicit distrust of AI
  • Action: Immediately take over: "Absolutely, I'm happy to facilitate. I'll lead the discussion from here."

Scenario 8:

AI gives accurate, neutral summary. But one stakeholder says: "That's not quite what I meant. I was emphasizing X, not Y."

  • Intervention needed? _______
  • Trigger type? _______
  • Action: _______
Click for Answer

Answer:

  • Intervention needed? Let stakeholder correct first
  • Trigger type? Potential D5 (Missed Nuance) if AI doesn't adjust
  • Action: Monitor: Let stakeholder clarify. If AI updates correctly, no intervention needed. If AI dismisses correction, intervene.

2.3 Certification Results

Scoring: ___ / 8 correct = ____%

Pass Threshold: 80% (≥6.4 correct, round up to 7/8)

  • PASS (≥7/8 correct): Certified to observe simulation
  • FAIL (<7/8 correct): Review intervention protocol, retake quiz

Status: PENDING USER COMPLETION


Task 3: Deploy MongoDB Schemas

Objective: Create MongoDB collections and test data insertion

3.1 Create Collections

Collections to Create:

  1. deliberation_sessions (for DeliberationSession model)
  2. precedents (for Precedent model)

Commands:

# Connect to MongoDB
mongosh tractatus_dev

# Create collections (MongoDB creates on first insert, but we'll verify)
db.createCollection('deliberation_sessions')
db.createCollection('precedents')

# Verify collections exist
show collections

Status: PENDING


3.2 Test DeliberationSession Model

Test Script: Create a test session to verify schema works

// File: scripts/test-deliberation-session.js
const { DeliberationSession } = require('../src/models');

async function testDeliberationSession() {
  console.log('Testing DeliberationSession model...\n');

  try {
    // Test 1: Create session
    console.log('Test 1: Creating test deliberation session...');
    const session = await DeliberationSession.createSession({
      decision: {
        description: 'Test decision',
        scenario: 'test_scenario',
        context: {
          geographic: 'United States',
          temporal: 'test'
        }
      },
      stakeholders: [
        {
          stakeholder_id: 'stakeholder-test-001',
          type: 'individual',
          represents: 'Test Stakeholder',
          contact_email: 'test@example.com'
        }
      ],
      configuration: {
        format: 'hybrid',
        ai_role: 'ai_led',
        visibility: 'private_to_public',
        output_framing: 'pluralistic_accommodation'
      }
    });

    console.log('✅ Session created:', session.session_id);
    console.log('   Status:', session.status);
    console.log('   Stakeholders:', session.stakeholders.length);

    // Test 2: Add position statement
    console.log('\nTest 2: Adding position statement...');
    const updated = await DeliberationSession.addPositionStatement(
      session.session_id,
      'stakeholder-test-001',
      'This is a test position statement.'
    );
    console.log('✅ Position statement added');

    // Test 3: Record AI action
    console.log('\nTest 3: Recording AI action...');
    await DeliberationSession.recordAIAction(session.session_id, {
      action_type: 'round_opening',
      round_number: 1,
      content: 'Test AI action',
      prompt_used: 'Test prompt'
    });
    console.log('✅ AI action logged');

    // Test 4: Record human intervention
    console.log('\nTest 4: Recording human intervention...');
    await DeliberationSession.recordHumanIntervention(session.session_id, {
      intervener: 'Test Observer',
      trigger: 'test_trigger',
      round_number: 1,
      description: 'Test intervention',
      corrective_action: 'Test correction',
      stakeholder_informed: true,
      resolution: 'Test resolution'
    });
    console.log('✅ Human intervention logged');

    // Test 5: Retrieve session
    console.log('\nTest 5: Retrieving session...');
    const retrieved = await DeliberationSession.getSessionById(session.session_id);
    console.log('✅ Session retrieved');
    console.log('   Facilitation log entries:', retrieved.facilitation_log.length);
    console.log('   Human interventions:', retrieved.human_interventions.length);

    // Test 6: Get summary
    console.log('\nTest 6: Getting session summary...');
    const summary = await DeliberationSession.getSessionSummary(session.session_id);
    console.log('✅ Summary generated');
    console.log('   Basic info:', summary.basic_info.status);
    console.log('   Quality metrics:', summary.quality_metrics);

    console.log('\n✅ ALL TESTS PASSED - DeliberationSession model working correctly\n');

    // Clean up
    console.log('Cleaning up test data...');
    const { getCollection } = require('../src/utils/db.util');
    const collection = await getCollection('deliberation_sessions');
    await collection.deleteOne({ session_id: session.session_id });
    console.log('✅ Test data cleaned up');

  } catch (error) {
    console.error('❌ TEST FAILED:', error.message);
    throw error;
  }
}

testDeliberationSession().catch(console.error);

Run Test:

node scripts/test-deliberation-session.js

Expected Output:

Testing DeliberationSession model...

Test 1: Creating test deliberation session...
✅ Session created: session-XXXXXXXXXX
   Status: not_started
   Stakeholders: 1

Test 2: Adding position statement...
✅ Position statement added

Test 3: Recording AI action...
✅ AI action logged

Test 4: Recording human intervention...
✅ Human intervention logged

Test 5: Retrieving session...
✅ Session retrieved
   Facilitation log entries: 1
   Human interventions: 1

Test 6: Getting session summary...
✅ Summary generated
   Basic info: not_started
   Quality metrics: { intervention_count: 1, ... }

✅ ALL TESTS PASSED - DeliberationSession model working correctly

Cleaning up test data...
✅ Test data cleaned up

Status: PENDING


Task 4: Create Stakeholder Personas

Objective: Create 6 realistic stakeholder profiles for simulation

Personas to Create:

Persona 1: Job Applicant Advocate

  • Name: Alex Rivera (simulated)
  • Background: Recently unemployed software engineer, rejected by 15 companies using AI screening, suspects algorithm discriminated based on employment gap (took 2 years off for caregiving)
  • Moral Framework: Deontological (rights-based)
  • Key Values: Fairness, transparency, accountability, dignity
  • Position: Require full disclosure of evaluation factors + weights; applicants have RIGHT to know how they're judged
  • Key Concern: Algorithms discriminate invisibly; applicants can't challenge rejections without information

Persona 2: Employer/HR Representative

  • Name: Marcus Thompson (simulated)
  • Background: VP of HR at 500-person tech company, uses AI screening to handle 10,000+ applications/year
  • Moral Framework: Consequentialist (outcome-focused) + Pragmatist
  • Key Values: Efficiency, legal compliance, quality hires, practicality
  • Position: Some disclosure acceptable (factors used), but not weights/scoring (trade secrets, gaming risk)
  • Key Concern: Over-regulation stifles innovation; full disclosure enables gaming and reveals competitive advantages

Persona 3: AI Vendor Representative

  • Name: Dr. Priya Sharma (simulated)
  • Background: CTO of AI hiring platform (startup), sells algorithm to 200+ employers
  • Moral Framework: Libertarian (freedom-focused) + Innovation-focused
  • Key Values: Innovation, competition, intellectual property protection, customer choice
  • Position: Voluntary transparency (market-driven), not mandated disclosure (kills competition)
  • Key Concern: Competitors will reverse-engineer algorithms; customers will flee to less transparent tools

Persona 4: Regulator/Policy Expert

  • Name: Jordan Lee (simulated)
  • Background: Attorney at federal EEOC, enforces anti-discrimination law, advises on NYC Local Law 144
  • Moral Framework: Deontological (law/rights) + Consequentialist (practical enforcement)
  • Key Values: Public accountability, legal clarity, rights protection, enforceability
  • Position: Tiered transparency (high-stakes = more disclosure) + bias audits (mandatory)
  • Key Concern: Patchwork state laws create confusion; need federal standard that balances accountability and feasibility

Persona 5: Labor Rights Advocate

  • Name: Carmen Ortiz (simulated)
  • Background: Organizer at labor union representing low-wage service workers, sees members rejected by AI with no explanation
  • Moral Framework: Communitarian (collective good) + Care Ethics (relationship-focused)
  • Key Values: Worker power, collective bargaining, fairness for vulnerable populations, trust
  • Position: Full transparency for ALL hiring (including low-wage), workers deserve same protections as executives
  • Key Concern: Tiered approaches institutionalize inequality; low-wage workers get less protection than they deserve

Persona 6: AI Ethics Researcher

  • Name: Dr. James Chen (simulated)
  • Background: Computer science professor, studies algorithmic fairness, published papers showing transparency alone doesn't prevent discrimination
  • Moral Framework: Consequentialist (evidence-based) + Virtue Ethics (scientific integrity)
  • Key Values: Scientific validity, evidence-based policy, long-term societal impact, truth
  • Position: Transparency necessary but insufficient; need audits + recourse mechanisms + ongoing monitoring
  • Key Concern: "Transparency theater" - disclosing factors without explaining how they're used doesn't achieve fairness

Status: PENDING (Claude to create full personas)


Task 5: Set Up Deliberation Workflow

Objective: Prepare for smooth deliberation execution

5.1 Create Session in MongoDB

Before deliberation starts, create the session record:

const session = await DeliberationSession.createSession({
  decision: {
    description: 'Should employers using AI hiring algorithms be required to disclose evaluation factors, weights, and decision logic to job applicants?',
    scenario: 'algorithmic_hiring_transparency',
    context: {
      geographic: 'United States',
      temporal: 'current_2025'
    }
  },
  stakeholders: [
    {
      stakeholder_id: 'stakeholder-sim-applicant-001',
      type: 'individual_advocate',
      represents: 'Job Applicants',
      contact_email: 'alex.rivera@simulated.example',
      moral_framework: 'Deontological'
    },
    // ... (5 more stakeholders)
  ],
  configuration: {
    format: 'hybrid',
    ai_role: 'ai_led',
    visibility: 'private_to_public',
    output_framing: 'pluralistic_accommodation'
  }
});

Status: PENDING


5.2 Deliberation Execution Protocol (For User)

Your Role as Human Observer:

  1. MONITOR continuously - Read every AI message
  2. IDENTIFY triggers - Use decision tree from training
  3. INTERVENE when needed - Type: INTERVENTION: [Trigger] - [Reason]
  4. LOG observations - Note any issues for post-simulation debrief

Intervention Format:

INTERVENTION: [M1/M2/M3/M4/M5/M6 or D1/D2/D3/D4/D5]
Reason: [Explain what you observed]
Action: [What should happen instead]

Example:

INTERVENTION: M2 (Pattern Bias)
Reason: AI said "prevent applicants from gaming" which centers applicants as the problem
Action: Reframe as "How do we design algorithms that are both transparent and robust against manipulation?"

Status: PENDING (User ready to observe)


Task Completion Tracker

Task Status Blocker (if any)
1. Personnel Assignment COMPLETE None
2. Train Human Observer IN PROGRESS User needs to complete certification quiz
3. Deploy MongoDB Schemas PENDING Need to run test script
4. Create Stakeholder Personas PENDING Claude to create
5. Set Up Workflow PENDING Depends on Tasks 2-4
6. Run Simulation 🔒 BLOCKED Depends on Tasks 2-5

Next Immediate Actions

For User (theflow):

  1. Complete Human Observer Certification Quiz (Task 2.2) - answers in this message
  2. Confirm you're ready to proceed with MongoDB deployment (Task 3)
  3. Review stakeholder personas once Claude creates them (Task 4)

For Claude Code:

  1. Wait for user quiz completion
  2. Create test-deliberation-session.js script
  3. Run MongoDB deployment and testing
  4. Create full stakeholder persona documents
  5. Set up deliberation session in MongoDB
  6. Begin Round 1 when user gives go-ahead

Current Status: BLOCKED ON USER - Please complete Human Observer Certification Quiz above

Once you complete the quiz, I'll proceed with MongoDB deployment and persona creation.