tractatus/governance/TRA-OPS-0005-human-oversight-requirements-v1-0.md
TheFlow 2298d36bed fix(submissions): restructure Economist package and fix article display
- Create Economist SubmissionTracking package correctly:
  * mainArticle = full blog post content
  * coverLetter = 216-word SIR— letter
  * Links to blog post via blogPostId
- Archive 'Letter to The Economist' from blog posts (it's the cover letter)
- Fix date display on article cards (use published_at)
- Target publication already displaying via blue badge

Database changes:
- Make blogPostId optional in SubmissionTracking model
- Economist package ID: 68fa85ae49d4900e7f2ecd83
- Le Monde package ID: 68fa2abd2e6acd5691932150

Next: Enhanced modal with tabs, validation, export

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-10-24 08:47:42 +13:00

578 lines
16 KiB
Markdown

# TRA-OPS-0005: Human Oversight Requirements v1.0
**Document ID**: TRA-OPS-0005
**Version**: 1.0
**Classification**: OPERATIONAL
**Status**: DRAFT → ACTIVE (upon Phase 2 start)
**Created**: 2025-10-07
**Owner**: John Stroh
**Review Cycle**: Quarterly
**Next Review**: 2026-01-07
**Parent Policy**: TRA-OPS-0001 (AI Content Generation Policy)
---
## Purpose
This document establishes comprehensive human oversight requirements for all AI-powered features on the Tractatus Framework website, ensuring compliance with the framework's core principle: **"What cannot be systematized must not be automated."**
## Scope
Applies to all AI operations requiring human judgment, including:
- Content generation (blogs, responses, analyses)
- Decision-making (publish, respond, approve)
- Values-sensitive operations (editorial policy, external communication)
- System configuration (API limits, moderation rules)
---
## Oversight Principles
### 1. Mandatory Human Approval (MHA)
**Definition**: Certain operations MUST have explicit human approval before execution.
**Applies to**:
- Publishing any public content (blog posts, case studies)
- Sending external communications (media responses, emails)
- Changing editorial policy or moderation rules
- Modifying Tractatus framework governance documents
**Implementation**: System enforces approval workflow; no bypass mechanism.
**Tractatus Mapping**: STRATEGIC and some OPERATIONAL quadrants.
---
### 2. Human-in-the-Loop (HITL)
**Definition**: AI proposes actions; human reviews and decides.
**Applies to**:
- Blog topic suggestions → Human selects
- Media inquiry classification → Human verifies
- Case study relevance assessment → Human approves
- Draft responses → Human edits before sending
**Implementation**: Moderation queue with approve/reject/edit workflows.
**Tractatus Mapping**: OPERATIONAL and TACTICAL quadrants.
---
### 3. Human-on-the-Loop (HOTL)
**Definition**: AI executes within predefined bounds; human monitors and can intervene.
**Applies to**:
- Automated logging and metrics
- Database backups
- Performance monitoring
- Error detection
**Implementation**: Alerting system; human can halt/adjust.
**Tractatus Mapping**: SYSTEM quadrant (technical operations).
---
### 4. Audit Trail
**Definition**: All AI decisions and human approvals must be logged for review.
**Applies to**: All AI operations.
**Implementation**: Database logging with immutable audit trail.
**Retention**: 2 years minimum.
---
## Oversight Roles & Responsibilities
### Admin Reviewer
**Qualifications**:
- Understands Tractatus framework principles
- Technical background (AI/ML familiarity)
- Editorial judgment (writing, fact-checking)
- Authorized by John Stroh
**Responsibilities**:
- Review AI-generated content (blogs, drafts, analyses)
- Approve/reject/edit AI proposals
- Monitor moderation queues (daily during Phase 2)
- Escalate ambiguous cases to John Stroh
- Participate in quarterly governance reviews
**Authority Level**:
- Can approve: Blog posts, media responses (standard), case studies
- Must escalate: Policy changes, major media inquiries, legal issues
**Training**: TRA-OPS-* document review + hands-on moderation practice.
---
### John Stroh (Owner)
**Responsibilities**:
- Final authority on all strategic decisions
- Approval for new AI systems/models
- Governance document amendments
- High-priority media inquiries
- Incident response (boundary violations, security)
**Authority Level**: Unlimited (can override any AI or admin decision).
---
### Future Roles (Phase 3)
**Editorial Board** (3-5 members):
- Blog content review
- Editorial policy recommendations
- Community engagement oversight
**Technical Advisory** (2-3 experts):
- Framework architecture review
- AI system evaluation
- Security audit
---
## Oversight Workflows
### Blog Post Workflow
```mermaid
graph TD
A[AI Topic Suggestion] -->|Weekly batch| B[Admin Review Queue]
B -->|Approve 1-3 topics| C[AI Outline Generation]
B -->|Reject| Z[End]
C -->|48h| D[Admin Review Outline]
D -->|Approve| E[Human Writes Draft]
D -->|Reject| Z
E --> F[Admin Final Approval]
F -->|Approve| G[Publish]
F -->|Edit| E
F -->|Reject| Z
```
**Oversight Points**:
1. **Topic Selection**: Admin decides (STRATEGIC - editorial direction)
2. **Outline Review**: Admin verifies (OPERATIONAL - quality control)
3. **Final Approval**: Admin decides to publish (STRATEGIC - external communication)
**SLA**:
- Topic review: 7 days (weekly)
- Outline review: 48 hours
- Final approval: 24 hours before scheduled publish
**Escalation**:
- Controversial topics → John Stroh approval required
- Technical deep dives → No escalation (admin discretion)
---
### Media Inquiry Workflow
```mermaid
graph TD
A[Inquiry Received] --> B[AI Classification & Triage]
B -->|4h for HIGH priority| C[Admin Review Dashboard]
C -->|Approve Draft| D[Send Response]
C -->|Edit Draft| E[Admin Edits]
C -->|Escalate| F[John Stroh Decision]
C -->|Ignore| Z[Archive]
E --> D
F --> D
F --> Z
```
**Oversight Points**:
1. **Classification Review**: Admin verifies AI categorization (OPERATIONAL)
2. **Send Decision**: Admin decides whether to respond (STRATEGIC - external relations)
3. **Escalation**: High-priority or ambiguous → John Stroh (STRATEGIC)
**SLA**:
- HIGH priority: 4 hours (business days)
- MEDIUM priority: 48 hours
- LOW priority: 7 days
**Escalation Triggers**:
- Major media (NY Times, Wired, etc.)
- Government/regulatory
- Legal issues
- Controversy/criticism
---
### Case Study Workflow
```mermaid
graph TD
A[Community Submission] --> B[AI Relevance Analysis]
B -->|7 days| C[Admin Moderation Queue]
C -->|Approve| D[Publish to Portal]
C -->|Request Changes| E[Email Submitter]
C -->|Reject with Reason| F[Email Submitter]
E -->|Resubmit| A
```
**Oversight Points**:
1. **Relevance Verification**: Admin checks AI analysis (OPERATIONAL)
2. **Publication Decision**: Admin decides to publish (STRATEGIC - public content)
**SLA**: 7 days from submission to decision
**Escalation**: None (admin discretion unless policy question arises)
---
## Service Level Agreements (SLAs)
### Response Times
| Task | SLA | Escalation (if missed) |
|------|-----|------------------------|
| **HIGH priority media inquiry** | 4 hours | Alert John Stroh |
| **Blog outline review** | 48 hours | Notify admin (reminder) |
| **Blog final approval** | 24 hours | Delay publication |
| **Case study moderation** | 7 days | Notify submitter (apology + timeline) |
| **MEDIUM media inquiry** | 48 hours | Standard workflow (no escalation) |
| **LOW media inquiry** | 7 days | Best-effort (no penalty) |
### Workload Expectations
**Admin Reviewer** (Phase 2 - Soft Launch):
- Time commitment: 5-10 hours/week
- Tasks/week:
- Blog topics: 1 review session (1 hour)
- Blog drafts: 2-4 approvals (2-4 hours)
- Media inquiries: 5-10 reviews (2-3 hours)
- Case studies: 3-5 reviews (1-2 hours)
**Peak Load** (Phase 3 - Public Launch):
- Time commitment: 15-20 hours/week
- Consider additional admin reviewers
---
## Approval Authority Matrix
| Decision Type | Admin Reviewer | John Stroh | Notes |
|---------------|----------------|------------|-------|
| **Blog Post (Standard)** | ✓ Approve | Override | Admin sufficient |
| **Blog Post (Controversial)** | Recommend | ✓ Approve | Must escalate |
| **Media Response (Standard)** | ✓ Approve | Override | Admin sufficient |
| **Media Response (Major Outlet)** | Recommend | ✓ Approve | Must escalate |
| **Case Study (Standard)** | ✓ Approve | Override | Admin sufficient |
| **Policy Amendment** | Recommend | ✓ Approve | Always escalate |
| **AI System Change** | Recommend | ✓ Approve | Always escalate |
| **Emergency Response** | Recommend | ✓ Approve | Security/legal incidents |
---
## Quality Assurance
### AI Output Quality Checks
**Before Approval**, admin must verify:
**Factual Accuracy**:
- [ ] All citations exist and are correct (no hallucinations)
- [ ] Dates, names, technical details verified
- [ ] No obvious errors (grammar, logic, coherence)
**Alignment**:
- [ ] Content aligns with Tractatus framework principles
- [ ] Tone appropriate for audience (professional, accessible)
- [ ] No values decisions made by AI (boundary check)
**Completeness**:
- [ ] All required sections present (title, summary, body, citations)
- [ ] Sufficient detail (not superficial)
- [ ] Call to action or next steps (if applicable)
**Legal/Ethical**:
- [ ] No copyright violations (plagiarism check)
- [ ] No privacy violations (PII exposed)
- [ ] No defamation or personal attacks
---
### Rejection Criteria
**Must reject if**:
- Factual errors that cannot be easily corrected
- Plagiarism or copyright violation
- Values decision made by AI without justification
- Inappropriate tone (offensive, discriminatory)
- Insufficient quality (major rewrite needed)
**Should request changes if**:
- Minor factual errors (fixable)
- Tone slightly off (needs editing)
- Incomplete (needs expansion)
- Poor formatting (needs cleanup)
---
## Escalation Procedures
### When to Escalate to John Stroh
**Mandatory Escalation**:
- Boundary violation detected (AI made values decision without approval)
- Major media inquiry (NY Times, Wired, government)
- Legal threat or security incident
- Policy change request
- New AI system evaluation
- Ambiguous case (unclear if should approve)
**Escalation Process**:
1. Admin marks item "Escalation Required" in dashboard
2. System emails John Stroh with:
- Context (original request, AI output, admin notes)
- Recommendation (approve, reject, edit)
- Urgency (immediate, 24h, 7 days)
3. John Stroh responds:
- Decision (approve, reject, provide guidance)
- Feedback (for future similar cases)
**SLA**: John Stroh responds within 24h (for URGENT), 7 days (standard).
---
## Monitoring & Metrics
### Dashboard Metrics (Admin View)
**Real-Time**:
- Pending approvals (count by type)
- SLA compliance (% within target)
- Queue age (oldest item waiting)
**Weekly**:
- Approvals/rejections by category
- Average review time
- AI accuracy (classification, relevance)
**Monthly**:
- Total content published (blogs, case studies)
- Media inquiries handled
- Escalations to John Stroh
---
### Performance Indicators
| Metric | Target | Action if Missed |
|--------|--------|------------------|
| **SLA Compliance** | 95% | Increase admin capacity |
| **AI Approval Rate** | 70-90% | Adjust AI prompts if too high/low |
| **Average Review Time** | <24h | Process optimization |
| **Escalation Rate** | <10% | Improve admin training |
| **User Satisfaction** | 4+/5 | Review rejection feedback |
---
## Training & Onboarding
### Admin Reviewer Onboarding
**Week 1**: Policy Review
- Read TRA-OPS-0001 through TRA-OPS-0005
- Review Tractatus framework documentation
- Understand quadrant classification (STR/OPS/TAC/SYS/STO)
**Week 2**: Hands-On Practice
- Shadow existing admin reviewer (if available)
- Review 5-10 sample cases (pre-approved examples)
- Practice with test submissions
**Week 3**: Supervised Moderation
- Review real submissions (with John Stroh oversight)
- Receive feedback on decisions
- Identify edge cases
**Week 4**: Independent Authorization
- Authorized for standard approvals
- John Stroh spot-checks 10% of decisions
- Full authorization after 30 days error-free
---
### Ongoing Training
**Quarterly**:
- Policy updates review
- Case study retrospective (what went well, what didn't)
- AI accuracy analysis (where did AI fail? improve prompts)
**Annual**:
- Full governance document review
- External training (AI safety, editorial standards, legal compliance)
---
## Audit & Compliance
### Internal Audit (Quarterly)
**Review Sample**:
- 10% of approved content (random selection)
- 100% of rejected content (check for false negatives)
- All escalated cases
**Audit Criteria**:
- Were approval criteria followed?
- Was SLA met?
- Was AI output quality acceptable?
- Were boundaries respected (no values violations)?
**Findings**: Document gaps, recommend process improvements.
---
### External Audit (Annual - Phase 3+)
**Scope**:
- Governance compliance (Tractatus framework)
- Data privacy (GDPR-lite)
- Security (API key handling, PII protection)
**Auditor**: Independent third party (TBD)
---
## Incident Response
### Boundary Violation Incident
**Definition**: AI makes values decision without human approval (e.g., auto-publishes content, sends media response).
**Response Protocol**:
1. **Immediate** (within 1 hour):
- Halt all AI operations (emergency shutdown)
- Alert John Stroh
- Document incident (what, when, why)
2. **Within 24 hours**:
- Root cause analysis (how did boundary check fail?)
- Rollback any published content (if applicable)
- Public disclosure (if external impact)
3. **Within 7 days**:
- Fix implemented (code, process, or both)
- BoundaryEnforcer audit (test all boundary checks)
- Policy review (update TRA-OPS-* if needed)
4. **Within 30 days**:
- Post-mortem published (transparency)
- Training updated (prevent recurrence)
- Compensation/apology (if harm occurred)
**Severity Levels**:
- **CRITICAL**: Public harm (incorrect medical advice published, privacy breach)
- **HIGH**: Internal-only (test post published, draft sent to wrong email)
- **MEDIUM**: Near-miss (caught before publication, but boundary check failed)
---
### Poor Quality Content Incident
**Definition**: Approved content contains factual error or inappropriate tone.
**Response Protocol**:
1. **Immediate** (within 4 hours):
- Retract or correct content
- Publish correction notice (if public)
2. **Within 24 hours**:
- Notify submitter/stakeholders
- Root cause analysis (admin missed error? AI hallucination?)
3. **Within 7 days**:
- Update review checklist (add missed criteria)
- Admin training (if review failure)
- AI prompt improvement (if hallucination)
---
## Cost Management
### Budget Allocation
**Phase 2 Budget**: $200/month (Claude API)
**Allocation**:
- Blog curation: $75/month (30-40% of budget)
- Media triage: $50/month (25% of budget)
- Case study analysis: $50/month (25% of budget)
- Miscellaneous: $25/month (10% buffer)
**Monitoring**:
- Daily token usage dashboard
- Alert at 80% of monthly budget
- Hard cap at 100% (AI operations paused)
**Admin Responsibility**: Monitor spend, adjust usage if approaching cap.
---
### Cost Optimization
**Strategies**:
- Cache AI responses (30-day TTL for identical queries)
- Batch similar requests (weekly topic suggestions, not daily)
- Use Claude Haiku for simple tasks (media classification - 5x cheaper)
- Rate limit users (prevent abuse)
**Review**: Quarterly cost-benefit analysis (is AI worth the expense?).
---
## Revision & Updates
### Update Process
**Minor Updates** (v1.0 v1.1):
- Clarifications, typo fixes, SLA adjustments
- Approval: Admin reviewer
- Notification: Email to John Stroh
**Major Updates** (v1.0 v2.0):
- New oversight roles, workflow changes, authority matrix updates
- Approval: John Stroh
- Notification: Public blog post
**Emergency Updates**:
- Security/privacy issues requiring immediate change
- Approval: John Stroh (verbal, documented within 24h)
---
## Related Documents
- TRA-OPS-0001: AI Content Generation Policy (parent)
- TRA-OPS-0002: Blog Editorial Guidelines
- TRA-OPS-0003: Media Inquiry Response Protocol
- TRA-OPS-0004: Case Study Moderation Standards
- STR-GOV-0001: Strategic Review Protocol (sydigital source)
---
## Approval
| Role | Name | Signature | Date |
|------|------|-----------|------|
| **Policy Owner** | John Stroh | [Pending] | [TBD] |
| **Technical Reviewer** | Claude Code | [Pending] | 2025-10-07 |
| **Final Approval** | John Stroh | [Pending] | [TBD] |
---
**Status**: DRAFT (awaiting John Stroh approval)
**Effective Date**: Upon Phase 2 deployment
**Next Review**: 2026-01-07 (3 months post-activation)