# Phase 2 Roadmap: Production Deployment & AI-Powered Features **Project**: Tractatus AI Safety Framework Website **Phase**: 2 of 3 **Status**: Planning **Created**: 2025-10-07 **Owner**: John Stroh **Duration**: 2-3 months (estimated) --- ## Table of Contents 1. [Overview](#overview) 2. [Phase 1 Completion Summary](#phase-1-completion-summary) 3. [Phase 2 Objectives](#phase-2-objectives) 4. [Timeline & Milestones](#timeline--milestones) 5. [Workstreams](#workstreams) 6. [Success Criteria](#success-criteria) 7. [Risk Assessment](#risk-assessment) 8. [Decision Points](#decision-points) 9. [Dependencies](#dependencies) 10. [Budget Requirements](#budget-requirements) --- ## Overview Phase 2 transitions the Tractatus Framework from a **local prototype** (Phase 1) to a **production-ready platform** with real users and AI-powered content features. This phase demonstrates the framework's capacity to govern its own AI operations through human-oversight workflows. ### Key Themes - **Production Deployment**: OVHCloud hosting, domain configuration, SSL/TLS - **AI Integration**: Claude API for blog curation, media triage, case studies - **Dogfooding**: Tractatus framework governs all AI content generation - **Security & Privacy**: Hardening, monitoring, privacy-respecting analytics - **Soft Launch**: Initial user testing before public announcement (Phase 3) --- ## Phase 1 Completion Summary **Completed**: 2025-10-07 **Status**: ✅ All objectives achieved ### Deliverables Completed - ✅ MongoDB instance (port 27017, database `tractatus_dev`) - ✅ Express application (port 9000, CSP-compliant) - ✅ Document migration pipeline (12+ documents) - ✅ Three audience paths (Researcher, Implementer, Advocate) - ✅ Interactive demonstrations (27027, classification, boundary) - ✅ Tractatus governance services (100% test coverage on core services) - InstructionPersistenceClassifier (85.3%) - CrossReferenceValidator (96.4%) - BoundaryEnforcer (100%) - ContextPressureMonitor (60.9%) - MetacognitiveVerifier (56.1%) - ✅ Admin dashboard with moderation workflows - ✅ API reference documentation - ✅ WCAG AA accessibility compliance - ✅ Mobile responsiveness optimization - ✅ 118 integration tests (all passing) ### Technical Achievements - CSP compliance: 100% (script-src 'self') - Test coverage: 85.3%+ on Tractatus services - Accessibility: WCAG AA compliant - Performance: <2s page load times (local) - Security: JWT authentication, role-based access control --- ## Phase 2 Objectives ### Primary Goals 1. **Deploy to production** on OVHCloud with domain `agenticgovernance.digital` 2. **Integrate Claude API** for AI-powered content features 3. **Implement human oversight workflows** via Tractatus framework 4. **Launch blog curation system** with moderation queue 5. **Enable media inquiry triage** with AI classification 6. **Create case study submission portal** for community contributions 7. **Soft launch** to initial user cohort (researchers, implementers) ### Non-Goals (Deferred to Phase 3) - ❌ Koha donation system - ❌ Multi-language translations - ❌ Public marketing campaign - ❌ Community forums/discussion boards - ❌ Mobile app development --- ## Timeline & Milestones ### Month 1: Infrastructure & Deployment (Weeks 1-4) **Week 1: Environment Setup** - [ ] Provision OVHCloud VPS (specs TBD) - [ ] Configure DNS for `agenticgovernance.digital` → production IP - [ ] SSL/TLS certificates (Let's Encrypt) - [ ] Firewall rules (UFW) and SSH hardening - [ ] Create production MongoDB instance - [ ] Set up systemd services (tractatus.service, mongodb-tractatus.service) **Week 2: Application Deployment** - [ ] Deploy Express application to production - [ ] Configure Nginx reverse proxy (port 80/443 → 9000) - [ ] Environment variables (.env.production) - [ ] Production logging (file rotation, syslog) - [ ] Database migration scripts (seed production data) - [ ] Backup automation (MongoDB dumps, code snapshots) **Week 3: Security Hardening** - [ ] Fail2ban configuration (SSH, HTTP) - [ ] Rate limiting (Nginx + application-level) - [ ] Security headers audit (OWASP compliance) - [ ] Vulnerability scanning (Trivy, npm audit) - [ ] ProtonBridge email integration - [ ] Admin notification system (email alerts) **Week 4: Monitoring & Testing** - [ ] Plausible Analytics deployment (self-hosted) - [ ] Error tracking (Sentry or self-hosted alternative) - [ ] Uptime monitoring (UptimeRobot or self-hosted) - [ ] Performance baseline (Lighthouse, WebPageTest) - [ ] Load testing (k6 or Artillery) - [ ] Disaster recovery drill (restore from backup) **Milestone 1**: Production environment live, accessible at `https://agenticgovernance.digital` ✅ --- ### Month 2: AI-Powered Features (Weeks 5-8) **Week 5: Claude API Integration** - [ ] Anthropic API key setup (production account) - [ ] ClaudeAPI.service refactoring for production - [ ] Rate limiting and cost monitoring - [ ] Error handling (API failures, timeout recovery) - [ ] Prompt templates for blog/media/cases - [ ] Token usage tracking and alerting **Week 6: Blog Curation System** - [ ] BlogCuration.service implementation - AI topic suggestion pipeline - Outline generation - Citation extraction - Draft formatting - [ ] Human moderation workflow (approve/reject/edit) - [ ] Blog post model (MongoDB schema) - [ ] Blog UI (list, single post, RSS feed) - [ ] OpenGraph/Twitter card metadata - [ ] Seed content: 5-10 human-written posts **Week 7: Media Inquiry Triage** - [ ] MediaTriage.service implementation - Incoming inquiry classification (press, academic, commercial) - Priority scoring (high/medium/low) - Auto-draft response generation (for human approval) - [ ] Media inquiry form (public-facing) - [ ] Admin triage dashboard - [ ] Email notification system - [ ] Response templates **Week 8: Case Study Portal** - [ ] CaseSubmission.service implementation - Community submission form - AI relevance analysis (Tractatus framework mapping) - Failure mode categorization - [ ] Case study moderation queue - [ ] Public case study viewer - [ ] Submission guidelines documentation - [ ] Initial case studies (3-5 curated examples) **Milestone 2**: All AI-powered features operational with human oversight ✅ --- ### Month 3: Polish, Testing & Soft Launch (Weeks 9-12) **Week 9: Governance Enforcement** - [ ] Review all AI prompts against TRA-OPS-* policies - [ ] Audit moderation workflows (Tractatus compliance) - [ ] Test boundary enforcement (values decisions require humans) - [ ] Cross-reference validator integration (AI content checks) - [ ] MetacognitiveVerifier for complex AI operations - [ ] Document AI decision audit trail **Week 10: Content & Documentation** - [ ] Final document migration review - [ ] Cross-reference link validation - [ ] PDF generation pipeline (downloads section) - [ ] Citation index completion - [ ] Privacy policy finalization - [ ] Terms of service drafting - [ ] About/Contact page updates **Week 11: Testing & Optimization** - [ ] End-to-end testing (user journeys) - [ ] Performance optimization (CDN evaluation) - [ ] Mobile testing (real devices) - [ ] Browser compatibility (Firefox, Safari, Chrome) - [ ] Accessibility re-audit (WCAG AA) - [ ] Security penetration testing - [ ] Load testing under realistic traffic **Week 12: Soft Launch** - [ ] Invite initial user cohort (20-50 users) - AI safety researchers - Academic institutions - Aligned organizations - [ ] Collect feedback via structured surveys - [ ] Monitor error rates and performance - [ ] Iterate on UX issues - [ ] Prepare for public launch (Phase 3) **Milestone 3**: Soft launch complete, feedback collected, ready for public launch ✅ --- ## Workstreams ### 1. Infrastructure & Deployment **Owner**: Infrastructure Lead (or John Stroh) **Duration**: Month 1 (Weeks 1-4) #### Tasks 1. **Hosting Provision** - Select OVHCloud VPS tier (see Budget Requirements) - Provision server (Ubuntu 22.04 LTS recommended) - Configure DNS (A records, AAAA for IPv6) - Set up SSH key authentication (disable password auth) 2. **Web Server Configuration** - Install Nginx - Configure reverse proxy (port 9000 → 80/443) - SSL/TLS via Let's Encrypt (Certbot) - HTTP/2 and compression (gzip/brotli) - Security headers (CSP, HSTS, X-Frame-Options) 3. **Database Setup** - Install MongoDB 7.x - Configure authentication - Set up replication (optional for HA) - Automated backups (daily, 7-day retention) - Restore testing 4. **Application Deployment** - Git-based deployment workflow - Environment variables management - Systemd service configuration - Log rotation and management - Process monitoring (PM2 or systemd watchdog) 5. **Security Hardening** - UFW firewall (allow 22, 80, 443, deny all others) - Fail2ban (SSH, HTTP) - Unattended security updates - Intrusion detection (optional: OSSEC, Wazuh) - Regular security audits **Deliverables**: - Production server accessible at `https://agenticgovernance.digital` - SSL/TLS A+ rating (SSL Labs) - Automated backup system operational - Monitoring dashboards configured --- ### 2. AI-Powered Features **Owner**: AI Integration Lead (or John Stroh with Claude Code) **Duration**: Month 2 (Weeks 5-8) #### Tasks ##### 2.1 Claude API Integration - **API Setup** - Anthropic production API key - Rate limiting configuration (requests/min, tokens/day) - Cost monitoring and alerting - Fallback handling (API downtime) - **Service Architecture** - `ClaudeAPI.service.js` - Core API wrapper - Prompt template management - Token usage tracking - Error handling and retry logic ##### 2.2 Blog Curation System - **AI Pipeline** - Topic suggestion (from AI safety news feeds) - Outline generation - Citation extraction and validation - Draft formatting (Markdown) - **Human Oversight** - Moderation queue integration - Approve/Reject/Edit workflows - Tractatus boundary enforcement (AI cannot publish without approval) - Audit trail (who approved, when, why) - **Publishing** - Blog post model (title, slug, content, author, published_at) - Blog list UI (pagination, filtering) - Single post viewer (comments optional) - RSS feed generation - Social media metadata (OpenGraph, Twitter cards) ##### 2.3 Media Inquiry Triage - **AI Classification** - Inquiry type (press, academic, commercial, spam) - Priority scoring (urgency, relevance, reach) - Auto-draft responses (for human review) - **Moderation Workflow** - Admin triage dashboard - Email notification to John Stroh - Response approval (before sending) - Contact management (CRM-lite) ##### 2.4 Case Study Portal - **Community Submissions** - Public submission form (structured data) - AI relevance analysis (Tractatus applicability) - Failure mode categorization (27027-type, boundary violation, etc.) - **Human Moderation** - Case review queue - Approve/Reject/Request Edits - Publication workflow - Attribution and licensing (CC BY-SA 4.0) **Deliverables**: - Blog system with 5-10 initial posts - Media inquiry form with AI triage - Case study portal with 3-5 examples - All AI decisions subject to human approval --- ### 3. Governance & Policy **Owner**: Governance Lead (or John Stroh) **Duration**: Throughout Phase 2 #### Tasks 1. **Create TRA-OPS-* Documents** - TRA-OPS-0001: AI Content Generation Policy - TRA-OPS-0002: Blog Editorial Guidelines - TRA-OPS-0003: Media Inquiry Response Protocol - TRA-OPS-0004: Case Study Moderation Standards - TRA-OPS-0005: Human Oversight Requirements 2. **Tractatus Framework Enforcement** - Ensure all AI actions classified (STR/OPS/TAC/SYS/STO) - Cross-reference validator integration - Boundary enforcement (no AI values decisions) - Audit trail for AI decisions 3. **Legal & Compliance** - Privacy Policy (GDPR-lite, no tracking cookies) - Terms of Service - Content licensing (CC BY-SA 4.0 for community contributions) - Cookie policy (if analytics use cookies) **Deliverables**: - 5+ TRA-OPS-* governance documents - Privacy Policy and Terms of Service - Tractatus framework audit demonstrating compliance --- ### 4. Content & Documentation **Owner**: Content Lead (or John Stroh) **Duration**: Month 3 (Weeks 9-12) #### Tasks 1. **Document Review** - Final review of all migrated documents - Cross-reference link validation - Formatting consistency - Citation completeness 2. **Blog Launch Content** - Write 5-10 seed blog posts (human-authored) - Topics: Framework introduction, 27027 incident, use cases, etc. - RSS feed implementation - Newsletter signup (optional) 3. **Legal Pages** - Privacy Policy - Terms of Service - About page (mission, values, Te Tiriti acknowledgment) - Contact page (ProtonMail integration) 4. **PDF Generation** - Automated PDF export for key documents - Download links in UI - Version tracking **Deliverables**: - All documents reviewed and polished - 5-10 initial blog posts published - Privacy Policy and Terms of Service live - PDF downloads available --- ### 5. Analytics & Monitoring **Owner**: Operations Lead (or John Stroh) **Duration**: Month 1 & ongoing #### Tasks 1. **Privacy-Respecting Analytics** - Deploy Plausible (self-hosted) or Matomo - No cookies, no tracking, GDPR-compliant - Metrics: page views, unique visitors, referrers - Geographic data (country-level only) 2. **Error Tracking** - Sentry (cloud) or self-hosted alternative (GlitchTip) - JavaScript error tracking - Server error logging - Alerting on critical errors 3. **Performance Monitoring** - Uptime monitoring (UptimeRobot or self-hosted) - Response time tracking - Database query performance - API usage metrics (Claude API tokens/day) 4. **Business Metrics** - Blog post views and engagement - Media inquiry volume - Case study submissions - Admin moderation activity **Deliverables**: - Analytics dashboard operational - Error tracking with alerting - Uptime monitoring (99.9% target) - Monthly metrics report template --- ## Success Criteria Phase 2 is considered **complete** when: ### Technical Success - [ ] Production site live at `https://agenticgovernance.digital` with SSL/TLS - [ ] All Phase 1 features operational in production - [ ] Blog system publishing AI-curated content (with human approval) - [ ] Media inquiry triage system processing requests - [ ] Case study portal accepting community submissions - [ ] Uptime: 99%+ over 30-day period - [ ] Performance: <3s page load (95th percentile) - [ ] Security: No critical vulnerabilities (OWASP Top 10) ### Governance Success - [ ] All AI content requires human approval (0 auto-published posts) - [ ] Tractatus framework audit shows 100% compliance - [ ] TRA-OPS-* policies documented and enforced - [ ] Boundary enforcer blocks values decisions by AI - [ ] Audit trail for all AI decisions (who, what, when, why) ### User Success - [ ] Soft launch cohort: 20-50 users - [ ] User satisfaction: 4+/5 average rating - [ ] Blog engagement: 50+ readers/post average - [ ] Media inquiries: 5+ per month - [ ] Case study submissions: 3+ per month - [ ] Accessibility: WCAG AA maintained ### Business Success - [ ] Monthly hosting costs <$100/month - [ ] Claude API costs <$200/month - [ ] Zero data breaches or security incidents - [ ] Privacy policy: zero complaints - [ ] Positive feedback from initial users --- ## Risk Assessment ### High-Risk Items | Risk | Probability | Impact | Mitigation | |------|-------------|--------|------------| | **Claude API costs exceed budget** | Medium | High | Implement strict rate limiting, token usage alerts, monthly spending cap | | **Security breach (data leak)** | Low | Critical | Security audit, penetration testing, bug bounty program (Phase 3) | | **AI generates inappropriate content** | Medium | High | Mandatory human approval, content filters, moderation queue | | **Server downtime during soft launch** | Medium | Medium | Uptime monitoring, automated backups, disaster recovery plan | | **GDPR/privacy compliance issues** | Low | High | Legal review, privacy-by-design, no third-party tracking | ### Medium-Risk Items | Risk | Probability | Impact | Mitigation | |------|-------------|--------|------------| | **OVHCloud service disruption** | Low | Medium | Multi-region backup plan, cloud provider diversification (Phase 3) | | **Email delivery issues (ProtonBridge)** | Medium | Low | Fallback SMTP provider, email queue system | | **Blog content quality concerns** | Medium | Medium | Editorial guidelines, human review, reader feedback loop | | **Performance degradation under load** | Medium | Medium | Load testing, CDN evaluation, database optimization | | **User confusion with UI/UX** | High | Low | User testing, clear documentation, onboarding flow | ### Low-Risk Items | Risk | Probability | Impact | Mitigation | |------|-------------|--------|------------| | **Domain registration issues** | Very Low | Low | Auto-renewal, registrar lock | | **SSL certificate expiry** | Very Low | Low | Certbot auto-renewal, monitoring alerts | | **Dependency vulnerabilities** | Medium | Very Low | Dependabot, regular npm audit | --- ## Decision Points ### Before Starting Phase 2 **Required Approvals from John Stroh:** 1. **Budget Approval** (see Budget Requirements section) - OVHCloud hosting: $30-80/month - Claude API: $50-200/month - Total: ~$100-300/month 2. **Timeline Confirmation** - Start date for Phase 2 - Acceptable completion timeframe (2-3 months) - Soft launch target date 3. **Content Strategy** - Blog editorial guidelines (TRA-OPS-0002) - Media response protocol (TRA-OPS-0003) - Case study moderation standards (TRA-OPS-0004) 4. **Privacy Policy** - Final wording for data collection - Analytics tool selection (Plausible vs. Matomo) - Email handling practices 5. **Soft Launch Strategy** - Target user cohort (researchers, implementers, advocates) - Invitation method (email, social media) - Feedback collection process ### During Phase 2 **Interim Decisions:** 1. **Week 2**: VPS tier selection (based on performance testing) 2. **Week 5**: Claude API usage limits (tokens/day, cost caps) 3. **Week 8**: Blog launch readiness (sufficient seed content?) 4. **Week 10**: Soft launch invite list (who to include?) 5. **Week 12**: Phase 3 go/no-go decision --- ## Dependencies ### External Dependencies 1. **OVHCloud** - VPS availability in preferred region - DNS propagation (<24 hours) - Support response time (for issues) 2. **Anthropic** - Claude API production access - API stability and uptime - Pricing stability (no unexpected increases) 3. **Let's Encrypt** - Certificate issuance - Auto-renewal functionality 4. **ProtonMail** - ProtonBridge availability - Email delivery reliability ### Internal Dependencies 1. **Phase 1 Completion** ✅ - All features tested and working - Clean codebase - Documentation complete 2. **Governance Documents** - TRA-OPS-* policies drafted (see Task 3) - Privacy Policy finalized - Terms of Service drafted 3. **Seed Content** - 5-10 initial blog posts (human-written) - 3-5 case studies (curated examples) - Documentation complete 4. **User Cohort** - 20-50 users identified for soft launch - Invitation list prepared - Feedback survey drafted --- ## Budget Requirements **See separate document: PHASE-2-COST-ESTIMATES.md** Summary: - **One-time**: $50-200 (SSL, setup) - **Monthly recurring**: $100-300 (hosting + API) - **Total Phase 2 cost**: ~$500-1,200 (3 months) --- ## Phase 2 → Phase 3 Transition ### Exit Criteria Phase 2 ends and Phase 3 begins when: - All success criteria met (see Success Criteria section) - Soft launch feedback incorporated - Zero critical bugs outstanding - Governance audit complete - John Stroh approves public launch ### Phase 3 Preview - Public launch and marketing campaign - Koha donation system (micropayments) - Multi-language translations - Community forums/discussion - Bug bounty program - Academic partnerships --- ## Appendices ### Appendix A: Technology Stack (Production) **Hosting**: OVHCloud VPS **OS**: Ubuntu 22.04 LTS **Web Server**: Nginx 1.24+ **Application**: Node.js 18+, Express 4.x **Database**: MongoDB 7.x **SSL/TLS**: Let's Encrypt (Certbot) **Email**: ProtonMail + ProtonBridge **Analytics**: Plausible (self-hosted) or Matomo **Error Tracking**: Sentry (cloud) or GlitchTip (self-hosted) **Monitoring**: UptimeRobot or self-hosted **AI Integration**: Anthropic Claude API (Sonnet 4.5) ### Appendix B: Key Performance Indicators (KPIs) **Technical KPIs**: - Uptime: 99.9%+ - Response time: <3s (95th percentile) - Error rate: <0.1% - Security vulnerabilities: 0 critical **User KPIs**: - Unique visitors: 100+/month (soft launch) - Blog readers: 50+/post average - Media inquiries: 5+/month - Case submissions: 3+/month **Business KPIs**: - Hosting costs: <$100/month - API costs: <$200/month - User satisfaction: 4+/5 - AI approval rate: 100% (all content human-approved) ### Appendix C: Rollback Plan If Phase 2 encounters critical issues: 1. **Immediate**: Revert to Phase 1 (local prototype) 2. **Within 24h**: Root cause analysis 3. **Within 72h**: Fix deployed or timeline extended 4. **Escalation**: Consult security experts if breach suspected --- **Document Version**: 1.0 **Last Updated**: 2025-10-07 **Next Review**: Start of Phase 2 (TBD) **Owner**: John Stroh **Contributors**: Claude Code (Anthropic Sonnet 4.5)