tractatus/PITCH-GENERAL-PUBLIC.md
TheFlow ac2db33732 fix(submissions): restructure Economist package and fix article display
- Create Economist SubmissionTracking package correctly:
  * mainArticle = full blog post content
  * coverLetter = 216-word SIR— letter
  * Links to blog post via blogPostId
- Archive 'Letter to The Economist' from blog posts (it's the cover letter)
- Fix date display on article cards (use published_at)
- Target publication already displaying via blue badge

Database changes:
- Make blogPostId optional in SubmissionTracking model
- Economist package ID: 68fa85ae49d4900e7f2ecd83
- Le Monde package ID: 68fa2abd2e6acd5691932150

Next: Enhanced modal with tabs, validation, export

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-10-24 08:47:42 +13:00

9.2 KiB

Tractatus Framework - Elevator Pitches

General Public / Family & Friends Audience

Target: Non-technical audiences, family, friends, general public, media Use Context: Social gatherings, media interviews, public talks, casual explanations Emphasis: Relatable problem → Simple explanation → Research direction Status: Research prototype demonstrating architectural AI safety


5. General Public / Family & Friends Audience

Priority: Relatable problem → Simple explanation → Research direction

Short (1 paragraph, ~100 words)

Tractatus is a research project exploring how to keep AI reliable. The challenge: AI systems often ignore what you specifically told them because their training makes them "autocorrect" your instructions—like your phone changing a correctly-spelled unusual name. When you tell an AI "use port 27027" for a good reason, it might silently change this to 27017 because that's what it saw in millions of examples. We've built a system that structurally prevents this and tested it on ourselves—it works reliably. Our main research question now is understanding how well this approach scales as organizations add more rules for different situations, studying whether we can optimize it to handle hundreds of rules efficiently.

Medium (2-3 paragraphs, ~250 words)

Tractatus is a research project exploring a fundamental question: How do you keep AI systems reliable when they're helping with important decisions? The problem we're addressing is surprisingly common. Imagine telling an AI assistant something specific—"use this port number, not the default one" or "prioritize privacy over convenience in this situation"—and the AI silently ignores you because its training makes it "autocorrect" your instruction. This happens because AI systems learn from millions of examples, and when your specific instruction conflicts with the pattern the AI learned, the pattern often wins. It's like autocorrect on your phone changing a correctly-spelled but unusual name to something more common—except with potentially serious consequences in business, healthcare, or research settings.

Our approach is to design AI systems where certain things are structurally impossible without human approval. Instead of training the AI to "do the right thing" and hoping that training holds up, we build guardrails: the AI literally cannot make decisions about values trade-offs (privacy vs. convenience, security vs. usability) without asking a human. It cannot silently change instructions you gave it. It monitors its own performance and recognizes when context is degrading—like a person recognizing they're too tired to make good decisions in a long meeting—and triggers a handoff. We've tested this extensively on ourselves while building this website (using the AI to help build the AI governance system), and it works reliably: catching problems before they happened, following instructions consistently, and asking for human judgment when appropriate.

Our main research focus now is understanding scalability. As we've used the system, we've added rules for different situations—went from 6 rules initially to 18 rules now as we encountered and handled various problems. This is expected and good (the system learns from experience), but it raises an important question: How well does this approach work when an organization might need hundreds of rules to cover all their different situations? We're studying techniques to optimize the system so it can handle many rules efficiently—like organizing them by priority (check critical rules always, less important ones only when relevant) or using machine learning to predict which rules matter for each situation. Understanding these scaling characteristics will help determine how this approach translates from our successful testing to larger organizational use.

Long (4-5 paragraphs, ~500 words)

Tractatus is a research project exploring how to keep AI systems reliable when they're helping with important work. If you've used AI assistants like ChatGPT, Claude, or Copilot, you've probably noticed they're impressively helpful but occasionally do confusing things—ignoring instructions you clearly gave, making confidently wrong statements, or making decisions that seem to miss important context you provided earlier in the conversation. We're investigating whether these problems can be prevented through better system design, not just better AI training.

The core challenge is surprisingly relatable. Imagine you're working with a very knowledgeable but somewhat unreliable assistant. You tell them something specific—"use port 27027 for the database, not the default port 27017, because we need it for this particular project"—and they nod, seem to understand, but then when they set up the database, they use 27017 anyway. When you ask why, they explain that port 27017 is the standard default for this type of database, so that seemed right. They've essentially "autocorrected" your explicit instruction based on what they learned was normal, even though you had a specific reason for the non-standard choice. Now imagine this happening hundreds of times across security settings, privacy policies, data handling procedures, and operational decisions. This is a real problem organizations face deploying AI systems: the AI doesn't reliably follow explicit instructions when those instructions conflict with patterns in its training data.

Traditional approaches to fixing this focus on better training: teach the AI to follow instructions more carefully, include examples of edge cases in training data, use techniques like reinforcement learning from human feedback. These help, but they all assume the AI will maintain this training under all conditions—complex tasks, long conversations, competing objectives. Our approach is different: instead of training the AI to make correct decisions, we're designing systems where incorrect decisions are structurally impossible. Think of it like the guardrails on a highway—they don't train you to drive better, they physically prevent you from going off the road.

We've built a prototype with five types of guardrails. First, instruction persistence: when you give explicit instructions, they're stored and checked before any major action—the system can't "forget" what you told it. Second, context monitoring: the system tracks its own performance (like monitoring how tired you're getting in a long meeting) and triggers handoffs before quality degrades. Third, values decisions: when a decision involves trade-offs between competing values (privacy vs. convenience, security vs. usability), the system recognizes it can't make that choice and requires human judgment. Fourth, conflict detection: before making changes, the system checks whether those changes conflict with instructions you gave earlier. Fifth, self-checking: for complex operations, the system verifies its own reasoning before proceeding, catching scope creep or misunderstandings.

We've tested this extensively on ourselves—using AI with these guardrails to help build the agentic governance website https://agenticgovernance.digital . The results are measurable: the system caught problems before they were published (like fabricated statistics that weren't based on real data), followed instructions consistently across many work sessions (zero cases where it ignored what we told it), enforced security policies automatically, and recognized when to ask for human judgment on values decisions. This demonstrates the approach works reliably in real use, not just theory.

Our main research focus now is understanding how this approach scales to larger organizations with more complex needs. As we've used the system, we've added rules for different situations we encountered—grew from 6 rules initially to 18 rules now. This is expected and positive: the system learns from experience and gets better at preventing problems. But it raises an important research question: How well does this approach work when an organization might need 50, 100, or 200 rules to cover all their different situations and requirements?

We're actively studying three ways to optimize the system for scale. First, consolidation: combining related rules to reduce total count while keeping the same coverage (like merging three security-related rules into one comprehensive security policy). Second, priority-based checking: organizing rules by how critical they are (always check the most important rules, but only check less critical ones when they're relevant to what you're doing). Third, smart prediction: using machine learning to predict which rules will actually matter for each situation, so the system only checks the relevant ones. Our research will determine whether architectural governance can work not just at small scale (where we've proven it works) but at the larger scale needed for enterprise organizations. We're conducting this research transparently—we'll publish what we find regardless of outcome, because organizations considering AI governance approaches deserve to understand both the capabilities and the limitations. The goal is to provide real data on how this approach performs at different scales, helping organizations make informed decisions about AI governance strategies.