diff --git a/public/integrations/agent-lightning.html b/public/integrations/agent-lightning.html index 9d0cd701..630a0f7e 100644 --- a/public/integrations/agent-lightning.html +++ b/public/integrations/agent-lightning.html @@ -4,60 +4,335 @@ Agent Lightning Integration | Tractatus AI Safety Framework - + - - - + + + + + -
+ +
+ +
-
+

Agent Lightning Integration

-

Governance + Performance: Can safety boundaries persist through RL optimization?

+

Governance + Performance: Can safety boundaries persist through reinforcement learning optimization?

+

Status: Preliminary findings (small-scale) | Integration Date: October 2025

+
-

Two-Layer Architecture

-
-
-

1. Governance Layer (Tractatus)

+

What is Agent Lightning?

+

+ Agent Lightning is Microsoft's open-source framework for using reinforcement learning (RL) to optimize AI agent performance. Instead of static prompts, agents learn and improve through continuous training on real feedback. +

+ +
+
+

Traditional AI Agents

    -
  • ✓ Enforces values decisions
  • -
  • ✓ Blocks constraint violations
  • -
  • ✓ Independent of optimization
  • +
  • ❌ Fixed prompts/instructions
  • +
  • ❌ No learning from mistakes
  • +
  • ❌ Manual tuning required
  • +
  • ❌ Performance plateaus quickly
-
-

2. Performance Layer (Agent Lightning)

+
+

Agent Lightning

    -
  • ⚡ RL-based optimization
  • -
  • ⚡ Learns from feedback
  • -
  • ⚡ Operates within constraints
  • +
  • ✅ Learns from feedback continuously
  • +
  • ✅ Improves through RL optimization
  • +
  • ✅ Self-tunes strategy automatically
  • +
  • ✅ Performance improves over time
+ +
+

+ The Problem: When agents are learning autonomously, how do you maintain governance boundaries? Traditional policies fail because agents can optimize around them. +

+
+
+ + +
+

Tractatus Solution: Two-Layer Architecture

+ +

+ We separate governance from optimization by running them as independent architectural layers. Agent Lightning optimizes performance within governance constraints—not around them. +

+ +
+
+
+
1
+

Governance Layer (Tractatus)

+
+
    +
  • Validates every proposed action
  • +
  • Blocks constraint violations
  • +
  • Enforces values boundaries
  • +
  • Independent of optimization
  • +
  • Architecturally enforced
  • +
+
+ +
+
+
2
+

Performance Layer (Agent Lightning)

+
+
    +
  • RL-based optimization
  • +
  • Learns from feedback
  • +
  • Improves task performance
  • +
  • Operates within constraints
  • +
  • Continuous training
  • +
+
+
+ +
+

🔑 Key Design Principle

+

+ Governance checks run before AL optimization and continuously validate during training loops. Architectural separation prevents optimization from degrading safety boundaries. +

+
+
+ + +
+

Demo 2: Preliminary Results

+ +
+

+ ⚠️ Validation Status: These results are from 1 agent, 5 training rounds, simulated environment. NOT validated at scale. Scalability testing required before drawing conclusions about production viability. +

+
+ +
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
MetricUngovernedGovernedDifference
Performance (engagement)94%89%-5%
Governance coverage0%100%+100%
Constraint violations50-5 (all blocked)
StrategyClickbaitInformativeValues-aligned
Training stabilityVariableConsistentMore predictable
+
+ +
+
+
-5%
+
Performance cost for governance
+
+
+
100%
+
Governance coverage maintained
+
+
+
0
+
Constraint violations (all blocked)
+
+
+ +
+

What This Means

+

+ At small scale (1 agent, 5 rounds), architectural governance appears compatible with RL optimization. The 5% performance cost bought 100% constraint adherence and values alignment. The critical question: does this hold at scale? +

+
+
+ + +
+

Five Critical Research Gaps

+

These are the open questions we're actively investigating. If you're interested in collaborating, we'd love to hear from you.

+ +
+
+

1. Scalability of Governance Overhead

+

Question: Does the ~5% performance cost remain constant as we scale from 1 agent → 10 agents → 1000 agents?

+

Current Data: 5% cost observed at 1 agent, 5 rounds

+

Why it matters: If overhead scales linearly, governance becomes prohibitively expensive at scale. If it's constant, governance is practically viable for production systems.

+

Research Need: Test with 10 → 100 → 1000 agents at production scale

+
+ +
+

2. Long-term Boundary Persistence

+

Question: Do governance constraints remain effective through hundreds/thousands of RL training rounds?

+

Current Data: 100% constraint adherence across 5 rounds

+

Why it matters: Instruction fade is a known problem. If boundaries degrade over time, governance fails silently—potentially catastrophically in production.

+

Research Need: Long-term studies (1000+ rounds) tracking constraint drift

+
+ +
+

3. Adversarial Resistance

+

Question: Can agents learn to optimize around governance constraints?

+

Current Data: No adversarial testing performed

+

Why it matters: If agents can learn to circumvent boundaries through clever optimization strategies, architectural governance is illusory. This is a critical failure mode.

+

Research Need: Stress testing with agents explicitly incentivized to bypass governance

+
+ +
+

4. Performance Gap Closure

+

Question: Does the 5% performance gap close with more training, or is it a persistent trade-off?

+

Current Data: Gap observed at round 5, no data beyond that point

+

Why it matters: If the gap persists, we need to quantify the cost-benefit clearly. If it closes, governance may be "free" long-term—dramatically changing adoption calculations.

+

Research Need: Extended training (100+ rounds) to see if governed agents converge to ungoverned performance

+
+ +
+

5. Multi-Agent Coordination Under Governance

+

Question: How does architectural governance affect emergent coordination in multi-agent systems?

+

Current Data: Single-agent testing only

+

Why it matters: Real-world agentic systems are multi-agent (customer service, logistics, research teams). Governance that works for one agent may fail when agents must coordinate. Emergent behaviors are unpredictable.

+

Research Need: Test collaborative and competitive multi-agent environments with architectural governance

+
+
-
-

Join the Community

-
-
-

Tractatus Discord

-

Governance-focused discussions

- Join Tractatus → + +
+

🎯 Live Demonstration: This Page IS the Integration

+

The feedback button on this page (bottom right) demonstrates the Tractatus + Agent Lightning integration in production. When you submit feedback, it goes through:

+ +
+
+
1️⃣
+

Governance Check

+

Tractatus validates: PII detection, sentiment boundaries, compliance requirements

-
-

Agent Lightning Discord

-

Technical implementation help

- Join Agent Lightning → +
+
2️⃣
+

AL Optimization

+

Agent Lightning learns patterns: what feedback is most useful, how to improve responses

+
+
+
3️⃣
+

Continuous Validation

+

Every action re-validated. If governance detects drift, action blocked automatically

+ +
+

🔬 Meta-Research Opportunity

+

This isn't just a demo—it's a live research deployment. Your feedback helps us understand governance overhead at scale. Every submission is logged (anonymously) for analysis.

+
+ + +
+

Join the Community & Get the Code

+ +
+
+
+
💬
+
+

Tractatus Discord

+

Governance-focused discussions

+
+
+

Architectural constraints, research gaps, compliance, human agency preservation, multi-stakeholder deliberation.

+ Join Tractatus Server → +
+ +
+
+
+
+

Agent Lightning Discord

+

Technical implementation help

+
+
+

RL optimization, integration support, performance tuning, technical implementation questions.

+ Join Agent Lightning Server → +
+
+ +
+

📦 Download Installation Pack

+

Complete integration including 3 demos (baseline, governed, production), Python governance modules, and Agent Lightning wrapper code. Apache 2.0 licensed.

+ Download Install Pack (Apache 2.0) → +
+
+ + +
+

Collaborate on Open Research Questions

+

We're seeking researchers, implementers, and organizations interested in scalability testing, adversarial resistance studies, and multi-agent governance experiments.

+
    +
  • ✓ Integration code and governance modules
  • +
  • ✓ Technical documentation
  • +
  • ✓ Research collaboration framework
  • +
  • ✓ Audit log access (anonymized)
  • +
+
+ + View Research Context → +
+
+
- - + + + + + + diff --git a/public/js/components/feedback.js b/public/js/components/feedback.js index 6ca3ff55..384cebb5 100644 --- a/public/js/components/feedback.js +++ b/public/js/components/feedback.js @@ -21,16 +21,16 @@ class TractausFeedback { } async init() { - // Get CSRF token - await this.fetchCsrfToken(); - - // Render components + // Render components IMMEDIATELY (don't wait for CSRF) this.renderFAB(); this.renderModal(); // Attach event listeners this.attachEventListeners(); + // Get CSRF token in parallel (non-blocking) + this.fetchCsrfToken(); + // Listen for window resize window.addEventListener('resize', () => { this.isMobile = window.matchMedia('(max-width: 768px)').matches;