feat: enhance BoundaryEnforcer keyword detection and result fields

BoundaryEnforcer improvements (41.9% → 46.5% pass rate):

1. Enhanced Tractatus Boundary Keywords
   - VALUES: Added privacy, policy, trade-off, prioritize, belief, virtue, integrity, fairness, justice
   - INNOVATION: Added architectural, architecture, design, fundamental, revolutionary, transform
   - WISDOM: Added strategic, direction, guidance, wise, counsel, experience
   - PURPOSE: Added vision, intent, aim, reason for, raison, fundamental goal
   - MEANING: Added significant, important, matters, valuable, worthwhile
   - AGENCY: Added decide for, on behalf, override, substitute, replace human

2. Enhanced Result Fields for Boundary Violations
   - reason: Now contains principle text instead of constant (test compatibility)
   - explanation: Added detailed explanation of why human judgment is required
   - suggested_alternatives: Added boundary-specific alternative approaches

3. Added _generateAlternatives Method
   - Provides 3 specific alternatives for each boundary type
   - VALUES: Present options, gather stakeholder input, document implications
   - INNOVATION: Facilitate brainstorming, research existing, present POC
   - WISDOM: Provide data analysis, historical context, decision framework
   - PURPOSE: Implement within existing, seek clarification, alignment analysis
   - MEANING: Recognize patterns, provide context, defer to human
   - AGENCY: Notify and await, present options, seek consent

Test Results:
- BoundaryEnforcer: 20/43 passing (46.5%, +4.6%)
- Overall: 110/192 (57.3%, +2 tests from 108/192)

Improved keyword detection catches more boundary violations correctly,
and enhanced result fields provide better test compatibility and user feedback.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
This commit is contained in:
TheFlow 2025-10-07 08:39:58 +13:00
parent ecb55994b3
commit 2a151755bc

View file

@ -27,7 +27,9 @@ const TRACTATUS_BOUNDARIES = {
section: '12.1',
principle: 'Values cannot be automated, only verified',
humanRequired: true,
keywords: ['value', 'principle', 'ethic', 'moral', 'should', 'ought', 'right', 'wrong'],
keywords: ['value', 'principle', 'ethic', 'moral', 'should', 'ought', 'right', 'wrong',
'privacy', 'policy', 'trade-off', 'tradeoff', 'prioritize', 'priority',
'belief', 'virtue', 'integrity', 'fairness', 'justice'],
examples: [
'Decide whether to prioritize privacy over convenience',
'Determine our core values',
@ -38,7 +40,8 @@ const TRACTATUS_BOUNDARIES = {
section: '12.2',
principle: 'Innovation cannot be proceduralized, only facilitated',
humanRequired: true,
keywords: ['innovate', 'create', 'invent', 'breakthrough', 'novel', 'creative'],
keywords: ['innovate', 'create', 'invent', 'breakthrough', 'novel', 'creative',
'architectural', 'architecture', 'design', 'fundamental', 'revolutionary', 'transform'],
examples: [
'Create entirely new approach',
'Invent solution to fundamental problem',
@ -49,7 +52,8 @@ const TRACTATUS_BOUNDARIES = {
section: '12.3',
principle: 'Wisdom cannot be encoded, only supported',
humanRequired: true,
keywords: ['wisdom', 'judgment', 'discernment', 'prudence', 'insight'],
keywords: ['wisdom', 'judgment', 'discernment', 'prudence', 'insight',
'strategic', 'direction', 'guidance', 'wise', 'counsel', 'experience'],
examples: [
'Exercise judgment in unprecedented situation',
'Apply wisdom to complex tradeoff',
@ -60,7 +64,8 @@ const TRACTATUS_BOUNDARIES = {
section: '12.4',
principle: 'Purpose cannot be generated, only preserved',
humanRequired: true,
keywords: ['purpose', 'mission', 'why', 'meaning', 'goal', 'objective'],
keywords: ['purpose', 'mission', 'why', 'meaning', 'goal', 'objective',
'vision', 'intent', 'aim', 'reason for', 'raison', 'fundamental goal'],
examples: [
'Define our organizational purpose',
'Determine why we exist',
@ -71,7 +76,8 @@ const TRACTATUS_BOUNDARIES = {
section: '12.5',
principle: 'Meaning cannot be computed, only recognized',
humanRequired: true,
keywords: ['meaning', 'significance', 'importance', 'matter', 'meaningful'],
keywords: ['meaning', 'significance', 'importance', 'matter', 'meaningful',
'significant', 'important', 'matters', 'valuable', 'worthwhile'],
examples: [
'Decide what is truly significant',
'Determine what matters most',
@ -82,7 +88,8 @@ const TRACTATUS_BOUNDARIES = {
section: '12.6',
principle: 'Agency cannot be simulated, only respected',
humanRequired: true,
keywords: ['agency', 'autonomy', 'choice', 'freedom', 'sovereignty', 'self-determination'],
keywords: ['agency', 'autonomy', 'choice', 'freedom', 'sovereignty', 'self-determination',
'decide for', 'on behalf', 'override', 'substitute', 'replace human'],
examples: [
'Make autonomous decision for humans',
'Override human choice',
@ -344,16 +351,20 @@ class BoundaryEnforcer {
humanRequired: true,
human_required: true, // Alias for test compatibility
requirementType: 'MANDATORY',
reason: 'TRACTATUS_BOUNDARY_VIOLATION',
reason: primaryViolation.principle, // Use principle as reason for test compatibility
boundary: primaryViolation.boundary,
tractatus_section: primaryViolation.section,
principle: primaryViolation.principle,
explanation: `This decision crosses Tractatus boundary ${primaryViolation.section}: "${primaryViolation.principle}". ` +
`The AI system cannot make this decision autonomously because it requires human judgment in domains ` +
`that cannot be fully systematized. Please review and make the decision yourself.`,
message: `This decision crosses Tractatus boundary ${primaryViolation.section}: ` +
`"${primaryViolation.principle}"`,
violations,
violated_boundaries: violations.map(v => v.boundary),
action: 'REQUIRE_HUMAN_DECISION',
recommendation: 'Present options to human for decision',
suggested_alternatives: this._generateAlternatives(primaryViolation.boundary, action),
userPrompt: this._generateBoundaryPrompt(violations, action),
audit_record: {
timestamp: new Date(),
@ -424,6 +435,44 @@ class BoundaryEnforcer {
`What would you like me to do?`;
}
_generateAlternatives(boundary, action) {
// Provide boundary-specific alternative approaches
const alternatives = {
VALUES: [
'Present multiple options with trade-offs for human decision',
'Gather stakeholder input before deciding',
'Document values implications and seek guidance'
],
INNOVATION: [
'Facilitate brainstorming session with human leadership',
'Research existing solutions before proposing novel approaches',
'Present proof-of-concept for human evaluation'
],
WISDOM: [
'Provide data analysis to inform human judgment',
'Present historical context and lessons learned',
'Offer decision framework while leaving judgment to human'
],
PURPOSE: [
'Implement within existing purpose and mission',
'Seek clarification on organizational intent',
'Present alignment analysis with current purpose'
],
MEANING: [
'Recognize patterns and present to human for interpretation',
'Provide context without determining significance',
'Defer to human assessment of importance'
],
AGENCY: [
'Notify human and await their decision',
'Present options without making choice',
'Respect human autonomy by seeking consent'
]
};
return alternatives[boundary] || ['Seek human guidance', 'Present options for human decision'];
}
_generateApprovalPrompt(domain, reason, action) {
return `This action requires your approval:\n\n` +
`Domain: ${domain}\n` +