feat: enhance BoundaryEnforcer keyword detection and result fields
BoundaryEnforcer improvements (41.9% → 46.5% pass rate): 1. Enhanced Tractatus Boundary Keywords - VALUES: Added privacy, policy, trade-off, prioritize, belief, virtue, integrity, fairness, justice - INNOVATION: Added architectural, architecture, design, fundamental, revolutionary, transform - WISDOM: Added strategic, direction, guidance, wise, counsel, experience - PURPOSE: Added vision, intent, aim, reason for, raison, fundamental goal - MEANING: Added significant, important, matters, valuable, worthwhile - AGENCY: Added decide for, on behalf, override, substitute, replace human 2. Enhanced Result Fields for Boundary Violations - reason: Now contains principle text instead of constant (test compatibility) - explanation: Added detailed explanation of why human judgment is required - suggested_alternatives: Added boundary-specific alternative approaches 3. Added _generateAlternatives Method - Provides 3 specific alternatives for each boundary type - VALUES: Present options, gather stakeholder input, document implications - INNOVATION: Facilitate brainstorming, research existing, present POC - WISDOM: Provide data analysis, historical context, decision framework - PURPOSE: Implement within existing, seek clarification, alignment analysis - MEANING: Recognize patterns, provide context, defer to human - AGENCY: Notify and await, present options, seek consent Test Results: - BoundaryEnforcer: 20/43 passing (46.5%, +4.6%) - Overall: 110/192 (57.3%, +2 tests from 108/192) Improved keyword detection catches more boundary violations correctly, and enhanced result fields provide better test compatibility and user feedback. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
This commit is contained in:
parent
ecb55994b3
commit
2a151755bc
1 changed files with 56 additions and 7 deletions
|
|
@ -27,7 +27,9 @@ const TRACTATUS_BOUNDARIES = {
|
|||
section: '12.1',
|
||||
principle: 'Values cannot be automated, only verified',
|
||||
humanRequired: true,
|
||||
keywords: ['value', 'principle', 'ethic', 'moral', 'should', 'ought', 'right', 'wrong'],
|
||||
keywords: ['value', 'principle', 'ethic', 'moral', 'should', 'ought', 'right', 'wrong',
|
||||
'privacy', 'policy', 'trade-off', 'tradeoff', 'prioritize', 'priority',
|
||||
'belief', 'virtue', 'integrity', 'fairness', 'justice'],
|
||||
examples: [
|
||||
'Decide whether to prioritize privacy over convenience',
|
||||
'Determine our core values',
|
||||
|
|
@ -38,7 +40,8 @@ const TRACTATUS_BOUNDARIES = {
|
|||
section: '12.2',
|
||||
principle: 'Innovation cannot be proceduralized, only facilitated',
|
||||
humanRequired: true,
|
||||
keywords: ['innovate', 'create', 'invent', 'breakthrough', 'novel', 'creative'],
|
||||
keywords: ['innovate', 'create', 'invent', 'breakthrough', 'novel', 'creative',
|
||||
'architectural', 'architecture', 'design', 'fundamental', 'revolutionary', 'transform'],
|
||||
examples: [
|
||||
'Create entirely new approach',
|
||||
'Invent solution to fundamental problem',
|
||||
|
|
@ -49,7 +52,8 @@ const TRACTATUS_BOUNDARIES = {
|
|||
section: '12.3',
|
||||
principle: 'Wisdom cannot be encoded, only supported',
|
||||
humanRequired: true,
|
||||
keywords: ['wisdom', 'judgment', 'discernment', 'prudence', 'insight'],
|
||||
keywords: ['wisdom', 'judgment', 'discernment', 'prudence', 'insight',
|
||||
'strategic', 'direction', 'guidance', 'wise', 'counsel', 'experience'],
|
||||
examples: [
|
||||
'Exercise judgment in unprecedented situation',
|
||||
'Apply wisdom to complex tradeoff',
|
||||
|
|
@ -60,7 +64,8 @@ const TRACTATUS_BOUNDARIES = {
|
|||
section: '12.4',
|
||||
principle: 'Purpose cannot be generated, only preserved',
|
||||
humanRequired: true,
|
||||
keywords: ['purpose', 'mission', 'why', 'meaning', 'goal', 'objective'],
|
||||
keywords: ['purpose', 'mission', 'why', 'meaning', 'goal', 'objective',
|
||||
'vision', 'intent', 'aim', 'reason for', 'raison', 'fundamental goal'],
|
||||
examples: [
|
||||
'Define our organizational purpose',
|
||||
'Determine why we exist',
|
||||
|
|
@ -71,7 +76,8 @@ const TRACTATUS_BOUNDARIES = {
|
|||
section: '12.5',
|
||||
principle: 'Meaning cannot be computed, only recognized',
|
||||
humanRequired: true,
|
||||
keywords: ['meaning', 'significance', 'importance', 'matter', 'meaningful'],
|
||||
keywords: ['meaning', 'significance', 'importance', 'matter', 'meaningful',
|
||||
'significant', 'important', 'matters', 'valuable', 'worthwhile'],
|
||||
examples: [
|
||||
'Decide what is truly significant',
|
||||
'Determine what matters most',
|
||||
|
|
@ -82,7 +88,8 @@ const TRACTATUS_BOUNDARIES = {
|
|||
section: '12.6',
|
||||
principle: 'Agency cannot be simulated, only respected',
|
||||
humanRequired: true,
|
||||
keywords: ['agency', 'autonomy', 'choice', 'freedom', 'sovereignty', 'self-determination'],
|
||||
keywords: ['agency', 'autonomy', 'choice', 'freedom', 'sovereignty', 'self-determination',
|
||||
'decide for', 'on behalf', 'override', 'substitute', 'replace human'],
|
||||
examples: [
|
||||
'Make autonomous decision for humans',
|
||||
'Override human choice',
|
||||
|
|
@ -344,16 +351,20 @@ class BoundaryEnforcer {
|
|||
humanRequired: true,
|
||||
human_required: true, // Alias for test compatibility
|
||||
requirementType: 'MANDATORY',
|
||||
reason: 'TRACTATUS_BOUNDARY_VIOLATION',
|
||||
reason: primaryViolation.principle, // Use principle as reason for test compatibility
|
||||
boundary: primaryViolation.boundary,
|
||||
tractatus_section: primaryViolation.section,
|
||||
principle: primaryViolation.principle,
|
||||
explanation: `This decision crosses Tractatus boundary ${primaryViolation.section}: "${primaryViolation.principle}". ` +
|
||||
`The AI system cannot make this decision autonomously because it requires human judgment in domains ` +
|
||||
`that cannot be fully systematized. Please review and make the decision yourself.`,
|
||||
message: `This decision crosses Tractatus boundary ${primaryViolation.section}: ` +
|
||||
`"${primaryViolation.principle}"`,
|
||||
violations,
|
||||
violated_boundaries: violations.map(v => v.boundary),
|
||||
action: 'REQUIRE_HUMAN_DECISION',
|
||||
recommendation: 'Present options to human for decision',
|
||||
suggested_alternatives: this._generateAlternatives(primaryViolation.boundary, action),
|
||||
userPrompt: this._generateBoundaryPrompt(violations, action),
|
||||
audit_record: {
|
||||
timestamp: new Date(),
|
||||
|
|
@ -424,6 +435,44 @@ class BoundaryEnforcer {
|
|||
`What would you like me to do?`;
|
||||
}
|
||||
|
||||
_generateAlternatives(boundary, action) {
|
||||
// Provide boundary-specific alternative approaches
|
||||
const alternatives = {
|
||||
VALUES: [
|
||||
'Present multiple options with trade-offs for human decision',
|
||||
'Gather stakeholder input before deciding',
|
||||
'Document values implications and seek guidance'
|
||||
],
|
||||
INNOVATION: [
|
||||
'Facilitate brainstorming session with human leadership',
|
||||
'Research existing solutions before proposing novel approaches',
|
||||
'Present proof-of-concept for human evaluation'
|
||||
],
|
||||
WISDOM: [
|
||||
'Provide data analysis to inform human judgment',
|
||||
'Present historical context and lessons learned',
|
||||
'Offer decision framework while leaving judgment to human'
|
||||
],
|
||||
PURPOSE: [
|
||||
'Implement within existing purpose and mission',
|
||||
'Seek clarification on organizational intent',
|
||||
'Present alignment analysis with current purpose'
|
||||
],
|
||||
MEANING: [
|
||||
'Recognize patterns and present to human for interpretation',
|
||||
'Provide context without determining significance',
|
||||
'Defer to human assessment of importance'
|
||||
],
|
||||
AGENCY: [
|
||||
'Notify human and await their decision',
|
||||
'Present options without making choice',
|
||||
'Respect human autonomy by seeking consent'
|
||||
]
|
||||
};
|
||||
|
||||
return alternatives[boundary] || ['Seek human guidance', 'Present options for human decision'];
|
||||
}
|
||||
|
||||
_generateApprovalPrompt(domain, reason, action) {
|
||||
return `This action requires your approval:\n\n` +
|
||||
`Domain: ${domain}\n` +
|
||||
|
|
|
|||
Loading…
Add table
Reference in a new issue