tractatus/docs/governance/PRIVACY-PRESERVING-ANALYTICS-PLAN.md
TheFlow 03fdb080bd docs: Close privacy-preserving analytics plan (Option A: No Analytics)
Update governance document to reflect the final decision: no analytics
on the website. Records the decision history from deferral through
Umami implementation and removal to final policy alignment.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-10 13:49:28 +13:00

312 lines
11 KiB
Markdown
Raw Permalink Blame History

This file contains invisible Unicode characters

This file contains invisible Unicode characters that are indistinguishable to humans but may be processed differently by a computer. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

# Privacy-Preserving Analytics Implementation Plan
**Document Type:** Implementation Plan
**Created:** 2025-10-11
**Author:** Claude (Session 2025-10-07-001)
**Priority:** CRITICAL (Values alignment)
**Status:** CLOSED - Option A chosen (No Analytics)
**Decision History:**
- 2025-10-11: Deferred by Human PM (John Stroh)
- 2025-11-XX: Umami Analytics implemented and deployed
- 2026-01-20: Umami Analytics removed (commit 403a54d). Decision: No analytics.
- 2026-02-10: Privacy.html updated to remove all analytics references. Plan formally closed.
**Related Documents:** TRA-VAL-0001 (Core Values), privacy.html
**Primary Quadrant:** STRATEGIC (Values-sensitive decision)
---
## Executive Summary
**Problem Identified:** The Tractatus privacy policy claims "privacy-respecting analytics (no cross-site tracking)" but NO analytics implementation currently exists. This creates a gap between stated policy and actual implementation.
**Values Consideration:** Per TRA-VAL-0001, our core value is "Privacy-First Design: No tracking, no surveillance, minimal data collection." This is a **values-sensitive decision requiring human approval**.
**Recommended Solution:** Implement Plausible Analytics (cloud-hosted initially, self-hosted in Phase 2) as a privacy-preserving analytics solution that aligns with our core values.
---
## Current State Analysis
### What Was Discovered (October 11, 2025)
1. **No Analytics Implementation Found:**
- Searched all HTML files for Google Analytics, Plausible, Matomo, tracking scripts
- No third-party analytics scripts present
- No analytics cookies being set
2. **Privacy Policy Claims Analytics Exist:**
- Line 64: "Cookies: Session management, preferences (e.g., selected currency), **analytics**"
- Line 160: "**Analytics Cookies:** Privacy-respecting analytics (no cross-site tracking)"
3. **Legitimate Data Storage Found:**
- `localStorage.tractatus_currency` - User's currency preference
- `localStorage.tractatus_search_history` - Docs search history
- `localStorage.auth_token` - Authentication token
- `localStorage.admin_token` - Admin panel authentication
- All legitimate, privacy-respecting uses
4. **Admin Audit Analytics (Separate):**
- `/admin/audit-analytics.html` exists but is for **internal governance auditing**
- Tracks AI governance decisions (BoundaryEnforcer, etc.)
- NOT user behavior tracking
---
## Options Analysis
### Option A: Remove Analytics Claims from Privacy Policy
**Approach:** Update privacy.html to remove all mentions of analytics cookies and tracking.
**Pros:**
- Simple, immediate fix
- No new code to maintain
- Truly minimal data collection
- Zero privacy risk
**Cons:**
- Lose visibility into basic usage patterns (which pages are valuable?)
- Can't measure impact of improvements
- Can't understand referrer sources (how did users find us?)
- Harder to demonstrate framework adoption/impact
- Privacy policy already published with analytics claim
**Values Alignment:** ✅ Fully aligned with "Privacy-First Design"
---
### Option B: Implement Privacy-Preserving Analytics (RECOMMENDED)
**Approach:** Implement Plausible Analytics, a privacy-first analytics tool designed for GDPR/CCPA compliance.
#### Why Plausible?
**Privacy Features:**
- ✅ No cookies used (100% cookie-free)
- ✅ No personal data collected (no IP logging, no fingerprinting)
- ✅ No cross-site tracking
- ✅ All data anonymized by default
- ✅ GDPR/CCPA/PECR compliant without cookie banners
- ✅ Open source (transparency)
- ✅ Lightweight (<1KB script vs. Google Analytics 45KB+)
- Does not slow down page load
**Data Collected (All Anonymized):**
- Page views
- Referrer sources (where visitors came from)
- Browser/device type (general categories only)
- Country (derived from IP, not stored)
- Visit duration (aggregate, not individual tracking)
**Data NOT Collected:**
- Individual IP addresses
- User identifiers
- Personal information
- Cross-site behavior
- Long-term tracking cookies
**Values Alignment:** Aligns with "Privacy-First Design: minimal data collection" + provides value for improvement
---
## Recommended Implementation: Plausible Analytics
### Phase 1: Cloud-Hosted Plausible (Immediate)
**Timeline:** 1-2 hours implementation
**Approach:**
1. Sign up for Plausible Cloud ($9/month for up to 10k monthly pageviews)
2. Add single script tag to HTML pages: `<script defer data-domain="agenticgovernance.digital" src="https://plausible.io/js/script.js"></script>`
3. Configure dashboard access (admin-only)
4. Update privacy.html to explicitly mention Plausible
**Cost:** $9/month (~$108/year)
**Pros:**
- Zero infrastructure maintenance
- Immediate implementation
- Professionally managed, high uptime
- EU/US data residency options
- Built-in dashboard
**Cons:**
- Ongoing monthly cost
- Data hosted by third party (though anonymized)
- Less control over data sovereignty
---
### Phase 2: Self-Hosted Plausible (Future, Phase 2+)
**Timeline:** Phase 2 infrastructure work (Q2 2026)
**Approach:**
1. Deploy Plausible CE (Community Edition) on VPS
2. PostgreSQL + ClickHouse database setup
3. Nginx reverse proxy configuration
4. Automated backups
5. Update script tag to point to self-hosted instance
**Cost:** ~$20/month VPS increase (additional resources for PostgreSQL + ClickHouse)
**Pros:**
- Complete data sovereignty
- One-time setup, no recurring licensing
- Full control over retention and access
- Aligns with "No Proprietary Lock-in" value
**Cons:**
- Infrastructure complexity
- Requires ongoing maintenance
- Database management overhead
- Higher initial time investment
---
## Privacy Policy Updates Required
### Current (Line 160):
```
Analytics Cookies: Privacy-respecting analytics (no cross-site tracking)
```
### Updated (Specific):
```
Analytics: We use Plausible Analytics, a privacy-first, open-source analytics tool that:
- Does not use cookies
- Does not collect personal data
- Does not track you across websites
- Is fully GDPR/CCPA compliant
- Collects only anonymized, aggregate data (page views, referrers, country-level location)
- View our privacy-respecting analytics policy: https://plausible.io/privacy-focused-web-analytics
```
### Current (Line 64):
```
Cookies: Session management, preferences (e.g., selected currency), analytics
```
### Updated:
```
Cookies: Session management, user preferences (currency selection). Note: Our analytics tool (Plausible) does not use cookies.
```
---
## User Value Proposition
**Why Minimal Analytics Benefits Users:**
1. **Site Improvements:** Understanding which documentation pages are most helpful guides future content
2. **Bug Detection:** Unusual patterns (e.g., high bounce rate on a page) may indicate broken features
3. **Community Impact:** Demonstrating framework reach and adoption (anonymized, aggregate numbers)
4. **Resource Allocation:** Focus development effort on high-traffic, high-value features
5. **Transparency:** Public analytics dashboard option (Plausible supports this)
**Privacy Trade-off:** Minimal anonymized data collection in exchange for better user experience and site quality.
---
## Implementation Checklist
### Phase 1: Cloud-Hosted Plausible
- [ ] **HUMAN APPROVAL REQUIRED** - Values-sensitive decision (analytics implementation)
- [ ] Create Plausible Cloud account (store admin credentials securely)
- [ ] Add domain: agenticgovernance.digital
- [ ] Add script tag to all HTML pages:
- [ ] index.html
- [ ] about.html, advocate.html, researcher.html, implementer.html, leader.html
- [ ] docs.html, blog.html, blog-post.html
- [ ] case-submission.html, media-inquiry.html
- [ ] privacy.html
- [ ] demos/*.html (4 files)
- [ ] admin/*.html (exempt from public analytics)
- [ ] Test script loading (check browser network tab)
- [ ] Verify data collection in Plausible dashboard (wait 24 hours for data)
- [ ] Update privacy.html with specific Plausible details
- [ ] Document admin access to Plausible dashboard
- [ ] (Optional) Make dashboard publicly viewable for transparency
### Phase 2: Documentation
- [ ] Create TRA-GOV-XXXX governance document for analytics policy
- [ ] Update CLAUDE.md with analytics approach
- [ ] Add section to integrated roadmap
- [ ] Document in PHASE-2-PREPARATION-ADVISORY.md
---
## Boundary Enforcement Check
**Question:** Is implementing privacy-preserving analytics a technical decision or a values decision?
**Analysis:**
- **Values Dimension:** Privacy vs. Utility trade-off (even if minimal)
- **Strategic Impact:** Affects "Privacy-First Design" core value
- **User Impact:** Changes what data we collect (even if anonymized)
- **Transparency Requirement:** Must be disclosed to users
**Classification:** **STRATEGIC** - Requires human approval per TRA-VAL-0001
**BoundaryEnforcer Assessment:**
```
Action: Implement analytics (even privacy-preserving)
Domain: Values (Privacy vs. Utility)
Boundary Crossed: Yes - involves data collection philosophy
Human Approval Required: MANDATORY
Alternative: Option A (remove analytics claims entirely)
```
---
## Recommendation
**Implement Plausible Analytics (Cloud-Hosted, Phase 1):**
1. Aligns with "Privacy-First Design" (no tracking, no surveillance, minimal data)
2. Provides value for site improvement and community impact demonstration
3. Fixes privacy policy gap (claim matches implementation)
4. Minimal cost ($9/month)
5. Quick implementation (1-2 hours)
6. Clear path to self-hosting in Phase 2 (full sovereignty)
7. Open source, transparent, GDPR/CCPA compliant
**Awaiting human approval to proceed.**
---
## Alternatives Considered
1. **Google Analytics** - Rejected: Violates privacy-first values, uses cookies, tracks users
2. **Matomo (cloud)** - Better than Google but more expensive, overkill for our needs
3. **Matomo (self-hosted)** - Good alternative but heavier than Plausible, more maintenance
4. **Simple Analytics** - Similar to Plausible but not open source
5. **Fathom Analytics** - Similar to Plausible but more expensive ($14/month vs $9/month)
6. **No analytics** - Valid choice but loses valuable insights
**Winner:** Plausible (best balance of privacy, utility, cost, maintenance, transparency)
---
## Questions for Human PM
1. **Approve Option B (Plausible)?** Or prefer Option A (no analytics)?
2. **Dashboard visibility?** Keep private or make publicly viewable for transparency?
3. **Budget approval?** $9/month for Plausible Cloud?
4. **Timeline?** Implement immediately or defer to Phase 2?
5. **Self-hosting timeline?** Phase 2 infrastructure work or later?
---
**Document Status:** DEFERRED - Scheduled for review November 2025
**Next Action:** Revisit in November 2025 for human PM review and decision
**Deferral Rationale:** Privacy policy gap identified but not urgent. Site currently has no analytics (clean state). Decision deferred to allow time for consideration of values trade-offs.
---
*This document was created by Claude (Session 2025-10-07-001) following the Tractatus governance framework. All values-sensitive decisions require human approval per TRA-VAL-0001.*