# External Uptime Monitoring Setup Guide This guide explains how to set up external uptime monitoring for the Tractatus Umami Analytics instance. ## Monitored Endpoints ### Primary Monitoring Target - **URL**: `https://analytics.agenticgovernance.digital/api/heartbeat` - **Expected Response**: HTTP 200 OK - **Purpose**: Umami application health check ### Secondary Monitoring Targets (Optional) - **URL**: `https://agenticgovernance.digital/` - **Expected Response**: HTTP 200 OK - **Purpose**: Main website availability ## Recommended Service: UptimeRobot (Free Tier) UptimeRobot provides free uptime monitoring with: - 50 monitors - 5-minute check intervals - Email/SMS alerts - Status page generation ### Setup Instructions #### 1. Create Account 1. Visit https://uptimerobot.com 2. Sign up for a free account 3. Verify your email address #### 2. Add Analytics Monitor 1. Click "Add New Monitor" 2. Configure: - **Monitor Type**: HTTP(s) - **Friendly Name**: `Tractatus Analytics (Umami)` - **URL**: `https://analytics.agenticgovernance.digital/api/heartbeat` - **Monitoring Interval**: 5 minutes - **Monitor Timeout**: 30 seconds - **HTTP Method**: GET - **Expected Status Code**: 200 3. Click "Create Monitor" #### 3. Add Main Website Monitor (Optional) 1. Click "Add New Monitor" 2. Configure: - **Monitor Type**: HTTP(s) - **Friendly Name**: `Tractatus Website` - **URL**: `https://agenticgovernance.digital/` - **Monitoring Interval**: 5 minutes - **Monitor Timeout**: 30 seconds 3. Click "Create Monitor" #### 4. Configure Alert Contacts 1. Go to "My Settings" → "Alert Contacts" 2. Add email address for alerts 3. (Optional) Add SMS number for critical alerts 4. Configure alert preferences: - **Alert When**: Down - **Alert After**: 2 consecutive failures (10 minutes) - **Re-Alert After**: 30 minutes #### 5. Create Public Status Page (Optional) 1. Go to "Status Pages" 2. Click "Add Status Page" 3. Configure: - **Title**: Tractatus Services Status - **Custom Domain**: (optional) status.agenticgovernance.digital - **Monitors**: Select both monitors 4. Enable "Show Uptime Percentage" 5. Enable "Show Response Times" ## Alternative Services ### Pingdom - **Free Tier**: 1 monitor - **Check Interval**: 1 minute - **URL**: https://www.pingdom.com ### Better Uptime - **Free Tier**: 10 monitors - **Check Interval**: 3 minutes - **URL**: https://betteruptime.com ### StatusCake - **Free Tier**: 10 monitors - **Check Interval**: 5 minutes - **URL**: https://www.statuscake.com ## Internal Monitoring (Already Configured) The following internal monitoring is already set up: ### Docker Health Checks - **Umami Container**: `curl -f http://localhost:3000/api/heartbeat` - Interval: 10 seconds - Timeout: 5 seconds - Retries: 5 - **PostgreSQL Container**: `pg_isready -U $POSTGRES_USER -d $POSTGRES_DB` - Interval: 5 seconds - Timeout: 5 seconds - Retries: 5 ### Automated Backups - **Schedule**: Daily at 2:00 AM - **Retention**: 7 days - **Location**: `~/umami-backups/` - **Script**: `~/umami-deployment/backup-umami-db.sh` ### Disk Usage Monitoring - **Schedule**: Daily at 3:00 AM - **Warning Threshold**: 80% disk usage - **Critical Threshold**: 90% disk usage - **Location**: `~/umami-backups/disk-monitoring.log` - **Script**: `~/umami-deployment/monitor-disk-usage.sh` ## Verification To verify monitoring is working: 1. **Check Endpoint Manually**: ```bash curl -I https://analytics.agenticgovernance.digital/api/heartbeat # Should return: HTTP/2 200 ``` 2. **Test Alert Flow**: - Stop Umami container: `docker stop tractatus-umami` - Wait for alert (should arrive within 10 minutes) - Restart container: `docker start tractatus-umami` - Verify recovery alert 3. **Check Internal Monitoring**: ```bash # View Docker health status docker ps # Check backup logs tail -20 ~/umami-backups/backup.log # Check disk monitoring logs tail -20 ~/umami-backups/disk-monitoring.log ``` ## Alert Response Procedures ### Analytics Down (5+ minutes) 1. Check Docker container status: `docker ps` 2. Check container logs: `docker logs tractatus-umami` 3. Check PostgreSQL status: `docker logs tractatus-umami-db` 4. If needed, restart: `cd ~/umami-deployment && docker compose restart` ### High Disk Usage (>80%) 1. Check backup retention: `ls -lh ~/umami-backups/` 2. Remove old backups manually if needed 3. Check PostgreSQL volume: `docker exec tractatus-umami-db du -sh /var/lib/postgresql/data` 4. Consider database cleanup or server upgrade ### Database Corruption 1. Stop Umami: `docker compose stop umami` 2. Restore from backup: `~/umami-deployment/restore-umami-db.sh ~/umami-backups/umami_backup_YYYYMMDD_HHMMSS.sql.gz` 3. Restart services: `docker compose up -d` ## Next Steps - [ ] Sign up for UptimeRobot - [ ] Add analytics.agenticgovernance.digital monitor - [ ] Configure email alerts - [ ] Test alert delivery - [ ] (Optional) Create public status page - [ ] Document response procedures in team wiki ## Maintenance - Review monitoring logs monthly - Test restore procedure quarterly - Update alert contacts when team changes - Review disk usage trends monthly --- **Last Updated**: 2025-10-29 **Monitoring Status**: Internal monitoring active, external monitoring pending user setup