Add comprehensive disk monitoring with real-time metrics: - Backend API endpoints for disk/memory metrics (local + remote) - Admin UI page with CSP-compliant DOM rendering - Health status indicators with color-coded thresholds - SSH-based remote metrics collection from OVH VPS - Auto-refresh every 5 minutes Backend: - src/models/DiskMetrics.model.js: Metrics collection model - src/controllers/diskMetrics.controller.js: 3 admin endpoints - src/routes/diskMetrics.routes.js: Admin-authenticated routes - src/routes/index.js: Register disk-metrics routes Frontend: - public/admin/disk-monitoring.html: Admin dashboard page - public/js/admin-disk-monitoring.js: CSP-compliant UI rendering - public/js/components/navbar-admin.js: Add disk monitoring link Documentation: - deployment-quickstart/UPTIME_MONITORING_SETUP.md API endpoints: - GET /api/admin/disk-metrics (all systems) - GET /api/admin/disk-metrics/local (dev system) - GET /api/admin/disk-metrics/remote (production VPS) 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
5.2 KiB
5.2 KiB
External Uptime Monitoring Setup Guide
This guide explains how to set up external uptime monitoring for the Tractatus Umami Analytics instance.
Monitored Endpoints
Primary Monitoring Target
- URL:
https://analytics.agenticgovernance.digital/api/heartbeat - Expected Response: HTTP 200 OK
- Purpose: Umami application health check
Secondary Monitoring Targets (Optional)
- URL:
https://agenticgovernance.digital/ - Expected Response: HTTP 200 OK
- Purpose: Main website availability
Recommended Service: UptimeRobot (Free Tier)
UptimeRobot provides free uptime monitoring with:
- 50 monitors
- 5-minute check intervals
- Email/SMS alerts
- Status page generation
Setup Instructions
1. Create Account
- Visit https://uptimerobot.com
- Sign up for a free account
- Verify your email address
2. Add Analytics Monitor
-
Click "Add New Monitor"
-
Configure:
- Monitor Type: HTTP(s)
- Friendly Name:
Tractatus Analytics (Umami) - URL:
https://analytics.agenticgovernance.digital/api/heartbeat - Monitoring Interval: 5 minutes
- Monitor Timeout: 30 seconds
- HTTP Method: GET
- Expected Status Code: 200
-
Click "Create Monitor"
3. Add Main Website Monitor (Optional)
-
Click "Add New Monitor"
-
Configure:
- Monitor Type: HTTP(s)
- Friendly Name:
Tractatus Website - URL:
https://agenticgovernance.digital/ - Monitoring Interval: 5 minutes
- Monitor Timeout: 30 seconds
-
Click "Create Monitor"
4. Configure Alert Contacts
- Go to "My Settings" → "Alert Contacts"
- Add email address for alerts
- (Optional) Add SMS number for critical alerts
- Configure alert preferences:
- Alert When: Down
- Alert After: 2 consecutive failures (10 minutes)
- Re-Alert After: 30 minutes
5. Create Public Status Page (Optional)
- Go to "Status Pages"
- Click "Add Status Page"
- Configure:
- Title: Tractatus Services Status
- Custom Domain: (optional) status.agenticgovernance.digital
- Monitors: Select both monitors
- Enable "Show Uptime Percentage"
- Enable "Show Response Times"
Alternative Services
Pingdom
- Free Tier: 1 monitor
- Check Interval: 1 minute
- URL: https://www.pingdom.com
Better Uptime
- Free Tier: 10 monitors
- Check Interval: 3 minutes
- URL: https://betteruptime.com
StatusCake
- Free Tier: 10 monitors
- Check Interval: 5 minutes
- URL: https://www.statuscake.com
Internal Monitoring (Already Configured)
The following internal monitoring is already set up:
Docker Health Checks
-
Umami Container:
curl -f http://localhost:3000/api/heartbeat- Interval: 10 seconds
- Timeout: 5 seconds
- Retries: 5
-
PostgreSQL Container:
pg_isready -U $POSTGRES_USER -d $POSTGRES_DB- Interval: 5 seconds
- Timeout: 5 seconds
- Retries: 5
Automated Backups
- Schedule: Daily at 2:00 AM
- Retention: 7 days
- Location:
~/umami-backups/ - Script:
~/umami-deployment/backup-umami-db.sh
Disk Usage Monitoring
- Schedule: Daily at 3:00 AM
- Warning Threshold: 80% disk usage
- Critical Threshold: 90% disk usage
- Location:
~/umami-backups/disk-monitoring.log - Script:
~/umami-deployment/monitor-disk-usage.sh
Verification
To verify monitoring is working:
- Check Endpoint Manually:
curl -I https://analytics.agenticgovernance.digital/api/heartbeat
# Should return: HTTP/2 200
-
Test Alert Flow:
- Stop Umami container:
docker stop tractatus-umami - Wait for alert (should arrive within 10 minutes)
- Restart container:
docker start tractatus-umami - Verify recovery alert
- Stop Umami container:
-
Check Internal Monitoring:
# View Docker health status
docker ps
# Check backup logs
tail -20 ~/umami-backups/backup.log
# Check disk monitoring logs
tail -20 ~/umami-backups/disk-monitoring.log
Alert Response Procedures
Analytics Down (5+ minutes)
- Check Docker container status:
docker ps - Check container logs:
docker logs tractatus-umami - Check PostgreSQL status:
docker logs tractatus-umami-db - If needed, restart:
cd ~/umami-deployment && docker compose restart
High Disk Usage (>80%)
- Check backup retention:
ls -lh ~/umami-backups/ - Remove old backups manually if needed
- Check PostgreSQL volume:
docker exec tractatus-umami-db du -sh /var/lib/postgresql/data - Consider database cleanup or server upgrade
Database Corruption
- Stop Umami:
docker compose stop umami - Restore from backup:
~/umami-deployment/restore-umami-db.sh ~/umami-backups/umami_backup_YYYYMMDD_HHMMSS.sql.gz - Restart services:
docker compose up -d
Next Steps
- Sign up for UptimeRobot
- Add analytics.agenticgovernance.digital monitor
- Configure email alerts
- Test alert delivery
- (Optional) Create public status page
- Document response procedures in team wiki
Maintenance
- Review monitoring logs monthly
- Test restore procedure quarterly
- Update alert contacts when team changes
- Review disk usage trends monthly
Last Updated: 2025-10-29 Monitoring Status: Internal monitoring active, external monitoring pending user setup