- Update INCIDENT_RECOVERY_2026-01-19.md with complete recovery status - Create VPS_RECOVERY_REFERENCE.md with step-by-step recovery guide - Update remediation plan to show executed status - Update OVH rescue mode doc with resolution notes Documents the successful complete reinstall approach after multiple failed partial cleanup attempts. Includes attack indicators, banned software list, and verification checklist for future incidents. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
465 lines
13 KiB
Markdown
465 lines
13 KiB
Markdown
# Comprehensive Remediation Plan: agenticgovernance.digital
|
|
## Date: 2026-01-19
|
|
## Status: EXECUTED SUCCESSFULLY (2026-01-20)
|
|
|
|
> **UPDATE 2026-01-20:** This remediation plan was successfully executed. Complete VPS reinstallation performed, all systems restored, security hardening applied. See `docs/INCIDENT_RECOVERY_2026-01-19.md` for full details.
|
|
|
|
---
|
|
|
|
## Executive Summary
|
|
|
|
The agenticgovernance.digital VPS (vps-93a693da.vps.ovh.net) has been compromised **three times** by the same botnet infrastructure. Each prior "recovery" was incomplete, leaving persistence mechanisms that allowed reinfection.
|
|
|
|
**Recommendation: COMPLETE REINSTALL**
|
|
|
|
Based on security industry best practices and the pattern of recurring compromise, partial cleanup is no longer viable. A complete OS reinstall is the only way to guarantee all malware is removed.
|
|
|
|
---
|
|
|
|
## Attack History Analysis
|
|
|
|
### Timeline
|
|
|
|
| Date | Attack | Root Cause | Recovery Status |
|
|
|------|--------|------------|-----------------|
|
|
| 2025-12-09 | 83Kpps DNS flood (Exodus botnet via Docker/Umami) | Docker container compromise | **INCOMPLETE** - PM2, umami-deployment, cron jobs left |
|
|
| 2026-01-18 13:57 | 171Kpps UDP flood to 15.184.38.247:9007 | PM2 resurrected botnet processes | **INCOMPLETE** - SSH broken post-recovery |
|
|
| 2026-01-18 23:44 | 44Kpps UDP flood to 171.225.223.4:80 | Continued PM2 persistence | Server in rescue mode |
|
|
| 2026-01-19 (today) | OVH anti-hack triggered again | Unknown - likely same persistence | **CURRENT INCIDENT** |
|
|
|
|
### What Was Missed in Each Recovery
|
|
|
|
**December 2025 Recovery:**
|
|
- Docker removed ✓
|
|
- PM2 process manager NOT removed ✗
|
|
- `/home/ubuntu/umami-deployment/` NOT removed ✗
|
|
- Ubuntu crontab NOT cleared ✗
|
|
- PostgreSQL service NOT disabled ✗
|
|
|
|
**January 19 Recovery (earlier today):**
|
|
- PM2 removed ✓
|
|
- umami-deployment removed ✓
|
|
- PostgreSQL disabled ✓
|
|
- **But server is in rescue mode AGAIN** = something else was missed
|
|
|
|
### Malware Profile
|
|
|
|
**Name:** Exodus Botnet (Mirai variant)
|
|
**C2 Server:** 196.251.100.191 (South Africa)
|
|
**Capabilities:**
|
|
- Multi-architecture binaries (x86, x86_64, ARM, MIPS, etc.)
|
|
- UDP/DNS flood attacks
|
|
- Self-replicating via PM2 process manager
|
|
- Persistence through system services
|
|
|
|
**PM2 as Persistence Mechanism:**
|
|
- PM2's `resurrect` feature auto-restarts saved processes on boot
|
|
- Used by modern botnets like NodeCordRAT and Tsundere (2025)
|
|
- Survives manual process termination
|
|
- Creates systemd service (`pm2-ubuntu.service`)
|
|
|
|
---
|
|
|
|
## Why Partial Cleanup Has Failed
|
|
|
|
### Problem 1: Unknown Persistence Mechanisms
|
|
Each cleanup identified SOME persistence mechanisms but missed others. There may be:
|
|
- Modified system binaries (rootkits)
|
|
- Kernel modules
|
|
- Hidden cron jobs in unexpected locations
|
|
- Modified init scripts
|
|
- SSH backdoors (could explain broken SSH)
|
|
|
|
### Problem 2: No Baseline for Comparison
|
|
Without knowing exactly what should exist on a clean system, we cannot verify complete removal.
|
|
|
|
### Problem 3: Forensic Limitations in Rescue Mode
|
|
Rescue mode provides limited visibility into:
|
|
- Runtime state of malware
|
|
- Memory-resident components
|
|
- Kernel-level modifications
|
|
|
|
### Expert Consensus
|
|
> "Reinstalling a computer after it has been compromised can be a painstaking process, but it is the best way to be certain that everything an attacker left behind has been found." - UC Berkeley Information Security Office
|
|
|
|
> "Rootkits are difficult to remove, and the only 100% sure fire way to remove a rootkit from a device that has been infected is to wipe the device and reinstall the operating system."
|
|
|
|
---
|
|
|
|
## Recommended Solution: Complete Reinstall
|
|
|
|
### Phase 1: Data Backup (From Rescue Mode)
|
|
|
|
**CRITICAL:** Before reinstalling, back up essential data:
|
|
|
|
```bash
|
|
# 1. Boot into rescue mode via OVH Manager
|
|
# 2. Mount main disk
|
|
mount /dev/sdb1 /mnt/vps
|
|
|
|
# 3. Create backup directory
|
|
mkdir -p /tmp/tractatus-backup
|
|
|
|
# 4. Backup application code (verify hashes later)
|
|
tar -czf /tmp/tractatus-backup/app.tar.gz /mnt/vps/var/www/tractatus/
|
|
|
|
# 5. Backup MongoDB data
|
|
tar -czf /tmp/tractatus-backup/mongodb.tar.gz /mnt/vps/var/lib/mongodb/
|
|
|
|
# 6. Backup SSL certificates
|
|
tar -czf /tmp/tractatus-backup/ssl.tar.gz /mnt/vps/etc/letsencrypt/
|
|
|
|
# 7. Backup nginx config (for reference, will recreate)
|
|
cp /mnt/vps/etc/nginx/sites-available/tractatus /tmp/tractatus-backup/
|
|
|
|
# 8. Download backups to local machine
|
|
scp -r root@RESCUE_IP:/tmp/tractatus-backup/ ~/tractatus-recovery/
|
|
```
|
|
|
|
**DO NOT BACKUP:**
|
|
- `/home/ubuntu/.pm2/` (malware)
|
|
- `/home/ubuntu/umami-deployment/` (malware)
|
|
- Any executables (may be compromised)
|
|
- `/var/lib/docker/` (attack vector)
|
|
|
|
### Phase 2: VPS Reinstallation
|
|
|
|
1. **Via OVH Manager:**
|
|
- Navigate to VPS management
|
|
- Select "Reinstall" option
|
|
- Choose: Ubuntu 22.04 LTS (or latest LTS)
|
|
- Wait for completion (~10 minutes)
|
|
|
|
2. **Retrieve New Root Password:**
|
|
- Check email for new credentials
|
|
- Or use OVH password reset function
|
|
|
|
### Phase 3: Fresh System Setup
|
|
|
|
**Initial SSH Access:**
|
|
```bash
|
|
ssh root@91.134.240.3
|
|
```
|
|
|
|
**Step 1: System Updates**
|
|
```bash
|
|
apt update && apt upgrade -y
|
|
```
|
|
|
|
**Step 2: Create Non-Root User**
|
|
```bash
|
|
adduser ubuntu
|
|
usermod -aG sudo ubuntu
|
|
```
|
|
|
|
**Step 3: SSH Hardening**
|
|
```bash
|
|
# Add authorized keys
|
|
mkdir -p /home/ubuntu/.ssh
|
|
cat > /home/ubuntu/.ssh/authorized_keys << 'EOF'
|
|
ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAACAQCZ8BH+Bx4uO9DTatRZ... theflow@the-flow
|
|
ssh-ed25519 AAAAC3NzaC1lZDI1NTE5AAAAIPdJcKMabIVQRqKqNIpzxHNgxMZ8NOD+9gVCk6dY5uV0 tractatus-deploy
|
|
EOF
|
|
|
|
chown -R ubuntu:ubuntu /home/ubuntu/.ssh
|
|
chmod 700 /home/ubuntu/.ssh
|
|
chmod 600 /home/ubuntu/.ssh/authorized_keys
|
|
|
|
# Harden SSH config
|
|
cat > /etc/ssh/sshd_config.d/hardening.conf << 'EOF'
|
|
PasswordAuthentication no
|
|
PermitRootLogin no
|
|
MaxAuthTries 3
|
|
LoginGraceTime 20
|
|
ClientAliveInterval 300
|
|
ClientAliveCountMax 2
|
|
EOF
|
|
|
|
systemctl restart sshd
|
|
```
|
|
|
|
**Step 4: Firewall Configuration**
|
|
```bash
|
|
apt install -y ufw
|
|
ufw default deny incoming
|
|
ufw default allow outgoing
|
|
ufw allow 22/tcp comment 'SSH'
|
|
ufw allow 80/tcp comment 'HTTP'
|
|
ufw allow 443/tcp comment 'HTTPS'
|
|
# Block Docker ports (never needed)
|
|
ufw deny 2375/tcp comment 'Block Docker API'
|
|
ufw deny 2376/tcp comment 'Block Docker TLS'
|
|
ufw enable
|
|
```
|
|
|
|
**Step 5: Intrusion Prevention**
|
|
```bash
|
|
apt install -y fail2ban
|
|
cat > /etc/fail2ban/jail.local << 'EOF'
|
|
[sshd]
|
|
enabled = true
|
|
maxretry = 3
|
|
bantime = 24h
|
|
findtime = 1h
|
|
EOF
|
|
|
|
systemctl enable fail2ban
|
|
systemctl start fail2ban
|
|
```
|
|
|
|
**Step 6: Install Required Software**
|
|
```bash
|
|
# Node.js (via NodeSource)
|
|
curl -fsSL https://deb.nodesource.com/setup_20.x | bash -
|
|
apt install -y nodejs
|
|
|
|
# MongoDB
|
|
curl -fsSL https://pgp.mongodb.com/server-7.0.asc | gpg -o /usr/share/keyrings/mongodb-server-7.0.gpg --dearmor
|
|
echo "deb [ signed-by=/usr/share/keyrings/mongodb-server-7.0.gpg ] https://repo.mongodb.org/apt/ubuntu jammy/mongodb-org/7.0 multiverse" > /etc/apt/sources.list.d/mongodb-org-7.0.list
|
|
apt update
|
|
apt install -y mongodb-org
|
|
systemctl enable mongod
|
|
systemctl start mongod
|
|
|
|
# nginx
|
|
apt install -y nginx
|
|
systemctl enable nginx
|
|
|
|
# certbot for SSL
|
|
apt install -y certbot python3-certbot-nginx
|
|
```
|
|
|
|
### Phase 4: Application Deployment
|
|
|
|
**Step 1: Prepare Application Directory**
|
|
```bash
|
|
mkdir -p /var/www/tractatus
|
|
chown ubuntu:ubuntu /var/www/tractatus
|
|
```
|
|
|
|
**Step 2: Deploy from CLEAN Local Source**
|
|
```bash
|
|
# From local machine - deploy ONLY from verified clean source
|
|
cd ~/projects/tractatus
|
|
./scripts/deploy.sh --full
|
|
```
|
|
|
|
**Step 3: Restore MongoDB Data**
|
|
```bash
|
|
# If data integrity is verified
|
|
mongorestore --db tractatus ~/tractatus-recovery/mongodb/tractatus/
|
|
```
|
|
|
|
**Step 4: SSL Certificate**
|
|
```bash
|
|
certbot --nginx -d agenticgovernance.digital
|
|
```
|
|
|
|
**Step 5: Create Systemd Service**
|
|
```bash
|
|
cat > /etc/systemd/system/tractatus.service << 'EOF'
|
|
[Unit]
|
|
Description=Tractatus Application
|
|
After=network.target mongod.service
|
|
|
|
[Service]
|
|
Type=simple
|
|
User=ubuntu
|
|
WorkingDirectory=/var/www/tractatus
|
|
ExecStart=/usr/bin/node src/server.js
|
|
Restart=on-failure
|
|
RestartSec=10
|
|
Environment=NODE_ENV=production
|
|
|
|
[Install]
|
|
WantedBy=multi-user.target
|
|
EOF
|
|
|
|
systemctl daemon-reload
|
|
systemctl enable tractatus
|
|
systemctl start tractatus
|
|
```
|
|
|
|
### Phase 5: Monitoring Setup
|
|
|
|
**Step 1: Log Rotation for MongoDB**
|
|
```bash
|
|
cat > /etc/logrotate.d/mongodb << 'EOF'
|
|
/var/log/mongodb/*.log {
|
|
daily
|
|
rotate 7
|
|
compress
|
|
missingok
|
|
notifempty
|
|
sharedscripts
|
|
postrotate
|
|
/bin/kill -SIGUSR1 $(cat /var/lib/mongodb/mongod.lock 2>/dev/null) 2>/dev/null || true
|
|
endscript
|
|
}
|
|
EOF
|
|
```
|
|
|
|
**Step 2: Install Rootkit Scanner**
|
|
```bash
|
|
apt install -y rkhunter chkrootkit lynis
|
|
|
|
# Run initial scan
|
|
rkhunter --update
|
|
rkhunter --check --skip-keypress
|
|
chkrootkit
|
|
lynis audit system
|
|
```
|
|
|
|
**Step 3: Monitoring Script**
|
|
```bash
|
|
cat > /usr/local/bin/security-check.sh << 'EOF'
|
|
#!/bin/bash
|
|
# Daily security check
|
|
LOG=/var/log/security-check.log
|
|
echo "=== Security Check $(date) ===" >> $LOG
|
|
|
|
# Check for unauthorized services
|
|
systemctl list-units --type=service --state=running | grep -v "systemd\|ssh\|nginx\|mongod\|tractatus\|fail2ban\|ufw" >> $LOG
|
|
|
|
# Check for unusual network connections
|
|
netstat -tlnp | grep -v "127.0.0.1\|mongodb\|node\|nginx" >> $LOG
|
|
|
|
# Check for PM2 (should never exist)
|
|
if command -v pm2 &> /dev/null; then
|
|
echo "WARNING: PM2 DETECTED - SHOULD NOT EXIST" >> $LOG
|
|
fi
|
|
|
|
# Check for Docker (should never exist)
|
|
if command -v docker &> /dev/null; then
|
|
echo "WARNING: DOCKER DETECTED - SHOULD NOT EXIST" >> $LOG
|
|
fi
|
|
EOF
|
|
chmod +x /usr/local/bin/security-check.sh
|
|
|
|
# Add to cron
|
|
echo "0 6 * * * root /usr/local/bin/security-check.sh" > /etc/cron.d/security-check
|
|
```
|
|
|
|
---
|
|
|
|
## What Must NEVER Exist on This Server
|
|
|
|
| Component | Reason |
|
|
|-----------|--------|
|
|
| PM2 | Used for malware persistence |
|
|
| Docker | Attack vector (Umami compromise) |
|
|
| PostgreSQL | Only for Umami, not needed |
|
|
| Any analytics containers | Attack surface |
|
|
| Node packages outside app | Potential supply chain risk |
|
|
|
|
**Verification Script:**
|
|
```bash
|
|
#!/bin/bash
|
|
ALERT=0
|
|
if command -v pm2 &> /dev/null; then echo "ALERT: PM2 exists"; ALERT=1; fi
|
|
if command -v docker &> /dev/null; then echo "ALERT: Docker exists"; ALERT=1; fi
|
|
if [ -d "/home/ubuntu/.pm2" ]; then echo "ALERT: .pm2 directory exists"; ALERT=1; fi
|
|
if [ -d "/home/ubuntu/umami-deployment" ]; then echo "ALERT: umami-deployment exists"; ALERT=1; fi
|
|
if systemctl is-enabled postgresql &> /dev/null; then echo "ALERT: PostgreSQL enabled"; ALERT=1; fi
|
|
if [ $ALERT -eq 0 ]; then echo "Server is clean"; fi
|
|
```
|
|
|
|
---
|
|
|
|
## Post-Recovery Verification Checklist
|
|
|
|
- [ ] SSH access works with key authentication
|
|
- [ ] Password authentication is disabled
|
|
- [ ] fail2ban is running and banning IPs
|
|
- [ ] UFW is enabled with correct rules
|
|
- [ ] nginx is serving the site
|
|
- [ ] tractatus service is running
|
|
- [ ] MongoDB is running and bound to 127.0.0.1
|
|
- [ ] SSL certificate is valid
|
|
- [ ] No PM2 installed
|
|
- [ ] No Docker installed
|
|
- [ ] No PostgreSQL installed
|
|
- [ ] rkhunter scan is clean
|
|
- [ ] chkrootkit scan is clean
|
|
- [ ] Log rotation is configured
|
|
- [ ] Daily security check cron is active
|
|
|
|
---
|
|
|
|
## Credentials to Rotate
|
|
|
|
After reinstall, rotate all credentials:
|
|
|
|
1. **MongoDB admin password** (if using authentication)
|
|
2. **Application secrets** in `.env`
|
|
3. **Session secrets**
|
|
4. **Any API keys**
|
|
|
|
**Important:** Change passwords from a DIFFERENT machine, not the compromised server.
|
|
|
|
---
|
|
|
|
## Long-Term Prevention
|
|
|
|
1. **Never install Docker** - not needed for this application
|
|
2. **Never install PM2** - use systemd only
|
|
3. **Weekly security scans** - rkhunter, chkrootkit
|
|
4. **Monitor outbound traffic** - alert on unexpected destinations
|
|
5. **Keep system updated** - enable unattended-upgrades
|
|
6. **Review SSH logs weekly** - check for brute force patterns
|
|
|
|
---
|
|
|
|
## OVH Support Communication Template
|
|
|
|
```
|
|
Subject: Request to restore VPS to normal mode after reinstallation
|
|
|
|
Reference: [ref=1.2378332d]
|
|
Server: vps-93a693da.vps.ovh.net
|
|
|
|
We have identified the cause of the anti-hack triggers:
|
|
- Compromised Docker container running botnet malware
|
|
- PM2 process manager persisting malicious processes
|
|
|
|
We have completed a full OS reinstall and implemented:
|
|
- Hardened SSH configuration (key-only, no root)
|
|
- UFW firewall with minimal open ports
|
|
- fail2ban for intrusion prevention
|
|
- Removal of Docker and PM2
|
|
|
|
Please restore the VPS to normal boot mode.
|
|
|
|
Thank you.
|
|
```
|
|
|
|
---
|
|
|
|
## Timeline Estimate
|
|
|
|
| Phase | Duration |
|
|
|-------|----------|
|
|
| Backup data | 30 min |
|
|
| VPS reinstall | 10 min |
|
|
| System setup | 45 min |
|
|
| Application deployment | 30 min |
|
|
| Verification | 30 min |
|
|
| **Total** | ~2.5 hours |
|
|
|
|
---
|
|
|
|
## References
|
|
|
|
- [Mirai Malware Behavior & Mitigation (2025)](https://echoxec.com/mirai-malware-in-2025-variant-behavior-exploit-chains-and-mitigation-insights)
|
|
- [PM2 Malware Persistence (ThreatLabz)](https://www.zscaler.com/blogs/security-research/malicious-npm-packages-deliver-nodecordrat)
|
|
- [Linux Rootkit Detection Tools](https://www.tecmint.com/scan-linux-for-malware-and-rootkits/)
|
|
- [OVH VPS Rescue Mode Guide](https://help.ovhcloud.com/csm/en-vps-rescue?id=kb_article_view&sysparm_article=KB0047656)
|
|
- [UC Berkeley - Reinstalling Compromised Systems](https://security.berkeley.edu/education-awareness/reinstalling-your-compromised-computer)
|
|
- [Common Malware Persistence Mechanisms](https://www.infosecinstitute.com/resources/malware-analysis/common-malware-persistence-mechanisms/)
|
|
|
|
---
|
|
|
|
**Document Author:** Claude Code
|
|
**Date:** 2026-01-19
|
|
**Status:** Ready for implementation
|
|
**Next Action:** User decision on proceeding with complete reinstall
|