12.5. Lab: Automated Reporting#
12.5.1. Lab 6: Automated Performance Dashboard#
Build a comprehensive performance monitoring and reporting system that collects metrics, processes them in parallel, and generates a web-accessible dashboard.
Requirements:
Collect system metrics every minute: CPU, memory, disk, network, process count
Store in time-series format (CSV or database)
Process collected data in parallel (4 workers) to generate:
Hourly summaries
Daily statistics
Weekly trends
Top resource consumers
Generate HTML dashboard showing:
Current status indicators
24-hour graphs
Top processes
Alerts for anomalies
Schedule collection and report generation via cron
Make reports available via simple HTTP server
Dashboard Features:
Real-time metrics display
Historical trend charts
Alert indicators for high usage
Top 10 processes by CPU and memory
Disk usage breakdown by partition
Data Collection Script:
# Collect every minute for 7 days
* * * * * /usr/local/bin/collect_metrics.sh
# Generate reports daily at 1 AM
0 1 * * * /usr/local/bin/generate_reports.sh
# Clean old data (keep 30 days)
0 2 * * * find /var/log/metrics -mtime +30 -delete
Validation:
Verify metrics are collected consistently
Dashboard renders correctly in browser
Reports process data efficiently in parallel
Old data is cleaned up per policy
Dashboard responds quickly (<1 second)
Bonus:
Implement anomaly detection (alert if metric exceeds historical average)
Email summary reports weekly
Add predictive capacity planning warnings
Export data in JSON format for external tools
12.5.2. Lab 5: Scheduled Database Maintenance#
Create an automated database maintenance system that runs optimizations, backups, and consistency checks on a schedule.
Requirements:
Check database integrity (PRAGMA integrity_check for SQLite or CHECK TABLE for MySQL)
Optimize tables and indexes
Run daily incremental backups, weekly full backups
Generate maintenance reports with timing and status
Schedule using cron: daily maintenance at 2 AM, full backup Sunday midnight
Implement rollback capability if issues detected
Email report to DBA
Sample Database Operations:
# SQLite
sqlite3 database.db "PRAGMA integrity_check;"
sqlite3 database.db "VACUUM;"
# MySQL
mysqlcheck -u root -p -a -o database
mysqldump -u root -p database > backup.sql
Validation:
Verify maintenance completes in reasonable time
Backup file is valid and can be restored
Integrity check passes
Reports show timing information
Old backups are cleaned up per policy
Bonus:
Calculate and report space savings from OPTIMIZE
Monitor slow queries during maintenance
Implement point-in-time recovery capability
12.5.3. Lab 4: Daemon Health Checker#
Create a monitoring daemon that periodically checks the health of other services and automatically restarts them if they fail.
Requirements:
Write a daemon script that runs continuously
Define a list of services to monitor (e.g., web server, database, API)
Check service health every 30 seconds
If service is down, automatically restart it
Log all events (start, stop, restart) with timestamp
Handle signals (SIGTERM, SIGHUP) gracefully
Implement PID file locking to prevent multiple instances
Create systemd service file for the daemon
Services to Monitor:
declare -A SERVICES=(
[http]="curl -s http://localhost:8080/health"
[db]="mysql -u root -e 'SELECT 1'"
[api]="curl -s http://localhost:3000/status"
)
Validation:
Kill a monitored service and verify it auto-restarts
Check PID file prevents duplicate instances
Verify logging shows all events
Test SIGTERM handling (graceful shutdown)
Monitor from systemd:
systemctl status daemon-monitor
Bonus:
Alert via email after N restart attempts
Exponential backoff for restart attempts
Collect and report service uptime statistics
12.5.4. Lab 3: Parallel Log Analysis Pipeline#
Build a parallel processing pipeline that analyzes multiple log files simultaneously and generates a consolidated report.
Requirements:
Parse access logs and extract: IP, request method, status code, response time
Process multiple log files in parallel (4 workers)
Count unique IPs, calculate average response time
Identify slow requests (> 1000ms)
Generate HTML report with findings
Use xargs for parallelism
Sample Log Data:
192.168.1.100 - - [15/Jan/2024:10:30:45 +0000] "GET /api/users HTTP/1.1" 200 1523 "0.234"
192.168.1.101 - - [15/Jan/2024:10:30:46 +0000] "POST /api/data HTTP/1.1" 201 456 "1.234"
192.168.1.100 - - [15/Jan/2024:10:30:47 +0000] "GET /home HTTP/1.1" 200 2345 "0.123"
Validation:
Process multiple log files in parallel
Ensure all records are processed exactly once
Verify parallel processing is faster than sequential
Check for race conditions in output file
Bonus:
Find top 10 slowest requests
Identify suspicious IP addresses (excessive requests)
Generate time-series report of response times
12.5.5. Lab 1: System Monitoring Scheduler#
Create a cron-based system monitoring script that captures CPU, memory, and disk usage at 5-minute intervals and stores the data in a CSV file for later analysis.
Requirements:
Create a monitoring script that captures: timestamp, CPU %, memory %, disk %
Schedule it to run every 5 minutes using cron
Store output in
/tmp/sysmon.csvEnsure the script handles cron environment limitations
Create a report script that analyzes the CSV and finds peak usage times
Bonus:
Alert if any metric exceeds 80%
Implement log rotation for the CSV file
Create a weekly summary report
Expected Output:
2024-01-15T10:30:00,45.2,62.1,55.3
2024-01-15T10:35:00,48.1,65.2,55.3
2024-01-15T10:40:00,52.3,68.5,55.4
12.5.6. Lab 2: Automated Backup with Verification#
Implement a production-grade backup system with multiple strategies (full/incremental) and automatic verification.
Requirements:
Create separate full and incremental backup functions
Use retention policy: keep 7 full backups, daily incrementals for 30 days
Verify each backup with checksum comparison
Generate a backup report with success/failure status
Handle backup failures gracefully with email alerts
Schedule via cron (daily full on Sunday, incrementals Mon-Sat)
Test Data:
mkdir -p /tmp/backup_test/data
dd if=/dev/urandom of=/tmp/backup_test/data/file1.bin bs=1M count=10
dd if=/dev/urandom of=/tmp/backup_test/data/file2.bin bs=1M count=5
Validation:
Verify backup size matches original
Restore from backup and compare checksums
Verify old backups are cleaned up according to policy