Lab: Debugging

11.5. Lab: Debugging#

11.5.1. Lab Exercise 6: Comprehensive Testing Framework#

Create a test suite for validating error handling in Bash scripts.

11.5.1.1. Scenario: Test Your Own Scripts#

Write a test framework that validates:

Script handles missing arguments
Script fails gracefully on missing files
Script properly logs errors
Script cleans up on failure
Exit codes are correct
Scripts handle signals (Ctrl+C)

11.5.1.2. Basic Test Framework#

# test_framework.sh

run_test() {
  local test_name=$1
  shift
  local command="$@"
  
  echo -n "Testing: $test_name... "
  
  if output=$($command 2>&1); then
    echo "✓ PASS"
    return 0
  else
    echo "✗ FAIL"
    echo "Output: $output" >&2
    return 1
  fi
}

assert_exit_code() {
  local expected=$1
  shift
  local command="$@"
  
  $command 2>/dev/null || actual=$?
  
  if [[ $actual -eq $expected ]]; then
    echo "✓ Exit code correct: $actual"
  else
    echo "✗ Exit code wrong: expected $expected, got $actual"
    return 1
  fi
}

11.5.1.3. Requirements#

Create comprehensive tests for a script (e.g., your backup script) that verify:

Argument validation
- Missing required args exits with code 2
- Invalid args rejected with error message
File handling
- Script errors if source doesn’t exist
- Script errors if destination not writable
Error recovery
- Failed operations don’t partially succeed
- Cleanup occurs on interruption
Logging
- Errors logged to file
- Success logged with timestamp
Exit codes
- 0 on success
- 1 on user error
- 2 on invalid arguments
- 130 on Ctrl+C

11.5.1.4. Example Test Suite#

#!/bin/bash

# Tests for backup.sh script

test_count=0
pass_count=0

run_test "Missing source argument" bash backup.sh
run_test "Non-existent source" bash backup.sh /nonexistent /tmp
run_test "Valid backup creates file" bash backup.sh /tmp /tmp/backup
assert_exit_code 0 bash backup.sh /tmp /tmp/backup
assert_exit_code 2 bash backup.sh  # Missing args

echo "Tests: $pass_count/$test_count passed"

11.5.1.5. Testing#

./test_suite.sh
# Output:
# Testing: Missing source argument... ✓ PASS
# Testing: Non-existent source... ✓ PASS
# Tests: 2/2 passed

11.5.2. Lab Exercise 5: Recovery and Retry Logic#

Create robust error recovery for scripts that deal with transient failures.

11.5.2.1. Scenario: API Integration Script#

You need to write a script that calls an HTTP API that occasionally fails. Implement retry logic with exponential backoff.

11.5.2.2. Requirements#

Create api_integration.sh that:

Calls an endpoint (use curl or wget)
Handles transient errors:
- Timeout (retry)
- 503 Service Unavailable (retry)
- 429 Too Many Requests (retry with backoff)
- 401/403 (don’t retry, fail immediately)
- 200 (success)
Implements exponential backoff:
- First retry: 2 seconds
- Second retry: 4 seconds
- Third retry: 8 seconds
- Max retries: 3
Logs all attempts with timestamps
Reports final status: Success or failure reason

11.5.2.3. Example Output#

[2024-01-15 14:30:00] Attempting to fetch https://api.example.com/data
[2024-01-15 14:30:02] Error 503 (Service Unavailable). Retrying in 2s...
[2024-01-15 14:30:04] Attempting to fetch https://api.example.com/data
[2024-01-15 14:30:06] Error 503 (Service Unavailable). Retrying in 4s...
[2024-01-15 14:30:10] Attempting to fetch https://api.example.com/data
[2024-01-15 14:30:11] Success: Retrieved 2048 bytes

11.5.2.4. Hints#

Use curl -f to fail on HTTP errors
Check HTTP status code separately if needed
Implement retry counter and max retries
Use exponential backoff: sleep $((2 ** retry_count))
Save both stdout and stderr for error messages
Log to file with timestamps

11.5.2.5. Testing#

# Test against real API
./api_integration.sh "https://httpbin.org/delay/1"

# Test with failing endpoint
./api_integration.sh "https://httpbin.org/status/503"

11.5.3. Lab Exercise 4: Fix Input Validation Bugs#

A user-facing script doesn’t properly validate inputs, causing crashes or unexpected behavior.

11.5.3.1. Vulnerable Script#

#!/bin/bash

# User management script
create_user() {
  local username=$1
  local password=$2
  local role=$3
  
  # Validate username (should check for special chars)
  # Validate password (should check minimum length)
  # Validate role (should whitelist valid roles)
  
  # Create the user
  useradd -m -s /bin/bash "$username"
  echo "$username:$password" | chpasswd
  
  # Assign role (vulnerable to command injection!)
  usermod -G "$role" "$username"
  
  echo "User $username created with role $role"
}

# Interactive input
read -p "Username: " user
read -s -p "Password: " pass
read -p "Role: " role

create_user "$user" "$pass" "$role"

11.5.3.2. Vulnerabilities to Fix#

No username validation (special chars, length)
No password strength checking
Role not validated (could inject commands)
Error handling for useradd/usermod failures
Confirmation before making changes
No logging of admin actions

11.5.3.3. Requirements#

Implement:

Input validation with whitelisting
Password minimum requirements
Role validation against allowed list
Error checking for system commands
Confirmation prompt
Detailed logging of all changes
Secure handling of passwords (don’t echo)

11.5.3.4. Example Validation#

validate_username() {
  local user=$1
  
  # Check length
  [[ ${#user} -lt 3 ]] && {
    echo "Username too short (min 3)" >&2
    return 1
  }
  
  # Check for valid chars only
  [[ $user =~ ^[a-z_][a-z0-9_-]*$ ]] || {
    echo "Invalid username (alphanumeric + _ only)" >&2
    return 1
  }
  
  return 0
}

11.5.3.5. Testing#

# Test with invalid inputs
./create_user.sh "user name" "pass" "admin"  # Should reject space
./create_user.sh "a" "x" "admin"  # Should reject short
./create_user.sh "user" "weak" "invalid"  # Should reject role

11.5.4. Lab Exercise 3: Trace and Debug a Failing Script#

A complex script fails intermittently without clear error messages. Use debugging techniques to identify the problem.

11.5.4.1. Failing Script (database migration)#

#!/bin/bash

migrate_data() {
  local source_db=$1
  local dest_db=$2
  
  # Export from source
  export_file="/tmp/migration_$$.sql"
  mysqldump $source_db > $export_file
  
  # Verify export
  wc -l $export_file
  
  # Import to destination
  mysql $dest_db < $export_file
  
  # Verify counts match
  query="SELECT COUNT(*) FROM table1"
  src_count=$(mysql $source_db -se "$query")
  dst_count=$(mysql $dest_db -se "$query")
  
  if [[ $src_count != $dst_count ]]; then
    echo "Migration failed"
    exit 1
  fi
}

migrate_data "production" "staging"

11.5.4.2. Issues to Investigate#

Silent failures in mysqldump/mysql calls
Exit code not propagated
Missing database credentials (empty output)
No error context if query fails
Temp file not cleaned up
No logging of what completed

11.5.4.3. Requirements#

Use debugging tools to:

Add set -x to trace execution
Check each command’s exit code
Verify variable values at each step
Add strategic echo statements
Capture and display error output
Create a debug log file

11.5.4.4. Suggested Debugging Approach#

# Enable debugging
set -x

# Add debug output
echo "[DEBUG] source_db=$source_db, dest_db=$dest_db" >&2

# Check return codes
mysqldump "$source_db" > "$export_file" || {
  echo "[ERROR] Dump failed: $?" >&2
  exit 1
}

# Verify output
[[ -s "$export_file" ]] || {
  echo "[ERROR] Export file is empty" >&2
  exit 1
}

11.5.4.5. Testing#

# Run with debugging
bash -x migrate.sh 2>&1 | tee debug.log

# Analyze the log to find where it fails

11.5.5. Lab Exercise 2: Add Error Handling to Script#

Take a fragile script and make it production-ready with proper error handling.

11.5.5.1. Original Fragile Script#

#!/bin/bash

# Simple backup script
backup_dir="/backups"
source_dir="/var/www"
db_host="localhost"
db_name="production"

mkdir -p $backup_dir
tar -czf $backup_dir/web-backup.tar.gz $source_dir
mysqldump -h $db_host -u root $db_name > $backup_dir/db-backup.sql
echo "Backup complete"

11.5.5.2. Issues to Address#

No validation of prerequisites (dirs exist, tools available, permissions)
No error checking on critical operations
Unquoted variables
No cleanup on failure (partial backups)
No logging
No verification of backups
No handling if script interrupted

11.5.5.3. Requirements#

Enhance the script with:

Strict mode (set -euo pipefail)
Input validation (dirs exist, permissions)
Error handlers with logging
Cleanup function
Backup verification
Lock file to prevent concurrent runs
Informative error messages

11.5.5.4. Testing#

# Test with missing directory
./backup.sh /nonexistent

# Test successful backup
mkdir -p /tmp/www
./backup.sh /tmp/www

# Test interrupt handling (Ctrl+C)
# Should cleanup lock file and temp files

11.5.5.5. Hints#

Use trap for cleanup and error handling
Verify tar integrity after creation
Check if mysqldump succeeded before using file
Use lock file: [[ -f $lock ]] && exit 1
Log all actions to file with timestamps

11.5.6. Lab Exercise 1: Debugging a Broken Script#

A script has multiple bugs causing failures. Your task is to identify and fix them.

11.5.6.1. Broken Script#

#!/bin/bash

# Process files in directory
process_files() {
  local dir=$1
  
  for file in $dir/*.txt
  do
    echo "Processing $file"
    grep "ERROR" $file || break
    cat $file | grep "WARN" | sort | uniq
  done
  
  echo "Processing complete"
}

process_files /var/log

11.5.6.2. Issues to Find and Fix#

Missing quotes: $dir and $file unquoted (breaks with spaces)
Wrong error handling: || break stops at first non-match (should be ||)
Unnecessary cat: Useless use of cat in pipeline
No error messages: If directory doesn’t exist, error is silent
Missing function check: Process files returns wrong exit code

11.5.6.3. Requirements#

Add proper error handling with set -euo pipefail
Quote all variables
Add validation that directory exists
Fix the grep logic
Remove unnecessary command
Add helpful error messages

11.5.6.4. Testing#

# Create test environment
mkdir -p /tmp/test_logs
echo "ERROR in line 1" > /tmp/test_logs/test1.txt
echo "WARN in line 2" > /tmp/test_logs/test2.txt

# Run fixed script
./process_files.sh /tmp/test_logs

11.5.6.5. Hints#

Use set -euo pipefail at top
Always quote variables: "$var"
Check file/directory existence with [[ -d ]]
Test with both valid and invalid directories