11.5. Lab: Debugging#

11.5.1. Lab Exercise 6: Comprehensive Testing Framework#

Create a test suite for validating error handling in Bash scripts.

11.5.1.1. Scenario: Test Your Own Scripts#

Write a test framework that validates:

  1. Script handles missing arguments

  2. Script fails gracefully on missing files

  3. Script properly logs errors

  4. Script cleans up on failure

  5. Exit codes are correct

  6. Scripts handle signals (Ctrl+C)

11.5.1.2. Basic Test Framework#

# test_framework.sh

run_test() {
  local test_name=$1
  shift
  local command="$@"
  
  echo -n "Testing: $test_name... "
  
  if output=$($command 2>&1); then
    echo "✓ PASS"
    return 0
  else
    echo "✗ FAIL"
    echo "Output: $output" >&2
    return 1
  fi
}

assert_exit_code() {
  local expected=$1
  shift
  local command="$@"
  
  $command 2>/dev/null || actual=$?
  
  if [[ $actual -eq $expected ]]; then
    echo "✓ Exit code correct: $actual"
  else
    echo "✗ Exit code wrong: expected $expected, got $actual"
    return 1
  fi
}

11.5.1.3. Requirements#

Create comprehensive tests for a script (e.g., your backup script) that verify:

  1. Argument validation

    • Missing required args exits with code 2

    • Invalid args rejected with error message

  2. File handling

    • Script errors if source doesn’t exist

    • Script errors if destination not writable

  3. Error recovery

    • Failed operations don’t partially succeed

    • Cleanup occurs on interruption

  4. Logging

    • Errors logged to file

    • Success logged with timestamp

  5. Exit codes

    • 0 on success

    • 1 on user error

    • 2 on invalid arguments

    • 130 on Ctrl+C

11.5.1.4. Example Test Suite#

#!/bin/bash

# Tests for backup.sh script

test_count=0
pass_count=0

run_test "Missing source argument" bash backup.sh
run_test "Non-existent source" bash backup.sh /nonexistent /tmp
run_test "Valid backup creates file" bash backup.sh /tmp /tmp/backup
assert_exit_code 0 bash backup.sh /tmp /tmp/backup
assert_exit_code 2 bash backup.sh  # Missing args

echo "Tests: $pass_count/$test_count passed"

11.5.1.5. Testing#

./test_suite.sh
# Output:
# Testing: Missing source argument... ✓ PASS
# Testing: Non-existent source... ✓ PASS
# Tests: 2/2 passed

11.5.2. Lab Exercise 5: Recovery and Retry Logic#

Create robust error recovery for scripts that deal with transient failures.

11.5.2.1. Scenario: API Integration Script#

You need to write a script that calls an HTTP API that occasionally fails. Implement retry logic with exponential backoff.

11.5.2.2. Requirements#

Create api_integration.sh that:

  1. Calls an endpoint (use curl or wget)

  2. Handles transient errors:

    • Timeout (retry)

    • 503 Service Unavailable (retry)

    • 429 Too Many Requests (retry with backoff)

    • 401/403 (don’t retry, fail immediately)

    • 200 (success)

  3. Implements exponential backoff:

    • First retry: 2 seconds

    • Second retry: 4 seconds

    • Third retry: 8 seconds

    • Max retries: 3

  4. Logs all attempts with timestamps

  5. Reports final status: Success or failure reason

11.5.2.3. Example Output#

[2024-01-15 14:30:00] Attempting to fetch https://api.example.com/data
[2024-01-15 14:30:02] Error 503 (Service Unavailable). Retrying in 2s...
[2024-01-15 14:30:04] Attempting to fetch https://api.example.com/data
[2024-01-15 14:30:06] Error 503 (Service Unavailable). Retrying in 4s...
[2024-01-15 14:30:10] Attempting to fetch https://api.example.com/data
[2024-01-15 14:30:11] Success: Retrieved 2048 bytes

11.5.2.4. Hints#

  • Use curl -f to fail on HTTP errors

  • Check HTTP status code separately if needed

  • Implement retry counter and max retries

  • Use exponential backoff: sleep $((2 ** retry_count))

  • Save both stdout and stderr for error messages

  • Log to file with timestamps

11.5.2.5. Testing#

# Test against real API
./api_integration.sh "https://httpbin.org/delay/1"

# Test with failing endpoint
./api_integration.sh "https://httpbin.org/status/503"

11.5.3. Lab Exercise 4: Fix Input Validation Bugs#

A user-facing script doesn’t properly validate inputs, causing crashes or unexpected behavior.

11.5.3.1. Vulnerable Script#

#!/bin/bash

# User management script
create_user() {
  local username=$1
  local password=$2
  local role=$3
  
  # Validate username (should check for special chars)
  # Validate password (should check minimum length)
  # Validate role (should whitelist valid roles)
  
  # Create the user
  useradd -m -s /bin/bash "$username"
  echo "$username:$password" | chpasswd
  
  # Assign role (vulnerable to command injection!)
  usermod -G "$role" "$username"
  
  echo "User $username created with role $role"
}

# Interactive input
read -p "Username: " user
read -s -p "Password: " pass
read -p "Role: " role

create_user "$user" "$pass" "$role"

11.5.3.2. Vulnerabilities to Fix#

  1. No username validation (special chars, length)

  2. No password strength checking

  3. Role not validated (could inject commands)

  4. Error handling for useradd/usermod failures

  5. Confirmation before making changes

  6. No logging of admin actions

11.5.3.3. Requirements#

Implement:

  • Input validation with whitelisting

  • Password minimum requirements

  • Role validation against allowed list

  • Error checking for system commands

  • Confirmation prompt

  • Detailed logging of all changes

  • Secure handling of passwords (don’t echo)

11.5.3.4. Example Validation#

validate_username() {
  local user=$1
  
  # Check length
  [[ ${#user} -lt 3 ]] && {
    echo "Username too short (min 3)" >&2
    return 1
  }
  
  # Check for valid chars only
  [[ $user =~ ^[a-z_][a-z0-9_-]*$ ]] || {
    echo "Invalid username (alphanumeric + _ only)" >&2
    return 1
  }
  
  return 0
}

11.5.3.5. Testing#

# Test with invalid inputs
./create_user.sh "user name" "pass" "admin"  # Should reject space
./create_user.sh "a" "x" "admin"  # Should reject short
./create_user.sh "user" "weak" "invalid"  # Should reject role

11.5.4. Lab Exercise 3: Trace and Debug a Failing Script#

A complex script fails intermittently without clear error messages. Use debugging techniques to identify the problem.

11.5.4.1. Failing Script (database migration)#

#!/bin/bash

migrate_data() {
  local source_db=$1
  local dest_db=$2
  
  # Export from source
  export_file="/tmp/migration_$$.sql"
  mysqldump $source_db > $export_file
  
  # Verify export
  wc -l $export_file
  
  # Import to destination
  mysql $dest_db < $export_file
  
  # Verify counts match
  query="SELECT COUNT(*) FROM table1"
  src_count=$(mysql $source_db -se "$query")
  dst_count=$(mysql $dest_db -se "$query")
  
  if [[ $src_count != $dst_count ]]; then
    echo "Migration failed"
    exit 1
  fi
}

migrate_data "production" "staging"

11.5.4.2. Issues to Investigate#

  1. Silent failures in mysqldump/mysql calls

  2. Exit code not propagated

  3. Missing database credentials (empty output)

  4. No error context if query fails

  5. Temp file not cleaned up

  6. No logging of what completed

11.5.4.3. Requirements#

Use debugging tools to:

  1. Add set -x to trace execution

  2. Check each command’s exit code

  3. Verify variable values at each step

  4. Add strategic echo statements

  5. Capture and display error output

  6. Create a debug log file

11.5.4.4. Suggested Debugging Approach#

# Enable debugging
set -x

# Add debug output
echo "[DEBUG] source_db=$source_db, dest_db=$dest_db" >&2

# Check return codes
mysqldump "$source_db" > "$export_file" || {
  echo "[ERROR] Dump failed: $?" >&2
  exit 1
}

# Verify output
[[ -s "$export_file" ]] || {
  echo "[ERROR] Export file is empty" >&2
  exit 1
}

11.5.4.5. Testing#

# Run with debugging
bash -x migrate.sh 2>&1 | tee debug.log

# Analyze the log to find where it fails

11.5.5. Lab Exercise 2: Add Error Handling to Script#

Take a fragile script and make it production-ready with proper error handling.

11.5.5.1. Original Fragile Script#

#!/bin/bash

# Simple backup script
backup_dir="/backups"
source_dir="/var/www"
db_host="localhost"
db_name="production"

mkdir -p $backup_dir
tar -czf $backup_dir/web-backup.tar.gz $source_dir
mysqldump -h $db_host -u root $db_name > $backup_dir/db-backup.sql
echo "Backup complete"

11.5.5.2. Issues to Address#

  1. No validation of prerequisites (dirs exist, tools available, permissions)

  2. No error checking on critical operations

  3. Unquoted variables

  4. No cleanup on failure (partial backups)

  5. No logging

  6. No verification of backups

  7. No handling if script interrupted

11.5.5.3. Requirements#

Enhance the script with:

  • Strict mode (set -euo pipefail)

  • Input validation (dirs exist, permissions)

  • Error handlers with logging

  • Cleanup function

  • Backup verification

  • Lock file to prevent concurrent runs

  • Informative error messages

11.5.5.4. Testing#

# Test with missing directory
./backup.sh /nonexistent

# Test successful backup
mkdir -p /tmp/www
./backup.sh /tmp/www

# Test interrupt handling (Ctrl+C)
# Should cleanup lock file and temp files

11.5.5.5. Hints#

  • Use trap for cleanup and error handling

  • Verify tar integrity after creation

  • Check if mysqldump succeeded before using file

  • Use lock file: [[ -f $lock ]] && exit 1

  • Log all actions to file with timestamps

11.5.6. Lab Exercise 1: Debugging a Broken Script#

A script has multiple bugs causing failures. Your task is to identify and fix them.

11.5.6.1. Broken Script#

#!/bin/bash

# Process files in directory
process_files() {
  local dir=$1
  
  for file in $dir/*.txt
  do
    echo "Processing $file"
    grep "ERROR" $file || break
    cat $file | grep "WARN" | sort | uniq
  done
  
  echo "Processing complete"
}

process_files /var/log

11.5.6.2. Issues to Find and Fix#

  1. Missing quotes: $dir and $file unquoted (breaks with spaces)

  2. Wrong error handling: || break stops at first non-match (should be ||)

  3. Unnecessary cat: Useless use of cat in pipeline

  4. No error messages: If directory doesn’t exist, error is silent

  5. Missing function check: Process files returns wrong exit code

11.5.6.3. Requirements#

  • Add proper error handling with set -euo pipefail

  • Quote all variables

  • Add validation that directory exists

  • Fix the grep logic

  • Remove unnecessary command

  • Add helpful error messages

11.5.6.4. Testing#

# Create test environment
mkdir -p /tmp/test_logs
echo "ERROR in line 1" > /tmp/test_logs/test1.txt
echo "WARN in line 2" > /tmp/test_logs/test2.txt

# Run fixed script
./process_files.sh /tmp/test_logs

11.5.6.5. Hints#

  • Use set -euo pipefail at top

  • Always quote variables: "$var"

  • Check file/directory existence with [[ -d ]]

  • Test with both valid and invalid directories