9.2. Advanced Redirection#
9.2.1. Common Pitfalls#
1. Using cat unnecessarily
# Inefficient: useless use of cat
cat file.txt | grep pattern
# Better: let grep read the file
grep pattern file.txt
2. Forgetting that pipes destroy exit codes
# This will always succeed, even if command fails
command | tee output.log
echo $? # Shows status of tee, not command!
# Better: check PIPESTATUS
command | tee output.log
echo "${PIPESTATUS[0]}" # Status of first command
3. Expecting xargs to preserve argument grouping
# Problem: xargs breaks up the arguments
echo "arg1 arg2" | xargs command # Runs: command arg1 arg2
# Fix: Use -I to treat entire input as one argument
echo "arg1 arg2" | xargs -I {} command "{}"
4. Forgetting null delimiter with filenames
# Breaks on filenames with spaces
find . -name "*.txt" | xargs rm
# Better: use null delimiter
find . -name "*.txt" -print0 | xargs -0 rm
9.2.2. Real-World Example: Data Processing Pipeline#
#!/bin/bash
# Process a web server log file
process_logs() {
local log_file=$1
local output_dir=$2
mkdir -p "$output_dir"
# Extract and analyze various metrics
cat "$log_file" | \
tee >(awk '{print $1}' | sort | uniq -c | sort -rn > "$output_dir/ip_count.txt") \
>(grep "error" | wc -l > "$output_dir/error_count.txt") \
>(awk '{print $9}' | sort | uniq -c > "$output_dir/http_codes.txt") | \
awk '{print $4}' | \
cut -d: -f1 | \
sort | uniq -c > "$output_dir/request_time_count.txt"
# Summary
echo "Log processing complete:"
echo " Total requests: $(wc -l < "$log_file")"
echo " Unique IPs: $(wc -l < "$output_dir/ip_count.txt")"
echo " Errors: $(cat "$output_dir/error_count.txt")"
}
process_logs access.log analysis/
9.2.3. Tee: Splitting Output to Multiple Destinations#
The tee command sends data to both stdout and files:
9.2.3.1. Basic tee Usage#
# Display and save output
ls /tmp | tee file_list.txt
# Append instead of overwriting
du -sh * | tee -a disk_usage.log
# Write to multiple files
date | tee log1.txt log2.txt
# Discard screen output, just save to file
long_command | tee /dev/null > output.log # Actually just: long_command > output.log
9.2.3.2. Pipe to Multiple Commands#
# Process output in parallel paths
cat data.txt | tee >(sort > sorted.txt) >(wc -l) | grep pattern
# Save raw output and processed versions
command | tee raw.log | grep "ERROR" > errors.log
9.2.4. Xargs: Passing Output as Arguments#
The xargs command converts stdin into command arguments:
9.2.4.1. Basic xargs Usage#
# List files, pass each to a command
find . -name "*.log" | xargs rm
# Rename files with spaces in names
find . -name "*.txt" | xargs -I {} mv {} {}.bak
# Parallel processing (run 4 jobs at a time)
find . -name "*.jpg" | xargs -P 4 -I {} convert {} {}.png
9.2.4.2. Handling Special Characters#
# Use null delimiter for filenames with spaces
find . -name "*.txt" -print0 | xargs -0 wc -l
# Display each argument before executing
find . -name "*.old" -print0 | xargs -0 -v rm
# Confirm before each execution
find . -name "*.tmp" -print0 | xargs -0 -p rm
9.2.4.3. Common xargs Options#
# -I {} : Replace {} with each argument
echo "file1 file2 file3" | xargs -I {} echo "Processing: {}"
# -n 2 : Pass 2 arguments per execution
echo -e "1\n2\n3\n4\n5" | xargs -n 2 echo
# -P 0 : No limit on parallel jobs
find . -name "*.txt" | xargs -P 0 -I {} process_file {}
9.2.5. Process Substitution#
Process substitution allows you to use command output as if it were a file:
9.2.5.1. Using <() for Input#
# Compare output of two commands
diff <(ls dir1) <(ls dir2)
# Combine multiple outputs into one command
sort -m <(sort file1.txt) <(sort file2.txt)
# Pass command output as a file argument
tar -czf backup.tar.gz <(find . -name "*.txt") <(find . -name "*.md")
9.2.5.2. Using >() for Output#
# Write output of one command to multiple destinations
tee >(command1) >(command2) < input.txt
# Example: log output while also displaying it
long_running_command | tee >(cat > output.log) >(cat > verbose_log.txt)
9.2.5.3. Practical Use Case#
# Compare server configurations
diff <(ssh server1 "cat /etc/config.conf") <(ssh server2 "cat /etc/config.conf")
9.2.6. Pipes: Connecting Commands#
The pipe operator | sends the stdout of one command to the stdin of another.
9.2.6.1. Basic Piping#
# Count the number of files in /tmp
ls /tmp | wc -l
# Find lines containing "error"
cat logfile.txt | grep "error"
# Sort and count unique values
cat data.txt | sort | uniq -c | sort -rn
9.2.6.2. Chaining Multiple Pipes#
# Extract usernames from /etc/passwd, sort, remove duplicates
cut -d: -f1 /etc/passwd | sort | uniq
# Find large files, showing only the filename
ls -lR / | grep "^-" | awk '{print $NF, $5}' | sort -rn | head -10
# Count occurrences of each word
cat document.txt | tr ' ' '\n' | grep -v '^$' | sort | uniq -c | sort -rn
9.2.6.3. Avoiding Unnecessary Pipes#
# Inefficient: using grep in a pipe
cat file.txt | grep pattern | wc -l
# Better: let grep do the counting
grep -c pattern file.txt
# Inefficient: unnecessary cat
cat file.txt | sort
# Better: redirect directly
sort < file.txt