4.1. Everything Is A File#
One of Unix’s most powerful principles: everything is a file. This includes regular files, directories, devices, network connections, and processes. More importantly for this chapter: everything is text (or can be treated as text).
This philosophy enables the Unix approach: small, focused tools that read text input and write text output, which can be chained together.
4.1.1. Common Mistakes with Text Data#
4.1.1.1. ❌ Mixing binary and text#
# DON'T: Try to grep a binary file
$ grep "text" /bin/ls
Binary file /bin/ls matches
# DO: Use tools meant for binary data
$ strings /bin/ls | grep "text"
# Extract text strings first
4.1.1.2. ❌ Assuming order in pipelines#
$ sort file.txt | grep "pattern"
# If you wanted sorted output, this is wrong
# The sort output goes to grep (fine)
# But you never see the sorted result
Fix:
$ grep "pattern" file.txt | sort
# Now you see sorted results
4.1.1.3. ❌ Forgetting about stderr#
$ grep "pattern" file.txt > output.txt
# Error messages still print to screen!
# If file.txt doesn't exist, error goes to stderr (not captured)
Fix:
$ grep "pattern" file.txt > output.txt 2> errors.txt
4.1.2. Common Pitfalls#
4.1.3. Key Principles#
Principle |
Meaning |
Example |
|---|---|---|
Do one thing |
Each tool has a focused purpose |
|
Read text |
Input is text/lines |
All examples read line-based data |
Write text |
Output is text/lines |
All examples output text |
Chain them |
Pipe output to input |
|
No assumptions |
Don’t assume anything about data format |
Tools work with any text |
4.1.4. Text Processing Chain Example#
Real-world scenario: Count lines containing “ERROR” in a log file:
# Get lines with ERROR, count them
$ grep "ERROR" /var/log/app.log | wc -l
1523
# Get unique error messages
$ grep "ERROR" /var/log/app.log | cut -d: -f3 | sort | uniq -c
42 Connection timeout
189 Invalid request
203 Database error
...
# Find errors from last hour, count by type
$ grep "ERROR" /var/log/app.log | grep "2025-01-15 1[0-9]:" | cut -d: -f3 | sort | uniq -c
5 Connection timeout
21 Invalid request
8 Database error
Each tool does one thing well. Combined, they’re powerful.
4.1.5. Combining Input and Output#
You can redirect streams to files while piping others:
# Redirect stdout to file, keep stderr visible
$ ./my_script.sh > output.txt 2>&1
# Both stdout and stderr go to output.txt
# Redirect stdout to file, save stderr separately
$ ./my_script.sh > output.txt 2> errors.txt
# stdout → output.txt
# stderr → errors.txt
4.1.5.1. Why This Matters#
Without pipes, you’d need:
# BAD: Create intermediate files
$ cat data.csv > temp1.txt
$ grep "error" temp1.txt > temp2.txt
$ wc -l temp2.txt
12345
# GOOD: Use pipes
$ cat data.csv | grep "error" | wc -l
12345
# No intermediate files, faster, cleaner
4.1.6. Pipes: Connecting Streams#
A pipe (|) connects stdout of one command to stdin of another:
$ cat data.csv | grep "error" | wc -l
↑ ↑ ↑
produces reads input reads input
output from pipe from pipe
The data flows: cat → grep → wc
Each command processes the data and passes it to the next.
4.1.6.1. Viewing Streams in Action#
# Normal output goes to stdout
$ echo "Hello"
Hello
# Error message goes to stderr
$ ls /nonexistent
ls: cannot access '/nonexistent': No such file or directory
# Separate stdout and stderr
$ ls /nonexistent > out.txt 2> err.txt
$ cat out.txt
# (empty - no stdout)
$ cat err.txt
ls: cannot access '/nonexistent': No such file or directory
4.1.6.2. Example: A Command’s Streams#
$ grep "error" logfile.txt > matches.txt 2> errors.log
↑ ↑ ↑ ↑
stdout stdin stdout stderr
(file) (file)
4.1.7. Standard Streams#
Every command has three standard streams:
Stream |
Name |
Use |
Descriptor |
|---|---|---|---|
stdin |
Standard Input |
Data coming IN to the command |
0 |
stdout |
Standard Output |
Normal output FROM the command |
1 |
stderr |
Standard Error |
Error messages FROM the command |
2 |
4.1.8. Text Philosophy#
In Unix, text is not “lines of characters.” It’s structured data:
name,age,city
Alice,28,Portland
Bob,35,Seattle
Charlie,42,San Francisco
Same data, different format:
name: Alice
age: 28
city: Portland
name: Bob
age: 35
city: Seattle
The tools don’t care about the format. They just read text, transform it, and output text.