4.1. Everything Is A File#

One of Unix’s most powerful principles: everything is a file. This includes regular files, directories, devices, network connections, and processes. More importantly for this chapter: everything is text (or can be treated as text).

This philosophy enables the Unix approach: small, focused tools that read text input and write text output, which can be chained together.

4.1.1. Common Mistakes with Text Data#

4.1.1.1. ❌ Mixing binary and text#

# DON'T: Try to grep a binary file
$ grep "text" /bin/ls
Binary file /bin/ls matches

# DO: Use tools meant for binary data
$ strings /bin/ls | grep "text"
# Extract text strings first

4.1.1.2. ❌ Assuming order in pipelines#

$ sort file.txt | grep "pattern"
# If you wanted sorted output, this is wrong
# The sort output goes to grep (fine)
# But you never see the sorted result

Fix:

$ grep "pattern" file.txt | sort
# Now you see sorted results

4.1.1.3. ❌ Forgetting about stderr#

$ grep "pattern" file.txt > output.txt
# Error messages still print to screen!
# If file.txt doesn't exist, error goes to stderr (not captured)

Fix:

$ grep "pattern" file.txt > output.txt 2> errors.txt

4.1.2. Common Pitfalls#

4.1.3. Key Principles#

Principle

Meaning

Example

Do one thing

Each tool has a focused purpose

grep searches, wc counts

Read text

Input is text/lines

All examples read line-based data

Write text

Output is text/lines

All examples output text

Chain them

Pipe output to input

cat | grep | wc

No assumptions

Don’t assume anything about data format

Tools work with any text

4.1.4. Text Processing Chain Example#

Real-world scenario: Count lines containing “ERROR” in a log file:

# Get lines with ERROR, count them
$ grep "ERROR" /var/log/app.log | wc -l
1523

# Get unique error messages
$ grep "ERROR" /var/log/app.log | cut -d: -f3 | sort | uniq -c
   42 Connection timeout
  189 Invalid request
  203 Database error
  ...

# Find errors from last hour, count by type
$ grep "ERROR" /var/log/app.log | grep "2025-01-15 1[0-9]:" | cut -d: -f3 | sort | uniq -c
    5 Connection timeout
   21 Invalid request
    8 Database error

Each tool does one thing well. Combined, they’re powerful.

4.1.5. Combining Input and Output#

You can redirect streams to files while piping others:

# Redirect stdout to file, keep stderr visible
$ ./my_script.sh > output.txt 2>&1
# Both stdout and stderr go to output.txt

# Redirect stdout to file, save stderr separately
$ ./my_script.sh > output.txt 2> errors.txt
# stdout → output.txt
# stderr → errors.txt

4.1.5.1. Why This Matters#

Without pipes, you’d need:

# BAD: Create intermediate files
$ cat data.csv > temp1.txt
$ grep "error" temp1.txt > temp2.txt
$ wc -l temp2.txt
12345

# GOOD: Use pipes
$ cat data.csv | grep "error" | wc -l
12345
# No intermediate files, faster, cleaner

4.1.6. Pipes: Connecting Streams#

A pipe (|) connects stdout of one command to stdin of another:

$ cat data.csv | grep "error" | wc -l
                                         produces         reads input    reads input
    output          from pipe       from pipe

The data flows: catgrepwc

Each command processes the data and passes it to the next.

4.1.6.1. Viewing Streams in Action#

# Normal output goes to stdout
$ echo "Hello"
Hello

# Error message goes to stderr
$ ls /nonexistent
ls: cannot access '/nonexistent': No such file or directory

# Separate stdout and stderr
$ ls /nonexistent > out.txt 2> err.txt
$ cat out.txt
# (empty - no stdout)
$ cat err.txt
ls: cannot access '/nonexistent': No such file or directory

4.1.6.2. Example: A Command’s Streams#

$ grep "error" logfile.txt > matches.txt 2> errors.log
                                                      stdout          stdin   stdout            stderr
                                (file)           (file)

4.1.7. Standard Streams#

Every command has three standard streams:

Stream

Name

Use

Descriptor

stdin

Standard Input

Data coming IN to the command

0

stdout

Standard Output

Normal output FROM the command

1

stderr

Standard Error

Error messages FROM the command

2

4.1.8. Text Philosophy#

In Unix, text is not “lines of characters.” It’s structured data:

name,age,city
Alice,28,Portland
Bob,35,Seattle
Charlie,42,San Francisco

Same data, different format:

name: Alice
age: 28
city: Portland

name: Bob
age: 35
city: Seattle

The tools don’t care about the format. They just read text, transform it, and output text.