How to Safely Read and Write to the Same File in Linux Without Data Loss


2 views

We've all been there - trying to process a file and write back to it in one command:

uniq .bash_history > .bash_history

Only to find our file empty afterwards. This happens because the shell truncates the output file before the command begins reading it.

When you use > redirection in bash:

  1. The shell immediately opens the output file for writing (truncating it)
  2. Only then does it execute your command
  3. The command tries to read from an already empty file

Using sponge from moreutils

The most elegant solution is using sponge from the moreutils package:

sudo apt-get install moreutils  # Debian/Ubuntu
uniq .bash_history | sponge .bash_history

sponge soaks up all its input before writing to the output file, solving our problem.

Temporary File Approach

When moreutils isn't available, a temporary file works:

uniq .bash_history > tmpfile && mv tmpfile .bash_history

In-place Editing with sed

For simple text processing, sed -i handles this safely:

sed -i 's/old/new/g' file.txt

Using tee as an Alternative

For special cases where you need to both see output and save it:

grep "pattern" file | tee file

But be warned - this can still cause data loss for large files.

For large files where memory is a concern:

mkdir -p /tmp/workdir
trap 'rm -rf /tmp/workdir' EXIT
split -l 1000 .bash_history /tmp/workdir/chunk.
for f in /tmp/workdir/chunk.*; do
    uniq "$f" >> .bash_history.tmp
done
mv .bash_history.tmp .bash_history
  • Always make backups before modifying important files
  • Consider using version control for configuration files
  • Test commands with file copies first
  • For production scripts, implement proper error checking

When you run commands like uniq .bash_history > .bash_history, the shell handles the redirection first. The > operator immediately truncates (empties) the target file before your command even starts reading it. This explains why your history file ended up empty.

1. Using sponge from moreutils

The safest and most elegant solution is to use sponge, which comes with the moreutils package:

sudo apt-get install moreutils  # For Debian/Ubuntu
uniq .bash_history | sponge .bash_history

2. Using a Temporary File (The Classic Approach)

While you mentioned wanting to avoid temp files, here's the robust version:

uniq .bash_history > temp && mv temp .bash_history

3. Using Process Substitution (Bash-only)

For Bash users, you can use process substitution:

uniq <(cat .bash_history) > .bash_history

How sponge works

sponge reads all input first before opening the output file, avoiding the truncation issue. It essentially:

  1. Reads all stdin into memory
  2. Closes the input stream
  3. Then opens and writes to the output file

When to use each approach

  • sponge: Best for most cases where you have moreutils installed
  • Temp file: Most portable solution that works everywhere
  • Process substitution: Bash-specific but avoids creating actual temp files

For large files:

# For files larger than memory:
sort -u .bash_history -o .bash_history  # Works in-place for sort
  • Never chain multiple operations with the same output file: cmd1 | cmd2 > file
  • Avoid complex pipelines where intermediate commands might fail
  • For system files, always keep backups first