Bash Variable Scope Issue: Why Variables Lose Value After While Read Loop


2 views

I recently encountered a puzzling issue in bash scripting where a variable modified inside a while read loop mysteriously loses its value when the loop completes. Here's the exact scenario that baffled me and my colleagues:

echo "1">input.data
echo "2">>input.data
echo "3">>input.data
echo "4">>input.data
echo "5">>input.data

CNT=0

cat input.data | while read ;
do
  let CNT++;
  echo "Counting to $CNT"
done
echo "Count is $CNT"

Despite incrementing CNT inside the loop, the final output shows "Count is 0" instead of the expected "Count is 5". This behavior occurs because the while read loop runs in a subshell when using pipe (|), creating a separate execution environment.

Here are several reliable approaches to maintain variable values:

# Solution 1: Process substitution
CNT=0
while read
do
  let CNT++
  echo "Counting to $CNT"
done < <(cat input.data)
echo "Count is $CNT"
# Solution 2: Using file descriptor
CNT=0
exec 3< input.data
while read -u 3
do
  let CNT++
  echo "Counting to $CNT"
done
exec 3<&-
echo "Count is $CNT"

This behavior isn't just academic - it affects many practical scripting scenarios:

  • Processing log files while maintaining counters
  • Reading configuration files and collecting statistics
  • Implementing progress indicators during file operations

When using pipes in bash, each command in the pipeline runs in its own subshell. This means:

  • Variable changes in subshells don't affect the parent shell
  • Process substitution (< <()) keeps execution in the current shell
  • The same applies to command substitution ($()) in most cases
# This also creates a subshell
CNT=0
for i in $(seq 1 5); do
  let CNT++
done
echo $CNT  # Outputs 5 because for doesn't create subshell

To avoid similar issues:

  • Always be aware of subshell creation points
  • Prefer process substitution over pipes when variable persistence is needed
  • Consider using temporary files for complex data sharing between shells
  • Document scope-sensitive variables in script headers

Many bash programmers encounter this puzzling behavior where variables modified inside a while read loop don't retain their values. Here's the classic example that trips people up:

#!/bin/bash
echo "1">input.data
echo "2">>input.data
echo "3">>input.data
echo "4">>input.data
echo "5">>input.data

CNT=0

cat input.data | while read line
do
  let CNT++
  echo "Inside loop: $CNT"
done

echo "After loop: $CNT"  # Surprisingly outputs 0!

The key insight is that pipelines create subshells in bash. When you pipe data into a while read loop, the loop actually executes in a subshell (a child process). Variables modified in a subshell don't affect the parent shell's environment.

Here are several ways to work around this behavior:

1. Process Substitution (Recommended for bash)

CNT=0
while read line
do
  let CNT++
  echo "Counting: $CNT"
done < <(cat input.data)
echo "Final count: $CNT"  # Correctly outputs 5

2. Here Document Approach

CNT=0
while read line
do
  let CNT++
done <

3. Temporary File Redirect

CNT=0
exec 3< input.data
while read -u 3 line
do
  let CNT++
done
exec 3<&-
echo $CNT

4. Store Results in a File

CNT=0
cat input.data | {
while read line
do
  let CNT++
done
echo $CNT > cnt.tmp
}
CNT=$(cat cnt.tmp)
echo $CNT

Sometimes the best solution is to restructure your code:

# Count lines without needing a loop
CNT=$(wc -l < input.data)
echo $CNT

# Or using array
mapfile -t lines < input.data
CNT=${#lines[@]}
echo $CNT

For large files, some solutions perform better than others. Process substitution and file redirection are generally more efficient than creating temporary files.

Remember that bash isn't always the best tool for intensive data processing - sometimes switching to awk or perl might be more appropriate for complex tasks.