Extracting Process IDs in Linux: Clean Solutions Using ps, awk and sed


2 views

When working with process management in Linux, extracting just the PID numbers from ps output is a common task for automation scripts and monitoring tools. The raw output contains multiple columns we need to filter:

ps ax
1 ?        Ss     0:01 /sbin/init
2 ?        S<     0:00 [kthreadd]
3 ?        S<     0:00 [migration/0]

The cleanest solution uses awk to print just the first column:

ps ax | awk '{print $1}'
1
2
3

For more precise control, specify the PID column explicitly:

ps -eo pid | awk 'NR>1'
1
2
3

The ps command itself supports output formatting:

ps -eo pid
PID
1
2
3

To remove the header:

ps -eo pid | tail -n +2
1
2
3

While possible with sed, it's less elegant than awk:

ps ax | sed -n 's/^ *$[0-9]\+$.*/\1/p'
1
2
3

Combine with grep to filter specific processes:

ps -eo pid,comm | grep nginx | awk '{print $1}'
1234
5678

For a complete list including child processes:

pgrep -d ',' nginx
1234,5678,9012

For scripts processing many PIDs, the pgrep method is fastest:

time pgrep -d ',' nginx >/dev/null
real 0m0.003s

Compared to the ps|awk approach:

time ps -eo pid,comm | grep nginx | awk '{print $1}' >/dev/null
real 0m0.008s

When working with system monitoring or process automation, we often need just the raw process IDs without other process information. While ps ax provides comprehensive process data, parsing the output to extract only PIDs can be tricky for several reasons:

ps ax | head -5
  PID TTY      STAT   TIME COMMAND
    1 ?        Ss     0:01 /sbin/init
    2 ?        S<     0:00 [kthreadd]
    3 ?        S<     0:00 [migration/0]

Method 1: Using awk for Precise Column Extraction

The simplest and most reliable method uses awk to extract just the first column:

ps ax | awk '{print $1}'
PID
1
2
3

To remove the header:

ps ax | awk 'NR>1{print $1}'
1
2
3

Method 2: Using pgrep for Dedicated PID Lookup

For a more specialized tool designed specifically for PID extraction:

pgrep -d " " -a .
1 /sbin/init
2 [kthreadd]
3 [migration/0]

For just PIDs without process names:

pgrep -d " " .
1 2 3

Method 3: Using procfs Directly

For maximum performance in scripts:

ls /proc | grep -E '^[0-9]+$'
1
2
3

Filtering by Process Name

Combine with grep to find specific processes:

ps ax | grep nginx | awk '{print $1}'
1234
5678

Excluding System Processes

Filter out kernel threads (processes in brackets):

ps ax | awk '$5 !~ /^\[/ {print $1}'
1
567

For frequent polling in monitoring scripts, consider these optimizations:

# Fastest method for many processes
cut -d " " -f 1 /proc/[0-9]*/stat 2>/dev/null