When working with SGE (particularly version 6.2u5), many users encounter confusing discrepancies between actual memory usage shown in system tools like top
and the values reported by SGE utilities (qstat
, qacct
). Here's what I've discovered through extensive testing and discussions with cluster administrators.
Let's break down the key memory metrics you'll encounter:
# From top:
VIRT: 45.6g # Virtual memory size (includes reserved but unused memory)
RES: 38g # Resident memory (actual physical RAM used)
SHR: 9600 # Shared memory (portion that could be shared with other processes)
# From qacct:
mem 2768.453 # Memory usage in MB-seconds (integral over time)
maxvmem 4.078G # Peak virtual memory usage during job execution
The maxvmem
value in SGE is typically lower than what top
shows because:
- SGE measures the process tree's memory usage differently than
top
- SGE may not account for memory-mapped files or shared libraries properly
- The sampling frequency of SGE might miss short memory spikes
Here are three reliable approaches I've used to get accurate memory measurements:
1. Using timev and RSS Measurement
Create a wrapper script that periodically samples RSS:
#!/bin/bash
# monitor_mem.sh
while true; do
ps -p $1 -o rss= >> memory_usage.log
sleep 30
done
# Usage:
# ./monitor_mem.sh $$ &
# your_actual_command
# kill %1
2. Direct cgroup Memory Stats
For newer systems using cgroups:
grep 'total_rss' /sys/fs/cgroup/memory/sge/*/memory.stat
3. Enhanced SGE Accounting
Configure SGE to use more accurate memory accounting (requires admin access):
# In sge_conf:
execd_params ENABLE_ADDGRP_KILL=1 MEMORY_ACCOUNTING=true
complex_values mem=virtual_free
The mem
field in qacct shows MB-seconds (memory-time product). To calculate average memory usage:
average_mem_MB = mem / (end_time - start_time)
For the example with mem=2768.453 and 100 seconds runtime:
2768.453 MB-s / 100s = 27.68 MB average usage
This explains why it appears much lower than your peak usage.
Always request adequate memory in your job submission:
qsub -l h_vmem=50G -l mem_free=50G my_job.sh
This ensures proper scheduling and prevents memory-based job failures.
When SGE reporting isn't sufficient, consider these alternatives:
htop
- Enhanced version of top with tree viewsmem
- Provides proportional set size (PSS) measurement/proc/$PID/status
- Detailed memory stats for any process
The confusion between SGE's reported memory values (qacct
/qstat
) and system-level measurements (top
) stems from fundamental differences in what these tools measure:
// Sample qacct output structure
job_number 7270916
mem 2768.453 # Total memory GB-seconds
maxvmem 4.078G # Peak virtual memory usage
top
shows real-time memory allocation:
- VIRT: Total virtual memory (45.6GB - includes reserved but unused memory)
- RES: Resident memory actually in RAM (38GB)
- SHR: Shared memory portions (9.6MB)
Sun Grid Engine tracks memory differently:
// Sample qstat -j output
usage 1: cpu=00:01:37, mem=168.12988 GBs, io=38.64676, vmem=1.665G, maxvmem=4.078G
Key metrics:
- mem: Cumulative memory-time product (GB-seconds)
- maxvmem: Peak virtual memory observed by SGE
For precise RAM tracking in SGE 6.2u5:
#!/bin/bash
# Method 1: Use /proc directly
grep VmRSS /proc/$JOB_ID/status | awk '{print $2}'
# Method 2: Enhanced SGE reporting
qacct -j $JOB_ID | grep -E 'mem|maxvmem' | awk '{printf "%.2f GB\n", $2/1024}'
# Method 3: Periodic sampling
while sleep 5; do
ps -p $JOB_ID -o rss= | awk '{print $1/1024/1024 " GB"}'
done
- Enable detailed SGE accounting with
-l m_mem_free=4G
in job submission - Combine
qacct
with/proc
monitoring for complete picture - For critical memory measurements, implement custom logging within your application
Consider these additional methods for more precise tracking:
// Python memory profiler snippet
import resource
def memory_usage():
return resource.getrusage(resource.RUSAGE_SELF).ru_maxrss/1024/1024
Third-party tools worth exploring:
- GNU time with
-v
flag - Valgrind massif for detailed heap analysis
- Custom cgroups memory tracking