Understanding HugePages Accounting in Linux: Why Free and Reserved Pages Don’t Sum to Total

When working with HugePages in Linux (particularly for Java applications), the numbers in /proc/meminfo can sometimes appear counterintuitive. Let's examine a real-world case:

# grep HugePages /proc/meminfo 
AnonHugePages:    274432 kB
HugePages_Total:    1008
HugePages_Free:      596
HugePages_Rsvd:      594
HugePages_Surp:        0

At first glance, one might expect HugePages_Free + HugePages_Rsvd = HugePages_Total, but our example shows 596 + 594 = 1190 ≠ 1008. This discrepancy reveals important nuances in Linux's HugePages management.

Three critical states exist for HugePages:

Free: Truly available pages (596 in our case)
Reserved: Committed but not yet allocated (594)
In-use: Actively allocated pages (1008 - 596 = 412)

The reservation system creates the apparent mismatch. Reserved pages are subtracted from the free pool but aren't yet in active use. The actual equation is:

HugePages_Total = HugePages_Free + HugePages_Rsvd + Actually_Used_Pages
1008 = 596 + 594 + (1008 - 596)

For Java applications using HugePages, here's a useful monitoring script:

#!/bin/bash
while true; do
  grep -e HugePages -e AnonHugePages /proc/meminfo
  echo "Active HugePages: $(( $(grep HugePages_Total /proc/meminfo | awk '{print $2}') - $(grep HugePages_Free /proc/meminfo | awk '{print $2}') ))"
  sleep 5
done

When configuring HugePages for Java:

Calculate your expected needs: num_hugepages = (Java heap size) / (hugepage size)
Add 10-20% overhead for reservations
Set vm.nr_overcommit_hugepages if you expect dynamic growth

For a 16GB Java heap using 2MB HugePages:

# 16GB = 16384MB
# 16384 / 2 = 8192 pages needed
# Add 20% buffer: 9830 pages
echo 9830 > /proc/sys/vm/nr_hugepages

To get deeper insights:

# Show per-NUMA node HugePages
cat /sys/devices/system/node/node*/meminfo | grep Huge

# Show application HugePages mapping
grep -B 10 -A 10 Huge /proc/[pid]/smaps

Remember that HugePages reservations are a form of memory commitment - they protect against allocation failures but don't represent physical memory consumption until actually used.

When working with Java applications on Linux systems configured with HugePages, many developers notice an accounting discrepancy in /proc/meminfo that doesn't immediately make sense. Here's a typical output that raises questions:

# grep HugePages /proc/meminfo 
AnonHugePages:    274432 kB
HugePages_Total:    1008
HugePages_Free:      596
HugePages_Rsvd:      594
HugePages_Surp:        0

The confusion stems from expecting that HugePages_Free + HugePages_Rsvd should equal HugePages_Total. However, the kernel maintains these counters differently:

HugePages_Free: Count of truly unallocated huge pages in the pool
HugePages_Rsvd: Pages committed to be allocated but not yet actually assigned
HugePages_Total: The static pool size configured via sysctl

When a Java process requests huge pages through SHM_HUGETLB or mmap, the sequence is:

1. Application requests huge pages (reservation occurs)
2. Kernel marks pages as HugePages_Rsvd
3. Actual memory access triggers physical allocation
4. HugePages_Rsvd decreases, HugePages_Free decreases

You can observe this behavior dynamically with a simple C program:

#include 
#include 

#define LENGTH (2UL * 1024 * 1024) // 2MB huge page

int main() {
    char *addr = mmap(NULL, LENGTH, PROT_READ|PROT_WRITE, 
                     MAP_PRIVATE|MAP_ANONYMOUS|MAP_HUGETLB, -1, 0);
    if (addr == MAP_FAILED) {
        perror("mmap");
        return 1;
    }
    printf("Allocated huge page at %p\n", addr);
    // Keep the mapping
    pause();
    return 0;
}

Watch the counters change in real-time:

watch -n 1 "grep HugePages /proc/meminfo"

When you run the program, you'll see:

HugePages_Rsvd increases immediately
HugePages_Free decreases when you actually write to the memory

This two-phase approach serves important purposes:

Prevents overcommitment of huge pages
Allows fail-fast detection of insufficient huge pages
Maintains accurate accounting during allocation races

For Java applications using -XX:+UseLargePages, consider:

vm.nr_hugepages = (heap_size / hugepage_size) + buffer

Where buffer accounts for:

Non-heap memory allocations
Other processes using huge pages
Reservation overhead

Additional tools for troubleshooting:

# View per-process huge page usage
grep -B 11 Huge /proc/[pid]/smaps

# Check huge page mount points
cat /proc/mounts | grep hugetlbfs

ServerDevWorker

Understanding HugePages Accounting in Linux: Why Free and Reserved Pages Don’t Sum to Total

Related Articles