When building mission-critical systems like FreeNAS servers, ECC RAM verification often becomes more complex than expected. My journey through multiple verification methods revealed several technical nuances worth documenting.
The standard dmidecode -t memory
output requires careful analysis. Server-grade hardware often reports:
# dmidecode -t memory Memory Device Total Width: 128 bits Data Width: 64 bits Error Correction Type: Single-bit ECC
The 128-bit width actually represents dual-channel configuration (64-bit per channel), not ECC status. The true indicator is the "Error Correction Type" field.
For modern Linux kernels (4.1+), check the EDAC subsystem:
# grep -E "EDAC|ECC" /var/log/kern.log # dmesg | grep -i ECC
Positive confirmation appears as:
EDAC MC: Verifying 2 memory controllers EDAC MC0: 2048MB registered memory with ECC
The Gigabyte X150M-Pro ECC requires Xeon processors despite chipset support. This motherboard-specific limitation explains why initial Celeron tests failed.
For systems where traditional methods fail, try these advanced techniques:
EDAC Sysfs Interface
# ls -l /sys/devices/system/edac/mc/ # cat /sys/devices/system/edac/mc/mc0/ce_count
Active ECC systems will show memory controller directories with error counters.
Memory Error Injection
For conclusive testing (requires root):
# apt install mcelog # echo 0x1000000000 > /sys/kernel/debug/mce-inject/addr # echo 0x1 > /sys/kernel/debug/mce-inject/cpu/misc # echo date +%s > /sys/kernel/debug/mce-inject/trigger # tail -f /var/log/mcelog
This forces a correctable error that should appear in logs if ECC is active.
Recommended budget ECC processors:
- Intel Xeon E3-1200 v5/v6 series
- AMD Ryzen Pro series (requires appropriate motherboard)
- Intel Pentium G series with ECC support (limited models)
Many motherboards require explicit ECC enablement in BIOS:
- Advanced Memory Settings → ECC Mode → Enabled
- Chipset Configuration → Memory ECC → Enabled
Some implementations automatically enable ECC with supported hardware.
Through comprehensive testing, we've established that:
dmidecode
's Error Correction Type field is authoritative- Kernel EDAC subsystems provide reliable verification
- Motherboard-specific limitations can override chipset capabilities
When building a FreeNAS server with ECC memory, verification becomes crucial yet surprisingly complex. The standard toolchain often provides ambiguous results, especially with modern hardware combinations. Here's what I've learned through extensive testing with my Gigabyte X150M-Pro ECC motherboard and Xeon E3-1220v5 setup.
The most dependable method I've found is parsing multiple system information sources:
# Check memory controller information
sudo dmidecode -t memory | grep -A5 "Error Correction"
# Alternative for newer kernels
sudo grep -i ecc /var/log/dmesg
# Check CPU ECC support
sudo cpuid | grep -i ecc
The key indicators in dmidecode
output:
Handle 0x1000, DMI type 16, 15 bytes
Physical Memory Array
Location: System Board Or Motherboard
Use: System Memory
Error Correction Type: Multi-bit ECC # This is the critical line
Maximum Capacity: 32 GB
Number Of Devices: 4
For technical users willing to dive deeper, these methods provide more certainty:
Kernel Module Inspection
# Check loaded memory controller modules
lsmod | grep -i edac
# Detailed EDAC information
sudo modprobe edac_mc
sudo cat /sys/devices/system/edac/mc/mc*/csrow*/ch*_ce_count
Hardware-Specific Tools
For Intel systems, the msr-tools
package provides low-level access:
sudo apt install msr-tools
sudo modprobe msr
sudo rdmsr 0x17 -a # Check IA32_MCG_CAP MSR
Here's an enhanced version of the Puget Systems test that works with newer CPUs:
#include
#include
#include
#include
#include
#include
#include
#define TEST_SIZE (1024*1024*10) // 10MB test area
int main() {
int fd;
void *mem;
volatile unsigned long *ptr;
fd = open("/dev/mem", O_RDWR);
if (fd < 0) {
perror("open");
return 1;
}
mem = mmap(0, TEST_SIZE, PROT_READ|PROT_WRITE, MAP_SHARED, fd, 0);
if (mem == MAP_FAILED) {
perror("mmap");
close(fd);
return 1;
}
ptr = (volatile unsigned long *)mem;
// Try to trigger correctable error
for (int i = 0; i < TEST_SIZE/sizeof(unsigned long); i++) {
ptr[i] = 0xAAAAAAAA55555555UL;
if (ptr[i] != 0xAAAAAAAA55555555UL) {
printf("Memory error detected at %p\n", &ptr[i]);
}
}
// Check kernel logs for ECC events
system("dmesg | grep -i ECC");
munmap(mem, TEST_SIZE);
close(fd);
return 0;
}
Many consumer-grade motherboards with ECC support (like my Gigabyte) require:
- Latest BIOS version (check for ECC-related release notes)
- Proper DIMM slot configuration (often slots A2/B2 for optimal ECC functionality)
- Xeon processors despite chipset specifications
For production systems, implement continuous monitoring:
# Set up cron job to log ECC events
*/5 * * * * root /usr/bin/logger -t ECC_CHECK "$(date) $(dmidecode -t memory | grep -A5 'Error Correction') $(dmesg | grep -i ECC | tail -5)"