Is Non-ECC Memory Corrupting My Files?
I turned my old gaming PC into a TrueNAS box. Seemed smart at the time. i5-12600K, 32GB of DDR4-3200, tossed in four 8TB drives in RAIDZ2. Should just work, right?
Except it didn’t.
Fresh files I’d copy over would get flagged “permanently damaged.” Videos mostly. They’d still play, but zpool status -v kept showing corruption. I’d fix it, scrub, re-copy, and the corruption would come right back.
I went down the TrueNas self-hosting rabbit hole. Turns out TrueNAS really wants ECC memory, and I was running regular consumer DDR4. The theory: bit flips in RAM silently corrupt data before it even hits the disk. ZFS catches it, but can’t fix it. Makes sense, but I didn’t want it to be true.
My motherboard doesn’t support ECC anyway. New board + ECC RAM meant several hundred bucks. New CPU on top if I couldn’t keep the 12600K. What started as “use the old gaming rig” was getting expensive.
So I tried the cheap fixes first:
- Disabled scrub tasks (pointless if RAM’s trashing data)
- Turned off XMP
- Underclocked memory from 3200 to 2400
- Started watching Craigslist and FB Marketplace
Then I actually tested the RAM.
Update 1: Memory’s bad. Gonna see if any sticks are salvageable.
Update 2: Tested each RAM stick individually. First three sticks? Failed in under a minute, 30+ errors each. Fourth stick ran 40 minutes, three full passes, zero errors. So now I’m running 8GB. Better than nothing. Checked prices: 2x16GB Micron DDR4-2400 ECC is $185. Oof.
Update 3: Bought the new RAM. Tested immediately with MemTest86, both sticks passed clean. Installed, deleted the corrupt files, restored from backup. Pools scrubbed with zero errors. Healthy now.
The lesson? Seven-year-old RAM was the culprit. Not “non-ECC is inherently dangerous,” just my specific sticks were dying and I blamed the wrong thing first. New RAM fixed everything.
Still probably going to grab ECC eventually. But for now, working memory beats broken memory.