Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

>And yes, my system is all set up for ECC

From reading this, I guess one has to do some special setup to let a system use ECC?

Been thinking about ECC myself. What would I need to do, apart from buying the DIMMs and putting them in? Some BIOS settings? Jumper settings?



I've been through this ordeal recently, but I'm probably missing something anyway.

You need to have a compatible CPU/motherboard/chipset. For normal CPUs: AMD Ryzen non-Pro APUs don't have support for it, the rest of AMD's CPUs and chipsets have unofficial support for it. You'll have to check the motherboard vendor's support page if a certain board also has support for ECC. Then you need ECC memory modules and you should stick near the qualified vendors list (QVL) here since systems are kind of pickier with ECC memory. For Intel, you're out of luck except for the W680 chipset, but motherboards seem to be scarce.

For high-end desktop (HEDT) and workstations CPUs: AMD's Threadripper lineup have official ECC support, but still check with the motherboard vendor first. For Intel, most Xeons should do it, but check before you buy. The same caveat about motherboards applies here, too: Check if there's ECC support first and stick to the QVL to be safe.


Pretty much all Ryzen 3000 and 5000 CPUs support ECC. The 4000 G series only with the pro models.

Here's the link: https://www.asus.com/support/FAQ/1045186/


Thank you! Great info.


Asus motherboards mostly offer ECC support with AMD CPUs:

https://rog.asus.com/forum/showthread.php?112750-List-Asus-M...

I just built a desktop machine with such a board, a matching Ryzen and 128GB Kingston ECC memory. Works like a charm, the only problem is the on-board Intel Ethernet chip which ignores Wake-on-LAN (although it's supposed to handle it) so I had to add a PCIe ethernet card to get WoL running. Asus and Intel seem to discuss whose fault it is since two years, sigh.


You need system software to make it work right. You want to configure your system to halt as soon as possible after uncorrectable errors. You also need to prominently log correctable errors. How you achieve this is going to vary by hardware platform and operating system.


That makes sense, thanks! Ubuntu in my case atm.


It's going to vary based on whether you have a BMC or not. If you do, it's better to disable all the OS EDAC stuff and let the BMC handle it.

Otherwise, a good starting point would be to boot linux with `mce=0` so it panics ASAP upon uncorrectable errors.

In Ubuntu, there are the rasdaemon, mcelog, and edac-utils packages.


Cool, will look into that.


As for the BIOS settings (Asus motherboards for AMD) look under:

AMD CBS -> DDR4 Common Options -> Common RAS -> ECC Configuration


rdpintqogeogsaa hit the nail on the head.

Linus uses a Threadripper machine which supports ECC, and non-ecc.

Most Mobos support non-ecc and maybe ECC if it's AMD and the supplier wired it up. (ASUS and someone else I can't recall seem to do so, Gigabyte does not appear to)


Thanks! ASUS pro MB here (on the sole non-Apple device around here lol) so maybe it will work. Also will keep all this in mind when upgrading, which could be in the cards.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: