Supermicro X7SBA and ECC RAM — part 1

Somewhat recently I purchased an X7SBA motherboard to replace my PDSMi+ system at home — more SATA ports and faster bus speed for the win!

Recently in the co-lo, we had a machine go belly-up due to bad memory (random segfaults and all kinds of other insanity). This is the first time in years I’ve actually experienced memory going bad (normally it’s hard disks or mainboards), so I was both irritated and excited. As a result, I’ve been going somewhat crazy purchasing ECC memory for all of our servers.

But first, I decided to purchase a Crucial CT2KIT25672AA800 kit for my X7SBA system at home (as a test). Pairs or “kits” consist of two identically-matched DIMMs, in this case, both 2GB in size. I always buy memory in pairs/kits, because I don’t like playing “mix-and-match” and risking SPD timings being different (read some of my past blog entries for proof that such happens). Crucial exclusively uses Micron-brand RAM, and Supermicro’s site confirms that Micron works fine with the X7SBA, and the PDSMi+.

I installed the memory, and it worked… except for one oddity which bothered me.

During POST (and only POST!), the memory size reported was 4864MB. If I entered the BIOS, the status screen reported 4096MB installed with 4094MB usable (correct). If I let FreeBSD boot, it reported 4086MB usable (also correct), and the machine functioned fine (even with heavy memory I/O). So where was this 4864MB value coming from?

Before I go any further: worth noting is the following: 4864 – 4096 == 768. I’m almost positive this is the extra amount of RAM used for ECC bits, and the memory scan during POST doesn’t truly detect ECC memory.

Most system administrators, when installing/upgrading/changing RAM, pay close attention to the values shown during POST. I quickly concluded this was a BIOS bug, even though it did not impact functionality. I checked the online FAQ and the user manual — nada. So I wrote the following to Supermicro Technical Support:

Support,

On an X7SBA motherboard (BIOS rev 1.1), with ECC memory installed, during POST, the system claims to see 4864MB of memory.

However, once in the BIOS, it claims to see 4096MB (of which 4094MB is usable; this appears correct). The underlying OS sees 4086MB usable (also correct).

The memory is Crucial brand (e.g. Micron), and is stated to be 100% compatible with the X7SBA:

http://www.crucial.com/store/mpartspecs.aspx?mtbpoid=299FFA5FA5CA7304

This problem does not happen when non-ECC RAM is used.

The CPU being used is an Intel Core2Duo E8400.

Can you explain why the BIOS would print 4864MB, but only during POST? Is this a BIOS bug? I could not find any mention of it in the User Manual, or the Online Support FAQ.

I got a response a few days later (delay was due to the holiday):

Jeremy,

The memory is Kit Of 2 feature support, we do not test and recommend. Try with standard memory module.

Thank you

I quickly fired off a reply, still in shock over the ineptitude of the response I received.

Support,

I find your answer very strange, because the individual DIMMs that are used in a “Kit of 2” have the exact same timings and attributes as their standalone DIMMs:

Single: http://www.crucial.com/store/mpartspecs.aspx?mtbpoid=EBB973D6A5CA7304
Kit of 2: http://www.crucial.com/store/mpartspecs.aspx?mtbpoid=299FFA5FA5CA7304

Additionally, why would the BIOS (not during POST) and the OS see the correct amount of memory? Also, 4864-4096 = 768, which seems very suspicious — possibly the memory scan does not take into consideration ECC addressing?

However, as you request, I will try purchasing two individual 2GB DIMMs (above) from Crucial, and not in a “kit”, to see if the problem goes away.

I will get back to you when I’ve received individual DIMMs and tested them.

I did exactly as I told Supermicro I would. I purchased two individual DIMMs of the same model number (see above), rather than in a paired kit. They should arrive sometime this week.

I expect the result to be the exact same as the “kit”, and if that’s the case, I will be requesting this be labelled as a BIOS bug and will work with Supermicro to fix the problem. In the case the BIOS POST reports 4096MB, I will be incredibly surprised.