Another reason not to run DD-WRT

Embarrassing is an understatement.

Tomato users are not affected. No idea regarding HyperWRT or Thibor.

Sebastian Gottschall’s statement, “consider that this exploit was released without any report to us”, is a miserable attempt at taking responsibility for the mistake. I have personally reviewed the DD-WRT source many times while working with WRT* routers — and like Busybox, it’s all duct tape and Bondo. The same applies to HyperWRT, though most of the trashy code there comes from the base source which is the responsibility of Linksys and their third-party vendor.

With regards to DD-WRT, I really don’t care if the exploit was released without any prior report — consider doing security audits of your own code, and stop allowing patches with hacked-up solutions. Instead, stop and think about the change in its entirety before committing.

Marvell’s faulty 88SE9123 (SATA 6G) controller

Marvell’s new 88SE9123 SATA 6G controller is apparently responsible for a number of motherboard manufacturers having to re-engineer and re-produce their upcoming Intel P55-based boards:

I’ve now added Marvell to my list of manufacturers to avoid for storage. The list is now Silicon Image (31xx series only), JMicron (a.k.a. “GIGABYTE2″ on Gigabyte boards), and Marvell. Keep ‘em coming, and keep laying off your QA staff! *sigh*

Regarding the Asus P55 series, so far I’ve seen *three* separate photos of an Asus P7P55D board, and all are completely different SATA-connector-wise:

  • Board #1 — Intel connectors are red AND light blue, Marvell are white
  • Board #2 — Intel connectors are light blue, Marvell are white
  • Board #3 — Intel connectors are light blue, and I have no idea what’s on the brown SATA connector (7th SATA port)

Regarding a Gigabyte board (not sure what model): they planned on using two Marvell 88SE9123 controllers (each controller offers 2x SATA 6G ports), so in the PC Perspective photo of a Gigabyte board, the Intel P55 PCH SATA ports are light blue and the Marvell SATA ports are white.

I don’t know what MSI’s board looks like, so I don’t know what to tell people to avoid there.

Stick with the Intel P55 PCH SATA ports if you have one of these boards, and also disable the Marvell controller in the BIOS if possible. Take no chances.

2009/10/30 edit: Isn’t it funny how review sites don’t bother mentioning the problem? “Oops”.

My experience with an Intel X25-M and Windows XP

EDIT (June 18, 2009)

The friendly guys over at DSLReports informed me that to achieve proper SSD speed, the filesystem actually has to be properly aligned on a 4KB boundary — which Windows XP does not do out-of-the-box. Instead, you’re required to run diskpart.exe on an existing Windows machine with your SSD hooked up to achieve this. The folks over at the OCZ forums have put together quite a series of documents explaining the details. What’s stated there does make sense — remember, flash is memory, and memory alignment (yes, as in RAM!) is actually something kernels and even libc (on *IX) do to ensure performance. So this doesn’t surprise me, and at the same time, probably explains what I’m seeing.

But as my reply says, I feel this is even more evidence that existing software (OSes, applications) are not ready for SSDs yet, and chances of me forgetting to do this the next time I format the drive are very high. It’s simply not worth all this rigmarole.

Below is the original portion of my blog post.

Last week I purchased an Intel X25-M (80GB), as I figured I was ready to plop down over US$300 on something that gave me incredible performance. The drive arrived earlier this week, and included firmware 8820 (saved me the trouble of having to update it). My workstation setup (disk/controller-wise) then became very simple:

  • Asus P5Q SE (Intel ICH10-based)
  • Intel X25-M, 80GB (SATA II)
  • Western Digital WD5000AAKS, 500GB (SATA II)
  • Pioneer DVD-ROM (SATA)

The WD drive was going to become D:, dedicated entirely to things like mass storage (movies, downloads, games, etc.), while the X25-M was going to be used as the boot/OS drive and for applications. Simple.

I reinstalled Windows a couple days ago onto the X25-M. The first thing I’ll say is that the installation was definitely faster than that of a hard drive, which is no surprise. The fastest part was once the GUI-based portion of the installer was up — that whole thing took maybe 8 minutes at most. Cool.

Then came booting into XP. Well, this is probably the most impressive thing I’ve ever seen: the time the BIOS finishes POST until the point I’m logged in and have a usable desktop is 6 seconds. That also includes the time it takes for me to type my password. Amazing.

Benchmarks on the X25-M are also equally as impressive. Read speeds are around 220-250MB/sec, while write speeds are around 50-60MB/sec (varies based on block size).

One oddity I did run into was during nVidia driver installation, where the installer suddenly spit out a message stating “Nvsvc could not be started; rebooting your computer will fix this problem”. I’ve never once seen that message on any non-SSD-based system I’ve installed nVidia’s drivers on, and I haven’t seen it since rebooting, but it was still strange. (EDIT: It appears this is a newly-introduced problem with nVidia’s last two driver releases, at least for GeForce 9 and Quadro NVS 440 cards. This problem has nothing to do with an SSD being used. :-) )

And then came what I dreaded… strange “stalls” which I couldn’t explain. These were rare; most of the time things opened quickly (blazingly fast), but there were times when things like Firefox took literally 6-7 seconds to launch.

So I started doing reading. And more reading. And even more reading. The more I read, the more I found troublesome annoyances. Here’s a list of the ones I’ve found:

Web browsers and SSDs don’t mix very well due to use of disk caching for visited pages. Apparently this thrashes too hard on the SSD, and bottlenecks start to appear. Firefox has some about:config parameters you can adjust to help minimise the pain, but the pain still exists. Then I read about how Firefox on an SSD was performing horribly due to use of the “phishing filter” (suspected attack sites and suspected forgery sites), so the solution was to disable those entirely. Specifically, setting these two about:config parms:

  • browser.safebrowsing.enabled = false
  • browser.safebrowsing.malware.enabled = false

Next came general Windows filesystem performance. I disable NTFS atime (access times) by default regardless if I’m using an SSD or classic hard disk, so that’s not the problem. Occasionally small programs — like CCleaner — would take 2-3 seconds to start. What was the system spending its time doing? I checked Perfmon and wasn’t able to figure it out. It’s like the SSD was “too fast” for the system…

More reading determined I had to do all kinds of bizarre tweaks to the system, including massive registry changes. Here’s a list of what was recommended:

  • Disable paging file entirely on C: (SSD), and move it entirely to D: (hard drive)
  • Disable the Windows Indexing Service (which didn’t apply to me because the service wasn’t started to begin with
  • Disable the Windows Prefetcher — HKLM\SYSTEM\CurrentControlSet\Control\Session Manager\Memory Management\PrefetchParameters\EnablePrefetcher=dword:00000000
  • Stop Windows from relocating files required for quicker/faster booting to the front of the disk (improving boot times) — HKLM\Software\Microsoft\Dfrg\BootOptimizeFunction\Enable="N"
  • Stop Windows from relocating commonly-used files to sectors closer to the front of the disk (improving launch times) — SOFTWARE\Microsoft\Windows\CurrentVersion\OptimalLayout\EnableAutoLayout=dword:00000000
  • Disable NTFS atimes (access times) — HKLM\SYSTEM\CurrentControlSet\Control\FileSystem\NtfsDisableLastAccessUpdate=dword:00000001 (note: I do this on all my systems, so this really isn’t a negative against SSDs)

I also ran into one incident where HD Tach (benchmarking program) crashed in such a way where the process was left lingering around in memory chewing up 100% of one of my cores (e.g. on a quad-core, 25% CPU).

I did all of these things, and didn’t really see any form of improvement. Mysterious unexpected delays when launching things would still happen. CPU usage was not a problem; this was purely something going on behind-the-scenes as a result of the SSD being either too fast, or write operations which were “too” synchronous for the OS.

At this point, I will be returning the SSD to my place of purchase. I really love the no-moving-parts aspect, and I’m impressed by read speeds, but this degree of system tuning required just to get “decent performance” out of something that already HAS screaming performance is… well, not worth it IMHO.

What I’m saying is: SSDs are great, but software (OSes, applications, etc.) are obviously not ready for them. Most software has been engineered under the assumption that there’s a hard disk involved, and hopefully over the next 5-10 years that assumption will change.

I will say one thing about SSDs, though: these would be *perfect* for laptops. Those little stalls and so on are pretty much normal on laptops to begin with (I’m basing this on my experience with a Lenovo T60p Widescreen), so an SSD would decrease laptop heat by a large amount, and increase overall responsiveness. But for the kind of desktop usage I have? Well, the X25-M won’t cut it. And I refuse to “destroy” my Windows system with magical registry tweaks which impact the entire system (and not just a single filesystem).

Shuttle SG45H7 — Firewire bug in BIOS SG45U10O

EDIT 2009/11/12: Shuttle has released BIOS SG45S10S which encapsulates the fix mentioned below, though you won’t find mention of the problem in their ChangeLog. Thanks to George King for getting Shuttle to release an updated BIOS publicly!

Below is a copy of an Email I sent Shuttle yesterday, indicating that I had found a bug in their latest BIOS for the Shuttle SG45H7. The bug is described below.

From: Jeremy Chadwick
To: support@tw.shuttle.com
Subject: SG45H7 -- bug in BIOS version SG45U10O

Support,

There is a bug introduced in BIOS version SG45U10O (date 03/11/2009)
for the Shuttle SG45H7.  The bug is the following:

If the 1394 controller is Disabled in the BIOS, upon the next reboot,
the BIOS crashes/locks up (BIOS startup screen is never seen, POST never
happens).  Clearing the CMOS is the only way to get the machine usable
again.

Please see about fixing this.

If you need any other information from me (motherboard version, etc.)
just ask.

Thanks.

I received two responses from Shuttle within 24 hours. The first confirms the bug (nice QA!):

From: support S
To: Jeremy Chadwick
Subject: RE: SG45H7 -- bug in BIOS version SG45U10O

Dear Jeremy
Thank you choosing Shuttle.
Regarding your concern about SG45SH7 problem.
We found out the issue like what you described.
We are going to modify and release it as soon as we can.
Please feel free to let me know if you have any questions or concerns

Shuttle Inc.
Technical Support

And the second providing a beta BIOS — which I am not willing to try given that there are obviously other changes within (note the subrevision letter has gone from O to Q. What was P? Sorry, I don’t play roulette with my BIOSes):

From: support S
To: Jeremy Chadwick
Subject: RE: SG45H7 -- bug in BIOS version SG45U10O

Dear Jeremy
Thank you choosing Shuttle.
Attached is the BETA BIOS that can solve your problem, please try it.
Please feel free to let me know if you have any questions or concerns

Shuttle Inc.
Technical Support

Regardless, it’s good to see a company taking bug reports seriously. Now I’m left wondering if I should tell them about a bug in their ACPI DSDT which FreeBSD whines about upon boot-up. Hmm…

Supermicro X7SBA and ECC RAM — finale

Supermicro took my report seriously and filed a case on the problem.

A few days ago, a new BIOS version (v1.2a) appeared on Supermicro’s site:

http://www.supermicro.com/products/motherboard/Xeon3000/3210/X7SBA.cfm

I upgraded my X7SBA’s system BIOS to 1.2a, and the memory size reported during POST is now correct (4096MB). Hooray! Another case closed. :-)

Supermicro X7SBA and ECC RAM — part 1

Somewhat recently I purchased an X7SBA motherboard to replace my PDSMi+ system at home — more SATA ports and faster bus speed for the win!

Recently in the co-lo, we had a machine go belly-up due to bad memory (random segfaults and all kinds of other insanity). This is the first time in years I’ve actually experienced memory going bad (normally it’s hard disks or mainboards), so I was both irritated and excited. As a result, I’ve been going somewhat crazy purchasing ECC memory for all of our servers.

But first, I decided to purchase a Crucial CT2KIT25672AA800 kit for my X7SBA system at home (as a test). Pairs or “kits” consist of two identically-matched DIMMs, in this case, both 2GB in size. I always buy memory in pairs/kits, because I don’t like playing “mix-and-match” and risking SPD timings being different (read some of my past blog entries for proof that such happens). Crucial exclusively uses Micron-brand RAM, and Supermicro’s site confirms that Micron works fine with the X7SBA, and the PDSMi+.

I installed the memory, and it worked… except for one oddity which bothered me.

During POST (and only POST!), the memory size reported was 4864MB. If I entered the BIOS, the status screen reported 4096MB installed with 4094MB usable (correct). If I let FreeBSD boot, it reported 4086MB usable (also correct), and the machine functioned fine (even with heavy memory I/O). So where was this 4864MB value coming from?

Before I go any further: worth noting is the following: 4864 – 4096 == 768. I’m almost positive this is the extra amount of RAM used for ECC bits, and the memory scan during POST doesn’t truly detect ECC memory.

Most system administrators, when installing/upgrading/changing RAM, pay close attention to the values shown during POST. I quickly concluded this was a BIOS bug, even though it did not impact functionality. I checked the online FAQ and the user manual — nada. So I wrote the following to Supermicro Technical Support:

Support,

On an X7SBA motherboard (BIOS rev 1.1), with ECC memory installed, during POST, the system claims to see 4864MB of memory.

However, once in the BIOS, it claims to see 4096MB (of which 4094MB is usable; this appears correct). The underlying OS sees 4086MB usable (also correct).

The memory is Crucial brand (e.g. Micron), and is stated to be 100% compatible with the X7SBA:

http://www.crucial.com/store/mpartspecs.aspx?mtbpoid=299FFA5FA5CA7304

This problem does not happen when non-ECC RAM is used.

The CPU being used is an Intel Core2Duo E8400.

Can you explain why the BIOS would print 4864MB, but only during POST? Is this a BIOS bug? I could not find any mention of it in the User Manual, or the Online Support FAQ.

I got a response a few days later (delay was due to the holiday):

Jeremy,

The memory is Kit Of 2 feature support, we do not test and recommend. Try with standard memory module.

Thank you

I quickly fired off a reply, still in shock over the ineptitude of the response I received.

Support,

I find your answer very strange, because the individual DIMMs that are used in a “Kit of 2″ have the exact same timings and attributes as their standalone DIMMs:

Single: http://www.crucial.com/store/mpartspecs.aspx?mtbpoid=EBB973D6A5CA7304
Kit of 2: http://www.crucial.com/store/mpartspecs.aspx?mtbpoid=299FFA5FA5CA7304

Additionally, why would the BIOS (not during POST) and the OS see the correct amount of memory? Also, 4864-4096 = 768, which seems very suspicious — possibly the memory scan does not take into consideration ECC addressing?

However, as you request, I will try purchasing two individual 2GB DIMMs (above) from Crucial, and not in a “kit”, to see if the problem goes away.

I will get back to you when I’ve received individual DIMMs and tested them.

I did exactly as I told Supermicro I would. I purchased two individual DIMMs of the same model number (see above), rather than in a paired kit. They should arrive sometime this week.

I expect the result to be the exact same as the “kit”, and if that’s the case, I will be requesting this be labelled as a BIOS bug and will work with Supermicro to fix the problem. In the case the BIOS POST reports 4096MB, I will be incredibly surprised.

Noctua NH-C12P incompatibilities

I don’t have the time to provide full details at this point, so here’s the simple version:

The Noctua NH-C12P is incompatible in numerous ways with an Antec P182 case, and equally as incompatible with Asus P5Q SE and Asus P5Q SE/R motherboards (despite Noctua stating otherwise on their site).

I have videos documenting the problems, but need time to do a full write-up as well as edit the videos + upload them to WordPress.

Buyer beware: if you’re looking for a good heat sink for your new Asus P5Q SE board, or wanting to upgrade to a new sink while using an Antec P182, avoid the NH-C12P.

I have no idea if the NH-U12P will work, but based on the similarities to the NH-C12P, I would have to say no, it probably won’t work.

VIA Quits Motherboard Chipset Business

Best news I’ve read in months:

http://hardware.slashdot.org/article.pl?sid=08/08/11/1226221&from=rss

Now for Realtek to disappear…

Foxconn and slave labour

I normally don’t post (or even believe in) things like this, because I’ve always felt boycotting something or someone is often done for the wrong (or extremist) reasons.  But in this case, for me, I feel I’m justified:

http://www.msnbc.msn.com/id/13357555/

Safe to say I will never purchase a Foxconn motherboard.

I won’t tell readers to follow my footsteps, because I recommend everyone think for themselves.  But buying motherboards is something I do often; I purchase what suits my needs, tastes, and what meets my pre-requisites.  Foxconn products never have, so this is more or less icing on the cake.

I’m well aware there are worse examples of slave labour throughout Asia, but I have to be somewhat realistic and make my choices based upon what I feel is right for me.  For example, would I buy a Nike shoe?  Yes, assuming it felt more comfortable on my foot than other shoes I’d tried, and if (for the price) the shoe was well-made (I’m a guy — I hate buying shoes.  One pair should last me 5 years! ;) ).  But do I go into a shoe store or into Amazon’s footwear section and specifically purchase Nike shoes?  Absolutely not — like motherboards, I’ll purchase whatever meets my needs.  Would I buy a mink coat?  No, because mink feels fucking weird to me, and such things are horrendously overpriced.  Do I eat meat?  Hell yes, because it tastes good (I like tofu too, for what it’s worth).  You get the idea.

P.S. – I don’t own an iPod, and I likely never will — but not because of the above.

Supermicro PDSMi+ BIOS bugs — finale

A few weeks ago, I received a final statement from Supermicro regarding the bugs I had reported: both are indeed bugs.  Here’s the official word:

We’ve already had a fix for your Kingston flash drive, a Beta version BIOS is attached for floppy BIOS flash disk.  …

For the Sandisk flash drive, our BIOS engineers find the data code which might be embedded in the firmware is kind of old. We may fix it in the future, but we don’t have a date for now.

The Sandisk flash drive we bought two months ago has no problem. So we recommend to contact the Sandisk to replace a new one or you can buy a new one. …

I consider this reasonable/acceptable support, so thumbs up/kudos to Supermicro for taking my claims seriously.  I haven’t had a chance to test the beta BIOS I was given.  And no, I will not provide it here, simply because the vendor should take the responsibility of releasing a new BIOS themselves (plus I don’t want to upset Supermicro).

Here are the exact model numbers of both the Kingston and SanDisk drives, so that anyone who comes across this post of mine will (hopefully) be in better shape afterwards:

  • Kingston DataTraveler 100, 4GB — no external serial or model number
  • SanDisk Micro Cruzer w/ U3, 2GB — model SDCZ6-2048RB

In the case of the 2GB SanDisk Micro Cruzer, Supermicro recommends users contact SanDisk for a replacement.

I’ve purchased an 8GB SanDisk Micro Cruzer (model SDCZ6-8192RB), which has absolutely no problems booting on a PDSMi+ using BIOS v1.3.  So Supermicro’s claim is accurate: SanDisk has some products which contain buggy firmware/boot code.