Firmware Installation

Now that the hardware side of the drive is ready, it's time to put some intelligence (the firmware) inside. 

The firmware download is done by custom PC setups that consist of normal PC hardware (if you look closely, you can see ASUS' logo on a motherboard or two) running some sort of a Linux distro with OCZ's custom firmware download tool. If you zoom into the monitor you can see that in this case the system is applying firmware to 240GB ARC100 drives.

Once the firmware has been loaded, the drives will move to run-in testing. OCZ has developed a custom script that writes and reads all LBAs eight times with the purpose of identifying bad blocks. If a drive has more bad blocks than a preset threshold allows, it will be pulled away and either fixed or destroyed. The scripts also test performance using common benchmarking tools (e.g. AS-SSD and ATTO) to ensure that all drives meet the spec. 

Currently OCZ has two different test setups. One half of the test systems are regular PCs that are very similar to the firmware download systems, whereas the other half are custom racks pictured above. OCZ is looking to move all testing to rack-based cabins since one cabin can simultaneously test 256 drives, which is far more efficient than having dozens of PC setups around that can only test a handful of drives each at a time. The test regime is the same in both cases, so it's purely a matter of space and labor efficiency.

At the moment SATA based drives are tested through the host, which means that the IO commands are sent by the host similar to how we test SSDs. For PCIe drives, however, OCZ is developing a Manufacturing Self Test (MST) that is essentially a custom firmware that is loaded into the drive, which then reads and writes all LBAs to test for bad blocks. The benefit of MST is the fact that it bypasses the host interface (i.e. all IO commands are generated by the controller/firmware), making the test cycle faster as the host overhead is removed. 

Additionally, every month a sample of finished drives go through a more rigid tests called Ongoing Reliability Testing (ORT) to ensure that nothing has changed in production quality. The tests consist of Thermal Cycle Test (TCT) where the drive is subjected to thermal shocks to validate the quality of manufacturing and Reliability Demonstration Test (RDT) where drives are tested at elevated temperature (~70°C) to demonstrate that the mean time before failure (MTBF) meets the specification. 

The run-in testing hasn't changed much since Toshiba took over, but Toshiba did help OCZ to align to its quality standards. All the processes running today have been inspected by Toshiba and meet the strict standards set by the company. Note that the purpose of run-in testing isn't to screen for firmware bugs, but to ensure that the hardware is functional. The firmware development and validation is done before the mass production begins and after Toshiba took over OCZ has modified its development process to increase the quality and reliability of its products.

OCZ's whole philosophy has actually changed since the previous CEO left the company because in the past OCZ always tried to be the first to the market at any cost and tried to cover every possible micro-niche, which resulted in too many product lines for the resources OCZ had. Nowadays OCZ is putting a lot of effort into product qualification and it no longer has a dozen products in development at the same time, meaning that there's now sufficient resources to properly validate every product before it enters mass production. 

The run-in testing may seem light with only eight full LBA read/write spans, but honestly I don't think it's necessary to hammer a drive for days because any apparent hardware flaw should surface very quickly. Basically, the hardware either works or it doesn't, and once the drive leaves the factory it's more likely to fail due to firmware anomaly than a physical hardware failure. 

The Factory & Assembling an SSD Packaging & Final Words
Comments Locked

64 Comments

View All Comments

  • fatpugsley - Wednesday, May 20, 2015 - link

    A root GUI with presumably obsolete stock Fedora doesnt seem very secure to me.
  • Stahn Aileron - Thursday, May 21, 2015 - link

    Doesn't really matter if the only network they're connected to is an internal network without internet access. The only thing those workstations should be touching and working on are SSDs. There's minimal security needs there. The best and simplest security measure there would just be an air gap.
  • dreamslacker - Thursday, May 21, 2015 - link

    It's not as uncommon as you might think. Most of the machines are effectively on an 'intranet' so they really only need a UI and kernel base that has reasonably good OoB driver support for commodity hardware and the dependencies for the specific in-house software suite.

    When I worked a short project on similar lines in Seagate back in the past, their machines ran Centos 4.4 (at the time) and they started off that line-up on RHEL before moving to Fedora Core and then to Centos 4.4. They were planning a move back to Fedora after that though.
  • der - Wednesday, May 20, 2015 - link

    They're made to they can dig deep to our pocket monies.
  • Scott_T - Wednesday, May 20, 2015 - link

    Next time make us a movie in the format of 'How its Made'
  • MrSpadge - Wednesday, May 20, 2015 - link

    I'd rather read an article at my own pace.
  • jann5s - Wednesday, May 20, 2015 - link

    Cool stuff Kristian!
  • YoloPascual - Wednesday, May 20, 2015 - link

    Wow a field trip.
  • OzzieGT - Wednesday, May 20, 2015 - link

    That poor soul who has to apply labels all day...
  • junky77 - Wednesday, May 20, 2015 - link

    this fake smile.. seems like all the people interviewed on AT and such have this smile lol

Log in

Don't have an account? Sign up now