Western Digital Reimagines HDD - Flash Integration with OptiNANDby Ganesh T S on August 31, 2021 12:00 PM EST
The last few years have seen plenty of new innovations come up in the hard-disk drive market. For quite some time, the HDD technology roadmap was shared industry-wide - vendors introduced new technologies at different points in time, but they were all similar in nature. As a recent example, HGST (now, Western Digital) was the first to market with helium-filled HDDs, but both Seagate and Toshiba followed up with similar drives within a few years.
Prior to 2017, there was consensus that heat-assisted magnetic recording (HAMR) would help drive the increase in storage density for HDDs after traditional perpendicular magnetic recording (PMR) ran out of steam. Western Digital sprang a surprise in Q4 2017 by announcing the decision to go with microwave-assisted magnetic recording (MAMR) for future HDDs. Seagate, in the meanwhile, has been all-in on HAMR and also launched 20TB HDDs based on the technology for enterprise customers (those HAMR drives are yet to hit retail, though). In the meanwhile, Western Digital was promising MAMR drives for 16TB+ HDDs, but eventually back-tracked in favor of energy-enhanced PMR (ePMR). Toshiba, on the other hand, introduced flux control-MAMR (FC-MAMR) in its MG09-series of enterprise 16TB and 18TB HDDs.
At the HDD Reimagine event today, Western Digital is introducing OptiNAND - a novel architecture involving the integration of an embedded iNAND UFS embedded flash drive (EFD) on the drive's mainboard.
In conjunction, the company is also announcing that it has been sampling its first 20TB non-SMR drives based on OptiNAND-enabled ePMR to select customers, and that it would be adopting the OptiNAND platform moving forward for all 20TB+ HDDs. The company also sees a path to 50TB OptiNAND-enabled ePMR drives in the second half of the decade.
While the company did not quantify the amount of NAND in its OptiNAND drives, they are stressing the fact that it is not a hybrid drive (SSHD). Unlike SSHDs, the OptiNAND drives do not store any user data at all during normal operation. Instead, the NAND is being used to store metadata from HDD operation in order to improve capacity, performance, and reliability.
Western Digital's OptiNAND announcement also conveys the fact that their 20TB 9-platter HDDs will continue to use energy-enhanced PMR (ePMR). In addition to the use of a triple-stage actuator to enable more accurate positioning of the heads over the tracks, the OptiNAND aspect is being touted as the key to enabling 2.2TB capacity for each platter.
The increase in areal density is being achieved by cramming the tracks on the platter closer together (increased TPI), while also moving out some of the metadata (both factory-generated and mid-user operation) out from the platter to the NAND. In particular, Western Digital made a mention of the repeatable run out (RRO) recording of the head jitter / error position as the spindle revolves. This data (running into multiple gigabytes) is generated in the factory during manufacturing. It is typically stored in the disk, taking up space that could have potentially been used for user data. The OptiNAND architecture moves this to the NAND in the EFD.
One of the key challenges to packing tracks closer together is the concept of 'adjacent track interference' (ATI). This results in the need to periodically refresh data in the platter's tracks as it could get corrupted by writes to adjacent tracks. Currently available HDDs triggered these refreshes on a track-by-track basis based on the recording of write operations at the track-level. One of the downsides to increasing areal density by increasing the TPI is the need to do more frequent refreshes. From refreshing once in 10000 write operations in early HDDs, the narrow tracks now need to be refreshed as frequently as once every 6 writes. Beyond a certain point, it doesn't make sense to increase TPI any further because the increase in the frequency of ATI refreshes has an extreme impact on performance. In present-generation HDDs, these refreshes have been triggered at the track level by recording write operations at that hierarchy. The OptiNAND architecture allows the write operations to be recorded at the sector level. This means that the refresh operations are more spread out both temporally and spatially, allowing the tracks to be packed closer together without sacrificing performance. In turn, this increases the areal density.
Consumers can operate HDDs with the write cache in the device enabled or disabled. Irrespective of the cache enablement, the HDD has to buffer up the incoming data. In the disabled case, the amount of data that could be buffered up is dependent on the amount of data that can be safely flushed out to non-volatile storage in the case of an emergency power-off (EPO) situation. The presence of significant NAND capacity in the HDD means that the drive can use the rotational energy present in the platters to flush out more data in the DRAM into the NAND (Present-day HDDs dump out the DRAM data into serial flash - around a couple of MBs worth - in an EPO situation). The ability to buffer out more data in this case means that the performance of write-cache enabled case and write-cache disabled case approach each other in OptiNAND-enabled HDDs.
Western Digital also claims that the 'write cache enabled' case can benefit on the performance front. This is an indirect result of the reduced refresh rates (referencing the observations in the previous sub-section on how OptiNAND handles adjacent-track interference) that allows the HDD to spend more time in servicing user data requests. Again, there was no quantification of the improvement in IOPS for different access patterns over non-OptiNAND HDDs in Western Digital's event.
The aspects of OptiNAND used to enhance the performance of the drives in the write caching disabled state also contribute to enhancing their reliability under EPO conditions. By including faster non-volatile storage compared to serial flash, Western Digital claims that up to 50x more data can be flushed out compared to previous-generation HDDs.
Western Digital claims that the vertical integration possible with the HDD technology from the WD / HGST side along with the flash technology from the SanDisk side is essential for the creation of a platform like OptiNAND.
There is bound to be a cost-premium associated with the drives due to the NAND integration. New recording technologies (like HAMR and MAMR) require significant investment into the design of the recording heads as well as platters, and need to be revamped every few generations. On the other hand, technologies like OptiNAND are independent of the underlying technology.
Without exact quantification of the increase in areal density enabled by OptiNAND, it is not possible to provide comparative comments on the Capacity aspect of Western Digital's OptiNAND trifecta - except that the company is now able to introduce 20TB hard drives to the market with the same ePMR technology used in its 18TB drives (around 2.2TB/platter).
The Performance aspect should be easier to evaluate when OptiNAND drives hit retail. While the benefits for the 'write caching disabled' case (where the NAND can act as a safe cache in an EPO situation) are easy to verify (essentially acting the same as the 'write caching enabled' case), the pure 'write caching enabled' case should be much more interesting to analyze against competing drives of the same capacity.
Western Digital indicated that all of their 20TB+ HDDs moving forward will be OptiNAND-enabled. This will be across all market verticals - cloud deployment, enterprise drives (Gold), storage for surveillance recording (Purple line), and NAS (Red line). It must be noted that the company has a 20TB SMR drive already in the market that is not OptiNAND-enabled. The new HDD architecture with its flexible SoC and high-performance NAND integration can also be used to enable customer-specific enhancements in the future. The ability to use the NAND to dynamically remap sectors can increase areal density and improve performance much more in SMR drives. Based on this, we can expect OptiNAND-enabled SMR drives to gain significant capacity advantage over CMR drives in comparison to what is being seen in the market currently.
The HDD industry is not yet in dire need of CPR, but Western Digital's usage of OptiNAND to address the Capacity, Performance, and Reliability trifecta is yet another unique aspect in the innovation-rich hard-disk drive market. Western Digital has both HDD and complete flash technology (from NAND fabrication to controller) in-house, while the other HDD vendors do not have that advantage. As such, it might take the other vendors some time to catch up on the advantages of using NAND for HDD metadata.
Source: Western Digital
Post Your CommentPlease log in or sign up to comment.
View All Comments
flyingpants265 - Wednesday, September 1, 2021 - linkSo buy two? 600TB is really small for a distributed database anyway.
Kamen Rider Blade - Tuesday, August 31, 2021 - linkDon't forget to implement multi-Actuator on a single Actuator Arm and Dual Actuators in the same 3.5" body.
I see enough empty volume that with some re working, you could fit in 2x Actuator Arm Stacks with multi-actuators per arm leading to huge Linear R/W performance increases.
ganeshts - Tuesday, August 31, 2021 - linkDual actuators are already in production in some of Seagate's drives. I think all vendors will eventually adopt the scheme in order to scale sequential bandwidth / IOPS with capacity. For now, WD believes it is not cost-effective to implement dual actuators outside their R&D labs.
WD does have the triple-stage actuator in their single arm that enables finer positioning (which in turn means tracks can be bunched even closer together) in order to increase areal density.
Kamen Rider Blade - Tuesday, August 31, 2021 - linkI've seen smaller HDD's with smaller base motors and housings.
Given the volume in a standard 3.5" HDD, you can easily fit at least 2x sets of arms.
If you shrink the motors for the arms and the base controller stack, you can get 4x stacks of arms.
But that requires them to re-engineer the stack for volumetric efficiency.
Then stack on Multi-Actuator per Stack.
Eventually, you might be able to get it down to each Actuator arm is fully independent of the other arm and have massive parralelism.
But that requires even more engineering.
flyingpants265 - Wednesday, September 1, 2021 - linkMaybe I'm missing something, but we should be going back to 5.25" as it gives a large (50%?) surface area boost per platter, and the Google hard drive study suggested thicker drives with many more platters per drive. With this we're already looking at 40TB drives easily. Lowering rotational speed is no big deal.
The_Assimilator - Friday, September 3, 2021 - linkGood luck finding a 5.25" bay on modern desktop chassis.
PeachNCream - Sunday, September 5, 2021 - linkA quick Amazon search yields a fair number of cases that feature 5.25" bays. I must be very lucky. I'd better go buy lottery tickets or something given my luck is so good.
ballsystemlord - Tuesday, August 31, 2021 - linkSo, will the iNAND solve the SMR delayed write problems?
I don't claim this article said anything about it. I'm curious, because that's the major problem with SMR and if the heads had to move less because you have the metadata in iNAND that would at least help things.
rygaroo - Tuesday, August 31, 2021 - linkThe article mentions potential SMR capacity improvements, but doesn't mention performance. My guess is that it would provide a negligible boost. Once your drive fills up and your write cache fills up, every random write still requires reading several tracks + writing back several tracks. I'm not sure how many consecutive tracks are shingled in each shingled region... but let's say 5 for an example, so 10 revolutions per write at 7200rpm is ~83ms per write or 12 IOPS. Double the tracks per shingled region to get more capacity benefit from the shingling and you half the random write IOPS. A larger write cache might help you get in a few more performance writes before you hit the read-modify-write wall, but you'll still hit it eventually. SMR is pretty much not an option if you have a random write heavy workload. If any of my assumptions are flawed, please let me know.
abufrejoval - Wednesday, September 1, 2021 - linkThe only flaw would be to put a random write heavy workload on HDDs these days: that's become "you're using it wrong".
The whole SMR debate really mostly centers around the RAID/ZFS rebuilds where you couldn't really avoid it and the upper layers (controller and FS) didn't know about SMR.
If they were SMR aware, rebuilds might perhaps be done at shingle granularity and here big flash (perhaps better outside the drive) would help.
Hardly any storage is designed to support constant maximum write pressure, because there isn't much you can save with optimizations. NAND could mostly just push the first level "SLC cache" for SMR much further so the SMR write amplification becomes less noticeable in "average scenarios".
If they use NAND to implement the fulll SMR write buffers, they better have lots of really good NAND. And I'd probably want a fail-back mode, where exhausted NAND just gets bypassed and won't result in a complete drive failure.