2020年8月22日 星期六

Farewell: Hard Disk Drives

It has been so many years already - my first HDD was a CONNER CFS-420A with capacity 420 MB. It should be around 1994. That was the exciting moment I can still remember - what an upgrade to 1.44MB floppy disks!

More than 25 years later, now I have made the decision to phase out all HDD sin my system. I have even bought a few 12TB Seagate Exos for my next project, but now they will be sold before being put into real use.

My experiences with HDDs could be quite a bit different from most of the people out there: I was very cautious about reliability and durability, rather than focusing mainly on capacity. If you have ever noticed about the specifications of the most decent HDDs, you might start noticing one thing: the Unrecoverable Error Rate has remained 10^14 (consumer parts) and 10^15 (server parts for many years without improvement.

Yes, right, who would be really caring about these numbers? They are just indicators, and new technology should be always more reliable, right? Unfortunately, my personal believe is, if there is a number published in that way, it MUST has a meaning. Consider other common storage technologies, such as SSD (10^17 for enterprise parts) and Tapes (10^17 - 10^19), what I can say is HDDs have a frightening LOW reliability nowadays.

Imagine a 1 TB HDD and a 10 TB HDD, they both have URE of 10^15. During a RAID rebuild, 10TB one will have 10 times higher chance to encounter an URE. While 10^15 URE means you *might* encounter a Non-Recoverable Read Error every 125 TB of data, this is not a lot for today's large capacity drives. If I am using 10x 12TB Exos HDDs to build a RAID 6 array, it will be so likely to get a Read Error during re-build - which is exactly the reason putting me off from proceeding with my original plan. I just don't want to put my data at risk.

That's also the reason why in the past my 16 HDDs RAID 60 array served me well, because each member was only 1TB in size. With 10^15 URE, the array was still relatively secured and it has proved itself. However, with 12TB, 10^15 URE is DANGEROUS and it is not making sense for traditional RAID techonology.

Also, think about the rebuild time - with my 16 HDDs RAID 60, even with a 1TB drive size, the rebuild time was somewhere around 4 hours without anyone using it. How about a 12 TB HDD? Well, if you are luck, you can have the array rebuilt in 2 days (given that nobody will be using it). Otherwise, it could be "weeks" - and the longer the time, the more likely another unrecoverable read error can happen. (And another one!)

You may say RAID is now out of date. But unfortunately it is the opposite - HDDs are now out of date. I now have an array of 2TB SSDs in RAID 6, and I am feeling way more comfortable with their 10^17 URE. The maximum size of SSD I can accept to be put into the array would be 15.36TB - if higher then I will need 10^18 URE.

HDD has a good characteristic: offline durability. This is especially true when it comes to offline data storage - think about data retention of SSD when it has been completely powered off - after the Intel 320 array incident happened to me in the past (data gone after powering off for > 1 month), I think you might expect data could be held in server class SSDs for 1 year max offline without problem, but don't expect more than that. HDD can do far better than SSD in this case since the storage of data is not depending on electricity stored in NAND cells that could leak, but on platter with magnetic recording that can last.

HOWEVER, how often are you going to put your HDDs offline? Possibly never - they are in RAID, and they are serving as nearline storage. This has defeated the purpose of this great characteristic. For nearline storage, I can easily use SSDs because they are ONLINE, with continuous power supply so they have the same data retention reliability as HDDs. They can run cooler and they have far better access performance than HDDs. So that's why my 12TB Exos are out of the picture for my project - I will be getting some 15.36 TB SSDs instead.

For archiving, an "old school" technology has somehow came to my mind - TAPES! They are offline storage, portable, power efficient and has a really high URE. They use magnetic recording as well so data retention is surprisingly good. If I need archiving, a tape library is better suited in this case.

With Tapes and SSDs, I have to say "farewell" to HDDs - unless URE has been increased to 10^16 or even 10^17, they won't be considered by me just because I want to have better sleep at night without worrying about RAID rebuild...

FAREWELL, MY LOVELY HDDS!

2020年8月1日 星期六

Windows Network Direct: Your better bet is with Windows Server 2019

I have been always struggling to get RDMA working inside a Windows virtual machine. I had tried Mellanox ConnectX-3, Mellanox ConnectX-5 and Chelsio T62100-LP-CR network adapters, with Windows Server 2012 or 2016, and even with Direct Device Assignment in Windows Server 2016, I could not get RDMA working flawlessly in any virtual machine.

Recently, I retried RDMA in a Windows VM (2019) on a Windows Server 2019 host, with a Chelsio T62100-LP-CR - and finally have RDMA (iWarp) working correctly (even without a switch - you can connect port 1 to port 2 to form a 100GbE link). It enabled SMB Direct between the VM and the host (and between VMs as well), and performance was acceptable (needs tuning).

If you are after any RDMA application inside a Windows VM, or simple just want to use SMB Direct in a VM, Windows Server 2019 or later is your better bet in this case.

Do note that you need the following:
  • SR-IOV support from BIOS - this sometimes means enabling the ASPM option in BIOS.
  • A network card that supports RDMA - I like iWarp because it is simpler (virtually no configuration needed). If you like RoCE then you may need DCB configured properly, or even need a 40GbE/100GbE switch.
  • Windows Server 2019 or higher - both host and VM. You may use Windows 10 (latest) - I didn't try that out but theoretically it should work.
  • Workstation / Server grade hardware - I have seen many times people complaining about not being able to enable SR-IOV due to missing implementation like IOMMU or ACS etc. with consumer grade hardware. Your CPU supports all these features doesn't mean your motherboard / BIOS has support of all features.

Incompatibilities and Compatibilities

NOTE: This article will be updated in the future when more compatibilities / incompatibilities are discovered.  Incompatibilities   12-Feb-...