Data Storage for VDI – Part 4 – The impact of RAID on performance
As I said at the end of my previous blog post
The read and write cache in traditional modular arrays are too small to make any significant difference to the read and write efficiencies of the underlying RAID configuration in VDI deployments
The good thing is that this makes calculating the Overall I/O Efficiency Factor (IEF) for traditional RAID configurations pretty straightforward. The overall IEF will depend on the kind of RAID, and the mixture of reads and writes using the following formula
Overall IEF = (Read% * read IEF) + (Write% * write IEF).
To start, with RAID-5, a single front-end write IOP requires 4 back-end IOPs, giving a write IEF of 25%. If you had 28 * 15K spindles in a RAID-5 configuration, this means you can only sustain 235 * 28 * 25% = 1645 IOPS at 20ms.
Using Rubens or a 30:70 VDI steady state read:write workload the Overall IEF for RAID-5 would be
(30 * 100%) + (70 * 25%) = 47.5%.
For a 50:50 workload, the Overall IEF would be
(50 * 100%) + (50 * 25%) = 75%
For RAID-10 you sacrifice half of your capacity, but instead of there being 4 IOPS for every 1 front end write there are 2 for an write IEF of 50%. The write coalescing caching tricks also add benefit to RAID-10 but again, not sufficiently to make any significant effect.
So how about RAID-6, with RAID-6, every front end write I/O requires 6 IOPS at the back end or an uncached Write IOE about 17% and a cached Write IOE of about 27%. Reads for non-NetApp RAID-6 based on Reed-Solomon algorithms are yet again, unaffected.
So, what about RAID-DP ? Well, much as I hate to say it, even though it is a form or RAID-6, by itself it has the worst of performance of all the RAID schemes (and yes I do still work for NetApp).
Why ? Because RAID-DP, like RAID-4 uses dedicated parity disks. Given that, by default, one disk in every 8 is dedicated to parity and can’t be used for data reads, both RAID-4, and RAID-DP immediately take a 13% hit on reads. In addition, just like RAID-6 every front end random write IOP can require up to 6 write IOPS at the back end This would mean that NetApp has the same write performance as RAID-6 and 13% worse read performance.
This gives the following results for overall IEF for the 30:70 read:write usecase(30 * 87%) + (70 * 17%) = 38.40 (!!)
This is exactly the kind of reasoning our competitors use when explaining our technology to others.
So why would NetApp be insane enough to make RAID-DP the default configuration? How have we succeeded so well in the market place ? Shouldn’t there be a tidal wave of unhappy NetApp customers demanding their money back?
Well there are a few reasons we use RAID-DP as the default configuration for all NetApp arrays. The first is that dedicated parity drives makes RAID reconstructs fast with minimal performance impact. It also makes it trivially easy to add disks to RAID groups non-disruptively. “This might be great for availability, but what about performance ?” I hear you ask. Well I’ve been told that you can mathematically prove that the RAID-DP algorithms are the most efficient possible way of doing dual parity RAID, frankly the math is beyond me, but the CPU consumption by the RAID layer is really minimal. The real magic however happens because RAID-DP is always combined with WAFL.
This isnt a good place to explain everything I know about WAFL, and others have already done it better that I probably can (cf Kostadis’ Blog), but I’ll outline the salient benefits from a performance point in the next post Data Storage for VDI – Part 5 – RAID-DP + WAFL The ultimate write accelerator