Storage Class Memory Disruption
About two years ago I wrote a blog on Megacaches, what I didn’t write about at the time was something that I knew had been brewing in our Advanced Technology Groups (ATG) lab around storage class memory. These guys look at developments five to ten years in advance, and when I had the chance to study their findings it became clear to me why NetApp never saw Flash or SSD’s as just another storage tier. While I can’t be specific about the contents of those reports I think it’s safe to say that if you take a conservative set of predictions, you’ll see that a very large percentage of not only Flash memory, but also post-Flash technology such Phase Change Memory and MRAM is destined to reside in, or very close to the server layer. This is already beginning to happen and it is going to cause a wave of disruption that will fundamentally change the way we think about, design, and manage mass data storage.
Even though there have been a number of early implementations of this technology, most of them in order to get time to market are little better than fast direct attached storage (DAS). While cost effective performance is probably one of the most important considerations in a data storage architecture, there are a whole range of other data management governance issues that need to be addressed, including backup and recovery, security, and increasingly, data mobility. There is a very good reason why the vast majority companies implement virtualised infrastructure on networked storage, and DAS, even very fast DAS, introduces more management problems than it solves.
Mercury the Prototype
In Febuary of 2011 some of the members of NetApp’s ATG presented a paper entitled – “Mercury: host-side flash caching for the datacentre”, they highlighted the pros and cons of integrated flash memory in the server
- 10s-100s GB of flash being integrated into servers
- New price/perf tier between disk and DRAM
- Flash is 10-100x faster than disk, ¼ price of DRAM
- High IO-per-second (IOPS) storage close to CPU
…but using integrated flash for primary storage breaks the shared-pool datacenter model
- Binds software services to specific servers
- Puts flash primary storage out of reach of storage management tools
To address this challenge, NetApp ATG members presented in what must be one of the storage industries best stealth launches not only an architecture, but a working prototype and the performance results. I suspect that this presentation spooked a number of other vendors because it only took a few months before we saw a number of other announcements that they had a project or were planning to buy someone to do something similar.
Flash Accel the Product
Since then I’ve seen this prototype built into a fully functional product which we’ve called Flash Accell. We’ve taken our time with this, to give customers something that
- Extends server memory by turning any flash into read cache for Data ONTAP
- Software only, compatible with any server PCI-e flash or SSD drive
- Ensures intelligent data coherency with Data ONTAP
- Persistent and durable across server reboots and crashes
- Fully supported
One of the important things about this is that it is software only, which is why we’ve formed alliances with some of the best companies working in the server class memory space including Fusion-io, LSI and Virident to provide the initial set of qualified hardware for our own Flash Accel software
It’s also clear that since we presented this prototype that our chosen hardware partners, and other companies also have software caching products that address similar concerns as Flash Accel. Each of these solutions which are typically tied to one kind of hardware have particular strengths and some of offer functionality that isn’t available today in Flash Accel. Because this space is moving so fast the only way to keep ahead of the innovation curve is to remain as open to new ideas and technology as we can, so we’ve also partnered with SanDisk, Fusion I/O, LSI, and STEC to make sure their caching software is compatible with the advanced array based data management capabilities built into ONTAP.
These vendors want to work with us, because we are easily the largest most leverageable platform in the industry. Data ONTAP is the largest installed operating system base in the storage industry, and by no coincidence it also has the richest set of built in data protection and capacity optimization capabilities, including snapshots, cloning, deduplication and compression, and SnapMirror the worlds most implemented data replication technology.
This allows innovative companies to combine the incredible performance advances being made at the server level with storage class memory with the worlds most efficient and agile data infrastructure platform. It is this open approach to innovation, allowing customers to choose the best of breed, not just at the storage array with NetApp FAS and V-Series, but also from the incredibly fast moving server class memory vendors. It is this that I think makes NetApp’s approach to the upcoming storage class memory disruption so compelling.
Flash Accel the next level of Virtual Storage Tiering
This new capabilities will soon become one of the most important components of NetApp’s virtual storage tiering (VST) architecture, a set of technologies that combines solid state memory at multiple places in the storage hierarchy with capacity optimized disk architecture that
- Efficiently Uses various forms of storage class memory in the optimal locations
- Is Simple to install and is self managing over time
- Allows for dynamic changes to quality of service policies
- Provides Real Time Response to changing workload patters
- Minimizes HDD I/O’s to allow for denser (and hence cheaper) HDD architectures
- Gives Very high performance and very low latency with consistent performance
- Is Resilient to failures
- Scales non-disruptively
- Stays at the forefront of innovation
While Virtual storage tiering architecture is compelling today but there is still more goodness to come, and in upcoming blog posts I’ll delve more deeply into Flash Accel and our partners offering and how this combines with Flash Cache, and Flash Pool, and if I’m allowed 🙂 I’ll give some hints into what’s coming in the future
Part of a Long Term Plan
Even if I cant tell you too much about what’s coming, to give you an idea about where we’ve come from, in November 2010 I put up a diagram in a blog post on Archiving. That diagram was by that time already two years old, and the original dates from the time of PAM-I, the DRAM based precursor to Flash Cache 2008.
That proposal document from mid 2008 actually had the words “Virtual Storage Tiers” and Intelligent Storage Caching, which pre-dated pretty much every other vendors’ announcements around their plans for Fully Automated Storage Tiering by about a year. I say this, not to disparage our competition, but to point out that taking advantage of storage class memory has been something we’ve been committed to for quite some time, and that is why I’m so excited to see these announcements today.
If I were to update this, I’d probably just have one large pool of 3TB drives, and maybe a small quantity of 600GB 10K drives, even so, the same basic rule applies, a small number of capacity optimised physical disk pools with QOS based software “Tiers” sitting on top of them. This is something that’s been core to the way we think about tiering and was enabled firstly by PAM-I then substantially expanded with the release of Flash Cache in 2009. That approach has been incredibly successful since then with over 1.2 Exabytes of physical disk accelerated by 15 Petabytes of Flashcache. In June this year, we extended that approach with Flash Pool to include a number of optimisations for random-overwrite intensive workloads as well as extending our VST story to SATA/NL-SAS for more workloads all the way down to the 2220, the smallest array in our FAS family.
One Giant Step ….
Now that Flash Accel is out of the box, the scalability and intelligence of NetApp’s Virtual Storage Tiering approach has taken another major step forward, and this, along with the infinite and immortal characteristics of ONTAP cluster-mode at the back end, will form the basis for an agile data infrastructure that will change the way we think about, design, and deploy mass storage and data management for the next ten years.