Software Defined .. Why Now ? Why Bother ? …
Monday is my admin clean-up and research day, which makes it the best day for quadrant-II thinking, and most of what I’ve been thinking about recently is software defined storage, or if you’re an Openstack advocate then you’d call it software-led storage.
After spending more than a few weeks thinking and researching, I’ve come to the conclusion that I’m not a big fan of either term, especially as it pertains to storage. Given the likelihood of an increasingly fuzzy set of layers between hardware and software, I think that “software-led” is probably a more useful way of talking about the future of storage infrastructure, but even so I’m still not convinced it’s the most useful description either. Nonetheless, for the moment a lot of people are talking abut software-defined networks, datacenters and storage, so I’ll start to outline my breakdown of storage within that paradigm.
Software defined anything has its roots in software defined networking and OpenFlow, so the rest of this post goes through how I see Software Defined Networking, and then I’ll use that as a framework in future posts for talking about software defined storage.
So how do you define “Software Defined” I think if you’re going to use the term without it being just another way of saying virtualised, then you need to be talking about infrastructure built on the principal of a clean separation of hardware optmised functions from software control structures, or , in the parlance of Software Defined Networking separating the data plane from the control plane. That means to create something that is truly Software Defined XXXX and not just a marketing-sexy-me-too-rebrand you have to
- identify and then formally define a set of common functions or primitives performed by existing infrastructure that are optimally run in purpose built devices (e.g. hardware filled with interfaces and ASICs) – This becomes the “Data Plane”
- Create a protocol that manages those functions
- Create a standards compliant controller that runs on general purpose hardware (e.g. an intel server, virtualised or otherwise) that takes higher order service requests from applications and translates those into the primitives codified in step 1, over the protocol devised in step 2. – This becomes the “Control Plane”
The prime example of this is with software defined networking world that could be something that looks like this …
So why did this happen in networking, and not storage or compute ? Why now ? And why bother ?
As to why it happened in networking, there are a half a bazillion blogs out there on the subject, of which I’ve only read a small fraction, but from my perspective, I reckon it happened because of the following reasons
- by its very nature, customers have demanded that networking vendors must inter-operate with other vendors equipment in as seamless a fashion as possible
- there has been one absolutely dominant player in the market at pretty much all times along with a very well supported standards body.
- Networking subsequently evolved to the point where there is one dominant layer-2 implementation (Ethernet) with one dominant layer-3 implementation (IP), and a fairly small number of upper level protocols above that (TCP/UDP/HTTP etc).
- This has driven the similarity of network equipment functionality from disparate vendors that allowed the developers of openflow the opportunity to identify the commonality of flow-tables in hardware on which the elegant separation of control and data planes in SDN is built.
Like many “new and revolutionary ideas”, it probably worth noting, that this revolutionary “new” architecture has been evolving since at least 2001 when the IETF started the “Forwarding and Control Element Separation” (ForCES) working group”, and arguably before than back to 1996 with things like General Switch Management Protocol (GSMP).
But even if you can do this clean separation, why bother ? The development of openflow wasn’t driven by market requirements, it was developed to let researchers run interesting experiments on existing large scale university campus networks. While that’s a very cool thing to do as a researcher, running “experiments” on a large scale enterprise infrastructure isn’t something I’ve ever had much success with. About as adventurous as I get is asking for a vlan that spans two datacentres, and for the most part whenever I’ve suggested stuff like that in the past, I get one of those “Put the network diagram down … and STEP AWAY” looks from the network guys. I can only imagine what would happen if I said “Hey I’ve got this really cool idea for encapsulating fibre channel over token ring and running it on your existing Ethernet infrastructure”. Which begs the question, why on earth would anyone in Enterprise-IT implement want to implement something this radical ?
The answer for the most part is .. they don’t. Sure there is a promise that opening up the infrastructure will lead to more competition and that will reduce prices, but the last time I looked, the networking industry was already pretty competitive. Even of you were to pull a datacentre class switch apart into cheap basic hardware and smart software running on an Intel-box, the value that vendors like Cisco bring in terms of scalability, quality assurance, interoperability testing, support, professional services etc, will mean that in all likelihood, customers will be willing to pay a premium for their solutions, and Cisco and others like them may become even more profitable as a result. As a parallel case, there are plenty of free database offerings out there, and yet Oracle is doing just fine. You might expect that if “Software Defined” was something everyone now uses as a prime buying criteria you might see Larry Ellison extolling the virtues of a “Software Defined Database”. OK, maybe not given his rather sceptical comments about cloud in the early days, but they’re going in exactly the opposite direction, increasingly embedding more of their software into vertically integrated hardware solutions precisely because there is continued ongoing demand from enterprise customers for tightly integrated hardware/software solutions.
So if SDN isnt likely to significantly reduce costs, and there isnt an organic pent up demand within the enterprise, then where is the payoff for the large risks that come with developing and deploying any new technology ?
The answer to that question lies in the standardisation and maturity of today’s network protocols that led to the commonality expressed in flow-tables. The core protocols of TCP/IP were developed almost forty years ago and were built not only on a set of solid principals that have stood the test of time, but also on what were in 1973, some very reasonable assumptions. Unfortunately some of those assumptions no longer hold true e.g. there was an assumption that a machine with an IP address wont magically teleport from one physical location to another, yet this is exactly what happens when you try to migrate a virtual machine from one datacenter to the next. It is exactly those kinds of assumptions that are now causing problems the largest consumers of Network equipment: the large-scale cloud and telecom service providers.
That is why Software Defined Networking is suddenly interesting. For many businesses, IT infrastructure isn’t a competitive differentiator (it could be, and it should be, but right now it isn’t), but there are some very large customers, with some very large IT budgets for whom IT infrastructure is a core enabler of their business, and are willing to take on the risk of a new approach in the promise of disruptive innovation. These people aren’t just dreamers with fists full of VC dollars, but some of the networking industries largest and most influential customers. Other agile enterprise customers who understand how to leverage IT infrastructure for competitive advantage will also benefit from the investments of these larger organisations, but for the most enterprise customers, what passes today for software defined Networking will be restricted to virtual switches inside their virtual server infrastructure, and that, while useful, doesn’t exactly fit the definition I used at the beginning of this post.
Which brings me to storage .. For a number of vendors, a Virtual Storage Array = Software Defined storage, and while that’s reasonably valid, I also think it’s a bit of a half measure. I’m not saying that because I don’t like VSA’s, I do, I think they’re great, but, I don’t think that they’re the best example of what a software defined storage infrastructure can do. They might be part of it, but they’re not a necessary part, and in some cases, I’d argue that they’re not even a desirable part of a cleanly separated software defined infrastructure. And that is what I’m going to cover in my next post.