by Woody Hutselll, http://www.appICU.com
The phrase “software defined storage” burst into the storage marketing lexicon in seemingly less time than the data access latency of a good SSD. Unless you were born yesterday, you saw it happen. Solid state storage vendors piled on the bandwagon, most of them leaping by the most convenient route. But IBM has taken a more reasoned, and seasoned, approach, resulting in a software defined storage solution that captures the benefits originally imagined in the phrase, without resorting to some quick-time-to-market strategies.
One of the more fascinating stories to me in the last two years has been the rapid adoption of the phrase: “software defined storage.” Here for your viewing pleasure is a Google Trends view:
The mainstream use of the term software defined storage started in August 2012 with the launch of the Fusion ION Data Accelerator. Within a few months every major and minor storage vendor was labeling their solution as software defined storage, including companies with solutions as different as Nexenta, NetApp, and IBM.
While researching for this blog, I came across a nice blog post by a fellow IBMer that casts additional light on the idea of software defined storage. I love that IDC created a software defined storage Taxonomy in April 2013. Can you believe it? From creation as a phrase, to requiring a taxonomy in less than eight months. If you are reading this, you can count yourself as having been along for the ride as this phrase started to infiltrate storage marketing.
As I explore the meaning of software defined storage, I will use a really basic definition that I think allows everyone to jump on the bandwagon:
Software-defined storage involves running storage services (such as replication, snapshots, tiering, virtualization and data reduction) on a server platform.
No wonder everyone can claim to be in the software defined storage business. Count IBM and its SAN Volume Controller (SVC) with over 10 years in the industry as a pioneer in this category. Certainly NetApp, Nexenta, and others belong as well. For years the storage industry has been migrating the delivery of storage services from custom-architected hardware to commodity server hardware. In doing so, vendors gain lower cost hardware, a faster time to market, and the advantage of using industry standard and open source software components. This isn’t to say the solutions aren’t differentiated; they are on the basis of their feature sets, but they are not significantly differentiated based on the hardware of their solution.
The introduction of all-Flash appliances into the product mix provided a real test of the capability of software defined storage. I remember IBM talking about project Quicksilver in 2008. Quicksilver used IBM SVC. The results were impressive and showed that software defined solutions could scale to IOPS levels required by the enterprise. Since that time nearly every Flash product brought to market could be labelled software defined storage: Intel server platform, Linux OS, software storage stack like SCST/LIO, HBAs/NICs, third party SSDs, and software for storage services. Storage has become integration and tuning rather than engineering. This approach to system design leaves a lot to be desired. Are the OS’s, storage stacks, RAID, enclosures, or HBAs all really designed for Flash? No, actually. The integration happens only in the minds of the marketers, unless you count the SAS link that connects the server to the storage enclosure or subsystem.
Instead, IBM has taken a novel approach to the Flash market, recognizing that producing extreme performance requires custom hardware, while also acknowledging that offering rich storage services is best accomplished with software defined storage. This recognition led IBM to offer a brand new solution called the FlashSystem V840 Enterprise Performance Solution. The software side of the equation is driven by IBM’s extensive experience building actual, integrated software defined storage solutions. The hardware side of the equation, rather than being a potpourri of third party stuff, is a custom-engineered Flash storage system (the IBM FlashSystem 840). On the software side, the software defined storage control modules have been purposely developed with data paths that substantially reduce the latency impact of most storage services. In fact, the FlashSystem V840 achieves latency for data accesses from Flash as low as 200 microseconds.
For a minute, let’s contrast the FlashSystem V840 with the attributes of nearly every competing Flash appliance offering:
Typical storage enclosure
- Third Party MLC/eMLC SSDs
- No SSD-level data protection
- Inexpensive processors as Flash controllers
- SAS-connected
- Limited density and scalability due to form factor
- Off the shelf HBAs as interface controllers
- Software RAID and over-provisioning provided by the control enclosures
FlashSystem 840
- IBM designed FlashSystem Flash Modules
- IBM patented Variable Stripe RAID™ protects performance/availability even with cell, layer, or chip failures
- IBM engineered PowerPC processors combined with FPGAs as Flash controllers
- High speed proprietary interconnect
- High density and highly scalable
- IBM engineered interface controllers
- Optimized for low latency and high IOPS
- IBM engineered hardware RAID controllers
- Optimized for low latency and high IOPS with FPGAs as RAID controllers.
All of this discussion about proprietary hardware may have users worried about vendor lock-in and creating silos of data, however, the FlashSystem V840, with its storage virtualization feature, enables data center managers to break vendor lock-in by virtualizing heterogeneous third party arrays behind the FlashSytem V840 taking advantage of its feature rich set of storage services.
The choice of third party SSDs combined with software defined RAID architectures pushes storage processing work from the storage enclosure to the control enclosures. The problem is that these storage processing tasks are processor intensive (taking up threads and cores from what are already limited processors). The net result is that the control enclosures, without running any desirable storage services, are already burdened because they are performing functions that are best off-loaded to the storage enclosure. Combine this with the proven inefficiency of software RAID and the result is the terrible performance metrics we see from IBM’s Flash appliance competitors. Look closely at write IOPS performance and you will clearly see the deleterious effect of software RAID on performance. Try adding storage services to these control enclosures and you understand why the other Flash appliances on the market are not feature rich. Except by adding additional processors, they cannot add more features without cratering their already terrible performance.
In the case of the IBM FlashSystem V840, the storage enclosure functions as a high performance processing offload engine, freeing the control enclosures to do what they do best – implement storage services. The resulting solution delivers industry leading latency, IOPS, and bandwidth with a much more scalable solution.
Software defined storage may have its place, but only if done well. Abandoning effective hardware/software integration just for the chance to save on engineering seems like a terrible choice for all-Flash appliances. IBM has taken a different tack, purposely engineering and integrating a software defined storage solution that offers all the benefits, without resorting to the short-cuts that most storage vendors have used to get there.
To learn more about IBM and Software Defined Storage make sure and attend Edge2014.