Pondering NVMe-oF

December 7, 2017

by:  Woody Hutsell, appICU.com

I published a blog on IBM developerWorks on a recent technology preview of NVMe-oF with Power9 and FlashSystem.  I hope you take a minute to read it.

I have many mixed feelings on this topic that I thought I would share.

  1.  I think NVMe-oF will make a positive impact on application performance.
  2.  I have lived through the early days of other protocols and this one is no different:  immature standards, proprietary solutions, and slow customer adoption.
  3. I believe vendors will offer a bunch of NVMe-oF enabled solutions.  Most of them won’t make any sense.  NVMe-oF is about shaving latency.  If the solution being paired with NVMe-oF is loaded with latency from poorly implemented architectures and slow storage services, adding NVMe-oF will hardly make a difference.
  4. The wide range of NVMe-oF options is an impediment to its success:  InfiniBand, RoCE, iWARP, FC-NVMe, and more on the way.  The fact that different vendors are throwing their weight into different protocols is also not helping.
  5. The focus on lower latency for the customer is positive and I am delighted to see the storage industry refocused on latency even if these are the same people I heard mutter that latency under 500 microseconds doesn’t matter.
  6. Don’t be one of those people who says NVMe when you mean NVMe-oF.  I have seen industry experts get lost in the terminology.



Death of a Trade-off

October 24, 2017

The death of a trade-off, by Woody Hutsell, appICU.com

Everyone hates trade-offs but we almost always have to make them.  One of the most famous trade-offs in the IT world is that you can have it fast, cheap or good, pick two. One version of this trade off in the flash industry is that you can have it fast or low cost, pick one.  This is because our primary way to lower cost for all flash arrays has been to implement data reduction.  These data reduction tools lower the effective cost but they add latency thus slowing down the all flash arrays.  A quick look at the latency specifications for devices that data reduce and those that don’t will confirm this notion even where the marketing seeks to obfuscate this reality.

With its latest refresh of the FlashSystem 900, IBM allows the customer to get it fast and get it inexpensively.

There are two key technology advancements in the FlashSystem 900.  First, it has IBM enhanced 3D TLC NAND flash.  As with prior generations of FlashSystem, IBM has acquired Micron chips directly from the fab and enhanced them with our advanced flash management.  The economic benefits of moving to 3D TLC are well documented and apply to the new FlashSystem 900.  With the new chips, we achieve up to a 3x increase in maximum capacity.

The second key technology advance is line speed hardware compression.  IBM is the second major vendor to implement hardware compression but the first to deliver it for 3D TLC.  IBM compresses data in our field programmable gate arrays (FPGAs) within every flash module.  If you work with our sales and business partner teams we will put in a writing a 2:1 compression guarantee (and yes, your data must be compressible).  We have used a variety of terms to describe the performance of this new compression solution such as zero performance impact and worry-free compression.  But I want to take it one step further.  In most cases, our hardware compression will deliver better performance than the prior generation FlashSystem 900.

Implementing compression has always been a trade-off.  You implement compression to improve economics but trade-off performance.  Now, this trade-off is history thanks to the new FlashSystem 900.

Re-prioritizing Analytics

June 26, 2017

Woody Hutsell, http://www.appICU.com

See this link to read my first ever blog post on an IBM.com website.



Start waiting on 3DXP arrays

June 1, 2017

Start Waiting on 3DXP arrays, by Woody Hutsell, AppICU

Let’s get one thing out of the way.  Most storage systems will eventually offer 3DXP.  Why?  Because adding 3DXP SSDs to a storage array will be easy.

A second thing, I think the early usage for 3DXP will flow largely to server vendors (and their suppliers).  This is a major point and central to my thoughts on storage and 3DXP.  In the server, 3DXP reduces cost and increases density versus RAM.

3DXP in external storage will lag expectations until there are major advances in density and price.

I have worked in the part of the market that 3DXP external storage solutions will target for the last 17 years.  For most of those 17 years, I think we could comfortably call this space Tier 0.  These are customers whose end-customer satisfaction, missions or revenue are directly tied to the performance of their storage arrays.  When I say performance, I really mean latency sensitive.  They are so latency sensitive that they will not tolerate storage services getting in the way of application performance.  There are customers in the financial, telecom, defense, government, retail, e-commerce and logistics businesses that I could probably with a high degree of accuracy predict their interest in this solution.

These customers are willing to pay for low latency.  Customers in this category bought all RAM solid state storage.  They were early adopters of all flash arrays. They still buy based on latency curves (who delivers predictable low latency at the IOPS level they require).

These are not the customers buying Tier 1 arrays with a full suite of storage services.  They will not tolerate data reduction or storage services if it impacts latency.  These are not the customers buying primarily on cost/capacity though they still have budgets and need a solution that fits that budget.

I love this Tier 0 market, because these customers are solving world class problems and must stay on the bleeding edge of technology to grow their business.    These customers will buy 3DXP arrays that deliver on the low latency potential of 3DXP.  The phrasing of this sentence is no accident, if the array offers 3DXP but only delivers modest latency improvements, it will be largely ignored.

The first enterprise market to hit it big with flash was inside the server, particularly PCI flash (think Fusion-io).  The second enterprise market to hit it big with flash, a few years later, was the Tier 0 external storage market (think Texas Memory Systems (subsequently as IBM) and Violin Memory).  These splashes were nothing compared to the tsunami of business when all flash arrays entered the Tier 1 market with compelling economics driven by adoption of flash in consumer devices and supported by inline data reduction technologies to further reduce the cost per capacity.  These were majority buyers who were confident that the technology wrinkles were ironed out and who by and large wanted better performance than they could get from their disk-based solutions but were very focused on storage services, cost and cost/capacity.  They are not Tier 0 buyers though they won’t go back to disk having tasted the sweet nectar of low latency storage.

Tier 1 customers are unlikely to buy into all 3DXP storage arrays until the cost approaches the cost of flash because for these customers the difference between 120 microseconds of latency and 20 microseconds of latency is not as motivating as the difference between 5-20 milliseconds of latency and ½ a millisecond of latency.  And can you really get 20 microsecond latency on a Tier 1 device loaded with storage services?

What does this mean for the industry?  The market for 3DXP in external storage arrays will appear vibrant due to product introductions but the revenue that can be directly attributed to 3DXP in external storage will be low until the cost and density make meaningful improvements.  Storage architects are already designing ways to use 3DXP as a RAM replacement/supplement in the storage array.  There is some interesting potential here given the memory requirements for flash metadata and caching and the use of 3DXP as a tier of storage.  These steps are reminiscent of the way flash was gradually introduced into Tier 1 before it became Tier 1, for example in RAID cache backups.  As with the all flash arrays, the all 3DXP arrays custom built for the best latency curve at the right price will start out in the Tier 0 space waiting for the cost and density improvements that bring it to the big time.  This time around, that transition could take much longer than it did with flash based arrays.  Flash arrays benefited massively from the density and cost reductions needed in the consumer space.  3DXP does not appear to have the same tailwinds yet.

Stop waiting on NVMe all flash arrays

December 6, 2016

by Woody Hutsell, AppICU

NVMe has taken the flash array market by storm if you consider the number of storage vendors getting in line to deploy NVMe SSDs inside their all flash arrays.  NVMe inside the server (which is the basis for most all flash arrays) is an improvement over SAS or SATA due to the lighter protocol and is an improvement over PCI flash because it is hot swappable and in a drive form factor.

However, just as with the adoption of SAS SSDs inside all flash arrays, these early all flash arrays that include NVMe SSDs will be a figment of what is possible with the technology.  Why? The first flash arrays using NVMe SSDs have the same fundamental software heavy architectures that are already wasting the speed of the internal SAS SSDs.  The move to NVMe SSDs in these bloated solutions will result in some latency/IOPS improvements but ignore the problem that the storage platform is the bottleneck.  Why is it that most all flash arrays, even those with low to no storage services, are in the 500 microsecond range for latency?  One of the main reasons is the data path is littered with obstacles to low latency.  It is the server architecture, the bulky operating system, the software RAID and clumsy storage services that are behind the terrible latency not the flash media or even the SCSI protocol.

If you find yourself waiting on a low latency NVMe driven all flash array, you can stop waiting (just as your application can stop waiting), because a solution is here and available now.  The IBM FlashSystem 900, which has no software in the data path, is shipping with the low latency characteristics your applications demand.  What’s more it doesn’t require proprietary host drivers like some competing solutions (EMC DSSD and E8).  It uses industry standard Fibre Channel and InfiniBand to attach to your existing storage network.  You might protest, the FlashSystem 900 does not use NVMe inside the storage array and you would be right.  There is absolutely no NVMe inside the FlashSystem 900.  There is no storage protocol inside the FlashSystem 900.  Once the data hits the interface controller it ceases to be SCSI or PCI or NVMe.  The only thing better than an improved protocol like NVMe, is no protocol.  The FlashSystem 900, like many prior generations of FlashSystem solutions treats the flash inside the system like memory.  The result is unmatched latency characteristics.

So what do you do with the FlashSystem 900 and its low latency?  Make your applications faster.  For many database driven applications, storage services are already provided at the application or relational database layer.  The FlashSystem 900 is the perfect accelerator for these environments.  For customers who have embraced software defined storage, the FlashSystem 900 is a software defined storage accelerators, just ask the customers who have accelerated IBM SAN Volume Controller and Spectrum Virtualize with FlashSystem.  For customers who need the full storage services feature set in an integrated storage solution, the FlashSystem V9000 and FlashSystem A9000/A9000R include the FlashSystem 900 as the storage enclosure.

NVMe is full of promise for servers and for storage vendors willing to start fresh or further optimize their solutions to actually benefit from the technology.  There are noteworthy examples of new solutions on the market designed for NVMe with encouraging performance gains.  Oddly, the most noteworthy of these solutions are hard to deploy due to custom interface technologies and proprietary drivers (I think of these devices as standards based inside and proprietary outside).  The FlashSystem 900 delivers all of the benefits of NVMe today but without requiring you to change your storage network.  I think of it as proprietary inside but standards based outside.   I think the choice between these options is easy.  The fastest path to improved application performance is with the FlashSystem 900.

Cloud Grid Architecture

June 30, 2016

by Woody Hutsell, AppICU

Prevent cloud failures with grid architecture

Public and private cloud architectures fail with alarming frequency. David Linthicum, with Cloud Technology Partners, wrote in an article – Bracing for the Failure of Your Private Cloud Architecture – for TechTarget’s SearchCloudComputing that a major problem with private cloud deployments results from reusing the same hardware they used for their traditional IT. Specifically, he comments that “hardware requirements for most private cloud operating systems are demanding” and later that “If the hardware doesn’t have enough horsepower, the system will begin thrashing, which causes poor performance and likely a system crash.

Andrew Froehlich, writing 9 Spectacular Cloud Computing Fails for InformationWeek, extends this thought to the public cloud when he says that one of the three key reasons cloud service providers fail is due to “beginner mistakes on the part of service providers…when the provider starts out or grows at a faster rate than can be properly managed by its data center staff.”

Serving up applications in the cloud is different from traditional IT. Cloud deployments thrive when ease of application deployment is matched by ease of management combined with consistent performance under all workloads. Successful cloud deployments support many demanding applications and customers. With the increasing diversity of hosted applications comes some infrastructure headaches. We often custom tailor our traditional IT environments to meet the needs of a specific application or class of applications.  We know it has certain peaks for online transaction processing or batch processes. We know when we can perform maintenance. With the cloud, success means we have many applications with overlapping (or not) peak performance periods. With the cloud, we may be more likely to see constant use resulting in fewer opportunities to perform maintenance and restructure our storage to balance for intense workloads.

Successful cloud deployments can challenge and break traditional storage from a performance point of view. Traditional storage scales poorly. Whether the traditional storage array uses HDD or hybrid architectures, it will experience the same problem: as the number of I/Os to the system increase, the system performance will degrade rapidly. With an all-HDD system the latency will begin high and rapidly decay; with a hybrid configuration (SSD + HDD), the system latency will start lower, stay low longer but then rapidly decay.  When latency decays, applications and users suffer.

Successful cloud deployments can also challenge and break traditional storage from a management point of view. Traditional storage arrays are difficult to configure and deploy. It is not unheard of for initial deployments of scalable traditional storage to take days or sometimes weeks for the system to be tuned so that applications are properly mapped to the right RAID groups. Do you need a RAID group with SSDs; do you need a tiered deployment with SSDs, SAS, and SATA? How many drives are needed in each RAID group?  Should you implement RAID 0, 1, 5 or 6?  Once sized, configured, and deployed, further tweaking of these systems can be administrator intensive. When workloads change, as is the expectation in a cloud deployment, how quickly can you create new volumes and what happens when the performance needed for an application exceeds what the system is capable of delivering? The hard answer is that traditional storage was not designed for the cloud.

Fortunately, IBM has a solution – the IBM FlashSystem A9000 a modular configuration that is also available as the IBM FlashSystem A9000R, a multi-unit rack model. The new IBM FlashSystem family members tackle the performance and management issues caused by successful cloud deployments. Where the cloud needs consistent low latency even as I/O increases, FlashSystem A9000 applies low latency all-flash storage. Where the cloud needs simplified management, the systems apply grid storage architecture.

It all starts with the configuration. FlashSystem A9000 customers do not have to configure RAID groups, the system automatically implements a Variable Stripe RAID within each MicroLatency flash module and a RAID-5 stripe across all of the modules in an enclosure. An administrator configuring the system creates volumes and assigns those volumes to hosts for application use. Every volume’s data is distributed evenly across the grid controllers (this is where the storage services software runs) and the flash enclosures (this is where the data is stored). This grid distribution prevents hot spots and never requires tuning in order to maintain performance. No tuning means substantially less on-going system management. When the rack-based FlashSystem A9000R is expanded it automatically redistributes the workloads across the new grid controllers and flash enclosures.

When an I/O comes into these new FlashSystem arrays, it is written to three separate grid controllers simultaneously. These I/Os are cached in controller RAM and the write is considered committed from the application’s point of view. In this way, the application is not slowed down by data reduction. Next, the three controllers distribute the pattern reduction, inline data deduplication, and data compression tasks across all the grid controllers, thus providing the best possible data reduction performance before writing the data to the flash enclosure(s). Data can be written across any of the flash enclosures in the system, preserving the grid architecture and distribution of workload. When data is written to flash inside the flash enclosure, it is distributed evenly across the flash in a way that ensures consistent low latency performance. All of this is aided by IBM FlashCore™ technology which provides a hardware only data path inside the flash enclosure during the time data is written persistently to flash. The flash storage is housed in IBM MicroLatency® modules whose massively parallel array of flash chips provides high storage density, extremely fast I/O, and consistent low latency.

Together these technologies are a real blessing for the cloud service provider (CSP). When new customers arrive, CSPs know they can easily allocate new storage to new customers and not worry about special tuning to ensure the best performance possible. When existing customers’ performance demands skyrocket, CSPs know that their FlashSystem A9000-based systems offer enough performance to match the growing requirements of their customers without negatively impacting other customers. And when launching or expanding their businesses, CSPs know that FlashSystem A9000 can eliminate one of the leading causes of cloud offering failures, the inability of storage architectures to scale.

For more information, read Ray Luchessi’s, Silverton Consulting, article on Grid Storage Technology and Benefits

The new storage UI from IBM: simply sophisticated

May 26, 2016

What has twenty patents, eight tentacles, and is cooler than a six-pack on a scorching day? Hint: it “lives” in the recently announced IBM FlashSystem A9000. Give up? It’s the IBM…

Source: The new storage UI from IBM: simply sophisticated