By Woody Hutsell, appICU
I have a point of view about third party caching (particularly as it applies to external systems as opposed to caching at the server with PCI-E) that is different than many in the industry. Some will see this as bashing of some particular product, but it is not intended to be that. As far as I know, I am not competing with a third party caching solution at any customer site. My goal here is to start a discussion on third party caching, I will lead with my opinions and hope that others weigh-in. I am open to changing my mind on this topic as I have numerous friends in the industry who stand behind this category.
First, some background. Many years ago, 2003 to be exact, I helped bring a product to market to provide third party caching with RAM SSD. I believed in the product and was able to get many others to believe in the product. What I was not able to do was to get many people to buy the product. As I look at solutions on the market, I can see that companies trying to sell third party caching solutions are encountering the same obstacles and are fixing or working around the problems. Here are some problems I have experienced with third party caching solutions:
1. Writes. The really delicious problem to solve several years ago with a RAM caching appliance was related to write performance. Many storage systems had relatively small write caching capabilities that caused major pain for write intensive applications. A large RAM SSD (at the time I think we were using 128GB RAM) as a write cache was a major problem solver for these environments. Several things have happened to make selling write caching as a solution more difficult:
• RAID systems increasingly offered reasonable cache levels narrowing down the field of customers that need write caching. At the time we offered this RAM write cache, we thought that Xiotech customers were the perfect target as they did not believe in write caching at the time. Fact is, the combined solution worked out pretty well but was only useful until Xiotech realized that offering their own write cache could solve most customer problems.
• Third party write caching introduces a point of failure into the solution. If you write-cache, you have to be at least as reliable as the solution you are caching otherwise you have net lost the customer reliability.
• Write caching is nearly impossible if the backend storage array has replication or snapshot capabilities. Arrays with snapshot have to be cache aware when they snapshot or else they risk snapshotting without the full data set. I have seen companies try to get around this but most of the solutions look messy to me.
• Putting a third party device from a small company in front of a big expensive product from a big company is a good way for a customer to lose support. We realized early on that the only way for this product to really succeed was to get storage OEMs to certify it and approve it for their environments (we did not do very well at this).
2. Reads. Given the challenges with write caching it seems to me that most companies today are focused on read caching. Read caching solutions have a long history. Gear 6 was one of the first to take the space seriously and had some limited success with environments such as oil & gas HPC and rendering. Some of the companies that have followed Gear 6, seem to be following in their footsteps with markedly different types of hardware and cost. Here are some issues I see with read caching:
• A third party read-only cache adds a write bottleneck (as writes to the cache have to be subsequently written to the storage). i.e. Latency injection. I assume there are architectures that get around this today.
• A third party read only cache really only make sense if your controller is 1) poorly cached or 2) does not have fast backend storage or 3) is processor limited or 4) has inherently poor latency. This may be the real long term problem for this market. Whether you talk about SAN solutions or NAS solutions all storage vendors today are offering Flash SSD as disk storage. In SAN environments, many vendors can dynamically tier between disk levels (thus implementing their own internal kind of caching). NetApp has Flash PAM cards. Both BlueArc and NetApp can implement read caching. The only hope is that the customer has legacy equipment or poorly scoped their solution such that they need a third party caching product.
• Third party caching creates a support problem. Imagine you are NetApp and the customer calls in and says I am having problems with my NetApp storage can you fix it. Support says, describe the environment. Customer says “blah…blah…third party cache cache…NetApp”. NetApp says “that is not a supported environment”. I always saw this as a major limiting factor for third party caching solutions. How do you get the blessing of the array/NAS vendor so that your customer maintains support after placing your box between the servers and the storage.
• Third party read caching solutions cannot become a single point of failure for the architecture.
So, there it is. I am looking forward to some insightful comments and feedback from the industry. As you can see many are my opinions are based on scars from prior efforts in this segment and not meant to be a reflection on existing products and approaches.