by Woody Hutsell at www.appICU.com
In late 2006, Robin Harris at www.StorageMojo.com wrote “RAM-based SSDs are Toast –Yippie ki-yay”. As a leader of the largest RAM-based solid state storage vendor at the time, I can assure you that his message was not lost on me. In fact, we posted a response to Robin in “A Big SSD Vendor Begs to Differ” to which Robin famously responded “If I were TMS, I’d ask a couple of my better engineers to work part time on creative flash-based SSD architectures.” I cannot honestly remember the timing, but it is fair to say that the comment minimally reinforced our internal project to develop a system that relied heavily on SLC NAND Flash for most of its storage capacity. Within a few years, TMS had transitioned from a RAM-based SSD company to a company whose growth was driven primarily by Flash-based SSD. Nearly five years after the predicted death of RAM-based SSD I thought it would be interesting to evaluate the role of RAM SSD in the application acceleration market.
First off, it is important to note that RAM-based SSDs are not toast. In fact, a number of companies continue to promote RAM-based SSDs including my employer, ViON, who is still marketing, selling and supporting RAM-based SSDs. What may be more surprising is that the intervening years have actually seen a few new companies join the RAM-based SSD market. What all of these companies have identified is that there are still use cases for the high-performance per density available with RAM-based SSD. In particular, RAM-based SSDs continue to be ideal for database transaction logs, temporary segments or small to medium databases where the ability to scale transactions without sacrificing latency is critical. Customers in the e-commerce, financial and telecom markets will still use RAM SSD. When a customer says to me that they need to do be able to say they have done “everything possible” to make a database fast, I still point them to RAM SSD if the economics are reasonable. I think the RAM SSD business has promise for these specific use cases and will watch with curiosity the companies that try to expand the use cases to much higher capacities.
The second thing to note is that without RAM, Flash SSDs would not be all that appealing. You will probably all recall the reaction to initial Flash SSDs that had write performance slower than hard disk drives. How did the vendors solve this problem? Well for one thing they over-provisioned Flash so that writes don’t wait so much on erases. In enterprise solutions, however, the real solution is RAM. Because the NAND Flash media just needs a little bit of help, a small amount of RAM caching goes a long way toward decreasing write latencies and dramatically improving peak and sustainable write IOPS. This increases the cost and complexity of the Flash SSD but makes it infinitely more attractive to the application acceleration market.
Third, the companies with the most compelling Flash SSD performance characteristics have come out of the RAM SSD market. These companies had developed low latency, high bandwidth controllers and backplanes that were tuned for RAM. Contrast this with the difficulties the integrated storage manufacturers have had since their controllers and backplanes were tuned for hard disk drives.
Casual industry observers might ask a couple of other questions about this market:
- With the rapid decrease in RAM prices, is RAM likely to replace Flash as the storage media of choice for enterprise SSD? No.
- Are the large integrated storage companies likely to add a non-volatile RAM SSD tier in front of their new Flash SSD tier? I tend to doubt it, but would not rule it out completely.
- Aren’t customers that start with Flash going to look to RAM SSD to go even faster? I think some of these customers will want more speed but for most users Flash will be “good-enough”.
- Aren’t customers that start with RAM likely to move to Flash SSD on technology refreshes? Probably not. RAM SSD is addictive. Once you start with RAM SSD, it is hard to contemplate going slower.
To put this all in perspective, Flash SSDs did not kill the RAM SSD market. In some ways, Flash SSD and the big companies who have embraced it have added legitimacy to the RAM SSD market that it lacked for decades. I think RAM SSDs will continue to be an important niche in the overall application acceleration market and anticipate innovative companies introducing new use cases and products over the next five years.
To give credit where credit is due while Flash SSDs did not kill the RAM SSD market, it has come to dominate the enterprise storage landscape like no other technology since the advent of disk storage. Robin Harris may not have accurately predicted the end of RAM SSD but he was at the forefront of analysts and bloggers, including Zsolt at www.StorageSearch.com, predicting Flash SSD’s widespread success.
Good post as always Woody. I just wrote-up some information on Flash SSD write handling that provides more info on the sustaining write performance with flash:
I assume you are talking about latency advantage of DRAM versus Flash. In this case I wonder if network latency to the DRAM-based SSD doesn’t neutralize the latency advantage of DRAM.
Let’s take 4Gbit FC as an example of communication channel and smallest 512 byte block. It will take a few microseconds just to send this 512 bytes over the wire (compared to nanoseconds to extract it from DRAM chip). So communication latency clearly dominates the overall latency here. Effectively putting both SSD in the same “microseconds” ball park latency-wise.
Now regarding local DRAM-based SSD: well… why not just to add more RAM to the server if it is the case? As dumb as few more DIMM sticks? The volatility is the issue? just like in that DBMS redo log example?
Thanks for the comments. A couple of thoughts. The latency advantage of DRAM versus Flash is not really lost due to the interconnect. Your typical RAM SSD has write and read latencies of 15 microseconds (once the request has left the server). Your typical Flash SSD, depending on its level of RAM caching will have writes in the neighborhood of 80 microseconds and reads typically around 220 microseconds or more (system architecture can affect these numbers). Many of the customers using RAM SSD for writes were previously hitting RAM cache on their disk array. The problem is that the controller on the typical disk array adds too much latency even to a cache hit. If you can improve a write from 1 millisecond down to 15 microseconds at the storage level you can do far more single threaded writes such as those you might experience with database logs. The other benefit of a RAM SSD is that even with ridiculous write loads, you do not suffer major performance drops over time and can expect nearly linear response even as IOPS scale. Flash systems do not tend to respond as nicely though still much better than disk based systems. Some reviews of the SPC-1 results published by Texas Memory Systems will expose these facts pretty clearly. I think you answered your last question, you cannot use server RAM because your solution has to be non-volatile if you are using it for database writes. Most people who adopt RAM based SSDs have already tried everything else including maximizing server RAM, storage array cache, etc.
Woody thanks for the reply. I agree with all the above but I’m still respectfully disagree with you that DRAM-based SSD has any brighter future than staying as niche technology.
Regarding 220 usec latency of Flash drives: well… there is no particular reason for flash-based SSD to exhibit as high as 220 microseconds latency. That’s an artifact of suboptimal implementation that with proper competition must go away, and I’m convinced that it is already improved with products currently on the market such as FusionIO/NextIO combo. Two-digit microsecond latencies must be a norm for Flash SSD.
NAND sense-time is few tens of microseconds and is the only physics-induced latency that cannot be improved by better SSD design and therefore will always be principal disadvantage of flash versus DRAM. However, beside this everything else can easily done under 1-3 microseconds all together with proper design.
Everything else is three things actually:
1. FTL (trivial on Ghz processor to do under one microsecond)
2. DDR-mode for getting data off-chip quickly (same tech as DRAM now).
3. Advanced ECC (trivial on GHz processor to do under one microsecond)
the rest of the route is 100% same for DRAM and Flash designs.
So I’m still convinced that latency of remotely accessed DRAM-based SSD is better by factor not more than x3-x5 with prices higher by factor of x10-x20 compared to flash. More importantly, as you said flash latency is already good enough.
That said I think DRAM-SSD will maintain its niche when one or more of following hold true:
1. Price insensitivity (either low data volumes or just storage representing a small fraction of larger budget)
2. Re-write intensive workloads.
3. Lowest-latency possible and non-volatility on small blocks (starting from 512 bytes).
I think the great thing is that we are actually in agreement. A careful read of the blog post will suggest that RAM SSD lives and has a key role to play, but pretty consistently points out that the Flash SSD market has eclipsed the RAM SSD market. I agree it is a niche, but having worked with RAM SSD for over a decade now I have seen that RAM SSD can be used to solve some of the toughest application performance problems faced in the enterprise. There are many ways and places to accelerate reads, accelerating writes in a way that maintains persistence, reliability and availability of data is much harder.
That makes sense.
I’ve re-read the post and now I see that I was terribly mistaken. You never suggested RAM-SSD becoming a mainstream, just an “important niche” which I think is a pretty accurate definition. Not to mention the four-question section. So my apologies here.
May be I was misled by a title and your “blessing” for newly funded dram-ssd startups?
On separate note: could you comment on “RAM clouds” stuff and whether you consider it a kind of DRAM-based SSD or something very different. I mean the technique of using cluster of commodity servers to hold the whole dataset in their collective RAM and solving volatility by synchronous replication to another server within cluster and usual heavyweight UPS and diesel generator as a kind of “NVM battery”. May be as follow-up post?
Great article! I wonder would you like to write something about failure modes of flash SSD? I am surprised at the degree of penetration you report in the enterprise market, as I thought these things were for gamerz and mobile use only 😉 but I am learning otherwise, I guess!
Your previous article on consistency groups hints at complexities added to the backup/recovery process by the introduction of an additional intermediate level of storage between RAM and disk, but I wonder if the devices themselves add any intrinsic complexities due to the way that they use redundent writes and white-outs (or at least laggy deletes) rather than real deletes… does any of this restrict the arena in which flash SSD are trusted with server-room data?
Has there been enough experience of failure modes to know whether spontaneous corruption might occur, or other nightmares, and are additional layers of checksumming, for example, used to guard against such risks? Is it a non-issue, eg the devices might be being introduced into RAID arrays with checksumming already a standard component … Thanks in advance for any comment you can make on this issue. 🙂
I believe the word niche is also misused. I believe DRAM is and will be targeted use. Such as:
1. There are important uses for DRAM in high rewrite areas that flash would break down over time.
2. DRAM also is important in areas that time critical where updating data in real time is more important than the data itself such as airtraffic control where old data is not important as current status.
3. DRAM is also important in enterprise backup areas where holding of temporary information is important.
4. Realtime video or picture imaging/transforming into a mpg format as in security or hospital cameras, that is being stored.
What I am trying to say that; any area where the “current” temporary data is being updated “frequently” and can be considered “time critical” or “performance critical” is an ideal place to use DRAM.
Purely adding more ram to a server does not necessarily resolve these issues. RAM in a server may not be used effectively by the application(s)/OS that is running.
Finally..’In late 2006, Robin Harris at http://www.StorageMojo.com wrote “RAM-based SSDs are Toast –Yippie ki-yay”.’
I want to declare the spinning hard drive dead… Yippie ki-yay.