StorONE Blog

Hybrid Storage – A Balance

By James Keating, StorONE Solution Architect

Hybrid storage has been a concept that has been around since the beginning of storage. The concept is simple: the data you normally use on more readily assessable media and stuff that is less often needed in some form of long-term storage. This could be as simple as some stuff on your phone and other stuff on good old paper in a file cabinet. Essentially, it is the art of having more needed data on performance-based media and less often used data on more cost-effective long-term storage.

We have had hierarchical storage systems that had some data on spinning disk and some data on tape. The data mover was automated software that allowed the data to be retrieved into the spinning disk if needed, but older or less used data would sit on tape. This was often accomplished using a cache. The cache has some inherent issues, requiring some specific architecture to make it work. You need to have a lot of stuff to keep the cache protected from issues, as not all the data has been moved to a more stable location. Thus, the storage controllers must have more items in them, like battery backing and mirroring of the cache. All of which is more cost. Today, when people talk about hybrid storage, they most often refer to FLASH as the upper tier and spinning media as the lower tier. They may or may not be assuming a cache layer.

The issue with current hybrid technologies is that they are not fully optimized in most cases. The first area of optimization issues is with caching itself. If you need a cache-enabled storage system to allow for hybrid data movement (allowing for data to be moved from cache to FLASH, to hard drives), use the cache as the primary mechanism for determining hot or cold data. This cache itself can defeat some of the purposes of having a hybrid system, which is cost savings. If one needs to invest in more CPU, memory, and battery backing inside the storage controller, the cost savings of the hard drives will be lessened as the controller costs will eat into that savings.

The second issue is performance consistency. A traditional hybrid system can have some performance challenges when it begins to fill the upper tier. This can be caused by undersized upper tiers or by bursts of workload that overrun the upper tier. When this happens, the array will go into bypass mode, and new writes will be going directly to the lower tier (this will significantly impact performance) while the upper tier itself is working hard to evacuate older data to the lower tier. It becomes an I/O competition as the new writes are competing with the upper-tier evacuations for I/O on the lower tier. All of this equates to impacted performance. This spiral of IO doom is why many have predicted the ultimate death of the hard drive, arguing that having lower-tier media is not worth this potential performance risk.

The Future of Hybrid Storage

However, this is not an argument that holds up to true cost investigations. Hard drives are both decreasing in cost and increasing in density. This is making the cost per TB lower and lower on this type of media. It reminds me of the folks who told me when I was a young IT worker starting out that tape was dead. If tape is dead, it is haunting almost every data center I have been in throughout my career. I will go out on a limb and predict we will not see the true death of the hard drive for many years. The economics of it will keep it in play. No, I suspect disks are here for a long time. So, with the cost staying low and the densities increasing, I would argue the challenge is finding effective ways to utilize that low-cost space.

Holistic Hybrid Storage with SAFE Principles

Enter the idea of Holistic Hybrid Storage; this is essentially the idea of being able to use those dense, cost-effective drives and prevent via the software and architecture design that IO death spiral from happening. So, to that end, I find myself looking to architect hybrid storage using a principle I call SAFE.

Security – I look for storage solutions with built-in security features. Hybrid is not only a concept for storage but also for security. In security, hybrid means layers of protection. Each is different or from a different perspective, allowing for better overall security. Examples of this are self-encrypting drives, immutable snapshots, and 3-2-1 data copy methodologies, all combined. So, any hybrid storage solution will need to be able to play a part in the better together security approach.

Accessible – This is tied directly to the data IO spiral I mentioned above. I want the data to live on the tier it should to satisfy performance, but also have the system move cold data to the lower tiers but have the ability to prevent that dreaded bypass mode situation. In the case of StorONE, this is accomplished using machine learning to proactively migrate data to the lower tier. This is a much less IO-intensive method as it uses the system’s own less busy times to stage data to the lower tier and then allows for simple procedures when required to free up space on the upper tier.

Foundational – This is the space where most technology falls down. If the system is too complicated, requires a lot of manual intervention, or doesn’t fit into the business processes, it will have issues. The best example I have seen around this is surfaced in the backup space. The backup team backs up items based on the business requirements they are given. This means they take great care with production data, track and watch the backups carefully, and ensure things are getting backed up. The problem is the process of knowing the business requirements and the process of how various teams place data onto storage systems don’t always align. So a team puts a critical part of a production system on storage they have access to, but it may not be the storage location that gets backed up rigorously, or a team adds a new application that doesn’t use the local storage or the cloud storage the backup teams administer, and as such goes orphaned from backups.

Effective – All technology must be effective to justify staying on the floor, so to speak. I think this part of the design is the most self-explanatory. We need a solution that allows for predictable performance and keeps the cost per TB low. This is an area that StorONE has spent years working on perfecting. The first is with a truly cache-less architecture, the StorONE storage controllers do not need expensive caching infrastructure, and as such, the investment in the controllers will not take away from the savings one can achieve by using hard drives. Second, StorONE has vRAID technology that allows for fast rebuilds of failed hard drives. One reason many systems do not want large spinning drives is rebuilding from a failed drive could take days to weeks. StorONE can do this in a couple of hours.

StorONE also uses machine learning to simplify the data movement from the upper tier to the lower tier to avoid the I/O death spiral I mentioned earlier. We have proactive data cascading, which, without a long technical explanation, means we can have built-in features to allow for the predictable movement of data between tiers.

StorONE: A Cacheless Hybrid Storage Solution

StorONE is an example of a hybrid storage solution that adheres to the SAFE principles. It utilizes a cacheless architecture, eliminating the need for expensive caching hardware and reducing storage controller costs. Additionally, StorONE’s vRAID technology enables rapid rebuilding of failed hard drives, addressing a concern that often deters users from deploying large hard drives in storage systems. StorONE also uses machine learning to optimize data movement between tiers, preventing performance issues caused by data I/O bottlenecks.

In conclusion, hard drives remain a viable storage option due to their cost-effectiveness and increasing density. By using a SAFE design methodology, hybrid storage solutions can leverage the economic advantages of hard drives while mitigating their performance limitations. StorONE is a specific example of a hybrid storage solution incorporating these principles.

To learn more about how a StorONE system may be able to help in your environment, please contact us: info@storone.com

Request a Demo