By maximizing primary storage data protection, organizations can better protect themselves from ransomware, accidental data loss, and site-wide disasters. Most primary storage systems have all the essential features to protect an organization’s data. The problem is these vendors, to get to market quickly, tie themselves to a legacy storage IO stack, which can’t fully exploit the critical data protection features. The lack of capability forces customers to count on separate backup and replication solutions, which adds cost and complexity to meet their organization’s recovery point and recovery time objectives.
Legacy Primary Storage Limitations That Put Your Data at Risk:
- Write-Caches that falsely confirm writes
- Limited drive redundancy
- Slow RAID rebuild times
- Limited snapshot quantity and retention
- Expensive replication
It seems like the combination of RAID, snapshots, and replication should meet most organization’s data protection needs. But, unfortunately, the legacy storage system’s use of the traditional storage IO stack means they can’t meet the organization’s data protection requirements. To make matters worse, these vendors try to circumvent the storage IO stack with a write-cache which puts data even more at risk. These shortcomings are why the backup and replication market is a multi-billion dollar market.
StorONE, when bringing its Enterprise Storage Platform to market, took the exact opposite approach when bringing our solution to market. We built a new storage engine from the ground up, collapsing the legacy storage stack and refactoring core storage algorithms. As a result, our rewritten features encourage you to reset your primary storage data protection expectations.
Steps to Reset Your Primary Storage Data Protection Expectations
Step 1: Eliminate the write-cache
The first step to maximizing primary storage data protection is to eliminate the write-cache. The widespread use of the legacy storage stack forces every storage system on the market today, except for StorONE, to use a write-cache to circumvent the latency impact of the storage IO stack. Write caching uses RAM to provide much faster write acknowledgments to the application.
The problem is that RAM is volatile, so any vendor servicing the enterprise data center must take steps to prevent data loss. They will use capacitor-backed RAM and will mirror that RAM to another storage controller. The problem is that these steps add to the cost of the storage system and lower the technique’s effectiveness. Most problematic, it puts data at further risk.
StorONE, because it rewrote the storage IO stack, does not use a write cache yet still delivers more performance, per drive, than any other storage system on the market today. We call the feature DirectWrite. With DirectWrite, all writes are to persistent media. So secure is DirectWrite that you can remove power from all nodes in a StorONE system simultaneously and not lose data.
Step 2 – Highly Available Storage
The next step to maximizing primary storage data protection is delivering a highly available (HA) storage solution, a table-stakes requirement for enterprise storage systems. The problem facing legacy storage vendors is how to deliver HA with the added complications of a write cache. The required mirroring of cache data means that vendors must make sure the mirror occurs consistently, and they must make sure they don’t suffer from a split-brain on recovery.
StorONE, partly because it doesn’t require a write cache and because it did the hard work of rewriting the storage stack, delivers a much simpler and more reliable HA solution that delivers much more consistent failover and allows dissimilar hardware within the same cluster.
Step 3 – Better Protection from Media Failure
Because of the sheer number of drives, a storage system might have, the most vulnerable element in the storage infrastructure is the individual media. Maximizing primary storage data protection requires rethinking how a storage system protects against media failure. Most vendors offer tried and true RAID to meet the requirement for data access during a media failure. RAID, however, is full of challenges. When a drive fails, RAID significantly impacts the performance of production applications.
The biggest problem with RAID, however, is the time it takes to recover from a media failure. 14TB to 18TB Hard disk drives can take days to rebuild from a media failure and return the array to a protected state. Moving to flash storage to avoid slow RAID rebuilds is only a temporary fix. We are already seeing flash drive, as the density increases, rebuild times approach double-digit hours.
StorONE’s vRAID is a complete rewrite of erasure coding. With it, we can rebuild high-density hard disk drives in less than two hours, flash drives in less than five minutes. During the rebuild process, there is almost no impact on performance. vRAID enables you to leverage high-density flash and HDD without compromise. You can also mix and match drive sizes within the same volume group.
Step 4 – Better Protection from Ransomware
IT often uses snapshots for the rapid recovery of recently deleted user data. They also use snapshots to feed the backup process. Many customers were counting on snapshots for maximizing primary storage data protection. Unfortunately, in both cases, most organizations only retain a handful of snapshots because of legacy software limitations, making them ineffective for ransomware protection and the recovery of data older than a few days.
Legacy snapshots are often hard to recover in any meaningful way. For example, IT cannot easily search snapshots by date/time or search by filename.
StorONE’s efficient storage foundation enables us to take and retain millions of snapshots without impacting performance. Our snapshot technology S1:Snap is space-efficient and only consumes additional capacity as the production volume changes. With our snapshot technology running on the Enterprise Storage Platform, you can take millions of snapshots, retain those snapshots indefinitely and search them quickly either by date and time or by filename.
Add these capabilities to S1:Snap’s ability to take a snapshot every minute and retain that snapshot indefinitely provides the ultimate protection against a ransomware attack.
Step 5 – Better Protection from Disaster
Primary storage disaster recovery has a cost problem. It stems from the storage vendor’s requirement that the target system is nearly identical to the source system. This requirement means if you invest in an all-flash array in the primary data center, you must invest in an all-flash array at your DR site that you may never use. If you decide to replicate to the cloud for DR, you need a similar class of VMs so that the vendor’s inefficient software can run at some level of performance.
There is also a technology problem, as we discuss in our blog, “Three steps to maximum disaster recovery (DR) success.”
StorONE’s S1:Replicate provides continuous replication to the DR site. At a workload level, you can select whether or not to replicate synchronously to an on-campus system, asynchronously to a remote site, or asynchronously to the cloud. You can also select “all the above.” For example, you can replicate from an all-flash array to a hybrid storage system or HDD-based resources in the cloud.
Conclusion
The StorONE Enterprise Storage Platform provides the absolute minimum total cost of ownership and the absolute maximum level of data protection. It is so flexible and cost-effective that some customers start using us for archive or backup storage and gradually start placing production Oracle and VMware workloads on the platform. It is designed to require no storage refreshes and no data migrations for more than ten years.
Learn More
Register for a personal, PowerPoint-free, 1-on-1 whiteboard walkthrough of the technology with me. I will show you how our platform can maximize your data protection at minimum TCO during the live, interactive session.