StorONE Blog

Write-Caches Are Unnecessary

Write cache has been an essential component of storage systems for many years, designed to improve performance by buffering writes and minimizing disk access. However, the rise of new storage media has made write cache increasingly unnecessary, with some systems even moving away from it entirely.

StorONE is one such company that has developed a cacheless architecture that unleashes the performance potential of high-performance media.

Why Write-Cache Was Used in Storage

Historically, write cache was needed because hard drives were the primary storage media used in storage systems. Hard drives are slow compared to other media like solid-state drives (SSDs), and so buffering writes in cache helped to minimize the amount of time it took to complete a write operation. But with the advent of high-performance media like SSDs and NVMe drives, buffering writes in cache is no longer necessary.

The storage I/O stack is a software architecture that defines how data flows between a computer system and storage devices. The stack is composed of several layers, each with its own function and set of APIs. The layers include the application layer, file system layer, block layer, device driver layer, and storage device layer.

Each layer of the storage I/O stack adds a certain amount of latency to the data access process. For example, the application layer generates I/O requests that pass through the file system layer, where additional operations such as file locking and permission checks can add overhead. The block layer, which handles I/O requests at the block level, introduces additional overhead for operations such as queuing, scheduling, and caching. The device driver layer manages communication between the operating system and the storage device, adding further latency to the process. Finally, the storage device layer includes the hardware components that store and retrieve data, with its own set of latency-inducing operations such as seek time, rotational latency, and data transfer time, although this layer is the most optimized layer over all the above layers. 

These layers of the storage I/O stack are necessary for providing a robust and flexible storage system, but they also add latency to the data access process. The amount of latency introduced by each layer can vary depending on the specific hardware and software configuration. However, even small amounts of added latency can have a significant impact on application performance, particularly for latency-sensitive workloads such as database applications and virtualized environments, which is why write cache was placed in between the servers and the disks. 

Why Write-Cache is Becoming a Thing of the Past

Write-cache has hidden the poor performance of persistent media causing problems for storage systems designers and puts data at risk of loss. The first problem is that the write-cache acknowledges a successful write operation before the data is written to persistent storage media. The second problem is that in the event of a power failure or server crash, all the data in the RAM used for the write-cache is lost with no way to recover it.

This forces vendors to add complicated redundancies and processes to avoid data loss. 

Insert The Cacheless Architecture

To address this challenge to improve efficiency and performance, StorONE has developed a cacheless architecture that bypasses the write cache to reduce latency and increase throughput.

By eliminating the need for a write cache, StorONE’s DirectWrite reduces latency in the block layer, which can have a significant impact on overall performance. In addition, the architecture includes non-shared media pools that eliminate resource contention up to the point of the actual resources in a given system, allowing for highly competitive workloads to be run on the same system. With these two features, DirectWrite enables a significant increase in performance, making it an ideal solution for demanding workloads.

The DirectWrite Performance Advantage

Customers see a more efficient use of storage resources, maximizing performance while reducing costs. By eliminating the need for write cache, organizations can save money on hardware, maintenance, and licensing costs.

Write cache was once an essential component of storage systems but is increasingly unnecessary with the rise of high-performance media. StorONE’s DirectWrite cacheless architecture is a solution that will deliver better performance and cost savings. By bypassing the write cache entirely, you will reduce latency, increase throughput, and deliver better performance in many situations. Additionally, you’ll have a more efficient use of your storage resources. With the increasing availability and affordability of high-performance media, a cacheless architecture is the way forward for organizations seeking to optimize their storage infrastructure.

Let us Prove it to You

In the video below, you’ll see a live performance demonstration of QTY(24) HSG SSD drives achieving over 1.7 million IOPS with less than 0.3 ms of latency.

Below are the results from two workloads that were configured to run on the same storage platform. The results are listed below as the combined workloads were running simultaneously.  

Controller Type and disk configuration of the system: This is a SuperMicro Controller configured as a  dual node high availability system with QTY(8) NVMe SSDs and QTY(10) HDDs. 

Workload 1: 

Protocol: The test was run against QTY(8) VSCs (NVMe-oF(TCP)) for small-block random I/O.

Configuration of the Virtual Storage Container (VSC): The VSCs for this workload were pinned to an All-Flash VSC. 

Host: A single Ubuntu Linux host to drive the small block random I/O.

Test tool: FIO was the load generator tool used.

Workload 2:

Protocol: (Block (iSCSI)) for large, sequential Reads/Writes. 

Configuration of the Virtual Storage Container (VSC):  The second workload was a QTY(4) VSCs Two initiators were used in the test and the  VSCs targeted HDD only config.

Host: A single Windows host to drive the large sequential I/O.

Test tool:  FIO was the load generator tool used. 

Below are the system performance metrics for IOPS, latency, IO size and Throughput which shows the following:

System Read IOPS: 657,868

System Write IOPS: 34,158

System Read latency: 0.0ms

System Write latency: 0.0ms

System Read IO size: 3 KB/s

System Write IO size: 2 KB/s

System Read throughput: 4.6 GB/s

System Write throughput: 139.9 MB/s

Request a Demo