To Cache or not to Cache, that is the question.
Well, do you? Cache for your Ceph® cluster? The answer is, that it depends.
You can use high-end enterprise NVMe™ drives, such as the Micron® 9200 MAX, and not have to worry about getting the most performance from your Ceph cluster. But what if you would like to gain more performance in a system that is made up mostly of SATA drives. If this is the case, there are benefits to adding a couple of faster drives to your Ceph OSD servers for storing your BlueStore database and write-ahead log.
Micron developed and tested the popular Accelerated Ceph Storage Solution, which leverages servers with Red Hat Ceph Storage running on Red Hat Linux. I will go through a few workload scenarios and show you where caching can help you, based on actual results from our solution testing lab.
Testing was done using a four OSD node Ceph cluster with the following configuration:
|Single Socket AMD 7551P
|256GB DDR4 @ 2666Hz (8x32GB)
|Micron 5210 ION 3.84TB (x12)
|NVMe Drives (Cache devices)
|Micron 9200 Max 1.6TB (x2)
|Red Hat® Enterprise Linux 7.6
|Red Hat Ceph Storage 3.2
|OSDs per drive SATA drive
|50 RBDs @ 150GB each with 2x replication
Table 1: Ceph OSD Server Configuration
4KiB Random Block Testing
For 4KiB random writes, using FIO (Flexible I/O), you can see that utilizing caching drives greatly increases your performance while keeping your tail latency low, even at high load. For 40 instances of FIO, the performance is 71% higher (190K vs 111K) and tail latency is 72% lower (119ms vs 665ms).
Figure 1: 4KiB Random Write Performance and Tail Latency
There is some performance gain during 4KiB Random Read testing, but it is much less convincing. This is to be expected as, during a read test, the write-ahead log will not be utilized and the BlueStore database won’t change much if at all.
Figure 2: 4KiB Random Read Performance and Tail Latency
A mixed workload (70% Read/30% Write) also shows the benefits of having caching devices in your system. Performance gains range from 30% at 64 queue depth to 162% at 6 Queue depth.
Figure 3: 4KiB Random 70% Read/30% Write Performance and Tail Latency
4MiB Object Testing
When running the rados bench command with 4MiB objects, there is some performance gain with caching devices, but it’s not as dramatic as the small block workloads. Since the write-ahead log is small and the objects are large, there is much less impact on performance by adding caching devices. Throughput is 9% higher (4.94 GiB/s vs 4.53 GiB/s) with caching vs none, while average latency is 7% lower (126ms vs 138ms), when running 10 instances of rados bench.
Figure 4: 4MiB Object Write Performance
With reads, we again see that there is negligible performance gain across the board.
Figure 5: 4MiB Object Read Performance
As you can see, if your workload is almost all reads, you won’t gain much if anything from adding caching devices to your Ceph cluster for BlueStore database and write-ahead log storage. But with writes, it is a completely different story. Although, for large objects, there is some gain, the real showstopper for caching devices is with small block writes and mixed workloads. For a small investment of adding a couple of Micron performance 9200 NVMe drives to your system, you can get the most out of your Ceph cluster.
What sorts of results are you getting with your open source storage? Learn more at Micron Accelerated Ceph Storage.