Happy 2016 to all! I wish you all the best for the year. After enjoying some time off during the holidays with my family, during the ‘short week’ in-between Christmas and New Year’s Day, I spent some time in the trusty home lab. In my last post, I described my findings running Jetstress using the Micron P420m PCI-E SSD as a VMFS datastore. This time, I did a similar set of experiments, using VSAN 6.1 instead of VMFS. In particular, I compared Jetstress in a hybrid VSAN configuration to an all-flash configuration. Here’s a handy link for more information about, and how to download Jetstress.
My lab setup is outlined in part 1 of my 2015 blog series and described further in part 2. Given those descriptions, my VSAN setup was similar – but instead of using just one server, I used three identical servers, to form my VSAN cluster, and added a pair of Seagate ST3750640NS SATA HDDs as well as a pair of Micron M600 SATA SSDs to each server. The first results I’ll show here are my findings running Jetstress on an all-flash VSAN configuration; after that, I’ll show the same workload running a hybrid VSAN configuration on the same cluster.
The purpose of my exercise here is not to knock the competition, but to establish to what degree using an all-flash VSAN configuration is superior to using a hybrid VSAN for enterprise workloads such as Exchange as embodied by Jetstress. Even though it’s true an all-flash VSAN can be 5% more expensive than a hybrid VSAN, on a $/GB basis – as Micron showed at VMworld 2015 (follow this post) - the workload advantage gained far outweighs the slight additional cost. After all, it’s really price/performance that counts in enterprise applications, not raw price or raw performance alone.
My all-flash VSAN configuration was one P420M and two M600 SSDs per server, while my hybrid VSAN was one P420M and two ST3750640NS HDDs per server. The P420M was the cache tier while the other drives formed the capacity tier. I used all the VSAN defaults, such as stripe width (1) and number of failures to tolerate (1).
Note, as with my other tests, the adapter queue depth was 255 for the P420M and the P420m was fully preconditioned (24 hours’ worth) to ensure steady state behavior. Also, as in previous tests, I issued the following command to set the device queue depth to 255:
esxcli storage core device set –m 255 –O 255 –d <P420m device name>
As before, I ran a very straightforward test: four databases with four logs (the four databases resided on one letter drive and the four logs on another letter drive). The entire virtual Windows 2008 R2 system was persistent on the VSAN datastore. This means every I/O the OS generated, whether on behalf of itself or on behalf of Jetstress, was handled by VSAN. The test consumed ~700GB of the 1TB usable (2TB raw, since I ran #FT and stripe width 1) on the VSAN datastore and ran for 2 hours. I ran several session counts to determine a performance profile and a maximum session count. For you sharp-eyed readers, this test consumed nearly six times the capacity as my prior tests using a single P420M. I wanted to see what the effect was running a larger database.
The action profile I used was the same as before - the Jetstress default - which is 40% insert, 20% delete, 5% replace, 35% read for the transactions with 70% lazy commit (implying 30% non-lazy commit), background database maintenance running, and one copy per database.
Here’s a table showing my findings as Jetstress executed on the all-flash VSAN to find the optimal number of sessions at a target transactional IOPS:
Average Latency (ms)
I decided to stop at 50 sessions, since the Jetstress limit is 20 ms for database read average latency. At that point – 50 sessions – measured per database (remember, there are four databases in play) it recorded 446 database reads/sec @ 33KB average size; 317 database writes/sec @ 34K; 0.265 log read/sec @ 4K; and 87 log writes/sec @ 7.4K for a combined transactional IOPS workload of 3,044. One of the interesting things about all-flash VSAN is that it uses the cache tier (the P420M) only as a write cache; it satisfies reads directly from the capacity tier. This is important to remember when comparing results to the hybrid VSAN below.
That trial completed, I used vCenter to tear down the all-flash VSAN, and constructed a new hybrid VSAN. Note, once VSAN is configured for a particular scheme (either all-flash or hybrid) one cannot just remove disks and add disks of the other type – in order to change from one type to the other, you must tear down the VSAN and disable it before the hypervisor will allow you to construct a VSAN of a different type. Word to the wise!
Once my hybrid VSAN was constructed – using exactly the same parameters as the all-flash VSAN - I ran the exact same workload; the VM running in its entirety on the hybrid VSAN datastore, including the ~700GB capacity used by Jetstress for the four databases and logs.
Here’s the table showing my findings as Jetstress executed on the hybrid VSAN to find the optimal number of sessions at a target IOPS. Compare the findings to the previous table.
Average Latency (ms)
Yes, indeed, quite a difference. Jetstress could only run 3 sessions underneath the 20ms database read average latency limit. At 4 sessions, Jetstress failed the test. For the successful 3 sessions, when it finished, per database, it recorded 37 database reads/sec @ 35KB average size; 25 database writes/sec @ 36K; 0.04 log read/sec @ 4K; and 19 log writes/sec @ 5.7K.
The findings, when compared, show the enormous difference between running complete workloads—OS and application together in a VM—on an all-flash VSAN compared to a hybrid VSAN at constant latency. This means, to the individual using their mailbox, they’d see a similar response time using either method, but the all-flash VSAN can sustain nearly 14 times the workload – 40 sessions versus 3! Much more efficient, much more “bang for the buck.” This gives you an idea of the power—running efficient, virtualized enterprise workloads—in using an all-flash VSAN.
To summarize, at a given constant transactional latency – in this case, 13ms – the all-flash VSAN executed Jetstress with 40 sessions @ 2,930 IOPS while the hybrid VSAN only sustained 3 sessions @ 251 IOPS. This is the effect of performing random 32K I/Os on HDD – which many on hybrid VSAN did, for two reasons:
- Because only 70% of the cache device is used as a read cache, chances are roughly 3 in 10 that a random 32KB read from the database – which is greater in size than the cache area, 700+ GB versus 490 GB - will miss in cache. Since all such missed reads must be satisfied from HDD, the time (latency) taken by the 3 HDD reads far exceeds the 7 cache SSD reads, and therefore dominates the latency profile. Note, this is a variant of Amdahl’s law, which concerns parallel versus serial execution, and proves the time needed for the workload is dominated by serial execution time unless the workload is extremely parallel. Here, we have the HDD representing the serial part of the execution stream.
- Since the remaining 30% of the cache device (210GB) is used for writes, the random 32KB writes coming from Jetstress into the 700GB+ database will quickly fill the write cache area because of the relatively long time it takes to ‘drain’ the write cache down to the capacity HDD layer. So the caching device is, after the workload starts and becomes steady-state, always busy both trying to satisfy reads and also flush writes down to HDD. As a result, the workload quickly becomes dominated by the (lack of) response time from the HDD, especially since they are performing both reads and writes, the classic “I/O blender” effect. The HDD became saturated at roughly 80 IOPS.
Finally, as I mentioned above, using an all-flash VSAN, all reads are satisfied from the capacity SSD – and as you can see above, that capacity SSD provided 12 times the IOPS at the same latency.
For 5% more dollars.
As they say, choose wisely!
Let us know what you think. Leave a comment below or send us a tweet @MicronStorage or me directly @peglarr.