DESIGN TOOLS
storage

Millions of IOPS from a networked file system

Ryan Meredith | June 2020

Millions of IOPS from a networked file system using Weka™

How do we share the performance of modern NVMe™ drives across a data center? There are many competing answers to that question and many companies with exciting solutions.

The right solution depends on the needs of the application in use. The simplest way to share data across servers and applications is to load it on a remote file system. File-based solutions such as NFS and Windows SMB are ubiquitous because of their ease of use, but they aren’t known to be particularly performant. What if you need remote file-system storage and ultra-high performance? Many high-value, file-system-based workloads, like artificial intelligence training and inference, can benefit from high-performance file storage. That’s where Weka™ comes in.

My team at the Micron Solutions Engineering Lab recently completed a proof of concept using Weka to share a pool of our mainstream Micron 7300 PRO with NVMe SSDs. The results — millions of IOPS from a file system — are exciting to consider.

Test Configuration

Weka has built a high-performance, parallel, shared file system called WekaFS. According to Weka, a deployment typically runs with a minimum of eight nodes in production environments, but it can run with as few as six nodes if the user does not need virtual spare capacity for a node rebuild.

Our testing uses six nodes in a 4 + 2 (data + parity) erasure-coding configuration for data protection. Weka supports N + 2 and N + 4 erasure-coding configurations supporting either two or four data protection nodes. As the number of data-focused nodes increases, two things happen: write performance increases and the probability of data exposure decreases. (See Weka data protection white paper for details.)

Here is the configuration we use in our testing (Figure 1):

  • 6x Dell™ R740xd 2U with 2x Intel 6142 processors (16 core 2.60GHz)
  • 1x 100 GbE Mellanox™ ConnectX™-5 NIC per server
  • 6x 7300 PRO 7.68TB SSDs per server (36 drives total)
  • 9 FIO load generators, each with 100 GbE NICs
  • Cumulus™ Linux™ 100 GbE switch (jumbo frames enabled)
  • WekaFS version 3.6.1
  • CentOS™ 7.6.1810 (kernel 3.10.0-957.el7.x86_64)
Network switch connected via 100GbE links to WEKA storage servers and FIO workload generators Figure 1: Test infrastructure overview

We use the Micron 7300 PRO 7.68TB SSD for this test due to its high capacity and compelling performance (Table 1).

Installation and configuration of Weka is straightforward, enabling us to provision this system quickly. For those who need help during installation, Weka has a great support team.

Table 1: Micron 7300 Performance Characteristics
Modal Capacity 4KB Random Read IOPS 4KB Random Write IOPs 4KB Random 70/30 (IOPS) 128KB Sequential Read 128KB Sequential Write
Micron 7300 PRO 7.68TB 520KB 85KB 190KB 3.0GB/s 1.8 GB/s

Test Methodology

To evaluate the performance of the Weka solutions using Micron 7300 PRO SSDs, we perform a traditional “four corners” testing strategy, providing 100% read, 100% write and 70% read/30% write mixed, small-block workloads to test operations per second and 100% read and write large-block workloads to test throughput.

We conduct all tests using nine client nodes (each node running eight fio execution jobs), with each client targeting separate file folders on the shared file system. For each workload, we increase queue depth (QD) until we determine maximum performance levels. We configure Weka to use 19 CPU cores, with six of those cores dedicated to managing I/O to the six data drives in each Weka data node.

Performance Results

Our first three tests focus on 4KB block size measured in input/output operations per second (IOPS). We also provide average latency measured in microseconds (µs).

chart showing small-block, 100% random read performance results Figure 2: Small-block, 100% random read performance results

Using 100% 4KB random reads, we see a consistent increase of performance as queue depth increases. At QD32, maximum attained performance is achieved at over 4.6 million IOPS, while average latency increases to 487 µs, a 63% increase over QD16 (Figure 2).

small-block, 100% random write performance results Figure 3: Small-block, 100% random write performance results

Using 100% 4KB random writes, we see a rapid increase in performance as we move from QD1 to QD4 at around 626,000 IOPS. Moving to higher queue depths, IOPS level off significantly, reaching a maximum of 696,000 IOPS while latency increases from 830 µs at QD8 to 1.6 ms at QD16 (Figure 3).

In our experience, achieving submillisecond latencies for 4KB random writes to a remote file system at this level of performance is impressive. It is important to note that write performance is heavily influenced by the number of deployed data nodes. Using more nodes increases overall write performance.

Small-block, 70% read/30% write performance results Figure 4: Small-block, 70% read/30% write performance results

Finally, for IOPS performance, we test using a 4KB 70% read/30% write workload. I/O performance peaks at over 1.6 million IOPS, with a latency of 467 µs for reads and a latency of 3.6 ms for writes at QD16 (Figure 4).

Our next series of tests focus on large block — 128KB — sequential workloads. Large-block I/O testing attempts to simulate use cases such as video streaming, database decision support systems, or big data analytics workloads. This type of workload test measures the rate of data throughput in gigabytes per second (GB/s).

Large-block, 100% sequential read performance results Figure 5: Large-block, 100% sequential read performance results

First, we test 128KB sequential 100% reads across a range of queue depths. Our maximum performance is reached at QD16, hitting 62 GB/s at 2.3 ms average latency (Figure 5).

Large-block, 100% sequential write performance results Figure 6: Large-block, 100% sequential write performance results

Our test shows that 100% 128KB sequential write workloads also reach a maximum throughput at QD16. But like the 128KB sequential read workload, the QD16 write performance reflects an increase in latency relative to QD8, in this case, approximately 86% higher (Figure 6).

Conclusion

Our testing shows that it’s possible to achieve impressive performance using Weka. Micron NVMe SSDs, such as the Micron 7300, can reach high levels of performance in an easy-to-manage file-system solution. Generating millions of IOPS and GB/s of throughput from a software-defined solution that also provides data protection, Weka pushes the boundaries of high-performance file storage.

Micron’s 7300 SSD with NVMe offers great performance that you expect from NVMe with the cost profile and power consumption typically seen in SATA solutions. These factors make the 7300 the go-to drive for broad deployment scenarios such as file-based storage infrastructures offered by Weka.

More Info

To learn more about Weka’s distributed file system, download the WekaFS datasheet.

To learn more about Micron NVMe SSDs like the Micron 7300, visit the data center SSD page on micron.com.

Also, stay up to date on future discussions about using SSDs in data center solutions, like those offered by Weka, by following us on Twitter and connecting with us on LinkedIn.

Director, Storage Solutions Architecture

Ryan Meredith

Ryan Meredith is director of Data Center Workload Engineering for Micron's Storage Business Unit, testing new technologies to help build Micron's thought leadership and awareness in fields like AI and NVMe-oF/TCP, along with all-flash software-defined storage technologies.