AMD has unleashed its EPYCTM 7FX2 line of CPUs with high clock frequency; high cache; and 8, 16, and 24 cores. Our team in the Micron Austin performance lab tested the performance of these new CPUs using Microsoft SQL Server 2019 with Micron’s new mainstream data center NVMe drive, the 7300.
Enterprise applications like SQL Server can stress the beefiest systems, so why use a CPU with fewer cores when the EPYC family offers higher-core options? Efficiency is the key — either performance per core or performance per watt can determine the best total cost of ownership (TCO) of a hardware platform.
For this technical blog, we compared two of the three new CPUs, the AMD EPYC 7F32 and 7F52. We tested the 7F32 and 7F52 because of their high L3 cache per core. We expected the 24-core 7F72 to perform similarly but did not have time to test it before launch. Table 1 includes the specs of the three CPUs.
Table 1: AMD EPYC new processor offerings
|Model||Core Count||TDP Target||L3 Cache||L3 Cache/Core||Base Freq||Max Boost|
We also wanted to see how Micron’s mainstream NVMe SSD, the 7300, could enable great performance in a test that would punish all the components in a system (Table 2).
Table 2: Micron 7300 performance characteristics
|Model||Capacity||4KB Random Read IOPs||4KB Random Write IOPs||4KB Random 70/30 IOPs||128k Sequential Read||128k Sequential Write|
By performing these initial tests, we wanted to provide some insights into whether the 8-core or 16-core option worked best. The decision comes down to what’s important to the user and the workload that the user is running.
How We Tested
We used a Dell PowerEdge 7515 server for our testing. The BIOS was tuned with NUMA (non-uniform memory access) per socket (NPS) set to 4, as recommended by AMD. We used the current production BIOS with AGESA (AMD Encapsulated Software Architecture) version 18.104.22.168. Other system BIOS tunings were left to the defaults. Table 3 summarizes the server configurations for both the database server (the system under test) and the load generation server illustrated in Figure 1.
Table 3: Test configuration
|Processor(s)||1x AMD EPYC 7F32/52||2x Intel Platinum 8168|
|Storage||4x Micron 3.84TB 7300 PRO
NVMe SSD (LVM RAID 10)
|Network||25Gbps LOM||25Gbps LOM|
|Operating System||CentOS 8.1||CentOS 7.7|
|Application||Microsoft SQL Server Linux 2019||Py-TPCC|
Workload and Dataset
Our test workload was a custom, internally developed benchmarking application called Py-TPCC, written in Python. The implementation was very similar to HammerDB and provided comparable — but not identical — performance. It was based off the Transaction Processing Council’s online transaction processing (OLTP) TPC-C benchmark specifications, with a few modifications to better load the entire system and ensure that the entire dataset was accessed during the testing period. To measure performance, we recorded the number of TPC-C transactions (stored procedures) per minute, simply referred to as TPM.
We created a 1TB dataset to ensure the target database didn’t fit in memory. Consequently, the 2-to-1 dataset-to-memory ratio for this configuration resulted in a write-intensive workload to disk.
Below is a simplified outline of how tests were executed and measured:
- Restored dataset, replacing any existing database
- Applied load
- Ramped up to get to steady state: 20 minutes
- Began test measurement period
- Continued applied load: 30 minutes
- Stopped test
We repeated this process on both test systems, steadily increasing the load applied, until a predefined stop condition was met. In this testing, we stopped increasing load once the resulting TPM reached a performance plateau.
Predictably, the 16-core 7F52 CPU supported higher TPC-C TPMs than the 8-core CPU (Figure 2). When looking at transaction response times, both CPUs reported an aggressive average response time (Figure 3) and 99.9% response time (Figure 4), with the 16-core latency lower than the 8-core latency.
These results and quick response times would be impossible if not for the Micron 7300 PRO NVMe SSD. Microsoft SQL Server has a challenging I/O profile that mixes 64KB reads and writes with smaller 4KB and 8KB I/O. In testing, the logical volume manager (LVM) volume consisting of 4x 3.84TB NVMe SSDs was able to keep the CPUs busy while introducing minimal latency (Figure 5).
TPM Performance per Core
Looking at application performance efficiency, we saw that the 8-core (7F32) clocked a higher TPM per core than the 16-core (7F52) (Figure 6).
Power Utilization and Efficiency
The 16-core (7F52) had a higher overall system power draw, which made sense due to the higher thermal design point (TDP) of the CPU and the fact that it was processing more transactions than the 8-core (Figure 7).
Measuring the power consumed per Py-TPCC transaction, we saw that the 16-core was more power-efficient per operation than the 8-core (Figure 8).
Microsoft SQL Server 2019 can demand very high system resources. Both the AMD 8-core (7F32) and 16-core (7F52) CPUs can fulfill enterprise demands for SQL Server performance due to their high clock speeds and large L3 cache per core. In a high-transaction environment like OLTP solutions, maximizing transactions completed per minute and overall efficiency are the most important success criteria, and the 16-core (7F52) CPU is a great fit. If maximum performance per core is the primary success criteria, then the 8-core 7F32 CPU may be a better fit.
With either configuration, having fast, cost-effective storage is key. Micron’s 7300 PRO data center NVMe is the perfect fit for enterprise use cases like this, where a balance of cost, efficiency and performance will guide architecture decisions.
Would You Like to Know More?
Through Micron Accelerated Solutions, we also have a wide variety of workload-optimized solutions that provide ready-to-build enterprise storage-centric workloads.