Carpet Wear & Pizza Boxes: SSD Performance & Endurance
So Micron is a co-chair on the task group within JEDEC that's working to better define SSD standards. We work with others in the industry to define the future standards that will apply to SSD products with the goal of defining a shared language for building, testing, and measuring solid state storage products. So because we're so heavily involved in driving these new standards, I always enjoy hearing new claims of SSD performance and endurance. Sometimes I’m impressed … oftentimes it’s just good to see others catch-up…
In order to help folks better understand some of the claims out there, I thought I’d provide a quick overview on managing NAND on an SSD as well as some background on wear-leveling concepts.
First, understand that unlike DRAM and SRAM which can be read and written one word at a time, hard drives and SSDs are “block addressed” devices. The Logical Block on a hard drive or SSD is 512 bytes in size. On a hard drive, a given logical block is always on the same physical location region of the disk. On an SSD the relationship between physical and logical blocks is less direct due to endurance issues with NAND. If an operating system or program continuously addresses a limited number of logical blocks then those logical blocks will likely wear out before the rest of the NAND wears out. That said, well-designed SSDs continually move logical blocks about the drive to minimize “hot spots” and reduce NAND wear.
Firmware running on the SSD controller takes care of these tasks in a process commonly referred to as wear-leveling. Today all of the latest-generation SSDs implement complex wear-leveling schemes in an effort to efficiently utilize the available NAND cycles available. Think of NAND memory on an SSD like a carpet in a house. In a house with older carpet you will notice that the carpet becomes worn out in high-traffic areas but other areas like under furniture and the edges of the rooms the carpet is still new and unworn. In this same house if the carpet was used evenly in all areas it would last much longer. This is the same concept for an SSD—wear-leveling moves the high traffic areas evenly around the entire NAND array giving the SSD a much longer life. Within the umbrella of wear-leveling there are a couple other concepts you will see in the industry—garbage collection and write amplification.
NAND has some constraints that are not ideal for use as a storage medium in a storage device like an SSD. The storage area on a NAND device is broken into units called pages and blocks. A page is typically 4KB in size and a block is a group of pages (64 to 128 pages to a block for today's NAND devices). In order to write data to a NAND device, it must be erased first. The smallest unit that can be erased is a block. Once the block is erased the pages can be written one at a time until the block is filled. It is undesirable to have to erase a block and move the data around on every single write that is received from the host because this process is slow—resulting in poor SSD performance. The process is referred to as “read-modify-write”. In order to avoid performing read-modify-write procedures, modern SSDs will keep a pool of blocks pre-erased and ready for new data. When data is written to the same logical area repeatedly it is always written to a new physical area in the NAND. Along with the written data, a table that tells the controller where to locate the latest data is updated and the old locations are marked invalid. At some point the drive runs out of pre-erased blocks and must re-claim the areas marked invalid by the firmware. This process of reclaiming blocks is called garbage collection and SSDs must do it frequently or they will quickly run out of space.
To put this into an everyday example, imagine that you and your friends order two pizzas for dinner. The two pizzas arrive and soon everyone is busy moving slices of pizza onto their own plates. The only problem is there’s no room on the table for the requisite pitchers of beer. So you make the command decision to combine the remaining slices of pizza onto one pizza tray—creating new empty space on the table. Your friends pour their beer and applaud your sheer brilliance…as a garbage collector!
The amount of information written to the NAND is always greater than that written by the user because wear leveling and the garbage collection process generate some extra NAND writing. Write amplification is a measure of the amount of data written to the NAND verses the amount of data written by the user to the SSD. The objective of the firmware on the SSD is to be as efficient as possible to limit the extra NAND writes.
That said, there is a lot of misinformation about write amplification. All drives will have a worst case limit when the drive is nearly full. In this case, the write of a single 512byte logical block will result in at least one NAND page being written. With a page size of 4K the write amplification must, out of necessity, be 4kbytes/512bytes, for a write amplification of 8. However, most SSD vendors report something much closer to one, which would be the case for an empty drive and larger data transfer sizes.
So…next time you see new claims about SSD performance and endurance, keep these concepts in mind. They’re a good place to start judging whether they’re catching up or if you should be impressed.