|What is the background of the birth of NVMe?|
|What is end-to-end NVMe?|
|Why is NVMe so fast?|
|What are the NVMe-oF transmission types?|
|Which operating systems support NVMe?|
When it comes to NVMe, "fast" is the first impression it gives people. Several years ago, the number of all-flash arrays supporting NVMe accounted for only a small part of the entire storage market. This year, it is difficult to see that new storage products above the mid-range do not support NVMe.
However, what does NVMe have besides being fast? We often hear end-to-end NVMe, NVMe-OF, or FC-NVMe, what do they mean?
Historically, most SSDs used bus connections such as SATA, SAS, or Fibre Channel to connect to computer interfaces. With the popularity of solid-state drives in the mass market, SATA has become the most typical way to connect SSDs to personal computers. SATA, on the other hand, is mostly for mechanical disks (HDD). It is difficult to meet the increasing speed of SSD. With the popularity in the mass market, the data rate increase of many solid-state drives has slowed. Unlike mechanical hard drives, the maximum throughput of SATA has constrained some SSDs.
Before the advent of NVMe, high-end SSDs could only be manufactured using the PCI Express bus, but they needed to use non-standard interfaces. If there is a standardized SSD interface, the operating system just needs one driver to work with all SSDs that meet the specifications. This also means that each SSD manufacturer does not have to use additional resources to design drivers for specific interfaces.
This is because the storage systems at the time were all designed for mechanical hard drives. For example, many of the early full-flash or mixed-flash arrays used the traditional storage technology-SATA SSD. This type of storage was based on the AHCI (Advanced Host Controller Interface) command protocol. The AHCI was born for mechanical hard drives, and the SATA III bus using AHCI only allows data transfer speeds of up to 600MB/s.
Therefore, the NVMe specification was born. When NVMe (Non-volatile Memory Express) was first put on the market, many people thought it was just a new and faster SSD. NVMe, on the other hand, is a performance-driven and entirely new storage protocol that allows us to fully exploit the speed of SSDs and storage-class memory (SCM).
NVMe has replaced the original AHCI specification, and the processing commands at the software level have also been redefined. The specifications of the SCSI/ATA command no longer apply. NVMe SSD uses the PCIe high-speed bus in the computer or server to connect it directly to the computer, thereby reducing CPU overhead, simplifying operations, reducing latency, and improving IOPS and throughput.
Queue Depth (QD) is another advantage of NVMe over AHCI. SAS and AHCI can only be a single queue, and the depth of each queue is relatively low, respectively 254 and 32 queue depths. The NVMe protocol considered this problem at the beginning of its design. Its maximum number of queues can be 65K, and the depth of each queue can be as high as 65K. In addition to reducing latency, this is critical to improving the server's ability to handle concurrent requests.
After talking about NVMe, let's talk about end-to-end NVMe. When we describe PowerMax or PowerStore, we often use the term "end-to-end NVMe" support. In fact, this also means that the performance of SSD can be further released.
This is because most of the all-flash arrays at the time supported NVMe SSDs at the storage backend. Compared with all-flash arrays using SATA or SAS SSDs, it did bring performance improvements. However, this does not mean that NVMe SSD has reached its performance limit. In fact, NVMe SSD all-flash arrays can theoretically provide a greater performance improvement-10 times more performance than all-flash arrays using SAS and SATA SSDs.
This huge performance difference stems from the fact that the all-flash array controller architecture at the time was also designed to adapt to mechanical hard drives. When NVMe SSDs were used, this controller became an obstacle. For this reason, the array Controllers and storage network protocols must continue to evolve.
The emergence of NVMe over Fabrics (NVMe-OF for short) is to apply NVMe to the front-end as a channel for connecting storage arrays and front-end hosts, replacing FC and iSCSI in the past. As a result, the host can directly communicate with the NVMe SSD using the native NVMe protocol, thereby greatly reducing the delay.
The following article will introduce why NVMe is so fast (note: the speed mentioned here is based on SSD devices, not if it is a mechanical hard drive). Due to the physical characteristics of the SSD itself, its data access has been very fast, and the performance bottleneck lies in the interface and protocol between the computer and the device.
Let's take a simple example. For example, we have a warehouse that will continuously produce products. And we can take the products from the warehouse to other places by manipulators (as shown in Figure). For SATA's SSD, it is similar to a single-arm robot. The warehouse produces very fast, but the robot can only take one at a time, and the moving speed is relatively slow.
But what about NVMe-based SSDs? This is equivalent to the robot with hundreds of hands, so the speed is obviously much faster than the SATA's SSD.
The principle of the NVMe protocol is also the same. It essentially establishes a path between multiple computers and storage devices, so that the speed of moving data is naturally increased. In the NVMe protocol, multiple channels are actually multiple queues, as shown in Figure. In SATA, computers and storage devices can only have one queue. Even in the case of multiple CPUs, all requests can only pass through such a narrow path. The NVMe protocol can have up to 64K queues, and each CPU or core can have a queue, so that the degree of concurrency is greatly improved, and the performance is naturally higher.
For NVMe-oF, there are three choices for the transmission type, which are NVMe-oF using Fibre Channel, NVMe-oF using TCP, and NVMe-oF using RDMA.
The combination of using NVMe on Fibre Channel (FC) is usually called FC-NVMe, NVMe over FC, and sometimes also called NVMe/FC. Fibre Channel is a powerful protocol for transferring data between storage arrays and servers, and most SAN storage systems use it. In FC-NVMe, SCSI commands are encapsulated in FC frames. It is based on standard FC rules and matches the standard FC protocol that supports access to shared NVMe flash memory.
This type of transmission is one of the latest developments in NVMe-oF. NVMe over TCP (Transmission Control Protocol) uses NVMe-oF and TCP transmission protocols to transmit data on an IP (Ethernet) network. NVMe uses Ethernet as a physical transmission and is transmitted in TCP datagrams.
Despite RDMA and Fibre Channel, TCP offers a potentially cheaper and more flexible option. In addition, compared with RoCE, which also uses Ethernet, NVMe/TCP behaves more like FC-NVMe because they use message semantics in I/O.
The specification uses remote direct memory access (RDMA) to enable data and memory to be transferred across the network between computers and storage devices. RDMA is a way of exchanging information between the main memories of two computers in a network. It does not involve any computer's processor, cache, or operating system. Because RDMA avoids the operating system, it is usually the fastest and lowest cost mechanism for network data transfer.
NVMe-oF on RDMA uses the TCP transfer protocol to transfer data on the IP network. Typical RDMA implementations include virtual interface architecture, RDMA (RoCE) on aggregated Ethernet, InfiniBand, Omni-Path, and iWARP. RoCE, InfiniBand, and iWARP are currently the most used.
Use NVMe-oF with RDMA, Fibre Channel, or TCP to form a complete end-to-end NVMe storage solution. These solutions provide significant-high performance while maintaining the extremely low latency provided by NVMe.
Today, NVMe is becoming more and more popular due to its low latency and high throughput multitasking speed. Although NVMe is also used in personal computers to improve video editing, games, and other solutions, real benefits can be seen in enterprises through NVMe-oF, especially in enterprise scenarios where every second counts. Such as real-time customer interaction, artificial intelligence (AI), machine learning (ML), big data, and advanced analysis application development and operation. The faster the data is processed and accessed, the more value it can bring to the business.
On March 30, 2017, the NVMe driver was released and available for use.
On February 24, 2015, the NVMe driver was added to the kernel and boot loader, and Chrome OS can be started from the NVMe device.
DragonFly 4.6 starts with a built-in NVMe driver.
The driver developed by Intel has been built into the head and stable/9 branches of FreeBSD. The nvd(4) and nvme(4) drivers are built-in by default in FreeBSD version 10.2.
Haiku has a driver development schedule, but it has not been completed yet.
illumos received driver support on October 15, 2014.
iOS 9 began to support, the first device equipped with NVMe interface is iPhone 6S/6S Plus. It is also the first mobile device to use NVMe. The physical interface is the same as UFS using M-PHY PCIe. Next, Apple's iPad Pro and iPhone SE also adopted NVMe.
The NVMe driver was originally provided by Intel and is suitable for the Linux kernel driver module. This module was integrated into the mainline driver of the Linux kernel on March 19, 2012. The Linux kernel version 3.3 started to have built-in support without the need to install additional modules.
Starting from version 3.13 of the Linux kernel on January 19, 2014, the blk-multiqueue or blk-mq module developed by Fusion-io is added as a "scalable block layer" for NVMe SSD. This leverages the performance offered by SSDs and NVM Express, by allowing much higher I/O submission rates. With this new design of the Linux kernel block layer, internal queues are split into two levels (per-CPU and hardware-submission queues), thus removing bottlenecks and allowing much higher levels of I/O parallelization.
NetBSD initially supports NVMe in the 2016 development version. OpenBSD released the NVMe driver in its 6.0 version.
OpenBSD's NVMe driver started in June 2014. The senior development team that developed and released USB and AHCI drivers were responsible for this. OpenBSD 6.0 began to officially support.
Mac OS X 10.10.3 (OS X Yosemite) began to support NVMe. Apple's 2016 Retina MacBook and MacBook Pro models include a PCIe SSD with NVMe as the primary hard drive.
Solaris has supported NVMe since Oracle Solaris 11.2.
Intel has released the NVMe driver for VMWare, which is in vSphere 6.0 and later versions and supports a variety of NVMe devices. In the vSphere 6 update 1, the storage subsystem emulated by VMWare's VSAN software has also begun to support NVMe devices.
Starting with Windows 8.1 and Windows Server 2012 R2, Microsoft natively supports NVMe devices. It also provides native driver support for Windows 7 and Windows Server 2008 R2.
In addition to the drivers officially provided by Microsoft, OpenFabrics Alliance also maintains a set of open-source NVMe drivers for Windows 7, 8, 8.1, 10, and Windows Server 2008 R2, 2012, 2012 R2.