Understanding UFS vs NVMe Software Stack Overhead in Linux Systems
Storage performance has become a critical component in modern computing, especially in embedded systems, smartphones, and laptops. Two major storage technologies—Universal Flash Storage (UFS) and Non-Volatile Memory Express (NVMe)—are at the center of this evolution. While both offer high performance and low latency, their software stack overheads vary significantly, especially in Linux-based systems.
This blog dives deep into the UFS and NVMe storage software stacks on Linux, explores system-level overheads, and highlights findings from a detailed performance comparison presentation by Micron.
1. Overview of UFS and NVMe Software Stacks in Linux
UFS (Universal Flash Storage)
UFS is a JEDEC standard primarily used in smartphones and embedded systems. It offers:
- High-speed serial interface (MIPI M-PHY + UniPro)
- Full-duplex communication
- Command queuing (native NCQ support)
- Low power consumption
UFS in the Linux Kernel
- First-class support since Linux kernel 3.18
- Modern devices use kernel 5.x and above with enhanced UFS features
-
Key Linux components for UFS:
- SCSI UFS driver (
ufshcd.c
) - SCSI mid-layer
- Block layer (
blk-mq
) - I/O scheduler (None, MQ-Deadline, BFQ, etc.)
- SCSI UFS driver (
UFS Stack Flow in Linux
User I/O Request
↓
Filesystem (ext4, f2fs, etc.)
↓
Block Layer (blk-mq)
↓
SCSI Mid-layer
↓
UFS Host Controller Driver (`ufshcd`)
↓
UniPro/M-PHY Interface
↓
UFS Device
NVMe (Non-Volatile Memory Express)
NVMe is designed specifically for SSDs connected over PCIe. It is widely used in:
- Laptops and desktops
- Data centers
- Removable SSDs (e.g., USB4 NVMe enclosures)
NVMe in the Linux Kernel
- Introduced in Linux 3.3, matured in 4.x+ series
-
Key Linux components for NVMe:
- NVMe driver (
nvme-core
,nvme.c
) - Block layer (
blk-mq
) - Direct I/O submission to hardware
- NVMe driver (
NVMe Stack Flow in Linux
User I/O Request
↓
Filesystem
↓
Block Layer (blk-mq)
↓
NVMe Driver (`nvme.c`)
↓
PCIe
↓
NVMe Device
Key NVMe Advantages
- No SCSI overhead (unlike UFS)
- Simpler protocol stack
- Lower software latency
2. Micron Presentation: UFS vs NVMe Storage Stack Overhead Comparison
In an insightful presentation, titled “UFS/NVMe Storage Stack System Level Performance in Embedded Systems”, the software overhead of UFS and NVMe stacks in embedded Linux systems was evaluated.
📄 Download the PDF Slides
Key Points from the Presentation:
-
Test Environment:
- Embedded system with both UFS and NVMe storage
- Linux Kernel 5.4/5.10 (depending on platform)
- Identical workloads across both devices
-
Metrics Analyzed:
- IOPS (Input/Output Operations Per Second)
- Latency per layer (block layer, driver layer, etc.)
- CPU usage and context switch rate
- Interrupt count and scheduling cost
-
Findings:
- NVMe has 30-50% lower software stack latency than UFS in random read workloads.
- UFS software stack introduces extra overhead due to SCSI mid-layer and controller complexity.
- The Linux SCSI stack is a legacy burden for UFS, not present in NVMe path.
- In high-concurrency scenarios, NVMe handles scaling better due to its native multiqueue support and lightweight protocol.
-
Power Efficiency:
- UFS is still more power-efficient for mobile usage, justifying its dominance in smartphones.
- NVMe consumes more power but provides higher throughput per watt in high-performance systems.
Supporting Resources:
- Blog: Linux Storage System Bottleneck for eMMC/UFS
- Legacy Slides: Linux Storage System Bottleneck Exploration (PDF)
3. Industry Usage Scenarios
Technology | Typical Usage | Stack Features | Power Profile |
---|---|---|---|
UFS | Smartphones, Tablets, Automotive Infotainment | SCSI-based, slower software stack | Optimized for low power |
NVMe | Laptops, Desktops, Embedded Edge Devices | Native NVMe, low-latency path | Higher performance, higher power |
4. Conclusion
The choice between UFS and NVMe in Linux systems isn’t just about hardware performance—it heavily depends on software stack efficiency. NVMe offers a cleaner, faster path, while UFS lags due to legacy SCSI components. However, UFS remains relevant in power-sensitive applications like mobile devices.
As Linux continues to evolve (especially with features like IO_uring, blk-mq optimizations, and lightweight SCSI layering), we may see these gaps narrow. For now, NVMe is the clear winner in low-overhead, high-performance environments, while UFS remains the choice for mobile-centric designs.
Author Note: This blog is based on public presentations and research by Bean Huo and is intended for educational and reference use for developers and system integrators.