Use Case 4: On-Demand Node-Local Parallel Filesystem

New node-local parallel filesystems from BeeGFS (BeeOND) and Lustre are being deployed to provide ephemeral scratch space to keep inter-node IO traffic localized.  In many cases, the best way to implement the back-end block devices for these node-local parallel filesystems is with RAM disks.  On-demand parallel filesystems are implemented with Management, Metadata Targets, and Object Storage (see the figure below).  The amount of allocated RAM disk storage must provide enough space to accomodate growth of Metadata and Object Storage stripes.  Communication between the parallel filesystem components can be performed via Ethernet but is more commonly performed with RDMA to reduce communication latency and improve bandwidth.  In current HPC and Cloud architectures, the RAM disk block devices are reducing the available RAM for running processes.

In a composable parallel computing system, a better option is to deploy requested RAM disk storage from available NVMe memory blocks, using Machine Learning trained datasets to find the memory that is going to provide the highest IO transaction bandwidth and lowest latency.  As shown in the figure, CPU cores are matched to NVMe memories through CXL-3.0 spec peer-to-peer network switches.  Included in the diagram, are block storage devices that are available through the CXL switches for another option.  IO block transactions are very good candidates for dynamically attached memories blocks, because processes and threads can be placed into an IO wait queue by the OS until the transactions are completed and an interrupt is generated.