
Robust & High Performance Parallel File System Storage for HPC & AI
OnDemandPFS is an appliance based preconfigured Parallel File System Storage solution built on BeeGFS a parallel cluster file system, aimed at small to midsized HPC/AI installations.
OnDemandPFS provides high throughput specially aimed at AI workloads and can be easily scaled horizontally by adding more nodes thus increasing performance and capacity. The customers have the flexibility to choose OPEX or CAPEX business models.
Key Features
-
All Flash storage for demanding HPC/AI workloads.
-
Redundant configuration consisting of 4x BeeGFS nodes and with 8x uplinks to IB Switches.
-
System monitoring stack is included, Virtual Machines based monitoring is supported.
-
Central syslog server integration if required.
-
Based on xiRAID – the fastest RAID engine. Industry fastest drive rebuild time.
-
On Demand PFS is available in 3 configurations: 100TB/ 200TB/ 600TB.
System Architecture
The unique userspace architecture concept allows users to keep the metadata access latency (e.g., directory lookups) at a minimum and distributes the metadata across multiple servers so that each of the metadata servers stores a part of the global file system namespace.
By increasing the number of servers and disks in the system, it is possible to scale the performance and the capacity of the filesystem to the level that you need, seamlessly from small clusters up to enterprise-class systems with thousands of nodes.

Solution Whitepaper - On Demand PFS
In the realm of high-performance computing (HPC) and artificial intelligence (AI), designing a storage
system that strikes the right balance for efficient and optimal performance poses significant challenges. These environments, often consisting of hundreds of compute/worker nodes, handle vast amounts of data to solve intricate and complex problems. They need to support both large data sets for big data analytics and numerous small files for machine learning (ML) and AI applications without
experiencing performance issues. Organizations are seeking affordable, high-performance, and highly available solutions that are also easy to manage while adapting to the growing demands of diverse I/O-intensive workloads.
The challenge is that HPC/AI storage infrastructure requirements can change quickly, requiring companies to scale up and scale out their resources. Composable Disaggregated Infrastructure (CDI) represents the modern architectural approach to data centre infrastructure, disaggregating compute, storage, and network resources into shared pools that can be composed for on-demand allocation. Composable disaggregated infrastructure (CDI) for HPC/AI storage is the key to solving this optimization problem. It enables valuable resources to be deployed through software at just the right levels and within a minimal amount of time, even inside an HPC or AI cluster.
Xinnor xiRAID is a lightweight software RAID offering which complements CDI solutions whilst performing very closely to raw device capabilities and exceeding that of traditional hardware RAID solutions.
BeeGFS is an easily deployable hardware-independent POSIX parallel file system developed with a strong focus on performance whilst designed for ease of use, simple installation, and easy management.
The purpose of this document is to showcase a combined solution of BeeGFS cluster deployment with Xinnor xiRAID volumes deployed on Commodity hardware.