Shaheen II
Introduced in March 2015, the KSL team manages Shaheen II, a Cray XC40 delivering over 7.2 Pflop/s of theoretical peak performance. With 5.536 Pflop/s of sustained LINPACK performance, Shaheen II was the seventh fastest supercomputer in the world according to the TOP500 list of July 2015.
The system has 6,174 dual sockets compute nodes based on 16 core Intel Haswell processors running at 2.3GHz. Each node has 128GB of DDR4 memory running at 2300MHz. Overall the system has a total of 197,568 processor cores and 790TB of aggregate memory. Fig. 1 summarises the specifications of the Shaheen II system.
COMPUTE | Node | Processor Type: | 2 CPU sockets per node, 16 processors per CPU. 2.3 GHz |
6174 Nodes | 197,568 cores | ||
128 GB of DDR4 memory per node | Over 790 TB total memory | ||
Power | Up to 3.1 MW | Water Cooled | |
Weight/Size | More than 100 metric tonnes | 36 Cray XC40 compute cabinets, plus disk, blowers, management etc. | |
Speed | 7.2 Pflop/s theoretical peak performance | 5.53 Pflop/s sustained LINPACK | |
Network | Cray Aries Interconnect with Dragonfly topology | 57% of the maximum global bandwidth between the 18 groups of 2 cabinets. | |
STORE | Storage | Sonexion 2000 Lustre | 5988 4TB disks |
ClusterStor E1000 Lustre | 3392 16TB disks | ||
Burst Buffer | Cray DataWarp | 536 Solid-state Drives (SSD) fast data cache. 1.5 PB capacity | |
Backup | HPE Data Management Framework (DMF) | 52,000 tapes providing 100 PB of capacity using a Spectralogic tape library with 20 tape drives |
Figure 1. Specification of Cray XC40 Shaheen-II.
The compute nodes are housed in 36 water-cooled XC40 cabinets, and connected via the Aries High Speed Network (HSN). The HSN is configured with 8 optical network connections between every pair of cabinets achieving therefore 57% of the maximum global bandwidth between the 18 groups of two cabinets. This will allow the design of the future upgrade with additional cabinets to accommodate more optical links between all cabinets with the same level of connectivity, i.e. 8 optical network connections between every pair of cabinets.
KAUST’s system includes richly layered data storage architecture. The main data storage solution is a Lustre Parallel file system based on Cray Sonexion 2000 with a usable storage capacity of 17.2 PB delivering around 500 GB/s of I/O throughput. The Cray Sonexion 2000 installation is configured using 72 high performance Scalable Storage Units (SSU) and 144 Object Storage Services (OSS) with 4TB drives connected to the XC40 via 72 LNET router service nodes evenly distributed across the 36 cabinets.
The backup and archiving was initially enabled with a Cray Tiered Adaptive Storage (TAS) system this has now been replaced with HPE Data Management Framework (DMF) which consists of a tape library with a total capacity of 100 PB.