Using Burst Buffer for Complex Workflows

Typical PDEs based simulation workflows entail the following steps: geometric modelling, meshing, setting up of boundary and initial conditions, solver, possible format change of output files, and post processing i.e. visualization and/or data analysis. Most of the workflows start with a very small data set and the final data that is needed to make scientific discovery is also very small (could be just a picture or a movie). But these workflows do invlove large intermediatery data that is not needed for the final scientific discovery.

Shaheen's DataWARP (burst buffer) is a good candidate for running such workflows. The main motivations are to reduce the data footprint and minimize metadata traffic on the lustre file system.

A test case can be viewed in /scratch/tmp/openfoam_bb_testcase.tar. You can find a ReadMe file in the folder that provides information about editing the  jobscript and running the test case.

This is a simple 100x100x100 cells laminar flow 3D cavity test case which is designed to demonstrate the value of DataWARP in optimizing the flow of data for complex workflows. The intermediate data only stays temporarily in Burst Buffer memory and the final processed data gets migrated to the lustre file system. This workflow reduces the data footprint and also minimize metadata traffic on the lustre file system.

The ideas from this CFD-based test can be applied to simulations in other domain sciences that involve workflows and temporary intermediate data.

In the next training session (or next month's Tip of the Week), we will demonstrate how to integrate data visualisation in this workflow. This type of workflow design is ideally suited for generating simulation based data for deep learning training.