Distributed Copy
dcp or distributed copy is a MPI-based copy tool developed by Lawrence Livermore National Lab (LLNL) as part of their mpifileutils suite. We have installed it on Shaheen. Here is an example jobscript to launch a data moving job with dcp:
#!/bin/bash #SBATCH --ntasks=4 #SBATCH --time=01:00:00 #SBATCH --hint=nomultithread module load mpifileutils time srun -n ${SLURM_NTASKS} dcp --verbose --progress 60 --preserve /path/to/source/directory /path/to/destination/directory
The above script launches dcp in parallel on with 4 MPI processes.
--progress 60 means that the progress of the operation will be reported every 60 seconds.
--preserve means that the ACL permissions, group ownership, timestamps and extended attributes will be preserved on the files in the destination directory as they were in the parent/source directory.
This tip is reprinted from this website where you can also find additional details on the topic.