Allocating Ranks Unequally Over Nodes
SLURM offers a facility for distributing ranks of the processes across the nodes. It can even distribute unequally for example rank 0 on the first node for extra memory when gathering data from many workers ranks on other nodes. The option is --distribution, which controls the distribution of tasks to the nodes on which resources have been allocated, and the distribution of those resources to tasks for binding (task affinity). There are multiple options like block|cyclic|arbitrary|plane. (Further details are available in man srun). For example, the arbitrary method of distribution will allocate processes in-order as listed in file designated by the environment variable SLURM_HOSTFILE.
srun -l /bin/hostname | sort -n | awk '{print $2}' > nodelist.txt
or use
scontrol show hostname $SLURM_NODELIST > nodelist.txt
Then, export the environment variable.
export SLURM_HOSTFILE=nodelist.txt
Edit the file and make sure to have the adequate number of lines with the number of tasks, and the run the executable.
srun --hint=nomultithread -N 3 --ntasks=64 --distribution=arbitrary ./my_exe