Allocating Ranks Unequally Over Nodes

SLURM offers a facility for distributing ranks of the processes across the nodes. It can even distribute unequally for example rank 0 on the first node for extra memory when gathering data from many workers ranks on other nodes. The option is --distribution, which controls the distribution of tasks to the nodes on which resources have been allocated, and the distribution of those  resources  to tasks for binding (task affinity).  There are multiple options like block|cyclic|arbitrary|plane. (Further details are available in man srun). For example, the arbitrary method of distribution will allocate processes in-order as listed in file designated by the environment variable SLURM_HOSTFILE.  

srun -l /bin/hostname | sort -n | awk '{print $2}' > nodelist.txt

or use 

scontrol show hostname $SLURM_NODELIST > nodelist.txt 

Then, export  the environment variable.

export SLURM_HOSTFILE=nodelist.txt 

Edit the file and make sure to have the adequate number of lines with the number of tasks, and the run the executable.

srun --hint=nomultithread -N 3 --ntasks=64 --distribution=arbitrary ./my_exe