How to estimate your memory usage per node

The command time in the Unix operating system is used to determine the duration of execution of a particular command. For example:

@cdl1:~> /usr/bin/time ls
   real    0m0.017s
   user    0m0.004s
   sys    0m0.004s

By calling time with the --verbose option, it also provides you with additional useful information about memory consumption, bytes dumped to or read from the filesystem or exchanged through sockets. For example:

@cdl1:~>/usr/bin/time --verbose ls -r

   <output of 'ls' command>
    Command being timed: "ls -r"
    User time (seconds): 0.01
    System time (seconds): 0.01
    Percent of CPU this job got: 61%
    Elapsed (wall clock) time (h:mm:ss or m:ss): 0:00.01
    Average shared text size (kbytes): 0
    Average unshared data size (kbytes): 0
    Average stack size (kbytes): 0
    Average total size (kbytes): 0
    Maximum resident set size (kbytes): 4912
    Average resident set size (kbytes): 0
    Major (requiring I/O) page faults: 0
    Minor (reclaiming a frame) page faults: 375
    Voluntary context switches: 1
    Involuntary context switches: 1150
    Swaps: 0
    File system inputs: 0
    File system outputs: 0
    Socket messages sent: 0
    Socket messages received: 0
    Signals delivered: 0
    Page size (bytes): 4096
    Exit status: 0

You may want to apply this command on a job running in parallel on shaheen. As time outputs on the standard error stream, you just to tell srun to create as many error files as tasks running in order to differentiate the timing observed on each task. For example:

    srun --error=job.%t.err -n 4 /usr/bin/time --verbose my_program

will produce 4 error files (job.0.err, job.1.err, job.2.err, and job.3.err), each containing the output of the time command related to each one of the 4 parallel tasks.