Slurm workflow


This guide is written to compile and run a program written in Fortran using Intel's compiler.

Unity can also load and run MATLAB, Python, R, Mathematica, etc. . To view the whole list of available modules while you're logged in to Unity, type: module avail

Follow along in an example file: SlurmTime.sh

Job workflow

Preparing Fortran files to run on Unity

  • If you reference a path, make sure it uses / instead of \

    • / still works with Windows, according to the wisdom of the internet

    • E.g. if your input files are saved in \input\files, change it to /input/files

  • Compilation script / makefile

    • Linux uses -qopenmp instead of /Qopenmp

    • Linux uses -mkl instead of /Qmkl

    • Linux builds to .o files instead of .obj, so you may need to tweak the compile script

    • Compilation example included in sample file, "SlurmTime" above


Upload your working files

  • In your home folder ( ~ or /home/name.#/ ), use WinSCP to create a new folder

    • On the right side screen, click "New", then choose "Directory"

      • Use the default options, but change the name to something related to your project

      • Life is simpler if you don't use spaces when choosing a folder name

  • Double click the new folder

  • Upload all code, parameter values, inputs, etc

  • See Uploading files to Unity for process details


Slurm

  • About: Slurm is the workflow manager that assigns resource requests to actual computer hardware

  • Slurm scripts

    • To start a job, you will need to write a Slurm script that tells the system what you need and what you want it to do

    • See attached example file and modify as needed in Notepad++

      • Be sure to change job description, your email, and the compiled files

      • See the block below for details on how to read/change the sample script located at the top of this page

    • Caution: Slurm jobs need Linux Line breaks


Walking through the linked Slurm script

  • Don’t change the first line (#!/usr/bin/env bash)

  • Lines starting with #SBATCH are instructions for the scheduler

  • Lines starting with # are comments

  • There are 4 blocks of SBATCH commands in the example file

    • Name/user stuff to track your job

    • CPU resource requests

      • 1 node, 1 task per node, then pick the number of CPUs you want (for OpenMP implementations)

      • If you coded in MPI, you may want more than 1 node or task per node, but I don’t know much about that

    • Time request

    • Memory request

  • After ALL of the #SBATCH lines, you can put in normal Linux Bash commands

  • There are 3 blocks in the example file:

    • Export OMP_NUM_THREADS: Sets the OpenMP number of threads to be equal to the number you requested

    • Scratch directory

      • I tell the system to create a directory for the unique job ID

        • This way, you don’t have to be as careful about overwriting files if you are running similar jobs

      • I compile everything each time I run so that the program is always compatible with the node it was assigned to

      • Then copy your files over to the working directory

      • Example: If all of my files are in ~/LendingStandards, I the Slurm command would be

        • cp -R ${SLURM_SUBMIT_DIR}/LendingStandards ${SCRATCH_DIRECTORY}

        • cd ${SCRATCH_DIRECTORY}/LendingStandards

    • Compile the Fortran file

      • Module load intel loads the ifort command and other intel tools

      • The remaining ifort … lines compile my fortran source files

      • Be sure to use -qopenmp and -mkl options for any source files that need those tools

    • Run the program

      • ./program_name will execute the program

    • When it is done, exit 0

      • You will get an email saying your batch exited with code 0

      • If you get an email saying your batch exited with a different code, it failed for some reason

Submitting a job

  1. When you have a Slurm file ready, upload the .sh file to Unity (see Uploading files to Unity for details)

    • To help you manage your files, save each , given them different names

    • (Optional) I save Slurm scripts in my home holder ( ~ or /home/name.#/ )

      • Scripts run from the folder they are started in. I prefer to start from the home folder each time, but you could put things in project-specific subfolders, if you prefer

    • Caution: Slurm jobs need Linux Line breaks

  2. Log into Unity (the prompt should show something like [name.#@unity....]$ )

  3. Type: sbatch myscript.sh

    • Be sure to replace myscript.sh with the name of the Slurm file you uploaded in step 1

    • You will be shown a job ID. Make note of this in case you need to cancel the job

  4. You will get an email when your job starts and when it ends

    • You can see information about your job by logging into Unity and typing: squeue -u name.#

  5. If you need to cancel the job, log into Unity and type: scancel ######

    • ##### is the number of the job you want to cancel

  6. As the job is running, it will output anything it normally shows to the console to your home directory with the job number

    • Example: slurm-3023230.out

    • You can view this file with WinSCP by navigating to it and double clicking

    • Or you can use PuTTY to view the file

      • To just read the last 100 lines, connect to Unity with PuTTY and type: tail slurm-3023230.out -n 100

      • To read the whole thing instead, type: cat slurm-3023230.out

      • Note: the .out file will be in the same folder as where you started the Slurm .sh file