Parameter sweep multiple job – Slurm

Parameter sweep – Python

A Parameter sweep is a useful tool for submitting multiple jobs without having to create a sbatch script for each individual file or having to edit the script in-between submissions. This is useful for when you need to process a lot of jobs in the same manner, such as when using LLM’s or processing data through a standard matrix.

This page will show you how to get the parameter sweep working from scratch on a simple ‘hello world’ program.

First, we need to connect to the HPC and be connected to a session. Instruction can be found here.

Next we’ll need to create a folder for our testing, the name isn’t important but I am going to use ‘/test’ and depending on what you use you’ll need to change some commands to make your set up if you’re not using test.

mkdir test

followed by

cd test

you are now inside the new folder.

we now need to set up the folder which we want to use the parameter set on, This folder will need to be called ‘\data’ for the script to work but if you use another name you can edit the script to use a different folder name. We’ll use nano to make a bash file then run it to do the setup.

For more experienced users you can write and run the bash script below in terminal.

nano setup.sh

followed by into the file

mkdir data
for i in 11 22 33 44 55; do
     echo "$i" > data/results_$i.txt
done

use ctrl + s and ctrl + x to save and close the nano file.

bash setup.sh

This will create 5 *.txt files to represent our data that we want to put into the python script.

I chose 2-digit numbers to differentiate from other variables that simple count up by 1 to ensure the script is running correctly.

next we need to create the python script we want to put the data into to process. for this example I’ve created a simple Hello World script that displays the variable

nano hello.py

import sys

var = sys.argv[1]
print(f"Hello, World {var}")

This will import the system variable and assign it an easier variable name to work with, it’ll then print “Hello, World” followed by the variable that’ll be imported by the text file.

While this is a very basic use of the format this can also be applied to importation interview transcript into a Python script for running through an LLM or raw data files to sort in large batches.

Next we need to create our bash and sbatch script to loop through the file in a parameter sweep and execute hello world script 5 times with different variables.

you can use nano submit.sh to create this file similarly as we did above.

submit.sh

#!/bin/bash

i=1  #set's the counter for the variable in the job name to 1
for file in data/*.txt; do
  jobname="Multi-Job_$i"
  filename=$(basename "$file")
  var=$(cat "$file")  # Read variable from text file

  cat << EOF | sbatch
#!/bin/bash
#SBATCH -J $jobname
#SBATCH -c 1
#SBATCH --mem=1g
#SBATCH -q workq
#SBATCH -o processed/${jobname}.txt
#SBATCH -e processed/${jobname}.err

cd \$SLURM_SUBMIT_DIR

# Run the Python script with variable
python hello.py "$var"
EOF

  ((i++))  #increases the variable in job name by 1
done

you can add module load before the python command at the end to load a specific module and that is how you’ll normally use these for more complicated works.

to submit the submit.sh file you use the command bash

bash submit.sh

This will submit 1 job that will then submit 5 jobs and create the output directory “processed”.

This same method can be used from running 5 interviews through an LLM script to running 5000 data sets through a common script for processing results.

If you think your work would benefit from a Parameter sweep script but aren’t sure how to make the one above work for your project get in contact with the eResearch team at eresearch@cqu.edu.au as we are always happy to help.