• eResearch
    • Collaborative Technologies
      • ARDC (Australian Research Data Commons)
      • ARDC Nectar Research Cloud
      • Australian Access Federation
      • QRIScloud
      • Video Collaboration
    • Data Management
      • Research Data Management Plans
    • Data Services
      • Australian and International Data Portals
      • CQUni Research Data Storage Options
      • CQUniversity Research Data Storage
      • GEOSCIENCE DATA PORTALS
    • eResearch and Security: MFA and CyberSafety
      • Encrypting Data on Portable Devices
      • Password best practices
    • High Performance Computing
      • The History of CQU’s HPC Facilities
        • Ada Lovelace Cluster (Current)
        • Marie Curie Cluster (Decommissioned)
        • Einstein Cluster (Decommissioned)
        • Isaac Newton HPC Facility (Decommissioned)
      • HPC User Guides and FAQs
        • Getting started on CQUniversity’s Ada Lovelace HPC System
        • Basics of working on the HPC
        • Connecting to HPC via CLI/SSH
        • Finding Installed Software
        • Graphical connection HPC via Open On Demand
        • HPC Collaboration
        • Simple Unix Commands
        • Slurm Commands
        • Software Module Information
        • Transferring Files to the HPC System (Ada)
        • Using Abaqus
        • Using ANSYS on the HPC System
        • Using HPC Scheduler on Ada Lovelace Cluster
        • Using MATLAB
        • Using R
        • Using Whisper to Transcribe audio on the HPC
        • Virtualisation and Containers
        • Creating a Conda Enviroment
        • LLM’s on Ada Cluster
        • Running Python on Ada
        • Machine and Deep Learning modules on Ada
        • PBS to Slurm Command tables (HPC Scheduler)
        • Compiling Programs (and using the optimization flags)
        • Frequently Asked Questions
        • HPC Trouble Shooting
      • HPC Community
      • HPC Related Links
      • HPC Sample Code Scripts
        • Multiple Job Submission – Slurm
        • Parameter sweep multiple job – Slurm
        • R Sample Scripts – Slurm
        • Sample Slurm CPU Submission Script
        • Sample Slurm GPU Submission Script
      • HPC Software
    • Research Software
    • Scholarly Communication
    • Survey Tools
    • Training
      • QCIF – Queensland Cyber Infrastructure Foundation
      • Teaching Lab Skills for Scientific Computing

eResearch

SAMPLE SLURM GPU SUBMISSION SCRIPT

The following guide provides details on how to submit a simple program to execute on the GPU node.

In order to submit a job to the HPC system, it is recommended to write a script file similar to the one below, in which offers the benefit for the job to be re-submitted.

When it comes to picking the resources needed it’s worth keeping in mind the QoS limits on the HPC Cluster. The limits for the H100’s can be found in the table below. If you want to run 2 jobs on the H100’s halving the QoS limits for each job so they can both run at the same time.

Partition gpucomputeq 
CPU Cores 80
GPU’s 2
Memory 480 GBs
Wall time 24 Hours (1 day)

Note all “[…]” are variables that require defining.

Example Script (example.slurm)

#!/bin/bash
###### Select resources #####
#SBATCH -J [Name of Job]
#SBATCH -c [number of cpu's required, most likely 1]
#SBATCH --mem=[amount of memory required]G
#SBATCH -p [partition name]  ##gpucomputeq
#SBATCH ---gres=gpu:[amount of gpu's needed]
#SBATCH -t=[how long the job should run for in minutes]    ## You may wish to remove this line if the length of time required is unknown
#
#### Output File ##### 
#SBATCH -o [output_file].out    ## If left blank will default to slurm-[Job Number].out
#
#### Error File ##### 
#SBATCH -e [error_file].err     ## If left blank will default to slurm-[Job Number].err
#
##### Mail Options #####
#SBATCH --mail-type=ALL   ## BEGIN, END, FAIL, REQUEUE, STAGE_OUT, ALL, TIME_LIMIT_%% 
#SBATCH --mail-user=[your email address]
#
##### Change to current working directory #####
cd $SLURM_SUBMIT_DIR

##### Execute Program #####
./[program executable]

Real Example


#!/bin/bash
###### Select resources ######
#SBATCH -J Job1
#SBATCH -c 10
#SBATCH --mem=40GB
#SBATCH -p gpucomputeq
#SBATCH --gres=gpu:1
###### Output File ######
#SBATCH -o job1.out
###### Error File ######
#SBATCH -e Job1.err
###### Mail Options ######
#SBATCH --mail-type=BEGIN,END,FAIL
#SBATCH --mail-user=l.decosta@cqu.edu.au

###### Change to current working directory ######
cd $SLURM_SUBMIT_DIR

###### Execute Program ######
module load Python/3.12.3-GCCcore-13.3.0
python ./myprogram.py

Executing script on the HPC System

To submit a job, simply login to one of the “login nodes” and execute the command on a terminal:

sbatch [slurm_script_file].slurm

Handy commands, to check if your job is running, queued or completed is by using one of the following commands:

squeue

squeue -u [username]

Support

eresearch@cqu.edu.au

tasac@cqu.edu.au OR 1300 666 620

Hacky Hour (3pm – 4pm every Tuesday)

High Performance Computing Teams site