• eResearch
    • Collaborative Technologies
      • ARDC (Australian Research Data Commons)
      • ARDC Nectar Research Cloud
      • Australian Access Federation
      • QRIScloud
      • Video Collaboration
    • Data Management
      • Research Data Management Plans
    • Data Services
      • Australian and International Data Portals
      • CQUni Research Data Storage Options
      • CQUniversity Research Data Storage
      • GEOSCIENCE DATA PORTALS
    • eResearch and Security: MFA and CyberSafety
      • Encrypting Data on Portable Devices
    • High Performance Computing
      • The History of CQU’s HPC Facilities
        • Ada Lovelace Cluster (New HPC)
        • Marie Curie Cluster (Current HPC)
        • Einstein Cluster (Decommissioned)
        • Isaac Newton HPC Facility (Decommissioned)
      • HPC User Guides and FAQs
        • Basics of working on the HPC
        • Getting started on CQUniversity’s Ada Lovelace HPC System
        • Graphical Connection to the HPC System
        • Compiling Programs (and using the optimization flags)
        • Connecting to the Marie Curie Cluster
        • Finding Installed Software
        • Frequently Asked Questions
        • Graphical connection HPC via Open On Demand
        • HPC Job Scheduler
        • HPC Trouble Shooting
        • Machine and Deep Learning
        • PBS Commands
        • PBS to Slurm Command tables (HPC Scheduler)
        • Running LLM’s on the HPC system
        • Running Python on HPC
        • Simple Unix Commands
        • Software Module Information
        • Submitting an Interactive Job
        • Transferring Files to the HPC System
        • Transferring Files to the HPC System (Ada)
        • Using Abaqus
        • Using ANSYS (Fluent) on the HPC System
        • Using APSIM
        • Using HPC Scheduler on Ada Lovelace Cluster
        • Using MATLAB
        • Using R
        • Virtualisation and Containers
      • HPC Community
      • HPC Related Links
      • HPC Sample Code Scripts
        • MATLAB Sample Scripts
        • Multiple Job Submission
        • Multiple Run Job Submission
        • PBS Job Array Submission
        • R Sample Scripts
        • Sample PBS Submission Script
        • Sample Slurm Submission Script
      • HPC Software
        • Mathematica Sample Scripts
    • Research Software
    • Scholarly Communication
    • Survey Tools
    • Training
      • QCIF – Queensland Cyber Infrastructure Foundation
      • Teaching Lab Skills for Scientific Computing

eResearch

HPC JOB SCHEDULER

For Marie Curie Cluster

By default, when running anything on the CQ University HPC systems, unless you are preforming simple tasks or doing a “quick test”, all programs must be executed on the “compute nodes”.

The CQ University’s HPC Facilities are “Large Shared” resources.  Unlike personal computers, these system are used by multiple users at the same time.  Given the usage of the HPC can vary at times, there is a need for a “HPC Scheduler” to be ultilised.  This scheduler will check if the requested resources are available.  If they are available, they will execute the job on one of the available compute resources, if no resources are available, the request is “queued”, until resources become available.

If users execute large jobs on any of the “Login” nodes, this will slow down usability and will impact other users performance.

The CQ University’s HPC Facilities uses “PBS Pro” as the scheduler for resource management.  Information on PBS commands can be found on the “PBS Commands” user guide.

In an effort to make using the scheduler easier, as number of PBS sample scripts have been created (See here for sample information).  Additionally, some simple scripts have been created to highlight current HPC usage and to assist with deleting HPC jobs.

 

 

 

Command Usage Example Output
qusers This will provide an overall summary of HPC usage
Thu Sep 12 12:32:30 EST 2013
            
There are 3 users with jobs on: hawking
Username #jobs #run #cpus #Memory #queued #other Real Name
=========================================================================
moserg 150 150 150 3000 gb 0 0 Gerhard Moser
vanderj2 1 1 16 20 gb 0 0 Jeremy VanDerWal
wuq1 1 1 1 1 gb 0 0 Qing Wu
=========================================================================
Totals 152 152 167 3021 gb 0 0
=========================================================================
myjobs This command will provide information on your current HPC jobs, as well as providing a comparison of HPC Scheduler Requested Resources vs Actual Compute resources used for all “R”unning jobs.
bellj@newton:~> myjobs 

            Jobs running for bellj 

            -------------------------------- 

            pbsserver: 

            Req'd  Req'd   Elap 

            Job ID          Username Queue    Jobname    SessID NDS TSK Memory Time  S Time 

            --------------- -------- -------- ---------- ------ --- --- ------ ----- - ----- 

            407256.pbsserve bellj    workq    Test-run1   65575   4  32    --  01:00 R 00:55
            n005[0]/0*8+n005[0]/1*8+n008[0]/0*8+n008[0]/1*8 

            407257.pbsserve bellj    workq    Test-run2   44175   4  32    --  01:00 R 00:55
            n009[0]/0*8+n009[0]/1*8+n022[0]/1*8+n023[0]/0*8 

            407260.pbsserve bellj    workq    Test-run5   86742   4  32    --  01:00 R 00:55
            n027[0]/1*8+n028[0]/0*8+n028[0]/1*8+gn002[1]/0*8 

            407828.pbsserve bellj    workq    STDIN         --    1   4   10gb   --  Q   --   -- 

            ========================================================================

            Job D            Job          #CPU's      CPU's (%)            Memory (gb)    Memory (gb) 

                             Name         requested    Utilisation          requested      in use 

            ========================================================================

            407256.pbsserver   Test-run1           32             99            0            1 

            407257.pbsserver   Test-run2           32             98            0            1 

            407260.pbsserver   Test-run5           32             98            0            0
deletemyjobs This command will delete all your submitted jobs (both “R”unning and “Q”ueued)

Support

eresearch@cqu.edu.au

tasac@cqu.edu.au OR 1300 666 620

Hacky Hour (3pm – 4pm every Tuesday)

High Performance Computing Teams site