Running Python on Ada – CQU eResearch

Running Python on Ada Lovelace Cluster

This instructional guide will provide you with many ways on how to use python on the CQUniversity HPC system.

This guide assumes you have successfully Mapped the HPC drive and achieved a graphical connection via open on demand . Python is one of the most popular programming languages that is being used in a variety of research domains and appears to be extremely popular in the Machine Learning and Artificial Intelligence community.

There are a variety of ways to use python on CQUniversity’s High Performance Computing system, including using a graphical interactive development environment (IDE), running python code interactively via the command line, and submitting python jobs to the scheduler. This Page will go over the basics of selecting a Python module and will provide resources to options such as IDE and creating Conda environments. There are a variety of Python modules installed on the HPC, including Anaconda, MiniConda, and a large variety of Python modules predominantly Python 3.

Loading a Python Module

Once on the HPC you can open a terminal window and use the command module avail to see any preinstalled modules that match the search parameters. I’ll use “Python” in the example but you could also you “conda” or “deep-learning” or “llm”. this is primarily a search function to get the exact spelling of the module and version numbers to be able to load the correct module. The output of the command has been reduced to show sections of the available module to make this page easier to read. It doesn’t represent an exhaustive list.

[decostal@ada ~]$ module avail Python


---------------- /apps/modulefiles/easybuild/rhel9/epyc4/all ----------------
Biopython/1.79-foss-2022a
...
GitPython/3.1.40-GCCcore-12.3.0
...
Python-bundle-PyPI/2023.06-GCCcore-12.3.0
...
Python/3.10.4-GCCcore-11.3.0-bare
...
Python/3.12.3-GCCcore-13.3.0-deep-learning-cpu
...
meson-python/0.16.0-GCCcore-13.3.0 (D)
...
pytest/4.6.11-GCCcore-12.3.0-Python-2.7.18
...
python-slugify/8.0.4-GCCcore-13.2.0
...
Use "module keyword key1 key2 ..." to search for all possible modules
matching any of the "keys".

Some commands on the Unix system are case sensitive, if you are having troubles finding a module you know is on there you may need to check spelling and capitalisation.

Once you have found the Module you want to load you can use the command “module load” followed by the desired module. for this example I’ll use “Python/3.12.3-GCCcore-13.3.0-deep-learning-cpu”

module load Python/3.12.3-GCCcore-13.3.0-deep-learning-cpu

These commands are able to be tab completed allowing you to use tab to fill in commands once enough is written. You can confirm the module is loaded by using the command “module list” and show all installed libraries using the “command pip list“.

IDE’s on the HPC

Now that you’re on the system and have Python loaded you may want to work within an IDE such as Spyder or Juypter notebook. Information for accessing an IDE can be found here.

What module should I load?

If you’re just starting out you may not know what module you need, or even how to find it without combing through every module, looking for the particular software you were hoping to use. The eResearch team has a script that allows the system to search the pip list of modules to find particular programs and what modules they are in. It may be useful for you to run the script to find which HPC python modules have a particular module preinstalled, as starting point for which to start using Python or creating a custom environment to build out from.

check_python_module.sh (module you want)

For example:

check_python_module.sh tensorflow

Once you’ve got the module you need you can load up an IDE or Python script and start using the HPC.

What if the module is missing some libraries I need

It’s very likely as you continue using Python you may have installed custom libraries to your system that you’d like to use on the HPC. Those libraries may be available in another module but what if you need multiple that are split between separate modules. There’s several option and we’d recommend getting in touch with eresearch@cqu.edu.au to discuss which would be best as we may already have a solution made.

Option 1, Install missing libraries locally.

If your familiar with setting up and installing a Python environment you will have used pip install before to get it set up. if you try using pip install in the HPC you will get an error since your account is lacking permissions to install to the system as this is a shared environment. To get around this restriction you can install libraries to your local folder that you do have permissions for.

To start you’ll want to load a module that has most of the libraries you need to save you the work of installing them yourself. for example:

module load Python/3.12.3-GCCcore-13.3.0-deep-learning-cpu

Now you can add missing libraries by using the command “pip install –user“ followed by the command to install.

This will install these libraries to your local user folder.

Option 2, Install missing library to a custom Conda environment.

You can, using a similar process as above, create a custom environment which may be useful if you’ll need to work on different projects. You can load and unload these environments easily.

More information on creating and installing libraries can be found here.

Submitting jobs to the scheduler

Once you have your module and environment set up and working, you’ll be able to write scripts that will let the HPC load and run your programs allowing you to use more resources and run jobs simultaneously. examples can be found here.