HPC

HPC#

To take advantage of the HPC resources available within ANL, we have developed a job scheduling system based on EQ/SQL to coordinate the running of arbitrary python jobs based on polaris-studio across the heterogenous hardware that is available. At a high level this looks like:

  1. Worker loops are run on available machine(s) which communicating back to a central database

  2. Jobs are added to the database and monitored from a Jupyter notebook

  3. Each Worker instance runs a job as it has the capacity to do so until no jobs remain.

It is the responsibility of the individual jobs to get the data that they need and to copy back any results that they generate.

An overview is given in below:

EQSQL Architecture Overview