Worker Nodes on LCRC#

Worker nodes can be launched on LCRC HPCs like Improv, Bebop, or Crossover. The following steps will ensure a worker node can be setup for use with a study.

One-Time Environment Setup For A New Study#

Example of crossover shown below. Other LCRC HPCs are accessed the same way.

  • Step 1: SSH into crossover.lcrc.anl.gov on the terminal. Options include:

    • Using the Putty terminal shortcut in WinSCP (that uses the configuration already setup for crossover access in WinSCP) Log in to WinSCP
      Click on the Putty Terminal

    • Directly on Putty or an alternative terminal

  • Step 2: Load relevant modules for the HPC to have access to Python, GCC. For a list of modules for LCRC, go to Bebop and Crossover. Note: This step is needed every time the terminal is reopened or restarted.

  • Step 3: Create a Conda environment for use with the worker. Workers look for the HPC name in the environment and selects the first one - so be sure to maintain only one per HPC

    conda create -n polarislib_xover_eqsql
    conda activate polarislib_xover_eqsql
    

    Note: You may need to restart terminal after running conda init bash for conda activate to work if its your first conda environment on this shell.

    Loading HPC-specific modules (from a saved list) and activating the conda environment

  • Step 4: Setup your conda environment to use a supported version of Python.

    conda install python=3.10								
    

    Note: Supported versions of python can be found at Installation

  • Step 5: Get polaris-studio installed to your python.

    • Navigate to the the POLARIS project which is a shared space for everyone at TSM:

      cd /lcrc/project/POLARIS
      

      Note: If you do not have access to the POLARIS project, please contact Griffin White.

    • Clone the polaris-studio repository into a STUDY folder

      cd /lcrc/project/POLARIS/STUDY
      git clone https://git-out.gss.anl.gov/polaris/code/polarislib.git
      
    • Install polaris-studio from the cloned repository.

      python -m pip install -e .[dev]
      

      Note: The hpc configuration is sufficient for workers but if the study requires additional dependencies they need to be installed into the conda environment. The example above shows using the dev option which is comprehensive. Building models from CSVs requires the builder option. For a complete set of options, visit Installation Options

  • Step 6: Setup Globus Authentication so that magic_copy can copy files between VMS fileservers and LCRC filesystem.

    cd /lcrc/project/POLARIS/STUDY/polarislib
    python bin/authenticate_globus.py
    

    Note: Follow instructions that include logging into globus and pasting the token to authenticate globus.

  • Step 7: Log in to www.globus.org with your Argonne credentials.

    • Check if you are a part of the POLARIS Group. If not contact Jamie, Griffin, Josh, or Murthy to be added to this group.

    • A collection/endpoint for the study needs to be setup for moving files between specific folders on VMS file server and the LCRC filesystem

Launching a worker#

From the polaris-studio folder that was cloned, navigate to the bin/hpc directory to launch a worker.

cd /lcrc/project/POLARIS/STUDY/polarislib/bin/hpc

Schedule a job using the pre-prepared shell script for use with LCRC. The following example launches a 32 of 128 threads on a node in Crossover with a runtime of 7 days.

sbatch --time=7-00:00:00 --partition=TPS --account=TPS --ntasks-per-node=32 worker_loop_lcrc.sh 

Note: Keep in mind that there is an idle timeout built in - so if no jobs are currently running, these jobs will cancel out automatically after 30 minutes.

For details about scheduling a job on LCRC, please visit: Schedule job

Helpful commands#

Project allocations available for running POLARIS studies can be confirmed for individual accounts.

lcrc-sbank -q balance