Job Manager for local execution of ATK scripts

Version: 2017.0

Table of Contents:

introbar

Execute ATK simualtions via the Job Manager

In this section you will learn how to use the Job Manager for local execution of ATK scripts. Specifically, you will learn about queuing, running and managing ATK jobs.

Create a new empty project and download the example script silicon.py, which runs an ATK-DFT calculation with very many k-points (31x31x31).

Drop it on the job_manager_icon Job Manager and select a local machine for the job execution.

snap_run_local

The job is now in the task state “pending” with “Threaded parallel (Single process)” default settings. Click the Job Settings jm_preferences_enabled_icon icon to edit the job settings.

snap2

The Job Settings widget has three basic panels:

  • Job type;
  • Job properties;
    • Threading
    • MPI
  • Use separate temporary directory (set this to allow using a non-default working directory for running the job. Note, results will not be appended to existing hdf5 or nc files, but instead a new file will be created.)

snap_job_settings_default

Set these settings according to your needs, and click OK.

Back in the Job Manager, click the Run jm_play_enabled_icon icon to start the job. The task state changes from “Pending” to “Running”.

snap4

The job finishes after ca. 1 minute (2.5 GHz CPU). Note that the task state changes to “Finished”. You can inspect the job log file by clicking the LOG icon: jm_log_icon

snap5

snap6

The job output of course appears on the VNL LabFloor after job execution.

snap7

Back in the Job Manager, the Property–Value list shows all details of the settings used for job execution, including

  • path to the ATK executable;
  • name of the Python script and the log file.
  • threading and other prallelization options.

snap10

You can use the Resubmit jm_resubmit_enabled_icon icon to resubmit a script. Note that any changes that have been made to the script will be picked up by the new job.

snap11

Note

Remember that the default job type is “Threaded parallel (Single process)”. You can change this to “Serial” or “Multiprocess parallel” before starting the job.

Use the Trash jm_garbage_bin_enabled_icon icon to remove jobs from the job queue.

snap12

Serial execution

In the Job Settings window select a Serial job type as shown in the below figure to run on a single porcess with no threading. In fact, note that threading is turned off (number of threads is 1), and MPI parallelization is not available.

snap_job_settings_serial

If you check the system load during local execution in serial, you should see that the serial job launches only a single computing task on a single CPU core.

snap8

Only one core is used at a time, but the hardware process manager may move the task between cores from time to time.

snap9

Threading

In the Job Settings window select a Threaded parallel (single process) job type as shown in the below figure to run on a single process with threading.

snap_job_settings_default

Threading is one way to parallelize a computational job. ATK uses Intel MKL for openMP threading. Note that we do in general recommend MPI parallelization over threading for parallelizing DFT calculations. However, threading is often more efficient for parallelizing ATK-ForceField calculations.

Download the script cnt.py, which uses ATK-ForceField to calculate the dynamical matrix of a multiwall carbon nanotube.

Execute it using the Job Manager, and choose job type “Threaded parallel (single process)”. It should be pretty fast.

snap13

If you check the system load during execution of the calculation, you should see that only a single atkpython process is started, even though several cores appear to be busy. This is because the work load of the one process is split into a number of threads that may be distributed on more cores.

snap15

snap14

Download the script cnt.py to test the performance of a ATK-ForceField simulations using threading. This specific example will calculate the dynamical matrix of a multiwall carbon nanotube. If you also run the calculation in serial, you will see that the wall-clock time used for evaluating ATK-ForceField forces may decrease significantly when threading is switched on. In the example shown below, the time spent on force calculations is roughly halved.

snap16

MPI parallelization

Important

If you are running ATK 2016 or earlier you need to have MPI installed on your local machine. If not, please check out the guide MPI setup for running ATK 2016 in parallel.

Both the Linux and Windows versions are compiled against Intel MPI library. Since ATK 2017 Intel’s mpiexec.hydra is provided on both Windows and Linux versios - this is the recommended way to run ATK in parallel.

In Job Settings choose Multiprocess parallel and e.g. 4 MPI processes.

snap_job_settings_mpi

../../_images/jm_18.png

Fig. 22 The Property–Value list shows the name of the MPI executable and that 4 processors are used for MPI.

Running from the command line

If you wish to run ATK in parallel from the command line you can use the mpiexec.hydra binary shipped with ATK and located in the folder libexec/mpiexec.hydra present in your installation folder.

In this case you can run parallel jobs with:

$ QW_INSTALLATION_PATH/libexec/mpiexec.hydra -n 4 atkpython atk_script.py

Hint

Prepend QW_INSTALLATION_PATH/libexec/ in your PATH.

$ export PATH=QW_INSTALLATION_PATH/libexec:$PATH

This way you will automatically pick up the mpiexec.hydra binary shipped with ATK:

$ mpiexec.hydra -n 4 atkpython atk_script.py

Machine Manager

It may sometimes be convenient to have a predefined local machine that is set up with MPI parallelization as default mode. You can easily add such a machine yourself.

In the Job Manager main window, click jm_cluster_icon to open the Machine Manager, and click New ‣ Local.

Then edit the default job settings of the new machine in the window that pops up:

  • Name the machine, e.g. “Local (2017.0) - 4 MPI”.
  • Select Multiprocess Parallel as job type.
  • Make sure threading is turned off (Number of threads = 1)
  • Choose the default number of processors, e.g. 4.
  • Click OK to add the new machine to the Machine Manager.

snap20

../../_images/jm_20__2.png

Fig. 23 You can add as many custom machines to the Machine Manager as you like.