Generation of benchmarks

We first need to generate benchmarks with MDBenchmark, before we can run and analyze these. All options for benchmark generation are accessible via mdbenchmark generate. The options presented in the following text can be chained together in no particular order in one single call to mdbenchmark generate. Before actually writing any files, you will be promted to confirm the action. You can skip this confirmation with the --yes option.

Specifying the input file

MDBenchmark requires one file to generate GROMACS benchmarks and three files for NAMD. The base name of the input file is provided via the -n or --name option to mdbenchmark generate. The following table lists all files required by the given MD engine.

MD engine Required files
GROMACS .tpr
NAMD .namd, .psf, .pdb

If your input file is called protein.tpr, then the base name of the file is protein and you need to call:

mdbenchmark generate --name protein

Choosing a MD engine for the benchmark(s)

MDBenchmark assumes that your HPC uses the modules package to manage loading of MD engines. When given the name of a supported MD engine, it will try to find the specified version:

mdbenchmark generate --module gromacs/2018.3

It is also possible to specify two or more modules at the same time. MDBenchmark will generate the correct number of benchmark systems for the respective MD engines, sharing all other given options:

mdbenchmark generate --module gromacs/2018.3 --module gromacs/2018.2

Also it is possible to mix and match MD engines in a single mdbenchmark generate call, if the base name of the files is the same (see above):

mdbenchmark generate --module gromacs/2018.3 --module namd/2.12

Skipping module name validation

If MDBenchmark does not manage to determine the naming of your MD engine modules, it will warn you, but continue generating the benchmarks. Contrary, if it manages to determine the naming, but is unable to find the specified version, benchmark generation fails. If you are sure that the name is correct and MDBenchmark is wrong, you can force the generation of benchmark systems with the --skip-validation option:

mdbenchmark generate --skip-validation

Defining the number of nodes to run on

Benchmarks are especially helpful, if you want to figure out on how many nodes you should run your MD job on. You can provide MDBenchmark with a range of nodes to run benchmarks on. The two options defining the range are --min-nodes and --max-nodes for the lower and upper limit of the range, respectively. If you do not specify either of these two options, MDBenchmark will use the default values of --min-nodes=1 and --max-nodes=5. This would generate a total of 5 benchmarks, running each benchmark on 1, 2, 3, 4 and 5 nodes.

Listing available hosts

MDBenchmark comes with two pre-defined templates for the MPCDF clusters draco and hydra. You can easily create your own job templates, as described here. You can list all available job templates via:

mdbenchmark generate --list-hosts

Defining the job template to run from

MDBenchmark will try to lookup the hostname of your current machine and search for a job template with the same name. If it cannot find the correct file or you want to use one you have written yourself, e.g., named my_job_template, simply use the --host option:

mdbenchmark generate --host my_job_template

Running on CPUs or GPUs

Depending on your setup you might want to run your simulations only on GPUs or CPUs. You can do so with the --cpu/--no-cpu and --gpu/--no-gpu flags, -c/-nc` and ``-g/-ng` respectively. If neither of both options is given, benchmarks will be generated for CPUs only. The default template for the MPCDF cluster ``draco showcases the ability to run benchmarks on GPUs:

mdbenchmark generate --gpu

This generates benchmarks for both GPU and CPU partitions. If you only want to run on GPUs this is easily achieved with:

mdbenchmark generate --gpu --no-cpu

Limiting the run time of benchmarks

You want your benchmarks to run long enough for the MD engine to stop optimizing the performance, but short enough not to waste too much computing time. We currently default to 15 minutes per benchmark, but think that common system sizes (less than 1 million atoms) can be benchmarked in 5-10 minutes on modern HPCs. To change the run time per benchmark, simply use the --time option:

mdbenchmark generate --time 5

This would run all benchmarks for a total of five minutes.

Changing the job name

If you want your benchmark jobs to have specific names, use the --job-name option:

mdbenchmark generate --job-name my_benchmark