Generation of benchmarks¶
We first need to generate benchmarks with MDBenchmark, before we can run and
analyze these. All options for benchmark generation are accessible via
mdbenchmark generate
. The options presented in the following text can be
chained together in no particular order in one single call to mdbenchmark
generate
. Before actually writing any files, you will be promted to confirm the action. You can skip this confirmation with the --yes
option.
Specifying the input file¶
MDBenchmark requires one file to generate GROMACS benchmarks and three files for
NAMD. The base name of the input file is provided via the -n
or --name
option to mdbenchmark generate
. The following table lists all files required
by the given MD engine.
MD engine | Required files |
---|---|
GROMACS | .tpr |
NAMD | .namd , .psf , .pdb |
If your input file is called protein.tpr
, then the base name of the file is
protein
and you need to call:
mdbenchmark generate --name protein
Choosing a MD engine for the benchmark(s)¶
MDBenchmark assumes that your HPC uses the modules package to manage loading of MD engines. When given the name of a supported MD engine, it will try to find the specified version:
mdbenchmark generate --module gromacs/2018.3
It is also possible to specify two or more modules at the same time. MDBenchmark will generate the correct number of benchmark systems for the respective MD engines, sharing all other given options:
mdbenchmark generate --module gromacs/2018.3 --module gromacs/2018.2
Also it is possible to mix and match MD engines in a single mdbenchmark
generate
call, if the base name of the files is the same (see above):
mdbenchmark generate --module gromacs/2018.3 --module namd/2.12
Skipping module name validation¶
If MDBenchmark does not manage to determine the naming of your MD engine
modules, it will warn you, but continue generating the benchmarks. Contrary, if
it manages to determine the naming, but is unable to find the specified version,
benchmark generation fails. If you are sure that the name is correct and
MDBenchmark is wrong, you can force the generation of benchmark systems with the
--skip-validation
option:
mdbenchmark generate --skip-validation
Defining the number of nodes to run on¶
Benchmarks are especially helpful, if you want to figure out on how many nodes
you should run your MD job on. You can provide MDBenchmark with a range of nodes
to run benchmarks on. The two options defining the range are --min-nodes
and
--max-nodes
for the lower and upper limit of the range, respectively. If you
do not specify either of these two options, MDBenchmark will use the default
values of --min-nodes=1
and --max-nodes=5
. This would generate a total
of 5 benchmarks, running each benchmark on 1, 2, 3, 4 and 5 nodes.
Listing available hosts¶
MDBenchmark comes with two pre-defined templates for the MPCDF clusters draco and hydra. You can easily create your own job templates, as described here. You can list all available job templates via:
mdbenchmark generate --list-hosts
Defining the job template to run from¶
MDBenchmark will try to lookup the hostname of your current machine and search
for a job template with the same name. If it cannot find the correct file or you
want to use one you have written yourself, e.g., named my_job_template
,
simply use the --host
option:
mdbenchmark generate --host my_job_template
Running on CPUs or GPUs¶
Depending on your setup you might want to run your simulations only on GPUs
or CPUs. You can do so with the --cpu/--no-cpu
and --gpu/--no-gpu
flags, -c/-nc` and ``-g/-ng` respectively.
If neither of both options is given, benchmarks will be generated for CPUs only.
The default template for the MPCDF cluster ``draco
showcases the ability to
run benchmarks on GPUs:
mdbenchmark generate --gpu
This generates benchmarks for both GPU and CPU partitions. If you only want to run on GPUs this is easily achieved with:
mdbenchmark generate --gpu --no-cpu
Limiting the run time of benchmarks¶
You want your benchmarks to run long enough for the MD engine to stop optimizing
the performance, but short enough not to waste too much computing time. We
currently default to 15 minutes per benchmark, but think that common system
sizes (less than 1 million atoms) can be benchmarked in 5-10 minutes on modern
HPCs. To change the run time per benchmark, simply use the --time
option:
mdbenchmark generate --time 5
This would run all benchmarks for a total of five minutes.
Changing the job name¶
If you want your benchmark jobs to have specific names, use the --job-name
option:
mdbenchmark generate --job-name my_benchmark