Zacros Manual
Zacros Manual
03
Dear Colleague,
I would like to thank you for downloading Zacros and I hope you will find it useful in your research.
Zacros is an advanced kinetic Monte Carlo (KMC) software application for the simulation of molecular
phenomena, such as adsorption and catalytic reactions, on surfaces. The package employs the Graph-
Theoretical KMC methodology, coupled with cluster expansion Hamiltonians and Brønsted-Evans-
Polanyi relations for the adlayer energetics. This integrated implementation can naturally capture:
• steric exclusion effects for species that bind in more than one catalytic site,
• complex reaction steps involving adsorbates in specific binding configurations and neighboring
patterns,
• spatial correlations and ordering arising from adsorbate lateral interactions that may involve many-
body contributions,
• coverage effects, namely the dependence of the activation energy of an elementary event on the
presence of spectators in the neighborhood of this event.
In addition to these, the code features an easy-to-learn keyword-based language for defining a
simulation, and can be run in “debugging” mode, thereby generating detailed output that can be used to
efficiently troubleshoot a KMC simulation. Moreover, to tackle simulations on very large lattices, Zacros
implements a domain decomposition scheme along with the Time-Warp algorithm for boundary conflict
resolution. Informative articles, tutorials and examples that can help you get started with KMC
simulation can be found in the Zacros website: http://zacros.org.
Zacros is distributed free of charge to the academic community in the hope that it will benefit
researchers worldwide. If you decide to use this software for a scientific article, I kindly ask you to
include the following citations in your work:
Nielsen, J., M. d’Avezac, J. Hetherington and M. Stamatakis (2013). “Parallel Kinetic Monte
Carlo Simulation Framework Incorporating Accurate Models of Adsorbate Lateral
Interactions.” The Journal of Chemical Physics, 139(22): 224706.
Ravipati, S., Nielsen, J., d’Avezac, M., Hetherington, J. and M. Stamatakis (2020). “A Caching
Scheme to Accelerate Kinetic Monte Carlo Simulations of Catalytic Reactions”. The Journal
of Physical Chemistry A, 124(35): 7140-7154. [please cite if you are using the caching
scheme]
Ravipati, S., Savva, G. D., Christidi, I.-A., Guichard, R., Nielsen, J., Réocreux, R., and
Stamatakis, M. (2022). “Coupling the Time-Warp algorithm with the Graph-Theoretical
Kinetic Monte Carlo framework for distributed simulations of heterogeneous catalysts”.
Computer Physics Communications, 270: 108148 [please cite if you are using the MPI Time-
Warp algorithm implementation]
I would be glad to receive feedback about Zacros, and if you would like to contribute to the
development thereof, please do not hesitate to get in touch.
Kind regards,
Michail Stamatakis
Associate Professor in Chemical Engineering
University College London
Torrington Place
London, WC1E 7JE
United Kingdom
Phone: +44 (0)20 3108 1128
Fax: +44 (0)20 7383 2348
e-mail: [email protected]
url: https://www.stamatakislab.org/
Table of Contents
Introduction .................................................................................................................................................. 8
Compiling Zacros ........................................................................................................................................... 9
Supported Compilers ................................................................................................................................ 9
Serial, Threaded or Distributed? ............................................................................................................... 9
Using the Makefiles Provided for GNU Fortran ...................................................................................... 10
Compilation on Unix/Linux ................................................................................................................. 10
Compilation on macOS........................................................................................................................ 11
Compilation on Windows Using MSYS2 .............................................................................................. 11
Using the CMake Build System ............................................................................................................... 13
Compilation on Unix/Linux/macOS..................................................................................................... 13
Compilation on Windows Using MSYS2 .............................................................................................. 14
Compilation on Windows Using Visual Studio .................................................................................... 15
Running Zacros ............................................................................................................................................ 15
Input/Output Files................................................................................................................................... 16
Units and Constants ................................................................................................................................ 17
Setting up a KMC Simulation in Zacros ....................................................................................................... 18
Simulation Input File ............................................................................................................................... 18
General Simulation Parameters .......................................................................................................... 18
Reporting Schemes ............................................................................................................................. 21
Stopping and Resuming ...................................................................................................................... 25
Treating Fast Quasi-Equilibrated Processes ........................................................................................ 25
Simulating Very Large Lattices ............................................................................................................ 29
Overview of the Approach .............................................................................................................. 29
Performing MPI Time-Warp Runs ................................................................................................... 33
Keywords......................................................................................................................................... 33
Validation of MPI Time-Warp Runs................................................................................................. 35
Memory Management ........................................................................................................................ 37
Accelerators ........................................................................................................................................ 38
Troubleshooting .................................................................................................................................. 39
Lattice Input File...................................................................................................................................... 41
Introduction
Zacros is an advanced kinetic Monte Carlo (KMC) package for the simulation of molecular phenomena,
such as adsorption and catalytic reactions, on structures that can be represented by static lattices. The
package employs the Graph-Theoretical KMC methodology1 coupled with cluster expansion
Hamiltonians for the adlayer energetics,2 allowing it to tackle:
• binding configurations on more than one sites, and the steric exclusion effect resulting therefrom,
• complex surface reactions in which several species and spectators can be involved in specific
neighboring patterns,
• adsorbate lateral interactions involving long-range and many body terms, and the spatial correlation
and ordering effects resulting therefrom.
• coverage effects, namely the dependence of the activation energy of an elementary event on the
presence of spectators in the neighborhood of this event.
Various optimizations (e.g. Ullmann’s algorithm for subgraph isomorphism, caching of energetic
interaction terms) as well as OpenMP parallelization have been implemented for the efficient simulation
of systems with energetic models involving long-range interactions. Moreover, as of version 2.0 an
approximate method that rescales rate constants of fast quasi-equilibrated events is available (see
section Treating Fast Quasi-Equilibrated Processes on page 25). Furthermore, as of version 3.0, Zacros
implements MPI parallelization, using a domain decomposition scheme and the Time-Warp algorithm
for boundary conflict resolution (see section Simulating Very Large Lattices on page 29).
This user guide provides information about the syntax of input/output files and the options available in
Zacros, with some high-level overview of the methods on a “need-to-know” basis. For more in-depth
information on KMC simulation and the underlying methods implemented in the package, the user is
referred to the following publications:
Darby, M. T., Piccinin, S. and M. Stamatakis (2016). “Chapter 4: First principles-based kinetic
Monte Carlo simulation in catalysis” in Kasai, H. and M. C. S. E. Escaño (Eds.), Physics of
Surface, Interface and Cluster Catalysis, Bristol, UK: IOP Publishing.
Nielsen, J., M. d’Avezac, J. Hetherington and M. Stamatakis (2013). “Parallel Kinetic Monte
Carlo Simulation Framework Incorporating Accurate Models of Adsorbate Lateral
Interactions.” The Journal of Chemical Physics, 139(22): 224706.
Ravipati, S., Nielsen, J., d’Avezac, M., Hetherington, J. and M. Stamatakis (2020). “A Caching
Scheme to Accelerate Kinetic Monte Carlo Simulations of Catalytic Reactions”. The Journal
of Physical Chemistry A, 124(35): 7140-7154. [please cite if you are using the caching
scheme]
Ravipati, S., Savva, G. D., Christidi, I.-A., Guichard, R., Nielsen, J., Réocreux, R., and
Stamatakis, M. (2021). “Coupling the Time-Warp algorithm with the Graph-Theoretical
Kinetic Monte Carlo framework for distributed simulations of heterogeneous catalysts”.
Computer Physics Communications, 270: 108148 [please cite if you are using the MPI Time-
Warp algorithm implementation]
Compiling Zacros
Supported Compilers
We build and test Zacros using the Intel and GNU Fortran compilers on Linux and Windows, as well as
the GNU Fortran compiler on OSX. Minimum recommended versions are: 7.3.0 for GNU, and 18.0.3 for
Intel. For the MPI version of Zacros we test both the MPICH and the OpenMPI frameworks, with the
following minimum recommended versions: 3.1.1 for OpenMPI and 3.2.1 for MPICH.
that the parallel versions (with OpenMP and/or MPI), can have quite significant computational
overheads, pertinent to e.g. thread creation or sharing information among threads for the OpenMP
version; or pertinent to messaging, global communications, rollbacks and re-simulations for the MPI
version. These overheads may, in certain cases, outweigh any gains from the parallelization. On the
other hand, the serial version implements optimizations and does not incur any such overheads; thus,
you may, for instance, find out that the same simulation takes less time in a serial run compared to an
OpenMP run with a single thread. It is therefore advisable to run short benchmarks to find out what
works best for the systems you want to simulate.
Compilation on Unix/Linux
This section refers to Unix and Linux operating systems. If your system does not come with the GFortran
compiler you will have to install it. For instance, in Ubuntu 20.04, you can do so by running the following
command on a terminal: sudo apt install gfortran.
Compiling should be done using a terminal. For the serial version (no OpenMP or MPI parallelism), it
simply comes down to the following:
cd path/to/source/of/Zacros
mkdir build # If it does not exist yet
cd build
cp ../makefiles/makefile-gnu-serial-unix makefile
make
The first line above moves to the directory where the source code can be found. Then we create a build
directory and move to that directory. All the build files will be located here, rather than “polluting” the
source. We copy one of the makefiles provided to the build directory (renaming it to makefile at the
same time) and compile the code. At the end of this process, there should be an executable called
zacros.x in the build directory. The executable thus obtained is of interest when running on a
computer with a single core, or when simulating systems with short-range or no lateral interactions
(please refer to section Energetics Input File, page 47).
To compile the OpenMP version of the code (efficient for systems with long-range lateral interactions
among adsorbates or a large number of interaction patterns), simply replace line 4 of the above
commands with:
cp ../makefiles/makefile-gnu-parallel-unix makefile
Finally, if you would like to compile the MPI version of Zacros (for distributed simulations on very large
lattices), you need to have OpenMPI or MPICH installed. As an example, in Ubuntu 20.04, MPICH can be
installed via the terminal with the following command: sudo apt install mpich. The
compilation procedure is similar to what was described earlier; just replace line 4 of the above
commands with the following:
cp ../makefiles/makefile-gnu-distributed-unix makefile
After compiling any (or all) of the aforementioned versions, you can run a set of tests to verify that your
executable works fine. As a prerequisite, your system must have Python installed. Then you can do the
testing with the following command:
make test
Many of these tests are fast, but some may take a few minutes to complete (depending also on your
hardware). At the end of each test, an informative message will be printed stating whether the test has
been successful or not. Note that the test command of the makefiles runs only a subset of the available
tests. For more comprehensive testing, please use the CMake build system (section Using the CMake
Build System, page 13). If any tests fail, refer to section Known Issues/Limitations (page 79) for possible
remedies, and if you cannot resolve the issue, feel free to contact us for help.
Compilation on macOS
For macOS one can install GFortran from http://gcc.gnu.org/wiki/GFortranBinaries. If your system does
not recognize the make command, you can install the XCode package making sure that the Command
Line Tools are included in the installation. The rest of the compilation instructions are the same as in
section Compilation on Unix/Linux.
Then try to run the following commands to verify that you have all the tools necessary for the
compilation of Zacros:
this means that you need to install the corresponding package that provides support for that command.
MSYS2 provides a package manager called pacman, which can be used to search for packages and
install them. Searching for a package, e.g. make, can be done with the following command:
pacman -S make
The base MSYS2 installation may not include the GNU Fortran compiler, which, however, should be easy
to install using the following commands:
pacman -S mingw-w64-x86_64-gcc
pacman -S mingw-w64-x86_64-gcc-fortran
pacman -S mingw-w64-x86_64-gcc-libgfortran
pacman -S mingw-w64-x86_64-openmp
At any point, you can check which packages are installed with the following command:
pacman -Qqe
Moreover, you can check the existence of package in the repository, e.g. fortran, via:
If all necessary packages are installed, the serial version can be compiled with the following commands:
cd path/to/source/of/Zacros
mkdir build # If it does not exist yet
cd build
cp ../makefiles/makefile-gnu-serial-msys-windows makefile
make
The first line above moves to Zacros directory we created earlier, where the source code can be found.
Then we create a build directory and move to that directory. All the build files will be located here,
rather than “polluting” the source. We copy one of the makefiles provided to the build directory
(renaming it to makefile at the same time) and compile the code. At the end of this process, there
should be an executable called zacros.exe in the build directory. The OpenMP parallel version of the
code is obtained by replacing line 4 of the above commands with:
cp ../makefiles/makefile-gnu-parallel-msys-windows makefile
At the time of writing these instructions, there was no MPI package available in the repositories of
MSYS2 and thus, we do not provide makefiles for the distributed version of Zacros in Windows systems.
It might be possible to compile and link Zacros to the MSMPI libraries; however, this does not appear to
be straightforward…
Note that running Zacros will have to be done from the MSYS2 terminal (not from the Windows
command prompt), so that the gcc/gfortran libraries MSYS2 are available to the Zacros executable.
After compiling, you can run a set of tests to verify that your executable works fine. As a prerequisite,
your system must have Python installed. Then you can do the testing with the following command:
make test
Keep in mind that some tests may take a few minutes to complete. At the end of each test, an
informative message is printed, stating whether the test has been successful or not. If any tests fail,
refer to section Known Issues/Limitations (page 79) for possible remedies, and if you cannot resolve the
issue, feel free to contact us for help. Note that the make test command of the makefiles runs only a
subset of the available tests. For more comprehensive testing, please use the CMake build system as
describe in the next section.
Compilation on Unix/Linux/macOS
This section covers all Unix-like operating systems, including Linux and macOS. It has been tested with
GNU Fortran and Intel Fortran. Compiling the serial version should be done using a terminal using the
following commands:
cd path/to/source/of/Zacros
mkdir build # If it does not exist yet
cd build
cmake .. -DCMAKE_BUILD_TYPE=Release -Ddoopenmp=off -Ddompi=off
make # or “make -j” for faster (parallel) compilation
The first line above moves to the directory where the source code can be found. Then we create a build
directory. All the build files will be located here, rather than “polluting” the source. We move to the
build directory. The compilation is first configured for the current platform. Finally, the code is compiled.
At the end of this process, there should be a serial executable called zacros.x in the build directory.
This is of interest when running on a computer with a single core, or when simulating systems with
short-range or no lateral interactions (please refer to section Energetics Input File, page 47).
One can enable threading or build the distributed (MPI) version by setting the -Ddoopenmp and/or
-Ddompi arguments appropriately. For instance, to enable threading (OpenMP) only:
It is also possible to enable both OpenMPI and MPI; however, we have not performed extensive tests
with such a configuration. If you would like to use it, do so at your own risk!
If you would like to compile a version for debugging, you can change the build type to Debug and even
provide custom options to the compiler by using the -DCMAKE_Fortran_FLAGS parameter. This can
be done by editing line 4 of the above commands for the serial version as follows:
The above uses GNU Fortran flags that initialise integer variables with the value -987654321 and real
variables with NaN, and checks whether array bounds are exceeded at some point in the simulation
(note that the above command should be written in just one line in the terminal).
If your system has Python installed, then the code can be tested to validate the compilation, by running:
Please note that some of the tests may take several minutes to complete… You can selectively run
specific tests, thereby reducing the testing time, in two different ways. The first is to run tests with a
given label out of the following: fast, medium, slow, e.g.:
The second way is to run tests whose name contains a given string, e.g.:
If you see tests failing, refer to section Known Issues/Limitations (page 79) for possible remedies (there
might be something wrong with your system configuration, or your compiler may be incompatible with
some functionalities of the code). If you cannot resolve the issue, feel free to contact us for help.
pacman -S mingw-w64-x86_64-cmake
pacman -S python
Then, starting from the MSYS2 command-line, the serial version can be compiled as follows:
cd path/to/source/of/Zacros
mkdir build # If it does not exist yet
cd build
cmake .. -DCMAKE_BUILD_TYPE=Release -G "MSYS Makefiles" -Ddoopenmp=off
-Ddompi=off
make # or “make -j” for faster (parallel) compilation
This operation is quite similar to the one performed in the case of Unix-based systems (see section
Compilation on Unix/Linux, above). However, line 4 has changed a bit: the -G "MSYS Makefiles"
option tells CMake that we want to compile for MSYS.
Tests can be run with the command make test, as discussed in the previous subsection. Please be
warned that some of the tests are fairly long. Threading can be added by adding the option -
Ddoopenmp=on to line 4 above. At the time of writing these instructions, there was no MPI package
available in the repositories of MSYS2 and thus, setting -Ddompi=off will result in compilation errors.
cd path/to/source/
mkdir build
cd build
cmake ..
The project should contain several possible configurations. For speed, one should build the Release
configuration.
It is also possible to build and test directly from the command-line, thereby bypassing GUI systems:
Running Zacros
The simplest way to run Zacros is simply to launch the executable from the command-line. For Unix-like
systems:
path/to/executable/zacros.x
or simply,
zacros.x
if zacros.x is in the user's path. For Windows systems zacros.x is replaced with zacros.exe in
the above commands. Zacros expects all the appropriate input files to be in the current directory. Please
refer to section Setting up a KMC Simulation in Zacros for a description of these files.
When invoking the thread-capable executable (the default; see section Compiling Zacros, page 9, for
how to disable threads) Zacros will run with as many threads as there are cores. The number of threads
can be manually defined in MS-DOS and UNIX by setting the appropriate environmental variables. In
UNIX one needs to use the command export:
export OMP_NUM_THREADS=4
set OMP_DYNAMIC=FALSE
set OMP_NUM_THREADS=4
Running distributed simulations has to be done by invoking the appropriate MPI “manager” program,
mpiexec, mpirun, or similar, and using of course a Zacros executable that has been compiled with
MPI instructions/directives. The number of MPI processes is given to mpiexec by using the option -np
as follows (in the example below we use 4 MPI processes):
Note that the number of MPI processes depends on the size of the lattice simulated, i.e. it cannot be
chosen arbitrarily. For more details, refer to section Performing MPI Time-Warp Runs on page 33.
It is possible to give command-line arguments to Zacros in MPI runs (see section Command-Line
Arguments, on page 62). An example in which we specify (or override) the wall-time to 86400 seconds
(24 hours) is given below:
Input/Output Files
The input to Zacros consists of 5 keyword-based files, out of which one is optional:
general_output.txt history_output.txt
lattice_output.txt procstat_output.txt
specnum_output.txt
Two additional files are produced if keywords energetics_lists and process_lists appear in
simulation_input.dat file (see section Simulation Input File, page 23); these files are (respectively for
each keyword):
energlist_output.txt proclist_output.txt
If run in debugging mode (see section Troubleshooting, page 39), Zacros also generates the following
files which are useful for troubleshooting a simulation:
process_debug.txt globalenerg_debug.txt
newton_debug.txt
All the aforementioned files are read from (or written to) the same directory, unless otherwise specified
via the command line (see section Command-Line Arguments, page 62).
Finally, restart.inf is an input/output file used to resume a simulation from the point it stopped.
Even though it is a plain text file, it is not intended as human readable. For more details on how the
resume feature works, please refer to keywords wall_time and no_restart in section Simulation
Input File, page 25. Note that if the file restart.inf exists in the current working directory, Zacros
will disregard the aforementioned input files, attempt to read restart.inf and resume the
simulation.
For MPI runs, each MPI process generates its own output file for the lattice subdomain for which it is
responsible. Thus, MPI process 0 generates files with the names mentioned above, while MPI process i,
with i ≥ 1, generates files with the following names:
general_output_i.txt history_output_i.txt
lattice_output_i.txt procstat_output_i.txt
specnum_output_i.txt
for i = 1, 2, 3, … It is therefore important to keep in mind that MPI runs generate a large number of files.
Pi constant: π = 3.141592653589793
Gas constant: Rgas = 8.314472 J/K/mol Avogardro’s number: NA = 6.02214179⋅1023 mol-1
Boltzmann’s constant: kB = Rgas/NA
EnergyConv = 6.24150974⋅1018
One can use a different system of units for the time and pressure when providing pre-exponentials (see
section Mechanism Input File, page 52), keeping in mind that the reported values will also have different
units. However, using different units for energy and temperature will require changing the value of
parameter enrgconv in file constants_module.f90 and recompiling the program (see section
Compiling Zacros, page 9, for information on how to do this).
*
This feature exists for future development and presently does not affect the input/output.
If more than one arguments of the same kind follow a keyword, they appear numbered, for instance,
temperature ramp real1 real2.
All input files support free-format; thus, blank lines and comments are permitted anywhere in the text
as long as the syntax is valid. The commenting character is #, for instance:
The keywords are not case sensitive, and strings should be written free from quotation marks (unless for
instance one wants to use quotation marks as part of a name). Spaces are not allowed in a string; one
can use underscores _ instead. In general, the order of the keywords does not matter, but there are
cases where a keyword must precede another one, for instance one has to first define the number of
gas species and subsequently the names thereof, not the reverse. Moreover, keywords may not be
repeated within the same scope. The parser will report an error is such cases.
random_seed int1 int2 … The integer seed(s) of the random number generator. Out of
the random number generators available (see previous
keyword), only mt19937_ii supports initialization with more
than one value. All the others use only int1 and discard the
remaining integers.
temperature expr The temperature (K) under which the system is simulated.
Expression expr can be one of the following:
pressure real The pressure (bar) under which the system is simulated.
gas_specs_names str1 str2 … The names of the gas species. There should be as many strings
following the keyword as the number of gas species specified
with keyword n_gas_species.
gas_energies real1 real2 … The total energies (eV) of the gas species. There should be as
many reals following this keyword as the number of gas
species specified with keyword n_gas_species. The
ordering of these values should be consistent with the order
used in keyword gas_specs_names.
gas_molec_weights real1 real2 … The molecular weights (amu) of the gas species. There
should be as many reals following the keyword as the number
of gas species specified with keyword n_gas_species.
Note: at present these values are not used in the code. This
feature is there for future development.
gas_ molar_fracs real1 real2 … The molar fractions (dim/less) of the gas species in the
gas phase. There should be as many reals following this
keyword as the number of gas species specified with keyword
n_gas_species. The ordering of these values should be
consistent with the order used in keyword
gas_specs_names.
surf_specs_names str1 str2 … The names of the surface species. There should be as
many strings following the keyword as the number of surface
species specified with keyword n_surf_species. Note that
the name “*” is reserved for the empty site pseudo-species.
surf_specs_dent int1 int2 … The number of dentates of the surface species, specifying the
number of sites each species binds to. Thus, for a mono-
dentate species (for instance O adatoms on fcc sites) this
integer is 1, for a bidentate species (for instance O2 on a top-
fcc configuration) this integer is 2, etc. There should be as
many integers following this keyword as the number of surface
species specified with keyword n_surf_species. The
ordering of these values should be consistent with the order
used in keyword surf_specs_names.
Reporting Schemes
By “reporting”, we refer to writing information about the simulation into output files that can be further
post-processed. Zacros implements several different reporting modes that enable the user to get a full
picture of the dynamics of the system simulated. One can thus report information about the state of the
lattice, the number of gas phase molecules produced/consumed, the statistics of occurrence of
elementary events etc. Reporting can be done at specific time intervals, every time a lattice process (e.g.
desorption) takes place, or even when a specific process takes place, for instance a disproportionation
reaction between species A and B. The relevant keywords are shown below:
snapshots expr Determines how often snapshots of the lattice state will be
written to output file history_output.txt (for the latter
see section History Output File, page 71). Possible options for
expression expr are discussed below.
species_numbers expr Determines how often information about the number of gas
and surface species, as well as the energy of the current lattice
configuration) will be written to specnum_output.txt (for
the latter see section Species Numbers Output File, page 73).
Possible options for expression expr are discussed below.
on event [int] specifies that an entry to the corresponding output file will be
written at every int KMC steps. The integer following on
event is optional and assumes the value of 1 if omitted. In the
latter case, the initial (KMC step 0) and all subsequent
configurations will be written.
on elemevent int1 specifies that an entry to the corresponding output file will be
written at every occurrence of elementary event int1. The
latter number points to an event defined in the mechanism
input file (see section Mechanism Input File, page 52).
on time real specifies that a snapshot will be written at linearly spaced time
points, at every ∆t = real time units (s): 0, ∆t, 2·∆t, 3·∆t, …
on realtime real specifies that a snapshot will be written at linearly spaced time
points, at every ∆tR = real seconds of real (clock) time (s): 0,
∆tR, 2·∆tR, 3·∆tR, … This scheme is intended for benchmarking
or diagnostic purposes only, since it will produce output files
that vary among different computers. It is useful for instance, if
you want to choose an appropriate ∆t for on time reporting,
In addition to the above, Zacros has the functionality to save information about all energetic interaction
patterns (Energetics Input File, page 47) as well as the lattice processes (see section Mechanism Input
File, page 52), that have been detected for a configuration that arose during the course of the
simulation. These keywords are:
energetics_lists expr selexpr specifies that a list of energetic interaction patterns that
were detected on the lattice will be written to output file
energlist_output.txt. Possible options for expression
expr were discussed above. If one needs to output only
selected patterns (for instance only the 1st nearest neighbor
repulsions between species A-A) rather than all patterns, the
selection-expression selexpr can be used to achieve this.
Thus, selexpr follows the syntax:
process_lists expr selexpr specifies that a list of lattice processes (i.e. elementary events
that were detected on the lattice) will be written to output file
proclist_output.txt. Possible options for expression
expr were discussed above. If one needs to output only
selected elementary events (for instance only the bimolecular
reaction between species A-B) rather than all events, the
selection-expression selexpr can be used to achieve this.
Thus, selexpr follows the syntax:
Finally, the following keywords enable the output of detailed information in general_output.txt
about the setup or the course of the simulation.
max_time expr The maximum allowed simulated time interval (time ranges
from 0.0 to the maximum time in a simulation). This keyword
defines a stopping criterion. Expression expr can be one of
the following:
wall_time int The maximum allowed real-world time in seconds that the
simulation can be left running. The code has an internal
“stopwatch” and will exit normally once the wall-time has been
exceeded. Upon exit, the state of the program will be saved in
file restart.inf, so that the simulation can resume at a
later time. This feature is particularly useful when running in
computational clusters where a scheduler may enforce limits
on the time a simulation can be run.
no_restart This keyword gives the option to override the default behavior
of the program and not produce any restart.inf file upon
exit. If this has been specified, the program will not be able to
resume the simulation at a later time. This can be useful if one
wants to perform short/experimental runs, change the input
and rerun from scratch repetitively. For production runs, it is
recommended to avoid using this keyword.
down these kinetic constants incurs a small and quantifiable error in the simulation (see for instance
Stamatakis and Vlachos).6 Zacros employs dynamic detection of time-scale separation and dynamic
scaling of the kinetic constants to accelerate the simulation. The procedure currently implemented in
Zacros is along the lines of previously published algorithms;7-9 yet, it was developed independently and
thus it may behave differently than these algorithms.
Caution: all the algorithms that scale-down kinetic constants are approximate: they will always
introduce error in the simulation. The desired case is of course when this error is small compared to the
KMC sampling error, and therefore imperceptible. It is recommended that you do your own testing, by
progressively reducing the downscaling of the kinetic constants, until the results do not change. At that
point one can reasonably assume that they have converged to the accurate solution.
The algorithm implemented in Zacros works as follows (parameters of the algorithm appear in blue to
make the connection with the keywords later):
1. Define a “stiffness coefficient” (stiffness_coeffi) for each event i as the scaling factor of the kinetic
constant of the forward and (if applicable) reverse event. The stiffness_coeff ranges from 0+ to 1. To
scale-down the rate constant of an event, its pre-exponential is multiplied by stiffness_coeff.
2. In the beginning of the simulation, set stiffness_coeff = 1 for all events.
3. For every N_events∙[number of elementary steps in mechanism] KMC events that have occurred:
Nfwd,i
3.1. Calculate the partial-equilibrium ratios as Ri ← for each of the events, i, where
Nfwd,i + Nrev ,i
Nfwd,i and Nrev,i are the number of occurrences of the forward and reverse steps respectively.
Non-reversible steps have Ri = 1 by definition. For quasi-equilibrated reversible events Ri
evaluates to ½.
3.2. If any scaled-down step appears as non-equilibrated we may have scaled it down too much. Try
to detect such steps and remedy the situation by scaling up:
3.2.1. Check if Ri − ½ > quasiequi_tol and stiffness_coeffi < stiffn_scaling_threshold for any
event i, where quasiequi_tol is the tolerance for detecting the quasi-equilibrated steps
and stiffn_scaling_threshold is a threshold value.
3.2.2. If both conditions are true for an elementary step i, try to increase stiffness_coeffi
multiplying it by a constant factor, const_factor:
stiffness _ coeffi ← stiffness _ coeffi ⋅ const _ factor
If stiffness _ coeffi > stiffn_ scaling _ threshold set stiffness _ coeffi ← 1 .
3.3. Find the fastest non-equilibrated step, as the one with the largest Nfwd,i and a partial-
equilibrium ratio Ri − ½ > quasiequi_tol . Set ifastnoneq = i of that step if it exists.
3.4. Find the fastest equilibrated step, as the one with the largest Nfwd,i and a partial-equilibrium
ratio Ri − ½ ≤ quasiequi_tol . Set ifasteq = i of that step if it exists.
3.5. Find the slowest equilibrated step, as the one with the smallest Nfwd,i and a partial-equilibrium
ratio Ri − ½ ≤ quasiequi_tol . Set isloweq = i of that step if it exists.
3.6. Based on the event occurrence statistical information computed above, decide what action to
take:
3.6.1. Case 0: There is no fastest non-quasi-equilibrated step, but all events that have been
occurring appear to be fast and quasi equilibrated.
3.6.1.1. If the timescale separation between steps ifasteq and isloweq is significant, namely,
Nfwd,ifastequil
> max_allowed_fast_quasiequi_separ bring all quasi-equilibrated time
Nfwd,islowequil
Nfwd,islowequil
scales to the slowest one, by setting stiffness _ coeffi ← stiffness _ coeffi ⋅ .
Nfwd,i
If stiffness _ coeffi > stiffn_ scaling _ threshold set stiffness _ coeffi ← 1 .
3.6.1.2. Otherwise, all fast equilibrated steps occur on the same timescale. In this case try to
decrease stiffness_coeffi dividing it by a constant factor factor, const_factor:
stiffness _ coeffi
stiffness _ coeffi ← . This aims at exploring slower dynamics in the
const _ factor
system.
3.6.2. Case 1: There is a fastest non-quasi-equilibrated step. The aim is to make the ratio of the
timescales of all quasi-equilibrated steps over the timescale of the fastest non-
equilibrated step to range between timescale_sep_min and timescale_sep_max. The
parameter timescale_sep_geomean = (timescale_sep_min⋅timescale_sep_max)½ is also
used below.
3.6.2.1. Check if Ri − ½ ≤ quasiequi_tol and
timescale_sep_max ⋅ Nfwd,ifastnonequil < min (Nfwd,i ,Nrev ,i ) for any event i. Events for which
these conditions evaluate to true are too fast and quasi-equilibrated. For these
events evaluate a new stiffness coefficient as:
Nfwd,ifastnonequil
stiffness _ coeffi ← stiffness _ coeffi ⋅ timescale _ sep _ geomean ⋅ .
min (Nfwd,i ,Nrev ,i )
If stiffness _ coeffi > stiffn_ scaling _ threshold set stiffness _ coeffi ← 1 .
3.6.2.2. Check if Ri − ½ ≤ quasiequi_tol , stiffness _ coeffi < 1 , and
timescale_sep_min⋅ Nfwd,ifastnonequil > min (Nfwd,i ,Nrev ,i ) for any event i. Events for which
these conditions evaluate to true are too slow and quasi-equilibrated. For these
events evaluate a new stiffness coefficient as:
Nfwd,ifastnonequil
stiffness _ coeffi ← stiffness _ coeffi ⋅ timescale _ sep _ geomean ⋅ .
min (Nfwd,i ,Nrev ,i )
If stiffness _ coeffi > stiffn_ scaling _ threshold set stiffness _ coeffi ← 1 .
4. Scale all affected pre-exponentials and random times of the occurrence of the events in the queue,
and continue with the simulation.
The keywords that set the parameters of the above algorithm, thereby controlling the behavior of the
timescale separation treatment module, are discussed below.
stiffness_scale_all Specifies that the rate constant of any elementary step out of
those defined in mechanism_input.dat can be scaled to
treat time-scale separation. If this keyword is absent, one has
to explicitly define the elementary steps whose rate constants
can be scalable with the keyword stiffness_scalable in
mechanism_input.dat (see section Elementary Step
Representation, page 59).
check_every int Specifies the number of KMC events after which the stiffness
scaling module is invoked. It sets the parameter N_events in
step 3 of the above algorithm (default value = 1000). If the
number of new KMC events executed exceeds the product of
N_events with the count of elementary events in the
mechanism, then the scaling module is triggered.
stiffn_coeff_threshold real Specifies the threshold above which any stiffness coefficient
will be automatically mapped to 1. It sets the parameter
scaling_factor real Specifies the scaling factor used in the uniform upscaling or
downscaling of kinetic rate constants. It sets the parameter
const_factor in steps 3.2.2, 3.6.2.1 of the above algorithm
(default value = 5).
Domain 1 Domain 3
MPI Process 1 MPI Process 3
Domain 0 Domain 2
MPI Process 0 MPI Process 2
Figure 1: A lattice of 36 sites decomposed into 4 domains, each assigned to an MPI process. For
production runs, the domains are much larger, containing thousands of sites each.
these domains happens asynchronously, unless an event at a shared boundary occurs (e.g. MPI process
0 should not worry about internal events of process 1 or other MPI processes). The asynchronous nature
of the simulation risks violating causality. If, for example, MPI process 0 sends a particle via diffusion into
domain 1 at time t0, but the current KMC time in domain 1 is t1 > t0, then the history of process 1 (after
t0) is incorrect (Figure 2a). Causality has been violated, since at time t0 a particle diffused into domain 1,
but that domain has no record of this particle in its history.
To resolve this boundary conflict and restore causality, MPI process 1 will have to “roll back” to time t0,
discard the incorrect history and re-simulate its evolution (Figure 2b). However, during this discarded
history (t0 to t1) MPI process 1 may have performed diffusions that sent particles to other domains, e.g.
domain 2 in our example. Thus, it has to somehow signal to those MPI processes to undo and re-
simulate their history as well (Figure 2c). Further complications arise from the fact that these MPI
processes might have performed actions that affected other domains. It turns out that every boundary
conflict can potentially lead to a cascade of roll-backs and re-simulations that engage MPI processes
beyond the ones directly involved in that conflict.
The cascade just noted leads to a complex conflict resolution problem, a solution to which was proposed
by Jefferson in the mid-80s.10 His idea of virtual time and the “Time-Warp” algorithm provide an elegant
and powerful solution to this communication problem. The basic principle of this algorithm is that if
event A causes event B, then event A must be scheduled in real time before event B (note that our KMC
simulations have the simplifying property that an event, i.e. adsorption, desorption, reaction etc., is
“instantaneous”; thus there is no concept of an event’s duration). A high-level description of the
operations that need to be performed for the algorithm to work is as follows (more details and outlines
of the pertinent algorithms can be found in the Ref. 11):
• At every KMC step, the algorithm checks the imminent internal process and the imminent
messaged process (from the message-queue). The one with the smallest waiting time is
executed.
• Execution of internal events happens “privately” and asynchronously among MPI processes.
• Every so often (e.g. every 100 KMC steps), an MPI process saves a snapshot of its KMC state in
the state-queue data-structure, to be used in case of a roll-back. The state-queue has a fixed size
and therefore a given number of “slots”. If all slots are used up at some point during the run, the
queue is “sparsified”: every other snapshot is deleted and the interval by which snapshots are
saved is doubled.
• If an MPI process A executes an event that affects sites at the halo (or domain) of another MPI
process B, then process A has to send a message that encodes a “do” action to process B. The
message includes a timestamp, which is equal to the time that the event of MPI process A was
executed. Every message is stored in the message-queue data-structure for future reference.
t=0 t0 t1 tfin
(a)
Rollback of MPI Process 1
t=0 t0 t1 tfin
(b)
Rollback of MPI Process 2
t=0 t0 t1 tfin
(c)
Figure 2: Demonstration of the cascade of rollbacks caused by an event at the boundary between MPI
processes 0 and 1. At time t0 MP process 0 sends a particle to MPI process 1 (panel a). Since the latter
has already advanced past that time, it has to roll back to t0, discard any history simulated from t0 to t1
and re-simulate (panel (b). However, that discarded history contains other events communicated e.g. to
process 2, which now has to roll back as well (panel c).
o If the message has timestamp in the past, then a roll-back has to be performed. The MPI
process reinstates the KMC state at the time just before the message’s timestamp and
schedules the message’s action as the first one to be executed. Then it goes through the
message-queue, finds all messages that were set to other MPI processes and sends
corresponding anti-messages (so that others can undo the corresponding actions).
• These operations are supplemented by a Global Control Mechanism that keeps track of the so-
called global virtual time (GVT), which is essentially the time up to which the local histories
simulated are self-consistent.
o The computation of the GVT happens via a global reduction operation at specified
intervals of real time, e.g. every 30 seconds.
o After every GVT computation, any messages or snapshots that are no longer needed are
deleted to save space. Note that the algorithm must always retain at least one snapshot
strictly before the GVT (to be precise, this is true if GVT > 0, whereas if GVT = 0, the first-
ever snapshot taken at the beginning of the simulation must be retained). This snapshot
is to be used in case of a “worst-case scenario rollback”, i.e. a rollback that takes us back
to time exactly equal to the GVT.
o GVT computation is also useful in the proper termination of the run, which happens
once all MPI processes have reached the final KMC time specified by the user and all
messages have been received.
Regarding the efficiency of these runs, it is claimed in Ref. 10 that, for sufficiently small halos with
respect to the non-overlapping domain, a cascade of rollbacks that would bring the whole simulation to
the beginning is unlikely, and the global time will progress about as fast as the slowest process. Keep in
mind however, that the overheads incurred by the procedures just discussed (messaging, rollbacks,
snapshot saving, global communications) make the MPI Time-Warp implementation practical for
sufficiently large lattices. Moreover, such runs require substantial amounts of available memory, so that
a sufficient amount of KMC state snapshots can be retained in the state-queue. Our benchmarks on
simple and more complicated systems indicate that efficiency factors are heavily system-dependent, but
as a rule of thumb, we have seen that lattices with about a million sites or more, exhibit linear scaling for
at least 400 MPI processes.
• The lattice must be either a default or a unit-cell-defined lattice; explicitly defined lattices are
not supported (see section Lattice Input File, page 41 onwards).
• If a state input file is present, it can only contain individual seeding instructions
(seed_on_sites keywords), not multiple seeding instructions (seed_multiple blocks)
(see section Initial State Input File on page 61).
• Reporting of simulation observables must be on time (or off) (see section Reporting
Schemes, page 21), so that the output is synchronized (in simulated time) among the MPI
processes.
• Caching (see section Memory Management, page 37) is not supported due to the large memory
requirements thereof which make it impractical for distributed runs.
When Zacros is invoked by mpiexec (or mpirun or similar), via a command like:
it detects how many MPI processes are available and tries to partition the lattice in such a way that:
Keywords
We now proceed to discuss the relevant keywords that can be used in simulation_input.dat, to
control the behavior of Time-Warp runs.
state_queue_snapshots int real Specifies the initial frequency of snapshot saving and
the maximum allowed size of the state-queue. Thus, int is the
initial number of KMC steps every which a snapshot is saved
and real is the maximum size in gigabytes (GB) of the state-
queue. In the absence of a user-defined specification,
snapshots are saved every 1000 steps, up to 0.5 GB of memory
allocation. If the queue reaches maximum capacity and a new
snapshots needs to be saved, the queue is then sparsified by
deleting every other snapshot, and the interval for snapshot
saving is doubled.
emulation run yielding the same results as the truly distributed run (obtained with the MPI code). † The
pertinent keyword is discussed below.
Note that in the current version of Zacros, the following limitations apply to parallel emulation runs:
• In simulations in which events with equal timestamps arise, the trajectory obtained with parallel
emulation may differ from that of the truly distributed runs. This happens because our MPI
Time-Warp implementation contains special rules to determine the priority of equal-timestamp
events (thereby avoiding inconsistencies or deadlocks). We have not implemented these rules in
the parallel emulation algorithm, in order to keep as simple as possible and also because equal
timestamp events are normally quite rare.
• Parallel emulation runs do not currently support the specification of an initial state. If a
state_input.dat file is found in the working directory the simulation stops with an error.
Finally, one should keep in mind that, since the parallel emulation scheme is based on the sequential
KMC algorithm, it will be much less efficient than the Time-Warp implementation for simulations on
very large lattices. It is therefore advisable that the validation of MPI runs be done on relatively small to
medium sized lattices.
†
In the event that you see discrepancies between the results of parallel emulation runs and those of MPI runs,
please first double-check that the two simulation setups are indeed comparable and that your simulations
are not subject to the limitations discussed later in this section. If you still cannot explain the discrepancies,
please notify us, as this would mean that there may be a bug in the simulator.
Memory Management
During a KMC simulation, Zacros keeps in the memory of the computer a lot of information, notably:
1. all the possible elementary events that are possible given a current lattice configuration, up to a
maximum number of Nmax events such events,
2. references to the events to which each and every adsorbate on the lattice participates, up to a
adsorb.
maximum of Nmax events events per adsorbate,
3. all the possible energetic clusters that “make up” the total lattice energy of given a current
configuration, up to a maximum number of Nmax clusters such clusters,
4. references to the clusters to which each and every adsorbate on the lattice participates, up to a
adsorb.
maximum of Nmax clusters clusters per adsorbate.
adsorb. adsorb.
The aforementioned maximum numbers, Nmax events , Nmax events , Nmax clusters and Nmax clusters have a
direct effect on the memory footprint of a Zacros run: the larger their values, the larger the memory
needed for that run. Before Zacros 3.0, these maximum numbers were calculated using heuristic
expressions and remained fixed during the entire run. However, for some simulated systems these were
overestimating the memory needed (thereby allocating much more memory than what was actually
needed), while for other systems, the memory allocated turned out not to be enough (which would lead
to abnormal terminations of the run). In the latter cases, the user had to change certain parameters in
the code (used to calculate these maximum values) and recompile Zacros. Starting with version 3.0,
Zacros provides an interface for the user to easily manipulate these parameters and optimize memory
utilization. This is very useful for MPI runs with the Time-Warp algorithm (see section Simulating Very
Large Lattices on page 29), because such runs can be quite memory intensive.
Hence, the following equations are used to calculate the maximum numbers noted earlier:
Nmax events =
µevents ⋅ Nsites (1)
Nmax clusters =
λ clusters ⋅ Nsites (3)
adsorb. adsorb.
The parameters µevents , µevents , λ clusters and λ clusters are of type integer in Zacros 3.0 and can be
manipulated directly by the user using the keyword override_array_bounds (discussed in more
detail below). Keep in mind that the aforementioned maximum values still remain fixed throughout the
run, and thus, one should be careful to allow enough leeway to accommodate for the fluctuating
numbers of events or clusters detected during the simulation. The memory utilization report (see
section Memory Usage Statistics on page 69) produced at the end of the run is very useful in deciding
the values of the parameters entering the equations above. Note also that the parameters are in
principle size-invariant; therefore, once “good” values are determined for a lattice of a certain size, the
same values can in principle be used for a larger or a smaller lattice. ‡
adsorb. adsorb.
The default values of the parameters are: µevents = 50, µevents = 200, λ clusters = 50 and λ clusters = 60. These
values are merely estimates and cannot be efficient for all systems. They could underestimate the
memory needed for complicated systems or overestimate it for simple systems. As an example, in a
simple system for which the cluster expansion contains only single-body patterns, one only needs
λ clusters = 1 and λ adsorb.
clusters = 1, since every single adsorbate can be involved in at most one pattern and we
can have at most Nsites adsorbates on the lattice (therefore Nsites energetic clusters). Thus, the default
values overestimate the memory and can be overridden by the keyword discussed in the following.
override_array_bounds expr Allows the user to override parameters that control the
memory footprint of the simulation. Expression expr can be a
quadruplet of integers, int1 int2 int3 int4 which
specify the values of µevents , µadsorb.
events , λ clusters and
λadsorb.
clusters ,
respectively. Any of int1,…,int4 (or all four of them) can
be replaced by the ampersand character (&), denoting that we
wish to rely on the default value for the corresponding
parameter. In any case, exactly four arguments (integers or the
& character) must appear after this keyword.
adsorb. adsorb.
Caution: if the values of parameters µevents , µevents , λ clusters and λ clusters underestimate the memory
needed for the run, the latter will terminate with an error (also printing an advice note to override these
values). When simulating a complicated system, it is advisable to either rely on the default values, or if
they fail, specify quite high/permissive values for these parameters. You can optimize the memory
utilization later, after perusing the memory utilization report printed in general_output.txt at the
end of the run (see section Memory Usage Statistics on page 69).
Accelerators
Specialized exact algorithms that exploit certain properties of the simulated system to accelerate the
simulation are described in the following.
‡
We say “in principle” because one should always keep in mind that a KMC simulation is stochastic, and the
magnitude of the random fluctuations scales with the system size, which may lead to issues. As an example, say
that for a simulation on a lattice with 10,000 sites we have found that, at stationary conditions, we have a
number of events fluctuating between 17,000 to 19,000. We thus set µevents = 2, so that we can accommodate
up to 20,000 events in the memory. If we now use this setting on a lattice with 100 sites, we would be able to
accommodate up to 200 events in the memory. However, due to the smaller system size, the fluctuations
relative to the average now may be larger, e.g. the number of events may fluctuate between 150 and 210. The
maximum value of 210 events on the lattice can no longer be accommodated by the setting µevents = 2. This is
why it is important to allow for some leeway, as noted in the discussion.
Moreover, Zacros implements accelerators for simple systems, which are enabled automatically
(without the user having to provide a keyword). The first accelerator applies to systems in which the
cluster expansion contains only single-site (single-body) terms. For such systems, the subgraph
isomorphism procedures for the detection of energetic interaction patterns are skipped altogether. If
Zacros detects that a system is amenable to simulation with this accelerator, the following message will
appear in the energetics setup section of file general_output.txt:
The second accelerator is for systems in which the reaction mechanism involves monodentate species
participating in events spanning up to two sites and not involving any geometric criteria. In such cases, a
custom procedure is invoked that checks the occupancy of nearest neighbor sites and detects the
possible events, thereby skipping the subgraph isomorphism procedures for event detection (which are
of course general but more computationally intensive). The following message will appear in the
mechanism setup section of general_output.txt when this accelerator is enabled:
Troubleshooting
In addition to the aforementioned keywords there are some “debugging” keywords that may prove
particularly useful for troubleshooting a KMC simulation. These keywords trigger internal checks or
enable the output of detailed information in human-readable format, allowing the user to see “what is
going on” during the simulation. Note that these debugging procedures will significantly slow down
execution and/or result in large output files, so they should only be used in short runs (not for results
production).
§
Please notify us immediately if you encounter any of these errors, as this would mean that there may be a
bug in the simulator itself.
Finally, the following keyword is used to terminate parsing in the simulation input file:
The permitted keywords for each of the aforementioned options are discussed in the following.
Default Lattices
Currently there are three possible default lattices, all of which are periodic with coordination numbers
equal to 3, 4 and 6 (Figure 3). In these lattices all sites are equivalent (single site type). The name of this
site type is by default StTp1. Inside a default lattice block (lattice default_choice …
end_lattice) the following keywords are allowed:
rectangular_periodic real int1 int2 As above for a lattice with coordination number 4.
For this lattice, the unit cell is the primitive cell.
8 16 24 5 10 15 20 25 10 20 30
7 15 23 9 19 29
4 9 14 19 24 8 18 28
6 14 22 7 17 27
5 13 21
6 16 26
3 8 13 18 23
5 15 25
4 12 20
3 11 19 4 14 24
2 7 12 17 22 3 13 23
2 10 18 2 12 22
1 9 17 1 6 11 16 21 1 11 21
Figure 3: Default lattices in Zacros. Left: triangular (coordination number, CN = 3). Middle: rectangular
(CN = 4). Right: hexagonal (CN = 6). The blue lines connect the 1st nearest neighbors of the lattice. The
black lines denote the simulation box and the thick red lines the unit cell.
hexagonal_periodic real int1 int2 As above for a lattice with coordination number 6.
For this lattice, the unit cell is not the primitive unit cell, and it
contains 2 sites.
Note that, for MPI runs with default lattices, the number of MPI processes that can be used is subject to
the restrictions discussed in Performing MPI Time-Warp Runs on page 33.
n_cell_sites int The total number of sites (of any site type) in the unit cell.
n_site_types int The number of different site types. For instance, if one needs
to model a lattice for the Pt(111) surface taking into account
top, bridge, fcc and hcp sites, then int should be 4 (see also
Figure 4).
site_type_names str1 str2 … The names of the different site types. There should be as many
strings following this keyword as the number of site types
specified by n_site_types. In the example just mentioned
1 11 6 10
site_types 1 1 2 2 2 2 2 2 3 3 4 4
site_types top top brg brg brg brg brg brg fcc fcc hcp hcp
Figure 4: Lattice representing the (111) surface of an FCC metal, for instance Pt(111). Numbered are only
the sites belonging to one unit cell, which is denoted by thick black lines. The table on the right shows the
4 different sites types, along with the index and name of each one. Given on the bottom are the two
equivalent expressions that define the types of all sites within the unit cell.
for the Pt(111) surface lattice, the expression used can be:
site_type_names top brg fcc hcp (see Figure 4).
site_types expr The site types for each of the different sites of the unit cell.
Expression expr can consist of as many strings or integers as
the number of site types specified by site_type_names.
Thus, there are two options for expr (see Figure 4):
neighboring_structure
int1-int2 keywrd 1-2 self
N
2 2 NE 1-1 north
end_neighboring_structure
1 1 1-1 east
1-1 southeast
E
1
2
1
2
2-1 north
2-2 north
2 SE 2-1 east
1
2-2 east
2-2 southeast
end_neighboring_structure
Figure 5: Lattice representing the (111) surface of an FCC metal, with the fcc and hcp sites taken into
account. Numbered are only the sites belonging to the central unit cell, as well as the north, northeast,
east, and southeast neighboring unit cells (N, NE, E, SE, respectively). Within each cell, site 1 is the fcc,
whereas the hcp is site 2. The links of sites of the central unit cell with sites in neighboring cells are
depicted. The neighboring structure that gives rise to this lattice is shown on the right.
Note that, for MPI runs with unit-cell-defined lattices, the number of MPI processes that can be used is
subject to the restrictions discussed in Performing MPI Time-Warp Runs on page 33.
n_site_types int The number of different site types (as previously; see section
Unit-Cell-Defined Periodic Lattices, page 42).
site_type_names str1 str2 … The names of the different site types. There should be as many
strings following this keyword as the number of site types
specified by n_site_types (as previously; see section Unit-
Cell-Defined Periodic Lattices, page 42).
where:
int2 gives the site type of the site with index int1. In place
of this integer, one can also use a string denoting a site type
name as specified by keyword site_type_names as done
in Figure 6 (see also the description of site_types in
section Unit-Cell-Defined Periodic Lattices, page 42).
15
0 1 2 3 4 5
0 2 4 6
x (Å)
lattice_structure # Au6 nanocluster structure
1 0.0000e+0 0.0000e+0 cn2 2 2 6
2 1.4425e+0 0.0000e+0 br42 2 1 3
3 2.8850e+0 0.0000e+0 cn4 4 2 4 7 8
4 4.3275e+0 0.0000e+0 br42 2 3 5
5 5.7700e+0 0.0000e+0 cn2 2 4 9
6 7.2125e-1 1.2492e+0 br42 2 1 10
7 2.1637e+0 1.2492e+0 br44 2 3 10
8 3.6062e+0 1.2492e+0 br44 2 3 12
9 5.0487e+0 1.2492e+0 br42 2 5 12
10 1.4425e+0 2.4985e+0 cn4 4 6 7 11 13
11 2.8850e+0 2.4985e+0 br44 2 10 12
12 4.3275e+0 2.4985e+0 cn4 4 8 9 11 14
13 2.1637e+0 3.7477e+0 br42 2 10 15
14 3.6062e+0 3.7477e+0 br42 2 12 15
15 2.8850e+0 4.9970e+0 cn2 2 13 14
end_lattice_structure
Figure 6: Lattice representing the surface of a Au6 nanocluster, along with the corresponding
lattice_structure definition. For this example, n_sites would be set to 15 and max_coord to
4 (the maximum of column 5).
int4 int5 … give the 1st nearest neighbors of the site with
index int1. There should be as many neighbors listed as the
value of int3. If there are more integers listed in this line, the
extra entries will be ignored.
sites (br42 and br44) to be neighbors with each other? The answer to that lies in the way species bind to
the lattice sites and the types of reactions that can occur between these species. As rules of thumb:
• If multidentate species are present in the chemistry under investigation, the sites onto which each
dentate binds have to be neighbors with each other. For instance, a carbonate species (CO3) can bind
on Au6 in a top-bridge-top configuration involving sites cn2-br42-cn4 (see lattice of Figure 6).13 Thus,
these sites have been defined as neighboring in this lattice structure.
• For reactions that occur between adsorbed particles, there have to be one or more links between the
sites occupied by the different reactant molecules. Thus, on Au6 (Figure 6), representing a reaction
between CO bound to cn2 and O2 bound to cn4 in the absence of any adparticle on br42,
necessitates a neighboring relation between cn2-br42-cn4. This reaction would give CO2(gas) and a
left-over O adatom at br42.13
NC
NCIk ( σ )
E(σ)
= ∑
k =1 GMk
⋅ ECIk (5)
where NC is the number of clusters in the cluster expansion Hamiltonian, NCIk the number of instances
of cluster k on the lattice (which is a function of σ ), and GMk the graph multiplicity of cluster k (this is
the number of symmetric equivalents, and the pertinent term in corrects for the overcounting).
Specifying the cluster expansion Hamiltonian in Zacros (i.e. defining the energetic interactions of the
model) is done in energetics_input.dat using the syntax that is discussed in the following.
Cluster Representation
Each cluster is represented as a graph pattern. The latter consists of a collection of connected sites onto
which surface species can bind. In order to “translate” a pattern into input that Zacros can process, it is
instructive to make drawings such as the ones in Figure 7. The pattern on the top left (CO-O interaction)
can be used to model the repulsive interaction between an adsorbed CO molecule on a top site and an O
adatom on an fcc site, for example on Pt(111) . This pattern involves two monodentate species bound to
neighboring sites of different types. The pattern on the bottom left (Bidentate O2) represents the
bidentate binding configuration of O2. Finally, the pattern on the bottom right (O-O Interaction 3rd NN)
can model the interaction between O adatoms on Pt(111), for instance. It involves two monodentate
adsorbates and three sites, the second of which can be empty or occupied by another species.
Moreover, it involves a geometric criterion, as the angle between the links 1-2 and 2-3 has to be 180°
for sites 1 and 3 to be 3rd nearest-neighbors (for 2nd nearest-neighbors this angle is 120°; see also Figure
5). To represent patterns such as these, Zacros provides a number of keywords discussed below.
sites int1 Specifies the number of sites in the graph pattern representing
the cluster.
CO-O Interaction
entity 1 undefined entity 2
entity 1 site state
O2** O* O*
i ii 1 2 3
1 2
angle 180°
Bidentate O2
O-O Interaction 3rd NN
Figure 7: Schematics of various graph patterns representing energetic contribution terms (clusters) in a
cluster expansion Hamiltonian. The white numbers represent the indexes of each site of the pattern. The
lowercase roman numbers show the dentates of the bidentate oxygen.
neighboring int1-int2 … Specifies the neighboring between sites, if more than one sites
appear in the graph pattern. It is followed by expressions
structured as int1-int2 in the same line of input as the
keyword. Each such expression denotes that the sites with
indexes int1 and int2 are nearest neighbors. The values of
int1, int2, … range from 1 up to the number of sites
specified by the keyword sites. There can be as many such
expressions as needed to fully define the neighboring structure
of the pattern.
& & & For this specification all three columns contain the
ampersand character. This denotes an unspecified
state for that site.
site_types str1 str2 … The types of each and every site in the pattern. There should
be as many strings following this keyword as the number of
sites in the pattern specified by sites. This keyword is
optional. If omitted, the pattern will be detected based on
graph_multiplicity int The multiplicity of the pattern, namely the number of times
that the exact same pattern will be counted for a given lattice
configuration. This keyword is followed by an integer and can
be thought of as a symmetry number for the pattern (see also
the description of keyword cluster_eng below). It is an
optional keyword. Omitting the keyword is equivalent to
specifying a value equal to 1 for int.
cluster_eng real The energy contribution of the pattern, given as a real number
following the keyword. If graph_multiplicity is greater
than 1 for this pattern, the energy contribution is divided by
the (integer) multiplicity.
Examples
As guiding examples, we finally give the Zacros input defining the clusters presented in Figure 7.
Permitted keywords within the two blocks just mentioned will be presented shortly. Before doing so,
however, let us briefly discuss how elementary steps of a reaction mechanism are represented in Zacros.
The rates of elementary reactions are calculated from Arrhenius relationships. For the forward step of a
reversible process:
E‡ ( σ )
k fwd = A fwd ⋅ exp − fwd (6)
kB ⋅ T
where Afwd is the pre-exponential (also referred to as pre-factor), Efwd ( σ ) the activation energy for the
‡
entity 1 entity 2
entity 1 entity 2
CO* O* * *
2 2
+ CO2(gas)
1 1
CO-O Oxidation
entity 1 entity 3
entity 2
O2*** * CO* * O* * * *
iii ii i fwd
1 2 3 4 5 1 2 3 4 5
rev
CO-O2 Oxidation + CO2(gas)
Figure 8: Schematics of various graph patterns representing elementary steps of a reaction mechanism.
The white numbers represent the indexes of each site of the pattern. The lowercase roman numbers
show the dentates of the tridentate oxygen.
‡
Erev (σ)
krev = Arev ⋅ exp − (7)
kB ⋅ T
Microscopic reversibility dictates that the difference between forward and reverse activation energy is
equal to the reaction energy ∆Erxn ( σ ) (see also Figure 9):
In the expression above, ∆Erxn ( σ ) can be calculated from the energetics model (cluster expansion
Hamiltonian; see equation (5) in section Energetics Input File, page 47) as the difference between final
versus initial state energies, Efinal ( σ ) and Einitial ( σ ) , respectively:
‡
Efwd(σ) Transition State
‡
Efwd,0
Energy
‡ ‡
Erev,0 Erev(σ)
Initial
∆Erxn,0
State ∆Erxn(σ)
Final State
Reaction Coordinate
Figure 9: Energy profile of an elementary step. The quantities involved in the calculation of the forward
and reverse activation energies are noted.
‡
fwd σ
E= (
( ) max 0, ∆Erxn ( σ ) , E‡fwd,0 + ω⋅ ( ∆Erxn ( σ ) − ∆Erxn,0 ) ) (10)
where the max operator filters negative values, as well as values less than ∆Erxn ( σ ) , if the latter is
positive (important: this operator is not applied for irreversible steps; please read also the cautionary
‡
note in the step keyword on page 53). Moreover, Efwd,0 and ∆Erxn,0 are the activation and reaction
energies at the zero coverage limit (only the reactants existing on the surface), and ω is the so-called
proximity factor ranging from 0.0 for an initial-state-like transition state, to 1.0 for a final-state-like
transition state. The reverse activation energy expression that is in line with equations (8) and (10) is:
‡
Erev= (
( σ ) max −∆Erxn ( σ ) , 0, Erev,0
‡
− (1 − ω) ⋅ ( ∆Erxn ( σ ) − ∆Erxn,0 ) ) (11)
where:
‡ ‡
E=
rev,0 Efwd,0 − ∆Erxn,0 (12)
gas_reacs_prods str1 int1 … Provides information about the gas species participating in the
mechanism. The name of the first gas species is given in str1
sites int1 Specifies the number of sites in the graph pattern representing
the elementary step being defined.
neighboring int1-int2 … Specifies the neighboring between sites, if more than one sites
appear in the graph pattern representing the elementary step.
It is followed by expressions structured as int1-int2 in the
same line of input as the keyword. Each such expression
denotes that the sites with indexes int1 and int2 are
nearest neighbors. The values of int1, int2, … range from 1
up to the number of sites specified by the keyword sites.
There can be as many such expressions as needed to fully
define the neighboring structure of the pattern. For patterns
involving only one site, this keyword is omitted.
initial Specifies the initial state of each site in the graph pattern. It is
int1 str int2 followed by as many lines as the number of sites specified by
the keyword sites. Each one of these (non-blank) lines
contains an expression specifying the state of a site: the first
line corresponds to the site indexed 1 in the pattern, the
second line to site 2 etc. Note that there is no closing keyword
for initial; the program exits this input mode once the
appropriate number of such lines has been parsed. In each of
these lines, the first argument int1 is the number of the
molecular entity bound to that site. Thus, if a bidentate species
is bound to sites 1 and 3, both of these sites will have the same
integers in the first column. The second argument str gives
the name of the surface species bound to the site. Permitted
surface species names are those defined previously with
keyword surf_specs_names (see section Simulation Input
File, page 20). Finally, the third and last argument gives the
dentate number with which the species is bound. For sites
occupied by monodentate species, this number will always be
equal to 1.
final Specifies the final state of each site in the graph pattern. This
int1 str int2 keyword is subject to the exact same rules as the previously
introduced keyword initial.
site_types str1 str2 … The types of each and every site in the pattern. There should
be as many strings following this keyword as the number of
sites in the pattern specified by sites. This keyword is
optional. If omitted, the pattern will be detected based on
criteria pertinent to site occupancy and neighboring only.
α
A fwd (=
T ) exp − α1 ⋅ log ( T ) + 2 + α3 + α 4 ⋅ T + α 5 ⋅ T2 + α6 ⋅ T 3 + α7 ⋅ T 4 (13)
T
pe_ratio expr This keyword gives the ratio of forward over reverse pre-
exponentials and is valid only inside a reversible elementary
step specification block. The possible options for expression
expr are exactly as those of pre_expon, and the two
specifications (pre_expon and pe_ratio) should match:
either both specified as a single real number, or both as a
sequence of 7 real numbers. In the latter case, a similar
expression as that of equation (13) is employed by Zacros:
A fwd ( T ) β
= exp − β1 ⋅ log ( T ) + 2 + β3 + β4 ⋅ T + β5 ⋅ T2 + β6 ⋅ T 3 + β7 ⋅ T 4 (14)
Arev ( T ) T
activ_eng real The activation energy at the zero coverage limit. For a
reversible step, real gives the forward activation energy at
the zero coverage limit E‡fwd,0 . The forward activation energy
for the given configuration, which enters the Arrhenius
equation (6) is computed through the BEP relationship (10).
The latter makes use of the reaction energy given by the
energetics’ model (cluster expansion Hamiltonian; see section
Energetics Input File). The reverse activation energy entering
the Arrhenius equation (7) is computed through equation (11),
such that detailed balance is automatically satisfied.
prox_factor real The proximity factor used in the BEP relationship to calculate
the forward (and also reverse, if applicable) activation energy
(see equations 10, 11). If this keyword is omitted, a default
value of 0.5 is used for that elementary step.
seed_on_sites str int1 int2 … One or more such “individual seeding” instructions can
appear in place of expr in an initial state specification block
(see above). Each such instruction seeds one particle of the
species with name str on sites specified by the integers
int1, int2,… Permitted surface species names are those
defined previously with keyword surf_specs_names and
the number of integers following str must not exceed the
number of dentates of that species, defined by
surf_specs_dent (see section Simulation Input File, page
21). Finally, the int1, int2,… can range between 1 up to the
number of sites that exist on the lattice.
seed_multiple str1 int1 One or more such “multiple seeding” blocks can appear in
site_types str2 str3 … place of expr in an initial state specification block (see above).
neighboring int2-int3 … Each such instruction seeds multiple particles of the species
end_seed_multiple with name str1. The number of particles is defined by the
integer int1. The site types in which these particles will be
seeded is given by str2, str3, … There should be as many
such strings as the number of dentates of that species, defined
by surf_specs_dent (see section Simulation Input File,
page 21). If the species is monodentate, the neighboring
keyword is omitted; otherwise a neighboring structure is
specified using this keyword. This is done by using as many
expressions of the form int2-int3 as needed, in order to
define the links between the sites occupied by that species.
Note that if the neighboring structure thus defined cannot be
found on the lattice, execution of the seeding instruction will
fail. Note that for MPI runs, “multiple seeding” instructions are
not currently supported.
Examples
As illustrative examples, consider the following cases:
Suppose we need to seed 2 carbonate (CO3) particles randomly on the Au6 lattice (Figure 6). CO3 binds in
a top-bridge-top configuration at sites cn2-brg42-cn4. Thus, we can use the following instructions in the
file state_input.txt:
Alternatively suppose we would like to seed two CO3 molecules at specific sites on the Au6 structure. We
could then use the following syntax:
Command-Line Arguments
Zacros can also parse command-line arguments enabling it to override the default paths to all the input
files (see Input/Output Files, page 16), and some of the options read from a restart.inf file. In
particular, one can run Zacros as follows:
/path/to/executable/zacros.x --keyword=argument
C:\KMC\Zacros.exe --lattice=".\lattice_cases\lattice_25.dat"
Compiler Information
If the compiler supports this feature, this section gives information about the compiler version and
options used to generate the Zacros executable.
Threading/Multiprocessing Information
This section gives information about parallelization. If the program has been compiled as a serial
application, the message “Serial run (no parallelism).” will appear. If the compiler
recognized the OpenMP directives, the message will read “Shared-memory multiprocessing
with int OpenMP threads.” where int is the number of threads used during execution (please
refer to section Running Zacros, page 15, for more information about setting the number of threads).
For an MPI run, the following messages appear: “Distributed run with int1 MPI
process[es], without OpenMP threads.” or “Distributed run with int1 MPI
process, each with int2 OpenMP thread[s].”, “The rank of this MPI
process is int3.” and “The name of this processor is “str”.”. In these
messages, int1 is the total number of MPI processes used in the run, int2 is the number of threads in
each MPI process (if Zacros has been compiled with OpenMP and MPI directives), and int3 is the rank
of the MPI process that wrote the current general_output.txt file. Finally, str is the name of the
processor (computational node for high-performance computing clusters), e.g. node-k98j-004. The
latter information can be useful for troubleshooting, in case, for instance, a node is slower than the
others and results in an overall slow-down of the simulation. The name of the processor for MPI runs is
also written every time the simulation is restarted, since different processors may be chosen (or
different ranks may be assigned) in the new simulation chunk.
Simulation Setup
This section repeats the information parsed while processing file simulation_input.txt. If
everything is valid, this section ends with the message “Finished reading simulation
input.” otherwise an error is output to this file and execution is terminated.
Lattice Setup
In this section, information about the lattice structure is presented, namely the type of lattice
specification (default, periodic, explicit; see section Lattice Input File, page 41), the area of the
simulation box for periodic lattices, the site types and the number of sites per type, as well as the
maximum coordination number in the lattice. If everything is valid, this section ends with the message
“Finished reading lattice input.” otherwise an error is output to this file and execution is
terminated.
Energetics Setup
This section reports the number of clusters for the cluster expansion Hamiltonian parsed from the
energetics_input.dat file, and the maximum number of sites involved in a cluster. A summary of
the clusters defined is also given. For certain “simple” systems, the following message appears in this
section, right after the summary just noted: “This cluster expansion involves only
single body patterns.”. For such systems, the corresponding accelerator is automatically
enabled (see section Memory Management on page 37). If everything is valid, this section ends with the
message “Finished reading energetics input.” otherwise an error is output to this file and
execution is terminated.
Mechanism Setup
This section reports the number of elementary steps parsed from the mechanism_input.dat file,
and the maximum number of sites involved in a step. A summary of the elementary steps contained in
the mechanism is also given. For certain “simple” systems, the following message appears in this
section, right after the summary just noted: “This mechanism contains up to two-site
events involving only monodentate species.”. For such systems, the corresponding
accelerator is automatically enabled (see section Memory Management on page 37). If everything is
valid, this section ends with the message “Finished reading mechanism input.” otherwise
an error is output to this file and execution is terminated.
Simulation Preparation
This section opens with the message “Preparing simulation” and contains information about
the preparatory steps of the simulation, pertinent e.g. to the construction of the lattice, the pre-
allocation of data-structures handling certain elements of the simulation, and the initialization of these
data-structures while setting up the lattice state, adsorbate energetics, and elementary events. MPI runs
provide extra information about the lattice partitioning to subdomains, the size of halos etc. as well as
the MPI-specific data-structures (state queue and message queue; see section Simulating Very Large
Lattices on page 29 for more details on these).
Simulation Output
This section opens with the message “Commencing simulation” and closes with “Simulation
stopped”. If event reporting is turned on (see keyword event_report in section Simulation Input
File, page 24) the occurrence of each lattice process is reported using the following format:
KMC step int1
Elementary step str
occurred at time t = real1
involving site(s): int2 int3 …
Its propensity at T0 was k(T0) = real2
Its propensity at T(t) was k(T(t)) = real3
Its activation energy was Eact = real4
Its lattice delta energy of reaction was DElat = real5
where int1 is the KMC step number, str is the name of the elementary step that just occurred,
real1 is the time of occurrence thereof, and int2, int3,… are the indexes of the lattice sites on
which the event took place. The values of real2 and real3 are the propensities of the event at the
initial and current temperature of the simulation (these should be the same if the temperature is
constant). The activation energy and lattice reaction energy (neglecting contributions from gas species)
are given by real4 and real5.
Output line (i) can be generated in step 3.2 of the algorithm in page 26, whereas the output of line (ii)
can be generated in steps 3.6.1.1 and 3.6.2, and that of line (iii) in step 3.6.1.2. Note that such output is
produced only when the stiffness coefficients change, and real gives the time when this happened
during the simulation. Following any of these lines is a description of what was detected and what
actions were taken, for instance it may be reported that all elementary processes were found to be fast
and quasi-equilibrated, or that there were non-equilibrated steps of which the fastest one is reported.
The new stiffness coefficients are subsequently reported.
===============================================================================
Performing collective communications for global virtual time (GVT) computation
===============================================================================
Entered GVT-comp block at: real1
After MPI-reduction: real2
-----------------------------------------------------------------------------
Different times involved in the Time-Warp algorithm
-----------------------------------------------------------------------------
Global virtual time : real3
Global virtual time advancement : real4
Local time : real5
Minimum timestamp among sent messages : real6
Minimum timestamp among received messages : real7
-----------------------------------------------------------------------------
Time-Warp performance statistics in this GVT interval
-----------------------------------------------------------------------------
Number of snapshots taken : int1
Current (and max) size of snapshot queue : int2 of int3
Snapshot step adaptive interval : int4
Number of restore operations performed : int5
KMC time that was rerun due to rollbacks : real8
Ratio of the above over the GVT advancement : real9
KMC time spent in rollback propagation : real10
Ratio of the above over KMC time in rollbacks : real11
===============================================================================
The first two reals are clock times measured from the start of the main KMC loop: real1 is the time in
which the current MPI process entered the code section dealing with global communications, while
real2 is the time in which the reduction operations actually took place (in order to compute the GVT
or decide on simulation termination). These times should be close to each other (differences of tens of a
second are typically expected) and approximately multiples of the time specified by the
time_interval_gvt_computation keyword. In some instances, one may see larger differences,
on the order of a few seconds; this may be that at least one MPI process was “busy” doing something
else when all other MPI processes reached the global communications section. If this happens relatively
rarely, it is not a problem, but persistent such behavior might convey underlying issues. Do not hesitate
to notify us if you see such behavior.
The GVT and the GVT advancement are reported as real3 and real4. The latter difference is the
value of the GVT in the present reporting block minus that of the previous block. In “well-behaved” runs,
the GVT advancement should be non-zero most of the time. If this is zero in several consecutive blocks,
a warning is issued, and this might indicate a problem or just that the value specified in
time_interval_gvt_computation is too small (in other words, the run spends quite a lot of
time in global communications and not in advancing the simulation time). Moreover, the local time is
reported as real5.
The next two reals, real6 and real7 denote the minimum timestamps for sent and received
messages. These are provided mostly for information and if no messages are stored in the queue the
values printed are huge(1.0_8) = 1.797693134862316E+308.
The next section in the block pertains to Time-Warp performance statistics. Integer int1 denotes the
number of snapshots taken within the interval from the last global communication to the current one.
Moreover, int2 and int3 show the current number of KMC state snapshots stored in the state queue,
while int4 shows the current value of the snapshot interval, i.e. the number of KMC steps every which
a KMC state snapshot is stored in the state queue (this interval is adaptive as explained in the discussion
of the Time-Warp algorithm, in section Overview of the Approach, page 29). If int2 is appreciably
smaller than int3, this means that memory is not utilized effectively; a large portion of the memory
committed for the state queue via the second argument of keyword state_queue_snapshots
remains unutilized. In certain cases, this may be the desired behavior. In general, however, it might be
worth trying with smaller values for the snapshot interval (first argument of keyword
state_queue_snapshots and initial value of int4). Indeed, the efficiency of the Time-Warp
algorithm has been observed to depend strongly on the frequency of snapshot saving, with more
frequent saving resulting in better performance, of course up to a point, in which saving excessively
frequently hinders the progress of the run.
Next, int5 is the number of restore operations performed from the previous up to the current global
communication. It is important to check that the pertinent values reported among different MPI
processes are within the same range, e.g. all somewhere between 900 to 1200 restore operations. Large
differences are strong indications that the computational load is not well balanced. This could happen
for a strongly heterogeneous simulated system, or due to software or hardware issues. For instance, the
processors used may not have the same specifications (e.g. may have different clock speeds) or there
may be background tasks that slow down a certain MPI process. In such cases, the simulation will still
proceed but with a speed that depends on the slowest MPI processes.
The next four numbers provide information about the overheads pertinent to the rollbacks and re-
simulations, which are key (and computational-time-consuming) operations of the Time-Warp
algorithm. Thus, real8 reports the cumulative KMC time that had to be rerun due to rollbacks, and
real9 the ratio thereof the over the GVT advancement (i.e. real9 = real8 /real4). Depending on
the system this value can range between 4 or 5 to 80 or even higher values, for “difficult” systems.
Clearly, the higher the value for a given run, the less efficient the Time-Warp algorithm for that
simulation, since Zacros repeats segments of the simulated trajectory several times, until all boundary
conflicts are resolved.
Out of the re-simulated segments, some time is spent in “rollback propagation”; for instance, if a
boundary conflict arises at time 0.126 (measured in the time units of the KMC simulation) and the most
recent KMC state saved in the state queue has timestamp of 0.110, then 0.016 time units must be re-
simulated in rollback propagation mode. The total time re-simulated in rollback propagation (from the
previous global communication till the current one), is reported in real10. Moreover, real11 is the
ratio between the KMC time spent in rollback propagation and the total KMC time that had to be re-
simulated due to rollbacks (i.e. real11 = real10 /real8). This ratio must be less than 1 and should
be kept sufficiently low by choosing a high enough frequency of snapshot saving.
Simulation End
In the end of the simulation, right after the message “Simulation stopped”, the KMC time, total
number of elementary events simulated, and the event frequency are reported:
Performance Facts
Metrics about the performance of the program are also reported right after message “Performance
facts”:
In the above, real1 gives the CPU time spent whereas real2 is the real time, which we could, for
instance, measure using a stopwatch. The wall_time constraint is imposed on real time (see section
Simulation Input File). If the code was compiled as a serial application, real1 and real2 should be
approximately the same. Moreover, real3 gives the time required to set up the simulation (including
the time for parsing the input and setting up the data-structures), whereas real4 reports the time
spent in the actual KMC loop (involving event execution, update and reporting). The two values real3
and real4 should sum up to real2.
The value of real5 gives the real time needed on average to execute a single KMC step. This time is
reported in seconds and is used to compute of how many events can be executed if the simulation was
to be left running for one hour, as also reported in the value of int.
Moreover, real6 gives the real time needed on average to propagate the system for 1 unit of KMC
time. real7 shows how far in KMC time the system will go in one hour of real time.
Note that the performance metrics are not aggregated if the simulation is run in multiple chunks by use
of the restart feature. Thus, every time the simulation is restarted, Zacros resets the counters used to
evaluate these performance metrics.
Note that the total number of times run (int1) is not equal to the number of KMC events simulated.
This happens because in the course of the KMC simulation there are always processes that are detected
but subsequently removed if any of the participating adsorbates “decides to do something else”. The
number of times failed int2 should be zero. A non-zero value indicates that in one or several occasions
(as many as int2) the Newton- Raphson loop went through the maximum number of iterations (150 by
default) without converging, which may be cause for concern. The maximum errors are also reported:
real2 is the maximum norm of the difference between subsequent approximations of the solution,
whereas real3 is the maximum norm of the right-hand side. Both tolerances are 10−9 by default. Refer
to section Simulation Input File, page 18, on how to override these default tolerances and the maximum
number of iterations.
The meaning of the integers int1, int2 and int3 is self-explanatory. These numbers can sometimes
help do some quick sanity checks, e.g. for a simulated system for which the cluster expansion contains
only single body terms (no lateral interactions), one would expect no updates (int3 should be zero). Be
cautious if you are using these numbers to try to deduce where bottlenecks lie in your simulation. In
particular, keep in mind that the detection of reaction or energetic interaction patterns may be much
more time-consuming than these queueing operations.
The integers int2, int4, int6 and int8 report, respectively, the values of the maximum allowed
adsorb.
numbers of events in the entire lattice ( Nmax events ), events per adsorbate ( Nmax events ), energetic
adsorb.
clusters in the entire lattice ( Nmax clusters ) and energetic clusters per adsorbate ( Nmax clusters ) as
calculated from equations 1-4 (see section Memory Management on page 37). On the other hand,
int1, int3, int5 and int7 report on the actual memory utilization. Thus, int1 is the maximum
number of events in the entire lattice that had to be retained during the course of the simulation,
int2 is the maximum number of events per adsorbate encountered, etc. The real numbers,
real1,…,real4 denote the per-cent utilization of the memory committed for each datastructure;
thus, real1 = 100*int1/int2, real1 = 100*int1/int2, etc. Based on these numbers,
adsorb. adsorb.
one can easily adjust the parameters µevents , µevents , λ clusters and λ clusters , so as to improve memory
utilization. For instance, if µevents = 50 (value of int2) and the utilization reported by real1 is e.g.
1.15%, one could reduce µevents = 50 using the expression “override_array_bounds 1 & & &”
in simulation_input.dat, thereby increasing the memory utilization to about 58%.
Finally, if the simulation has terminated successfully, the message “> Normal termination <” is
written in the end of the file general_output.txt. In the case the simulation is being restarted, a
short message appears providing information about how many times has the simulation been restarted
previously, the last reported time and number of KMC events. The message “> Normal
termination <” is also written in the end of every restart session.
0 real1 real2 0 0 …
0 real3 real4 0 0 …
namely integer-type zeroes everywhere, except the 2nd and 3rd element of each row. These non-zero
elements give the two vectors defining the entire simulation box in row format, namely α = (real1,
real2) and β = (real3, real4). If the lattice has been defined using the keyword explicit these
real numbers have values of zero. For distributed runs, these values correspond to the vectors of the
entire domain, not just the subdomain of the corresponding MPI process.
The third and following lines give all the information pertinent to each site of the lattice, following the
format:
where:
int1 (1st column) is the index of the site (ranging from 1 to the total number of sites),
real1 and real2 (2nd and 3rd columns) are the x and y Cartesian coordinates of site int1,
int2 (4th column) gives the site type of the site with index int1,
int3 (5th column) gives the coordination number of the site with index int1,
int4 int5 … (6th and following columns) give the 1st nearest neighbors of the site with index int1.
Zacros always reports as many integers here as the maximum coordination number, writing zeroes after
the last nearest neighbor.
For distributed runs, each MPI process reports information about the sites of the internal part of its
subdomain (not the halo). Thus, in the example lattice of Figure 1, MPI process 0 will write information
about sites 1, 2 and 3 on lines 3, 4, and 5 of lattice_output.txt, followed by information about
sites 7, 8, 9 on lines 6, 7 and 8, and so on. Still though, the neighboring lists will include site numbers of
neighbors outside the internal part of the subdomain. Thus, MPI process 0 will list the following sites as
neighbors of site 1: 2, 7, 6, 31 (even though sites 6 and 31 fall outside the subdomain interior).
The first few lines of the file constitute a header, with general information about the simulation:
These are mostly self-explanatory. Note, however that for explicitly defined lattices (see keyword
explicit in section Explicitly Defined Custom Lattices, page 41), the Simulation box information does
not appear.
The rest of the file consists of sections beginning with the word “configuration” followed by information
structured as discussed below.
In the structure just shown, int1 is a counter showing how many configurations have been written so
far in file history_output.txt. Integer int2 gives the number of KMC events that have happened
up to that point. The next three reals real1, real2 , real3, give the time, temperature and the
energy of the current lattice configuration. The subsequent lines contain integers that encode the state
of the lattice; there are as many such lines as the number of lattice sites. The information is presented
as follows:
int4 (2nd column) gives the entity/adsorbate number (each adsorbate on the lattice has a unique
number/identifier, this is it),
int5 (3rd column) denotes the species number (zeroes are reported for empty sites),
int6 (4th column) gives the dentate number with which entity int4 occupies site int1.
Finally, the last line (int7, int8, …) gives the number of molecules produced (or consumed if the
corresponding value is negative) for each gas species in the chemistry. The order in which these
numbers are reported is the same as the order with which gas species were defined (see
gas_specs_names and pertinent keywords in section Simulation Input File, page 20) and also are
mentioned in the header of history_output.txt. Thus, int7 refers to species str1, int8 to
species str2 etc.
For distributed runs, each MPI process reports the states of sites in the interior as well as in the halo of
the subdomain it handles. Moreover, each line encoding the state of a site contains 5 integers: the first 4
(int3 – int6) are as discussed earlier, while the 5th integer takes the value of 1 if the site is in the halo
of the subdomain, otherwise the value of 0 (interior site).
File, page 22). As for the other output files, in distributed runs each MPI process generates its own file
and MPI processes with rank ≥ 1 append _{rank} to their filename. Each of these files contains
information about the events that were scheduled and executed by that MPI process, i.e. ignoring
messaged events. The contents of the procstat_output.txt (family of) file(s) are structured as
follows.
The first line of the file constitutes a header following the format:
The word “Overall” appears always first and is followed by strings that correspond to the names of all
elementary events defined in file mechanism_input.dat.
The rest of the file consists of sections beginning with the word “configuration” followed by information
structured as discussed below.
In the above, int1 is a counter showing how many configurations have been written so far in file
procstat_output.txt. Integer int2 gives the number of KMC events that have happened up to
that point and real1 the current time. The next two lines provide statistical information about each
elementary step of the mechanism in the same order as mentioned in the header.
Thus, real2, real3, real4, … give the average waiting times τk (also referred to as inter-arrival
times) for each reaction event:
Nkoccur
1
=τk
Nkoccur
∑τ
i≥1
k,i (15)
occur
where the averaging is done every time elementary event k occurs. Thus, Nk is the number of times
event k was executed so far in the KMC simulation, and τk,i the waiting time for that event to occur (the
waiting time for event k is by definition the time that passed since the occurrence of the most recent
event of any type). The value of real2 refers to an “overall” average in which all events are considered.
Moreover, int3, int4, int5, … give the numbers of times each event was executed during the KMC
simulation. The value of int3 refers the total number of events and should be the same as the value of
int2 in the header of file procstat_output.txt. Moreover, the values of int4, int5, … should
sum up to that of int3.
contains information about the species in the interior of the subdomain handled by the MPI process, as
well as about the events that were scheduled and executed by that MPI process and the gas phase
molecules produced or consumed due to these events (i.e. ignoring messaged events). The frequency at
which this information is reported is defined by keyword species_numbers in file simulation_
input.txt (see section Simulation Input File, page 22). The contents of this file are pretty self-
explanatory and are summarized in the first line (header) of the file. The overall structure is as follows:
Entry refers to an integer counting how many lines have been written to this output file. The column
marked as “Nevents” shows the total number of KMC events that happened up to that point, followed
by a column that shows the (simulated) time passed. The next column gives the temperature which
should be constant unless a temperature ramp has been specified (see keyword temperature in
section Simulation Input File, page 19). The column marked as “Energy” gives the energy of the
current lattice configuration. The following columns marked as str1, str2, … report the number of
molecules of each species currently adsorbed on the lattice (the strings are the names of the surface
species). Note that total numbers are reported; thus, if a species can bind to two different sites, this
output does not provide any information as to how many particles are bound to sites of type 1 versus 2.
Finally, the columns marked as str3, str4, … report the number of molecules of each gas species. For
species that appear as products in the net reaction under consideration, one should expect to see
positive numbers in this column. Negative numbers would be reported for reactant species.
The rest of the file consists of sections with the following structure:
where integer int1 is a counter showing how many configurations have been written so far in file
energlist_output.txt and integer int2 gives the number of KMC events that have happened
up to that point. real1 is the current time, real2 the current temperature and real3 the current
total energy of the lattice. Next, int3 shows the number of energetic clusters reported whereas int4
is the total number of clusters detected (all of which contribute to the total energy of the lattice). These
two values should be the same, unless selective reporting of clusters has been specified by using the
keyword select_cluster_type (see section Simulation Input File, page 23). The next lines contain
the list of clusters: integer int5 is the index of a cluster (for instance if int5= 2, the cluster’s name is
str2 as reported in the header section, etc). The following integers in the same line, give the lattice
sites on which this cluster was detected. For single-site patterns (e.g. on-site formation energy of a
mono-dentate species) only one integer int6 would appear. For multi-site patterns there are as many
integers as the number of sites in cluster int5.
For distributed runs, each MPI process reports the instances of energetic patterns (clusters) that are in
the interior of its subdomain. This condition is satisfied if the site covered by the first dentate of the first
molecule of the cluster is not in the halo of the subdomain. This “accounting scheme” ensures that each
cluster instance will be reported only once within the output files of all the MPI processes.
where integer int1 is a counter showing how many configurations have been written so far in file
energlist_output.txt and integer int2 gives the number of KMC events that have happened
up to that point. real1 is the current time, real2 the current temperature and real3 the current
total energy of the lattice. Next, int3 shows the number of lattice processes reported whereas int4 is
the total number of (imminent) lattice processes stored in the event queue. These two values should be
the same, unless selective reporting of processes has been specified by using the keyword
select_event_type (see section Simulation Input File, page 23). The next lines contain the list of
processes where int5 is an index pointing an elementary event (for instance if int5= 3, the
elementary event’s name is str3 as reported in the header section, etc), real1 gives the propensity
of the event at the initial temperature (beginning of simulation), real2 gives the propensity at the
current temperature, real3 is the activation energy, and real4 is the lattice reaction energy of the
event. The following integers in the same line (int6, int7 etc), give the lattice sites on which this
process was detected. For single-site patterns (e.g. adsorption/desorption of a mono-dentate species)
only one integer int6 would appear. For multi-site patterns there are as many integers as the number
of sites in elementary event int5.
For distributed runs, each MPI process reports the instances of elementary events (lattice processes)
that are in the interior of its subdomain. This condition is satisfied if the first site of the pattern is not in
the halo of the subdomain. Notice that “accounting scheme” uses a slightly different convention from
the one used in the energetics lists, but still ensures that each lattice process will be reported only once
within the output files of all the MPI processes.
This constant will be zero, unless an “empty cluster” has been specified. The latter, is a cluster involving
a single site with an unspecified state (using keyword &; see section Energetics Input File, page 49).
The expressions above are written when a pattern representing an energetic contribution has been
detected in the current lattice configuration. During the course of the simulation, Zacros keeps a list of
all such patterns, so that it can quickly compute changes in the lattice energy when
adsorption/desorption diffusion and reaction events take place. Thus, int1 is the index in this list of
patterns, str is the name of the pattern just detected (one of the cluster names defined in
energetics_input.dat; see section Energetics Input File, page 48); int3 int4 … give the
location of this pattern on the lattice; int5 and real just repeat the graph multiplicity and energy
The message above indicates that an energetic contribution was removed from the list, because the
corresponding pattern ceased to exist.
Regarding this message, note that Zacros stores the list of patterns in a data-structure in which each
pattern is indexed by an integer ranging from 1 to the total number of patterns NTot. To avoid generating
gaps in this data-structure upon removal of a pattern Nremv, the last pattern with index NTot takes the
index Nremv, so that the new set of indexes ranges from 1 to NTot – 1. The message above indicates that
such a re-indexing took place, with NTot = int1 and Nremv = int2.
The expressions above are output when a pattern representing an elementary process has been
detected in the current lattice configuration. During the course of the simulation, Zacros keeps a list of
all such patterns in a heap data-structure, in order to be able to find in constant time the next event to
take place. Thus, int1 is a unique identifier in this heap, str is the name of the pattern just detected
(one of the elementary event names defined in mechanism_input.dat; see section Mechanism
Input File); int3 int4 … gives the location of this pattern on the lattice; real1 is the activation
‡ ‡
coverage at the zero coverage limit ( Efwd,0 or Erev,0 in equations 10, 11); real2 is the actual activation
energy for the current configuration ( Efwd ( σ ) or Erev ( σ ) in equations 10, 11); real3 is the activation
‡ ‡
coverage at the zero coverage limit ( ∆Erxn,0 in equations 10, 11); real4 is the actual activation energy
for the current configuration ( ∆Erxn ( σ ) in equations 10, 11). The value of real5 gives the propensity
(equations 6, 7) at the initial temperature of the simulation (which would be the same throughout the
simulation if no temperature ramp has been defined). Finally, the random time for the occurrence of
that event is reported in the last line: real6 is the absolute time of occurrence and real7 is the time
increment (relative to the current time in which the process was detected).
The message above indicates that an elementary process was removed from the list, because the
corresponding pattern ceased to exist.
This message indicates that a process has been re-indexed to avoid generating gaps in the heap data-
structure upon removal of a pattern. Thus, if pattern Nremv is being removed, the last pattern with index
NTot takes the index Nremv, so that the new set of indexes ranges from 1 to NTot – 1. The message above
indicates that such a re-indexing took place, with NTot = int1 and Nremv = int2.
Notes on Troubleshooting
Zacros is able to identify syntax errors in the input files. If such an error is detected, the program will
report an error with a detailed description of what the problem was and in which line of which file it was
encountered.
In some cases though, the syntax may be perfectly valid but the specification might not be the one
intended. The following notes provide some hopefully useful considerations and guidelines for
troubleshooting.
1. Numbering/Naming consistency: make sure your numbering and naming is consistent throughout
your input. For instance, the order in which the names of surface species appear after keyword
surf_specs_names must match their dentate numbers after keyword surf_specs_dent.
Similarly for the gas species definition.
2. Pattern consistency: make sure that the binding configurations of different species are used in
consistent way in the following: the seeding instructions of state_input.dat (see section Initial
State Input File, page 61), the energetic clusters of energetics_input.dat (see section
Energetics Input File, page 47) and the elementary events of mechanism_input.dat (see section
Mechanism Input File, page 52). For instance, if the binding configuration of a bidentate species has
been defined with dentate 1 occupying a site of type “top” and with dentate 2 occupying a site of
type “fcc”, this convention should be followed throughout. If an individual seeding instruction
(keyword seed_on_sites, section Initial State Input File, page 61) places this species on the
wrong sites, Zacros will execute the instruction, but since no cluster contribution pattern will be
detected, the lattice energy will remain the same after addition of this species.
3. Units and conventions: it is necessary that the values entered for parameters such as rate constants,
energies or angles in patterns, adhere to Zacros’s unit system and sign conventions. For instance,
energies are given in eV (unless you have recompiled Zacros for a different unit system, see section
Units and Constants on page 17). Values for angles are signed, with positive values denoting the
counter-clockwise direction (see keyword angles in pages 50 and 58).
4. Quick checks: it is worth checking the summary of energetics and mechanism specifications in file
general_output.txt (see section General Output File, page 63). If the patterns that appear
there are not the ones intended, there may be a problem with the input. In some cases, the program
will issue warnings that may not have catastrophic consequences, but potentially need to be
addressed.
5. Advanced checks: one can frequently discover problems in the simulation setup by making use of the
debugging keywords:
debug_report_processes
debug_report_global_energetics
debug_newtons_method
debug_check_processes
debug_check_lattice
debug_check_caching
in simulation_input.dat (see section Simulation Input File, page 18) along with the output
information of the debugging files:
globalenerg_debug.txt
process_debug.txt
For instance, to make sure that the energetics model was defined properly, one could work out an
example problem, specify a configuration in file state_input.dat and see if the clusters are
being detected properly.
Known Issues/Limitations
1. If you try to compile Zacros with CMake and the build system fails to find a compiler, check the FC
environmental variable by running the following command:
echo $FC
If this returns an empty string, then you may try setting the compiler by:
export FC=ifort
where ifort, can be substituted with another compiler (available in your system).
2. For input files, the maximum record length that can be parsed is 213 = 8192 characters. A maximum
of 3000 words can be parsed. The maximum allowed length for the names of species, site-types,
clusters and mechanism-steps is 64 characters. These limits can be changed by redefining the
appropriate constants in file constants_module.f90 and recompiling Zacros.
4. Sites with unspecified states are not supported for elementary events. In most cases, sites that
participate in the elementary event are occupied by reactants, products or transiently by the
transition state. Thus, “extra” sites must usually be defined as empty rather than unspecified.
5. The calculation of energetics for lattices smaller than the maximum interaction length fails to
provide accurate values. It is recommended that the size of the lattice be chosen as at least twice
the length of the longest-range interaction pattern.
6. In the Zacros code, some assumptions have been made about, for instance the maximum number
of processes that need updates or adsorbates that need to be removed after a KMC event. While
these have empirically been shown to work well for the systems we have tested, it may happen
that they are inadequate for some other system. In such situations, Zacros will terminate
abnormally and you will most probably see errors like the following:
forrtl: severe (408): fort: (10): Subscript #1 of the array
PROCSUPDATE has value 3 which is greater than the upper bound of 2
If you encounter such problems you will have to increase the size of the “problematic” array and
recompile. Please do not hesitate to contact us in case you need help. We are planning to
implement memory amortization to avoid such issues altogether in the future. In the current Zacros
version, a memory management scheme exists that enables the user to specify convenient limits to
the sizes of the main data-structures of the KMC simulation (see section Memory Management on
page 37).
7. Certain limitations exist for the MPI Time-Warp implementation and the corresponding parallel
emulation scheme. Please consult section Performing MPI Time-Warp Runs on page 33 and section
Validation of MPI Time-Warp Runs on page 35, for details on these limitations.
8. There is no explicit limitation in the number of surface and gas-phase species, the size of the lattice,
and the number of cluster and elementary event patterns that can be defined, as the pertinent
data-structures consist of allocatable objects. However, different compilers and operating systems
may impose their limitations. Please refer to the documentation thereof.
9. On the Cray compiler there have been reported end-of-file related problems. In particular, there
has to be a newline character after the last record, otherwise Zacros is unable to correctly parse
this record.
10. On the NAG compiler, some functions, e.g. int8, are not supported. We have taken care to use
standard versions of functions and we have successfully tested Zacros on NAG 6.0. If you are using
an older version and encounter problems, please do not hesitate to contact us for help.
11. When running the tests with CMake (see section Using the CMake Build System, page 13), if you are
working with slow hardware, you may see one or more tests failing with timeout, e.g.:
Start 52: PAREMUL_SAMPLE_RUN_ADS_DES_RXN_DIFF_LATINTER_HEXA_4PROCS
24/55 Test #52: PAREMUL_SAMPLE_RUN_ADS_DES_RXN_DIFF_LATINTER_HEXA_4PROCS ...***Timeout 90.02 sec
To address this, you can increase the time that CMake allows this set of tests to run for. These
times are defined in file path/to/source/of/Zacros/tests/CMakeLists.txt, by the
following lines:
set(fasttimeout "90")
set(mediumtimeout "1000")
set(slowtimeout "15000")
By looking at the time in which the test failed (90.02 seconds in the above example), you know that
this was a fast test (you can also see this information explicitly in CMakeLists.txt, in the lines
that start with add_regression). You can then increase the corresponding timeout value, e.g.
change the first of the three lines above to:
set(fasttimeout "120")
If the new timeout value is not enough, you can try progressively larger values, but of course,
excessively large values might mean that there is a problem with your hardware of the compiled
executable (in which case, feel free to contact us for help).
12. There is a class of tests run by CMake, which we refer to as run-resume tests. Successful outcomes
of these tests validate that Zacros can successfully stop and restart a simulation from the last
written checkpoint (refer to keywords wall_time and no_restart in section Simulation Input
File, page 25). Each of these tests tries to run a simulation in several “chunks” and checks that
(i) indeed the simulation was run in more than one chunk, (ii) the simulation was successful (no
crashes) and (iii) the results obtained were the same as those of the one-off simulation, which are
provided as the “reference data”. In some cases, your hardware may be too fast, and condition (i)
may be violated. In this case, your simulation finishes before the wall-time given in the test, so
Zacros does not have the opportunity to actually restart the simulation, and the test fails. To
address this, you can simply change the wall-time specified for that test. To this end, first locate the
directory where the data of the test are provided, by looking into the name of test and searching
for it in CMakeLists.txt. As an example, below is the report of a failing test:
In CMakeLists.txt, this test is referred to in the following line, where the box highlights the
important part: the number of the test, which also refers to the directory of the data of this test:
Therefore, you know that the data of this test can be found in directory:
path/to/source/of/Zacros/tests/data/134
You now have to decrease the value of the integer following the wall_time keyword, for
instance for this test, you could change it from 20 to 13 seconds. If the new wall-time is not small
enough, you can try progressively smaller values.
Note: for MPI runs, you may also have to decrease the value of the integer following keyword
time_interval_gvt_computation, since, the simulation would not terminate before the
first global communication has taken place. Thus, in the above example:
time_interval_gvt_computation 6
and if you were to decrease the wall-time to say 4 seconds, the first simulation chunk would still be
6 seconds long. If your hardware was very fast and was able to complete the simulation in these 6
seconds, you would have to lower the value of time_interval_gvt_computation to say 1
second, and decrease the value of wall_time to say 3 seconds, in an attempt to create a
situation in which Zacros cannot finish the simulation in the first run and has to restart.
path/to/source/of/Zacros/tests/data/240
path/to/source/of/Zacros/tests/data/240_alt
Zacros Utilities
Mersenne-Twister Jump-Ahead Utility
This utility is a standalone program for computing and outputting the minimal polynomial coefficients of
the Mersenne-Twister 19937 recurrence for a given “jump factor”, as well as propagating a given state
by that “jump factor”. For the users’ convenience, this utility is executed by Zacros automatically when
the jump-ahead method is selected for the generation of multiple random streams (see keyword
random_streams on page 34). Thus, users are not required to run this utility themselves, and the
information of this section is provided mainly for reference to developers.
Background
At a high-level, the Mersenne-Twister random number generator is based on a linear recurrence that
generates a sequence of binary vectors: xn+1 = A⋅xn, where xk for k = 0, 1, … are the binary vectors (i.e.
their elements are either 0 or 1), and A is a binary matrix. The coefficients of A are chosen in such a way
that the sequence of xk is “sufficiently random”, making it then possible to map these pseudo-random
binary vectors into pseudo-random numbers. To create a set of disjoint pseudo-random streams by the
jump-ahead method, one needs to be able to efficiently compute the matrix Ajs, where js is the jump
step (chosen to be sufficiently large, so that each random stream can deliver several pseudo-random
numbers, before any overall with the next stream is observed). Then, the random streams can be
initialized with state vectors x0, xjf, x2⋅jf, … The efficient computation of Ajs is based on the properties of
the minimal polynomial of A, and is facilitated computationally by the fast execution of binary
operations (XOR, shift, bit mask). More detailed information about the jump-ahead method can be
found in Ref. 16.
Usage
The Mersenne-Twister jump-ahead utility can be invoked from the command line as follows:
where:
• PEid is the Processor-ID (e.g. the MPI process rank) for which the minimal polynomial is being
computed, and
• jp is the jump-factor. Note that the minimal polynomial depends on the jump step, the latter
taken as js = PEid⋅2jp.
• [MT parameters] is an ordered set of optional arguments setting the following parameters
of the Mersenne-Twister pseudo-random number generator (all 4-byte integers, default values
listed):
o word size (w = 32)
o degree of recurrence (n = 624)
o middle term (m = 397)
o separation point of one word (r = 31)
o integer "encoding" the binary matrix of the recurrence (a = -1727483681)
o integer "encoding" the lower mask
(lo = 2147483647, bit pattern: 01111111111111111111111111111111)
o integer "encoding" the upper mask
(up = -2147483648, bit pattern: 10000000000000000000000000000000)
As an example, the following command will compute the minimal polynomial for a jump step of 264 for
MPI-process 3, for the Mersenne-Twister generator with the default parameters:
The utility outputs the minimal polynomial coefficients in the form of 4-byte integers (“words”) in file
polycoeffs[_PEid].txt, with the part in brackets included if PEid > 0. The file contains one
integer per line; the first line gives the degree of the minimal polynomial, followed by the integers
“encoding” its coefficients. Moreover, if the working directory contains file mtstate.txt (of 4-byte
integers) the program computes and outputs the “distant” state in file mtstatedist[_PEid].txt
(again, the part in brackets is included if PEid > 0).
Note that for the last couple of calculations, the simulation cell should be large enough to avoid
interactions with periodic images of the adsorbate.
After collecting this data, one does the following simple calculations of the formation energy of each
configuration:
Then, the effective cluster interactions of the single- and two-body terms can be calculated by solving
the following (very simple) system of linear equations:
2 ⋅ ECIA + ECI2A(1NN) =
FE2A(1NN) + Surf (19)
Of course, the dataset of two configurations is the minimum needed to solve for the two ECIs (ECIA for
the single-body term, and ECI2A(1NN) for the two-body term). In practice, one would like to have more
configurations in order to perform a “more well-informed” fitting of the cluster expansion; thus, least-
squares fitting is essentially performed, since the linear system of equations is overdetermined. Note
that, more elaborate cluster expansions than that of our example would involve many patterns and a
higher number of configurations to be fitted. Clearly, such fitting exercises, involving many patterns and
configurations, can be complicated, and this is what the CE-Fit utility aims at streamlining.
Note: if you compile Zacros with MPI (see section Compiling Zacros, page 9), the CE-Fit utility will also be
compiled with MPI directives. However, since no MPI parallelism has been implemented in this utility,
you will be able to run it with only 1 MPI process. Trying to run it with many MPI processes will result in
an error.
The following mandatory input files and directories must be provided for a cluster expansion fit:
• calculation_input.dat, which provides general input about the fitting exercise, e.g.
number of configurations, surface species information etc. This is the “equivalent” of the
simulation_input.dat file of a Zacros simulation, and its keywords will be explained in
the next subsection.
• energetics_input.dat, which has the format described in section Energetics Input File
(page 47), with the exception that the cluster_eng keyword is now invalid, since these
energies are supposed to be fitted by CE-Fit.
• A series of directories named as Conf1, Conf2, Conf3, …, each of which must contain the
following files:
o lattice_input.dat, which follows exactly the format described in section Lattice
Input File (page 41),
o state_input.dat, which adheres to the rules of section Initial State Input File (page
61 onwards), with the exception that only seeding on specific sites is now valid (using
keyword seed_on_sites), in order to define configurations with known structure
and energy. This file can be omitted if one needs to define the “empty lattice”
configuration.
o a plain text file named energy (without any extension), which contains a single real
number, which provides the formation energy of the configuration captured by the
lattice and state input files. This energy value typically comes from first-principles
calculations (after some post-processing).
During the fitting, CE-Fit parses the general calculation settings from calculation_input.dat and
the patterns of the cluster expansion whose ECIs need to be fitted. Then it parses the configuration
versus energy data from the Config# directories and solves the (typically overdetermined) system of
linear equations.
Keywords
The keywords of the files used in CE-Fit (see list above), except calculation_input.dat, are
explained in the referenced sections. For calculation_input.dat, the following keywords are
valid:
The following keywords are also used in the calculation_input.dat file and their meaning is as
discussed in section General Simulation Parameters (page 18).
n_surf_species int This keyword is used in the same was as discussed in section
General Simulation Parameters on page 20.
finish Marks the end of input. Any subsequent information will not
be parsed.
As in any Zacros input file, comments can be added to calculation_input.dat, prepended by the
# character.
Output files
A cluster fitting calculation generates the following files:
• general_output.txt: the structure of this file is quite similar to the same-named file of a
Zacros calculation and provides general information about the fitting exercise. Thus, it
summarises the calculation setup, lists the patterns found in the cluster expansion to be fitted,
summarises the lattice and initial state setup for each configuration given, and outputs the
values of the ECIs if the fitting is successful or an error message if not.
• cefit_output.txt: this file lists the names of the interaction patterns in the first line (this
line contains as many strings as the patterns of energetics_input.dat), followed by rows
containing one real and one integer value: the former is the ECI of the corresponding pattern
while the latter is the graph multiplicity (which is as given in energetics_input.dat or
equal to 1 if the pertinent keyword is omitted therefrom).
• energies_parity.txt: this file contains only numeric data in 3 columns, and as many rows
as the number of configurations used in the fitting. The first column lists the energies provided
for each configuration as input to the fitting procedure (real numbers), while the second column
gives the energies calculated from the fitted cluster expansion (real numbers). The third and
final column gives the number of sites of the lattice provided for each configuration. This data is
useful for creating a parity plot of the model cluster expansion Hamiltonian energies with
respect to the input energies.
• AmatBvec.txt: this file gives the design matrix and the right-hand side vector of the linear
problem being solved: A⋅x = b. The last column of the file is b, which is simply the vector of the
energies given for each configuration. The remaining contents of the file provide matrix A, which
has as many rows as the number of configurations used in the fitting (i.e. the integer following
keyword n_config), and as many columns as the number of interaction patterns parsed in
energetics_input.dat. Element ai,j of this matrix is equal to the number of instances of
interaction pattern j in given configuration i divided by the graph multiplicity of pattern j. The
values of ai,j are expected to be integers; however, they are reported as reals (with 2 decimals)
in this file for debugging purposes. More specifically, recall that the graph multiplicity of pattern
j is the number of times that the exact same pattern will be counted for a given lattice
configuration (i.e. the number of symmetric equivalents of the pattern). It follows that the
number of instances of pattern j (as counted by the pattern matching subroutines of Zacros for a
given configuration) should be an integer multiple of the graph multiplicity. However, if an
incorrect graph multiplicity of pattern j is given by the user in energetics_input.dat, it
may happen that the number of instances is not an integer multiple of graph multiplicity, and
thus ai,j is no longer an integer. In such cases, the user needs to review the pattern that suffers
from this issue and amend the graph multiplicity value or the pattern definition altogether.
• Conf#_lattice_output.txt and Conf#_lattice_state.txt, where # ranges from
1 to the number of configurations used in the fitting: the former of these files provide an output
of the lattice structure as defined by lattice_input.dat for each configuration. Their
format has already been discussed in section Lattice Output File, page 70. The latter files provide
a snapshot of the lattice state and their format follows the structure of a configuration section in
history_output.txt, i.e. the contents after the line starting with configuration (but
without this “header” line) and before (and without including) the last line that reports the
number of gas phase molecules produced/consumed (see section History Output File, page 71).
• In addition to these files, if debug_report_global_energetics is parsed in
calculation_input.dat, files Conf#_globenerg_debut.txt are written, which
follow the structure discussed in section Energetics Debug Output File, page 76.
References
1
Stamatakis, M. and D.G. Vlachos, A Graph-Theoretical Kinetic Monte Carlo Framework for on-Lattice
Chemical Kinetics. Journal of Chemical Physics, 2011. 134(21): 214115.
2
Nielsen, J., M. d’Avezac, J. Hetherington, and M. Stamatakis, Parallel Kinetic Monte Carlo Simulation
Framework Incorporating Accurate Models of Adsorbate Lateral Interactions. Journal of Chemical
Physics, 2013. 139(22): 224706.
3
Park, S.K. and K.W. Miller, Random number generators: good ones are hard to find. Communications
of the ACM, 1988. 31(10): 1192-1201.
4
Matsumoto, M. and T. Nishimura, Mersenne twister: a 623-dimensionally equidistributed uniform
pseudo-random number generator. ACM Transactions on Modeling and Computer Simulation, 1998.
8(1): 3-30.
5
Savva, G.D. and M. Stamatakis, Comparison of Queueing Data-Structures for Kinetic Monte Carlo
Simulations of Heterogeneous Catalysts. Journal of Physical Chemistry A, 2020. 124(38): 7843-7856.
6
Stamatakis, M. and D.G. Vlachos, Equivalence of on-lattice stochastic chemical kinetics with the well-
mixed chemical master equation in the limit of fast diffusion. Computers & Chemical Engineering,
2011. 35(12): 2602-2610.
7
Danielson, T., J.E. Sutton, C. Hin, and A. Savara, SQERTSS: Dynamic rank based throttling of transition
probabilities in kinetic Monte Carlo simulations. Computer Physics Communications, 2017. 219: 149-
163.
8
Dybeck, E.C., C.P. Plaisance, and M. Neurock, Generalized Temporal Acceleration Scheme for Kinetic
Monte Carlo Simulations of Surface Catalytic Processes by Scaling the Rates of Fast Reactions. Journal
of Chemical Theory and Computation, 2017. 13(4): 1525-1538.
9
Chatterjee, A. and A.F. Voter, Accurate acceleration of kinetic Monte Carlo simulations through the
modification of rate constants. Journal of Chemical Physics, 2010. 132(19): 194101.
10
Jefferson, D.R., Virtual Time. ACM Transactions on Programming Languages and Systems, 1985. 7(3):
404-425.
11
Ravipati, S., G.D. Savva, I.-A. Christidi, R. Guichard, J. Nielsen, R. Réocreux, and M. Stamatakis,
Coupling the Time-Warp algorithm with the Graph-Theoretical Kinetic Monte Carlo framework for
distributed simulations of heterogeneous catalysts. Computer Physics Communications, 2022. 270:
108148.
12
Ravipati, S., M. d’Avezac, J. Nielsen, J. Hetherington, and M. Stamatakis, A Caching Scheme To
Accelerate Kinetic Monte Carlo Simulations of Catalytic Reactions. J Phys Chem A, 2020. 124(35):
7140-7154.
13
Stamatakis, M., M. Christiansen, D.G. Vlachos, and G. Mpourmpakis, Multiscale Modeling Reveals
Poisoning Mechanisms of MgO-Supported Au Clusters in CO Oxidation. Nano Letters, 2012. 12(7):
3621-3626.
14
Sanchez, J.M., F. Ducastelle, and D. Gratias, Generalized Cluster Description of Multicomponent
Systems. Physica A: Statistical and Theoretical Physics, 1984. 128(1-2): 334-350.
15
Wu, C., D.J. Schmidt, C. Wolverton, and W.F. Schneider, Accurate coverage-dependence incorporated
into first-principles kinetic models: Catalytic NO oxidation on Pt(111). Journal of Catalysis, 2012. 286:
88-94.
16
Haramoto, H., M. Matsumoto, T. Nishimura, F. Panneton, and P. L'Ecuyer, Efficient jump ahead for F2-
linear random number generators. Informs Journal on Computing, 2008. 20(3): 385-390.