0% found this document useful (0 votes)
31 views89 pages

Zacros Manual

Zacros 3.03 is an advanced kinetic Monte Carlo (KMC) software for simulating molecular phenomena like adsorption and catalytic reactions on surfaces, utilizing a Graph-Theoretical KMC methodology. The software includes features for handling complex reaction steps, steric exclusion effects, and spatial correlations, and it supports parallelization for large-scale simulations. The user guide provides comprehensive information on compiling, running simulations, input/output files, and troubleshooting, along with citations for relevant literature to support users in their research.

Uploaded by

Pranshu Kumar
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
31 views89 pages

Zacros Manual

Zacros 3.03 is an advanced kinetic Monte Carlo (KMC) software for simulating molecular phenomena like adsorption and catalytic reactions on surfaces, utilizing a Graph-Theoretical KMC methodology. The software includes features for handling complex reaction steps, steric exclusion effects, and spatial correlations, and it supports parallelization for large-scale simulations. The user guide provides comprehensive information on compiling, running simulations, input/output files, and troubleshooting, along with citations for relevant literature to support users in their research.

Uploaded by

Pranshu Kumar
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 89

Version 3.

03

User Guide • 27 July 2024


Zacros 3.03 User Guide Page 3 of 89

Dear Colleague,

I would like to thank you for downloading Zacros and I hope you will find it useful in your research.
Zacros is an advanced kinetic Monte Carlo (KMC) software application for the simulation of molecular
phenomena, such as adsorption and catalytic reactions, on surfaces. The package employs the Graph-
Theoretical KMC methodology, coupled with cluster expansion Hamiltonians and Brønsted-Evans-
Polanyi relations for the adlayer energetics. This integrated implementation can naturally capture:

• steric exclusion effects for species that bind in more than one catalytic site,
• complex reaction steps involving adsorbates in specific binding configurations and neighboring
patterns,
• spatial correlations and ordering arising from adsorbate lateral interactions that may involve many-
body contributions,
• coverage effects, namely the dependence of the activation energy of an elementary event on the
presence of spectators in the neighborhood of this event.
In addition to these, the code features an easy-to-learn keyword-based language for defining a
simulation, and can be run in “debugging” mode, thereby generating detailed output that can be used to
efficiently troubleshoot a KMC simulation. Moreover, to tackle simulations on very large lattices, Zacros
implements a domain decomposition scheme along with the Time-Warp algorithm for boundary conflict
resolution. Informative articles, tutorials and examples that can help you get started with KMC
simulation can be found in the Zacros website: http://zacros.org.

Zacros is distributed free of charge to the academic community in the hope that it will benefit
researchers worldwide. If you decide to use this software for a scientific article, I kindly ask you to
include the following citations in your work:

Stamatakis, M. and D. G. Vlachos (2011). “A Graph-Theoretical Kinetic Monte Carlo


Framework for on-Lattice Chemical Kinetics.” The Journal of Chemical Physics, 134(21):
214115.

Nielsen, J., M. d’Avezac, J. Hetherington and M. Stamatakis (2013). “Parallel Kinetic Monte
Carlo Simulation Framework Incorporating Accurate Models of Adsorbate Lateral
Interactions.” The Journal of Chemical Physics, 139(22): 224706.

Ravipati, S., Nielsen, J., d’Avezac, M., Hetherington, J. and M. Stamatakis (2020). “A Caching
Scheme to Accelerate Kinetic Monte Carlo Simulations of Catalytic Reactions”. The Journal
of Physical Chemistry A, 124(35): 7140-7154. [please cite if you are using the caching
scheme]

Savva, G. D. and M. Stamatakis (2020). “Comparison of Queueing Data-Structures for


Kinetic Monte Carlo Simulations of Heterogeneous Catalysts”. The Journal of Physical
Chemistry A, 124(38): 7843-7856. [please cite if you are using the skip-list or the pairing
heap queueing system]

© Michail Stamatakis July 27, 2024


Zacros 3.03 User Guide Page 4 of 89

Ravipati, S., Savva, G. D., Christidi, I.-A., Guichard, R., Nielsen, J., Réocreux, R., and
Stamatakis, M. (2022). “Coupling the Time-Warp algorithm with the Graph-Theoretical
Kinetic Monte Carlo framework for distributed simulations of heterogeneous catalysts”.
Computer Physics Communications, 270: 108148 [please cite if you are using the MPI Time-
Warp algorithm implementation]

I would be glad to receive feedback about Zacros, and if you would like to contribute to the
development thereof, please do not hesitate to get in touch.

Kind regards,
Michail Stamatakis
Associate Professor in Chemical Engineering
University College London
Torrington Place
London, WC1E 7JE
United Kingdom
Phone: +44 (0)20 3108 1128
Fax: +44 (0)20 7383 2348
e-mail: [email protected]
url: https://www.stamatakislab.org/

© Michail Stamatakis July 27, 2024


Zacros 3.03 User Guide Page 5 of 89

Table of Contents
Introduction .................................................................................................................................................. 8
Compiling Zacros ........................................................................................................................................... 9
Supported Compilers ................................................................................................................................ 9
Serial, Threaded or Distributed? ............................................................................................................... 9
Using the Makefiles Provided for GNU Fortran ...................................................................................... 10
Compilation on Unix/Linux ................................................................................................................. 10
Compilation on macOS........................................................................................................................ 11
Compilation on Windows Using MSYS2 .............................................................................................. 11
Using the CMake Build System ............................................................................................................... 13
Compilation on Unix/Linux/macOS..................................................................................................... 13
Compilation on Windows Using MSYS2 .............................................................................................. 14
Compilation on Windows Using Visual Studio .................................................................................... 15
Running Zacros ............................................................................................................................................ 15
Input/Output Files................................................................................................................................... 16
Units and Constants ................................................................................................................................ 17
Setting up a KMC Simulation in Zacros ....................................................................................................... 18
Simulation Input File ............................................................................................................................... 18
General Simulation Parameters .......................................................................................................... 18
Reporting Schemes ............................................................................................................................. 21
Stopping and Resuming ...................................................................................................................... 25
Treating Fast Quasi-Equilibrated Processes ........................................................................................ 25
Simulating Very Large Lattices ............................................................................................................ 29
Overview of the Approach .............................................................................................................. 29
Performing MPI Time-Warp Runs ................................................................................................... 33
Keywords......................................................................................................................................... 33
Validation of MPI Time-Warp Runs................................................................................................. 35
Memory Management ........................................................................................................................ 37
Accelerators ........................................................................................................................................ 38
Troubleshooting .................................................................................................................................. 39
Lattice Input File...................................................................................................................................... 41

© Michail Stamatakis July 27, 2024


Zacros 3.03 User Guide Page 6 of 89

Default Lattices ................................................................................................................................... 41


Unit-Cell-Defined Periodic Lattices ..................................................................................................... 42
Explicitly Defined Custom Lattices ...................................................................................................... 44
How to Determine Lattice Connectivity.............................................................................................. 46
Energetics Input File................................................................................................................................ 47
Cluster Representation ....................................................................................................................... 48
Examples ............................................................................................................................................. 51
Mechanism Input File.............................................................................................................................. 52
Elementary Step Representation ........................................................................................................ 53
Examples ............................................................................................................................................. 60
Initial State Input File .............................................................................................................................. 61
Examples ............................................................................................................................................. 62
Command-Line Arguments ..................................................................................................................... 62
Interpreting the Simulation Output of Zacros ............................................................................................ 63
General Output File................................................................................................................................. 63
Compiler Information.......................................................................................................................... 63
Threading/Multiprocessing Information ............................................................................................ 63
Simulation Setup ................................................................................................................................. 64
Lattice Setup ....................................................................................................................................... 64
Energetics Setup ................................................................................................................................. 64
Mechanism Setup ............................................................................................................................... 64
Initial State Setup ................................................................................................................................ 64
Simulation Preparation ....................................................................................................................... 65
Simulation Output ............................................................................................................................... 65
Stiffness Scaling Information .......................................................................................................... 65
Performance Monitoring of MPI Runs ............................................................................................ 66
Simulation End .................................................................................................................................... 68
Performance Facts .......................................................................................................................... 68
Newton's Method Statistics ............................................................................................................ 69
Execution Queue Statistics.............................................................................................................. 69
Memory Usage Statistics................................................................................................................. 69
Lattice Output File................................................................................................................................... 70

© Michail Stamatakis July 27, 2024


Zacros 3.03 User Guide Page 7 of 89

History Output File .................................................................................................................................. 71


Process Statistics Output File .................................................................................................................. 72
Species Numbers Output File.................................................................................................................. 73
Energetics Lists Output File ..................................................................................................................... 74
Process Lists Output File ......................................................................................................................... 75
Energetics Debug Output File ................................................................................................................. 76
Process Debug Output File...................................................................................................................... 77
Notes on Troubleshooting .......................................................................................................................... 78
Known Issues/Limitations ........................................................................................................................... 79
Zacros Utilities............................................................................................................................................. 83
Mersenne-Twister Jump-Ahead Utility ................................................................................................... 83
Background ......................................................................................................................................... 83
Usage................................................................................................................................................... 83
CE-Fit: Cluster Expansion Fitting Utility .................................................................................................. 84
Background ......................................................................................................................................... 84
Performing a Cluster Expansion Fit ..................................................................................................... 85
Keywords............................................................................................................................................. 86
Output files ......................................................................................................................................... 87
Zacros-post: Post-processing and Visualisation Graphical User Interface.............................................. 88
References .................................................................................................................................................. 89

© Michail Stamatakis July 27, 2024


Zacros 3.03 User Guide Page 8 of 89

Introduction
Zacros is an advanced kinetic Monte Carlo (KMC) package for the simulation of molecular phenomena,
such as adsorption and catalytic reactions, on structures that can be represented by static lattices. The
package employs the Graph-Theoretical KMC methodology1 coupled with cluster expansion
Hamiltonians for the adlayer energetics,2 allowing it to tackle:

• binding configurations on more than one sites, and the steric exclusion effect resulting therefrom,
• complex surface reactions in which several species and spectators can be involved in specific
neighboring patterns,
• adsorbate lateral interactions involving long-range and many body terms, and the spatial correlation
and ordering effects resulting therefrom.
• coverage effects, namely the dependence of the activation energy of an elementary event on the
presence of spectators in the neighborhood of this event.
Various optimizations (e.g. Ullmann’s algorithm for subgraph isomorphism, caching of energetic
interaction terms) as well as OpenMP parallelization have been implemented for the efficient simulation
of systems with energetic models involving long-range interactions. Moreover, as of version 2.0 an
approximate method that rescales rate constants of fast quasi-equilibrated events is available (see
section Treating Fast Quasi-Equilibrated Processes on page 25). Furthermore, as of version 3.0, Zacros
implements MPI parallelization, using a domain decomposition scheme and the Time-Warp algorithm
for boundary conflict resolution (see section Simulating Very Large Lattices on page 29).

This user guide provides information about the syntax of input/output files and the options available in
Zacros, with some high-level overview of the methods on a “need-to-know” basis. For more in-depth
information on KMC simulation and the underlying methods implemented in the package, the user is
referred to the following publications:

Review papers and book chapters


Stamatakis, M. and D. G. Vlachos (2012). "Unraveling the Complexity of Catalytic Reactions
via Kinetic Monte Carlo Simulation: Current Status and Frontiers." ACS Catalysis 2(12):
2648-2663.

Darby, M. T., Piccinin, S. and M. Stamatakis (2016). “Chapter 4: First principles-based kinetic
Monte Carlo simulation in catalysis” in Kasai, H. and M. C. S. E. Escaño (Eds.), Physics of
Surface, Interface and Cluster Catalysis, Bristol, UK: IOP Publishing.

Papanikolaou, K. G. and M. Stamatakis (2020). “Chapter 7 - Toward the accurate modeling


of the kinetics of surface reactions using the kinetic Monte Carlo method” in
Grammatikopoulos, P. (Ed.), Computational Modelling of Nanomaterials, Amsterdam,
Netherlands: Elsevier.

© Michail Stamatakis July 27, 2024


Zacros 3.03 User Guide Page 9 of 89

Method development (technical) papers


Stamatakis, M. and D. G. Vlachos (2011). “A Graph-Theoretical Kinetic Monte Carlo
Framework for on-Lattice Chemical Kinetics.” The Journal of Chemical Physics, 134(21):
214115.

Nielsen, J., M. d’Avezac, J. Hetherington and M. Stamatakis (2013). “Parallel Kinetic Monte
Carlo Simulation Framework Incorporating Accurate Models of Adsorbate Lateral
Interactions.” The Journal of Chemical Physics, 139(22): 224706.

Ravipati, S., Nielsen, J., d’Avezac, M., Hetherington, J. and M. Stamatakis (2020). “A Caching
Scheme to Accelerate Kinetic Monte Carlo Simulations of Catalytic Reactions”. The Journal
of Physical Chemistry A, 124(35): 7140-7154. [please cite if you are using the caching
scheme]

Savva, G. D. and M. Stamatakis (2020). “Comparison of Queueing Data-Structures for


Kinetic Monte Carlo Simulations of Heterogeneous Catalysts”. The Journal of Physical
Chemistry A, 124(38): 7843-7856. [please cite if you are using the skip-list or the pairing
heap queueing system]

Ravipati, S., Savva, G. D., Christidi, I.-A., Guichard, R., Nielsen, J., Réocreux, R., and
Stamatakis, M. (2021). “Coupling the Time-Warp algorithm with the Graph-Theoretical
Kinetic Monte Carlo framework for distributed simulations of heterogeneous catalysts”.
Computer Physics Communications, 270: 108148 [please cite if you are using the MPI Time-
Warp algorithm implementation]

Compiling Zacros
Supported Compilers
We build and test Zacros using the Intel and GNU Fortran compilers on Linux and Windows, as well as
the GNU Fortran compiler on OSX. Minimum recommended versions are: 7.3.0 for GNU, and 18.0.3 for
Intel. For the MPI version of Zacros we test both the MPICH and the OpenMPI frameworks, with the
following minimum recommended versions: 3.1.1 for OpenMPI and 3.2.1 for MPICH.

Serial, Threaded or Distributed?


The first question one would be inclined to ask is whether to compile and use the serial version of Zacros
or enable shared-memory (OpenMP) parallelism (threading) and/or message-passing (MPI) for
distributed simulations. The short answer is: compile all variants and do your own performance testing
for the systems of interest! As guidelines, you would expect the OpenMP version to perform better
when simulating systems with long-range lateral interactions or many patterns (figures) in the cluster
expansion, but there are no definite numbers that quantify how long-range or how many patterns. The
MPI version is efficient only when simulating sufficiently large lattices, e.g. with 106 sites or more, but
again, for different systems one may observe different efficiencies. In general, one should keep in mind

© Michail Stamatakis July 27, 2024


Zacros 3.03 User Guide Page 10 of 89

that the parallel versions (with OpenMP and/or MPI), can have quite significant computational
overheads, pertinent to e.g. thread creation or sharing information among threads for the OpenMP
version; or pertinent to messaging, global communications, rollbacks and re-simulations for the MPI
version. These overheads may, in certain cases, outweigh any gains from the parallelization. On the
other hand, the serial version implements optimizations and does not incur any such overheads; thus,
you may, for instance, find out that the same simulation takes less time in a serial run compared to an
OpenMP run with a single thread. It is therefore advisable to run short benchmarks to find out what
works best for the systems you want to simulate.

Using the Makefiles Provided for GNU Fortran


The following instructions are for compiling Zacros using the makefiles provided for GFortran, the GNU
Fortran compiler, which can be downloaded for free and works in Unix and Windows systems. Instead of
the makefiles, one can use the CMake build system, which is cross-compatible and has the advantage
that it allows one to run a broader set of tests that validate the compilation more comprehensively. For
information on how to do this, please refer to section Using the CMake Build System on page 13.

Compilation on Unix/Linux
This section refers to Unix and Linux operating systems. If your system does not come with the GFortran
compiler you will have to install it. For instance, in Ubuntu 20.04, you can do so by running the following
command on a terminal: sudo apt install gfortran.

Compiling should be done using a terminal. For the serial version (no OpenMP or MPI parallelism), it
simply comes down to the following:

cd path/to/source/of/Zacros
mkdir build # If it does not exist yet
cd build
cp ../makefiles/makefile-gnu-serial-unix makefile
make

The first line above moves to the directory where the source code can be found. Then we create a build
directory and move to that directory. All the build files will be located here, rather than “polluting” the
source. We copy one of the makefiles provided to the build directory (renaming it to makefile at the
same time) and compile the code. At the end of this process, there should be an executable called
zacros.x in the build directory. The executable thus obtained is of interest when running on a
computer with a single core, or when simulating systems with short-range or no lateral interactions
(please refer to section Energetics Input File, page 47).

To compile the OpenMP version of the code (efficient for systems with long-range lateral interactions
among adsorbates or a large number of interaction patterns), simply replace line 4 of the above
commands with:

cp ../makefiles/makefile-gnu-parallel-unix makefile

© Michail Stamatakis July 27, 2024


Zacros 3.03 User Guide Page 11 of 89

Finally, if you would like to compile the MPI version of Zacros (for distributed simulations on very large
lattices), you need to have OpenMPI or MPICH installed. As an example, in Ubuntu 20.04, MPICH can be
installed via the terminal with the following command: sudo apt install mpich. The
compilation procedure is similar to what was described earlier; just replace line 4 of the above
commands with the following:

cp ../makefiles/makefile-gnu-distributed-unix makefile

After compiling any (or all) of the aforementioned versions, you can run a set of tests to verify that your
executable works fine. As a prerequisite, your system must have Python installed. Then you can do the
testing with the following command:

make test

Many of these tests are fast, but some may take a few minutes to complete (depending also on your
hardware). At the end of each test, an informative message will be printed stating whether the test has
been successful or not. Note that the test command of the makefiles runs only a subset of the available
tests. For more comprehensive testing, please use the CMake build system (section Using the CMake
Build System, page 13). If any tests fail, refer to section Known Issues/Limitations (page 79) for possible
remedies, and if you cannot resolve the issue, feel free to contact us for help.

Compilation on macOS
For macOS one can install GFortran from http://gcc.gnu.org/wiki/GFortranBinaries. If your system does
not recognize the make command, you can install the XCode package making sure that the Command
Line Tools are included in the installation. The rest of the compilation instructions are the same as in
section Compilation on Unix/Linux.

Compilation on Windows Using MSYS2


To compile Zacros in Windows you can install MSYS2, which is a build environment based on open
source software that allows you to build native Windows programs. MSYS2 can be obtained from
https://www.msys2.org/ (refer to the Installation section on that page). After installing it, locate your
user-home directory, e.g. “C:\msys64\home\Michail”, copy the zip file of the Zacros release therein and
unzip it. You should then get a directory like “C:\msys64\home\Michail\Zacros”. Subsequently, from the
Windows Start menu run “MSYS2 MinGW 64-bit”, which will open the MSYS2 terminal at your home
directory. If you run ls you should see the Zacros directory there.

Then try to run the following commands to verify that you have all the tools necessary for the
compilation of Zacros:

make --version # Check the version of make…


gfortran --version # … and that of GNU Fortran
which libomp.dll # Query if the OpenMP library is available

If at any step you get an error like:

bash: make: command not found

© Michail Stamatakis July 27, 2024


Zacros 3.03 User Guide Page 12 of 89

this means that you need to install the corresponding package that provides support for that command.
MSYS2 provides a package manager called pacman, which can be used to search for packages and
install them. Searching for a package, e.g. make, can be done with the following command:

pacman -Qs make

and installing it can be done via:

pacman -S make

The base MSYS2 installation may not include the GNU Fortran compiler, which, however, should be easy
to install using the following commands:

pacman -S mingw-w64-x86_64-gcc
pacman -S mingw-w64-x86_64-gcc-fortran
pacman -S mingw-w64-x86_64-gcc-libgfortran
pacman -S mingw-w64-x86_64-openmp

At any point, you can check which packages are installed with the following command:

pacman -Qqe

Moreover, you can check the existence of package in the repository, e.g. fortran, via:

pacman -Qs fortran

If all necessary packages are installed, the serial version can be compiled with the following commands:

cd path/to/source/of/Zacros
mkdir build # If it does not exist yet
cd build
cp ../makefiles/makefile-gnu-serial-msys-windows makefile
make

The first line above moves to Zacros directory we created earlier, where the source code can be found.
Then we create a build directory and move to that directory. All the build files will be located here,
rather than “polluting” the source. We copy one of the makefiles provided to the build directory
(renaming it to makefile at the same time) and compile the code. At the end of this process, there
should be an executable called zacros.exe in the build directory. The OpenMP parallel version of the
code is obtained by replacing line 4 of the above commands with:

cp ../makefiles/makefile-gnu-parallel-msys-windows makefile

At the time of writing these instructions, there was no MPI package available in the repositories of
MSYS2 and thus, we do not provide makefiles for the distributed version of Zacros in Windows systems.
It might be possible to compile and link Zacros to the MSMPI libraries; however, this does not appear to
be straightforward…

© Michail Stamatakis July 27, 2024


Zacros 3.03 User Guide Page 13 of 89

Note that running Zacros will have to be done from the MSYS2 terminal (not from the Windows
command prompt), so that the gcc/gfortran libraries MSYS2 are available to the Zacros executable.

After compiling, you can run a set of tests to verify that your executable works fine. As a prerequisite,
your system must have Python installed. Then you can do the testing with the following command:

make test

Keep in mind that some tests may take a few minutes to complete. At the end of each test, an
informative message is printed, stating whether the test has been successful or not. If any tests fail,
refer to section Known Issues/Limitations (page 79) for possible remedies, and if you cannot resolve the
issue, feel free to contact us for help. Note that the make test command of the makefiles runs only a
subset of the available tests. For more comprehensive testing, please use the CMake build system as
describe in the next section.

Using the CMake Build System


Zacros can also be compiled via a build system called CMake (http://www.cmake.org/), which has the
distinct advantage of being cross-compatible with Linux, Mac, and Windows. For the CMake installation
instructions please refer to the documentation thereof (http://www.cmake.org/documentation/).

Compilation on Unix/Linux/macOS
This section covers all Unix-like operating systems, including Linux and macOS. It has been tested with
GNU Fortran and Intel Fortran. Compiling the serial version should be done using a terminal using the
following commands:

cd path/to/source/of/Zacros
mkdir build # If it does not exist yet
cd build
cmake .. -DCMAKE_BUILD_TYPE=Release -Ddoopenmp=off -Ddompi=off
make # or “make -j” for faster (parallel) compilation

The first line above moves to the directory where the source code can be found. Then we create a build
directory. All the build files will be located here, rather than “polluting” the source. We move to the
build directory. The compilation is first configured for the current platform. Finally, the code is compiled.

At the end of this process, there should be a serial executable called zacros.x in the build directory.
This is of interest when running on a computer with a single core, or when simulating systems with
short-range or no lateral interactions (please refer to section Energetics Input File, page 47).

One can enable threading or build the distributed (MPI) version by setting the -Ddoopenmp and/or
-Ddompi arguments appropriately. For instance, to enable threading (OpenMP) only:

cmake .. -DCMAKE_BUILD_TYPE=Release -Ddoopenmp=on -Ddompi=off

To compile the MPI version but without threading:

cmake .. -DCMAKE_BUILD_TYPE=Release -Ddoopenmp=off -Ddompi=on

© Michail Stamatakis July 27, 2024


Zacros 3.03 User Guide Page 14 of 89

It is also possible to enable both OpenMPI and MPI; however, we have not performed extensive tests
with such a configuration. If you would like to use it, do so at your own risk!

If you would like to compile a version for debugging, you can change the build type to Debug and even
provide custom options to the compiler by using the -DCMAKE_Fortran_FLAGS parameter. This can
be done by editing line 4 of the above commands for the serial version as follows:

cmake .. -DCMAKE_BUILD_TYPE=Debug -DCMAKE_Fortran_FLAGS=


’-finit-integer=-987654321 -finit-real=nan -fcheck=bounds’
-Ddoopenmp=off -Ddompi=off

The above uses GNU Fortran flags that initialise integer variables with the value -987654321 and real
variables with NaN, and checks whether array bounds are exceeded at some point in the simulation
(note that the above command should be written in just one line in the terminal).

If your system has Python installed, then the code can be tested to validate the compilation, by running:

cd /path/to/build/directory/ # If not already there…


make test

Please note that some of the tests may take several minutes to complete… You can selectively run
specific tests, thereby reducing the testing time, in two different ways. The first is to run tests with a
given label out of the following: fast, medium, slow, e.g.:

ctest -L fast # Runs only fast tests


ctest -L "(fast|medium)" # Runs fast and medium tests

The second way is to run tests whose name contains a given string, e.g.:

ctest -R SAMPLE_RUN # Runs tests whose name contains “SAMPLE_RUN”

If you see tests failing, refer to section Known Issues/Limitations (page 79) for possible remedies (there
might be something wrong with your system configuration, or your compiler may be incompatible with
some functionalities of the code). If you cannot resolve the issue, feel free to contact us for help.

Compilation on Windows Using MSYS2


MSYS2 (https://www.msys2.org/) is the recommended system for compiling Zacros in Windows. It can
be obtained for free and provides a collection of GNU utilities that make it possible to code just like on
any Unix platform, without relying on a proprietary compiler. For some background information and
guidelines on installing make, gfortran and the OpenMP library on MSYS2, refer to subsection
Compilation on Windows Using MSYS2 of section Using the Makefiles Provided for GNU Fortran on page
11. In addition to these tools you will need have cmake installed, as well as python if you would like to
test the build. These can be installed with the commands:

pacman -S mingw-w64-x86_64-cmake
pacman -S python

Then, starting from the MSYS2 command-line, the serial version can be compiled as follows:

© Michail Stamatakis July 27, 2024


Zacros 3.03 User Guide Page 15 of 89

cd path/to/source/of/Zacros
mkdir build # If it does not exist yet
cd build
cmake .. -DCMAKE_BUILD_TYPE=Release -G "MSYS Makefiles" -Ddoopenmp=off
-Ddompi=off
make # or “make -j” for faster (parallel) compilation

This operation is quite similar to the one performed in the case of Unix-based systems (see section
Compilation on Unix/Linux, above). However, line 4 has changed a bit: the -G "MSYS Makefiles"
option tells CMake that we want to compile for MSYS.

Tests can be run with the command make test, as discussed in the previous subsection. Please be
warned that some of the tests are fairly long. Threading can be added by adding the option -
Ddoopenmp=on to line 4 above. At the time of writing these instructions, there was no MPI package
available in the repositories of MSYS2 and thus, setting -Ddompi=off will result in compilation errors.

Compilation on Windows Using Visual Studio


If one has access to Intel Fortran, it is possible to create the project files using the command-
line/powershell commands:

cd path/to/source/
mkdir build
cd build
cmake ..

The project should contain several possible configurations. For speed, one should build the Release
configuration.

It is also possible to build and test directly from the command-line, thereby bypassing GUI systems:

cmake --build . --config Release


ctest . -C Release

Running Zacros
The simplest way to run Zacros is simply to launch the executable from the command-line. For Unix-like
systems:

path/to/executable/zacros.x

or simply,

zacros.x

if zacros.x is in the user's path. For Windows systems zacros.x is replaced with zacros.exe in
the above commands. Zacros expects all the appropriate input files to be in the current directory. Please
refer to section Setting up a KMC Simulation in Zacros for a description of these files.

© Michail Stamatakis July 27, 2024


Zacros 3.03 User Guide Page 16 of 89

When invoking the thread-capable executable (the default; see section Compiling Zacros, page 9, for
how to disable threads) Zacros will run with as many threads as there are cores. The number of threads
can be manually defined in MS-DOS and UNIX by setting the appropriate environmental variables. In
UNIX one needs to use the command export:

export OMP_NUM_THREADS=4

for 4 threads, whereas in MS-DOS this is done with command set:

set OMP_DYNAMIC=FALSE
set OMP_NUM_THREADS=4

Running distributed simulations has to be done by invoking the appropriate MPI “manager” program,
mpiexec, mpirun, or similar, and using of course a Zacros executable that has been compiled with
MPI instructions/directives. The number of MPI processes is given to mpiexec by using the option -np
as follows (in the example below we use 4 MPI processes):

mpiexec -np 4 /path/to/mpi-zacros/zacros.x

Note that the number of MPI processes depends on the size of the lattice simulated, i.e. it cannot be
chosen arbitrarily. For more details, refer to section Performing MPI Time-Warp Runs on page 33.

It is possible to give command-line arguments to Zacros in MPI runs (see section Command-Line
Arguments, on page 62). An example in which we specify (or override) the wall-time to 86400 seconds
(24 hours) is given below:

mpiexec -np 4 /path/to/mpi-zacros/zacros.x –-wall_time=86400

Input/Output Files
The input to Zacros consists of 5 keyword-based files, out of which one is optional:

simulation_input.dat (required) lattice_input.dat (required)


energetics_input.dat (required) mechanism_input.dat (required)
state_input.dat (optional)

Moreover, the output of Zacros consists of the following files:

general_output.txt history_output.txt
lattice_output.txt procstat_output.txt
specnum_output.txt
Two additional files are produced if keywords energetics_lists and process_lists appear in
simulation_input.dat file (see section Simulation Input File, page 23); these files are (respectively for
each keyword):
energlist_output.txt proclist_output.txt
If run in debugging mode (see section Troubleshooting, page 39), Zacros also generates the following
files which are useful for troubleshooting a simulation:

© Michail Stamatakis July 27, 2024


Zacros 3.03 User Guide Page 17 of 89

process_debug.txt globalenerg_debug.txt
newton_debug.txt

All the aforementioned files are read from (or written to) the same directory, unless otherwise specified
via the command line (see section Command-Line Arguments, page 62).

Finally, restart.inf is an input/output file used to resume a simulation from the point it stopped.
Even though it is a plain text file, it is not intended as human readable. For more details on how the
resume feature works, please refer to keywords wall_time and no_restart in section Simulation
Input File, page 25. Note that if the file restart.inf exists in the current working directory, Zacros
will disregard the aforementioned input files, attempt to read restart.inf and resume the
simulation.

For MPI runs, each MPI process generates its own output file for the lattice subdomain for which it is
responsible. Thus, MPI process 0 generates files with the names mentioned above, while MPI process i,
with i ≥ 1, generates files with the following names:

general_output_i.txt history_output_i.txt
lattice_output_i.txt procstat_output_i.txt
specnum_output_i.txt

for i = 1, 2, 3, … It is therefore important to keep in mind that MPI runs generate a large number of files.

Units and Constants


Zacros assumes that the input parameters are given in the following units:

Energy: electronvolt (eV) Time: second (s) Length: Ångstrom (Å)


Pressure: bar (bar) Molecular mass: atomic mass units (amu) * Temperature: Kelvin (K)

Moreover, the values of the constants used are as follows:

Pi constant: π = 3.141592653589793
Gas constant: Rgas = 8.314472 J/K/mol Avogardro’s number: NA = 6.02214179⋅1023 mol-1
Boltzmann’s constant: kB = Rgas/NA

For the conversion of J to eV the following constant is used:

EnergyConv = 6.24150974⋅1018

One can use a different system of units for the time and pressure when providing pre-exponentials (see
section Mechanism Input File, page 52), keeping in mind that the reported values will also have different
units. However, using different units for energy and temperature will require changing the value of
parameter enrgconv in file constants_module.f90 and recompiling the program (see section
Compiling Zacros, page 9, for information on how to do this).

*
This feature exists for future development and presently does not affect the input/output.

© Michail Stamatakis July 27, 2024


Zacros 3.03 User Guide Page 18 of 89

Setting up a KMC Simulation in Zacros


In the following we present the keywords and syntax used in each of the input files. Keywords are
denoted with blue colored Courier New font, for instance random_seed. Numeric or string
arguments to the keywords are denoted as follows:

int an integer number


real a real number
str a string
keywrd a keyword
expr an expression which may consist of combinations of the above

If more than one arguments of the same kind follow a keyword, they appear numbered, for instance,
temperature ramp real1 real2.

All input files support free-format; thus, blank lines and comments are permitted anywhere in the text
as long as the syntax is valid. The commenting character is #, for instance:

snapshots on time 1.50 # sampling every 1.5 time units

The keywords are not case sensitive, and strings should be written free from quotation marks (unless for
instance one wants to use quotation marks as part of a name). Spaces are not allowed in a string; one
can use underscores _ instead. In general, the order of the keywords does not matter, but there are
cases where a keyword must precede another one, for instance one has to first define the number of
gas species and subsequently the names thereof, not the reverse. Moreover, keywords may not be
repeated within the same scope. The parser will report an error is such cases.

Simulation Input File


The file simulation_input.dat contains information about the species involved in the chemistry,
the conditions under which the chemistry is to be simulated, as well as parameters that specify the
behavior of the program, namely when to take samples, what are the stopping criteria, etc. Common
keywords and options are explained below.

General Simulation Parameters


random_number_generator str The random number generator approach to be used in the
simulation, where str can be one of the following:

park_miller_min "minimal" random number generator of


Park and Miller.3

low_quality linear congruential generator with a


short period of 213 – 2 = 8190. Caution: this is
intended to be used only for testing/debugging.

very_low_quality similar to the low_quality


generator, but with an even shorter period equal

© Michail Stamatakis July 27, 2024


Zacros 3.03 User Guide Page 19 of 89

to 27 – 2 = 126. Caution: this is intended to be


used only for testing/debugging.

mt19937 Mersenne Twister 19937 by Matsumoto and


Nishimura.4 This is also the default generator used
if keyword random_number_generator is
omitted. The period of this random number
generator is 219937 – 1 ≈ 4.3 ⋅ 106001.

mt19937_ii Mersenne Twister 19937 with an improved


initialization scheme.

random_seed int1 int2 … The integer seed(s) of the random number generator. Out of
the random number generators available (see previous
keyword), only mt19937_ii supports initialization with more
than one value. All the others use only int1 and discard the
remaining integers.

tol_dx_newton real When simulating systems in which the temperature is not


constant Zacros uses the Newton-Raphson method to solve a
non-linear equation for the time of occurrence for each lattice
process. The value of real gives the tolerance for the norm
between subsequent approximations to a solution. If omitted,
this tolerance is taken equal to the default value of 10−9.

tol_rhs_newton real See also keyword tol_dx_newton above. This keyword


gives the tolerance for the right hand side for the Newton-
Raphson method. If omitted, this tolerance is taken equal to
the default value of 10−9.

max_newton_iter int See also keyword tol_dx_newton above. This keyword


gives the maximum number of interactions for the Newton-
Raphson method. If omitted, this number is taken equal to the
default value of 150.

n_gauss_pts int See also keyword tol_dx_newton above. The non-linear


equation for the time of occurrence for each lattice process
contains an integral of propensity over time. This integral is
computed using the Gauss-Legendre quadrature for which this
keyword gives the number of points to be used.

temperature expr The temperature (K) under which the system is simulated.
Expression expr can be one of the following:

© Michail Stamatakis July 27, 2024


Zacros 3.03 User Guide Page 20 of 89

real specifies a constant simulation temperature

ramp real1 real2 specifies a temperature ramp


where real1 is the initial temperature (K) and
real2 is the rate of change (K/s). If real2 is
positive, temperature programmed desorption
can be simulated. Negative values of real2 can
be used for simulated annealing calculations.

pressure real The pressure (bar) under which the system is simulated.

n_gas_species int The number of gas species in the chemistry.

gas_specs_names str1 str2 … The names of the gas species. There should be as many strings
following the keyword as the number of gas species specified
with keyword n_gas_species.

gas_energies real1 real2 … The total energies (eV) of the gas species. There should be as
many reals following this keyword as the number of gas
species specified with keyword n_gas_species. The
ordering of these values should be consistent with the order
used in keyword gas_specs_names.

gas_molec_weights real1 real2 … The molecular weights (amu) of the gas species. There
should be as many reals following the keyword as the number
of gas species specified with keyword n_gas_species.
Note: at present these values are not used in the code. This
feature is there for future development.

gas_ molar_fracs real1 real2 … The molar fractions (dim/less) of the gas species in the
gas phase. There should be as many reals following this
keyword as the number of gas species specified with keyword
n_gas_species. The ordering of these values should be
consistent with the order used in keyword
gas_specs_names.

n_surf_species int The number of surface species in the chemistry (without


counting the empty sites as a species, even though it is
considered as a pseudo-species internally in the code).

surf_specs_names str1 str2 … The names of the surface species. There should be as
many strings following the keyword as the number of surface
species specified with keyword n_surf_species. Note that
the name “*” is reserved for the empty site pseudo-species.

© Michail Stamatakis July 27, 2024


Zacros 3.03 User Guide Page 21 of 89

surf_specs_dent int1 int2 … The number of dentates of the surface species, specifying the
number of sites each species binds to. Thus, for a mono-
dentate species (for instance O adatoms on fcc sites) this
integer is 1, for a bidentate species (for instance O2 on a top-
fcc configuration) this integer is 2, etc. There should be as
many integers following this keyword as the number of surface
species specified with keyword n_surf_species. The
ordering of these values should be consistent with the order
used in keyword surf_specs_names.

kmc_propagation_method expr1 The method used to propagate the KMC state.


Expression expr1 should be as follows:
first_reaction expr2 specifying that the first reaction
method will be used, with a queuing system given
by expr2. The latter can be any of the following:
unsorted_list
binary_heap
binary_heap_with_swaps
skip_list
skip_list_1way
pairing_heap
The “binary-heap with swaps” as well as the “1-
way skip-list” queuing systems are made available
mainly for experimental (or validation) purposes,
while the “binary heap” and “pairing heap” are
the best for production runs. More information
about these approaches can be found in Ref. 5.

Reporting Schemes
By “reporting”, we refer to writing information about the simulation into output files that can be further
post-processed. Zacros implements several different reporting modes that enable the user to get a full
picture of the dynamics of the system simulated. One can thus report information about the state of the
lattice, the number of gas phase molecules produced/consumed, the statistics of occurrence of
elementary events etc. Reporting can be done at specific time intervals, every time a lattice process (e.g.
desorption) takes place, or even when a specific process takes place, for instance a disproportionation
reaction between species A and B. The relevant keywords are shown below:

snapshots expr Determines how often snapshots of the lattice state will be
written to output file history_output.txt (for the latter
see section History Output File, page 71). Possible options for
expression expr are discussed below.

© Michail Stamatakis July 27, 2024


Zacros 3.03 User Guide Page 22 of 89

process_statistics expr Determines how often statistical information about the


occurrence of elementary events will be written to output file
procstat_output.txt (for the latter see section Process
Statistics Output File, page 72). Possible options for expression
expr are discussed below.

species_numbers expr Determines how often information about the number of gas
and surface species, as well as the energy of the current lattice
configuration) will be written to specnum_output.txt (for
the latter see section Species Numbers Output File, page 73).
Possible options for expression expr are discussed below.

For the above three keywords (snapshots, process_statistics and species_numbers),


the possible options for expression expr are:

off switches off reporting. No output will be written.

on event [int] specifies that an entry to the corresponding output file will be
written at every int KMC steps. The integer following on
event is optional and assumes the value of 1 if omitted. In the
latter case, the initial (KMC step 0) and all subsequent
configurations will be written.

on elemevent int1 specifies that an entry to the corresponding output file will be
written at every occurrence of elementary event int1. The
latter number points to an event defined in the mechanism
input file (see section Mechanism Input File, page 52).

on time real specifies that a snapshot will be written at linearly spaced time
points, at every ∆t = real time units (s): 0, ∆t, 2·∆t, 3·∆t, …

on logtime real1 real2 specifies that a snapshot will be written at logarithmically


spaced time points, starting at time t0 = real1 and
progressing by multiplying by a = real2: t0, a·t0, a2·t0, a3·t0, …
This sampling scheme is particularly useful if one needs to
investigate a vast range of timescales, as it overcomes the
problem of generating huge output files.

on realtime real specifies that a snapshot will be written at linearly spaced time
points, at every ∆tR = real seconds of real (clock) time (s): 0,
∆tR, 2·∆tR, 3·∆tR, … This scheme is intended for benchmarking
or diagnostic purposes only, since it will produce output files
that vary among different computers. It is useful for instance, if
you want to choose an appropriate ∆t for on time reporting,

© Michail Stamatakis July 27, 2024


Zacros 3.03 User Guide Page 23 of 89

by you don’t know how often Zacros would write an entry to


the output files. It may therefore happen that you end up with
either huge or null output files. Sampling on realtime
allows you to write a desired number of configurations (e.g. 10
configurations in an hour for real = 360 seconds), and then
choose the appropriate ∆t for production runs.

In addition to the above, Zacros has the functionality to save information about all energetic interaction
patterns (Energetics Input File, page 47) as well as the lattice processes (see section Mechanism Input
File, page 52), that have been detected for a configuration that arose during the course of the
simulation. These keywords are:

energetics_lists expr selexpr specifies that a list of energetic interaction patterns that
were detected on the lattice will be written to output file
energlist_output.txt. Possible options for expression
expr were discussed above. If one needs to output only
selected patterns (for instance only the 1st nearest neighbor
repulsions between species A-A) rather than all patterns, the
selection-expression selexpr can be used to achieve this.
Thus, selexpr follows the syntax:

select_cluster_type int1 int2 … specifies


that only instances of energetic patterns (clusters)
of types int1 int2 … will be written in the
output file. These integers point to clusters
defined in the energetics input file (see section
Energetics Input File, page 47).

process_lists expr selexpr specifies that a list of lattice processes (i.e. elementary events
that were detected on the lattice) will be written to output file
proclist_output.txt. Possible options for expression
expr were discussed above. If one needs to output only
selected elementary events (for instance only the bimolecular
reaction between species A-B) rather than all events, the
selection-expression selexpr can be used to achieve this.
Thus, selexpr follows the syntax:

select_event_type int1 int2 … specifies


that only instances of elementary events of types
int1 int2 … will be written in the output file.
These integers point to clusters defined in the
mechanism input file (see section Mechanism
Input File, page 52).

© Michail Stamatakis July 27, 2024


Zacros 3.03 User Guide Page 24 of 89

Finally, the following keywords enable the output of detailed information in general_output.txt
about the setup or the course of the simulation.

on_sites_seeding_report expr Controls the level of detail in the reporting of on-site


seeding instructions used to specify an initial state (see section
Initial State Input File, page 61). In previous versions of Zacros
(2.0 and lower), all the on-sites seeding instructions given in
initial_state.dat were repeated for cross-checking in
general_output.txt. However, this reporting behavior
could result in very large output files, especially if the number
of such seeding instructions was large. Starting from Zacros 3.0
the default behavior (if this keyword is omitted) has been to
suppress such detailed output; however, the user can override
this default and get the full output if desired (as in previous
Zacros versions). Thus, expression expr can be one of the
following:

off switches off detailed reporting of on-sites seeding


instructions (this is the default). Also prints a note
in general_output.txt informing the user
on how to enable detailed reporting of on-sites
seeding instructions.

on switches on detailed reporting of on-sites seeding


instructions. Caution: this may result in
excessively large general_output.txt files if
many on-sites seeding instructions appear in
state_input.txt. This behavior will be
aggravated in MPI runs, because each general
output file (from each MPI process) will report all
the seeding instructions given. Use with caution!

event_report expr Controls event reporting behavior. Expression expr can be


one of the following:

off switches off event reporting (this is the default).


No information will be written.

on switches on event reporting. At every KMC step,


information about the last elementary process
executed is written to general_output.txt
(see section General Output File, page 63).

© Michail Stamatakis July 27, 2024


Zacros 3.03 User Guide Page 25 of 89

Stopping and Resuming


The following keywords control Zacros’s behavior in terminating and resuming a simulation.

max_steps expr The maximum number of KMC steps to be simulated. This


keyword defines a stopping criterion. Expression expr can be
one of the following:

infinity sets the maximum number of steps to the


maximum value of a 4-byte integer number
(2147483647 ≈ 2.15·109).

int specifies a number of maximum steps.

max_time expr The maximum allowed simulated time interval (time ranges
from 0.0 to the maximum time in a simulation). This keyword
defines a stopping criterion. Expression expr can be one of
the following:

infinity sets the maximum time to the maximum value of


an 8-byte real number (about 1.8·10308).

real specifies the maximum time.

wall_time int The maximum allowed real-world time in seconds that the
simulation can be left running. The code has an internal
“stopwatch” and will exit normally once the wall-time has been
exceeded. Upon exit, the state of the program will be saved in
file restart.inf, so that the simulation can resume at a
later time. This feature is particularly useful when running in
computational clusters where a scheduler may enforce limits
on the time a simulation can be run.

no_restart This keyword gives the option to override the default behavior
of the program and not produce any restart.inf file upon
exit. If this has been specified, the program will not be able to
resume the simulation at a later time. This can be useful if one
wants to perform short/experimental runs, change the input
and rerun from scratch repetitively. For production runs, it is
recommended to avoid using this keyword.

Treating Fast Quasi-Equilibrated Processes


It is frequently the case that in a KMC simulation some processes have rate constants that are much
higher than those of all the other processes. This results in these fast processes quickly reaching quasi-
equilibrium. Yet, they will continue to be simulated extremely frequently, to the point that a lot of
computational time is wasted. One can apply singular perturbation arguments to show that scaling

© Michail Stamatakis July 27, 2024


Zacros 3.03 User Guide Page 26 of 89

down these kinetic constants incurs a small and quantifiable error in the simulation (see for instance
Stamatakis and Vlachos).6 Zacros employs dynamic detection of time-scale separation and dynamic
scaling of the kinetic constants to accelerate the simulation. The procedure currently implemented in
Zacros is along the lines of previously published algorithms;7-9 yet, it was developed independently and
thus it may behave differently than these algorithms.

Caution: all the algorithms that scale-down kinetic constants are approximate: they will always
introduce error in the simulation. The desired case is of course when this error is small compared to the
KMC sampling error, and therefore imperceptible. It is recommended that you do your own testing, by
progressively reducing the downscaling of the kinetic constants, until the results do not change. At that
point one can reasonably assume that they have converged to the accurate solution.

The algorithm implemented in Zacros works as follows (parameters of the algorithm appear in blue to
make the connection with the keywords later):

1. Define a “stiffness coefficient” (stiffness_coeffi) for each event i as the scaling factor of the kinetic
constant of the forward and (if applicable) reverse event. The stiffness_coeff ranges from 0+ to 1. To
scale-down the rate constant of an event, its pre-exponential is multiplied by stiffness_coeff.
2. In the beginning of the simulation, set stiffness_coeff = 1 for all events.
3. For every N_events∙[number of elementary steps in mechanism] KMC events that have occurred:
Nfwd,i
3.1. Calculate the partial-equilibrium ratios as Ri ← for each of the events, i, where
Nfwd,i + Nrev ,i
Nfwd,i and Nrev,i are the number of occurrences of the forward and reverse steps respectively.
Non-reversible steps have Ri = 1 by definition. For quasi-equilibrated reversible events Ri
evaluates to ½.
3.2. If any scaled-down step appears as non-equilibrated we may have scaled it down too much. Try
to detect such steps and remedy the situation by scaling up:
3.2.1. Check if Ri − ½ > quasiequi_tol and stiffness_coeffi < stiffn_scaling_threshold for any
event i, where quasiequi_tol is the tolerance for detecting the quasi-equilibrated steps
and stiffn_scaling_threshold is a threshold value.
3.2.2. If both conditions are true for an elementary step i, try to increase stiffness_coeffi
multiplying it by a constant factor, const_factor:
stiffness _ coeffi ← stiffness _ coeffi ⋅ const _ factor
If stiffness _ coeffi > stiffn_ scaling _ threshold set stiffness _ coeffi ← 1 .
3.3. Find the fastest non-equilibrated step, as the one with the largest Nfwd,i and a partial-
equilibrium ratio Ri − ½ > quasiequi_tol . Set ifastnoneq = i of that step if it exists.
3.4. Find the fastest equilibrated step, as the one with the largest Nfwd,i and a partial-equilibrium
ratio Ri − ½ ≤ quasiequi_tol . Set ifasteq = i of that step if it exists.
3.5. Find the slowest equilibrated step, as the one with the smallest Nfwd,i and a partial-equilibrium
ratio Ri − ½ ≤ quasiequi_tol . Set isloweq = i of that step if it exists.

© Michail Stamatakis July 27, 2024


Zacros 3.03 User Guide Page 27 of 89

3.6. Based on the event occurrence statistical information computed above, decide what action to
take:
3.6.1. Case 0: There is no fastest non-quasi-equilibrated step, but all events that have been
occurring appear to be fast and quasi equilibrated.
3.6.1.1. If the timescale separation between steps ifasteq and isloweq is significant, namely,
Nfwd,ifastequil
> max_allowed_fast_quasiequi_separ bring all quasi-equilibrated time
Nfwd,islowequil
Nfwd,islowequil
scales to the slowest one, by setting stiffness _ coeffi ← stiffness _ coeffi ⋅ .
Nfwd,i
If stiffness _ coeffi > stiffn_ scaling _ threshold set stiffness _ coeffi ← 1 .
3.6.1.2. Otherwise, all fast equilibrated steps occur on the same timescale. In this case try to
decrease stiffness_coeffi dividing it by a constant factor factor, const_factor:
stiffness _ coeffi
stiffness _ coeffi ← . This aims at exploring slower dynamics in the
const _ factor
system.
3.6.2. Case 1: There is a fastest non-quasi-equilibrated step. The aim is to make the ratio of the
timescales of all quasi-equilibrated steps over the timescale of the fastest non-
equilibrated step to range between timescale_sep_min and timescale_sep_max. The
parameter timescale_sep_geomean = (timescale_sep_min⋅timescale_sep_max)½ is also
used below.
3.6.2.1. Check if Ri − ½ ≤ quasiequi_tol and
timescale_sep_max ⋅ Nfwd,ifastnonequil < min (Nfwd,i ,Nrev ,i ) for any event i. Events for which
these conditions evaluate to true are too fast and quasi-equilibrated. For these
events evaluate a new stiffness coefficient as:
Nfwd,ifastnonequil
stiffness _ coeffi ← stiffness _ coeffi ⋅ timescale _ sep _ geomean ⋅ .
min (Nfwd,i ,Nrev ,i )
If stiffness _ coeffi > stiffn_ scaling _ threshold set stiffness _ coeffi ← 1 .
3.6.2.2. Check if Ri − ½ ≤ quasiequi_tol , stiffness _ coeffi < 1 , and
timescale_sep_min⋅ Nfwd,ifastnonequil > min (Nfwd,i ,Nrev ,i ) for any event i. Events for which
these conditions evaluate to true are too slow and quasi-equilibrated. For these
events evaluate a new stiffness coefficient as:
Nfwd,ifastnonequil
stiffness _ coeffi ← stiffness _ coeffi ⋅ timescale _ sep _ geomean ⋅ .
min (Nfwd,i ,Nrev ,i )
If stiffness _ coeffi > stiffn_ scaling _ threshold set stiffness _ coeffi ← 1 .
4. Scale all affected pre-exponentials and random times of the occurrence of the events in the queue,
and continue with the simulation.

© Michail Stamatakis July 27, 2024


Zacros 3.03 User Guide Page 28 of 89

The keywords that set the parameters of the above algorithm, thereby controlling the behavior of the
timescale separation treatment module, are discussed below.

enable_stiffness_scaling By default, no stiffness scaling is performed during a


simulation. If this keyword is encountered in the file
simulation_input.dat, stiffness scaling is enabled.

stiffness_scale_all Specifies that the rate constant of any elementary step out of
those defined in mechanism_input.dat can be scaled to
treat time-scale separation. If this keyword is absent, one has
to explicitly define the elementary steps whose rate constants
can be scalable with the keyword stiffness_scalable in
mechanism_input.dat (see section Elementary Step
Representation, page 59).

check_every int Specifies the number of KMC events after which the stiffness
scaling module is invoked. It sets the parameter N_events in
step 3 of the above algorithm (default value = 1000). If the
number of new KMC events executed exceeds the product of
N_events with the count of elementary events in the
mechanism, then the scaling module is triggered.

min_separation real Specifies the minimum desired separation between the


timescales of the fastest non-equilibrated step and the quasi-
equilibrated steps. It sets the parameter timescale_sep_min in
step 3.6.2 of the above algorithm (default value = 49).

max_separation real Specifies the maximum allowed separation between the


timescales of the fastest non-equilibrated step and the quasi-
equilibrated steps. It sets the parameter timescale_sep_max in
step 3.6.2 of the above algorithm (default value = 100).

max_qequil_separation real Specifies the maximum allowed separation between the


timescales of the quasi-equilibrated steps in the case that all
steps are quasi-equilibrated. It sets the parameter
max_allowed_fast_quasiequi_separ in step 3.6.1.1 of the
above algorithm (default value = 5).

tol_part_equil_ratio real Specifies the tolerance for detecting quasi-equilibrated steps. It


sets the parameter quasiequi_tol in steps 3.2.1, 3.3, 3.4, 3.5,
3.6.2.1 and 3.6.2.2 of the above algorithm (default value = 0.05).

stiffn_coeff_threshold real Specifies the threshold above which any stiffness coefficient
will be automatically mapped to 1. It sets the parameter

© Michail Stamatakis July 27, 2024


Zacros 3.03 User Guide Page 29 of 89

stiffn_scaling_threshold in steps 3.2.2, 3.6.1.1, 3.6.2.1, 3.6.2.2


of the above algorithm. This threshold is also used to detect
whether an elementary step has been scaled down too much
in step 3.2.1 of the above algorithm (default value = 0.02, no
thresholding will occur if the value is set to 1).

scaling_factor real Specifies the scaling factor used in the uniform upscaling or
downscaling of kinetic rate constants. It sets the parameter
const_factor in steps 3.2.2, 3.6.2.1 of the above algorithm
(default value = 5).

Simulating Very Large Lattices


Capturing the dynamics of heterogeneous catalysts via KMC simulation can incur a high computational
cost, especially when simulating large domains, which may be necessary to obtain high accuracy, but
also because it may be dictated by the physics of the system. For instance, simulations of reactive
systems that exhibit pattern formation can only be done in domains much larger than the characteristic
lengths of the patterns, which can often be on the order of several hundreds or even thousands of sites.
To carry out such simulations efficiently in a distributed and scalable manner, Zacros incorporates an
MPI implementation based on domain decomposition and the Time-Warp algorithm,10 which comprise
an exact numerical scheme for efficient, large-scale KMC runs. Before describing the relevant keywords
that control the behavior of Zacros for such runs, let us provide a brief, high-level overview of the
approach; more technical details can be found in Ref. 11.

Overview of the Approach


The Time-Warp algorithm resolves a major challenge that arises due to the (mostly) asynchronous
nature of distributed KMC simulations of large domains, comprising several “connected” subdomains. To
demonstrate this challenge let us consider an example simulation in which a lattice is split into 4
domains, assigned to MPI processes 0, 1, 2, and 3, as in Figure 1. As already noted, the simulation of

Domain 1 Domain 3
MPI Process 1 MPI Process 3

Domain 0 Domain 2
MPI Process 0 MPI Process 2

Figure 1: A lattice of 36 sites decomposed into 4 domains, each assigned to an MPI process. For
production runs, the domains are much larger, containing thousands of sites each.

© Michail Stamatakis July 27, 2024


Zacros 3.03 User Guide Page 30 of 89

these domains happens asynchronously, unless an event at a shared boundary occurs (e.g. MPI process
0 should not worry about internal events of process 1 or other MPI processes). The asynchronous nature
of the simulation risks violating causality. If, for example, MPI process 0 sends a particle via diffusion into
domain 1 at time t0, but the current KMC time in domain 1 is t1 > t0, then the history of process 1 (after
t0) is incorrect (Figure 2a). Causality has been violated, since at time t0 a particle diffused into domain 1,
but that domain has no record of this particle in its history.
To resolve this boundary conflict and restore causality, MPI process 1 will have to “roll back” to time t0,
discard the incorrect history and re-simulate its evolution (Figure 2b). However, during this discarded
history (t0 to t1) MPI process 1 may have performed diffusions that sent particles to other domains, e.g.
domain 2 in our example. Thus, it has to somehow signal to those MPI processes to undo and re-
simulate their history as well (Figure 2c). Further complications arise from the fact that these MPI
processes might have performed actions that affected other domains. It turns out that every boundary
conflict can potentially lead to a cascade of roll-backs and re-simulations that engage MPI processes
beyond the ones directly involved in that conflict.

The cascade just noted leads to a complex conflict resolution problem, a solution to which was proposed
by Jefferson in the mid-80s.10 His idea of virtual time and the “Time-Warp” algorithm provide an elegant
and powerful solution to this communication problem. The basic principle of this algorithm is that if
event A causes event B, then event A must be scheduled in real time before event B (note that our KMC
simulations have the simplifying property that an event, i.e. adsorption, desorption, reaction etc., is
“instantaneous”; thus there is no concept of an event’s duration). A high-level description of the
operations that need to be performed for the algorithm to work is as follows (more details and outlines
of the pertinent algorithms can be found in the Ref. 11):

• At every KMC step, the algorithm checks the imminent internal process and the imminent
messaged process (from the message-queue). The one with the smallest waiting time is
executed.

• Execution of internal events happens “privately” and asynchronously among MPI processes.

• Every so often (e.g. every 100 KMC steps), an MPI process saves a snapshot of its KMC state in
the state-queue data-structure, to be used in case of a roll-back. The state-queue has a fixed size
and therefore a given number of “slots”. If all slots are used up at some point during the run, the
queue is “sparsified”: every other snapshot is deleted and the interval by which snapshots are
saved is doubled.

• If an MPI process A executes an event that affects sites at the halo (or domain) of another MPI
process B, then process A has to send a message that encodes a “do” action to process B. The
message includes a timestamp, which is equal to the time that the event of MPI process A was
executed. Every message is stored in the message-queue data-structure for future reference.

© Michail Stamatakis July 27, 2024


Zacros 3.03 User Guide Page 31 of 89
MPI : messages : anti-messages
Process:
3

t=0 t0 t1 tfin
(a)
Rollback of MPI Process 1

t=0 t0 t1 tfin
(b)
Rollback of MPI Process 2

t=0 t0 t1 tfin
(c)
Figure 2: Demonstration of the cascade of rollbacks caused by an event at the boundary between MPI
processes 0 and 1. At time t0 MP process 0 sends a particle to MPI process 1 (panel a). Since the latter
has already advanced past that time, it has to roll back to t0, discard any history simulated from t0 to t1
and re-simulate (panel (b). However, that discarded history contains other events communicated e.g. to
process 2, which now has to roll back as well (panel c).

© Michail Stamatakis July 27, 2024


Zacros 3.03 User Guide Page 32 of 89

• If an MPI process receives a message:

o If the message has timestamp in the past, then a roll-back has to be performed. The MPI
process reinstates the KMC state at the time just before the message’s timestamp and
schedules the message’s action as the first one to be executed. Then it goes through the
message-queue, finds all messages that were set to other MPI processes and sends
corresponding anti-messages (so that others can undo the corresponding actions).

o If the message has a future timestamp, it is just added to the message-queue.

• If an MPI process receives an anti-message:

o a rollback is performed in a similar way as when a message is received. The only


difference is that when the anti-message is inserted in the message-queue, it annihilates
the corresponding message.

o If the anti-message has a future timestamp, it just annihilates the corresponding


message from the message-queue.

• These operations are supplemented by a Global Control Mechanism that keeps track of the so-
called global virtual time (GVT), which is essentially the time up to which the local histories
simulated are self-consistent.

o The computation of the GVT happens via a global reduction operation at specified
intervals of real time, e.g. every 30 seconds.

o After every GVT computation, any messages or snapshots that are no longer needed are
deleted to save space. Note that the algorithm must always retain at least one snapshot
strictly before the GVT (to be precise, this is true if GVT > 0, whereas if GVT = 0, the first-
ever snapshot taken at the beginning of the simulation must be retained). This snapshot
is to be used in case of a “worst-case scenario rollback”, i.e. a rollback that takes us back
to time exactly equal to the GVT.

o GVT computation is also useful in the proper termination of the run, which happens
once all MPI processes have reached the final KMC time specified by the user and all
messages have been received.

Regarding the efficiency of these runs, it is claimed in Ref. 10 that, for sufficiently small halos with
respect to the non-overlapping domain, a cascade of rollbacks that would bring the whole simulation to
the beginning is unlikely, and the global time will progress about as fast as the slowest process. Keep in
mind however, that the overheads incurred by the procedures just discussed (messaging, rollbacks,
snapshot saving, global communications) make the MPI Time-Warp implementation practical for
sufficiently large lattices. Moreover, such runs require substantial amounts of available memory, so that
a sufficient amount of KMC state snapshots can be retained in the state-queue. Our benchmarks on
simple and more complicated systems indicate that efficiency factors are heavily system-dependent, but

© Michail Stamatakis July 27, 2024


Zacros 3.03 User Guide Page 33 of 89

as a rule of thumb, we have seen that lattices with about a million sites or more, exhibit linear scaling for
at least 400 MPI processes.

Performing MPI Time-Warp Runs


To perform a Time-Warp run, Zacros should be compiled with the MPI options enabled and the objects
linked against the MPI libraries (see section Compiling Zacros starting on page 9). Note that in the
current version of Zacros, the following limitations apply to MPI runs:
• Only the binary heap is supported as the queueing system for lattice events.

• The lattice must be either a default or a unit-cell-defined lattice; explicitly defined lattices are
not supported (see section Lattice Input File, page 41 onwards).

• If a state input file is present, it can only contain individual seeding instructions
(seed_on_sites keywords), not multiple seeding instructions (seed_multiple blocks)
(see section Initial State Input File on page 61).
• Reporting of simulation observables must be on time (or off) (see section Reporting
Schemes, page 21), so that the output is synchronized (in simulated time) among the MPI
processes.
• Caching (see section Memory Management, page 37) is not supported due to the large memory
requirements thereof which make it impractical for distributed runs.
When Zacros is invoked by mpiexec (or mpirun or similar), via a command like:

mpiexec -np 400 /path/to/zacros.x

it detects how many MPI processes are available and tries to partition the lattice in such a way that:

• all subdomains are N × N tilings of the unit cell, and

• all MPI processes are used.


For instance, if the lattice is 2000 × 3000 unit cells, Zacros can partition this to 600 MPI processes, each
responsible for a subdomain of 100 × 100 unit cells (not including external halo sites). Another possible
partitioning is to 24 MPI processes, each responsible for a 500 × 500 subdomain. If the above two
conditions cannot be satisfied, the execution stops with an error. For instance, in our example, if, say the
user has indicated that 100 MPI processes are available, the run would only be able to use up to 96 MPI
processes (each working on subdomain of 250 × 250 unit cells). To avoid having idle MPI processes
(which might be due to negligence on the user side, but wastes computational resources), Zacros does
not proceed with the simulation, unless all MPI processes can be used.

Keywords
We now proceed to discuss the relevant keywords that can be used in simulation_input.dat, to
control the behavior of Time-Warp runs.

© Michail Stamatakis July 27, 2024


Zacros 3.03 User Guide Page 34 of 89

random_streams expr The type of parallelization used in the random number


generator. Expression expr can be one of the following:

multi_seed Each MPI process will instantiate the same


random number generator but with a seed value
that is dependent on random_seed (see page
19) and its rank. Caution: this is known to be risky,
as it may lead to correlations among the random
streams of the different MPI processes. This is
implemented in Zacros for development and
experimental purposes.

jump_ahead A single random stream will be used for the


distributed simulation, but decomposed into sub-
streams as follows: each MPI process will first
instantiate the same random number generator
with the same seed value, which is given by
random_seed (see page 19). Then, each MPI
process will skip over a number of random
deviates. How many deviates are skipped depends
on the random number generator chosen; for
generators with small periods, the algorithms tries
to break the entire random sequence into as many
“chunks” as the MPI processes available. For
generators with very large periods, e.g. the
Mersenne Twister whose period is 219937-1, a fixed
number of 2256 deviates is skipped.

In the absence of a user-defined specification, jump_ahead is


used as the default option, and it is anyway advisable that this
approach be used to ensure high-quality simulations.

state_queue_container expr The type of state-queue data-structure used to store KMC


state snapshots. Expression expr can be one of the following:

linked_list A linked-list data-structure will be used as


the state-queue container. When saving a
snapshot new memory is allocated for the new
node of the list (the KMC state object), and when
deleting a snapshot, the corresponding memory
gets deallocated. In the case of frequent snapshot
saving, this allocation-deallocation may incur
some overhead.

© Michail Stamatakis July 27, 2024


Zacros 3.03 User Guide Page 35 of 89

vector A data structure base on a one-dimensional array


of KMC state objects (along with an indexing
array) will be used as the state-queue container.
The array gets pre-allocated at the start of the
run, so when saving a snapshot, a copy operation
is performed. Deleting a snapshot incurs negligible
overhead, since only the corresponding index is
deleted, rather than the entire object. Hence, the
vector data-structure can be more efficient.

In the absence of a user-defined specification, vector is used


as the default option.

state_queue_snapshots int real Specifies the initial frequency of snapshot saving and
the maximum allowed size of the state-queue. Thus, int is the
initial number of KMC steps every which a snapshot is saved
and real is the maximum size in gigabytes (GB) of the state-
queue. In the absence of a user-defined specification,
snapshots are saved every 1000 steps, up to 0.5 GB of memory
allocation. If the queue reaches maximum capacity and a new
snapshots needs to be saved, the queue is then sparsified by
deleting every other snapshot, and the interval for snapshot
saving is doubled.

time_interval_gvt_computation int Specifies the number of seconds of clock time every


which a global communication happens in order to compute
the GVT, clear old messages and snapshots and decide on run
termination. In the absence of a user-defined specification, 30
seconds is used as the default option.

Validation of MPI Time-Warp Runs


The serial code of Zacros implements a parallel emulation scheme that enables the validation of the
distributed (MPI Time-Warp) runs. This scheme runs exactly the same way as the “original” serial KMC
implementation of Zacros, with the only difference being the random numbers generated in the
simulation. More specifically, in parallel emulation runs, one makes use of as many random number
streams as the number of (emulated) MPI processes. When generating random times for new events, or
updating the random times of existing events, the pertinent subroutines check which MPI process would
the new (or updated) event belong to, and consequently draw the next random number from the
correct stream. The correctness of the MPI Time-Warp implementation is proven by the parallel

© Michail Stamatakis July 27, 2024


Zacros 3.03 User Guide Page 36 of 89

emulation run yielding the same results as the truly distributed run (obtained with the MPI code). † The
pertinent keyword is discussed below.

parallel_emulation int1 int2 Emulates, via a sequential algorithm, a distributed


Time-Warp run with a configuration of int1 × int2 MPI
processes. This keyword can only be used in non-MPI runs
(refer to section Compiling Zacros on page 9 for instructions on
how to compile a serial or OpenMP-threaded executable).

Note that in the current version of Zacros, the following limitations apply to parallel emulation runs:
• In simulations in which events with equal timestamps arise, the trajectory obtained with parallel
emulation may differ from that of the truly distributed runs. This happens because our MPI
Time-Warp implementation contains special rules to determine the priority of equal-timestamp
events (thereby avoiding inconsistencies or deadlocks). We have not implemented these rules in
the parallel emulation algorithm, in order to keep as simple as possible and also because equal
timestamp events are normally quite rare.
• Parallel emulation runs do not currently support the specification of an initial state. If a
state_input.dat file is found in the working directory the simulation stops with an error.

• The average waiting times reported in file procstat_output.txt of a parallel emulation


run are never identical to those calculated by “merging” the results of files
procstat_output_*.txt generated by the MPI processes performing a Time-Warp run.
However, these reported times should be close to each other. To obtain identical results would
necessitate extremely frequent global communications among all MPI processes, which is not
practical.
It is important to note that parallel emulation runs do not impose some restrictions of the MPI Time-
Warp runs (which are detailed in section Performing MPI Time-Warp Runs on page 33). Thus, parallel
emulation runs are slightly more permissive, e.g. they allow queueing systems other than the binary
heap and they support partitioning the lattice into subdomains that are not square tilings of the unit
cell. Future development planned for the MPI code will enable these features and this is why they are
enabled in the parallel emulation scheme. However, please keep in mind that, in order to validate your
MPI runs, the same settings (e.g. lattice size, MPI configuration, queueing approach) must be used for
the two runs that are being compared.

Finally, one should keep in mind that, since the parallel emulation scheme is based on the sequential
KMC algorithm, it will be much less efficient than the Time-Warp implementation for simulations on
very large lattices. It is therefore advisable that the validation of MPI runs be done on relatively small to
medium sized lattices.


In the event that you see discrepancies between the results of parallel emulation runs and those of MPI runs,
please first double-check that the two simulation setups are indeed comparable and that your simulations
are not subject to the limitations discussed later in this section. If you still cannot explain the discrepancies,
please notify us, as this would mean that there may be a bug in the simulator.

© Michail Stamatakis July 27, 2024


Zacros 3.03 User Guide Page 37 of 89

Memory Management
During a KMC simulation, Zacros keeps in the memory of the computer a lot of information, notably:

1. all the possible elementary events that are possible given a current lattice configuration, up to a
maximum number of Nmax events such events,
2. references to the events to which each and every adsorbate on the lattice participates, up to a
adsorb.
maximum of Nmax events events per adsorbate,
3. all the possible energetic clusters that “make up” the total lattice energy of given a current
configuration, up to a maximum number of Nmax clusters such clusters,
4. references to the clusters to which each and every adsorbate on the lattice participates, up to a
adsorb.
maximum of Nmax clusters clusters per adsorbate.
adsorb. adsorb.
The aforementioned maximum numbers, Nmax events , Nmax events , Nmax clusters and Nmax clusters have a
direct effect on the memory footprint of a Zacros run: the larger their values, the larger the memory
needed for that run. Before Zacros 3.0, these maximum numbers were calculated using heuristic
expressions and remained fixed during the entire run. However, for some simulated systems these were
overestimating the memory needed (thereby allocating much more memory than what was actually
needed), while for other systems, the memory allocated turned out not to be enough (which would lead
to abnormal terminations of the run). In the latter cases, the user had to change certain parameters in
the code (used to calculate these maximum values) and recompile Zacros. Starting with version 3.0,
Zacros provides an interface for the user to easily manipulate these parameters and optimize memory
utilization. This is very useful for MPI runs with the Time-Warp algorithm (see section Simulating Very
Large Lattices on page 29), because such runs can be quite memory intensive.

Hence, the following equations are used to calculate the maximum numbers noted earlier:

Nmax events =
µevents ⋅ Nsites (1)

Nmax adsorb. adsorb.


events = µ events (2)

Nmax clusters =
λ clusters ⋅ Nsites (3)

Nmax adsorb. adsorb.


clusters = λ clusters (4)

adsorb. adsorb.
The parameters µevents , µevents , λ clusters and λ clusters are of type integer in Zacros 3.0 and can be
manipulated directly by the user using the keyword override_array_bounds (discussed in more
detail below). Keep in mind that the aforementioned maximum values still remain fixed throughout the
run, and thus, one should be careful to allow enough leeway to accommodate for the fluctuating
numbers of events or clusters detected during the simulation. The memory utilization report (see
section Memory Usage Statistics on page 69) produced at the end of the run is very useful in deciding
the values of the parameters entering the equations above. Note also that the parameters are in

© Michail Stamatakis July 27, 2024


Zacros 3.03 User Guide Page 38 of 89

principle size-invariant; therefore, once “good” values are determined for a lattice of a certain size, the
same values can in principle be used for a larger or a smaller lattice. ‡

adsorb. adsorb.
The default values of the parameters are: µevents = 50, µevents = 200, λ clusters = 50 and λ clusters = 60. These
values are merely estimates and cannot be efficient for all systems. They could underestimate the
memory needed for complicated systems or overestimate it for simple systems. As an example, in a
simple system for which the cluster expansion contains only single-body patterns, one only needs
λ clusters = 1 and λ adsorb.
clusters = 1, since every single adsorbate can be involved in at most one pattern and we
can have at most Nsites adsorbates on the lattice (therefore Nsites energetic clusters). Thus, the default
values overestimate the memory and can be overridden by the keyword discussed in the following.

override_array_bounds expr Allows the user to override parameters that control the
memory footprint of the simulation. Expression expr can be a
quadruplet of integers, int1 int2 int3 int4 which
specify the values of µevents , µadsorb.
events , λ clusters and
λadsorb.
clusters ,
respectively. Any of int1,…,int4 (or all four of them) can
be replaced by the ampersand character (&), denoting that we
wish to rely on the default value for the corresponding
parameter. In any case, exactly four arguments (integers or the
& character) must appear after this keyword.

adsorb. adsorb.
Caution: if the values of parameters µevents , µevents , λ clusters and λ clusters underestimate the memory
needed for the run, the latter will terminate with an error (also printing an advice note to override these
values). When simulating a complicated system, it is advisable to either rely on the default values, or if
they fail, specify quite high/permissive values for these parameters. You can optimize the memory
utilization later, after perusing the memory utilization report printed in general_output.txt at the
end of the run (see section Memory Usage Statistics on page 69).

Accelerators
Specialized exact algorithms that exploit certain properties of the simulated system to accelerate the
simulation are described in the following.


We say “in principle” because one should always keep in mind that a KMC simulation is stochastic, and the
magnitude of the random fluctuations scales with the system size, which may lead to issues. As an example, say
that for a simulation on a lattice with 10,000 sites we have found that, at stationary conditions, we have a
number of events fluctuating between 17,000 to 19,000. We thus set µevents = 2, so that we can accommodate
up to 20,000 events in the memory. If we now use this setting on a lattice with 100 sites, we would be able to
accommodate up to 200 events in the memory. However, due to the smaller system size, the fluctuations
relative to the average now may be larger, e.g. the number of events may fluctuate between 150 and 210. The
maximum value of 210 events on the lattice can no longer be accommodated by the setting µevents = 2. This is
why it is important to allow for some leeway, as noted in the discussion.

© Michail Stamatakis July 27, 2024


Zacros 3.03 User Guide Page 39 of 89

cache_product_clusters Enables a product caching algorithm by which the counts of the


energetic interaction clusters of each possible reaction with
spectators are cached. This is an exact acceleration scheme
and has been observed to significantly speed up simulations
particularly for systems with long-range interactions.12
OpenMP parallelization is supported by this scheme’s
implementation as of version 3.0. To find out whether your
simulations would benefit from this scheme, it is
recommended that you do your own performance testing
before using it for production.

Moreover, Zacros implements accelerators for simple systems, which are enabled automatically
(without the user having to provide a keyword). The first accelerator applies to systems in which the
cluster expansion contains only single-site (single-body) terms. For such systems, the subgraph
isomorphism procedures for the detection of energetic interaction patterns are skipped altogether. If
Zacros detects that a system is amenable to simulation with this accelerator, the following message will
appear in the energetics setup section of file general_output.txt:

This cluster expansion involves only one-site (single-body) patterns.

The second accelerator is for systems in which the reaction mechanism involves monodentate species
participating in events spanning up to two sites and not involving any geometric criteria. In such cases, a
custom procedure is invoked that checks the occupancy of nearest neighbor sites and detects the
possible events, thereby skipping the subgraph isomorphism procedures for event detection (which are
of course general but more computationally intensive). The following message will appear in the
mechanism setup section of general_output.txt when this accelerator is enabled:

This mechanism contains up to two-site events involving only


monodentate species (all of which have no geometric specifications).

Troubleshooting
In addition to the aforementioned keywords there are some “debugging” keywords that may prove
particularly useful for troubleshooting a KMC simulation. These keywords trigger internal checks or
enable the output of detailed information in human-readable format, allowing the user to see “what is
going on” during the simulation. Note that these debugging procedures will significantly slow down
execution and/or result in large output files, so they should only be used in short runs (not for results
production).

debug_report_global_energetics Triggers the output of information pertinent to the


data-structures storing the energetic pattern contributions to
the total lattice energy. The information is written to file
globalenerg_debug.txt and includes: (i) the detection
of new energetic clusters initialization of the simulation or
whenever new species appear in the lattice after a KMC step,

© Michail Stamatakis July 27, 2024


Zacros 3.03 User Guide Page 40 of 89

(ii) the deletion of energetic clusters whenever species


disappear in the lattice after a KMC step, (iii) the re-indexing of
an energetic cluster when another cluster in the “middle” of
the stack is being deleted, so that no “holes” exist in the data-
structures, (iv) the total lattice energy at every KMC step (for
more details see section Energetics Lists Output File, page 74).

debug_report_processes Triggers the output of information pertinent to the queue of


KMC lattice processes in debugging-output file
process_debug.txt. Information written therein includes:
(i) the detection of new elementary processes upon
initialization of the simulation or whenever new species appear
in the lattice after a KMC step, (ii) the deletion of elementary
processes whenever species disappear in the lattice after a
KMC step, (iii) the re-indexing of an elementary processes
when another process in the “middle” of the queue is being
deleted, so that no “holes” exist in the queuing data-
structures, (iv) the update in the rates of elementary
processes, as a result of energetic interactions emerging from
newly appearing species in the lattice (for more details see
section Process Debug Output File, page 77).

debug_newtons_method Triggers the output of information pertinent to the Newton-


Raphson method, when simulating systems in which the
temperature is not constant. The subsequent approximations
to the solution along with convergence information are written
to file newton_debug.txt.

debug_check_processes Triggers an internal check verifying the self-consistency of the


data-structures pertinent to the KMC processes being queued
and executed. If a problem is found, the program produces an
error and terminates. §

debug_check_lattice Triggers an internal check verifying the self-consistency of the


data-structures representing the lattice state. If a problem is
found, the program produces an error and terminates.§

debug_check_caching Triggers an internal check verifying the self-consistency of the


data-structures of the product caching scheme. If a problem is
found, the program produces an error and terminates.§

§
Please notify us immediately if you encounter any of these errors, as this would mean that there may be a
bug in the simulator itself.

© Michail Stamatakis July 27, 2024


Zacros 3.03 User Guide Page 41 of 89

Finally, the following keyword is used to terminate parsing in the simulation input file:

finish This keyword marks the end of input. Any subsequent


information will not be parsed.

Lattice Input File


The file lattice_input.dat defines the lattice structure on which species can bind, diffuse and
react. There are 3 different ways of specifying a lattice structure as discussed in the following.

lattice expr Defines a lattice specification block. Expression expr can be


one of the following:

 default_choice allows the user to specify one of the
end_lattice available default lattices (explained in section
Default Lattices below).

periodic_cell allows the user to construct a lattice by


giving information about the unit cell (explained in
section Unit-Cell-Defined Periodic Lattices below).

explicit allows the user to import a custom (possibly non-


periodic) lattice structure generated manually or
by a different program (explained in section
Explicitly Defined Custom Lattices below).

The permitted keywords for each of the aforementioned options are discussed in the following.

Default Lattices
Currently there are three possible default lattices, all of which are periodic with coordination numbers
equal to 3, 4 and 6 (Figure 3). In these lattices all sites are equivalent (single site type). The name of this
site type is by default StTp1. Inside a default lattice block (lattice default_choice …
end_lattice) the following keywords are allowed:

triangular_periodic real int1 int2 Specifies a lattice with coordination number 3.


The real number defines the lattice constant whereas the two
integers give the number of copies of the unit cell in the
horizontal and vertical directions, respectively for int1 and
int2. Note that the unit cell for this default lattice is not the
primitive unit cell, and it contains 4 sites.

rectangular_periodic real int1 int2 As above for a lattice with coordination number 4.
For this lattice, the unit cell is the primitive cell.

© Michail Stamatakis July 27, 2024


Zacros 3.03 User Guide Page 42 of 89

8 16 24 5 10 15 20 25 10 20 30
7 15 23 9 19 29
4 9 14 19 24 8 18 28
6 14 22 7 17 27
5 13 21
6 16 26
3 8 13 18 23
5 15 25
4 12 20
3 11 19 4 14 24
2 7 12 17 22 3 13 23
2 10 18 2 12 22
1 9 17 1 6 11 16 21 1 11 21

Figure 3: Default lattices in Zacros. Left: triangular (coordination number, CN = 3). Middle: rectangular
(CN = 4). Right: hexagonal (CN = 6). The blue lines connect the 1st nearest neighbors of the lattice. The
black lines denote the simulation box and the thick red lines the unit cell.

hexagonal_periodic real int1 int2 As above for a lattice with coordination number 6.
For this lattice, the unit cell is not the primitive unit cell, and it
contains 2 sites.

Note that, for MPI runs with default lattices, the number of MPI processes that can be used is subject to
the restrictions discussed in Performing MPI Time-Warp Runs on page 33.

Unit-Cell-Defined Periodic Lattices


Zacros allows the user to define custom periodic lattices by providing information about the unit cell
geometry, the sites contained therein, and the neighboring relations between sites in the same cell as
well as neighboring cells. Inside a periodic lattice block (lattice periodic_cell …
end_lattice) the following keywords are allowed:
cell_vectors This keyword is followed by two pairs of reals in two
real1 real2 subsequent lines, as shown in the left, which define the unit
real3 real4
vectors. The two unit vectors are thus α = (real1, real2)
and β = (real3, real4).
repeat_cell int1 int2 The number of repetitions of the unit cell in the directions of
unit vectors α and β, respectively for int1 and int2.

n_cell_sites int The total number of sites (of any site type) in the unit cell.

n_site_types int The number of different site types. For instance, if one needs
to model a lattice for the Pt(111) surface taking into account
top, bridge, fcc and hcp sites, then int should be 4 (see also
Figure 4).

site_type_names str1 str2 … The names of the different site types. There should be as many
strings following this keyword as the number of site types
specified by n_site_types. In the example just mentioned

© Michail Stamatakis July 27, 2024


Zacros 3.03 User Guide Page 43 of 89

Marker Index Name


1 top
2 brg
5 8 3 fcc
3 9 2 12 4 hcp
4 7

1 11 6 10

site_types 1 1 2 2 2 2 2 2 3 3 4 4
site_types top top brg brg brg brg brg brg fcc fcc hcp hcp
Figure 4: Lattice representing the (111) surface of an FCC metal, for instance Pt(111). Numbered are only
the sites belonging to one unit cell, which is denoted by thick black lines. The table on the right shows the
4 different sites types, along with the index and name of each one. Given on the bottom are the two
equivalent expressions that define the types of all sites within the unit cell.

for the Pt(111) surface lattice, the expression used can be:
site_type_names top brg fcc hcp (see Figure 4).

site_types expr The site types for each of the different sites of the unit cell.
Expression expr can consist of as many strings or integers as
the number of site types specified by site_type_names.
Thus, there are two options for expr (see Figure 4):

int1 int2 … expresses the site types in terms of their


indexes.

str1 str2 … expresses the site types in terms of their


names as specified by site_type_names.

site_coordinates This keyword is followed by lines containing pairs of real


real1 real2 numbers specifying the “fractional coordinates” of each site in
real3 real4 the unit cell, as shown on the left. There should be as many
  lines as the number of sites in the unit cell, specified by
n_cell_sites. The “fractional coordinates” are with
respect to the unit cell vectors α and β defined by
cell_vectors. Thus, the Cartesian coordinates of site 1 will
be: real1·α + real2·β, of site 2: real3·α + real4·β, etc.

© Michail Stamatakis July 27, 2024


Zacros 3.03 User Guide Page 44 of 89

neighboring_structure
int1-int2 keywrd 1-2 self
N
 2 2 NE 1-1 north
end_neighboring_structure
1 1 1-1 east
1-1 southeast
E
1
2
1
2
2-1 north
2-2 north
2 SE 2-1 east
1
2-2 east
2-2 southeast
end_neighboring_structure

Figure 5: Lattice representing the (111) surface of an FCC metal, with the fcc and hcp sites taken into
account. Numbered are only the sites belonging to the central unit cell, as well as the north, northeast,
east, and southeast neighboring unit cells (N, NE, E, SE, respectively). Within each cell, site 1 is the fcc,
whereas the hcp is site 2. The links of sites of the central unit cell with sites in neighboring cells are
depicted. The neighboring structure that gives rise to this lattice is shown on the right.

neighboring_structure Defines a neighboring structure block containing an arbitrary


number of expressions formed by two integers separated by a
dash and a keyword (see Figure 5 for an example). The latter
can be one of the following: self, north, northeast,
east, southeast. Each of these expressions specifies a
“link” between two sites, making them 1st nearest neighbors.
For example, if sites 1 and 2 in the same unit cell need to be
specified as neighbors, the expression to be used is: 1-2
self. Note that the order in this case does not matter; thus
the expression just noted is equivalent to 2-1 self.
Moreover, if site 1 neighbors with its own image on the unit
cell above, the expression would be 1-1 north. Note that if
different sites are defined as neighbors across unit cells, the
order matters, namely 1-2 northeast, is not equivalent to
2-1 northeast.

Note that, for MPI runs with unit-cell-defined lattices, the number of MPI processes that can be used is
subject to the restrictions discussed in Performing MPI Time-Warp Runs on page 33.

Explicitly Defined Custom Lattices


Zacros can also accept custom lattice structures which may be created manually (yet, note that custom
lattices are not supported in the MPI version of Zacros; for a full list of limitations see Performing MPI
Time-Warp Runs on page 33). Inside an explicit lattice block (lattice explicit … end_lattice)
the following keywords are allowed (see also Figure 6 for an example of a custom lattice):

© Michail Stamatakis July 27, 2024


Zacros 3.03 User Guide Page 45 of 89

cell_vectors This keyword is followed by two pairs of reals in two


real1 real2 subsequent lines, as shown in the left, which define the unit
real3 real4 vectors. The two unit vectors are thus α = (real1, real2)
and β = (real3, real4). Unlike the unit-cell-defined periodic
lattice, the cell_vectors keyword here is optional, used
only if the lattice defined is intended as periodic. Moreover,
the site coordinates inside the lattice_structure block
(see below) are always given as Cartesian coordinates for an
explicit lattice (irrespective of periodicity).

n_sites int The total number of sites in the entire lattice.

max_coord int The maximum coordination number, namely the maximum


number of 1st nearest neighbors, for any of the lattice sites.

n_site_types int The number of different site types (as previously; see section
Unit-Cell-Defined Periodic Lattices, page 42).

site_type_names str1 str2 … The names of the different site types. There should be as many
strings following this keyword as the number of site types
specified by n_site_types (as previously; see section Unit-
Cell-Defined Periodic Lattices, page 42).

lattice_structure This keyword is followed by expressions expr1, expr2,…


expr1 containing information about each and every site of the lattice.
expr2 Thus, there should be as many expressions as the number of
 sites in the entire lattice, specified by n_sites. Each of these
end_lattice_structure expressions has the following form (see also Figure 6 for an
illustrative example):

int1 real1 real2 int2 int3 int4 int5 …

where:

int1 is the index of the site (ranging from 1 to the number of


sites specified by n_sites)

real1 and real2 are the x and y Cartesian coordinates of


the site with index int1.

int2 gives the site type of the site with index int1. In place
of this integer, one can also use a string denoting a site type
name as specified by keyword site_type_names as done
in Figure 6 (see also the description of site_types in
section Unit-Cell-Defined Periodic Lattices, page 42).

© Michail Stamatakis July 27, 2024


Zacros 3.03 User Guide Page 46 of 89

15

4 Marker Index Name


13 14
1 cn2
2 cn4
10 11 12
2 3 br42
6 7 8 9
4 br44

0 1 2 3 4 5

0 2 4 6
x (Å)
lattice_structure # Au6 nanocluster structure
1 0.0000e+0 0.0000e+0 cn2 2 2 6
2 1.4425e+0 0.0000e+0 br42 2 1 3
3 2.8850e+0 0.0000e+0 cn4 4 2 4 7 8
4 4.3275e+0 0.0000e+0 br42 2 3 5
5 5.7700e+0 0.0000e+0 cn2 2 4 9
6 7.2125e-1 1.2492e+0 br42 2 1 10
7 2.1637e+0 1.2492e+0 br44 2 3 10
8 3.6062e+0 1.2492e+0 br44 2 3 12
9 5.0487e+0 1.2492e+0 br42 2 5 12
10 1.4425e+0 2.4985e+0 cn4 4 6 7 11 13
11 2.8850e+0 2.4985e+0 br44 2 10 12
12 4.3275e+0 2.4985e+0 cn4 4 8 9 11 14
13 2.1637e+0 3.7477e+0 br42 2 10 15
14 3.6062e+0 3.7477e+0 br42 2 12 15
15 2.8850e+0 4.9970e+0 cn2 2 13 14
end_lattice_structure

Figure 6: Lattice representing the surface of a Au6 nanocluster, along with the corresponding
lattice_structure definition. For this example, n_sites would be set to 15 and max_coord to
4 (the maximum of column 5).

int3 gives the coordination number of the site with index


int1.

int4 int5 … give the 1st nearest neighbors of the site with
index int1. There should be as many neighbors listed as the
value of int3. If there are more integers listed in this line, the
extra entries will be ignored.

How to Determine Lattice Connectivity


At this point, one might wonder: what is the main consideration when defining the neighboring
structure for a lattice? For instance, in the lattice of Figure 6, why did we not explicitly define the bridge

© Michail Stamatakis July 27, 2024


Zacros 3.03 User Guide Page 47 of 89

sites (br42 and br44) to be neighbors with each other? The answer to that lies in the way species bind to
the lattice sites and the types of reactions that can occur between these species. As rules of thumb:

• If multidentate species are present in the chemistry under investigation, the sites onto which each
dentate binds have to be neighbors with each other. For instance, a carbonate species (CO3) can bind
on Au6 in a top-bridge-top configuration involving sites cn2-br42-cn4 (see lattice of Figure 6).13 Thus,
these sites have been defined as neighboring in this lattice structure.

• For reactions that occur between adsorbed particles, there have to be one or more links between the
sites occupied by the different reactant molecules. Thus, on Au6 (Figure 6), representing a reaction
between CO bound to cn2 and O2 bound to cn4 in the absence of any adparticle on br42,
necessitates a neighboring relation between cn2-br42-cn4. This reaction would give CO2(gas) and a
left-over O adatom at br42.13

Energetics Input File


The file energetics_input.dat defines a cluster expansion Hamiltonian14 to be used for
calculating the energy of a given lattice configuration. According to this approach, the total energy of
the system is made up of single-body contributions, as well two-body or many-body contributions due
to energetic interactions. For instance, for a single adsorbate that binds to a single site type on a lattice,
the single-body contribution can be the formation energy of a single/isolated adsorbing molecule, a
two-body term may capture the 1st nearest-neighbor (1NN) pairwise interaction, another two-body term
could capture the 2NN pairwise interaction etc. For a given configuration σ of adsorbates on the lattice,
the instances of each type of energetic contribution (referred to as interaction patterns or clusters) are
counted and the total energy of the lattice, E, is computed as:

NC
NCIk ( σ )
E(σ)
= ∑
k =1 GMk
⋅ ECIk (5)

where NC is the number of clusters in the cluster expansion Hamiltonian, NCIk the number of instances
of cluster k on the lattice (which is a function of σ ), and GMk the graph multiplicity of cluster k (this is
the number of symmetric equivalents, and the pertinent term in corrects for the overcounting).
Specifying the cluster expansion Hamiltonian in Zacros (i.e. defining the energetic interactions of the
model) is done in energetics_input.dat using the syntax that is discussed in the following.

energetics Defines an energetics specification block. Anything after the


 keyword end_energetics is ignored. The energetics
expr specification block contains expressions consisting of one or
 several blocks structured as cluster … end_cluster
end_energetics (explained below). Each of the latter defines a cluster (also
referred to as figure or pattern) in the cluster expansion
Hamiltonian.

© Michail Stamatakis July 27, 2024


Zacros 3.03 User Guide Page 48 of 89

cluster str Defines a cluster in the Hamiltonian. String str is a descriptive


 name of the cluster. There is no limitation to how many such
expr “cluster definition” blocks can be contained in an energetics
 specification. Permitted keywords within this block will be
end_cluster presented shortly. Before doing so, however, let us briefly
discuss how clusters are represented in Zacros.

Cluster Representation
Each cluster is represented as a graph pattern. The latter consists of a collection of connected sites onto
which surface species can bind. In order to “translate” a pattern into input that Zacros can process, it is
instructive to make drawings such as the ones in Figure 7. The pattern on the top left (CO-O interaction)
can be used to model the repulsive interaction between an adsorbed CO molecule on a top site and an O
adatom on an fcc site, for example on Pt(111) . This pattern involves two monodentate species bound to
neighboring sites of different types. The pattern on the bottom left (Bidentate O2) represents the
bidentate binding configuration of O2. Finally, the pattern on the bottom right (O-O Interaction 3rd NN)
can model the interaction between O adatoms on Pt(111), for instance. It involves two monodentate
adsorbates and three sites, the second of which can be empty or occupied by another species.
Moreover, it involves a geometric criterion, as the angle between the links 1-2 and 2-3 has to be 180°
for sites 1 and 3 to be 3rd nearest-neighbors (for 2nd nearest-neighbors this angle is 120°; see also Figure
5). To represent patterns such as these, Zacros provides a number of keywords discussed below.

sites int1 Specifies the number of sites in the graph pattern representing
the cluster.

entity 1 entity 2 Species Site Types


Name Index Dentates Marker Index Name
CO* O* CO* 1 1  1 top
O* 2 1  2 fcc
2 O2** 3 2
1

CO-O Interaction
entity 1 undefined entity 2
entity 1 site state
O2** O* O*
i ii 1 2 3
1 2
angle 180°
Bidentate O2
O-O Interaction 3rd NN
Figure 7: Schematics of various graph patterns representing energetic contribution terms (clusters) in a
cluster expansion Hamiltonian. The white numbers represent the indexes of each site of the pattern. The
lowercase roman numbers show the dentates of the bidentate oxygen.

© Michail Stamatakis July 27, 2024


Zacros 3.03 User Guide Page 49 of 89

neighboring int1-int2 … Specifies the neighboring between sites, if more than one sites
appear in the graph pattern. It is followed by expressions
structured as int1-int2 in the same line of input as the
keyword. Each such expression denotes that the sites with
indexes int1 and int2 are nearest neighbors. The values of
int1, int2, … range from 1 up to the number of sites
specified by the keyword sites. There can be as many such
expressions as needed to fully define the neighboring structure
of the pattern.

lattice_state Specifies the state of each site in the graph pattern


expr representing the cluster. It is followed by as many lines as the
 number of sites specified by the keyword sites. Each one of
these (non-blank) lines contains an expression specifying the
state of a site: the first line corresponds to the site indexed 1 in
the pattern, the second line to site 2 etc. Note that there is no
closing keyword for lattice_state; the program exits this
input mode once the appropriate number of such lines has
been parsed. Each expression expr conforms to one of the
two following standards:

int1 str int2 The first argument int1 is the number


of the molecular entity bound to that site. Thus, if
a bidentate species is bound to sites 1 and 3, both
of these sites will have the same integers in the
first column. The second argument str gives the
name of the species bound to the site. Permitted
surface species names are those defined
previously with keyword surf_specs_names
(see section Simulation Input File, page 20). Finally
the third and last argument gives the dentate
number with which the species is bound. For sites
occupied by monodentate species, this number
will always be equal to 1.

& & & For this specification all three columns contain the
ampersand character. This denotes an unspecified
state for that site.

site_types str1 str2 … The types of each and every site in the pattern. There should
be as many strings following this keyword as the number of
sites in the pattern specified by sites. This keyword is
optional. If omitted, the pattern will be detected based on

© Michail Stamatakis July 27, 2024


Zacros 3.03 User Guide Page 50 of 89

criteria pertinent to site occupancy, neighboring and geometry


(if applicable) only.

graph_multiplicity int The multiplicity of the pattern, namely the number of times
that the exact same pattern will be counted for a given lattice
configuration. This keyword is followed by an integer and can
be thought of as a symmetry number for the pattern (see also
the description of keyword cluster_eng below). It is an
optional keyword. Omitting the keyword is equivalent to
specifying a value equal to 1 for int.

cluster_eng real The energy contribution of the pattern, given as a real number
following the keyword. If graph_multiplicity is greater
than 1 for this pattern, the energy contribution is divided by
the (integer) multiplicity.

angles int1-int2-int3:real1 … Specifies a geometric criterion


3
based on the angle between two links
connecting pairs of three sites. There 2 1
can be as many expressions following
the keyword angles as needed,
provided they appear on the same line. The integers int1,
int2 and int3 denote three sites s1, s2 and s3, out of which s1
neighbors with s2, and s2 neighbors with s3. Then, the value of
real specifies the angle in degrees between vectors s2→s1
and s2→s3. Zacros accepts positive and negative values for the
angle specification, according to the following convention:
positive means counter-clockwise (in the direction of the arrow
in the above schematic), negative means clockwise. Note that
by default, mirror images of patterns are detected when Zacros
calculates the total energy of a given lattice configuration.
Thus, one does not need to explicitly define such mirror images
unless the default behavior is overridden as discussed below.

no_mirror_images Overrides the default behavior of the program thereby


preventing mirror image pattern detection. By default Zacros
detects mirror images by looking for patterns which have angle
values opposite from the ones specified in angles (for the
same site indexes). For instance, if a pattern is specified with
angles 4-5-1:60, Zacros will also search for patterns
having angles 4-5-1:-60 and all other properties the
same as the original pattern. The presence of the keyword

© Michail Stamatakis July 27, 2024


Zacros 3.03 User Guide Page 51 of 89

no_mirror_images restricts the search to the original


pattern only.

absl_orientation int1-int2:real Specifies a geometric criterion based on the angle


between the x-axis and a link between a pair of sites. The
integers int1 and int2 denote the neighboring sites. The
value of real specifies the angle in degrees between vector
s2→s1 and the unit vector (1,0) in Cartesian coordinates. This
keyword can be combined with keywords angles and
no_mirror_images for a precise definition of the
figures/patterns entering the cluster expansion Hamiltonian.

variant str To reduce repetitions in the energetics_input.dat file,


 the variant blocks can be of particular use. Right after the
expr cluster keyword one can thus define the number of sites
 (sites), neighboring structure (neighboring) and state of
end_variant each site (lattice_state). Then, within that cluster …
end_cluster block, one can define several variants that will
all share the same lattice structure and occupancies, but may
vary in their geometry or site types. The name of the variant
pattern consists of string str appended to the name of the
parent pattern (str following the keyword cluster). In this
respect, the following keywords are permitted within a
variant block: site_types, graph_multiplicity,
angles, no_mirror_images, absl_orientation,
cluster_eng. If any of these keywords has been listed
within a cluster block before a variant block has been
opened, the keyword variant is no longer permitted within
that cluster block.

Examples
As guiding examples, we finally give the Zacros input defining the clusters presented in Figure 7.

cluster CO-O_Interaction # Opening a cluster block


sites 2 # There are two sites in the pattern…
neighboring 1-2 # that are neighbors
lattice_state
1 CO* 1 # 1st site occupied by CO* (monodentate)
2 O* 1 # 2nd site occupied by O* (monodentate)
site_types top fcc # Specifying site types in the pattern…
cluster_eng 0.140 # and the energy contribution thereof
end_cluster # Closing the cluster block

© Michail Stamatakis July 27, 2024


Zacros 3.03 User Guide Page 52 of 89

cluster Bidentate_O2 # Opening a cluster block


sites 2 # There are two sites in the pattern…
neighboring 1-2 # that are neighbors
lattice_state
1 O2** 1 # 1st site occupied by 1st dentate of O2**
1 O2** 2 # 2nd site occupied by 2nd dentate of O2**
site_types top top # Specifying site types in the pattern…
cluster_eng -0.203 # and the energy contribution thereof
end_cluster # Closing the cluster block

cluster O-O_Interaction # Opening a cluster block


sites 3 # There are three sites in the pattern
neighboring 1-2 2-3 # Site 2 neighbors with sites 1 and 3…
lattice_state # (neighboring of 1-3 not yet precluded)
1 O* 1 # 1st site is occupied by O*
& & & # 2nd site’s state is unspecified
2 O* 1 # 3rd site is occupied by O*
variant 3rdNN # Defining variant O-O_Interaction_3rdNN
site_types fcc fcc fcc # Specifying site types in the pattern
graph_multiplicity 2 # Multiplicity = 2 due to symmetry
angles 1-2-3:180.0 # Geometric criterion for 3rd NN only:…
# at this point the neighboring of 1-3
# is ruled out!
cluster_eng -0.016 # Defining cluster’s energy contribution
end_variant # Closing the variant block
end_cluster # Closing the cluster block

Mechanism Input File


The file mechanism_input.dat defines a reaction mechanism, consisting of one or more reversible
and/or irreversible elementary steps. These steps can represent adsorption of molecules on surface
sites, desorption therefrom, diffusion from one site to a neighboring site, or reactions between
adsorbed particles and possibly gas species (Eley-Rideal reactions). The syntax used in this file is
discussed in the following.

mechanism Defines a mechanism specification block. Anything after the


 keyword end_mechanism is ignored. The mechanism
expr specification block contains expressions consisting of one or
 several blocks structured as either step … end_step, or
end_mechanism reversible_step … end_reversible_step (more
details follow). These blocks define irreversible or reversible
elementary steps, respectively.

© Michail Stamatakis July 27, 2024


Zacros 3.03 User Guide Page 53 of 89

step str Defines an irreversible elementary step in the mechanism.


 String str is a descriptive name of the step. There is no
expr limitation to how many such “step definition” blocks can be
 contained in a mechanism specification. Caution: irreversible
end_step steps by definition violate microscopic reversibility! Moreover,
for such steps, Zacros calculates the activation energy Ea in a
slightly different way than reversible steps. It uses a Brøsted-
Evans-Polanyi relation (see equations (8)-(12)), but does not
perform any correction if Ea < 0 or if Ea < ∆Erxn. Thus, for an
irreversible step, the max operator of equation (10) is not
applied, and E‡fwd=( σ ) E‡fwd,0 + ω⋅ ( ∆Erxn ( σ ) − ∆Erxn,0 ) . Irreversi-
ble steps are intended for simulating “prototype” systems e.g.
toy models with kinetic constants that have fixed values.

reversible_step str In a similar manner, this block defines a reversible elementary


 step. Thus, both the forward and reverse steps will be taken
expr into account in the KMC simulation. String str is a descriptive
 name of this reversible step. The forward and backward steps
end_reversible_step are named by appending the strings “_fwd” and “_rev” to
str. There is no limitation to how many such “step definition”
blocks can be contained in a mechanism specification.

Permitted keywords within the two blocks just mentioned will be presented shortly. Before doing so,
however, let us briefly discuss how elementary steps of a reaction mechanism are represented in Zacros.

Elementary Step Representation


As in the case of figures in a cluster expansion Hamiltonian, each elementary step is represented as a
graph pattern, with specific initial and final states. In order to “translate” an elementary step into input
that Zacros can process, it is instructive to make drawings such as the ones in Figure 8. Note that in
these patterns the number of sites and the neighboring structure remain static (no reconstructions). For
each pattern, the initial and final state is depicted along with two tables listing the site types and species
participating in the reaction steps. Note that the first step (CO-O Oxidation) is irreversible, whereas the
second (CO-O2 oxidation) is reversible. Moreover, both steps elicit a CO2 molecule in the gas phase.

The rates of elementary reactions are calculated from Arrhenius relationships. For the forward step of a
reversible process:

 E‡ ( σ ) 
k fwd = A fwd ⋅ exp  − fwd  (6)
 kB ⋅ T 

where Afwd is the pre-exponential (also referred to as pre-factor), Efwd ( σ ) the activation energy for the

given configuration of neighboring adsorbates, kB Boltzmann’s constant, and T the temperature.

© Michail Stamatakis July 27, 2024


Zacros 3.03 User Guide Page 54 of 89

Site Types Species


Marker Index Name Name Index Dentates
 1 top * 0 1 empty site regarded
 2 fcc CO* 1 1 as a pseudo-species
 3 brg O* 2 1
O2*** 3 3

entity 1 entity 2
entity 1 entity 2
CO* O* * *
2 2
+ CO2(gas)
1 1

CO-O Oxidation

entity 1 entity 3
entity 2
O2*** * CO* * O* * * *
iii ii i fwd
1 2 3 4 5 1 2 3 4 5
rev
CO-O2 Oxidation + CO2(gas)
Figure 8: Schematics of various graph patterns representing elementary steps of a reaction mechanism.
The white numbers represent the indexes of each site of the pattern. The lowercase roman numbers
show the dentates of the tridentate oxygen.

For the reverse step:


 Erev (σ) 
krev = Arev ⋅ exp  −  (7)
 kB ⋅ T 

Microscopic reversibility dictates that the difference between forward and reverse activation energy is
equal to the reaction energy ∆Erxn ( σ ) (see also Figure 9):

∆Erxn ( σ ) = E‡fwd ( σ ) − Erev



(σ) (8)

In the expression above, ∆Erxn ( σ ) can be calculated from the energetics model (cluster expansion
Hamiltonian; see equation (5) in section Energetics Input File, page 47) as the difference between final
versus initial state energies, Efinal ( σ ) and Einitial ( σ ) , respectively:

∆Erxn ( σ ) = Efinal ( σ ) − Einitial ( σ ) (9)

© Michail Stamatakis July 27, 2024


Zacros 3.03 User Guide Page 55 of 89


Efwd(σ) Transition State


Efwd,0
Energy

‡ ‡
Erev,0 Erev(σ)
Initial
∆Erxn,0
State ∆Erxn(σ)

Final State
Reaction Coordinate
Figure 9: Energy profile of an elementary step. The quantities involved in the calculation of the forward
and reverse activation energies are noted.

Moreover, the forward activation energy can be parameterized in terms of a Brønsted-Evans-Polanyi


(BEP) relationship:2,15


fwd σ
E= (
( ) max 0, ∆Erxn ( σ ) , E‡fwd,0 + ω⋅ ( ∆Erxn ( σ ) − ∆Erxn,0 ) ) (10)

where the max operator filters negative values, as well as values less than ∆Erxn ( σ ) , if the latter is
positive (important: this operator is not applied for irreversible steps; please read also the cautionary

note in the step keyword on page 53). Moreover, Efwd,0 and ∆Erxn,0 are the activation and reaction
energies at the zero coverage limit (only the reactants existing on the surface), and ω is the so-called
proximity factor ranging from 0.0 for an initial-state-like transition state, to 1.0 for a final-state-like
transition state. The reverse activation energy expression that is in line with equations (8) and (10) is:


Erev= (
( σ ) max −∆Erxn ( σ ) , 0, Erev,0

− (1 − ω) ⋅ ( ∆Erxn ( σ ) − ∆Erxn,0 ) ) (11)

where:

‡ ‡
E=
rev,0 Efwd,0 − ∆Erxn,0 (12)

To represent elementary steps constituting a mechanism, Zacros provides a number of keywords


discussed below. Unless otherwise stated, these keywords are valid for defining both irreversible and
reversible steps and thus can be used inside step … end_step or reversible_step …
end_reversible_step blocks.

gas_reacs_prods str1 int1 … Provides information about the gas species participating in the
mechanism. The name of the first gas species is given in str1

© Michail Stamatakis July 27, 2024


Zacros 3.03 User Guide Page 56 of 89

whereas the stoichiometry is given by integer int1 (negative


for reactants and positive for products). Permitted gas species
names are those defined previously with keyword
gas_specs_names (see section Simulation Input File, page
20). In principle, an arbitrary number of such int str pairs
can appear, although in physically meaningful situations one
would generally be limited to at most one reactant and one
product.

sites int1 Specifies the number of sites in the graph pattern representing
the elementary step being defined.

neighboring int1-int2 … Specifies the neighboring between sites, if more than one sites
appear in the graph pattern representing the elementary step.
It is followed by expressions structured as int1-int2 in the
same line of input as the keyword. Each such expression
denotes that the sites with indexes int1 and int2 are
nearest neighbors. The values of int1, int2, … range from 1
up to the number of sites specified by the keyword sites.
There can be as many such expressions as needed to fully
define the neighboring structure of the pattern. For patterns
involving only one site, this keyword is omitted.

initial Specifies the initial state of each site in the graph pattern. It is
int1 str int2 followed by as many lines as the number of sites specified by
   the keyword sites. Each one of these (non-blank) lines
contains an expression specifying the state of a site: the first
line corresponds to the site indexed 1 in the pattern, the
second line to site 2 etc. Note that there is no closing keyword
for initial; the program exits this input mode once the
appropriate number of such lines has been parsed. In each of
these lines, the first argument int1 is the number of the
molecular entity bound to that site. Thus, if a bidentate species
is bound to sites 1 and 3, both of these sites will have the same
integers in the first column. The second argument str gives
the name of the surface species bound to the site. Permitted
surface species names are those defined previously with
keyword surf_specs_names (see section Simulation Input
File, page 20). Finally, the third and last argument gives the
dentate number with which the species is bound. For sites
occupied by monodentate species, this number will always be
equal to 1.

© Michail Stamatakis July 27, 2024


Zacros 3.03 User Guide Page 57 of 89

final Specifies the final state of each site in the graph pattern. This
int1 str int2 keyword is subject to the exact same rules as the previously
   introduced keyword initial.

site_types str1 str2 … The types of each and every site in the pattern. There should
be as many strings following this keyword as the number of
sites in the pattern specified by sites. This keyword is
optional. If omitted, the pattern will be detected based on
criteria pertinent to site occupancy and neighboring only.

pre_expon expr Specifies the pre-exponential in the Arrhenius formula giving


the rate constant of that elementary step. Possible options for
expression expr are:

real if a single real number is given, the value of the


pre-exponential is assumed to be constant (i.e.
independent of temperature).

real1 real2 … real7 seven real numbers following


keyword pre_expon, are interpreted as defining
a temperature-dependent pre-exponential. The
value of the latter is calculated from the following
expression:

  α 
A fwd (=
T ) exp  −  α1 ⋅ log ( T ) + 2 + α3 + α 4 ⋅ T + α 5 ⋅ T2 + α6 ⋅ T 3 + α7 ⋅ T 4   (13)
  T 

where the values of α1, α2, …, α7 are given by the


reals real1, real2, …, real7, respectively, and
log is the natural logarithm. This option is useful in
simulating temperature programmed desorption
or reaction spectra. Note that the equation (13) is
applied over the range between the initial and
final temperatures in the simulation, namely
[Tinitial, Tinitial + Ramp⋅tsimulation] (refer to keywords
temperature and max_time in section
Simulation Input File). For temperatures outside
this range, the pre-exponential value at the
endpoint of the interval is used, for instance if
Ramp > 0, for T < Tinitial the pre-exponential will be
taken equal to Afwd(Tinitial). Similarly, for T > Tinitial +
Ramp⋅tsimulation the pre-exponential will be taken
equal to Afwd(Tinitial + Ramp⋅tsimulation).

© Michail Stamatakis July 27, 2024


Zacros 3.03 User Guide Page 58 of 89

pe_ratio expr This keyword gives the ratio of forward over reverse pre-
exponentials and is valid only inside a reversible elementary
step specification block. The possible options for expression
expr are exactly as those of pre_expon, and the two
specifications (pre_expon and pe_ratio) should match:
either both specified as a single real number, or both as a
sequence of 7 real numbers. In the latter case, a similar
expression as that of equation (13) is employed by Zacros:

A fwd ( T )   β 
= exp  −  β1 ⋅ log ( T ) + 2 + β3 + β4 ⋅ T + β5 ⋅ T2 + β6 ⋅ T 3 + β7 ⋅ T 4   (14)
Arev ( T )   T 

activ_eng real The activation energy at the zero coverage limit. For a
reversible step, real gives the forward activation energy at
the zero coverage limit E‡fwd,0 . The forward activation energy
for the given configuration, which enters the Arrhenius
equation (6) is computed through the BEP relationship (10).
The latter makes use of the reaction energy given by the
energetics’ model (cluster expansion Hamiltonian; see section
Energetics Input File). The reverse activation energy entering
the Arrhenius equation (7) is computed through equation (11),
such that detailed balance is automatically satisfied.

prox_factor real The proximity factor used in the BEP relationship to calculate
the forward (and also reverse, if applicable) activation energy
(see equations 10, 11). If this keyword is omitted, a default
value of 0.5 is used for that elementary step.

angles int1-int2-int3:real1 … Specifies a geometric criterion


3
based on the angle between two links
connecting pairs of three sites. There 2 1
can be as many expressions following
the keyword angles as needed,
provided they appear on the same line. The integers int1,
int2 and int3 denote three sites s1, s2 and s3, out of which s1
neighbors with s2, and s2 neighbors with s3. Then, the value of
real specifies the angle in degrees between vectors s2→s1
and s2→s3. Zacros accepts positive and negative values for the
angle specification, according to the following convention:
positive means counter-clockwise (in the direction of the arrow
in the above schematic), negative means clockwise. Note that
by default, mirror images of patterns are detected when Zacros

© Michail Stamatakis July 27, 2024


Zacros 3.03 User Guide Page 59 of 89

scans for the possible elementary processes for a given lattice


configuration. Thus, one does not need to explicitly define such
mirror images unless the default behavior is overridden as
discussed below.

no_mirror_images Overrides the default behavior of the program thereby


preventing mirror image pattern detection. By default Zacros
detects mirror images by looking for patterns which have angle
values opposite from the ones specified in angles (for the
same site indexes). For instance, if a pattern is specified with
angles 4-5-1:60, Zacros will also search for patterns
having angles 4-5-1:-60 and all other properties the
same as the original pattern. The presence of the keyword
no_mirror_images restricts the search to the original
pattern only.

absl_orientation int1-int2:real Specifies a geometric criterion based on the angle


between the x-axis and a link between a pair of sites. The
integers int1 and int2 denote the neighboring sites. The
value of real specifies the angle in degrees between vector
s2→s1 and the unit vector (1,0) in Cartesian coordinates. This
keyword can be combined with keywords angles and
no_mirror_images for a precise definition of the
figures/patterns representing the elementary steps of the
reaction mechanism.

stiffness_scalable Specifies that the rate constant of the currently defined


elementary step can be scaled to treat time-scale separation,
also referred to as stiffness (see section Treating Fast Quasi-
Equilibrated Processes, page 25).

variant str To reduce repetitions in the mechanism_input.dat file,


 the variant blocks can be of particular use. Thus, after the
expr keywords gas_reacs_prods, sites, neighboring,
 initial and final, inside an elementary step definition
end_variant block, one can define one or more variants that will all share
the same gas reactants/products, lattice structure and
initial/final occupancies, but may vary in their geometry or site
types. The name of the variant pattern consists of string str
appended to the name of the parent pattern (str following
the keyword step or reversible_step). In this respect,
the following keywords are permitted within a variant
block: site_types, pre_expon, pe_ratio (for reversible

© Michail Stamatakis July 27, 2024


Zacros 3.03 User Guide Page 60 of 89

step only), active_eng, angles, no_mirror_images,


absl_orientation. If any of these keywords has been
listed within a step/reversible_step block before a
variant block has been opened, the keyword variant is
no longer permitted within that block.
Examples
As guiding examples, we finally give the Zacros input defining the elementary steps of Figure 8.

step CO-O_Oxidation # Opening an irreversible step block


gas_reacs_prods CO2 1 # One CO2 product molecule
sites 2 # There are two sites in the pattern…
neighboring 1-2 # that are neighbors
initial # Initial state:
1 CO* 1 # 1st site occupied by CO* (monodentate)
2 O* 1 # 2nd site occupied by O* (monodentate)
final # Initial state:
1 * 1 # unoccupied 1st site (* is monodentate)
2 * 1 # unoccupied 2nd site
site_types top fcc # Specifying site types in the pattern…
pre_expon 1.000e+013 # along with the pre-exponential…
activ_eng 0.200 # activation energy…
prox_factor 0.500 # and proximity factor
end_step # Closing the step block

reversible_step CO-O_Oxidation # Opening a reversible step block


gas_reacs_prods CO2 1 # One CO2 product molecule
sites 5 # There are five sites in the pattern…
neighboring 1-2 2-3 3-4 4-5 # that neighbor as specified
initial # Initial state:
1 O2*** 3 # 1st site occupied by 3rd dentate of O2
1 O2*** 2 # 2nd site occupied by 2nd dentate of O2
1 O2*** 1 # 3rd site occupied by 1st dentate of O2
2 * 1 # 4th site unoccupied
3 CO* 1 # 5th site occupied by CO* (monodentate)
final # Initial state:
1 * 1 # 1st site unoccupied
2 O* 1 # 2nd site occupied by O* (monodentate)
3 * 1 # 3rd site unoccupied
4 * 1 # 4th site unoccupied
5 * 1 # 5th site unoccupied
site_types top brg top brg top # Specifying site types…
pre_expon 1.000e+013 # along with the pre-exponential…
pe_ratio 1.800e+006 # fwd/rev pre-exponential ratio…
activ_eng 0.300 # activation energy…
prox_factor 0.500 # and proximity factor
end_reversible_step # Closing the reversible step block

© Michail Stamatakis July 27, 2024


Zacros 3.03 User Guide Page 61 of 89

Initial State Input File


By default, a KMC simulation in Zacros is initialized with an empty lattice. However, there are cases in
which one would need to explicitly specify an initial state, for instance, in simulations of temperature
programmed desorption (TPD) or reaction (TPR), or in order to initialize the simulation from a
representative equilibrium configuration. The user may thus override the default behavior by supplying
a file named state_input.dat. This is an optional input file and, as noted before, Zacros will start
from an empty lattice in the absence thereof. The syntax used in this file is discussed below.

initial_state Defines an initial state specification block. Anything after the


 keyword end_initial_state is ignored. The initial state
expr specification block contains one or more “particle seeding”
 instructions expr (explained below), which allow Zacros to
end_initial_state populate the lattice with the desired number of particles.

seed_on_sites str int1 int2 … One or more such “individual seeding” instructions can
appear in place of expr in an initial state specification block
(see above). Each such instruction seeds one particle of the
species with name str on sites specified by the integers
int1, int2,… Permitted surface species names are those
defined previously with keyword surf_specs_names and
the number of integers following str must not exceed the
number of dentates of that species, defined by
surf_specs_dent (see section Simulation Input File, page
21). Finally, the int1, int2,… can range between 1 up to the
number of sites that exist on the lattice.

seed_multiple str1 int1 One or more such “multiple seeding” blocks can appear in
site_types str2 str3 … place of expr in an initial state specification block (see above).
neighboring int2-int3 … Each such instruction seeds multiple particles of the species
end_seed_multiple with name str1. The number of particles is defined by the
integer int1. The site types in which these particles will be
seeded is given by str2, str3, … There should be as many
such strings as the number of dentates of that species, defined
by surf_specs_dent (see section Simulation Input File,
page 21). If the species is monodentate, the neighboring
keyword is omitted; otherwise a neighboring structure is
specified using this keyword. This is done by using as many
expressions of the form int2-int3 as needed, in order to
define the links between the sites occupied by that species.
Note that if the neighboring structure thus defined cannot be
found on the lattice, execution of the seeding instruction will

© Michail Stamatakis July 27, 2024


Zacros 3.03 User Guide Page 62 of 89

fail. Note that for MPI runs, “multiple seeding” instructions are
not currently supported.

Examples
As illustrative examples, consider the following cases:

Suppose we need to seed 2 carbonate (CO3) particles randomly on the Au6 lattice (Figure 6). CO3 binds in
a top-bridge-top configuration at sites cn2-brg42-cn4. Thus, we can use the following instructions in the
file state_input.txt:

initial_state # Opening initial state block


seed_multiple CO3*** 2 # Two CO3*** molecules to be seeded…
site_types cn2 brg42 cn4 # on the specified site types
neighboring 1-2 2-3 # Defining dentates’ neighboring
end_seed_multiple # Closing seed_multiple block
end_initial_state # Closing initial_state block

Alternatively suppose we would like to seed two CO3 molecules at specific sites on the Au6 structure. We
could then use the following syntax:

initial_state # Opening initial state block


seed_on_sites CO3*** 1 6 10 # First CO3*** seeding instruction
seed_on_sites CO3*** 15 14 12 # Second CO3*** seeding instruction
end_initial_state # Closing initial_state block

Command-Line Arguments
Zacros can also parse command-line arguments enabling it to override the default paths to all the input
files (see Input/Output Files, page 16), and some of the options read from a restart.inf file. In
particular, one can run Zacros as follows:

/path/to/executable/zacros.x --keyword=argument

If keyword is one of the following: simulation, lattice, mechanism, energetics, or


state, then argument has to be a string specifying the path to a file that will be used instead of the
default files: simulation_input.dat, lattice_input.dat, mechanism_input.dat,
energetics_input.dat, and state_input.dat, respectively. This can reduce duplication of
input files when running benchmarks, for instance, or when performing parameter sweeps. Make sure
there is no space between the keyword and argument, for example:

C:\KMC\Zacros.exe --lattice=".\lattice_cases\lattice_25.dat"

© Michail Stamatakis July 27, 2024


Zacros 3.03 User Guide Page 63 of 89

The following keywords: max_time, max_steps, or wall_time, are followed by a numerical


argument (real, int, int, respectively) which overrides the corresponding parameters defining a
stopping criterion (see section Stopping and Resuming, page 25). Note that the aforementioned
command line options do not accept the keyword infinity (unlike when these are used in the
simulation_input.dat file). These options are useful if we want to extend a simulation past the
originally defined number of KMC steps or time. Provided that we have saved the last state of our
simulation in restart.inf, we can resume and at the same time override the previous stopping
criteria, thereby making it possible to set a new final time, wall time or maximum number of steps.

Interpreting the Simulation Output of Zacros


During a KMC simulation, Zacros produces one or more output files, depending on the type of output
requested and whether the run is distributed (using MPI) or not. The structures of these files are
discussed below.

General Output File


The file general_output.txt, for serial or threaded runs (with OpenMP), contains general
information about the KMC simulation. For distributed runs (with MPI), each MPI process generates one
such file; therefore, MPI process 0 generates and writes into general_output.txt, MPI process 1
general_output_1.txt, MPI process 2 general_output_2.txt etc. The file contents are
broken down to the following sections, which are mostly self-explanatory:

Compiler Information
If the compiler supports this feature, this section gives information about the compiler version and
options used to generate the Zacros executable.

Threading/Multiprocessing Information
This section gives information about parallelization. If the program has been compiled as a serial
application, the message “Serial run (no parallelism).” will appear. If the compiler
recognized the OpenMP directives, the message will read “Shared-memory multiprocessing
with int OpenMP threads.” where int is the number of threads used during execution (please
refer to section Running Zacros, page 15, for more information about setting the number of threads).

For an MPI run, the following messages appear: “Distributed run with int1 MPI
process[es], without OpenMP threads.” or “Distributed run with int1 MPI
process, each with int2 OpenMP thread[s].”, “The rank of this MPI
process is int3.” and “The name of this processor is “str”.”. In these
messages, int1 is the total number of MPI processes used in the run, int2 is the number of threads in
each MPI process (if Zacros has been compiled with OpenMP and MPI directives), and int3 is the rank
of the MPI process that wrote the current general_output.txt file. Finally, str is the name of the
processor (computational node for high-performance computing clusters), e.g. node-k98j-004. The
latter information can be useful for troubleshooting, in case, for instance, a node is slower than the
others and results in an overall slow-down of the simulation. The name of the processor for MPI runs is

© Michail Stamatakis July 27, 2024


Zacros 3.03 User Guide Page 64 of 89

also written every time the simulation is restarted, since different processors may be chosen (or
different ranks may be assigned) in the new simulation chunk.

Simulation Setup
This section repeats the information parsed while processing file simulation_input.txt. If
everything is valid, this section ends with the message “Finished reading simulation
input.” otherwise an error is output to this file and execution is terminated.

Lattice Setup
In this section, information about the lattice structure is presented, namely the type of lattice
specification (default, periodic, explicit; see section Lattice Input File, page 41), the area of the
simulation box for periodic lattices, the site types and the number of sites per type, as well as the
maximum coordination number in the lattice. If everything is valid, this section ends with the message
“Finished reading lattice input.” otherwise an error is output to this file and execution is
terminated.

Energetics Setup
This section reports the number of clusters for the cluster expansion Hamiltonian parsed from the
energetics_input.dat file, and the maximum number of sites involved in a cluster. A summary of
the clusters defined is also given. For certain “simple” systems, the following message appears in this
section, right after the summary just noted: “This cluster expansion involves only
single body patterns.”. For such systems, the corresponding accelerator is automatically
enabled (see section Memory Management on page 37). If everything is valid, this section ends with the
message “Finished reading energetics input.” otherwise an error is output to this file and
execution is terminated.

Mechanism Setup
This section reports the number of elementary steps parsed from the mechanism_input.dat file,
and the maximum number of sites involved in a step. A summary of the elementary steps contained in
the mechanism is also given. For certain “simple” systems, the following message appears in this
section, right after the summary just noted: “This mechanism contains up to two-site
events involving only monodentate species.”. For such systems, the corresponding
accelerator is automatically enabled (see section Memory Management on page 37). If everything is
valid, this section ends with the message “Finished reading mechanism input.” otherwise
an error is output to this file and execution is terminated.

Initial State Setup


This section appears only if an initial state has been defined using file state_input.dat and
summarizes all seeding instructions parsed therefrom. If everything is valid, this section ends with the
message “Finished reading initial state input.” otherwise an error is output to this file
and execution is terminated.

© Michail Stamatakis July 27, 2024


Zacros 3.03 User Guide Page 65 of 89

Simulation Preparation
This section opens with the message “Preparing simulation” and contains information about
the preparatory steps of the simulation, pertinent e.g. to the construction of the lattice, the pre-
allocation of data-structures handling certain elements of the simulation, and the initialization of these
data-structures while setting up the lattice state, adsorbate energetics, and elementary events. MPI runs
provide extra information about the lattice partitioning to subdomains, the size of halos etc. as well as
the MPI-specific data-structures (state queue and message queue; see section Simulating Very Large
Lattices on page 29 for more details on these).

Simulation Output
This section opens with the message “Commencing simulation” and closes with “Simulation
stopped”. If event reporting is turned on (see keyword event_report in section Simulation Input
File, page 24) the occurrence of each lattice process is reported using the following format:
KMC step int1
Elementary step str
occurred at time t = real1
involving site(s): int2 int3 …
Its propensity at T0 was k(T0) = real2
Its propensity at T(t) was k(T(t)) = real3
Its activation energy was Eact = real4
Its lattice delta energy of reaction was DElat = real5
where int1 is the KMC step number, str is the name of the elementary step that just occurred,
real1 is the time of occurrence thereof, and int2, int3,… are the indexes of the lattice sites on
which the event took place. The values of real2 and real3 are the propensities of the event at the
initial and current temperature of the simulation (these should be the same if the temperature is
constant). The activation energy and lattice reaction energy (neglecting contributions from gas species)
are given by real4 and real5.

Stiffness Scaling Information


If stiffness scaling has been enabled for treating time-scale separation (see section Treating Fast Quasi-
Equilibrated Processes, page 25), Zacros reports any scaling actions in general_output.txt. The
messages start with either of the following lines:

(i) Stiffness scaling possibly too aggressive at time t = real:


(ii) Stiffness detected at time t = real:
(iii) Stiffness possible at time t = real:

Output line (i) can be generated in step 3.2 of the algorithm in page 26, whereas the output of line (ii)
can be generated in steps 3.6.1.1 and 3.6.2, and that of line (iii) in step 3.6.1.2. Note that such output is
produced only when the stiffness coefficients change, and real gives the time when this happened
during the simulation. Following any of these lines is a description of what was detected and what
actions were taken, for instance it may be reported that all elementary processes were found to be fast

© Michail Stamatakis July 27, 2024


Zacros 3.03 User Guide Page 66 of 89

and quasi-equilibrated, or that there were non-equilibrated steps of which the fastest one is reported.
The new stiffness coefficients are subsequently reported.

Performance Monitoring of MPI Runs


MPI runs implementing the Time-Warp algorithm (see section Simulating Very Large Lattices on page 29)
print out the following block with useful information that can be used to assess the performance of the
approach or for troubleshooting.

===============================================================================
Performing collective communications for global virtual time (GVT) computation
===============================================================================
Entered GVT-comp block at: real1
After MPI-reduction: real2
-----------------------------------------------------------------------------
Different times involved in the Time-Warp algorithm
-----------------------------------------------------------------------------
Global virtual time : real3
Global virtual time advancement : real4
Local time : real5
Minimum timestamp among sent messages : real6
Minimum timestamp among received messages : real7
-----------------------------------------------------------------------------
Time-Warp performance statistics in this GVT interval
-----------------------------------------------------------------------------
Number of snapshots taken : int1
Current (and max) size of snapshot queue : int2 of int3
Snapshot step adaptive interval : int4
Number of restore operations performed : int5
KMC time that was rerun due to rollbacks : real8
Ratio of the above over the GVT advancement : real9
KMC time spent in rollback propagation : real10
Ratio of the above over KMC time in rollbacks : real11
===============================================================================

The first two reals are clock times measured from the start of the main KMC loop: real1 is the time in
which the current MPI process entered the code section dealing with global communications, while
real2 is the time in which the reduction operations actually took place (in order to compute the GVT
or decide on simulation termination). These times should be close to each other (differences of tens of a
second are typically expected) and approximately multiples of the time specified by the
time_interval_gvt_computation keyword. In some instances, one may see larger differences,
on the order of a few seconds; this may be that at least one MPI process was “busy” doing something
else when all other MPI processes reached the global communications section. If this happens relatively
rarely, it is not a problem, but persistent such behavior might convey underlying issues. Do not hesitate
to notify us if you see such behavior.

The GVT and the GVT advancement are reported as real3 and real4. The latter difference is the
value of the GVT in the present reporting block minus that of the previous block. In “well-behaved” runs,
the GVT advancement should be non-zero most of the time. If this is zero in several consecutive blocks,
a warning is issued, and this might indicate a problem or just that the value specified in
time_interval_gvt_computation is too small (in other words, the run spends quite a lot of

© Michail Stamatakis July 27, 2024


Zacros 3.03 User Guide Page 67 of 89

time in global communications and not in advancing the simulation time). Moreover, the local time is
reported as real5.

The next two reals, real6 and real7 denote the minimum timestamps for sent and received
messages. These are provided mostly for information and if no messages are stored in the queue the
values printed are huge(1.0_8) = 1.797693134862316E+308.

The next section in the block pertains to Time-Warp performance statistics. Integer int1 denotes the
number of snapshots taken within the interval from the last global communication to the current one.
Moreover, int2 and int3 show the current number of KMC state snapshots stored in the state queue,
while int4 shows the current value of the snapshot interval, i.e. the number of KMC steps every which
a KMC state snapshot is stored in the state queue (this interval is adaptive as explained in the discussion
of the Time-Warp algorithm, in section Overview of the Approach, page 29). If int2 is appreciably
smaller than int3, this means that memory is not utilized effectively; a large portion of the memory
committed for the state queue via the second argument of keyword state_queue_snapshots
remains unutilized. In certain cases, this may be the desired behavior. In general, however, it might be
worth trying with smaller values for the snapshot interval (first argument of keyword
state_queue_snapshots and initial value of int4). Indeed, the efficiency of the Time-Warp
algorithm has been observed to depend strongly on the frequency of snapshot saving, with more
frequent saving resulting in better performance, of course up to a point, in which saving excessively
frequently hinders the progress of the run.

Next, int5 is the number of restore operations performed from the previous up to the current global
communication. It is important to check that the pertinent values reported among different MPI
processes are within the same range, e.g. all somewhere between 900 to 1200 restore operations. Large
differences are strong indications that the computational load is not well balanced. This could happen
for a strongly heterogeneous simulated system, or due to software or hardware issues. For instance, the
processors used may not have the same specifications (e.g. may have different clock speeds) or there
may be background tasks that slow down a certain MPI process. In such cases, the simulation will still
proceed but with a speed that depends on the slowest MPI processes.

The next four numbers provide information about the overheads pertinent to the rollbacks and re-
simulations, which are key (and computational-time-consuming) operations of the Time-Warp
algorithm. Thus, real8 reports the cumulative KMC time that had to be rerun due to rollbacks, and
real9 the ratio thereof the over the GVT advancement (i.e. real9 = real8 /real4). Depending on
the system this value can range between 4 or 5 to 80 or even higher values, for “difficult” systems.
Clearly, the higher the value for a given run, the less efficient the Time-Warp algorithm for that
simulation, since Zacros repeats segments of the simulated trajectory several times, until all boundary
conflicts are resolved.

Out of the re-simulated segments, some time is spent in “rollback propagation”; for instance, if a
boundary conflict arises at time 0.126 (measured in the time units of the KMC simulation) and the most
recent KMC state saved in the state queue has timestamp of 0.110, then 0.016 time units must be re-

© Michail Stamatakis July 27, 2024


Zacros 3.03 User Guide Page 68 of 89

simulated in rollback propagation mode. The total time re-simulated in rollback propagation (from the
previous global communication till the current one), is reported in real10. Moreover, real11 is the
ratio between the KMC time spent in rollback propagation and the total KMC time that had to be re-
simulated due to rollbacks (i.e. real11 = real10 /real8). This ratio must be less than 1 and should
be kept sufficiently low by choosing a high enough frequency of snapshot saving.

Simulation End
In the end of the simulation, right after the message “Simulation stopped”, the KMC time, total
number of elementary events simulated, and the event frequency are reported:

Current KMC time: real1


Events occurred: int
Event frequency: real2

Performance Facts
Metrics about the performance of the program are also reported right after message “Performance
facts”:

Elapsed CPU time: real1 seconds


Elapsed clock time: real2 seconds
Setup clock time: real3 seconds
Simulation clock time: real4 seconds

Clock time per KMC event: real5 seconds


Clock time per KMC time: real6 seconds/KMCTimeUnits

Events per clock hour: int


KMC Dt per clock hour: real7 KMCTimeUnits

In the above, real1 gives the CPU time spent whereas real2 is the real time, which we could, for
instance, measure using a stopwatch. The wall_time constraint is imposed on real time (see section
Simulation Input File). If the code was compiled as a serial application, real1 and real2 should be
approximately the same. Moreover, real3 gives the time required to set up the simulation (including
the time for parsing the input and setting up the data-structures), whereas real4 reports the time
spent in the actual KMC loop (involving event execution, update and reporting). The two values real3
and real4 should sum up to real2.

The value of real5 gives the real time needed on average to execute a single KMC step. This time is
reported in seconds and is used to compute of how many events can be executed if the simulation was
to be left running for one hour, as also reported in the value of int.

Moreover, real6 gives the real time needed on average to propagate the system for 1 unit of KMC
time. real7 shows how far in KMC time the system will go in one hour of real time.

Note that the performance metrics are not aggregated if the simulation is run in multiple chunks by use
of the restart feature. Thus, every time the simulation is restarted, Zacros resets the counters used to
evaluate these performance metrics.

© Michail Stamatakis July 27, 2024


Zacros 3.03 User Guide Page 69 of 89

Newton's Method Statistics


If a temperature ramp has been specified, Zacros solves a non-linear equation to find the time of
occurrence of each elementary event.1 Statistics about the performance of the Newton-Raphson
method are accrued during the simulation and will be aggregated if the simulation is restarted. The
results are reported in this section and look like the following:

Total number of times run: int1


Number of times failed: int2
Avg number of iterations: real1
Maximum Dx error: real2
Maximum RHS error: real3

Note that the total number of times run (int1) is not equal to the number of KMC events simulated.
This happens because in the course of the KMC simulation there are always processes that are detected
but subsequently removed if any of the participating adsorbates “decides to do something else”. The
number of times failed int2 should be zero. A non-zero value indicates that in one or several occasions
(as many as int2) the Newton- Raphson loop went through the maximum number of iterations (150 by
default) without converging, which may be cause for concern. The maximum errors are also reported:
real2 is the maximum norm of the difference between subsequent approximations of the solution,
whereas real3 is the maximum norm of the right-hand side. Both tolerances are 10−9 by default. Refer
to section Simulation Input File, page 18, on how to override these default tolerances and the maximum
number of iterations.

Execution Queue Statistics


During a run, Zacros monitors the key operations of the datastructure responsible for the queueing of
KMC events. These operations include: (i) the insertion of a newly detected event in the queue, (ii) the
removal (deletion) of an event from the queue, and (iii) the update of the time of occurrence of an event
already in the queue. The relevant statistics are reported in a section of general_output.txt,
whose content is structured as follows:

Number of insertions: int1


Number of removals: int2
Number of updates: int3

The meaning of the integers int1, int2 and int3 is self-explanatory. These numbers can sometimes
help do some quick sanity checks, e.g. for a simulated system for which the cluster expansion contains
only single body terms (no lateral interactions), one would expect no updates (int3 should be zero). Be
cautious if you are using these numbers to try to deduce where bottlenecks lie in your simulation. In
particular, keep in mind that the detection of reaction or energetic interaction patterns may be much
more time-consuming than these queueing operations.

Memory Usage Statistics


The memory footprint of a KMC simulation is monitored during the run, and the relevant statistics are
output in a section titled “Memory usage statistics”. These statistics are useful for optimizing
memory allocations using the keyword override_array_bounds (see section Memory

© Michail Stamatakis July 27, 2024


Zacros 3.03 User Guide Page 70 of 89

Management on page 37). The information printed in general_output.txt adheres to the


following structure:

Used capacity of process queue: int1


...out of max-allocated: int2 ( real1 % utilization )

Used capacity of process-participation list: int3


...out of max-allocated: int4 ( real2 % utilization )

Used capacity of cluster list: int5


...out of max-allocated: int6 ( real3 % utilization )

Used capacity of cluster-participation list: int7


...out of max-allocated: int8 ( real4 % utilization )

The integers int2, int4, int6 and int8 report, respectively, the values of the maximum allowed
adsorb.
numbers of events in the entire lattice ( Nmax events ), events per adsorbate ( Nmax events ), energetic
adsorb.
clusters in the entire lattice ( Nmax clusters ) and energetic clusters per adsorbate ( Nmax clusters ) as
calculated from equations 1-4 (see section Memory Management on page 37). On the other hand,
int1, int3, int5 and int7 report on the actual memory utilization. Thus, int1 is the maximum
number of events in the entire lattice that had to be retained during the course of the simulation,
int2 is the maximum number of events per adsorbate encountered, etc. The real numbers,
real1,…,real4 denote the per-cent utilization of the memory committed for each datastructure;
thus, real1 = 100*int1/int2, real1 = 100*int1/int2, etc. Based on these numbers,
adsorb. adsorb.
one can easily adjust the parameters µevents , µevents , λ clusters and λ clusters , so as to improve memory
utilization. For instance, if µevents = 50 (value of int2) and the utilization reported by real1 is e.g.
1.15%, one could reduce µevents = 50 using the expression “override_array_bounds 1 & & &”
in simulation_input.dat, thereby increasing the memory utilization to about 58%.

Finally, if the simulation has terminated successfully, the message “> Normal termination <” is
written in the end of the file general_output.txt. In the case the simulation is being restarted, a
short message appears providing information about how many times has the simulation been restarted
previously, the last reported time and number of KMC events. The message “> Normal
termination <” is also written in the end of every restart session.

Lattice Output File


After parsing the lattice setup information, Zacros writes the file lattice_output.txt, or files
lattice_output.txt, lattice_output_1.txt etc. for MPI process 0, 1, … in distributed
runs. These files summarize the lattice structure. The first two lines of this file follow the format:

0 real1 real2 0 0 …
0 real3 real4 0 0 …

namely integer-type zeroes everywhere, except the 2nd and 3rd element of each row. These non-zero
elements give the two vectors defining the entire simulation box in row format, namely α = (real1,

© Michail Stamatakis July 27, 2024


Zacros 3.03 User Guide Page 71 of 89

real2) and β = (real3, real4). If the lattice has been defined using the keyword explicit these
real numbers have values of zero. For distributed runs, these values correspond to the vectors of the
entire domain, not just the subdomain of the corresponding MPI process.

The third and following lines give all the information pertinent to each site of the lattice, following the
format:

int1 real1 real2 int2 int3 int4 int5 …

where:

int1 (1st column) is the index of the site (ranging from 1 to the total number of sites),

real1 and real2 (2nd and 3rd columns) are the x and y Cartesian coordinates of site int1,

int2 (4th column) gives the site type of the site with index int1,

int3 (5th column) gives the coordination number of the site with index int1,

int4 int5 … (6th and following columns) give the 1st nearest neighbors of the site with index int1.
Zacros always reports as many integers here as the maximum coordination number, writing zeroes after
the last nearest neighbor.

For distributed runs, each MPI process reports information about the sites of the internal part of its
subdomain (not the halo). Thus, in the example lattice of Figure 1, MPI process 0 will write information
about sites 1, 2 and 3 on lines 3, 4, and 5 of lattice_output.txt, followed by information about
sites 7, 8, 9 on lines 6, 7 and 8, and so on. Still though, the neighboring lists will include site numbers of
neighbors outside the internal part of the subdomain. Thus, MPI process 0 will list the following sites as
neighbors of site 1: 2, 7, 6, 31 (even though sites 6 and 31 fall outside the subdomain interior).

History Output File


During the course of a simulation, Zacros takes snapshots of the lattice state and writes them in file
history_output.txt, along with other pertinent information. In distributed runs, the files
generated are history_output.txt (by MPI process 0), history_output_1.txt (by MPI
process 1), etc. The frequency at which snapshots are being taken is defined by keyword snapshots in
file simulation_input.txt (see section Simulation Input File, page 18). The structure of the
history output file(s) is discussed in the following.

The first few lines of the file constitute a header, with general information about the simulation:

Gas_Species: str1 str2 …


Surface_Species: str3 str4 …
Simulation_Box:
real1 real2
real3 real4
Site_Types: str5 str6 …

© Michail Stamatakis July 27, 2024


Zacros 3.03 User Guide Page 72 of 89

These are mostly self-explanatory. Note, however that for explicitly defined lattices (see keyword
explicit in section Explicitly Defined Custom Lattices, page 41), the Simulation box information does
not appear.

The rest of the file consists of sections beginning with the word “configuration” followed by information
structured as discussed below.

configuration int1 int2 real1 real2 real3


int3 int4 int5 int6
   
int7 int8 …

In the structure just shown, int1 is a counter showing how many configurations have been written so
far in file history_output.txt. Integer int2 gives the number of KMC events that have happened
up to that point. The next three reals real1, real2 , real3, give the time, temperature and the
energy of the current lattice configuration. The subsequent lines contain integers that encode the state
of the lattice; there are as many such lines as the number of lattice sites. The information is presented
as follows:

int3 (1st column) is the site number on the lattice,

int4 (2nd column) gives the entity/adsorbate number (each adsorbate on the lattice has a unique
number/identifier, this is it),

int5 (3rd column) denotes the species number (zeroes are reported for empty sites),

int6 (4th column) gives the dentate number with which entity int4 occupies site int1.

Finally, the last line (int7, int8, …) gives the number of molecules produced (or consumed if the
corresponding value is negative) for each gas species in the chemistry. The order in which these
numbers are reported is the same as the order with which gas species were defined (see
gas_specs_names and pertinent keywords in section Simulation Input File, page 20) and also are
mentioned in the header of history_output.txt. Thus, int7 refers to species str1, int8 to
species str2 etc.

For distributed runs, each MPI process reports the states of sites in the interior as well as in the halo of
the subdomain it handles. Moreover, each line encoding the state of a site contains 5 integers: the first 4
(int3 – int6) are as discussed earlier, while the 5th integer takes the value of 1 if the site is in the halo
of the subdomain, otherwise the value of 0 (interior site).

Process Statistics Output File


During the course of a simulation, Zacros collects statistical information about the occurrence of
elementary events which is reported in file procstat_output.txt. This statistical information is
always updated after every KMC event, whereas the frequency at which it is reported is defined by
keyword process_statistics in file simulation_input.txt (see section Simulation Input

© Michail Stamatakis July 27, 2024


Zacros 3.03 User Guide Page 73 of 89

File, page 22). As for the other output files, in distributed runs each MPI process generates its own file
and MPI processes with rank ≥ 1 append _{rank} to their filename. Each of these files contains
information about the events that were scheduled and executed by that MPI process, i.e. ignoring
messaged events. The contents of the procstat_output.txt (family of) file(s) are structured as
follows.

The first line of the file constitutes a header following the format:

Overall str1 str2 …

The word “Overall” appears always first and is followed by strings that correspond to the names of all
elementary events defined in file mechanism_input.dat.

The rest of the file consists of sections beginning with the word “configuration” followed by information
structured as discussed below.

configuration int1 int2 real1


real2 real3 real4 …
int3 int4 int5 …

In the above, int1 is a counter showing how many configurations have been written so far in file
procstat_output.txt. Integer int2 gives the number of KMC events that have happened up to
that point and real1 the current time. The next two lines provide statistical information about each
elementary step of the mechanism in the same order as mentioned in the header.

Thus, real2, real3, real4, … give the average waiting times τk (also referred to as inter-arrival
times) for each reaction event:

Nkoccur
1
=τk
Nkoccur
∑τ
i≥1
k,i (15)

occur
where the averaging is done every time elementary event k occurs. Thus, Nk is the number of times
event k was executed so far in the KMC simulation, and τk,i the waiting time for that event to occur (the
waiting time for event k is by definition the time that passed since the occurrence of the most recent
event of any type). The value of real2 refers to an “overall” average in which all events are considered.

Moreover, int3, int4, int5, … give the numbers of times each event was executed during the KMC
simulation. The value of int3 refers the total number of events and should be the same as the value of
int2 in the header of file procstat_output.txt. Moreover, the values of int4, int5, … should
sum up to that of int3.

Species Numbers Output File


Zacros also reports the number of surface and gas species along with other pertinent information in file
specnum_output.txt. As for the other output files, in distributed runs each MPI process generates
its own file and MPI processes with rank ≥ 1 append _{rank} to their filename. Each of these files

© Michail Stamatakis July 27, 2024


Zacros 3.03 User Guide Page 74 of 89

contains information about the species in the interior of the subdomain handled by the MPI process, as
well as about the events that were scheduled and executed by that MPI process and the gas phase
molecules produced or consumed due to these events (i.e. ignoring messaged events). The frequency at
which this information is reported is defined by keyword species_numbers in file simulation_
input.txt (see section Simulation Input File, page 22). The contents of this file are pretty self-
explanatory and are summarized in the first line (header) of the file. The overall structure is as follows:

Entry Nevents Time Temperature Energy str1 str2 … str3 str4 …


int1 int2 real1 real2 real3 int3 int4 … int5 int6 …
        

Entry refers to an integer counting how many lines have been written to this output file. The column
marked as “Nevents” shows the total number of KMC events that happened up to that point, followed
by a column that shows the (simulated) time passed. The next column gives the temperature which
should be constant unless a temperature ramp has been specified (see keyword temperature in
section Simulation Input File, page 19). The column marked as “Energy” gives the energy of the
current lattice configuration. The following columns marked as str1, str2, … report the number of
molecules of each species currently adsorbed on the lattice (the strings are the names of the surface
species). Note that total numbers are reported; thus, if a species can bind to two different sites, this
output does not provide any information as to how many particles are bound to sites of type 1 versus 2.
Finally, the columns marked as str3, str4, … report the number of molecules of each gas species. For
species that appear as products in the net reaction under consideration, one should expect to see
positive numbers in this column. Negative numbers would be reported for reactant species.

Energetics Lists Output File


To enable the monitoring of the energetic interaction patterns contributing to the total energy of the
system during the course of the simulation, one can use keyword energetics_lists (see section
Simulation Input File, page 23). In this case, file energlist_output.txt is generated; additionally,
files energlist_output_{rank}.txt are generated by MPI processes with rank ≥ 1, in the case
of distributed runs. This type of file starts with a short header section reporting the clusters’ names,
energies and graph multiplicities defined in energetics_input.dat (see section Energetics Input
File, page 47), as well as the total number of sites on the lattice:

Clusters: str1 str2 str3 …


Cluster_Energies: real1 real2 real3 …
Cluster_Graph_Multiplicities: int1 int2 int3 …
Number_Of_Sites: int4

The rest of the file consists of sections with the following structure:

energetics_list_entry int1 int2 real1 real2 real3


int3 int4
int5 int6 int7 …
  

© Michail Stamatakis July 27, 2024


Zacros 3.03 User Guide Page 75 of 89

where integer int1 is a counter showing how many configurations have been written so far in file
energlist_output.txt and integer int2 gives the number of KMC events that have happened
up to that point. real1 is the current time, real2 the current temperature and real3 the current
total energy of the lattice. Next, int3 shows the number of energetic clusters reported whereas int4
is the total number of clusters detected (all of which contribute to the total energy of the lattice). These
two values should be the same, unless selective reporting of clusters has been specified by using the
keyword select_cluster_type (see section Simulation Input File, page 23). The next lines contain
the list of clusters: integer int5 is the index of a cluster (for instance if int5= 2, the cluster’s name is
str2 as reported in the header section, etc). The following integers in the same line, give the lattice
sites on which this cluster was detected. For single-site patterns (e.g. on-site formation energy of a
mono-dentate species) only one integer int6 would appear. For multi-site patterns there are as many
integers as the number of sites in cluster int5.

For distributed runs, each MPI process reports the instances of energetic patterns (clusters) that are in
the interior of its subdomain. This condition is satisfied if the site covered by the first dentate of the first
molecule of the cluster is not in the halo of the subdomain. This “accounting scheme” ensures that each
cluster instance will be reported only once within the output files of all the MPI processes.

Process Lists Output File


Similarly, to enable the monitoring of lattice processes existing in the KMC event queue during the
course of the simulation, one can use keyword process_lists (see section Simulation Input File,
page 23). In this case, file proclist_output.txt is generated; for distributed runs, files
energlist_output_{rank}.txt are additionally generated by MPI processes with rank ≥ 1. This
type of tile which starts with a header reporting the elementary events’ names str1, str2, str3 …
as defined in mechanism_input.dat (see section Mechanism Input File, page 52), as well as the
total number of sites on the lattice int1:

Elementary_Events: str1 str2 str3 …


Number_Of_Sites: int1

This header is followed by sections with the following structure:

process_list_entry int1 int2 real1 real2 real3


int3 int4
int5 real1 real2 real3 real4 int6 int7 …
      

where integer int1 is a counter showing how many configurations have been written so far in file
energlist_output.txt and integer int2 gives the number of KMC events that have happened
up to that point. real1 is the current time, real2 the current temperature and real3 the current
total energy of the lattice. Next, int3 shows the number of lattice processes reported whereas int4 is
the total number of (imminent) lattice processes stored in the event queue. These two values should be
the same, unless selective reporting of processes has been specified by using the keyword
select_event_type (see section Simulation Input File, page 23). The next lines contain the list of

© Michail Stamatakis July 27, 2024


Zacros 3.03 User Guide Page 76 of 89

processes where int5 is an index pointing an elementary event (for instance if int5= 3, the
elementary event’s name is str3 as reported in the header section, etc), real1 gives the propensity
of the event at the initial temperature (beginning of simulation), real2 gives the propensity at the
current temperature, real3 is the activation energy, and real4 is the lattice reaction energy of the
event. The following integers in the same line (int6, int7 etc), give the lattice sites on which this
process was detected. For single-site patterns (e.g. adsorption/desorption of a mono-dentate species)
only one integer int6 would appear. For multi-site patterns there are as many integers as the number
of sites in elementary event int5.

For distributed runs, each MPI process reports the instances of elementary events (lattice processes)
that are in the interior of its subdomain. This condition is satisfied if the first site of the pattern is not in
the halo of the subdomain. Notice that “accounting scheme” uses a slightly different convention from
the one used in the energetics lists, but still ensures that each lattice process will be reported only once
within the output files of all the MPI processes.

Energetics Debug Output File


The output file globalenerg_debug.txt is generated if simulation_input.dat contains the
keyword debug_report_global_energetics (see section Simulation Input File, page 39), and
provides a full account of the bookkeeping related to energetics in the course of a KMC simulation (for
distributed runs, additional files are generated by MPI processes with rank ≥ 1, with filenames appended
by _{rank}). In particular, the file contains sections starting with the words “Initialization”,
and “KMC step int” and ending with the expression “Current total lattice energy is
real”. In these sections one or more of the following expressions can be contained:

Total empty-cluster energy constant = real

This constant will be zero, unless an “empty cluster” has been specified. The latter, is a cluster involving
a single site with an unspecified state (using keyword &; see section Energetics Input File, page 49).

Global-cluster int1 identified:


Cluster number: int2
Cluster description: str
Mapping of lattice to pattern sites: int3 int4 …
Cluster graph-multiplicity: int5
Its energy contribution is real

The expressions above are written when a pattern representing an energetic contribution has been
detected in the current lattice configuration. During the course of the simulation, Zacros keeps a list of
all such patterns, so that it can quickly compute changes in the lattice energy when
adsorption/desorption diffusion and reaction events take place. Thus, int1 is the index in this list of
patterns, str is the name of the pattern just detected (one of the cluster names defined in
energetics_input.dat; see section Energetics Input File, page 48); int3 int4 … give the
location of this pattern on the lattice; int5 and real just repeat the graph multiplicity and energy

© Michail Stamatakis July 27, 2024


Zacros 3.03 User Guide Page 77 of 89

contribution values that were defined in energetics_input.dat using the keywords


graph_multiplicity and cluster_eng, respectively.

Cluster int was removed.

The message above indicates that an energetic contribution was removed from the list, because the
corresponding pattern ceased to exist.

Cluster int1 was relabeled to int2.

Regarding this message, note that Zacros stores the list of patterns in a data-structure in which each
pattern is indexed by an integer ranging from 1 to the total number of patterns NTot. To avoid generating
gaps in this data-structure upon removal of a pattern Nremv, the last pattern with index NTot takes the
index Nremv, so that the new set of indexes ranges from 1 to NTot – 1. The message above indicates that
such a re-indexing took place, with NTot = int1 and Nremv = int2.

Process Debug Output File


The output file process_debug.txt is generated if simulation_input.dat contains the
keyword debug_report_processes (see section Simulation Input File, page 40), and provides a
full account of the bookkeeping related to elementary event occurrence in the course of a KMC
simulation (for distributed runs, additional files are generated by MPI processes with rank ≥ 1, with
filenames appended by _{rank}). In particular, the file contains sections starting with the words
“Initialization”, and “KMC step int”. In these sections the following expressions can be
contained:

Process int1 identified:


Elementary step number: int2
Elementary step description: str
Mapping of lattice to pattern sites: int3 int4 …
Its activation energy at the zero-coverage limit is real1
Its activation energy for the given configuration is real2
Its energy of reaction at the zero-coverage limit is real3
Its energy of reaction for the given configuration is real4
Its propensity at T0 is real5
It will occur at t = real6 after Dt = real7

The expressions above are output when a pattern representing an elementary process has been
detected in the current lattice configuration. During the course of the simulation, Zacros keeps a list of
all such patterns in a heap data-structure, in order to be able to find in constant time the next event to
take place. Thus, int1 is a unique identifier in this heap, str is the name of the pattern just detected
(one of the elementary event names defined in mechanism_input.dat; see section Mechanism
Input File); int3 int4 … gives the location of this pattern on the lattice; real1 is the activation
‡ ‡
coverage at the zero coverage limit ( Efwd,0 or Erev,0 in equations 10, 11); real2 is the actual activation

energy for the current configuration ( Efwd ( σ ) or Erev ( σ ) in equations 10, 11); real3 is the activation
‡ ‡

coverage at the zero coverage limit ( ∆Erxn,0 in equations 10, 11); real4 is the actual activation energy

© Michail Stamatakis July 27, 2024


Zacros 3.03 User Guide Page 78 of 89

for the current configuration ( ∆Erxn ( σ ) in equations 10, 11). The value of real5 gives the propensity
(equations 6, 7) at the initial temperature of the simulation (which would be the same throughout the
simulation if no temperature ramp has been defined). Finally, the random time for the occurrence of
that event is reported in the last line: real6 is the absolute time of occurrence and real7 is the time
increment (relative to the current time in which the process was detected).

Process int was removed.

The message above indicates that an elementary process was removed from the list, because the
corresponding pattern ceased to exist.

Process int1 was relabeled to int2.

This message indicates that a process has been re-indexed to avoid generating gaps in the heap data-
structure upon removal of a pattern. Thus, if pattern Nremv is being removed, the last pattern with index
NTot takes the index Nremv, so that the new set of indexes ranges from 1 to NTot – 1. The message above
indicates that such a re-indexing took place, with NTot = int1 and Nremv = int2.

Notes on Troubleshooting
Zacros is able to identify syntax errors in the input files. If such an error is detected, the program will
report an error with a detailed description of what the problem was and in which line of which file it was
encountered.

In some cases though, the syntax may be perfectly valid but the specification might not be the one
intended. The following notes provide some hopefully useful considerations and guidelines for
troubleshooting.

1. Numbering/Naming consistency: make sure your numbering and naming is consistent throughout
your input. For instance, the order in which the names of surface species appear after keyword
surf_specs_names must match their dentate numbers after keyword surf_specs_dent.
Similarly for the gas species definition.

2. Pattern consistency: make sure that the binding configurations of different species are used in
consistent way in the following: the seeding instructions of state_input.dat (see section Initial
State Input File, page 61), the energetic clusters of energetics_input.dat (see section
Energetics Input File, page 47) and the elementary events of mechanism_input.dat (see section
Mechanism Input File, page 52). For instance, if the binding configuration of a bidentate species has
been defined with dentate 1 occupying a site of type “top” and with dentate 2 occupying a site of
type “fcc”, this convention should be followed throughout. If an individual seeding instruction
(keyword seed_on_sites, section Initial State Input File, page 61) places this species on the
wrong sites, Zacros will execute the instruction, but since no cluster contribution pattern will be
detected, the lattice energy will remain the same after addition of this species.

© Michail Stamatakis July 27, 2024


Zacros 3.03 User Guide Page 79 of 89

3. Units and conventions: it is necessary that the values entered for parameters such as rate constants,
energies or angles in patterns, adhere to Zacros’s unit system and sign conventions. For instance,
energies are given in eV (unless you have recompiled Zacros for a different unit system, see section
Units and Constants on page 17). Values for angles are signed, with positive values denoting the
counter-clockwise direction (see keyword angles in pages 50 and 58).

4. Quick checks: it is worth checking the summary of energetics and mechanism specifications in file
general_output.txt (see section General Output File, page 63). If the patterns that appear
there are not the ones intended, there may be a problem with the input. In some cases, the program
will issue warnings that may not have catastrophic consequences, but potentially need to be
addressed.

5. Advanced checks: one can frequently discover problems in the simulation setup by making use of the
debugging keywords:

debug_report_processes
debug_report_global_energetics
debug_newtons_method
debug_check_processes
debug_check_lattice
debug_check_caching

in simulation_input.dat (see section Simulation Input File, page 18) along with the output
information of the debugging files:

globalenerg_debug.txt
process_debug.txt

For instance, to make sure that the energetics model was defined properly, one could work out an
example problem, specify a configuration in file state_input.dat and see if the clusters are
being detected properly.

Known Issues/Limitations
1. If you try to compile Zacros with CMake and the build system fails to find a compiler, check the FC
environmental variable by running the following command:

echo $FC

If this returns an empty string, then you may try setting the compiler by:

export FC=ifort

where ifort, can be substituted with another compiler (available in your system).

© Michail Stamatakis July 27, 2024


Zacros 3.03 User Guide Page 80 of 89

Another way to instruct CMake to use a specific compiler is the -DCMAKE_Fortran_COMPILER


option. As an example, in order to employ gfortran-6 as the compiler of choice e.g. in case of
multiple available versions in your system, you can use:

cmake .. -DCMAKE_BUILD_TYPE=Release -Ddoopenmp=off -Ddompi=off


-DCMAKE_Fortran_COMPILER=gfortran-6

(the above command is to be typed in one line).

2. For input files, the maximum record length that can be parsed is 213 = 8192 characters. A maximum
of 3000 words can be parsed. The maximum allowed length for the names of species, site-types,
clusters and mechanism-steps is 64 characters. These limits can be changed by redefining the
appropriate constants in file constants_module.f90 and recompiling Zacros.

3. The energy units in files energetics_input.dat and mechanism_input.dat are


assumed to be in eV. If you need to use a different unit you can redefine parameter enrgconv in
constants_module.f90 and recompile the program (the values are provided so you only need
to uncomment the appropriate line). See also section Input/Output Files.

4. Sites with unspecified states are not supported for elementary events. In most cases, sites that
participate in the elementary event are occupied by reactants, products or transiently by the
transition state. Thus, “extra” sites must usually be defined as empty rather than unspecified.

5. The calculation of energetics for lattices smaller than the maximum interaction length fails to
provide accurate values. It is recommended that the size of the lattice be chosen as at least twice
the length of the longest-range interaction pattern.

6. In the Zacros code, some assumptions have been made about, for instance the maximum number
of processes that need updates or adsorbates that need to be removed after a KMC event. While
these have empirically been shown to work well for the systems we have tested, it may happen
that they are inadequate for some other system. In such situations, Zacros will terminate
abnormally and you will most probably see errors like the following:
forrtl: severe (408): fort: (10): Subscript #1 of the array
PROCSUPDATE has value 3 which is greater than the upper bound of 2
If you encounter such problems you will have to increase the size of the “problematic” array and
recompile. Please do not hesitate to contact us in case you need help. We are planning to
implement memory amortization to avoid such issues altogether in the future. In the current Zacros
version, a memory management scheme exists that enables the user to specify convenient limits to
the sizes of the main data-structures of the KMC simulation (see section Memory Management on
page 37).

7. Certain limitations exist for the MPI Time-Warp implementation and the corresponding parallel
emulation scheme. Please consult section Performing MPI Time-Warp Runs on page 33 and section
Validation of MPI Time-Warp Runs on page 35, for details on these limitations.

© Michail Stamatakis July 27, 2024


Zacros 3.03 User Guide Page 81 of 89

8. There is no explicit limitation in the number of surface and gas-phase species, the size of the lattice,
and the number of cluster and elementary event patterns that can be defined, as the pertinent
data-structures consist of allocatable objects. However, different compilers and operating systems
may impose their limitations. Please refer to the documentation thereof.

9. On the Cray compiler there have been reported end-of-file related problems. In particular, there
has to be a newline character after the last record, otherwise Zacros is unable to correctly parse
this record.

10. On the NAG compiler, some functions, e.g. int8, are not supported. We have taken care to use
standard versions of functions and we have successfully tested Zacros on NAG 6.0. If you are using
an older version and encounter problems, please do not hesitate to contact us for help.

11. When running the tests with CMake (see section Using the CMake Build System, page 13), if you are
working with slow hardware, you may see one or more tests failing with timeout, e.g.:
Start 52: PAREMUL_SAMPLE_RUN_ADS_DES_RXN_DIFF_LATINTER_HEXA_4PROCS
24/55 Test #52: PAREMUL_SAMPLE_RUN_ADS_DES_RXN_DIFF_LATINTER_HEXA_4PROCS ...***Timeout 90.02 sec

To address this, you can increase the time that CMake allows this set of tests to run for. These
times are defined in file path/to/source/of/Zacros/tests/CMakeLists.txt, by the
following lines:
set(fasttimeout "90")
set(mediumtimeout "1000")
set(slowtimeout "15000")

By looking at the time in which the test failed (90.02 seconds in the above example), you know that
this was a fast test (you can also see this information explicitly in CMakeLists.txt, in the lines
that start with add_regression). You can then increase the corresponding timeout value, e.g.
change the first of the three lines above to:
set(fasttimeout "120")

If the new timeout value is not enough, you can try progressively larger values, but of course,
excessively large values might mean that there is a problem with your hardware of the compiled
executable (in which case, feel free to contact us for help).

12. There is a class of tests run by CMake, which we refer to as run-resume tests. Successful outcomes
of these tests validate that Zacros can successfully stop and restart a simulation from the last
written checkpoint (refer to keywords wall_time and no_restart in section Simulation Input
File, page 25). Each of these tests tries to run a simulation in several “chunks” and checks that
(i) indeed the simulation was run in more than one chunk, (ii) the simulation was successful (no
crashes) and (iii) the results obtained were the same as those of the one-off simulation, which are
provided as the “reference data”. In some cases, your hardware may be too fast, and condition (i)
may be violated. In this case, your simulation finishes before the wall-time given in the test, so
Zacros does not have the opportunity to actually restart the simulation, and the test fails. To

© Michail Stamatakis July 27, 2024


Zacros 3.03 User Guide Page 82 of 89

address this, you can simply change the wall-time specified for that test. To this end, first locate the
directory where the data of the test are provided, by looking into the name of test and searching
for it in CMakeLists.txt. As an example, below is the report of a failing test:

Start 30: MPI_SAMPLE_RUN_RESUME_ADS_DES_RXN_LATINTER_9PROCS


1/1 Test #30: MPI_SAMPLE_RUN_RESUME_ADS_DES_RXN_LATINTER_9PROCS ...***Failed 24.53 sec

In CMakeLists.txt, this test is referred to in the following line, where the box highlights the
important part: the number of the test, which also refers to the directory of the data of this test:

add_regression( MPI_SAMPLE_RUN_RESUME_ADS_DES_RXN_LATINTER_9PROCS 134 medium


${mediumtimeout} --nMPIprocs=9 --nOMPthreads=1 --KMCTime=200.0)

Therefore, you know that the data of this test can be found in directory:

path/to/source/of/Zacros/tests/data/134

You now have to decrease the value of the integer following the wall_time keyword, for
instance for this test, you could change it from 20 to 13 seconds. If the new wall-time is not small
enough, you can try progressively smaller values.

Note: for MPI runs, you may also have to decrease the value of the integer following keyword
time_interval_gvt_computation, since, the simulation would not terminate before the
first global communication has taken place. Thus, in the above example:

time_interval_gvt_computation 6

and if you were to decrease the wall-time to say 4 seconds, the first simulation chunk would still be
6 seconds long. If your hardware was very fast and was able to complete the simulation in these 6
seconds, you would have to lower the value of time_interval_gvt_computation to say 1
second, and decrease the value of wall_time to say 3 seconds, in an attempt to create a
situation in which Zacros cannot finish the simulation in the first run and has to restart.

13. Test 240: MPI_SAMPLE_BRUSSELATOR_9PROCS, is known to fail on certain configurations of


hardware, compiler or operating system. This test is quite special (and necessary), since it checks
how Zacros treats equal time-stamp events in the Time-Warp algorithm (for more details see the
supplementary information of Ref. 11). To create situations in which several equal time-stamp
events arise in a simulation, we have used a “very low quality random” generator (see keywords
random_number_generator and very_low_quality on page 18). The problem is that,
since the occurrence times of several events are supposed to be identical, the precision by which
the KMC time advancements are calculated and summed can affect the simulated history. At the
point of writing we have been able to obtain two different solutions using different hardware and
software: the one given in:

path/to/source/of/Zacros/tests/data/240

and another one that is given in the following directory:

© Michail Stamatakis July 27, 2024


Zacros 3.03 User Guide Page 83 of 89

path/to/source/of/Zacros/tests/data/240_alt

Thus, if test 240: MPI_SAMPLE_BRUSSELATOR_9PROCS fails with your configuration, simply


delete directory 240 (or rename it to 240_fail, for instance), then rename 240_alt to 240
and run the test again. If the test still fails, please contact us for help.

Zacros Utilities
Mersenne-Twister Jump-Ahead Utility
This utility is a standalone program for computing and outputting the minimal polynomial coefficients of
the Mersenne-Twister 19937 recurrence for a given “jump factor”, as well as propagating a given state
by that “jump factor”. For the users’ convenience, this utility is executed by Zacros automatically when
the jump-ahead method is selected for the generation of multiple random streams (see keyword
random_streams on page 34). Thus, users are not required to run this utility themselves, and the
information of this section is provided mainly for reference to developers.

Background
At a high-level, the Mersenne-Twister random number generator is based on a linear recurrence that
generates a sequence of binary vectors: xn+1 = A⋅xn, where xk for k = 0, 1, … are the binary vectors (i.e.
their elements are either 0 or 1), and A is a binary matrix. The coefficients of A are chosen in such a way
that the sequence of xk is “sufficiently random”, making it then possible to map these pseudo-random
binary vectors into pseudo-random numbers. To create a set of disjoint pseudo-random streams by the
jump-ahead method, one needs to be able to efficiently compute the matrix Ajs, where js is the jump
step (chosen to be sufficiently large, so that each random stream can deliver several pseudo-random
numbers, before any overall with the next stream is observed). Then, the random streams can be
initialized with state vectors x0, xjf, x2⋅jf, … The efficient computation of Ajs is based on the properties of
the minimal polynomial of A, and is facilitated computationally by the fast execution of binary
operations (XOR, shift, bit mask). More detailed information about the jump-ahead method can be
found in Ref. 16.

Usage
The Mersenne-Twister jump-ahead utility can be invoked from the command line as follows:

mt_jump_ahead PEid jp [MT parameters]

where:

• PEid is the Processor-ID (e.g. the MPI process rank) for which the minimal polynomial is being
computed, and
• jp is the jump-factor. Note that the minimal polynomial depends on the jump step, the latter
taken as js = PEid⋅2jp.

© Michail Stamatakis July 27, 2024


Zacros 3.03 User Guide Page 84 of 89

• [MT parameters] is an ordered set of optional arguments setting the following parameters
of the Mersenne-Twister pseudo-random number generator (all 4-byte integers, default values
listed):
o word size (w = 32)
o degree of recurrence (n = 624)
o middle term (m = 397)
o separation point of one word (r = 31)
o integer "encoding" the binary matrix of the recurrence (a = -1727483681)
o integer "encoding" the lower mask
(lo = 2147483647, bit pattern: 01111111111111111111111111111111)
o integer "encoding" the upper mask
(up = -2147483648, bit pattern: 10000000000000000000000000000000)

As an example, the following command will compute the minimal polynomial for a jump step of 264 for
MPI-process 3, for the Mersenne-Twister generator with the default parameters:

mt_jump_ahead 3 64 32 624 397 31 -1757483681 2147483647 -2147483648

The utility outputs the minimal polynomial coefficients in the form of 4-byte integers (“words”) in file
polycoeffs[_PEid].txt, with the part in brackets included if PEid > 0. The file contains one
integer per line; the first line gives the degree of the minimal polynomial, followed by the integers
“encoding” its coefficients. Moreover, if the working directory contains file mtstate.txt (of 4-byte
integers) the program computes and outputs the “distant” state in file mtstatedist[_PEid].txt
(again, the part in brackets is included if PEid > 0).

CE-Fit: Cluster Expansion Fitting Utility


Background
When introducing the cluster expansion Hamiltonian approach, as implemented in Zacros, as well as the
pertinent keywords of the energetics_input.dat file, we assumed that the energy contribution
of each interaction pattern was known (this is the parameter specified by the cluster_eng keyword,
page 50). These energy contributions are referred to as effective cluster interactions (ECIs) and are
calculated by solving linear systems of equations using data that typically come from first-principles
calculations. For simple systems, such a calculation is straightforward; for instance, suppose we have a
system with one adsorbate (e.g. CO) that binds to a single site type and we would like to fit the single
body energy and the 1st nearest-neighbor (1NN) lateral interaction term (only two patterns). We can
achieve this by doing the following calculations using e.g. density functional theory, to obtain the
necessary energy values:
1. a calculation with the adsorbate in the gas phase, yielding energy DFTEA(gas),
2. a calculation of the pristine surface, yielding energy DFTESurf,
3. a calculation with a single adsorbate molecule bound to the surface, yielding energy DFTEA+Surf,
4. and finally, a calculation with a 1NN pair of adsorbates on the surface, yielding energy
DFTE2A(1NN)+Surf

© Michail Stamatakis July 27, 2024


Zacros 3.03 User Guide Page 85 of 89

Note that for the last couple of calculations, the simulation cell should be large enough to avoid
interactions with periodic images of the adsorbate.

After collecting this data, one does the following simple calculations of the formation energy of each
configuration:

FEA + Surf = DFTEA + Surf − DFTESurf − DFTEA(gas) (16)

FE2A(1NN) + Surf DFTE2A(1NN) + Surf − DFTESurf − 2 ⋅ DFTEA(gas)


= (17)

Then, the effective cluster interactions of the single- and two-body terms can be calculated by solving
the following (very simple) system of linear equations:

ECIA = FEA + Surf (18)

2 ⋅ ECIA + ECI2A(1NN) =
FE2A(1NN) + Surf (19)

Of course, the dataset of two configurations is the minimum needed to solve for the two ECIs (ECIA for
the single-body term, and ECI2A(1NN) for the two-body term). In practice, one would like to have more
configurations in order to perform a “more well-informed” fitting of the cluster expansion; thus, least-
squares fitting is essentially performed, since the linear system of equations is overdetermined. Note
that, more elaborate cluster expansions than that of our example would involve many patterns and a
higher number of configurations to be fitted. Clearly, such fitting exercises, involving many patterns and
configurations, can be complicated, and this is what the CE-Fit utility aims at streamlining.

Performing a Cluster Expansion Fit


The CE-Fit utility is compiled automatically when using any of the makefiles or the CMake build system
(see section Compiling Zacros, page 9). The executable can be found in the build directory (same as
the location of the Zacros executable). Invoking the CE-Fit utility is done in a similar way as performing a
Zacros simulation: in a terminal (or command-prompt in Windows), use cd to go to the directory where
the input files are and then invoke the CE-Fit executable from there.

Note: if you compile Zacros with MPI (see section Compiling Zacros, page 9), the CE-Fit utility will also be
compiled with MPI directives. However, since no MPI parallelism has been implemented in this utility,
you will be able to run it with only 1 MPI process. Trying to run it with many MPI processes will result in
an error.

The following mandatory input files and directories must be provided for a cluster expansion fit:

• calculation_input.dat, which provides general input about the fitting exercise, e.g.
number of configurations, surface species information etc. This is the “equivalent” of the
simulation_input.dat file of a Zacros simulation, and its keywords will be explained in
the next subsection.

© Michail Stamatakis July 27, 2024


Zacros 3.03 User Guide Page 86 of 89

• energetics_input.dat, which has the format described in section Energetics Input File
(page 47), with the exception that the cluster_eng keyword is now invalid, since these
energies are supposed to be fitted by CE-Fit.
• A series of directories named as Conf1, Conf2, Conf3, …, each of which must contain the
following files:
o lattice_input.dat, which follows exactly the format described in section Lattice
Input File (page 41),
o state_input.dat, which adheres to the rules of section Initial State Input File (page
61 onwards), with the exception that only seeding on specific sites is now valid (using
keyword seed_on_sites), in order to define configurations with known structure
and energy. This file can be omitted if one needs to define the “empty lattice”
configuration.
o a plain text file named energy (without any extension), which contains a single real
number, which provides the formation energy of the configuration captured by the
lattice and state input files. This energy value typically comes from first-principles
calculations (after some post-processing).

During the fitting, CE-Fit parses the general calculation settings from calculation_input.dat and
the patterns of the cluster expansion whose ECIs need to be fitted. Then it parses the configuration
versus energy data from the Config# directories and solves the (typically overdetermined) system of
linear equations.

Keywords

The keywords of the files used in CE-Fit (see list above), except calculation_input.dat, are
explained in the referenced sections. For calculation_input.dat, the following keywords are
valid:

n_config int The number of configurations to use in fitting the cluster


expansion. This parameter allows users to progressively
perform fittings with larger sets of configurations and check
the variation of the ECIs thus obtained. Caution: CE-Fit will
disregard the configurations of directories ConfigX for X >
int. Therefore, when adding more configurations (Config#
directories) one would need to adjust the value of int, so that
the new configurations are taken into account in the fitting.

The following keywords are also used in the calculation_input.dat file and their meaning is as
discussed in section General Simulation Parameters (page 18).

n_surf_species int This keyword is used in the same was as discussed in section
General Simulation Parameters on page 20.

© Michail Stamatakis July 27, 2024


Zacros 3.03 User Guide Page 87 of 89

surf_specs_names str1 str2 … As discussed in section General Simulation Parameters


on page 20.

surf_specs_dent int1 int2 … As discussed in section General Simulation Parameters on page


21.

n_site_types int As discussed in section Unit-Cell-Defined Periodic Lattices on


page 42.

site_type_names str1 str2 … As discussed in section Unit-Cell-Defined Periodic Lattices on


page 42.

debug_report_global_energetics This keyword is optional and works as discussed in


section Troubleshooting of Simulation Input File on page 39.

debug_check_lattice This keyword is optional and works as discussed in section


Troubleshooting of Simulation Input File on page 40.

finish Marks the end of input. Any subsequent information will not
be parsed.

As in any Zacros input file, comments can be added to calculation_input.dat, prepended by the
# character.

Output files
A cluster fitting calculation generates the following files:

• general_output.txt: the structure of this file is quite similar to the same-named file of a
Zacros calculation and provides general information about the fitting exercise. Thus, it
summarises the calculation setup, lists the patterns found in the cluster expansion to be fitted,
summarises the lattice and initial state setup for each configuration given, and outputs the
values of the ECIs if the fitting is successful or an error message if not.
• cefit_output.txt: this file lists the names of the interaction patterns in the first line (this
line contains as many strings as the patterns of energetics_input.dat), followed by rows
containing one real and one integer value: the former is the ECI of the corresponding pattern
while the latter is the graph multiplicity (which is as given in energetics_input.dat or
equal to 1 if the pertinent keyword is omitted therefrom).
• energies_parity.txt: this file contains only numeric data in 3 columns, and as many rows
as the number of configurations used in the fitting. The first column lists the energies provided
for each configuration as input to the fitting procedure (real numbers), while the second column
gives the energies calculated from the fitted cluster expansion (real numbers). The third and
final column gives the number of sites of the lattice provided for each configuration. This data is
useful for creating a parity plot of the model cluster expansion Hamiltonian energies with
respect to the input energies.

© Michail Stamatakis July 27, 2024


Zacros 3.03 User Guide Page 88 of 89

• AmatBvec.txt: this file gives the design matrix and the right-hand side vector of the linear
problem being solved: A⋅x = b. The last column of the file is b, which is simply the vector of the
energies given for each configuration. The remaining contents of the file provide matrix A, which
has as many rows as the number of configurations used in the fitting (i.e. the integer following
keyword n_config), and as many columns as the number of interaction patterns parsed in
energetics_input.dat. Element ai,j of this matrix is equal to the number of instances of
interaction pattern j in given configuration i divided by the graph multiplicity of pattern j. The
values of ai,j are expected to be integers; however, they are reported as reals (with 2 decimals)
in this file for debugging purposes. More specifically, recall that the graph multiplicity of pattern
j is the number of times that the exact same pattern will be counted for a given lattice
configuration (i.e. the number of symmetric equivalents of the pattern). It follows that the
number of instances of pattern j (as counted by the pattern matching subroutines of Zacros for a
given configuration) should be an integer multiple of the graph multiplicity. However, if an
incorrect graph multiplicity of pattern j is given by the user in energetics_input.dat, it
may happen that the number of instances is not an integer multiple of graph multiplicity, and
thus ai,j is no longer an integer. In such cases, the user needs to review the pattern that suffers
from this issue and amend the graph multiplicity value or the pattern definition altogether.
• Conf#_lattice_output.txt and Conf#_lattice_state.txt, where # ranges from
1 to the number of configurations used in the fitting: the former of these files provide an output
of the lattice structure as defined by lattice_input.dat for each configuration. Their
format has already been discussed in section Lattice Output File, page 70. The latter files provide
a snapshot of the lattice state and their format follows the structure of a configuration section in
history_output.txt, i.e. the contents after the line starting with configuration (but
without this “header” line) and before (and without including) the last line that reports the
number of gas phase molecules produced/consumed (see section History Output File, page 71).
• In addition to these files, if debug_report_global_energetics is parsed in
calculation_input.dat, files Conf#_globenerg_debut.txt are written, which
follow the structure discussed in section Energetics Debug Output File, page 76.

Zacros-post: Post-processing and Visualisation Graphical User Interface


Zacros-post is a Graphical User Interface (GUI) based utility for postprocessing the results of Zacros
simulations, written in Python (https://www.python.org/) and compiled by Nuitka (https://nuitka.net/).
It is distributed separately from Zacros in the form of Linux or Windows executables. If you are
interested in this utility, you can learn what it can do (and see some screenshots) in the following
tutorial on the Zacros website:
https://zacros.org/tutorials/17-zacros-post-processing/
You can also obtain a trial license key via the following link:
https://xip.uclb.com/product/zacros-post
The trial period is 30 days, after which you have the option of purchasing a full license (pricing and terms
in the above link).

© Michail Stamatakis July 27, 2024


Zacros 3.03 User Guide Page 89 of 89

References
1
Stamatakis, M. and D.G. Vlachos, A Graph-Theoretical Kinetic Monte Carlo Framework for on-Lattice
Chemical Kinetics. Journal of Chemical Physics, 2011. 134(21): 214115.
2
Nielsen, J., M. d’Avezac, J. Hetherington, and M. Stamatakis, Parallel Kinetic Monte Carlo Simulation
Framework Incorporating Accurate Models of Adsorbate Lateral Interactions. Journal of Chemical
Physics, 2013. 139(22): 224706.
3
Park, S.K. and K.W. Miller, Random number generators: good ones are hard to find. Communications
of the ACM, 1988. 31(10): 1192-1201.
4
Matsumoto, M. and T. Nishimura, Mersenne twister: a 623-dimensionally equidistributed uniform
pseudo-random number generator. ACM Transactions on Modeling and Computer Simulation, 1998.
8(1): 3-30.
5
Savva, G.D. and M. Stamatakis, Comparison of Queueing Data-Structures for Kinetic Monte Carlo
Simulations of Heterogeneous Catalysts. Journal of Physical Chemistry A, 2020. 124(38): 7843-7856.
6
Stamatakis, M. and D.G. Vlachos, Equivalence of on-lattice stochastic chemical kinetics with the well-
mixed chemical master equation in the limit of fast diffusion. Computers & Chemical Engineering,
2011. 35(12): 2602-2610.
7
Danielson, T., J.E. Sutton, C. Hin, and A. Savara, SQERTSS: Dynamic rank based throttling of transition
probabilities in kinetic Monte Carlo simulations. Computer Physics Communications, 2017. 219: 149-
163.
8
Dybeck, E.C., C.P. Plaisance, and M. Neurock, Generalized Temporal Acceleration Scheme for Kinetic
Monte Carlo Simulations of Surface Catalytic Processes by Scaling the Rates of Fast Reactions. Journal
of Chemical Theory and Computation, 2017. 13(4): 1525-1538.
9
Chatterjee, A. and A.F. Voter, Accurate acceleration of kinetic Monte Carlo simulations through the
modification of rate constants. Journal of Chemical Physics, 2010. 132(19): 194101.
10
Jefferson, D.R., Virtual Time. ACM Transactions on Programming Languages and Systems, 1985. 7(3):
404-425.
11
Ravipati, S., G.D. Savva, I.-A. Christidi, R. Guichard, J. Nielsen, R. Réocreux, and M. Stamatakis,
Coupling the Time-Warp algorithm with the Graph-Theoretical Kinetic Monte Carlo framework for
distributed simulations of heterogeneous catalysts. Computer Physics Communications, 2022. 270:
108148.
12
Ravipati, S., M. d’Avezac, J. Nielsen, J. Hetherington, and M. Stamatakis, A Caching Scheme To
Accelerate Kinetic Monte Carlo Simulations of Catalytic Reactions. J Phys Chem A, 2020. 124(35):
7140-7154.
13
Stamatakis, M., M. Christiansen, D.G. Vlachos, and G. Mpourmpakis, Multiscale Modeling Reveals
Poisoning Mechanisms of MgO-Supported Au Clusters in CO Oxidation. Nano Letters, 2012. 12(7):
3621-3626.
14
Sanchez, J.M., F. Ducastelle, and D. Gratias, Generalized Cluster Description of Multicomponent
Systems. Physica A: Statistical and Theoretical Physics, 1984. 128(1-2): 334-350.
15
Wu, C., D.J. Schmidt, C. Wolverton, and W.F. Schneider, Accurate coverage-dependence incorporated
into first-principles kinetic models: Catalytic NO oxidation on Pt(111). Journal of Catalysis, 2012. 286:
88-94.
16
Haramoto, H., M. Matsumoto, T. Nishimura, F. Panneton, and P. L'Ecuyer, Efficient jump ahead for F2-
linear random number generators. Informs Journal on Computing, 2008. 20(3): 385-390.

© Michail Stamatakis July 27, 2024

You might also like