2023 K-ST A Formal Executable Semantics of The Structured Text
2023 K-ST A Formal Executable Semantics of The Structured Text
Research Collection School Of Computing and School of Computing and Information Systems
Information Systems
9-2023
Jingyi WANG
Christopher M. POSKITT
Singapore Management University, [email protected]
Xiangxiang CHEN
Jun SUN
Singapore Management University, [email protected]
Part of the Programming Languages and Compilers Commons, Software Engineering Commons, and
the Theory and Algorithms Commons
Citation
WANG, Kun; WANG, Jingyi; POSKITT, Christopher M.; CHEN, Xiangxiang; SUN, Jun; and CHENG, Peng. K-
ST: A formal executable semantics of the structured text language for PLCs. (2023). IEEE Transactions on
Software Engineering. 49, (10), 4796-4813.
Available at: https://ink.library.smu.edu.sg/sis_research/8199
This Journal Article is brought to you for free and open access by the School of Computing and Information
Systems at Institutional Knowledge at Singapore Management University. It has been accepted for inclusion in
Research Collection School Of Computing and Information Systems by an authorized administrator of Institutional
Knowledge at Singapore Management University. For more information, please email [email protected].
Author
Kun WANG, Jingyi WANG, Christopher M. POSKITT, Xiangxiang CHEN, Jun SUN, and Peng CHENG
Abstract—Programmable Logic Controllers (PLCs) are responsible for automating process control in many industrial systems (e.g. in
manufacturing and public infrastructure), and thus it is critical to ensure that they operate correctly and safely. The majority of PLCs are
programmed in languages such as Structured Text (ST). However, a lack of formal semantics makes it difficult to ascertain the
correctness of their translators and compilers, which vary from vendor-to-vendor. In this work, we develop K-ST, a formal executable
semantics for ST in the K framework. Defined with respect to the IEC 61131-3 standard and PLC vendor manuals, K-ST is a high-level
reference semantics that can be used to evaluate the correctness and consistency of different ST implementations. We validate K-ST
by executing 567 ST programs extracted from GitHub and comparing the results against existing commercial compilers (i.e.,
CODESYS, CX-Programmer, and GX Works2). We then apply K-ST to validate the implementation of the open source OpenPLC
platform, comparing the executions of several test programs to uncover five bugs and nine functional defects in the compiler.
Index Terms—Formal executable semantics, PLC programming, Structured text, K framework, OpenPLC.
1 I NTRODUCTION
addition, the analyses they perform are often limited (since supports only 10 basic data types, whereas CODESYS
the existing tools are not designed for PLCs) and do not supports 17 types. Thus, a formal semantics needs to be
offer the feedback to the level of source code. Canet et ‘concrete’ enough to be useful, but ‘high-level’ enough to
al. [22] propose formal semantics for a significant fragment be general/extendable to the different nuances of vendors’
of the IL language, and a direct coding of this semantics compilers. A preliminary attempt at defining a high-level
into a model checking tool. Huuck [28] develops a formal semantics for ST was made by Huang et al. [38]. However,
operational semantics and abstract semantics for IL, which it falls short of a full reference semantics as it misses several
allows approximating program simulation for a set of inputs important features of the language, e.g. certain data types,
in one simulation run. Blech et al. [10], [11], [30] attempted and key sentences.
to define the formal semantics of the IL and SFC languages In this work, we develop K-ST, a formal executable ref-
in Coq and NuSMV and, based on that, verify the safety erence semantics for ST in the K framework [39]. Our high-
properties in the code. However, IL is a low level assembly- level semantics is both executable and machine readable,
like language that has been deprecated from the IEC61131- and can be used by the K framework to generate inter-
3 standard. Furthermore, these studies mainly concentrate preters, compilers, state-space explorers, model checkers,
on analyzing the functional aspects of the programs and and deductive program verifiers. Our principal goals for the
may overlook potential vulnerabilities and security risks design of K-ST are as follows:
introduced during the compilation process.
1) Validated reference semantics. K-ST is designed
While extensive research has been conducted on testing
to cover all the main features of ST, and is vali-
more ‘traditional’ compilers (e.g. vulnerability detection for
dated against hundreds of different real-world ST
GCC and Clang [33], [34], [35]), compilers for PLC lan-
programs extracted from GitHub.
guages such as ST have received much less attention. The
2) General and extendable. The semantics is high-
challenges associated with testing the implementation of
level (rather than tied to a particular compiler), with
a compiler arise from the inherent difficulties of ensuring
the goal of supporting different ST implementations
its correctness. One particular challenge stems from the
as well as extensions for vendor-specific functions.
absence of a precise specification of the expected behavior of
3) Analyses of ST compilers. Most importantly, K-ST
a compiler. For most popular programming languages, there
can be used to check the correctness and consistency
exist multiple purportedly equivalent implementations of
of different ST implementations, and thus ensure
compilers. Compiler testing can take advantage of this by
that a compiler is not introducing an unintended
utilizing these implementations as oracles for conducting
behavior or compile-time threat [40], [41] into a
differential testing [36]. However, in the case of the domain-
critical industrial system.
specific ST language, there is no specific implementation
standard, and different vendors often develop their own Given the absence of complete feature descriptions for
compilers based on their specific requirements. Another the ST language in official documentation, we not only refer
challenge is the semantic complexity of the input and output to the definitions and code samples in the official docu-
languages that compilers handle. The fact that different ments, but also extensively consult the guidance manuals
vendors develop their own implementations further exacer- provided by multiple vendors to better define the semantics
bates this issue. Compiler testing methods based on formal of the ST language. For example, there is no specific docu-
semantics [37] have shown advantages in addressing these mentation on how integer overflow is handled in the offi-
challenges. With a formal semantics of the ST language, the cial documents. Through investigating multiple instruction
expected behavior of ST compilers can be precisely and manuals, we found that existing ST compilers generally use
unambiguously defined, which can greatly aid in testing truncation to handle integer overflow without any warning.
and verifying their correctness. In defining the semantics, we find that the rewriting rule of
To the best of our knowledge, a practical and complete the K framework provides a good mechanism for capturing
semantics for the ST language does not exist, which makes the unique features of ST. For example, we can rewrite
it difficult to ascertain the correctness of ST translators REPEAT to WHILE to achieve the execution effect of REPEAT.
and compilers (e.g. by comparing executions). There are a We validate K-ST by extracting 567 real-world ST code
number of reasons why such a reference semantics is yet to samples from GitHub and comparing their executions in our
emerge. First, there is insufficient documentation defining semantics against their executions resulting from various
or describing the complete features of the ST language [9]. commercial compilers (i.e., CODESYS, CX-Programmer, and
For instance, the official documentation introduces language GX Works2). We find that K-ST is sufficiently complete to
features by only a few examples, based on which it is support 509 of these programs (consisting of 26,137 lines of
difficult for readers to fully understand the behavior of the code) and executes those programs correctly (i.e., producing
language. Second, the ST compilers provided by different the same outputs as the corresponding existing compiler),
vendors (e.g. Allen-Bradley, Siemens) can implement the with the remaining programs only unsupported due to the
language differently, and their closed source solutions make use of certain vendor-specific or hardware-related functions
it difficult to fully assess how they behave systematically that we did not yet formalize. Furthermore, to evaluate
(other than through manual observation). For example, the utility of K-ST for testing ST compilers, we compared
CODESYS, CX-Programmer, and GX Works2 all produce the executions of the 567 programs (and several mutants)
negative numbers in the results of negative modulo oper- under K-ST and OpenPLC [42], a popular open source PLC
ations, even though this behavior is undefined according program compiler. Through this semantics-based testing, we
to the official documentation. Furthermore, GX Works2 are able to uncover five bugs and nine functional defects in
3
2 BACKGROUND
In this section, we briefly introduce the background of the
Structured Text (ST) language and the K framework.
Syntax Description
Id ::= [a − zA − z ] [a − zA − Z0 − 9 ]∗
Ids ::= Id∗ Identifier
IdV al ::= Id := Expression
EnumStructDeclaration ::= T Y P E EnumDeclarationExp∗ EN D T Y P E
| T Y P E StructDeclarationExp∗ EN D T Y P E
EnumBlock ::= Ids | IdV al∗ Enum and Struct declaration
EnumDeclarationExp ::= Id : (EnumBlock) ; | Id : (EnumBlock) := Id;
StructDeclarationExp ::= Id : ST RU CT V arDeclarationExp∗ EN D ST RU CT
F unction ::= F U N CT ION Id : T ype V arDeclaration∗ Statements EN D F U N CT ION Function declaration
F unctionBlock ::= F U N CT ION BLOCK Id V arDeclaration∗ Statements EN D F U N CT ION Function block declaration
P rogram ::= P ROGRAM Id V arDeclaration∗ Statements EN D P ROGRAM Program declaration
T ype ::= IN T |DIN T |SIN T |LIN T |U IN T |U DIN T |U SIN T |U LIN T |BY T E|W ORD|DW ORD|REAL
|LREAL|ST RIN G|ST RIN G [Expression] |W ST RIN G|W ST RIN G [Expression] |T IM E|DAT E Variable types
|T IM E OF DAY |DAT E AN D T IM E|Id|ARRAY [Expression] OF T ype
V arT ype ::= V AR GLOBAL | V AR | V AR IN P U T | V AR OU T P U T | V AR IN OU T | V AR T EM P
V arDeclarationExp ::= Ids : T ype; | Ids : T ype := Expression; Variable declaration
V arDeclaration ::= V arT ype V arDeclaration EN D V AR
Operation ::= + | − | ∗ | / | ∗ ∗ | M OD | < | > | = | <= | >= | <> | AN D | &
| AN D T HEN | XOR | OR | OR ELSE | ..
Expression ::= Int | F loat | String | Bool | Bit | AllT ime | Id | Expression Operation Expression Expressions
Expression (Expressions) | Expression.Expression | Expression [Expressions] | (Expression)
Expressions ::= Expression∗
Assignment ::= Expression := Expression; Assignment statement
ElseIf Block ::= ELSE Statements | ELSE IF Expression T HEN Statements ElseIf Block∗
If ::= IF Expression T HEN Statements ElseIf Block∗ EN D IF ;
CaseBlock ::= Expression : Statements | Expression .. Expression : Statements Branch statements
Case ::= CASE Expression OF CaseBlock∗ EN D CASE;
| CASE Expression OF CaseBlock∗ ELSE Statements EN D CASE;
W hile ::= W HILE Expression DO Statements EN D W HILE;
F or ::= F OR Expression T O Expression DO Statements EN D F OR;
Loop statements
| F OR Expression T O Expression BY Expression DO Statements EN D F OR;
Repeat ::= REP EAT Statements U N T IL Expression EN D REP EAT ;
Return ::= RET U RN ; Return statement
Exit ::= EXIT ; Exit statement
Statement ::= Expression; | Assignment | If | Case | W hile | F or | Repeat | Return | Exit
Statements
Statements ::= Statement∗
the k cell, denoting that no more units need to be executed. and indexes in the current environment during program ex-
In the preprocessing phase (the first pass of K), the k cell ecution. Furthermore, cells temp and count are used in ENUM
only contains the token execute. Afterwards, K will start and STRUCT, where temp is for temporary mapping and
executing from the MAIN program. count is used as a counting pointer. The cell gvid records
all identifiers of global variables to assist in the generation
Stack operations. The cell control contains seven of global variables. The cell print records variables which
subcells—f stack , env , temp, count, gvid, print and break — need to be output. Finally, break stores the program after
which record the operating environment of the currently the loop in order to support the implementation of the EXIT
running code segment. Specifically, the function stack statement in FOR, WHILE and REPEAT loops.
f stack is a list used to store the environment before exe-
cuting other POUs, including variables in the current envi-
ronment and the subsequent program. Next, the cell env is Execution environment. The allenv cell is used to cache
used to store the mapping relationship between variables the execution environment before function calls (for strict
7
type checking of parameter passing in function calls1 ). The types needing additional implementation in K-ST, which we
cell genv records the result of the pre-processing (including call extended data types. These extended data types can be
POUs and custom types) and will be copied to env when categorized into two kinds: 1) elementary types (TIME, BYTE,
env is refreshed. The last cell related to the environment is WORD, DWORD, TIME OF DAY, DATE and DATE AND TIME) and
called gvenv and is used to index global variables. 2) compound types (ENUM and STRUCT). We implement these
Memory operation. The store cell is used to simulate extended data types by the composition of built-in types and
memory to record the mapping relationships of indexes methods in K as follows.
and variable values. After that, the cells input and output
are used to realize external inputs and external output We take TIME OF DAY as an example to introduce
respectively. The last cell, nextLoc, ensures that the index of elementary types. There are two types of TIME OF DAY
a variable can always be incremented without duplication. in ST, e.g., TIME OF DAY#23 : 45 : 56.30 and
The design consideration behind this is that for complex lan- TOD#23 : 45 : 56.30. Fig. 6 shows our implementation
guages, it is more effective to explicitly manage arbitrarily of TIME OF DAY type together with its relevant operations.
large memory than use garbage collection [56]. Lines 1 and 2 respectively define the syntax of TIME OF DAY
and how to parse it (Get TIME OF DAY). Line 3 is used
3.3 Semantics of the Core Features to convert Get TIME OF DAY to TIME OF DAY, which is
We implement the executable semantics covering most core achieved by two steps—Gtd2T d and Standardization—
features of ST and leave the vendor-specific functionalities where Gtd2T d realizes the conversion of the format and
as potential extensions. For example, some compilers would Standardization realizes content conversion, e.g., replacing
use additional keywords to distinguish the declaration part 60 minutes with 1 hour. Lines 4–11 define some arithmetic
and the execution part of the program. In the following, and relational operations of TIME OF DAY.
we provide an overview of four core semantic features of
ST, including 1) data types, 2) main control statements, 3) For compound types, we take STRUCT as an example
declarations and calls of POUs and 4) memory operations. and show its semantics in Fig. 7, including both STRUCT
Before diving into the details, we present the notations as declaration and instantiation. Declarations are shown in
follows. rule Struct Declaration, where we allocate memory for
each defined data structure. The instantiation of STRUCT
3.3.1 Extended Data Types consists of four main steps in rule Struct Instantiation:
The K framework supports diverse data types including 1) CreatStruct allocates memory for I1, 2) StructInits
identifiers (Id), integers (Int), bools (Bool), floats (F loat) generates each variable in turn according to V ds in STRUCT,
and strings (String ), which cover most of the require- 3) Set assigns values to the corresponding variables ac-
ments. However, there are still some unsupported data cording to Idvs, and finally, 4) U pdate stores the mapping
relationship of variables related to I1 into the memory of I1
1. This is optional but recommended for ST compilers. to facilitate subsequent use.
8
and Statements. For Declaration types, we list the num- keyword BEGIN to represent the end of variable declaration
ber of tests for CONSTANT, VAR GLOBAL, VAR, VAR INPUT, and the beginning of operation instructions. In addition,
VAR OUTPUT, VAR IN OUT, VAR TEMP and VAR EXTERNAL. there are also obvious differences between different prod-
For Data types, we list the number of tests for elemen- ucts of the same vendor. For example, the S7-1500 and
tary types signed integer (INT, DINT, SINT, LINT), un- the S7-1200 from Siemens support different type conversion
signed integer (UINT, UDINT, USINT, ULINT), float (REAL, methods5 , where the former only provides explicit conver-
LREAL), Boolean (BOOL), byte (BYTE, WORD, DWORD), string sions of types, and the latter provides both explicit and
(STRING, WSTRING), and time (TIME, DATE, TIME OF DAY, implicit conversions.
DATE AND TIME); compound types enum (ENUM) and struct
(STRUCT); and finally, the array type ARRAY. For Statements, 5.2.2 Semantics Correctness (RQ2)
we list the number of tests for main control statements: IF, On the other hand, in order to evaluate the correctness of
CASE, FOR, WHILE, REPEAT, EXIT and RETURN. K-ST, we compared the execution results of K-ST against
As indicated in Fig. 14, compared with FUNCTION, those of vendor compilers CODESYS, CX-Programmer and
the FUNCTION BLOCK is more favored by ST program- GX Works2. We consider the proposed semantics correct
mers (PROGRAM is necessary for ST program operation). if the execution behaviors of K-ST are consistent with the
For Declaration types, the most used is VAR (with ones of the CODESYS, CX-Programmer and GX Works2
a ratio of 470/509), followed by VAR INPUT (386/509), compilers. The consistency criteria described in Section 4
VAR OUTPUT (360/509) and VAR IN OUT (313/509). Among are utilized to evaluate the consistency of behavior between
all the Data types, BOOL is the most used, followed by K-ST and the compilers provided by vendors. Specifically,
unsigned integer and ARRAY, constituting 322/509 and if K-ST and these compilers demonstrate identical execution
311/509 respectively. For the Data types, BOOL is the most and variable states for the same program, their behavior
common type. In addition, we must remark that we do not is deemed consistent. We list the coverage of the K-ST
count the type of array members. Finally, IF is the most semantics in TABLE 6 from the perspective of each feature
common statement in all the tests considered. This is also in specified by the official ST documentation, where FC, C and
line with the main working scenarios of PLCs. N mean “Fully Covered and Consistent with Compilers”,
“Covered and Consistent with Compilers” and “Not Cov-
We remark that we do not consider the vendor-based
ered”, respectively.
functions in this experiment as these functions vary not
From TABLE 6, we can see clearly that for POUs, we
only from vendor to vendor, but even from product to
fully cover the declaration and call. In variable declarations,
product. In particular, Mitsubishi PLCs provide completely
different data types, including Bit, Word[Signed/Unsigned],
5. https://support.industry.siemens.com/dl/dl-media/272/
Double Word[Signed/Unsigned], Bit STRING[16-bit/32- 109742272/att 918238/v6/93516999691/zh-CHS/index.html#
bit], FLOAT, STRING[32] and Time. Siemens PLCs support ae443583b99950f7cca0d7237fe81ad4
14
AT is related to input and output. We remark, however, that Programmer and GX Works2, the following points need to
the storage mode of variables in K is very different from that be explained. Firstly, due to the closed nature of these com-
in real PLCs, so we just support simple computer-side input pilers, they cannot be simply called, so we have to manually
and output. In addition, RETAIN and PERSISTENT are related fill the code in the specified way into the compiler to compile
to the actual situation in the PLC, so they are not imple- and run, and compare the results, which is laborious and te-
mented. For instance, AT is used to bind the actual point of dious work. This also hinders us from testing these commer-
the PLC; RETAIN and PERSISTENT support the preservation cial compilers in an extensively large scale. After that, differ-
of variable values after a power failure or power loss. Array ent vendors have obvious differences in the implementation
is the only one which is covered but not fully covered in of compilers, so the source code needs to be adapted to a
all data types. Limited by the realization of arrays, it is certain extent. For example, only 10 basic data types—Bit,
temporarily impossible to achieve the array for enum and Word[Signed/Unsigned], Double Word[Signed/Unsigned],
struct, and to assign values to multi-dimensional arrays as Bit STRING[16-bit/32-bit], FLOAT, STRING[32] and Time—
a whole. In statements, ⇒ has been used in K and can be are provided in the GX Works2 compiler, so we need to
replaced by :=. For built-in functions, we show a list which adapt the variable types of the source program.
we supported, including 30 numerical functions, 9 logical
functions, 9 string functions and 160 translate functions.
In the process of comparing with CODESYS, CX-
15
TABLE 7: The results of K-ST and OpenPLC functional deficiencies and bugs we found in OpenPLC.
We show some relevant case studies in APPENDIX B.
Data Set GitHub Set Mutated Set Considering that Beremiz can be regarded as an updated
Number of samples 567 31059 (2271) version of OpenPLC, we have retested the inconsistencies
Number of program K-ST 509 15850 we found in Beremiz. We found that in the latest Beremiz,
it fixes some problems, including negative MOD operation
run completely OpenPLC 490 11581
results and “VAR” parsing exceptions. But other bugs and
Kp Of 30 5664 shortcomings still exist. In response to these problems in
Inconsistent Kf Op 11 1395 OpenPLC, we have submitted them to the OpenPLC and
Diff. Result 0 735 Beremiz developers and are waiting for their confirmation6 .
provide an method based on automatic theoretical to verify 61833015 and 62293511, Provincial Key R&D Program of
LTL attributes on BM. Our work differs from the afore- Zhejiang under grants 2020C01038 and 2021C01032, and the
mentioned works because they attempt to transform PLC Starry Night Science Fund of Zhejiang University Shanghai
programs into intermediate languages or other program- Institute for Advanced Study, Grant No. SN-ZJU-SIAS-001.
ming languages which are suitable for verifying or detecting
potential issues, and lack analysis in the conversion process.
In addition, these methods do not offer feedback at the level R EFERENCES
of source code.
[1] R. Langner, “Stuxnet: Dissecting a cyberwarfare weapon,” IEEE
Huang et al. [38] is the closest work to ours. They first Security & Privacy, vol. 9, no. 3, pp. 49–51, 2011.
defined the executable semantics of the ST language in K [2] G. Liang, S. R. Weller, J. Zhao, F. Luo, and Z. Y. Dong, “The 2015
and use it to check some security properties. Our work Ukraine blackout: Implications for false data injection attacks,”
IEEE Transactions on Power Systems, vol. 32, no. 4, pp. 3317–3318,
differs because we cover a more complete ST language, and 2016.
we can use it to discover errors in ST compilers. [3] K. Zetter, “The Ukrainian power grid was hacked again,” Mother-
board, 2017.
[4] N. Perlroth and C. Krauss, “A cyberattack in Saudi Arabia had a
7 C ONCLUSION deadly goal,” Experts fear another try, 2018.
[5] D. Tychalas and M. Maniatakos, “Open platform systems under
In this paper, we introduced an executable operational
scrutiny: A cybersecurity analysis of the device tree,” in 2018 25th
semantics of ST formalized in the K framework. We pre- IEEE International Conference on Electronics, Circuits and Systems
sented the semantics of the core features of ST, namely (ICECS). IEEE, 2018, pp. 477–480.
data types, memory operations, its main control statements, [6] A. Nochvay, “Security research: CODESYS runtime, a PLC control
framework,” Kaspersky ICS CERT, 2019.
and function calls. Our experimental results show that the [7] D. Tychalas, H. Benkraouda, and M. Maniatakos, “ICSFuzz: Ma-
proposed ST semantics has already covered the main core nipulating I/Os and repurposing binary code to enable instru-
language features and correctly implements 26,137 lines of mented fuzzing in ICS control applications,” in 30th {USENIX}
public ST code on GitHub. Furthermore, the application Security Symposium ({USENIX} Security 21), 2021.
[8] “Programmable controllers - Part 3: Programming languages,”
of the proposed semantics in testing and analyzing PLC International Electrotechnical Commission, Standard, 2013.
compilers is discussed. By comparing and analyzing the [9] T. M. Antonsen, PLC Controls with Structured Text (ST), V3: IEC
execution results of OpenPLC and K-ST, we found five 61131-3 and best practice ST programming. BoD–Books on Demand,
2020.
bugs and some functional deficiencies in OpenPLC. In the
[10] J. O. Blech and S. O. Biha, “On formal reasoning on the semantics
future, we hope to further extend K-ST to support the pro- of PLC using Coq,” arXiv preprint arXiv:1301.3047, 2013.
gramming environments provided by different vendors. For [11] J. O. Blech and S. Ould Biha, “Verification of PLC properties based
example, vendors may customize keywords (Bit STRING of on formal semantics in Coq,” in International Conference on Software
Engineering and Formal Methods. Springer, 2011, pp. 58–73.
GX Works2), add additional structures (LABEL of Siemens), [12] T. Ovatman, A. Aral, D. Polat, and A. O. Ünver, “An overview
or even widely extend ST (ExST of CODESYS). of model checking practices on verification of PLC software,”
Software & Systems Modeling, vol. 15, no. 4, pp. 937–960, 2016.
[13] H. Janicke, A. Nicholson, S. Webber, and A. Cau, “Runtime-
ACKNOWLEDGMENTS monitoring for industrial control systems,” Electronics, vol. 4, no. 4,
pp. 995–1017, 2015.
We thank the reviewers for their constructive feedback. This
[14] L. Garcia, S. Zonouz, D. Wei, and L. P. De Aguiar, “Detecting PLC
research is supported by National Key R&D Program of control corruption via on-device runtime verification,” in 2016
China under grant 2020YFB2010900, NSFC under grants Resilience Week (RWS). IEEE, 2016, pp. 67–72.
17
[15] D. Darvas, B. F. Adiego, A. Vörös, T. Bartha, E. B. Vinuela, and L. Zhang, “A survey of compiler testing,” ACM Computing Surveys
V. M. G. Suárez, “Formal verification of complex properties on (CSUR), vol. 53, no. 1, pp. 1–36, 2020.
PLC programs,” in International Conference on Formal Techniques for [36] W. M. McKeeman, “Differential testing for software,” Digital Tech-
Distributed Objects, Components, and Systems. Springer, 2014, pp. nical Journal, vol. 10, no. 1, pp. 100–107, 1998.
284–299. [37] R. Schumi and J. Sun, “SpecTest: Specification-based compiler
[16] D. Darvas, I. Majzik, and E. B. Viñuela, “Formal verification of testing,” Fundamental Approaches to Software Engineering, vol. 12649,
safety PLC based control software,” in International Conference on p. 269, 2021.
Integrated Formal Methods. Springer, 2016, pp. 508–522. [38] Y. Huang, X. Bu, G. Zhu, X. Ye, X. Zhu, and J. Shi, “KST: Executable
[17] L. Garcia, F. Brasser, M. H. Cintuglu, A.-R. Sadeghi, O. A. Mo- formal semantics of IEC 61131-3 structured text for verification,”
hammed, and S. A. Zonouz, “Hey, my malware knows physics! IEEE Access, vol. 7, pp. 14 593–14 602, 2019.
Attacking PLCs with physical model aware rootkit.” in NDSS, [39] G. Rosu, “K: A semantic framework for programming languages
2017. and formal analysis tools,” Dependable Software Systems Engineer-
[18] R. Spenneberg, M. Brüggemann, and H. Schwartke, “PLC-Blaster: ing, vol. 50, p. 186, 2017.
A worm living solely in the PLC,” Black Hat Asia, vol. 16, pp. 1–16, [40] M. J. Hohnka, J. A. Miller, K. M. Dacumos, T. J. Fritton, J. D. Erdley,
2016. and L. N. Long, “Evaluation of compiler-induced vulnerabilities,”
[19] A. Keliris and M. Maniatakos, “ICSREF: A framework for auto- Journal of Aerospace Information Systems, vol. 16, no. 10, pp. 409–426,
mated reverse engineering of industrial control systems binaries,” 2019.
arXiv preprint arXiv:1812.03478, 2018. [41] M. Marcozzi, Q. Tang, A. F. Donaldson, and C. Cadar, “Compiler
[20] S. Guo, M. Wu, and C. Wang, “Symbolic execution of pro- fuzzing: How much does it matter?” Proceedings of the ACM on
grammable logic controller code,” in Proceedings of the 2017 11th Programming Languages, vol. 3, no. OOPSLA, pp. 1–29, 2019.
Joint Meeting on Foundations of Software Engineering, 2017, pp. 326– [42] T. R. Alves, M. Buratto, F. M. De Souza, and T. V. Rodrigues,
336. “OpenPLC: An open source alternative to automation,” in IEEE
[21] S. E. McLaughlin, S. A. Zonouz, D. J. Pohly, and P. D. McDaniel, Global Humanitarian Technology Conference (GHTC 2014). IEEE,
“A trusted safety verifier for process controller code.” in NDSS, 2014, pp. 585–589.
vol. 14, 2014. [43] D. Darvas, I. Majzik, and E. Blanco Viñuela, “Generic representa-
[22] G. Canet, S. Couffin, J. Lesage, A. Petit, and P. Schnoebelen, tion of PLC programming languages for formal verification,” in
“Towards the automatic verification of PLC programs written in 23rd PhD Mini-Symposium. Budapest University of Technology
instruction list,” in Proceedings of the IEEE International Conference and Economics, 2016, pp. 6–9.
on Systems, Man & Cybernetics: ”Cybernetics Evolving to Systems, [44] N. Roos, “Programming PLCs using structured text,” in Interna-
Humans, Organizations, and their Complex Interactions”. IEEE, 2000, tional Multiconference on Computer Science and Information Technol-
pp. 2449–2454. ogy. Citeseer, 2008, pp. 20–22.
[23] J. Xiong, X. Bu, Y. Huang, J. Shi, and W. He, “Safety verification [45] F. Markovic, “Automated test generation for structured text lan-
of IEC 61131-3 Structured Text programs,” IEEE Transactions on guage using uppaal model checker,” 2015.
Industrial Informatics, vol. 17, no. 4, pp. 2632–2640, 2020. [46] M. Tiegelkamp and K.-H. John, IEC 61131-3: Programming indus-
[24] M. Zhang, C.-Y. Chen, B.-C. Kao, Y. Qamsane, Y. Shao, Y. Lin, trial automation systems. Springer, 2010.
E. Shi, S. Mohan, K. Barton, J. Moyne et al., “Towards automated [47] N. Martı-Oliet and J. Meseguer, “Rewriting logic: roadmap and
safety vetting of PLC code in real-world plants,” in 2019 IEEE bibliography,” Theoretical Computer Science, vol. 285, no. 2, pp. 121–
Symposium on Security and Privacy (SP). IEEE, 2019, pp. 522–538. 154, 2002.
[25] N. Bauer, S. Engell, R. Huuck, S. Lohmann, B. Lukoschus, [48] A. Stefănescu, D. Park, S. Yuwen, Y. Li, and G. Roşu, “Semantics-
M. Remelhe, and O. Stursberg, “Verification of PLC programs based program verifiers for all languages,” ACM SIGPLAN Notices,
given as sequential function charts,” in Integration of software vol. 51, no. 10, pp. 74–91, 2016.
specification techniques for applications in Engineering. Springer, [49] C. Ellison and G. Rosu, “An executable formal semantics of C with
2004, pp. 517–540. applications,” ACM SIGPLAN Notices, vol. 47, no. 1, pp. 533–544,
[26] A. Mader and H. Wupper, “Timed automaton models for simple 2012.
programmable logic controllers,” in Proceedings of 11th Euromicro [50] D. Bogdanas and G. Roşu, “K-Java: A complete semantics of
Conference on Real-Time Systems. Euromicro RTS’99. IEEE, 1999, pp. Java,” in Proceedings of the 42nd Annual ACM SIGPLAN-SIGACT
106–113. Symposium on Principles of Programming Languages, 2015, pp. 445–
[27] T. Mertke and G. Frey, “Formal verification of PLC programs 456.
generated from signal interpreted Petri nets,” in 2001 IEEE Inter- [51] D. Park, A. Stefănescu, and G. Roşu, “KJS: A complete formal
national Conference on Systems, Man and Cybernetics. e-Systems and semantics of JavaScript,” in Proceedings of the 36th ACM SIGPLAN
e-Man for Cybernetics in Cyberspace (Cat. No. 01CH37236), vol. 4. Conference on Programming Language Design and Implementation,
IEEE, 2001, pp. 2700–2705. 2015, pp. 346–356.
[28] R. Huuck, “Semantics and analysis of instruction list programs,” [52] F. Wang, F. Song, M. Zhang, X. Zhu, and J. Zhang, “KRust: A for-
Electronic Notes in Theoretical Computer Science, vol. 115, pp. 3–18, mal executable semantics of Rust,” in 2018 International Symposium
2005. on Theoretical Aspects of Software Engineering (TASE). IEEE, 2018,
[29] J. Sadolewski, “Conversion of ST control programs to ANSI C for pp. 44–51.
verification purposes,” e-Informatica Software Engineering Journal, [53] J. Jiao, S. Kan, S.-W. Lin, D. Sanan, Y. Liu, and J. Sun, “Semantic
vol. 5, no. 1, 2011. understanding of smart contracts: Executable operational seman-
[30] B. F. Adiego, D. Darvas, E. B. Viñuela, J.-C. Tournier, V. M. G. tics of Solidity,” in 2020 IEEE Symposium on Security and Privacy
Suárez, and J. O. Blech, “Modelling and formal verification of (SP). IEEE, 2020, pp. 1695–1712.
timing aspects in large PLC programs,” IFAC Proceedings Volumes, [54] T. Nipkow and G. Klein, “Imp: A simple imperative language,” in
vol. 47, no. 3, pp. 3333–3339, 2014. Concrete Semantics. Springer, 2014, pp. 75–94.
[31] O. Maler and S. Yovine, “Hardware timing verification using KRO- [55] D. D. McCracken and E. D. Reilly, “Backus-Naur Form (BNF),” in
NOS,” in Proceedings of the Seventh Israeli Conference on Computer Encyclopedia of Computer Science, 2003, pp. 129–131.
Systems and Software Engineering. IEEE, 1996, pp. 23–29. [56] G. Roşu and T. F. Şerbănuţă, “K overview and simple case study,”
[32] M. Heiner and T. Menzel, “Petri net semantics for the PLC user Electronic Notes in Theoretical Computer Science, vol. 304, pp. 3–56,
programming language Instruction List,” Techn. Report BTU Cot- 2014.
tbus, I-20/1997, Cottbus December, 1997. [57] E. V. Kuzmin, A. Shipov, and D. A. Ryabukhin, “Construction and
[33] V. Le, M. Afshari, and Z. Su, “Compiler validation via equivalence verification of PLC programs by LTL specification,” in 2013 Tools
modulo inputs,” ACM Sigplan Notices, vol. 49, no. 6, pp. 216–226, & Methods of Program Analysis. IEEE, 2013, pp. 15–22.
2014. [58] D. Darvas, E. Blanco Vinuela, and I. Majzik, “A formal specifica-
[34] X. Yang, Y. Chen, E. Eide, and J. Regehr, “Finding and understand- tion method for PLC-based applications,” 2015.
ing bugs in C compilers,” in Proceedings of the 32nd ACM SIGPLAN [59] B. F. Adiego, D. Darvas, E. B. Viñuela, J.-C. Tournier, S. Bliudze,
conference on Programming language design and implementation, 2011, J. O. Blech, and V. M. G. Suárez, “Applying model checking to
pp. 283–294. industrial-sized PLC programs,” IEEE Transactions on Industrial
[35] J. Chen, J. Patra, M. Pradel, Y. Xiong, H. Zhang, D. Hao, and Informatics, vol. 11, no. 6, pp. 1400–1410, 2015.
18
[60] M. Hailesellasie and S. R. Hasan, “Intrusion detection in PLC- Jun Sun is currently a tenured professor at the
based industrial control systems using formal verification ap- School of Information Systems, Singapore Man-
proach in conjunction with graphs,” Journal of Hardware and Sys- agement University. He received bachelor’s and
tems Security, vol. 2, no. 1, pp. 1–14, 2018. Ph.D. degrees in computing science from the
[61] D. Bohlender and S. Kowalewski, “Compositional verification of National University of Singapore (NUS) in 2002
PLC software using horn clauses and mode abstraction,” IFAC- and 2006, respectively. From 2010 to 2019, he
PapersOnLine, vol. 51, no. 7, pp. 428–433, 2018. was an assistant/associate professor at the Sin-
[62] B. C. Rawlings, J. M. Wassick, and B. E. Ydstie, “Application of gapore University of Technology and Design. He
formal verification and falsification to large-scale chemical plant was a visiting scholar at MIT from 2011 to 2012.
automation systems,” Computers & Chemical Engineering, vol. 114, His research focuses on software engineering,
pp. 211–220, 2018. formal methods, program analysis, and cyber-
security. He is the co-founder of the PAT model checker.