Syntax Wikipedia
Syntax Wikipedia
The syntax of a language defines its surface form.[1] Text-based computer languages are based on
sequences of characters, while visual programming languages are based on the spatial layout and
connections between symbols (which may be textual or graphical). Documents that are syntactically
invalid are said to have a syntax error. When designing the syntax of a language, a designer might
start by writing down examples of both legal and illegal strings, before trying to figure out the
general rules from these examples.[2]
Syntax therefore refers to the form of the code, and is contrasted with semantics – the meaning. In
processing computer languages, semantic processing generally comes after syntactic processing;
however, in some cases, semantic processing is necessary for complete syntactic analysis, and these
are done together or concurrently. In a compiler, the syntactic analysis comprises the frontend,
while the semantic analysis comprises the backend (and middle end, if this phase is distinguished).
Contents
1 Levels of syntax
2 Syntax definition
4 See also
5 References
6 External links
Levels of syntax
Phrases – the grammar level, narrowly speaking, determining how tokens form phrases;
Context – determining what objects or variables names refer to, if types are valid, etc.
Distinguishing in this way yields modularity, allowing each level to be described and processed
separately and often independently. First, a lexer turns the linear sequence of characters into a
linear sequence of tokens; this is known as "lexical analysis" or "lexing". Second, the parser turns the
linear sequence of tokens into a hierarchical syntax tree; this is known as "parsing" narrowly
speaking. Thirdly, the contextual analysis resolves names and checks types. This modularity is
sometimes possible, but in many real-world languages an earlier step depends on a later step – for
example, the lexer hack in C is because tokenization depends on context. Even in these cases,
syntactical analysis is often seen as approximating this ideal model.
The parsing stage itself can be divided into two parts: the parse tree, or "concrete syntax tree",
which is determined by the grammar, but is generally far too detailed for practical use, and the
abstract syntax tree (AST), which simplifies this into a usable form. The AST and contextual analysis
steps can be considered a form of semantic analysis, as they are adding meaning and interpretation
to the syntax, or alternatively as informal, manual implementations of syntactical rules that would
be difficult or awkward to describe or implement formally.