Pron Tla
Pron Tla
mostly theory.
complements a tutorial.
will clarify concepts that are hard to grasp when learning TLA
TLA is small and simple but does contain ideas that may take time to understand.
But these ideas are not necessary to
- write good specifications
- these posts don't teach you how to write good specifications
so these posts are neither necessary nor sufficient to put TLA+ to good use BUT
they will help you *understand* TLA+, specifically
1. why TLA+
2. how its design and formal theory compare to other formal methods
3. how powerful a tool mathematics is to reason about systems we design and
build
Learning Resources
1. The TLA + HyperBook
2. Specifying Systems
3. Online TLA + Video Course
4. learntla.com (heavy emphasis on PlusCal)
5. Dr.TLA+ Series of Lectures
6. github repo of examples: https://github.com/tlaplus/Examples
7. Papers by Lamport http://lamport.azurewebsites.net/tla/papers.html
8. Papers by Stephen Merz
TLA+ in Practice
1. Use Of Formal Methods at Amazon Web Services
http://lamport.azurewebsites.net/tla/formal-methods-amazon.pdf
2. Why Amazon Chose TLA + http://link.springer.com/chapter/10.1007/978-3-662-
43652-3_3 (i have Chris Newcombe's email with this paper I think)
3. Slides http://tla2012.loria.fr/contributed/newcombe-slides.pdf
e.g: specification says the output of a function should be the input, but sorted,
(the what) without specifying which algorithm to use to sort (the how)
interesting: the what of one layer can be (part of) the how of a higher layer. Thus
calculating median needs to sort first (still, without specifying how the sort is
done). This notion that the 'how' of one layer is related to the 'what' of another
is given a precise meaning by something known as an abstraction/refinement relation
which is at the heart of TLA+'s theory.
but they cannot prove *global* properties of programs - e.g that the
program will eventually respond to every user request, or that the database will
not lose data
But, these abilities come at a high cost - these languages are extremely complex,
with steep learning curves. Used by specialists/academics and rarely in industry.
TLA+ focuses on
1. industrial use and user friendliness (like Z)
2. a universal mathematical formalization of mathematics (like Isabelle, and
Coq)
TLA+ Ideology/Aesthetics
Lamport observes
- program languages need to be complex (for good reasons)
- Specification languages built for reasoning about specification can be
simpler.
- much to be gained by separating programming and reasoning
the main difference is in reasoning about the program with (/in) a programming
language, or reasonining with math.
Lamport Motivations/Quotes
- I want to verify the algorithms I write
- We are not motivated by abstract elegance, but by the practical problem of
reasoning about real algorithms
LL: "The best way to get better programmers is to teach programmers how to think
better" . Use the same 'language of thinking' as every science- mathematics"
Lamport says his colleagues with various temporal logics "spent days trying to
specify a simple FIFO Queue arguing about the suffiency of the specified
properties. I realized that despite the aesthetic appeal, writing a specification
as a conjunction of temporal properties does not work in practice."
"TLA gave me, for the first time a formalism in which I was able to write
completely formal proofs without first adding a layer of formal semantics"
so in essence any level of abstraction you are interested in (examples for all
these? these claims are fantastic and no doubt true but need *examples*)
TLA+ is not a programming language. It has no notion of stack, heap, I/O, memory,
subroutine. (PlusCal does have processes, stacks and subroutines).
But any kind of hardware or software system and *almost* any kind of algorithm can
be written in it succinctly and elegantly.
By reasoning about systems and algorithms rather than programs, we gain power and
simplicity at the cost of being unable to mechanically translate the specification
into an executable.
TLA can *specify* probabilistic algorithms but does not have the ability to reason
about them. (TODO: Any work on extending TLA? Look for papers which cite Lamport's
papers on Google Scholar)
It *is* easy to specify worst case time or space complexity (how!!! example??)
TLA comes with a model checker (MC) and a proof assistant (PA).
But model checker is more useful since it gives you a counter example that
invalidates the sloppy systems you design, whereas the proof system can only prove
your system correct if it *is* correct (unlikely for initial system designs)
(so one implication is that you use the model checker on your system design till
you get the 'all clear' and then try to prove its correctness?)
TLA+ has the following parts
1. (the core ==) TLA - the Temporal Logic of Actions - analogouse to ODEs. ODEs
describe continuous physical systems, TLA describes discrete dynamical systems.
- TLA supports assertional reasoning and has a precise notion of refinement,
which is a precise definition of the abstraction/implementation relationship (e.g:
a parallel mergesort *implements*/refines a mergesort)
- TLA incorporates some Linear Temporal Logic but tries to minimize temporal
reasoning.
2. 'state space' handling. The '+' part of TLA+ uses a formal set theory based
on ZFC, to allow TLA+ variables to take on many kinds of values (numbers, finite
and infinite sequences, sets, records and functions) (SA? so any abstraction that
can be built on sets? that is all of mathematics!)
6. a 'proof system' TLAPS, which is actually a front end for the TLA+ "proof
language" which uses automated solvers and the Proof Assistant Isabelle as
backends.
Lamport Quote
- people look for a magic (math) bullet = you have a problem, you find some
'right math', and the math solves the problem for you.
- that is not how it works. First you understand something, and then you find
the math to express that understanding. The math does not provide the
understanding.
- many people addressing the same problem I do (== specification and
verification) are looking for new math abstraction. I gave up that approach. I
discovered that for proving the correctness of a concurrent algorithm there is one
basic approach - proving (the maintenance of) an invariant. And you can package
that (invariant maintenance) in many ways but these approaches don't make the proof
(of correctness of a concurrent algorithm) any simpler.
- what I've done is to use a method that goes as directly as possible from the
problems into the math
- and since I describe the algorithm in terms of math, I don't need any
semantics to translate from my description into math.
Being able to think precisely, i.e mathematically is a pre requisite to use any
kind of formalism, *but that ability improves with actual work*. (KEY).
TLA+ is great for learning to think mathematically (in the context of reasoning
about programs) because it is much simpler than other formalisms invented to reason
about programs.
It uses the simplest possible math and the simplest possible logic, so you can
concentrate on the program.
We begin to study details of the language, specifically those bits that describe
the static state (?) of a computation, namely the programs data and data operations
(?) that form the primitive building blocks of the computation. We call these bits
the 'data logic'
TLA+ uses (the logic) TLA to describe computations as a discrete dynamic system.
The 'state space' of TLA formulas is a logical structure that contains the values
that TLA variables can take.
The above => we use mathematics to describe our software, for now, just the
datastrucures and basic operations.
The static component of TLA+ is a formal set theory Lamport calls ZFM, Zermelo
Frankel for mathematics, which he explains in the short paper "Types are not
harmless" (read the paper, didn't understand anything)
The logic has a 'structure' == 'domain of discourse' == what the logic is talking
about, which assigns a meaning to the (?? a??) signature
The *meaning* of 0-ary symbols like 2 is called a constant (?? a 0-ary symbol is
called a constant?)
The meaning of symbols of higher arity are either relations (e.g <) or a function
(e.g +)
The meaning of a variable is also defined by the structure, but here the order of
the logic matters.
The collection of values defined by the structure is called the 'universe' of the
logic.
In First Order Logic, a variable can refer to any value in the universe of the
logic.
e.g: the structure of our logic (aka 'the domain of discourse') can be the set of
integers with multiplication, addition and negation
The collecton of *all* models of a formula A (iow *all* possible assignments to the
formula's variables) that make it true is the 'formal semantics' of the formula and
is written [[A]].
We can colloquially refer to the semantics of a formula, i.e *all* the models of
the formula as "the model" of the formula
Then we say that this model (all the models, the semantics) *specifies* the set of
numbers less than 2.
(iow, the model of a formula - all interpretations (in the L4CS sense) that make it
true)- is the same things as its 'specification')
and we write |= A with no structure on the left hand side. The formula TRUE is
satisfied by all models. FALSE has no models at all.
When we work with a logic we work with a logical theory, a set of formulas (called
axioms) taken to be equivalent to the formula TRUE
A model of a theory (vs a single formula) is a model that satisfies all the axioms
of the theory. In other words, a theory specifies a model
Often a logic is not charactecrized with a model, just with a theory that then
specifies all appropriate structure (note: structure == domain of discourse)
e.g: We adopt the Peano axioms as the axioms of our logical theory.
Then natural numbers are a model of Peano arithmetic
e.g: the formula X > 2 has a different model if interpreted over the real numbers
vs naturals
( this seems to come down to what set the assignments to X is drawn from. Here the
interpretations/assignments to X can be drawn from the integers, or the real
numbers. In both cases, the 'model' i.e the assignments that satisfy the formula
will be different. Here different sets.)
e.g: Consider the formula E x | x > 0 AND F y | y > 0 => y >= x is true when
interpreted over the integers (note the ambiguity caused by not having explicit
scopes for the quantifiers, essentially, when x = 1, the statement is true for
integers), but false when interpreted over reals (where for any value of x > 0, you
can always find a y between 0 and x, thus falsifying the statement)
A logic also has a 'calculus' which is a *syntactic* system for deriving formulas
from others. E.g "natural deduction" (Is 'calculus' a name for a system of
inference rules?).
A First Order Logic has connectives AND OR NOT etc, and quantifiers A, E
The variables in FOL can represent simple(?) values of the universe. If the
universe is natural numbers, then a variable, free or bound, refers to a number.
A Model Checker is a program that takes a formula (a set of formulas?) and a model
(a set of assignments to the variables of the formulas that makes the set of
formulas evaluate to true?) and decides if M is a model of that formula or not
Some unclear stuff about how Exists and Forall are defined in terms of CHOOSE. Ok
whatever.
Define
Definitions with parameters are called 'operators' which are *not* function
The 'data logic' of TLA+ is based on ZFC set theory. So the values are sets.
But the logic is first order. ( but since the values are sets, we are close to the
expressivity of second order logic)