0% found this document useful (0 votes)
128 views10 pages

Pron Tla

TLA+ is a formal specification and verification language that uses mathematical logic to specify and reason about algorithms and systems. It allows engineers to specify systems at various levels of abstraction and verify their properties. TLA+ specifications describe systems as logical formulas that can be mathematically manipulated. This provides powerful abstraction capabilities without needing special constructs for different system types. TLA+ focuses on practical usability for industry while also providing a universal mathematical foundation for formal reasoning.

Uploaded by

JustA Dummy
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as TXT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
128 views10 pages

Pron Tla

TLA+ is a formal specification and verification language that uses mathematical logic to specify and reason about algorithms and systems. It allows engineers to specify systems at various levels of abstraction and verify their properties. TLA+ specifications describe systems as logical formulas that can be mathematically manipulated. This provides powerful abstraction capabilities without needing special constructs for different system types. TLA+ focuses on practical usability for industry while also providing a universal mathematical foundation for formal reasoning.

Uploaded by

JustA Dummy
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as TXT, PDF, TXT or read online on Scribd
You are on page 1/ 10

Part 1

TLA is a formal (?)


- specification language
- verification language

that helps engineers


- specify (and)
- verify (and so)
- design (and)
- reason about
complex and
real life
algorithms and
systems

mostly theory.
complements a tutorial.
will clarify concepts that are hard to grasp when learning TLA

TLA is small and simple but does contain ideas that may take time to understand.
But these ideas are not necessary to
- write good specifications
- these posts don't teach you how to write good specifications
so these posts are neither necessary nor sufficient to put TLA+ to good use BUT
they will help you *understand* TLA+, specifically
1. why TLA+
2. how its design and formal theory compare to other formal methods
3. how powerful a tool mathematics is to reason about systems we design and
build

Learning Resources
1. The TLA + HyperBook
2. Specifying Systems
3. Online TLA + Video Course
4. learntla.com (heavy emphasis on PlusCal)
5. Dr.TLA+ Series of Lectures
6. github repo of examples: https://github.com/tlaplus/Examples
7. Papers by Lamport http://lamport.azurewebsites.net/tla/papers.html
8. Papers by Stephen Merz

TLA+ in Practice
1. Use Of Formal Methods at Amazon Web Services
http://lamport.azurewebsites.net/tla/formal-methods-amazon.pdf
2. Why Amazon Chose TLA + http://link.springer.com/chapter/10.1007/978-3-662-
43652-3_3 (i have Chris Newcombe's email with this paper I think)
3. Slides http://tla2012.loria.fr/contributed/newcombe-slides.pdf

What TLA + *is*


a formal (specification) language with precise syntactic and semantic rules,
but unlike a programming language focuses on *what* a program does, (vs how it does
something as with programming languages)

e.g: specification says the output of a function should be the input, but sorted,
(the what) without specifying which algorithm to use to sort (the how)
interesting: the what of one layer can be (part of) the how of a higher layer. Thus
calculating median needs to sort first (still, without specifying how the sort is
done). This notion that the 'how' of one layer is related to the 'what' of another
is given a precise meaning by something known as an abstraction/refinement relation
which is at the heart of TLA+'s theory.

There are different kinds of specification languages


1. 'contract' languages - specification languages embedded in mainstream
programming languages as providers of contracts.
e.g: spec#, Eiffel, clojure.spec.
They
describe the behavior of individual program units (functions or classes) and
can verify (the behaviour specified) using tools like
randomized test generation
concolic testing
model checking
automatic proofs with SMT Solvers
manual proofs with proof assistants etc

but they cannot prove *global* properties of programs - e.g that the
program will eventually respond to every user request, or that the database will
not lose data

2. languages that serve double duty as specification *and* programming


languages.
e.g: Agda, Idris, Isabelle/HOL, Coq
These have a very clear, unique advantage - they allow end to end verification -
the ability to specify and verify the program from the highest level of global
properties, down to machine instructions emitted by the compiler, ensuring the the
executable conforms to the specification.

But, these abilities come at a high cost - these languages are extremely complex,
with steep learning curves. Used by specialists/academics and rarely in industry.

3 standalone (so not embedded in specific programming languages like 1)


languages that don't serve as complete programming languages (unlike 2) that may
enable some code generation.
e.g: Z, VDM, TLA +

TLA+ focuses on
1. industrial use and user friendliness (like Z)
2. a universal mathematical formalization of mathematics (like Isabelle, and
Coq)

Ideology and Aesthetics


There is no perfect formal system. Each is constructed around 'design decisions'
relating to aesthetics/ideology.

TLA+ Ideology/Aesthetics

Lamport worked on formal methods - techniques based on reasoning about precise


properties of systems. Lamport's work on concurrent/distributed algorithms needed
such methods because these algorithms need proofs of correctness (else manifest
subtle errors)
Lamport has a focus on how engineers think of and use formal methods and his work
is driven by practicality. TLA+ is designed for industry engineers and algorithm
designers in industry/academia.
TLA + is not intended as a programming language or a tool for exploring different
mathematical/logical ideas (?).

Practicality has 3 requirements


simplicity (and consequent ease of learning and use)
scalability ( application to hardware and software of considerable complexity)
universality ( allow use of the same formality to different (ideally all?)
algorithms/system a user encounters EDIT: I think TLA+ can't deal with
probabilistic algorithms yet, so there is at least one type of algorithm it can't
model)

some bashing of functional programming approaches skipped.

Lamport observes
- program languages need to be complex (for good reasons)
- Specification languages built for reasoning about specification can be
simpler.
- much to be gained by separating programming and reasoning

the main difference is in reasoning about the program with (/in) a programming
language, or reasonining with math.

Analogy: electrical engineer building a complex circuit can use


- equations that describe each component (and so work at the math level, given
that the math is powerful enough to describe the components accurately and can
compose them ) - the TLA way
- create a language of components (resistors, capacitors etc) and work at that
level, analogous to reasoning with programming languages
(SA: there really needs to be a concrete example here. Otherwise non-electrical
engineers can only dimly sense the meaning)

Lamport Motivations/Quotes
- I want to verify the algorithms I write
- We are not motivated by abstract elegance, but by the practical problem of
reasoning about real algorithms

'reason precisely' == reason using mathematics, since mathematics is the language


of precision.

LL: "The best way to get better programmers is to teach programmers how to think
better" . Use the same 'language of thinking' as every science- mathematics"

TLA + can represent batch, sequential, concurrent, parallel, distributed,


realtime / discrete or continuous components

Composition of components in TLA + is not function composition, but just logical


conjunction (of the intersection of all constraints imposed by each component of
the system)

Lamport says his colleagues with various temporal logics "spent days trying to
specify a simple FIFO Queue arguing about the suffiency of the specified
properties. I realized that despite the aesthetic appeal, writing a specification
as a conjunction of temporal properties does not work in practice."

So he invented the Temporal Logic of Actions - a logical formalism that tries to


minimise as much as possible the use of temporal reasoning (from which it borrows
some operators) and used instead a concept called actions.

"TLA gave me, for the first time a formalism in which I was able to write
completely formal proofs without first adding a layer of formal semantics"

TLA + = a formal language for specification built completely around TLA.

The Feel of TLA

A TLA+ specification describes a system at some chosen level of detail. It can be


- (no more than) a list of global properties
- a high level description of the algorithm
- a code level description of the algorithm
- (even) a description of the CPU's digital circuits as they compute the
algorithm

so in essence any level of abstraction you are interested in (examples for all
these? these claims are fantastic and no doubt true but need *examples*)

Lamport Quote Paraphrase


The best language for writing specifications is mathematics.
Mathematics has the most powerful abstraction mechanism ever invented - the
definition.

With programming languages, one needs different programming constructs for


different classes of system
e.g: messaging passing primitives for communication systems
clock primitives for realtime systems
Reimann Integrals For hybrid sytems

with mathematics we can define what we need, with no special purpose


constructs. (In this paper, "Verification And Specification Of Concurrent System"
in Fig 3, Lamport specifies the Reimann Integral (!))

TLA+ is not a programming language. It has no notion of stack, heap, I/O, memory,
subroutine. (PlusCal does have processes, stacks and subroutines).
But any kind of hardware or software system and *almost* any kind of algorithm can
be written in it succinctly and elegantly.

By reasoning about systems and algorithms rather than programs, we gain power and
simplicity at the cost of being unable to mechanically translate the specification
into an executable.

A system or an algorithm at any level of abstraction is represented in TLA+ as a


single logical formula.
This logical formula can be mathematically manipulated.

TLA+ gives a precise defintion of abstraction.

X => Y == X implements Y, == Y abstracts X

Code level specification languages, built on contracts or types (even dependent


types), make a clear distinction between algorithms and algorithm properties. e.g:
Quicksort (algorithm - the body of a subroutine ) and algorithm properties (the
subroutine returns a sorted list)
(From the TLA point of view) the algorithm Quicksort and it's property (Quicksort
sorts) are just two specifications at different levels of detail.
From this perspective X => Y == Specification X has the property Y (so
Quicksort.tla has the property sorts.tla?)

TLA can *specify* probabilistic algorithms but does not have the ability to reason
about them. (TODO: Any work on extending TLA? Look for papers which cite Lamport's
papers on Google Scholar)

It *is* easy to specify worst case time or space complexity (how!!! example??)

TLA comes with a model checker (MC) and a proof assistant (PA).
But model checker is more useful since it gives you a counter example that
invalidates the sloppy systems you design, whereas the proof system can only prove
your system correct if it *is* correct (unlikely for initial system designs)

(so one implication is that you use the model checker on your system design till
you get the 'all clear' and then try to prove its correctness?)
TLA+ has the following parts
1. (the core ==) TLA - the Temporal Logic of Actions - analogouse to ODEs. ODEs
describe continuous physical systems, TLA describes discrete dynamical systems.
- TLA supports assertional reasoning and has a precise notion of refinement,
which is a precise definition of the abstraction/implementation relationship (e.g:
a parallel mergesort *implements*/refines a mergesort)
- TLA incorporates some Linear Temporal Logic but tries to minimize temporal
reasoning.

2. 'state space' handling. The '+' part of TLA+ uses a formal set theory based
on ZFC, to allow TLA+ variables to take on many kinds of values (numbers, finite
and infinite sequences, sets, records and functions) (SA? so any abstraction that
can be built on sets? that is all of mathematics!)

3. has a module system. This allows


a. information hiding
b. elaborate abstraction/implementation relations
c. equivalence relations

4. an IDE ( the 'toolbox')

5. a Model Checker, which verifies properties of algorithms written in a useful


subset of TLA+ on (?) restricted finite instances of your system (?) at the push of
a button.

6. a 'proof system' TLAPS, which is actually a front end for the TLA+ "proof
language" which uses automated solvers and the Proof Assistant Isabelle as
backends.

Some misconceptions about TLA

1. TLA is useful for verifying only concurrent/distributed systems.


- TLA+ has no special concurrency constructs (e.g message passing ) at all.

2. TLA incorrectly treats state as global


- counter example - a vector x_1 through x_i denotes the one dimensional
position of i particles, and thus the 'state' of a system containing i particles,
but the actual particles are not aware of the state of the other particles (though
all are collected together *in the notion of a vector*). In other words TLA allows
us to *denote* the global notion of a state.
- we need to distinguish what is physically realizable from what is denotable.
E.g mathematics can define speeds faster than light, though such a speed is not
realizable in a physical system. Similarly we may want to abstractly specify
distributed transactions *as if* all transactions were simultaneous in time and
instantaneous in duration. We can specify the physically realizable algorithm
separately, and use a refinement relationship between the two specifications.
Restricting *denotation* to what can be physically realized is crippling.

3. TLA is not 'higher order' and cannot specify some algorithms.


(the explanation seems a bit complex as of now, so skipped till later).

Lamport Quote
- people look for a magic (math) bullet = you have a problem, you find some
'right math', and the math solves the problem for you.
- that is not how it works. First you understand something, and then you find
the math to express that understanding. The math does not provide the
understanding.
- many people addressing the same problem I do (== specification and
verification) are looking for new math abstraction. I gave up that approach. I
discovered that for proving the correctness of a concurrent algorithm there is one
basic approach - proving (the maintenance of) an invariant. And you can package
that (invariant maintenance) in many ways but these approaches don't make the proof
(of correctness of a concurrent algorithm) any simpler.
- what I've done is to use a method that goes as directly as possible from the
problems into the math
- and since I describe the algorithm in terms of math, I don't need any
semantics to translate from my description into math.

Being able to think precisely, i.e mathematically is a pre requisite to use any
kind of formalism, *but that ability improves with actual work*. (KEY).

TLA+ is great for learning to think mathematically (in the context of reasoning
about programs) because it is much simpler than other formalisms invented to reason
about programs.
It uses the simplest possible math and the simplest possible logic, so you can
concentrate on the program.

Part 2. The + in TLA+

We begin to study details of the language, specifically those bits that describe
the static state (?) of a computation, namely the programs data and data operations
(?) that form the primitive building blocks of the computation. We call these bits
the 'data logic'

TLA+ uses (the logic) TLA to describe computations as a discrete dynamic system.
The 'state space' of TLA formulas is a logical structure that contains the values
that TLA variables can take.

In TLA+ sets form the logical structure.


The elements in that structure are described with Zermelo-Frankel Set Theory and
first order logic - the standard formalization of mathematics.

The above => we use mathematics to describe our software, for now, just the
datastrucures and basic operations.

'formal' == the system has precise syntax and semantics


The static component (??) of TLA+ is most of the language. From a theoretical
standpoint least interesting/important part of the system. The dynamiccomponent,
TLA, is the interesting part. But the static component is the most controversial.

The static component of TLA+ is a formal set theory Lamport calls ZFM, Zermelo
Frankel for mathematics, which he explains in the short paper "Types are not
harmless" (read the paper, didn't understand anything)

TLA+ uses formal mathematics to specify software systems.

What is (a) Logic?

TLA+ uses logic to specify both algorithms and data structures.

(a) Logic is a formal system.


syntax is built of
connectives - and, or, not, implies, iff
variables - x,y,z etc, names we use to refer to objects
signatures - symbols with specific arity. e.g: 5 (0 ary), unary - (1 ary)
+ (2 ary) etc
quantifiers A and E, bind variables
formulas - boolean valued expressions
variables that appear unbound in a variable are 'free variables'
formula with no free variables is a formula or closed sentence

The logic has a 'structure' == 'domain of discourse' == what the logic is talking
about, which assigns a meaning to the (?? a??) signature

Assigning a specific structure to the logic is called an 'interpretation'

The *meaning* of 0-ary symbols like 2 is called a constant (?? a 0-ary symbol is
called a constant?)

The meaning of symbols of higher arity are either relations (e.g <) or a function
(e.g +)

The meaning of a variable is also defined by the structure, but here the order of
the logic matters.

The collection of values defined by the structure is called the 'universe' of the
logic.

In First Order Logic, a variable can refer to any value in the universe of the
logic.

A 'model' is the relationship between syntax and semantics.

A model of a formula is a structure that satisfies it, namely an assignment of


values to the variables of the formula that make the formula true (truth is a
semantic property). The usual symbol for satisfies is |=.

On the right is a formula. On the left is a structure (?? assignment to variables)


that makes that formula true

e.g: the structure of our logic (aka 'the domain of discourse') can be the set of
integers with multiplication, addition and negation

Let the formula on the right of the expression be X < 2.


then *a* model (an assignment to the variables of the formula that makes it true)
could be x = -5.

The collecton of *all* models of a formula A (iow *all* possible assignments to the
formula's variables) that make it true is the 'formal semantics' of the formula and
is written [[A]].

We can colloquially refer to the semantics of a formula, i.e *all* the models of
the formula as "the model" of the formula

Then we say that this model (all the models, the semantics) *specifies* the set of
numbers less than 2.

(iow, the model of a formula - all interpretations (in the L4CS sense) that make it
true)- is the same things as its 'specification')

A formula that is true under all intepretations is said to be 'valid'.

and we write |= A with no structure on the left hand side. The formula TRUE is
satisfied by all models. FALSE has no models at all.

The logical operators interact with the models thus


- The model for A AND B is the intersection of the model of A and model of B
- The model for A OR B is the union of the models of A and B
- The model for NOT A is the compelement of the model of A

When we work with a logic we work with a logical theory, a set of formulas (called
axioms) taken to be equivalent to the formula TRUE
A model of a theory (vs a single formula) is a model that satisfies all the axioms
of the theory. In other words, a theory specifies a model

Often a logic is not charactecrized with a model, just with a theory that then
specifies all appropriate structure (note: structure == domain of discourse)
e.g: We adopt the Peano axioms as the axioms of our logical theory.
Then natural numbers are a model of Peano arithmetic

(the conceptual step being taken here is


- earlier we understood what 'the model' of a *set* of formulas is - the
collection of all interpretations that make *all* the formulas true)
- we also defined axioms as a distinguished set of formulas that are
semantically equivalent to TRUE
- from L4CS, a theory is a bunch of independent axioms.
- now we say that 'the model' of a theory characterizes a logic)

a logical theory can have multiple interpretations over different structures.

e.g: the formula X > 2 has a different model if interpreted over the real numbers
vs naturals

( this seems to come down to what set the assignments to X is drawn from. Here the
interpretations/assignments to X can be drawn from the integers, or the real
numbers. In both cases, the 'model' i.e the assignments that satisfy the formula
will be different. Here different sets.)

e.g: Consider the formula E x | x > 0 AND F y | y > 0 => y >= x is true when
interpreted over the integers (note the ambiguity caused by not having explicit
scopes for the quantifiers, essentially, when x = 1, the statement is true for
integers), but false when interpreted over reals (where for any value of x > 0, you
can always find a y between 0 and x, thus falsifying the statement)

A logic also has a 'calculus' which is a *syntactic* system for deriving formulas
from others. E.g "natural deduction" (Is 'calculus' a name for a system of
inference rules?).

If a formula C can be derived in a finite number of application of inference rules


from formulas A and B we say that A and B "entail" C.

If a formula is entailed by just the axioms of a theory with no other assumptions,


we say it is a tautology,and write |- A

If |- A and A is not an axiom, we call it a theorem.

In most logics, if A,B |- C, then (A AND B) => C

Two important axioms of logic.


1. Principle of Explosion - for any formula A |- NOT (A AND (NOT A)),
equivalently for any A |- NOT (NOT A)
2. Excluded Middle - for any formula A |- A OR (NOT A)

A logic with both these axioms is a classical or standard logic.


A logic with (1) but not (2) (as part of its axioms) is an intuitionistic logic.
(in which A is known to be false, we cannot conclude it is true
A logic with (2) but not (1) is a paraconsistent logic (if we know A to be true, we
cannot conclude it is not *also* false)

All logic used in TLA+ is classical.

First Order Logic And Other Orders

A First Order Logic has connectives AND OR NOT etc, and quantifiers A, E

The variables in FOL can represent simple(?) values of the universe. If the
universe is natural numbers, then a variable, free or bound, refers to a number.

In Second Order Logic, a variable can refer to a value or a *set of values* , so a


set of natural numbers for example.

In Third Order Logic, a variable can refer to a value or a *set of values*, or a


set of set of values.

Reading the following material, the key points seem to be


- keeping the logic first order (though mathematically, it would seem that
variables being able to refer to *sets*, so the logic should be 2nd order,ideally)
has advantages in terms of simplicity, model checking, and general theorems (?))
- one solution is to 'provide access to the metalanguage' (??!) and this is
what TLA uses.

A Model Checker is a program that takes a formula (a set of formulas?) and a model
(a set of assignments to the variables of the formulas that makes the set of
formulas evaluate to true?) and decides if M is a model of that formula or not

The Choose operatar


Choose x : P(x) has return value which is *some* value from the domain of discourse
of the logic that satisfies the predicate P.
That said, Choose is not computational it does not 'select' a value, only specifies
that x is an arbitrary value from the domain (that satisfies the predicate). It
*describes* a value, not how to find it.

Some unclear stuff about how Exists and Forall are defined in terms of CHOOSE. Ok
whatever.

Define

Name =t= e where e is a TLA+ expression

e.g: Double(x) =t= 2 * x ; possible s-exp equivalent (define (Double x) (* 2 x))

Definitions with parameters are called 'operators' which are *not* function

The values of the logic are all Sets.

The 'data logic' of TLA+ is based on ZFC set theory. So the values are sets.
But the logic is first order. ( but since the values are sets, we are close to the
expressivity of second order logic)

You might also like