Abstract Data Types and The Software Crisis - by Eric Elliott - JavaScript Scene - Medium
Abstract Data Types and The Software Crisis - by Eric Elliott - JavaScript Scene - Medium
Eric Elliott
May 5 · 10 min read
Note: This is part of the “Composing Software” series (now a book!) on learning
functional programming and compositional software techniques in JavaScriptES6+ from
the ground up. Stay tuned. There’s a lot more of this to come!
Buy the Book | Index | < Previous | Next >
Algebraic Data Types (sometimes abbreviated ADT or AlgDT). Algebraic Data Types refer
to complex types in programming languages (e.g., Rust, Haskell, F#) that display some
properties of specific algebraic structures. e.g., sum types and product types.
Algebraic Structures. Algebraic structures are studied and applied from abstract algebra,
which, like ADTs, are also commonly specified in terms of algebraic descriptions of axioms,
but applicable far outside the world of computers and code. An algebraic structure can
exist that is impossible to model in software completely. For contrast, Abstract Data Types
serve as a specification and guide to formally verify working software.
An Abstract Data Type (ADT) is an abstract concept defined by axioms that represent
some data and operations on that data. ADTs are not defined in terms of concrete
instances and do not specify the concrete data types, structures, or algorithms used in
implementations. Instead, ADTs define data types only in terms of their operations,
and the axioms to which those operations must adhere.
Stack
Queue
Set
Map
Stream
ADTs can represent any set of operations on any kind of data. In other words, the
exhaustive list of all possible ADTs is infinite for the same reason that the exhaustive
list of all possible English sentences is infinite. ADTs are the abstract concept of a set of
operations over unspecified data, not a specific set of concrete data types. A common
misconception is that the specific examples of ADTs taught in many university courses
and data structure textbooks are what ADTs are. Many such texts label the data
structures “ADTs” and then skip the ADT and describe the data structures in concrete
terms instead, without ever exposing the student to an actual abstract representation
of the data type. Oops!
ADTs can express many useful algebraic structures, including semigroups, monoids,
functors, monads, etc. The Fantasyland Specification is a useful catalog of algebraic
structures described by ADTs to encourage interoperable implementations in
JavaScript. Library builders can verify their implementations using the supplied
axioms.
Why ADTs?
Abstract Data Types are useful because they provide a way for us to formally define
reusable modules in a way that is mathematically sound, precise, and unambiguous.
This allows us to share a common language to refer to an extensive vocabulary of
useful software building blocks: Ideas that are useful to learn and carry with us as we
move between domains, frameworks, and even programming languages.
History of ADTs
In the 1960s and early 1970s, many programmers and computer science researchers
were interested in the software crisis. As Edsger Dijkstra put it in his Turing award
lecture:
“The major cause of the software crisis is that the machines have become several orders of
magnitude more powerful! To put it quite bluntly: as long as there were no machines,
programming was no problem at all; when we had a few weak computers, programming
became a mild problem, and now we have gigantic computers, programming has become
an equally gigantic problem.”
The problem he refers to is that software is very complicated. A printed version of the
Apollo lunar module and guidance system for NASA is about the height of a filing
cabinet. That’s a lot of code. Imagine trying to read and understand every line of that.
Many software engineers noted that the hardware they built things on top of mostly
worked. But software, more often than not, was complex, tangled, and brittle.
Software was commonly:
Over budget
Late
Buggy
Missing requirements
Difficult to maintain
If only you could think about software in modular pieces, you wouldn’t need to
understand the whole system to understand how to make part of the system work.
That principle of software design is known as locality. To get locality, you need modules
that you can understand in isolation from the rest of the system. You should be able to
describe a module unambiguously without over-specifying its implementation. That’s
the problem that ADTs solve.
Stretching from the 1960s almost to the present day, advancing the state of software
modularity was a core concern. It was with those problems in mind that people
including Barbara Liskov (the same Liskov referenced in the Liskov Substitution
Principle from the SOLID OO design principles), Alan Kay, Bertrand Meyer and other
legends of computer science worked on describing and specifying various tools to
enable modular software, including ADTs, object-oriented programming, and design
by contract, respectively.
ADTs emerged from the work of Liskov and her students on the CLU programming
language between 1974 and 1975. They contributed significantly to the state of the art
of software module specification — the language we use to describe the interfaces that
allow software modules to interact. Formally provable interface compliance brings us
significantly closer to software modularity and interoperability.
Liskov was awarded the Turing award for her work on data abstraction, fault tolerance,
and distributed computing in 2008. ADTs played a significant role in that
accomplishment, and today, virtually every university computer science course
includes ADTs in the curriculum.
The software crisis was never entirely solved, and many of the problems described
above should be familiar to any professional developer, but learning how to use tools
like objects, modules, and ADTs certainly helps.
Human readable description. ADTs can be rather terse if they are not
accompanied by some human readable description. The natural language
description, combined with the algebraic definitions, can act as checks on each
other to clear up any mistakes in the specification or ambiguity in the reader’s
understanding of it.
Definitions. Clearly define any terms used in the specification to avoid any
ambiguity.
Abstract signatures. Describe the expected inputs and outputs without linking
them to concrete types or data structures.
Stacks are commonly used in parsing, sorting, and data collation algorithms.
Definitions
a : Any type
b : Any type
stack(a) : a stack of a
Abstract Signatures
Construction
The stack operation takes any number of items and returns a stack of those items.
Typically, the abstract signature for a constructor is defined in terms of itself. Please
don’t confuse this with a recursive function.
Axioms
The stack axioms deal primarily with stack and item identity, the sequence of the stack
items, and the behavior of pop when the stack is empty.
Identity
Pushing and popping have no side-effects. If you push to a stack and immediately pop
from the same stack, the stack should be in the state it was before you pushed.
Given: push a to the stack and immediately pop from the stack
Sequence
Popping from the stack should respect the sequence: Last In, First Out (LIFO).
Given: push a to the stack, then push b to the stack, then pop from the stack
Empty
Popping from an empty stack results in an undefined item value. In concrete terms, this
could be defined with a Maybe(item), Nothing, or Either. In JavaScript, it’s customary
to use undefined . Popping from an empty stack should not change the stack.
Concrete Implementations
An abstract data type could have many concrete implementations, in different
languages, libraries, frameworks, etc. Here is one implementation of the above stack
ADT, using an encapsulated object, and pure functions over that object:
And another that implements the stack operations in terms of pure functions over
JavaScript’s existing Array type:
// Proofs
assert({
given: 'push `a` to the stack and immediately pop from the stack',
should: 'return a pair of `a` and `stack()`',
actual: pop(push(a, stack())),
expected: [a, stack()]
})
assert({
given: 'push `a` to the stack, then push `b` to the stack, then
pop from the stack',
should: 'return a pair of `b` and `stack(a)`.',
actual: pop(push(b, push(a, stack()))),
expected: [b, stack(a)]
});
assert({
given: 'pop from an empty stack',
should: 'return a pair of undefined, stack()',
actual: pop(stack()),
expected: [undefined, stack()]
});
Conclusion
An Abstract Data Type (ADT) is an abstract concept defined by axioms which
represent some data and operations on that data.
Abstract Data Types are focused on what, not how (they’re framed declaratively,
and do not specify algorithms or data structures).
ADTs provide a way for us to formally define reusable modules in a way that is
mathematically sound, precise, and unambiguous.
ADTs emerged from the work of Liskov and students on the CLU programming
language in the 1970s.
Bonus tip: If you’re not sure whether or not you should encapsulate a function, ask
yourself if you would include it in an ADT for your component. Remember, ADTs should be
minimal, so if it’s non-essential, lacks cohesion with the other operations, or its
specification is likely to change, encapsulate it.
Glossary
Axioms are mathematically sound statements which must hold true.
Next Steps
EricElliottJS.com features many hours of video lessons and interactive exercises on
topics like this. If you like this content, please consider joining.
. . .
Eric Elliott is the author of the books, “Composing Software” and “Programming
JavaScript Applications”. As co-founder of EricElliottJS.com and DevAnywhere.io, he
teaches developers essential software development skills. He builds and advises
development teams for crypto projects, and has contributed to software experiences for
Adobe Systems, Zumba Fitness, The Wall Street Journal, ESPN, BBC, and top
recording artists including Usher, Frank Ocean, Metallica, and many more.
He enjoys a remote lifestyle with the most beautiful woman in the world.
Thanks to JS_Cheerleader.