Clojure Guides_ Data Structures
Clojure Guides_ Data Structures
Intro
Vectors
Intro
Constructing Vectors
Accessing Elements
(Non-)emptiness
Linear-time Operations
Searching
Maps
Intro
Building Maps
Accessing Entries
Lists
Intro
Accessing Elements
Sets
Intro
Constructing sets
Sets as predicates
Relations
Sequences
Intro
Exercise: search for clojure.core functions with a coll argument
Exercise: Convoluted Encoding
Contributors
This work is licensed under a Creative Commons Attribution 3.0 Unported License
(https://creativecommons.org/licenses/by/3.0/) (including images & stylesheets). The source is available
on Github (https://github.com/clojure-doc/clojure-doc.github.io).
Intro
This cookbook covers some common tasks with core Clojure data structures. It assumes that the reader is
familiar with foundational read and write operations (get, conj, assoc, update, etc). For more coverage of
getting started with Clojure data structures, some recommended resources are:
Constructing Vectors
mapv and filterv are eager versions of map and filter that return vectors
Accessing Elements
Vectors implement the stack protocol, meaning peek and pop return the last element and the vector
without the last element, respectively.
(Non-)emptiness
Sometimes it's desirable to treat an empty vector as a falsey value, especially in conditionals. The not-
empty function will return its argument if it's not empty, otherwise it will return nil.
user=> (defn notate-range
"takes a vector of [start end] (both optional) and returns a range notat
[coords]
(if-not (empty? coords)
(let [[start end] coords]
(str start ".." end))
"empty range"))
#'user/notate-range
user=> (notate-range [1 5])
"1..5"
user=> (notate-range [7])
"7.."
user=> (notate-range [])
"empty range"
Here, not-empty is used to make the conditional false (the empty vector is truthy, and not-empty
makes it nil ).
Linear-time Operations
Operations that need to search through elements one-by-one or operate on a large number of elements
are less efficient, but still valuable in some contexts.
The replace function has a special case for vectors that returns a vector. It takes a map where the keys
are the values to replace and the values are the replacements.
"Deleting" an item from anywhere but the last position is not a 'fast' operation, because conceptually all the
indexes after the deleted item need to be decremented by one. We can use into to combine two sub-
vectors, but the trade-off is that this operation adds the items from the second sub-vector one at a time, so
this function is O(n) in the size of the second sub-vector.
Searching
Checking for the existence of a value (as opposed to an index) requires examining the elements one-by-
one. A point of confusion that sometimes arises is the contains? function, which searches an
associative collection for the existence of a key/index, and is very seldom desirable for vectors.
A commonly-used way to check for an occurrence of a certain value in a collection is the some function
with a set as a predicate. some stops at the first truthy return, so there's no 'wasted' computation.
Note that searching for false or nil in this way won't work. In those cases use the nil? or false?
functions.
Java interop can be used to get the first or last index of a specific value.
keep-indexed can be used to find all the indexes of a value (or indexes that match some predicate).
Note that this is actually a function that works on (and returns) a sequence.
Building Maps
zipmap associates corresponding entries from two seqs into a map. This is used here to assign people to
teams.
group-by could be used to build the team rosters, but the result still has the team numbers in the roster.
reduce-kv allows more control over the values, meaning it's possible to go directly to "clean" output (and
maybe even different data structures).
user=> (reduce-kv
(fn [acc k v] (update acc v (fnil conj #{}) k))
{} {"Mike" 1, "Tina" 2, "Alice" 1, "Fred" 2})
{1 #{"Alice" "Mike"}, 2 #{"Tina" "Fred"}}
frequencies takes a collection and returns a map of elements in that collection to how many times it
appears. This is used here for a rudimentary word counter.
clojure.set has a map-invert function that will swap the keys and values. The "unique key" facet of
maps means that duplicate values in the input map will get one of the keys.
user=> (require '[clojure.set :as set])
user=> (let [squares {1 1, 2 4, 3 9, 4 16}
sqrts (set/map-invert squares)]
(sqrts 9))
3
Accessing Entries
While select-keys can be used to create a submap, it is sometimes desirable to pull some keys into a
sequence. In this example, juxt is used to make a vector of the :x and :y values of maps for passing
to another function.
Lists
Intro
Lists are primarily used for code in Clojure. While there aren't a ton of functions aimed specifically at lists,
lists do implement the sequence interface, so all sequence functions work on lists without any conversion.
Accessing Elements
Like vectors, lists implement the stack protocol, but unlike vectors, the 'top' of the stack is the front of the
list. So, 'first' on a list returns the same element as peek (the top of the stack), but on a vector 'first' returns
the bottom of the stack (this can be useful for queueing in LIFO vs FIFO order, for example).
Sets
Intro
In addition to the distinctness of sets, they're fast to check membership, so if values are being collected for
the purposes of checking whether they've been seen or not, a set is often a good choice.
Constructing sets
While keys on a map returns a sequence, the Java interop call to keySet can be used to get a set of
keys on the map. This set is technically an anonymous instance of Java's AbstractSet .
Sets as predicates
Because sets can be used as functions, they can be used as predicates with various higher-order
functions.
;which word(s) can be typed using only the top row of a qwerty keyboard?
user=> (let [top-row (set "qwertyuiop")
candidates ["poet" "computer" "typewriter" "desk"]]
(filter #(every? top-row %) candidates))
("poet" "typewriter")
Relations
In addition to the more primitive set functions, the clojure.set namespace contains the fundamental
relational algebra (the underpinnings of SQL) operations. Sets of maps can be treated as relations,
providing the ability to do joins, projections, etc on in-memory data structures.
; since the arglist is nested, helper function to check for a `coll` arg
user=> (defn has-coll-arg?
"Arglists are stored as a list of arg vectors, e.g. ([x] [x y]).
Returns the first `coll` in an arg vector, else nil."
[arglist]
(some ; some over the arg-vecs
(fn [arg-vec] (some ; some over the members of each arg-vec
#{'coll} arg-vec)) arglist))
user=> (->> (ns-publics (the-ns 'clojure.core))
vals
(map meta)
(filter (comp has-coll-arg? :arglists))
(map :name)
sort)
(->Eduction
assoc!
associative?
...)
This example, in addition to illustrating some of the runtime introspection capabilities of Clojure, illustrates
a few sequence functions. some is used on both lists and vectors (and uses the 'set as predicate' idiom
from above to search through the vectors). Depending on exactly which version of Clojure is in use, this
code returns a sequence of around 80 functions.
450 - 4 * 5 + 0 = 20 = t
114 - 1 * 1 + 4 = 4 = e
291 - 2 * 9 + 1 = 19 = s
355 - 3 * 5 + 5 = 20 = t
943 - these digits are strictly descending, so from it onwards is thrown out, and the decoded text is
'test'
An exercise for the reader is to use for to generate the 3 digit sequences that 'compute' to (<= 1 n
26) and aren't in descending order, as part of the encoding procedure.
Contributors
@bobisgeek (https://github.com/bobisageek) - original author
Links
About (/articles/about/)
Table of Contents (/articles/content/)
Getting Started (/articles/tutorials/getting_started/)
Introduction to Clojure (/articles/tutorials/introduction/)
Clojure Editors (/articles/tutorials/editors/)
Clojure Community (/articles/ecosystem/community/)
Basic Web Development (/articles/tutorials/basic_web_development/)
Language: Functions (/articles/language/functions/)
Language: clojure.core (/articles/language/core_overview/)
Language: Collections and Sequences (/articles/language/collections_and_sequences/)
Language: Namespaces (/articles/language/namespaces/)
Language: Java Interop (/articles/language/interop/)
Language: Polymorphism (/articles/language/polymorphism/)
Language: Concurrency and Parallelism (/articles/language/concurrency_and_parallelism/)
Language: Macros (/articles/language/macros/)
Language: Laziness (/articles/language/laziness/)
Language: Glossary (/articles/language/glossary/)
Ecosystem: Library Development and Distribution (/articles/ecosystem/libraries_authoring/)
Ecosystem: Web Development (/articles/ecosystem/web_development/)
Ecosystem: Generating Documentation (/articles/ecosystem/generating_documentation/)
Building Projects: tools.build and the Clojure CLI (/articles/cookbooks/cli_build_projects/)
Data Structures
Strings (/articles/cookbooks/strings/)
Mathematics with Clojure (/articles/cookbooks/math/)
Date and Time (/articles/cookbooks/date_and_time/)
Working with Files and Directories in Clojure (/articles/cookbooks/files_and_directories/)
Middleware in Clojure (/articles/cookbooks/middleware/)
Parsing XML in Clojure (/articles/cookbooks/parsing_xml_with_zippers/)
Growing a DSL with Clojure (/articles/cookbooks/growing_a_dsl_with_clojure/)