Dictionaries and Sets
Dictionaries and Sets
Any running Python program has many dictionaries active at the same time, even if the
user’s program code doesn’t explicitly use a dictionary.
— A. M. Kuchling
Chapter 18, “Python’s Dictionary Implementation
The dict type is not only widely used in our programs but also a fundamental part of
the Python implementation. Module namespaces, class and instance attributes, and
function keyword arguments are some of the fundamental constructs where dictionaries
are deployed. The built-in functions live in __builtins__.__dict__.
Because of their crucial role, Python dicts are highly optimized. Hash tables are the
engines behind Python’s high-performance dicts.
We also cover sets in this chapter because they are implemented with hash tables as well.
Knowing how a hash table works is key to making the most of dictionaries and sets.
Here is a brief outline of this chapter:
63
WOW! eBook
www.wowebook.org
Generic Mapping Types
The collections.abc module provides the Mapping and MutableMapping ABCs to
formalize the interfaces of dict and similar types (in Python 2.6 to 3.2, these classes are
imported from the collections module, and not from collections.abc). See
Figure 3-1.
Figure 3-1. UML class diagram for the MutableMapping and its superclasses from col‐
lections.abc (inheritance arrows point from subclasses to superclasses; names in italic
are abstract classes and abstract methods)
Using isinstance is better than checking whether a function argument is of dict type,
because then alternative mapping types can be used.
All mapping types in the standard library use the basic dict in their implementation,
so they share the limitation that the keys must be hashable (the values need not be
hashable, only the keys).
WOW! eBook
www.wowebook.org
What Is Hashable?
Here is part of the definition of hashable from the Python Glossary:
An object is hashable if it has a hash value which never changes during its lifetime (it
needs a __hash__() method), and can be compared to other objects (it needs an
__eq__() method). Hashable objects which compare equal must have the same hash
value. […]
The atomic immutable types (str, bytes, numeric types) are all hashable. A frozen
set is always hashable, because its elements must be hashable by definition. A tuple is
hashable only if all its items are hashable. See tuples tt, tl, and tf:
>>> tt = (1, 2, (30, 40))
>>> hash(tt)
8027212646858338501
>>> tl = (1, 2, [30, 40])
>>> hash(tl)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: unhashable type: 'list'
>>> tf = (1, 2, frozenset([30, 40]))
>>> hash(tf)
-4118419923444501110
User-defined types are hashable by default because their hash value is their id() and
they all compare not equal. If an object implements a custom __eq__ that takes into
account its internal state, it may be hashable only if all its attributes are immutable.
Given these ground rules, you can build dictionaries in several ways. The Built-in
Types page in the Library Reference has this example to show the various means of
building a dict:
>>> a = dict(one=1, two=2, three=3)
>>> b = {'one': 1, 'two': 2, 'three': 3}
>>> c = dict(zip(['one', 'two', 'three'], [1, 2, 3]))
>>> d = dict([('two', 2), ('one', 1), ('three', 3)])
>>> e = dict({'three': 3, 'one': 1, 'two': 2})
>>> a == b == c == d == e
True
WOW! eBook
www.wowebook.org
In addition to the literal syntax and the flexible dict constructor, we can use dict com‐
prehensions to build dictionaries. See the next section.
dict Comprehensions
Since Python 2.7, the syntax of listcomps and genexps was applied to dict comprehen‐
sions (and set comprehensions as well, which we’ll soon visit). A dictcomp builds a dict
instance by producing key:value pair from any iterable. Example 3-1 shows the use of
dict comprehensions to build two dictionaries from the same list of tuples.
If you’re used to liscomps, dictcomps are a natural next step. If you aren’t, the spread of
the listcomp syntax means it’s now more profitable than ever to become fluent in it.
We now move to a panoramic view of the API for mappings.
WOW! eBook
www.wowebook.org
Table 3-1. Methods of the mapping types dict, collections.defaultdict, and collec‐
tions.OrderedDict (common object methods omitted for brevity); optional arguments
are enclosed in […]
dict defaultdict OrderedDict
d.clear() ● ● ● Remove all items
d.__contains__(k) ● ● ● k in d
d.copy() ● ● ● Shallow copy
d.__copy__() ● Support for copy.copy
d.default_factory ● Callable invoked by __missing__ to set
missing valuesa
d.__delitem__(k) ● ● ● del d[k]—remove item with key k
d.fromkeys(it, [initial]) ● ● ● New mapping from keys in iterable, with optional
initial value (defaults to None)
d.get(k, [default]) ● ● ● Get item with key k, return default or None
if missing
d.__getitem__(k) ● ● ● d[k]—get item with key k
d.items() ● ● ● Get view over items—(key, value) pairs
d.__iter__() ● ● ● Get iterator over keys
d.keys() ● ● ● Get view over keys
d.__len__() ● ● ● len(d)—number of items
d.__missing__(k) ● Called when __getitem__ cannot find the key
d.move_to_end(k, [last]) ● Move k first or last position (last is True by
default)
d.pop(k, [default]) ● ● ● Remove and return value at k, or default or
None if missing
d.popitem() ● ● ● Remove and return an arbitrary (key, val
ue) itemb
d.__reversed__() ● Get iterator for keys from last to first inserted
d.setdefault(k, [default]) ● ● ● If k in d, return d[k]; else set d[k] =
default and return it
d.__setitem__(k, v) ● ● ● d[k] = v—put v at k
d.update(m, [**kargs]) ● ● ● Update d with items from mapping or iterable of
(key, value) pairs
d.values() ● ● ● Get view over values
a default_factory is not a method, but a callable instance attribute set by the end user when defaultdict is instantiated.
b OrderedDict.popitem() removes the first item inserted (FIFO); an optional last argument, if set to True, pops the
WOW! eBook
www.wowebook.org