0% found this document useful (0 votes)
30 views42 pages

Data Collection Zelle - Open Michigan

Collections allow storing multiple values in a single variable. There are two main collection types in Python - lists and dictionaries. Lists store elements in order, while dictionaries store elements as key-value pairs where each key is unique. Python provides built-in functions and methods to manipulate collections, such as appending/removing elements, looping through elements, and more. Collections are an example of object-oriented programming in Python since they combine data (elements) with capabilities (methods).

Uploaded by

pasco11
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
30 views42 pages

Data Collection Zelle - Open Michigan

Collections allow storing multiple values in a single variable. There are two main collection types in Python - lists and dictionaries. Lists store elements in order, while dictionaries store elements as key-value pairs where each key is unique. Python provides built-in functions and methods to manipulate collections, such as appending/removing elements, looping through elements, and more. Collections are an example of object-oriented programming in Python since they combine data (elements) with capabilities (methods).

Uploaded by

pasco11
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 42

Data Collections

Zelle - Chapter 11
Charles Severance - www.dr-chuck.com

Textbook: Python Programming: An Introduction to Computer Science, John Zelle


What is not a “Collection”
• Most of our variables have one value in them - when we put a new
value in the variable - the old value is over written

$ python
Python 2.5.2 (r252:60911, Feb 22 2008, 07:57:53)
[GCC 4.0.1 (Apple Computer, Inc. build 5363)] on darwin
>>> x = 2
>>> x = 4
>>> print x
4
What is a Collection?

• A collection is nice because we can put more than one value in them
and carry them all around in one convenient package.

• We have a bunch of values in a single “variable”

• We do this by having more than one place “in” the variable.

• We have ways of fnding the different places in the variable

(Luggage) CC: BY-SA: xajondee (Flickr) http://creativecommons.org/licenses/by-sa/2.0/deed.en


A Story of Two Collections..
• List

• A linear collection of values that stay in order

• Dictionary

• A “bag” of values, each with its own label


(Pringle's Can) CC:BY-NC Roadsidepictures (flickr) http://creativecommons.org/licenses/by-nc/2.0/deed.en
(Pringles) CC:BY-NC Cartel82 (flickr) http://creativecommons.org/licenses/by-nc/2.0/deed.en
(Chips) CC:BY-NC-SA Bunchofpants (flickr) http://creativecommons.org/licenses/by-nc-sa/2.0/deed.en
(Bag) CC:BY-NC-SA Monkeyc.net (flickr) http://creativecommons.org/licenses/by-nc-sa/2.0/deed.en
The Python List Object

(Pringle's Can) CC:BY-NC Roadsidepictures (flickr) http://creativecommons.org/licenses/by-nc/2.0/deed.en


(Pringles) CC:BY-NC Cartel82 (flickr) http://creativecommons.org/licenses/by-nc/2.0/deed.en
>>> grades = list() The grades variable will have a list of values.
>>> grades.append(100)
>>> grades.append(97) Append some values to the list.
>>> grades.append(100)

>>> print sum(grades) Add up the values in the list using the sum()
297 function.

>>> print grades


What is in the list?
[100, 97, 100]

>>> print sum(grades)/3.0


Figure the average...
99.0
>>>
>>> print grades
[100, 97, 100] What is in grades?

>>> newgr = list(grades)


Make a copy of the
>>> print newgr entire grades list.
[100, 97, 100]

>>> newgr[1] = 85 Change the second new grade


(starts at [0])
>>> print newgr
[100, 85, 100]

>>> print grades The original grades are unchanged.


[100, 97, 100]
Looking in Lists...
>>> print grades
[100, 97, 100]

• We use square brackets to look >>> print grades[0]


up which element in the list we 100
are interested in.
>>> print grades[1]
• grades[2] translates to “grades 97
sub 2”
>>> print grades[2]
• Kind of like in math x2
100
Why lists start at zero?
• Initially it does not make sense that
the frst element of a list is stored at
the zeroth position

• grades[0]

• Math Convention - Number line

• Computer performance - don’t have


to subtract 1 in the computer all
Elevators in Europe!
the time
(elevator) CC:BY marstheinfomage (flickr)
http://creativecommons.org/licenses/by-nc/2.0/deed.en
Fun With Lists

• Python has many features that allow us to do things to an entire list in


a single statement

• Lists are powerful objects


>>> lst = [ 21, 14, 4, 3, 12, 18] >>> print lst
>>> print lst [21, 14, 3, 12, 18, 50]
[21, 14, 4, 3, 12, 18] >>> print lst.index(18)
>>> print 18 in lst 4
True >>> lst.reverse()
>>> print 24 in lst >>> print lst
False [50, 18, 12, 3, 14, 21]
>>> lst.append(50) >>> lst.sort()
>>> print lst >>> print lst
[21, 14, 4, 3, 12, 18, 50] [3, 12, 14, 18, 21, 50]
>>> lst.remove(4) >>> del lst[2]
>>> print lst >>> print lst[3, 12, 18, 21, 33]
[21, 14, 3, 12, 18, 50]
z-343
More functions for lists
>>> a = [ 1, 2, 3 ]
>>> print max(a)
3
>>> print min(a)
1
>>> print len(a)
3
>>> print sum(a)
6
>>>

http://docs.python.org/lib/built-in-funcs.html
>>>print Ist
[3,12,14,18,21,33]
>>>for xval in Ist:
… print xval

3
12
14 Looping through Lists
18
21
33
>>>
z-343
List
Operations

z-343
Quick Peek: Object Oriented

<nerd-alert>
What “is” a List Anyways?
• A list is a special kind of variable >>> i = 2
>>> i = i + 1
• Regular variables - integer >>> x = [1, 2, 3]
>>> print x
• Contain some data
[1, 2, 3]
• Smart variables - string, list >>> x.reverse()
>>> print x
• Contain some data and capabilities [3, 2, 1]

When we combine data + capabilities - we call this an “object”


One way to fnd out Capabilities

Buy a book and read it and carry it around with you.


Lets Ask Python...
• The dir() command lists >>> x = list()
capabilities >>> type(x)
<type 'list'>
• Ignore the ones with >>> dir(x)
underscores - these are used by ['__add__', '__class__', '__contains__',
Python itself '__delattr__', '__delitem__',
'__delslice__', '__doc__',
• The rest are real operations '__eq__''__setitem__', '__setslice__',
that the object can perform '__str__', 'append', 'count', 'extend',
'index', 'insert',
• It is like type() - it tells us 'pop', 'remove', 'reverse', 'sort']
something *about* a variable >>>
Try dir() with a String
>>> y = “Hello there”
>>> dir(y)
['__add__', '__class__', '__contains__', '__delattr__', '__doc__',
'__eq__', '__ge__', '__getattribute__', '__getitem__', '__getnewargs__',
'__getslice__', '__gt__', '__hash__', '__init__', '__le__', '__len__',
'__lt__', '__repr__', '__rmod__', '__rmul__', '__setattr__', '__str__',
'capitalize', 'center', 'count', 'decode', 'encode', 'endswith', 'expandtabs',
'find', 'index', 'isalnum', 'isalpha', 'isdigit', 'islower', 'isspace', 'istitle',
'isupper', 'join', 'ljust', 'lower', 'lstrip', 'partition', 'replace', 'rfind',
'rindex', 'rjust', 'rpartition', 'rsplit', 'rstrip', 'split', 'splitlines', 'startswith',
'strip', 'swapcase', 'title', 'translate', 'upper', 'zfill']
What does >>> a = list()
>>> print a
[]
x = list() mean? >>> print type(a)
<type 'list'>
• These are called
>>> b = dict()
>>> print b
“constructors” - they {}
make an empty list, str, or >>> print type(b)
dictionary <type 'dict'>
>>> a.append("fred")
• We can make a “fully >>> print a
formed empty” object and ['fred']
then add data to it using >>> c = str()
capabilities (aka methods) >>> d = int()
>>> print d
0
Object Oriented Summary

• Variables (Objects) contain data and capabilities

• The dir() function asks Python to list capabilities

• We call object capabilities “methods”

• We can construct fresh, empty objects using constructors like list()

• Everything in Python (even constants) are objects


(Chips) CC:BY-NC-SA Bunchofpants (flickr)
http://creativecommons.org/licenses/by-nc-sa/2.0/deed.en
(Bag) CC:BY-NC-SA Monkeyc.net (flickr)
http://creativecommons.org/licenses/by-nc-sa/2.0/deed.en

Python
Dictionaries
tissue
calculator

perfume
money
candy

http://en.wikipedia.org/wiki/Associative_array
Dictionaries
• Dictionaries are Python’s most powerful data collection

• Dictionaries allow us to do fast database-like operations in Python

• Dictionaries have different names in different languages

• Associative Arrays - Perl / Php

• Properties or Map or HashMap - Java

• Property Bag - C# / .Net


http://en.wikipedia.org/wiki/Associative_array
(Bag) CC:BY-NC-SA Monkeyc.net (flickr)
http://creativecommons.org/licenses/by-nc-sa/2.0/deed.en
Dictionaries
• Lists label their entries >>> purse = dict()
based on the position in the >>> purse['money'] = 12
list >>> purse['candy'] = 3
>>> purse['tissues'] = 75
• Dictionaries are like bags - >>> print purse
{'money': 12, 'tissues': 75, 'candy': 3}
no order
>>> print purse['candy']
• So we mark the things we 3
>>> purse['candy'] = purse['candy'] + 2
put in the dictionary with a >>> print purse
“tag” {'money': 12, 'tissues': 75, 'candy': 5}
>>> purse = dict()
money
>>> purse['money'] = 12 candy 12
>>> purse['candy'] = 3 3
tissues
>>> purse['tissues'] = 75 75

>>> print purse


{'money': 12, 'tissues': 75, 'candy': 3}
candy
5
>>> print purse['candy']
3

>>> purse['candy'] = purse['candy'] + 2

>>> print purse


{'money': 12, 'tissues': 75, 'candy': 5}
(Purse) CC:BY Monkeyc.net Stimpson/monstershaq2000' s photostream (flickr)
http://creativecommons.org/licenses/by/2.0/deed.en
Lookup in Lists and Dictionaries
• Dictionaries are like Lists except that they use keys instead of
numbers to look up values

>>> lst = list() >>> ddd = dict()


>>> lst.append(21) >>> ddd["age"] = 21
>>> lst.append(183) >>> ddd["course"] = 182
>>> print lst >>> print ddd
[21, 183] {'course': 182, 'age': 21}
>>> lst[0] = 23 >>> ddd["age"] = 23
>>> print lst >>> print ddd
[23, 183] {'course': 182, 'age': 23}
>>> lst = list()
>>> lst.append(21) List
>>> lst.append(183) Key Value
>>> print lst
[21, 183] [0] 21
lll
>>> lst[0] = 23
23 [1] 183
>>> print lst
[23, 183]

>>> ddd = dict() Dictionary


>>> ddd["age"] = 21
Key Value
>>> ddd["course"] = 182
>>> print ddd{'course': 182, 'age': 21} [course] 183
>>> ddd["age"] = 23
ddd
[age] 21
>>> print ddd
{'course': 182, 'age': 23}
Dictionary Operations

z-369
Dictionary Literals (Constants)
• Dictionary literals use curly braces and have a list of key : value pairs

• You can make an empty dictionary using empty curly braces

>>> jjj = { 'chuck' : 1 , 'fred' : 42, 'jan': 100}


>>> print jjj
{'jan': 100, 'chuck': 1, 'fred': 42}
>>> ooo = { }
>>> print ooo
{}
>>>
Dictionary Patterns
• One common use of dictionary is Key Value
counting how often we “see” something

>>> ccc = dict()


>>> ccc["csev"] = 1
>>> ccc["cwen"] = 1
>>> print ccc
{'csev': 1, 'cwen': 1}
>>> ccc["cwen"] = ccc["cwen"] + 1
>>> print ccc
{'csev': 1, 'cwen': 2}
Dictionary Patterns
• It is an error to reference a key which is not in the dictionary

• We can use the in operator to see if a key is in the dictionary

>>> ccc = dict()


>>> print ccc["csev"]
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
KeyError: 'csev'
>>> print "csev" in ccc
False
ccc = dict()
if “csev” in ccc:
print “Yes”
else
print “No”

ccc[“csev”] = 20

if “csev” in ccc:
print “Yes”
else
print “No”
Dictionary
Counting >>> ccc = dict()
>>> print ccc.get("csev", 0)
0
• Since it is an error to >>> ccc["csev"] = ccc.get("csev",0) + 1
reference a key which is not >>> print ccc
in the dictionary {'csev': 1}
>>> print ccc.get("csev", 0)
• We can use the dictionary 1
>>> ccc["csev"] = ccc.get("csev",0) + 1
get() operation and supply a
default value if the key does >>> print ccc
{'csev': 2}
not exist to avoid the error
and get our count started.
dict.get(key, defaultvalue)
What get() effectively does...
• The get() method basically d = dict()
does an implicit if checking x = d.get(“fred”,0)
to see if the key exists in the
dictionary and if the key is
not there - return the
default value d = dict()
if “fred” in d:
• The main purpose of get() is x = d[“fred”]
to save typing this four line else:
pattern over and over x=0
Retrieving lists of Keys and Values
• You can get a list of keys, values or items (both) from a dictionary

>>> jjj = { 'chuck' : 1 , 'fred' : 42, 'jan': 100}


>>> print jjj.keys()
['jan', 'chuck', 'fred']
>>> print jjj.values()
[100, 1, 42]
>>> print jjj.items()
[('jan', 100), ('chuck', 1), ('fred', 42)]
>>>
Looping Through Dictionaries
• We loop through the >>> jjj = { 'chuck' : 1 , 'fred' : 42, 'jan': 100}
key-value pairs in a >>> for aaa,bbb in jjj.items() :
dictionary using *two* ... print aaa, bbb
iteration variables ...
jan 100
• Each iteration, the frst chuck 1 aaa bbb
variable is the key and fred 42 [jan] 100
the the second variable is >>>
the corresponding value [chuck] 1
[fred] 42
Dictionary Maximum Loop
$ cat dictmax.py
jjj = { 'chuck' : 1 , 'fred' : 42, 'jan': 100}
print jjj $ python dictmax.py
{'jan': 100, 'chuck': 1, 'fred': 42}
maxcount = None jan 100
for person, count in jjj.items() :
if maxcount == None or count > maxcount :
maxcount = count
maxperson = person None is a special value in Python. It is
like the “absense” of a value. Like
print maxperson, maxcount “nothing” or “empty”.
Dictionaries are not Ordered

• Dictionaries use a Computer Science technique called “hashing” to


make them very fast and effcient

• However hashing makes it so that dictionaries are not sorted and they
are not sortable

• Lists and sequences maintain their order and a list can be sorted - but
not a dictionary

http://en.wikipedia.org/wiki/Hash_function
Dictionaries are not Ordered
>>> lst = dict()
>>> dict = { "a" : 123, "b" : 400, "c" : 50 } >>> lst.append("one")
>>> print dict >>> lst.append("and")
{'a': 123, 'c': 50, 'b': 400} >>> lst.append("two")
>>> print lst
['one', 'and', 'two']
>>> lst.sort()
Dictionaries have no order and >>> print lst
cannot be sorted. Lists have ['and', 'one', 'two']
order and can be sorted. >>>

http://en.wikipedia.org/wiki/Hash_function
Summary: Two Collections
• List

• A linear collection of values that stay in order

• Dictionary

• A “bag” of values, each with its own label / tag


(Pringle's Can) CC:BY-NC Roadsidepictures (flickr) http://creativecommons.org/licenses/by-nc/2.0/deed.en
(Bag) CC:BY-NC-SA Monkeyc.net (flickr) http://creativecommons.org/licenses/by-nc-sa/2.0/deed.en
What do we use these for?

• Lists - Like a Spreadsheet - with columns of stuff to be summed,


sorted - Also when pulling strings apart - like string.split()

• Dictionaries - For keeping track of (keyword,value) pairs in memory


with very fast lookup. It is like a small in-memory database. Also used
to communicate with databases and web content.

You might also like