Real Python Part 1
Real Python Part 1
Fletcher Heisler
Contents
Introduction
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
10
License . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
11
Conventions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
12
Errata . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
14
2 Getting Started
15
Download Python . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
15
Open IDLE . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
16
17
Screw things up . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
20
Store a variable
22
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
24
26
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
26
28
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
31
34
35
35
38
40
42
. . . . . . . . . . . . . . . . . . . . . . .
1
43
Do futuristic arithmetic . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
43
46
47
50
Run in circles
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
51
54
55
59
Compare values . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
59
62
66
. . . . . . . . . . . . . . . . . . . . . . . . .
69
72
75
77
78
79
79
84
86
88
93
93
99
105
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
106
111
112
115
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
115
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
120
125
126
126
133
134
134
142
146
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . .
153
156
156
163
181
181
189
. . . . . . . . . . . . . . . . . . . . . . . . . . . .
190
206
17 Web applications
207
207
213
. . . . . . . . . . . . . . . . . . . . . . . .
219
220
18 Final Thoughts
221
222
Windows 7 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
222
Mac OS X
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
224
Linux . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
225
20 Acknowledgements
226
List of Figures
15.1 ine_1_to_5 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
164
15.2 line_0_to_4 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
165
15.3 styled_line . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
166
15.4 two_lines_runoff.png . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
168
15.5 two_lines_shifted . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
169
15.6 two_lines_legend . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
170
15.7 basic_bars . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
171
15.8 bar_chart_spread . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
172
15.9 double_bars_fat . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
173
15.10double_bars_spread . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
174
15.11 double_bars_labeled . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
175
15.12histogram
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
176
15.13histogram_labeled . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
177
15.14histogram_with_mu . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
178
16.1 message_box_mac . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
183
16.2 message_box_ubuntu . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
183
16.3 open_dialog . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
185
16.4 framed . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
193
16.5 multicolored_frames . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
194
16.6 tk_positioning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
195
16.7 tk_frame_position . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
196
16.8 tk_columns . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
197
16.9 tk_multirows . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
198
16.10button . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
199
16.11 entries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
200
16.12temperature_app . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
202
16.13choose_a_file
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
203
16.14file_saver . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
205
223
224
225
Chapter 1
Introduction
Whether youre new to programming or a professional code monkey looking to dive into a new language, this book will teach you all of the practical Python that you need to get started on projects on
your own.
Real Python emphasizes real-world programming techniques, which are illustrated through interesting, useful examples. No matter what your ultimate goals may be, if you work with computer at all,
you will soon be finding endless ways to improve your life by automating tasks and solving problems
through Python programs that you create.
Python is open-source freeware, meaning you can download it for free and use it for any purpose. It
also has a great support community that has built a number of additional free tools. Need to work
with PDF documents in Python? Theres a free package for that. Want to collect data from webpages?
No need to start from scratch!
Python was built to be easier to use than other programming languages. Its usually much easier to
read Python code and MUCH faster to write code in Python than in other languages.
For instance, heres some simple code written in C, another commonly used programming language:
1
#include <stdio.h>
2
3
int main ()
4
5
reddit, Spotify, turntable.fm, Yahoo! Groups, and the list goes on And if its powerful enough for
both NASA and the NSA, its good enough for us.
10
License
This e-book is copyrighted and licensed under a Creative Commons Attribution- NonCommercialNoDerivs 3.0 Unported License. This means that you are welcome to share this book and use it for
any non-commercial purposes so long as the entire book remains intact and unaltered. That being
said, if you have received this copy for free and have found it helpful, I would very much appreciate
if you purchased a copy of your own.
The example Python scripts associated with this book should be considered open content. This means
that anyone is welcome to use any portion of the code for any purpose.
11
Conventions
NOTE: Since this is the Alpha release, we do not have all the conventions in place. We
are working on it. Patience.
Formatting
1. Code blocks will be used to present example code.
1
$ python hello-world.py
(dollar signs are not part of the command)
3. Italic text will be used to denote a file name:
hello-world.py.
1. Notes:
NOTE: This is a note filled in with bacon impsum text. Bacon ipsum dolor sit amet
t-bone flank sirloin, shankle salami swine drumstick capicola doner porchetta bresaola
short loin. Rump ham hock bresaola chuck flank. Prosciutto beef ribs kielbasa pork belly
chicken tri-tip pork t-bone hamburger bresaola meatball. Prosciutto pork belly tri-tip
pancetta spare ribs salami, porchetta strip steak rump beef filet mignon turducken tail
pork chop. Shankle turducken spare ribs jerky ribeye.
2. Warnings:
WARNING: This is a warning also filled in with bacon impsum. Bacon ipsum dolor
sit amet t-bone flank sirloin, shankle salami swine drumstick capicola doner porchetta
bresaola short loin. Rump ham hock bresaola chuck flank. Prosciutto beef ribs kielbasa
12
pork belly chicken tri-tip pork t-bone hamburger bresaola meatball. Prosciutto pork
belly tri-tip pancetta spare ribs salami, porchetta strip steak rump beef filet mignon turducken tail pork chop. Shankle turducken spare ribs jerky ribeye.
3. See Also:
SEE ALSO: This is a see also box with more tasty impsum. Bacon ipsum dolor sit amet
t-bone flank sirloin, shankle salami swine drumstick capicola doner porchetta bresaola
short loin. Rump ham hock bresaola chuck flank. Prosciutto beef ribs kielbasa pork belly
chicken tri-tip pork t-bone hamburger bresaola meatball. Prosciutto pork belly tri-tip
pancetta spare ribs salami, porchetta strip steak rump beef filet mignon turducken tail
pork chop. Shankle turducken spare ribs jerky ribeye.
13
Errata
I welcome ideas, suggestions, feedback, and the occasional rant. Did you find a topic confusing? Or
did you find an error in the text or code? Did I omit a topic you would love to know more about.
Whatever the reason, good or bad, please send in your feedback.
You can find my contact information on the Real Python website. Or submit an issue on the Real
Python offical support repository. Thank you!
NOTE: The code found in this course has been tested in Mac OS X v. 10.8, Windows
XP, Windows 7, and Linux Mint 14.
14
Chapter 2
Getting Started
Download Python
Before we can do anything, you have to download Python. Even if you already have Python on your
computer, make sure that you have the correct version: 2.7 is the version used in this book and by
most of the rest of the world.
NOTE: Theres a newer version, Python 3.4x, but it cant run code that was created with
previous versions of Python (including some useful and important packages that havent
been updated). Since most of the code youll see elsewhere will be from Python 2.7x, you
should learn that first. The two versions are still very similar, and it will take you very
little time to get used to the minor changes in Python 3 after youve mastered Python
2. If you really want to get a head start on learning Python 3, however, this book will
point out all the relevant differences; just know that you will still need to use Python 2
for some things until Python 3 support is offered.
Mac users: You already have a version of Python installed by default, but its not quite the same
as the standard installation. You should still download Python 2.7.6 as directed below. Otherwise,
you might run into problems later when trying to install some additional functionality in Python or
running code that involves graphics windows.
Linux users: You might already have Python 2.7.6 installed by default. Open your Terminal application and type python version to find out. If you have 2.7.1 or 2.7.2, you should go ahead and update
to the latest version.
If you need to, go to http://www.python.org/download/ to download Python 2.7.6 for your operating
system and install the program.
For further assistance, please refer to Appendix A for a basic tutorial on installing
Python.
15
Open IDLE
Well be using IDLE (Interactive DeveLopment Environment) to write our Python code. IDLE is a
simple editing program that comes automatically installed with Python on Windows and Mac, and it
will make our lives much easier while were coding. You could write Python scripts in any program
from a basic text editor to a very complex development environment (and many professional coders
use more advanced setups), but IDLE is simple to use and will easily provide all the functionality we
need. I personally use a more advanced text editor called Sublime Text for most projects.
Windows: Go to your start menu and click on IDLE (Python GUI) from the Python 2.7 program
folder to open IDLE. You can also type IDLE into the search bar.
OS X: Go to your Applications folder and click on IDLE from the Python 2.7 folder to start running
IDLE. Alternatively, you can type IDLE (without quotes) into your Terminal window to launch
IDLE.
Linux: I recommend that you install IDLE to follow along with this course. You could use Vim or
Emacs, but they will not have the same built-in debugging features. To install IDLE with admin
privileges:
On Ubuntu/Debian, type: sudo apt-get install idle
On Fedora/Red Hat/RHEL/CentOS, type: sudo yum install python-tools
On SUSE, you can search for IDLE via install software through YaST.
Opening IDLE, you will see a brief description of Python, followed by a prompt:
1
>>>
Were ready to program!
16
>>> 1+1
2
>>>
Lets try out some actual code. The standard program to display Hello, world on the screen is just
that simple in Python. Tell the interactive window to print the phrase by using the print command
like so:
1
2
3
NOTE: If you want to get to previous lines youve typed into the interactive window
without typing them out again or copying and pasting, you can use the pair of shortcut
keys ALT+P (or on a Mac, CTRL+P). Each time you hit ALT+P, IDLE will fill in the
previous line of code for you. You can then type ALT+N (OS X:CTRL+N) to cycle back
to the next most recent line of code.
Normally we will want to run more than one line of code at a time and save our work so that we can
return to it later. To do this, we need to create a new script.
From the menu bar, choose File -> New Window to generate a blank script. You should rearrange
this window and your interactive results window so that you can see them both at the same time.
Type the same line of code as before, but put it in your new script:
1
WARNING: If you just copy and paste from the interactive window into the script,
make sure you never include the >>> part of the line. Thats just the window asking
for your input; it isnt part of the actual code.
In order to run this script, we need to save it first. Choose File -> Save As . . ., name the file
hello_world.py (without the quotation marks) and save it somewhere youll be able to find it later.
The .py extension lets IDLE know that its a Python script.
17
NOTE: Notice that print and Hello, world appear in different colors to let you know
that print is a command and Hello, world is a string of characters. If you save the
script as something other than a .py file (or if you dont include the .py extension,
this coloring will disappear and everything will turn black, letting you know that the file
is no longer recognized as a Python script.
Now that the script has been saved, all we have to do in order to run the program is to select Run
-> Run Module from the script window (or hit F5), and well see the result appear in the interactive
results window just like it did before:
1
2
3
>>>
Hello, world
>>>
NOTE: The biggest difference between Python 2 and Python 3 is the print command. In
Python 3, you put parentheses after the print statement instead of a space; for instance,
instead of just print "Hello, world", the previous code in Python 3 would look like:
print("Hello, world")
If you want to get used to using this version in Python 2, you can add the following line
to the top of every script >you write:
(Thats future with two underscores on each side.) As long as your Python 2 script starts
with that line, you can > then use print() instead of print, just like in Python 3.
To open and edit a program later on, just open up IDLE again and select File -> Open, then browse
to and select the script to open it in a new script window.
Double-clicking on a .py script will run that script, usually closing the window once the script is
done running (before you can even see what happened). If you instead want to edit the script in
IDLE, you can usually right-click (OS X: control-click) on the file and choose to Edit with IDLE to
open the script.
Linux users: Read this overview first (especially section 2.2.2) if you want to be able to run Python
scripts outside of the editor.
NOTE: You might see something like the following line in the interactive window
when you run or re-run a script: >>> ====================== RESTART
======================
18
This is just IDLEs way of letting you know that everything after this line is the result
of the new script that you are just about to run. Otherwise, if you ran one script after
another (or one script again after itself), it might not be clear what output belongs to
which run of which script.
19
Screw things up
Everybody makes mistakes - especially while programming. In case you havent made any mistakes
yet, lets get a head start on that and mess something up on purpose to see what happens.
Using IDLE, there are two main types of errors youll experience. The most common is a syntax error,
which usually means that youve typed something incorrectly.
Lets try changing the contents of the script to:
1
>>>
Traceback (most recent call last):
2
3
4
5
6
7
>>>
So what happened? Python is telling us a few things:
An error occurred - specifically, Python calls it a NameError
20
Review exercises:
1. Write a script that IDLE wont let you run because it has a syntax error
2. Write a script that will only crash your program once it is already running because it has a
run-time error
21
Store a variable
Lets try writing a different version of the previous script. Here well use a variable to store our text
before printing the text to the screen:
1
2
3
print phrase
Notice the difference in where the quotation marks go from our previous script. We are creating a
variable named phrase and assigning it the value of the string of text Hello, world. We then print
the phrase to the screen. Try saving this script and running these two lines; you should see the same
output as before:
>>>
2
3
Hello, world
4
5
>>>
Notice that in our script we didnt say:
print "phrase"
Using quotes here would just print the word phrase instead of printing the contents of the variable
named phrase.
NOTE: Phrases that appear in quotation marks are called strings. We call them strings
because theyre just that - strings of characters. A string is one of the most basic building blocks of any programming language, and well use strings a lot over the next few
chapters.
2
3
print phrase
Can you spot the difference? In this example, the first line defines the variable Phrase with a capital
P at the beginning, but the second line prints out the variable phrase.
WARNING: Since Python is case-sensitive, the variables Phrase and phrase are entirely different things. Likewise, commands start with lowercase letters; we can tell
Python to print, but it wouldnt know how to Print. Keep this important distinction
in mind!
22
When you run into trouble with the sample code, be sure to double-check that every character in your
code (often including spaces) exactly matches the examples. Computers dont have any common
sense to interpret what you meant to say, so being almost correct still wont get a computer to do the
right thing!
Review exercises:
1. Using the interactive window, display some text on the screen by using print
2. Using the interactive window, display a string of text by saving the string to a variable, then
printing the contents of that variable
3. Do each of the first two exercises again by first saving your code in a script and then running
the script
23
Chapter 3
Interlude: Leave yourself helpful notes
As you start to write more complicated scripts, youll start to find yourself going back to parts of your
code after youve written them and thinking, what the heck was that supposed to do
To avoid these moments, you can leave yourself notes in your code; they dont affect the way the script
runs at all, but they help to document whats supposed to be happening. These notes are referred to
as comments, and in Python you start a comment with a pound (#) sign. Our first script could have
looked like this:
1
2
3
4
5
print "#1"
If you have a lot to say, you can also create comments that span over multiple lines by using a
series of three single quotes () or three double quotes (""") without any spaces between
them. Once you do that, everything after the or becomes a comment until you close the
comment with a matching or . For instance, if you were feeling excessively verbose, our first
script could have looked like this:
2
3
4
5
6
7
8
9
print phrase
10
11
25
Chapter 4
Fundamentals: Strings and Methods
Learn to speak in Python
As weve already seen, you write strings in Python by surrounding them with quotes. You can use
single quotes or double quotes, as long as youre consistent for any one string. All of the following
lines create string variables (called string literals because weve literally written out exactly how they
look):
longString = ```This is a
string that spans across multiple lines```
3
4
5
One last thing about strings: if you want to write out a really long string, but you dont want it to
appear on multiple lines, you can use a backslash like this when writing it out:
1
2
2
3
Here's a string that I want to write across multiple lines since it is long.
4
5
>>>
Review exercises:
1.
2.
3.
4.
print a string that uses double quotation marks inside the string
print a string that uses an apostrophe (single quote) inside the string
print a string that spans across multiple lines
print a one-line string that you have written out on multiple lines
27
1
2
3
4
5
6
1
2
3
>>> print"abra"+"ca"+"dabra"
abracadabra
>>>
In programming, when we add strings together like this, we say that we concatenate them.
NOTE: Youll see a lot of bold terms throughout the first few chapters of this book. Dont
worry about memorizing all of them if theyre unfamiliar! You dont need any fancy jargon to program well, but its good to be aware of the correct terminology. Programmers
tend to throw around technical terms a lot; not only does it allow for more precise communication, but it helps make simple concepts sound more impressive.
When we want to combine many strings at once, we can also use commas to separate them. This will
automatically add spaces between the strings, like so:
1
2
3
Of course, the commas have to go outside of the quotation marks, since otherwise the commas would
become part of the actual strings themselves.
Since a string is just a sequence of characters, we should be able to access each character individually
as well. We can do this by using square brackets after the string, like this:
1
2
3
4
1
2
3
Index / Subscript #:
We can get a particular section out of the string as well, by using square brackets and specifying the
range of characters that we want. We do this by putting a colon between the two subscript numbers,
like so:
1
2
3
4
If we use the colon in the brackets but omit one of the numbers in a range, Python will assume that
we meant to go all the way to the end of the string in that direction:
1
2
3
4
5
6
7
8
myString = "goal"
myString[0] = "f"
Instead, we would have to create an entirely new string (although we can still give myString that new value):
myString = "goal"
myString = "f" + myString[1:]
In the first example, we were trying to change part of myString and keep the rest of
it unchanged, which doesnt work. In the second example, we created a new string by
adding two strings together, one of which was a part of myString; then we took that new
string and completely reassigned myString to this new value.
Review exercises:
1. Create a string and print its length using the len() function
2. Create two strings, concatenate them (add them next to each other) and print the combination
of the two strings
3. Create two string variables, then print one of them after the other (with a space added in between) using a comma in your print statement
4. print the string zing by using subscripting and index numbers on the string bazinga to
specify the correct range of characters
30
car.park()
Methods are followed by parentheses, because sometimes methods use input. For instance, if we
wanted to drive the car object a distance of 50, we would place that input of 50 in the parentheses of
the drive method:
car.drive(50)
There are certain methods that belong to string objects as well. For instance, there is a string method
called upper() that creates an upper-case version of the string. (Likewise, there is a corresponding
method lower() that creates a lower-case version of a string.) Lets give it a try in the interactive
window:
1
2
3
4
5
6
7
>>>
We created a string loudVoice, then we called its upper() method to return the upper-case version
of the string, which we print to the screen.
NOTE: Methods are just functions that belong to objects. We already saw an example of
a general-purpose function, the len() function, which can be used to tell us the length of
many different types of objects, including strings. This iswhy we use the length function
differently, by only saying: len(loudVoice)
Meanwhile, we use dot notation to call methods that belong to an object, like when we
call the upper() method that belongs to the string loudVoice: loudVoice.upper()
31
Lets make things more interactive by introducing one more general function. Were going to get
some input from the user of our program by using the function raw_input(). The input that we pass
to this function is the text that we want it to display as a prompt; what the function actually does is
to receive additional input from the user. Try running the following script:
1
2
3
NOTE: Python 3 note: The raw_input() function has been renamed to just input() in
Python 3. There is an input() function in Python 2 as well, but it actually works differently (and its equivalent doesnt exist in Python 3).
When you run this, instead of the program ending and taking you back to the >>> prompt, youll just
see:
1
>>>
2
3
>>>
2
3
4
5
6
7
>>>
Now well combine the function raw_input() with the string method upper() in a script to modify
the users input:
2
3
response = response.upper()
4
5
32
In IDLE, if you want to see all the methods can apply to a particular kind of object, you can type that
object out followed by a period and then hit CTRL+SPACE. For instance, first define a string object
in the interactive window:
1
>>> myString.
When you hit CTRL+SPACE, youll see a list of method options that you can scroll through with the
arrow keys. Strings have lots of methods!
A related shortcut in IDLE is the ability to fill in text automatically without having to type in long
names by hitting TAB. For instance, if you only type in myString.u and then hit the TAB key, IDLE
will automatically fill in myString.upper because there is only one method belonging to myString
that begins with a u. In fact, this even works with variable names; try typing in just the first few
letters of myString and, assuming you dont have any other names already defined that share those
first letters, IDLE will automatically complete the name myString for you when you hit the TAB
key.
Review exercises:
1. Write a script that takes input from the user and displays that input back
2. Use CTRL+SPACE to view all the methods of a string object, then write a script that returns
the lower-case version of a string
33
34
Chapter 5
Fundamentals: Working with Strings
Mix and match different objects
Weve seen that string objects can hold any characters, including numbers. However, dont confuse
string numbers with actual numbers For instance, try this bit of code out in the interactive window:
1
2
3
4
1
2
3
4
int() stands for integer and converts objects into whole numbers, while float() stands for
floating-point number and converts objects into numbers that have decimal points. For instance,
we could change the string myNumber into an integer or a float like so:
1
2
3
4
5
6
35
Notice how the second version added a decimal point, because the floating- point number has more
precision (more decimal places). For this reason, we couldnt change a string that looks like a floatingpoint number into an integer because we would have to lose everything after the decimal:
1
2
3
4
5
6
7
8
1
2
3
4
5
6
1
2
3
Review exercises:
1. Create a string object that stores an integer as its value, then convert that string into an actual
integer object using int(); test that your new object is really a number by multiplying it by
another number and displaying the result
2. Repeat the previous exercise, but use a floating-point number and float()
36
3. Create a string object and an integer object, then display them side-by-side with a single print
statement by using the str() function
37
Although its less frequently used, we can also use index numbers inside the curly braces to do the
same thing:
1
print "{0} has {1} heads and {2} arms".format(name, numHeads, numArms)
Here weve inserted name into the {0} place-holder because it is the 0th input listed, and so on.
Since we numbered our place-holders, we dont even have to provide the inputs in the same order.
For instance, this line would also do the exact same thing:
print "{2} has {1} heads and {0} arms".format(numArms, numHeads, name)
This style of formatting can be helpful if you want to repeat an input multiple times within a string,
i.e.:
38
1
2
3
1
2
Review exercises:
1. Create a float object (a decimal number) named weight that holds the value 0.2, and create
a string object named animalthat holds the value newt, then use these objects to print the
following line without using the format() string method: 0.2 kg is the weight of the newt.
2. Display the same line using format() and empty {} place-holders
3. Display the same line using {} place-holders that use the index numbers of the inputs provided
to the format() method
4. Display the same line by creating new string and float objects inside of the format() method
39
1
2
3
4
1
2
3
1
2
3
1
2
3
A similar string method is replace(), which will replace all occurrences of one substring with a
different string. For instance, lets replace every instance of the truth with the string lies in the
following:
1
2
3
4
>>> myStory = "I'm telling you the truth; he spoke nothing but the truth!"
>>> print myStory.replace("the truth", "lies")
I'm telling you lies; he spoke nothing but lies!
>>>
Keep in mind that calling replace() did not actually change myStory; in order to affect this string,
we would still have to reassign it to a new value, as in:
1
2
Review exercises:
1. In one line, display the result of trying to find() the substring a in the string AAA; the result
should be -1
2. Create a string object that contains the value version 2.0; find() the first occurrence of the
number 2.0 inside of this string by first creating a float object that stores the value 2.0 as a
floating-point number, then converting that object to a string using the str() function
3. Write and test a script that accepts user input using raw_input(), then displays the result of
trying to find() a particular letter in that input
41
The letter:
The letter:
The letter:
The letter:
The letter:
The letter:
The letter:
a becomes: 4
b becomes: 8
e becomes: 3
l becomes: 1
o becomes: 0
s becomes: 5
t becomes: 7
Your program should then display the resulting output. A sample run of the program, with the user
input in bold, is shown below:
1
2
3
4
5
>>>
42
Chapter 6
Fundamentals: Functions and Loops
Do futuristic arithmetic
We already did some basic math using IDLEs interactive window. For instance, we saw that we could
evaluate simple expressions just by typing them in at the prompt, which would display the answer:
1
2
3
>>> 6 * (1+6)
42
>>>
However, just putting that line into a script:
6 * (1+6)
would be useless since we havent actually told the program to do anything. If we want to display the
result from a program, we have to rely on the print command again.
Go ahead and open a new script, save it as arithmetic.py and try displaying the results of some basic
calculations:
1
2
3
4
print
print
print
print
"1 +
"2 *
"1.2
"5 /
1 =", 1 + 1
(2 + 3) =", 2 * (2+3)
/ 0.3 =", 1.2 / 0.3
2 =", 5 / 2
Here weve used a single print statement on each line to combined two pieces of information by
separating the values with a comma. The results of the numerical expressions on the right will automatically be calculated when we display it.
All of the spaces we included above were entirely optional, but they help to makes things easier to
read.
When you save and run this script, it displays the results of your print commands as follows:
43
1
2
3
4
5
6
>>>
1 +
2 *
1.2
5 /
>>>
1 = 2
(2 + 3) = 10
/ 0.3 = 4.0
2 = 2
2
3
1
2
3
4
Review exercises:
1. In a script, print the result of dividing one integer by a second, larger integer; this is integer
division
2. Import division from the future into the same script, then save and rerun the script; compare
these results to the previous version
45
>>>
Enter a base: 1.2
Enter an exponent: 3
1.2 to the power of 3 = 1.728
>>>
Keep the following in mind:
1. In Python, xy is calculated by using the expression x ** y
2. Before you can do anything with the users input, you will have to store the results of both calls
to raw_input() in new objects
3. The raw_input() function returns a string object, so you will need to convert the users input
into numbers in order to do arithmetic on them
4. You should use the string format() method to print the result
5. You can assume that the user will enter actual numbers as input
46
def square(number):
2
3
4
5
return sqr_num
The def is short for define and lets Python know that we are about to define a new function. In this
case, we called the function square and gave it one input variable (the part in parentheses) named
number. A functions input is called an argument of the function, and a function can take more than
one argument.
The first line within our function multiplies number by itself and stores the result in a new variable
named sqr_num. Then the last line of our function returns the value of sqr_num, which is the output
of our function.
If you just type these three lines into a script, save it and run it, nothing will happen. The function
doesnt do anything by itself.
However, now we can use the function later on from the main section of the script. For instance, try
running this script:
1
2
3
defsquare(number):
sqr_num = number **2
return sqr_num
4
5
6
input_num = 5
output_num = square(input_num)
7
8
print output_num
By saying output_num = square(input_num), we are calling up the function square and providing
this function with the input variable input_num, which in this case has a value of 5. Our function then
calculates 52 and returns the value of the variable sqr_num, which gets stored in our new variable
output_num.
47
NOTE: Notice the colon and the indentation after we defined our function. These arent
optional This is how Python knows that we are still inside of the function. As soon as
Python sees a line that isnt indented, thats the end of the function. Every line inside
the function must be indented.
You can define many functions in one script, and functions can even refer to each other. However,
its important that a function has been defined before you try to use it. For instance, try running this
code instead:
1
input_num = 5
2
3
output_num = square(input_num)
4
5
print output_num
6
7
8
9
def square(number):
sqr_num = number * number
return sqr_num
Here weve just reordered the two parts of our script so that the main section comes before the function. The problem here is that Python runs through our code from the top to the bottom - so when we
call the square function on the second line, Python has no idea what we mean yet because we dont
actually define the square function until later on in the script, and it hasnt gotten there yet. Instead
we see an error:
1
2
print returnDifference(3, 5)
This line will call our new returnDifference() function, then display the result of -2 that the function
returns.
NOTE: Once a function returns a value with the return command, the function is done
running; if any code appears inside the function after the return statement, it will never
be run because the function has already returned its final result.
48
One last helpful thing about functions is that Python allows you to add special comments called docstrings. A docstring serves as documentation, helping to explain what a function does and how to
use it. Theyre completely optional, but can be helpful if theres any chance that youll either share
your code with someone else or if you ever come back to your code later, once youve forgotten what
its supposed to do - which is why you should leave comments in the first place A docstring looks just
like a multi-line comment with three quotation marks, but it has to come at the very beginning of a
function, right after the first definition line:
1
2
3
4
1
2
3
4
5
6
>>> help(returnDifference)
Help on function returnDifference in module __main__:
returnDifference(n1, n2)
Return the difference between two numbers.
Subtracts n2 from n1.
>>>
Of course, you can also call help() on the many other Python functions well see to get a quick reference on how they are used.
Review exercise:
1. Write a cube() function that takes a number and multiplies that number by itself twice over, returning the new value; test the function by displaying the result of calling your cube() function
on a few different numbers
2. Write a function multiply() that takes two numbers as inputs and multiples them together,
returning the result; test your function by saving the result of multiply(2, 5) in a new integer
object and printing that integers value
49
50
Run in circles
One major benefit of computers is that we can make them do the same exact thing over and over
again, and they rarely complain or get tired. The easiest way to program your computer to repeat
itself is with a loop.
There are two kinds of loops in Python: for loops and while loops. The basic idea behind any kind of
loop is to run a section of code repeatedly as long as a specific statement (called the test condition)
is true. For instance, try running this script that uses a while loop:
1
2
3
4
5
n = 1
while (n < 5):
print "n =", n
n = n +1
print "Loop finished "
Here we create a variable n and assign it a value of 1. Then we start the while loop, which is organized
in a similar way to how we defined a function. The statement that we are testing comes in parentheses
after the while command; in this case we want to know if the statement n < 5 is true or false. Since 1
< 5, the statement is true and we enter into the loop after the colon.
NOTE: Notice the indentation on the lines after the colon. Just like when we defined
a function, this spacing is important The colon and indenting let Python know that the
next lines are inside the while loop. As soon as Python sees a line that isnt indented,
that line and the rest of the code after it will be considered outside of the loop.
Once weve entered the loop, we print the value of the variable n, then we add 1 to its value. Now n
is 2, and we go back to test our while statement. This is still true, since 2 < 5, so we run through the
next two lines of code again And we keep on with this pattern while the statement n < 5 is true.
As soon as this statement becomes false, were completely done with the loop; we jump straight to
the end and continue on with the rest of the script, in this case printing out Loop finished
Go ahead and test out different variations of this code - and try to guess what your output will be
before you run each script. Just be careful: its easy to create whats called an infinite loop; if you test
a statement thats always going to be true, you will never break out of the loop, and your code will
just keep running forever.
NOTE: Its important to be consistent with indentation, too. Notice how you can hit
tab and backspace to change indentation, and IDLE automatically inserts four space
characters Thats because you cant mix tabs and spaces as indentation. Although IDLE
wont let you make this mistake, if you were to open your script in a different text editor
and replace some of the space indentation with tabs, Python would get confused - even
though the spacing looks the same to you (You could use tab characters everywhere as
indentation - or even use a different number of space characters - as long as you always
use the same type of indentation for a particular section of indented text.)
51
The second type of loop, a for loop, is slightly different in Python. We typically use for loops in Python
in order to loop over every individual item in a set of similar things - these things could be numbers,
variables, lines from an input file, etc.
For instance, the following code does the exact same thing as our previous script by using a for loop
to repeatedly run code for a range of numbers:
1
2
3
As long as we indent the code correctly, we can even put loops inside loops Try this out:
1
2
3
52
case (or if your code seems frozen because of anything else thats taking longer than
expected), you can usually break out of the code by typing CTRL+C in the interactive
window. This should immediately stop the rest of your script from running and take
you back to a prompt in the interactive window.
If that doesnt seem to have any effect (because you somehow managed to freeze the IDLE window),
you can usually type
CTRL+Q to quit out of IDLE entirely, much like End Task in Windows or Force Quit on a Mac.
Review exercises:
1. Write a for loop that prints out the integers 2 through 10, each on a new line, by using the
range() function
2. Use a while loop that prints out the integers 2 through 10
(Hint: youll need to create a new integer first; theres no good reason to use a while loop
instead of a for loop in this case, but its good practice)
3. Write a function doubles() that takes one number as its input and doubles that number three
times using a loop, displaying each result on a separate line; test your function by calling doubles(2) to display 4, 8, and 16
53
invest(2000, .025, 5)
Running this test code should produce the following output exactly:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
54
Chapter 7
Interlude: Debug your code
Youve probably already discovered how easy it is to make mistakes that IDLE cant automatically
catch for you. As your code becomes longer and more complicated, it can become a lot more difficult
to track down the sources of these errors.
When we learned about syntax and run-time errors, I actually left out a third, most difficult type
of error that youve probably already experienced: the logic error. Logic errors occur when youve
written a program that, as far as your computer is concerned, is a completely valid program that it
has no trouble running - but the program doesnt do what you intended it to do because you made a
mistake somewhere.
Programmers use debuggers to help get these bugs out of their programs (were clever with names
like that), and theres already a simple debugger built into IDLE that you should learn to use - before
you need to use it.
NOTE: Although debugging is the least glamorous and most boring part of programming, learning to make good use of a debugger can save you a lot of time in the end. We
all make mistakes; its a good idea to learn how to find and fix them.
From the file menu of the interactive window of IDLE (not in a script window), click on Debug ->
Debugger to open the Debug Control window.
Notice how the interactive window now shows [DEBUG ON] at the prompt to let you know that the
debugger is open. We have a few main options available to us, all of which will be explained shortly:
Go, Step, Over, Out, and Quit.
Keep both the debugger window and the interactive window open, but lets also start a new script so
that we can see how the debugger works:
1
for i in range(1,4):
2
3
j = i*2
4
5
If you run this script while you have the debugger open, youll notice that it doesnt get very far.
Actually, it pauses before running anything at all, and the Stack window at the top of the debugger
says:
1
def addUnderscores(word):
2
3
new_word ="_"
4
5
6
7
8
9
return new_word
10
11
56
What we meant for the function addUnderscores() to do was to add underscores around every character in the word passed to it, so that we could give It the input hello and it would return the output:
1
_h_e_l_l_o_ _
Instead, all we see right now is:
It might already be obvious to you what our error was, but lets use the debugger to work through
the problem. We know that the problem is occurring somewhere inside in the function - specifically,
within the for loop, since we said that new_word should start with a _ but it clearly doesnt. So lets
put a breakpoint at the start of the for loop so that we can trace out exactly whats happening inside.
To set a breakpoint on a line, right-click (Mac: control-click) on that line and select Set Breakpoint,
which should highlight the line to let you know that the breakpoint is active.
Now we can run the script with the debugger open. It will still pause on the very first line it sees
(which is defining the function), but we can select Go to run through the code normally. This will
save the function in Pythons memory, save the variable phrase as hello, call up our function from
the print statement and then pause at our breakpoint, right before entering the for loop.
At this point, we see that we have two local variables defined (theyre local because they belong to
the function). As expected, we have new_word, which is a string with just the underscore character,
and we have word, the variable we passed to the function, with hello as its contents.
Click Step once and youll see that weve entered the for loop. The counter were using, i, has been
given its first value of 0.
Click Step one more time, and it might become clearer whats happening in our code. The variable
new_word has taken on the value h_ It got rid of our first underscore character already If you
click Step a few more times, youll see that new_word gets set to e_, then l_, etc. Weve been
overwriting the contents of new_word instead of adding to it, and so we correct the line to:
57
Its most likely because you closed out of the debugger while your script was still running.
Always be sure to hit Go or Quit when youre finished with a debugging session instead of just closing the debugger, or you might have trouble reopening it. The IDLE
debugger isnt the most carefully crafted pieceof software - sometimes youll just have to
exit out of IDLE and reopen your script to make this error go away.
Debugging can be tricky and time-consuming, but sometimes its the only reliable way to fix errors
that youve overlooked. However, before turning to the debugger, in some cases you can just use
print statements to your advantage to figure out your mistakes much more easily.
The easiest way to do this is to print out the contents of variables at specific points throughout your
script. If your code breaks at some point, you can also print out the contents of variables using the
interactive window to compare how they appear to how you thought they should look at that point.
For instance, in the previous example we could have added the statement print new_word inside
of our for loop. Then when we run the script, wed be able to see how new_word actually grows
(or in this case, how it doesnt grow properly). Just be careful when using print statements this
way, especially inside of loops - if you dont plan out the process well, its easy to end up displaying
thousands of lines that arent informative and only end up slowing down or freezing your program.
Or you could always try rubber ducking.
58
Chapter 8
Fundamentals: Conditional logic
Compare values
Computers understand our world in binary, breaking every problem down into 0s and 1s. In order
to make comparisons between values, we have to learn how to communicate in this 1 or 0 language
using boolean logic. Boolean refers to anything that can only take on one of two values: true or
false.
To help make this more intuitive, Python has two special keywords that are conveniently named True
and False. The capitalization is important - these keywords are not ordinary variables or strings,
but are essentially synonyms for 1 and 0, respectively. Try doing some arithmetic in the interactive
window using True and False inside of your expressions, and youll see that they behave just like 1
and 0:
1
2
3
4
5
6
7
a == b > a equal to b
The last two symbols might require some explanation. The symbol != is a sort of shortcut notation
for saying not equal to. Try it out on a few expressions, like these:
1
2
>>> 1 != 2
3
4
True
5
6
>>> 1 != 1
7
8
False
9
10
>>>
In the first case, since 1 does not equal 2, we see the result True. Then, in the second example, 1 does
equal 1, so our test expression 1 != 1 returns False.
When we want to test if a equals b, we cant just use the expression a = b because in programming,
that would mean we want to assign the value of b to a. We use the symbol == as a test expression
when we want to see if a is equal to b. For instance, take a look at the two versions of equals here:
>>> a = 1
2
3
>>> b = 2
4
5
>>> a == b
6
7
False
8
9
>>> a = b
10
11
>>> a == b
12
13
True
14
15
>>>
First we assigned a and b two different values. When we check whether they are equal by using the
a == b expression, we get the result False. Then, we reassign a the value of b by saying a = b. Now,
since a and b are both equal to 2, we can test this relationship again by saying a == b (which is more
of a question that were asking), which returns the result of True. Likewise, we could ask:
>>> a != b
2
3
False
60
>>>
Its True that 2 == 2. So the opposite expression, a != b, is False. In other words, its not True that 2
does not equal 2.
We can compare strings in the same way that we compare numbers. Saying that one word is less
than another doesnt really mean anything, but we can test whether two strings are the same by
using the == or != comparators:
2
3
False
4
5
6
7
True
8
9
10
11
True
12
13
>>>
Keep in mind that two strings have to have exactly the same value them to be equal. For instance, if
one string has an extra space character at the end or if the two strings have different capitalization,
comparing whether or not they are equal will return False.
Review exercises:
1. Figure out what the result will be (True or False) when evaluating the following expressions,
then type them into the interactive window to check your answers:
1
1 <= 1
2
3
1 != 1
4
5
1 != 2
6
7
"good" != "bad"
8
9
"good" != "Good"
10
11
123 == "123"
61
2
3
True
4
5
6
7
False
8
9
10
11
False
12
13
14
15
False
16
17
>>>
1. 1 < 2 and 3 < 4: both statements are True, so the combination is also True
2. < 1 and 4 < 3: both statements are False, so their combination is also False
3. 1 < 2 and 4 < 3: the first statement (1 < 2) is True, while the second statement (4 < 3) is False;
since both statements have to be True, combining them with the and keyword gives us False
4. 2 < 1 and 3 < 4: the first statement (2 < 1) is False, while the second statement (3 < 4) is True;
again, since both statements have to be True, combining them with the and keyword gives us
False
62
1.
2.
3.
4.
It might seem counter-intuitive that True and False is False, but think back to how we use this term
in English; the following phrase, taken in its entirety, is of course false: cats have tails and the moon
is made of cheese.
The or keyword means that at least one value must be true. When we say the word or in everyday
conversation, sometimes we mean an exclusive or - this means that only the first option or the
second option can be true. We use an exclusive or when we say a phrase such as, I can stay or I
can go. I cant do both - only one of these options can be true.
However, in programming we use an inclusive or since we also want to include the possibility that
both values are true. For instance, if we said the phrase, we can have ice cream or we can have
cake, we also want the possibility of ice cream and cake, so we use an inclusive or to mean either
ice cream, or cake, or both ice cream and cake.
Again, we can try this concept out in Python by using some numerical expressions:
1
2
3
True
4
5
6
7
False
8
9
10
11
True
12
13
14
15
True
16
17
>>>
If any part of our expression is True, even if both parts are True, the result will also be True. We can
summarize these results as follows:
Combination using or > Result
2
3
False
4
5
6
7
True
8
9
>>>
Finally, as you would expect, the not keyword simply reverses the truth of a single statement:
Effect of using not > Result
2
3
True
4
5
6
7
True
8
9
>>>
We can now combine these keywords with the boolean comparators that we learned in the previous section to compare much more complicated expressions. For instance, heres a somewhat more
involved example:
2.
3.
4.
5.
6.
When working through complicated expressions, the best strategy is to start with the most complicated part (or parts) of the expression and build outward from there. For instance, try evaluating
this example:
1. (A != A) or not (2 >= 3)
We can break this expression down into two sections, then combine them:
1.
2.
3.
4.
5.
6.
Well start with the expression on the left side of the or keyword
We know that A == A is True, therefore
A != A is False, and the left side of our expression is False
Now on the right side of the or, 2 >= 3 is False, therefore not (2 >= 3) is True
So our entire expression simplifies to False or True
False or True evaluates to be True
Note that we didnt have to use parentheses to express either of these examples. However, its usually
best to include parentheses as long as they help to make a statement more easily readable.
NOTE: You should feel very comfortable using various combinations of these keywords
and boolean comparisons. Play around in the interactive window, creating more complicated expressions and trying to figure out what the answer will be before checking
yourself, until you are confident that you can decipher boolean logic.
Review exercises:
1. Figure out what the result will be (True or False) when evaluating the following expressions,
then type them into the interactive window to check your answers:
1
(1 <= 1) and (1 != 1)
2
3
not (1 != 2)
4
5
6
7
65
if 2 + 2 == 4:
print "2 and 2 is 4"
print "Arithmetic works."
Just like when we created for and while loops, when we use an if statement, we have a test condition
and an indented block of text that will run or not based on the results of that test condition. Here,
our test condition (which comes after the if keyword) is 2 + 2 == 4. Since this expression is True, our
if statement evaluates to be True and all of the code inside of our if statement is run, displaying the
two lines. We happened to use two print statements, but we could really put any code inside the if
statement.
If our test condition had been False (for instance, 2 + 2 != 4), nothing would have been displayed
because our program would have jumped past all of the code inside this if block after finding that
the if statement was False.
There are two other related keywords that we can use in combination with the if keyword: else and
elif. We can add an else statement after an if statement like so:
1
2
3
4
5
6
if 2 + 2 == 4:
print "2 and 2 is 4"
print "Arithmetic works."
else:
print "2 and 2 is not 4"
print "Big Brother wins."
The else statement doesnt have a test condition of its own because it is a catch-all block of code;
our else is just a synonym for otherwise. So if the test condition of the if statement had been False,
the two lines inside of the else block would have run instead. However, since our test condition (2 +
2 == 4) is True, the section of code inside the else block is not run.
The elif keyword is short for else if and can be used to add more possible options after an if statement. For instance, we could combine an if, and elif, and an else in a single script like this:
num = 15
2
3
4
5
6
7
8
1
2
3
4
5
6
if 1 < 2:
print "1 is less than 2"
elif 3 < 4:
print "3 is less than 4"
else:
print "Who moved my cheese?"
The first test condition (1 < 2) is True, so we print the first line inside the if block. Since we already
saw one True statement, however, the program doesnt even bother to check whether the elif or the
else blocks should be run; these are only alternative options. So, even though its True that 3 is less
than 4, we only ever run the first print statement.
Just like with for and while loops, we can nest if statements inside of each other to create more
complicated paths:
1
2
want_cake = "yes"
have_cake = "no"
3
4
if want_cake == "yes":
67
5
6
7
8
9
10
11
Review exercises:
1. Write a script that prompts the user to enter a word using the raw_input() function, stores
that input in a string object, and then displays whether the length of that string is less than 5
characters, greater than 5 characters, or equal to 5 characters by using a set of if, elif and else
statements.
68
>>>
Enter a positive integer: 12
1 is a divisor of 12
2 is a divisor of 12
3 is a divisor of 12
4 is a divisor of 12
6 is a divisor of 12
12 is a divisor of 12
>>>
You should use the % operator to check divisibility. This is called the modulus operator and is
represented by a percent symbol in Python. It returns the remainder of any division. For instance, 3
goes into 16 a total of 5 times with a remainder of 1, therefore 16 % 3 returns 1. Meanwhile, since 15
is divisible by 3, 15 % 3 returns 0.
Also keep in mind that raw_input() returns a string, so you will need to convert this value to an
integer before using it in any calculations.
There are two main keywords that help to control the flow of programs in Python: break and continue.
These keywords can be used in combination with for and while loops and with if statements to allow
us more control over where we are in a loop.
Lets start with break, which does just that: it allows you to break out of a loop. If we wanted to break
out of a for loop at a particular point, our script might look like this:
1
2
3
4
5
1
2
3
>>>
Finished with i = 3
>>>
Instead, each time we run through the loop, we are checking whether i == 2. When this is True, we
break out of the loop; this means that we quit the for loop entirely as soon as we reach the break
keyword. Therefore, this code will only display:
69
1
2
3
>>>
Finished with i = 2
>>>
Notice that we still displayed the last line since this wasnt part of the loop. Likewise, if we were in a
loop inside another loop, break would only break out of the inner loop; the outer loop would continue
to run (potentially landing us back inside the inner loop again).
Much like break, the continue keyword jumps to the end of a loop; however, instead of exiting a loop
entirely, continue says to go back to the top of the loop and continue with the next item of the loop.
For instance, if we had used continue instead of break in our last example:
1
2
3
4
5
1
2
3
>>>
Finished with i = 3
>>>
NOTES: Its always a good idea to give short but descriptive names to your variables
that make it easy to tell what they are supposed to represent. The letters i, j and k are
exceptions because they are so common in programming; these letters are almost always
used when we need a throwaway number solely for the purpose of keeping count while
working through a loop.
Loops can have their own else statements in Python as well, although this structure isnt used very
frequently. Tacking an else onto the end of a loop will make sure that your else block always runs
after the loop unless the loop was exited by using the break keyword. For instance, lets use a for loop
to look through every character in a string, searching for the upper-case letter X:
1
2
3
4
5
6
7
70
Here, our program attempted to find X in the string phrase, and would break out of the for loop if
it had. Since we never broke out of the loop, however, we reached the else statement and displayed
that X wasnt found. If you try running the same program on the phrase it marks Xthe spot or
some other string that includes an X, however, there will be no output at all because the block of
code in the else will not be run.
Likewise, an else placed after a while loop will always be run unless the while loop has been exited
using a break statement. For instance, try out the following script:
1
tries = 0
2
3
4
5
6
7
8
9
10
Review exercises:
1. Use a break statement to write a script that prompts the users for input repeatedly, only ending
when the user types q or Q to quit the program; a common way of creating an infinite loop
is to write while True:
2. Combine a for loop over a range() of numbers with the continue keyword to print every number
from 1 through 50 except for multiples of 3; you will need to use the % operator.
71
1
2
3
4
try:
number = int(raw_input("Enter an integer: "))
except ValueError:
print "That was not an integer."
The first thing that happens in a try/except pair is that everything inside of the try block is run
normally. If no error occurs, the code skips over the except block and continues running normally.
Since we said except ValueError, however, if the program encounters a ValueError (when the user
enters something that isnt an integer), we jump down to the except block and run everything there.
This avoids Pythons automatic error display and doesnt break the script since we have caught the
ValueError.
If a different kind of exception had occurred, then the program still would have broken; we only
handled one type of exception (a ValueError) with our except block.
A single except block can handle multiple types of exceptions by separating the exception names
with commas and putting the list of names in parentheses:
2
3
try:
5
6
7
This isnt used very frequently since we usually want our code to react specifically to each type of
exception differently. In this case, we created a divide() function that tries to divide two numbers.
If one or both of the numbers arent actually numbers, a TypeError exception will occur because we
cant use something that isnt a number for division. And if we provide 0 as the second number, a
ZeroDivisionError will occur since we cant divide by zero. Both of these exceptions will be caught
by our except block, which will just let the user know that we encountered a problem and continue
on with anything else that might be left in our script outside of the try/except pair.
More except error handling blocks can be added after the first except to catch different types of
exceptions, like so:
1
2
3
4
5
6
7
try:
number = int(raw_input("Enter an non-zero integer: "))
print "10 / {} = {}".format(number, 10.0/number)
except ValueError:
print "You did not enter an integer."
except ZeroDivisionError:
print "You cannot enter 0."
Here, we might have encountered two different types of errors. First, the user might not have input
an integer; when we try to convert the string input using int(), we raise an exception and jump to
the except ValueError: block, displaying the problem. Likewise, we could have tried to divide 10
by the user-supplied number and, if the user gave us an input of 0, we would have ended up with a
ZeroDivisionError exception; instead, we jump to the second except block and display this problem
to the user.
A list of Pythons built-in exceptions can be found here. Its usually easiest to figure out the name
of an exception by purposefully causing the error to occur yourself, although you should then read
the documentation on that particular type of exception to make sure that your code will actually
handle all of the errors that you expect and (just as importantly) that your program will still break if
it encounters a different, unexpected type of exception.
We can also use an except block by itself without naming specific exceptions to catch:
1
2
3
4
try:
# do lots of hazardous things that might break
except:
print"The user must have screwed something up."
However, this is dangerous to do and is usually not a good idea at all. Its easy to hide a poorly written
section of code behind a try/except and think that everything was working fine, only to discover
later that you were silencing all sorts of unexpected problems that should never have occurred.
73
Review exercises:
1. Write a script that repeatedly asks the user to input an integer, displaying a message to try
again by catching the ValueError that is raised if the user did not enter an integer; once the
user enters an integer, the program should display the number back to the user and end without
crashing
74
print randint(0, 1)
If you try this within the interactive window, you should see something like this:
1
2
3
4
while randint(0, 1) == 0:
Now all we have to do is keep track of our counts of heads and tails. Since we only want the average,
we can sum them all up over all our trials, so our full script ends up looking like the following:
1
2
3
4
5
heads = 0
tails = 0
6
7
8
9
10
11
12
Review exercises:
1. Write a script that uses the randint() function to simulate the toss of a die, returning a random
number between 1 and 6
2. Write a script that simulates 10,000 throws of dice and displays the average number resulting
from these tosses
76
77
If you just want to check whether or not your final answer is correct without looking at
the sample code, click here.
78
Chapter 9
Fundamentals: Lists and Dictionaries
Make and update lists
Lists are extremely useful - in life and in Python. There are so many things that naturally lend themselves to being put into lists that its often a very intuitive way to store and order data. In Python, a
list is a type of object (just like a string or an integer), except a list object is able to hold other objects
inside of it. We create a list by simply listing all the items we want in list, separated by commas, and
enclosing everything inside square brackets. These are all examples of simple list objects:
1
2
3
1
2
3
4
1
2
3
4
5
6
1
2
3
4
5
6
1
2
3
4
5
6
7
>>> animals = []
>>> animals.append("lion")
>>> animals.append("tiger")
>>> animals.append("frumious Bandersnatch")
>>> print animals
['lion', 'tiger', 'frumious Bandersnatch']
>>>
Likewise, we can remove objects from the list using the remove() method of the list:
1
2
3
4
5
>>> animals.remove("lion")
>>> animals.remove("tiger")
>>> print animals
['frumious Bandersnatch']
>>>
We can also use the lists index() method to get the index number of a particular item in a list in order
to determine its position. For instance:
1
2
3
4
80
Copying one list into another list, however, is somewhat unintuitive. You cant just reassign one list
object to another list object, because youll get this (possibly surprising) result:
1
2
3
4
5
6
large_cats = animals[:]
The [:] is the same technique that we used to retrieve a subset of the list over some range, but since
we didnt specify an index number on either side of the colon, we grabbed everything from the first
item through the last item.
Keep in mind that because lists are mutable, there is no need to reassign a list to itself when we use
one of its methods. In other words, we only need to say animals.append(jubjub) to add the jubjub
to the animals list. If we had said animals = animals.append(jubjub), we would have saved the
result returned by the append() method (which is nothing) into animals, wiping out our list entirely.
This is true of all the methods that belong to mutable objects. For instance, lists also have a sort()
method that sorts all of the items in ascending order (usually alphabetical or numerical, depending
on the objects in the list); all we have to say if we want to sort the animals list is animals.sort(), which
alphabetizes the list:
1
2
3
4
>>> animals.sort()
>>> print animals
['frumious Bandersnatch', 'jubjub', 'lion', 'tiger', 'Tigger']
>>>
81
If we had instead assigned the value returned by the sort() method to our list, we would have lost the
list entirely:
1
2
3
4
1
2
3
>>> two_by_two[1][0]
3
>>>
Since saying two_by_two[1] by itself would return the list [3, 4], we then had to specify an additional
index in order to get a single number out of this sub-list.
A list is just a sequence of objects, so nested lists dont have to be symmetrical:
1
2
3
4
>>> list = ["I heard you like lists", ["so I put a list", "in your list"]]
>>> print list
['I heard you like lists', ['so I put a list', 'in your list']]
>>>
Finally, if we want to create a list from a single string, we can use the string split() method as an
easy way of splitting one string up into individual list items by providing the character (or characters)
occurring between these items. For instance, if we had a single string of grocery items, each separated
by commas and spaces, we could turn them into a list of items like so:
1
2
3
4
5
Review exercises:
1. Create a list named desserts that holds the two string values ice cream and cookies
2. Sort the desserts in alphabetical order, then display the contents of the list
82
3.
4.
5.
6.
83
A furry horse
2
3
4
5
2
3
4
5
2. Choose random words from the appropriate list using the random.choice() function, storing
each choice in a new string. Select three nouns, three verbs, three adjectives, one adverb, and
two prepositions. Make sure that none of the words are repeated. (Hint: Use a while loop to
repeat the selection process until you get a new word.)
84
3. Plug the words you selected into the structure above to create a poem string by using the format() string method
4. Bonus: Make sure that the A in the title and the first line is adjusted to become an An
automatically if the first adjective begins with a vowel.
85
1
2
3
4
5
>>> myTuple[2]
'change'
>>> myTuple.index("me")
3
>>>
You probably wont create your own tuples very frequently, although well see some instances later
on when they become necessary. One place where we tend to see tuples is when a function returns
multiple values; in this case, we wouldnt want to accidentally change anything about those values or
their ordering, so the function provides them to us as a permanent list.
Parentheses are actually optional when we are creating a tuple; we can also just list out a set of objects
to assign to a new object, and Python will assume by default that we mean to create a tuple:
1
2
3
4
1
2
3
4
5
6
>>> x, y = coordinates
>>> print x
4.21
>>> print y
9.29
>>>
86
We assigned both new variables to the tuple coordinates, separating the names with commas, and
Python automatically knew how to hand out the items in the tuple. In fact, we can always make
multiple assignments in a single line by separating the names with commas, whether or not we use
a tuple:
1
2
3
4
5
6
7
8
>>>
>>>
a
>>>
b
>>>
c
>>>
This works because Python is basically doing tuple packing and tuple unpacking on its own in the
background. However, we dont use this very frequently because it usually only makes code harder
to read and more difficult to update when changes are needed.
Review exercises:
1. Create a tuple named cardinal_nums that holds the strings first, second and third in
order
2. Display the string at position 2 in cardinal_nums by using an index number
3. Copy the tuple values into three new strings named pos1, pos2 and pos3 in a single line of code
by using tuple unpacking, then print those string values
87
1
2
3
>>> phonebook["Jenny"]
'867-5309'
>>>
We can add entries to a dictionary by specifying the new key in square brackets and assigning it a
value:
1
2
3
4
5
88
which key we mean when were trying to find the keys associated value. If a key is given a new value,
Python just overwrites the old value. For instance, perhaps Jenny got a new number:
1
2
3
4
5
1
2
3
4
>>> del(phonebook["Destiny"])
>>> print phonebook
{'Mike Jones': '281-330-8004', 'Obama': '202-456-1414', 'Jenny': '555-0199'}
>>>
Often we will want to loop over all of the keys in a dictionary. We can get all of the keys out of a
dictionary in the form of a list by using the keys() method:
1
2
3
1
2
3
>>>
for contactName in phonebook:
print contactName, phonebook[contactName]
4
5
6
7
8
1
2
3
4
5
error to attempt to get a value for a key that doesnt exist in a dictionary:
1
2
3
4
5
6
7
>>> phonebook["Santa"]
Traceback (most recent call last):
File "<pyshell#1>", line 1,
in <module>
phonebook["Santa"]
KeyError: 'Santa'
>>>
If we do want to access a dictionarys keys in their sorted order, we can use Pythons sorted() function
to loop over the keys alphabetically:
1
2
3
>>>
for contactName in sorted(phonebook):
print contactName, phonebook[contactName]
4
5
6
7
8
Jenny 555-0199
Mike Jones 281-330-8004
Obama 202-456-1414
>>>
Keep in mind that sorted() doesnt re-sort the order of the actual dictionary; Python has to keep the
apparently haphazard ordering of keys in the dictionary in order to be able to access the dictionary
keys quickly using its own complicated hash function.
Dictionaries are very flexible and can hold a wide variety of information beyond the strings that weve
experimented with here. Although its usually the case, dictionaries dont even have to hold keys or
values that are all same types of objects. Dictionary values can be anything, while keys must be
immutable objects. For instance, we could have added a key to our phonebook that was an integer
object. However, we couldnt have added a list as a key, since that list could be modified while its in
the dictionary, breaking the overall structure of the dictionary.
Dictionary values can even be other dictionaries, which is more common than it probably sounds. For
instance, we could imagine a more complicated phonebook in which every key is a unique contact
name that is associated with a dictionary of its own; these individual contact dictionaries could then
include keys describing the phone number (home, work, custom supplied type, etc.) that are each
associated with a phone number value. This is much like creating a list of lists:
1
2
3
4
5
6
7
8
9
10
>>>
When we wanted to retrieve a specific value inside the dictionary-value associated with the key for
Jenny, we had to say contacts[Jenny][cell]. This is because saying contacts[Jenny] now returns
the dictionary of numbers for Jenny, from which we have to specify a key for a specific type of phone
number.
Finally, there are two alternative ways to create dictionaries that can come in useful in certain specific
contexts. You shouldnt worry about learning the details of how to use them right now, but just be
aware that they are a possibility.
When you want to use keys that are strings that only include letters and numbers (i.e., strings that
could stand for variable names), you can use dict() to create a dictionary like so:
1
2
3
4
1
2
3
4
5
91
Review exercises:
1. Create an empty dictionary named birthdays
2. Enter the following data into the dictionary:
1
2
3
3. Write if statements that test to check if Yoda and Darth Vader exist as keys in the dictionary,
then enter each of them with birthday value unknown if their name does not exist as a key
4. Display all the key-value pairs in the dictionary, one per line with a space between the name
and the birthday, by looping over the dictionarys keys
5. Delete Darth Vader from the dictionary
6. Bonus: Make the same dictionary by using dict() and passing in the initial values when you
first create the dictionary
92
Chapter 10
File Input and Output
Read and write simple files
So far, weve allowed the user to type input into our program and displayed output on the screen. But
what if we want to work with a lot of data Its time to learn how to use Python to work with files.
To read or write raw text files (i.e., the sort of file that you could use in a basic text editor because
it contains no formatting or extra information), we use the general-purpose open() function. When
we open() a file, the first thing we have to determine is if we want to read from it or write to it. Lets
start by creating a new text file and writing some data into it:
1
2
3
4
5
myOutputFile.close()
Make sure you know where you are saving this script before you run it; since we didnt specify a
directory path for the file, right now hello.txt will be created in the same folder as the script.
NOTE: You should always use the close() method to close any file that you have open()
once youre completely done with the file. Python will eventually close any open files
93
when you exit the program, but not closing files yourself can still cause surprising problems. This is because Python often buffers file output, meaning that it might save a
bunch of commands youve written (without running them right away), then run them
all in a big batch later on to make the process run faster. This could result in something
like the following unwanted situation: you write output to a file, then open that file up
in a text editor to view the output, but since you didnt close() the file in Python (and
IDLE is still running), the file is completely blank even though Python is planning to
write output to the file before it is closed.
After running this script, you should see a new file named hello.txt appear in the same folder as your
script; open the output file up yourself to check that it contains the line we wrote.
The writelines() method can also take a list of lines to be written all at once. The lines will be written
one after the other without a new line, so we have to specify the special newline character \n if we
actually want the lines to appear on separate lines. Lets modify the script to write a couple lines from
a list:
1
2
3
4
5
6
myOutputFile.writelines(linesToWrite)
7
8
myOutputFile.close()
Without deleting the previous hello.txt file, try running this version of the script, then check the
contents of the file. This is an important lesson thats easy to forget:
WARNING: As soon as you open() a file in w (write) mode, if the file already exists
then the files current contents are completely deleted. Its a common mistake to accidentally overwrite or delete the contents of an important file this way.
If we want to add information to a file instead of overwriting its contents, we can use the a mode
to append to the end of the file. The rest of the process is identical to writing in w mode; the only
difference is that we start writing at the end of the file. Again, if we want the new output to appear on
a new line, we have to specify the \n character to move to a new line. Lets append one additional
line onto our current file hello.txt:
1
2
3
4
5
myOutputFile.writelines(nextLine)
6
7
myOutputFile.close()
94
Now that we have a file written, lets read the data from it. You can probably guess how this goes by
now:
1
2
3
print myInputFile.readlines()
4
5
myInputFile.close()
This time, we used the r mode to read input from the file. We then used the readlines() method to
return every line of the file, which are displayed like so:
>>>
2
3
['This is my file.\n', 'There are many like it,\n', 'but this one is mine.',
'NON SEQUITUR']
4
5
>>>
The output is returned in the form of a list, and all of the line breaks were visible to us as printed
newline characters. One common way of working with a file in its entirety is to use a for loop:
2
3
4
5
print line,
6
7
myInputFile.close()
NOTE: Notice how we ended our print statement with a comma; this is because any
print statement will usually add a new line to the end of the line of output. The extra
comma stops the print statement from adding this automatic \n so that the next print
statement will continue to display on the same line. Since our file already has new line
characters, adding extra new lines would have made the file output display incorrectly,
with a blank line appearing in between each actual line of the file.
In Python 3, you can remove the automatic \n (or change it to something different) by specifying the
end parameter of the print() function like so:
1
print(line, end="")
In this case, we supplied empty quotes to get rid of the line break, but we could have put (for instance)
an extra line break after every line by passing end="\n\n" to the print() function.
95
We can also read lines from the file one at a time using the readline() method. Python will keep track
of where we are in the file for as long as we have it open, returning the next available line in the file
each time readline() is called:
1
2
3
line = myInputFile.readline()
4
5
6
7
print line,
8
9
line = myInputFile.readline()
10
11
myInputFile.close()
There is an additional shortcut that can be helpful in organizing code when working with files: using
Pythons with keyword. Using with to read our file, we could say:
2
3
4
5
print line,
Compare this code carefully to the two previous examples. When we say with X as Y we are defining
the variable Y to be the result of running X. This begins a block of code where we can use our new
variable as usual (in this case, myInputFile). The added benefit with using the with keyword is that
we no longer have to worry about closing the file; once our with block is finished, this clean-up
work will be managed for us automatically.
In fact, we can name multiple variables in a with statement if we want to open multiple files at once.
For instance, if we wanted to read hello.txt in and write its contents out into a new file hi.txt line-byline, we could simply say:
2
3
4
5
myOutput.write(line)
This will take care of all the clean-up work for us, closing both files once we exit the block of code
inside the with statement. (Of course, practically speaking theres an easier way to accomplish this
particular task; the shutil module includes many helpful functions including copy(), which can be
used to copy an entire file into a new location.)
The rest of the material in this section is conceptually more complicated and usually isnt necessary
for most basic file reading/writing tasks. Feel free to skim this remaining material for now and come
96
back to it if you ever find that you need to read or write to a file in a way that involve specific parts of
lines rather than taking entire lines from a file one by one.
If we want to visit a specific part of the file, we can use the seek() method to jump a particular number
of characters into the file. For instance:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
myInputFile.close()
Run this script, then follow along with the output as you read the description, since its not the most
intuitive method to use. When we provide a single number to seek(), it will go to the character in the
file with that index number, regardless of where we currently are in the file. Thus, seek(0) always gets
us back to the beginning of the file, and seek(8) will always place us at the character at index position
8, regardless of what we have done previously. When we provide a second argument of 1 to seek(),
as in the last example, we are moving forward (or backward, if we use a negative number) relative to
where we currently are in the file. In this case, after we displayed line 0 starting at its 9th character,
we were currently at the beginning of line 1. Calling seek(10, 1) then moved us 10 characters ahead
in line 1. Clearly, this sort of seeking behavior is only useful in very specific cases
Although its less commonly used, it is possible to open a file for both reading and writing. You may
have already guessed why this is usually not a good idea, though; its typically very difficult to keep
track of where you are in a particular file using seek() in order to decide which pieces you want to
read or write. We can specify the mode r+ to allow for both reading and writing, or ra+ to both
read and append to an existing file. Since writing or appending will change the characters in the file,
however, you will need to perform a new seek() whenever switching modes from writing to reading.
Review exercises:
1. Read in the raw text file poem.txt from the chapter 7 practice files and display each line by
looping over them individually, then close the file; well discuss using file paths in the next
97
section, but for now you can save your script in the same folder as the text file
2. Repeat the previous exercise using the with keyword so that the file is closed automatically
after youre done looping through the lines
3. Write a text file output.txt that contains the same lines as poem.txt by opening both files at
the same time (in different modes) and copying the original file over line-by-line; do this using
a loop and closing both files, then repeat this exercise using the with keyword
4. Re-open output.txt and append an additional line of your choice to the end of the file on a
new line
98
import os
2
3
99
of typing it out each time you want to access a sample file. We will then join this path to the rest of
each file location using the os.path.join() function, as well see below.
For instance, if you wanted to display the full contents of the example text file named example.txt
in the chapter 7 practice files folder, the sample code would look like the following:
1
import os
2
3
4
5
6
7
8
9
10
11
print line,
Again, the string on the second line that represents the path to the Practice files folder might need to
be changed if you saved the course files in a different location.
Notice how we used os.path.join() as a way of adding the full file path onto the main directory by
passing the two parts of the path as arguments to this function. This just combines the two strings
together, making sure that the right number of slashes is included in between the two parts. Instead
of using os.path.join(), we could have simply added (concatenated) the string path to the rest of the
file path by using a plus sign and adding an extra forward slash between the two strings like this:
Lets start modifying files using a basic practical example: we want to rename every .GIF file in a
particular folder to be a .JPG file of the same name. In order to get a list of the files in the folder, we
can use the os.listdir() function, which returns a list of file names found in the provided directory.
We can use the string method endswith() to check the file extension of each file name. Finally, well
use os.rename() to rename each file:
1
import os
2
3
4
5
6
7
fileNamesList = os.listdir(myPath)
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
os.rename(fullFileName, newFileName)
Since endswith() is case-sensitive, we had to convert fileName to lowercase using the lower()
method; since this method returns a string as well, we just stacked one method on top of another
in the same line. We used subscripting to replace the file extension in the line newFileName =
fullFileName[0:len(fullFileName)-4] by trimming the last four characters (the .gif) off of the full
file name, then we added the new .jpg extension instead. Our os.rename() function took two
arguments, the first being the full original file name and the second being the new file name.
A more efficient way of performing this same task would be to import and use the glob module, which
serves the purpose of helping to match patterns in file names. The glob.glob() function takes a string
that uses wildcard characters, then returns a list of all possible matches. In this case, if we provide
the file name pattern *.gif then we will be able to find any file names that match the .gif extension
at the end:
import glob
2
3
import os
4
5
6
7
8
9
10
11
12
13
101
14
15
16
17
os.rename(fullFileName, newFileName)
By providing a string with the full file path and a *, we were able to return a glob list of all possible
GIF images in that particular directory. We can also use glob() to search through subfolders - for
instance, if we wanted to search for all of the PNG files that are in folders inside of our images folder,
we could search using the string pattern:
import glob
2
3
import os
4
5
6
7
8
9
10
11
print fileName
Adding the string "*/*.png" to the path means that we are searching for any files ending in .png
that are inside of folders that can have any name (the first *). Since we used the forward slash to
separate the last folder, however, we will only be searching in subfolders of the images directory.
NOTE: When you print file paths generated by Python, you may notice that some of
them include backslashes. In fact, Python is automatically adding two backslashes everywhere, but only the second backslash of each pair is displayed. This is because the first
backslash is still acting as an escape character, then the second backslash tells Python
that we do in fact want just a backslash character. We could also have specified our
own paths this way, for instance:myPath = C:\Real Python\Course materials\Chapter
7\Practice files\imagesdisplay the string correctly (try it out), if we ever use the string
in a way that causes Python to reinterpret each of the backslashes in the file path as a
double backslash, well end up with an invalid path full of doubled double backslashes
This is why its a good idea to stick with using either forward slashes or raw strings for
path names.
Another special pattern-matching character that can be included in a glob pattern is a to stand
for any one single character; for instance, searching for anything matching .gif would only return
GIF files that have a name that is two characters long. We can also include ranges to search over by
putting them in square brackets; the pattern [0-9] will match any single number from 0 through
9, and the pattern [a-z] will match any single letter. For instance, if we wanted to search for any
GIF files that have the name image followed specifically by two digits, we could pass the pattern
image[0-9][0-9].gif to glob.
102
Keep in mind that listdir() returns a list of all files and folders in a given folder. Therefore, if we
had wanted to affect every file in the image folder, then we would want to be careful not to affect
folder names as well. We can check this easily by using either os.path.isfile() or os.path.isdir(), both
of which return True or False. For instance, if we wanted to add the string folder to the end of
each folder name inside the images folder but not affect any of the files in images, we could do the
following:
1
import os
2
3
4
5
filesAndFolders = os.listdir(myPath)
6
7
8
9
10
11
if os.path.isdir(fullPath):
12
13
import os
2
3
4
5
6
7
8
9
list of files within that folder. We used tuple unpacking in the outer for loop in order to get every
possible combination of (currentFolder, subfolders, fileNames) to loop over, where currentFolder
might actually represent the images folder or a subfolder of images. In this case, we dont care about
any of the results from subfolders since looping through each of the fileNames and joining them to
currentFolder will give us the full path to every file.
Although weve covered the most common cases, there are many additional functions belonging to
both the osmodule and the os.path module that can be used in various ways for accessing and modifying files and folders. In the assignment below, well practice a couple more of these functions:
deleting files and folders by passing them to the os.remove() function and getting the size of a file
in bytes by passing it to the os.path.getsize() function.
Review exercises:
1. Display the full paths of all of the files and folders in the images folder by using os.listdir()
2. Display the full paths of any PNG files in the images folder by using glob.glob()
3. Rename any PNG files in the images folder and its subfolders to be JPG files by using os.walk();
in case you mess things up beyond repair, there is a copy of the images folder in the backup
folder
4. Make sure that your last script worked by using os.path.exists() to check that the files
png file - not a gif.jpg and /additional files/one last image.jpg now exists (by providing
os.path.exists() with the full path to each of these files)
104
105
Python has a built-in csv module that makes it nearly as easy to read and write CSV files as any
other sort of text file. Lets start with a basic example and read in our wonka.csv file, then display its
contents:
1
import csv
2
3
import os
4
5
6
7
8
9
myFileReader = csv.reader(myFile)
106
10
11
12
13
print row
We opened a file just as weve done before, but this time we chose rb mode, which stands for read
binary. The additional binary part is important here, because CSV files (despite their appearance)
arent saved in the same way as raw text files. (Specifically, on Windows we will end up saving extra
newline characters after every line if we only specify r mode.) In Python 3, you should still use r
mode instead of rb for reading CSV files.
We then created a CSV file reader using csv.reader() and passed it the file. Notice that we had to pass
the actual opened file object to csv.reader(), not just the file name From there, we can easily loop
over the rows of data in this CSV reader object, which are each displayed as a list of strings:
1
2
3
4
5
6
>>>
['First name', 'Last name', 'Reward']
['Charlie', 'Bucket', 'golden ticket, chocolate factory']
['Veruca', 'Salt', 'squirrel revolution']
['Violet', 'Beauregarde', 'fruit chew']
>>>
Much like with readline() versus readlines(), there is also a next() method that gets only the next row
of data from a CSV reader object. This method is usually used as a simple method of skipping over
a row of header data; for instance, if we wanted to read in and store all the information except the
first line of our CSV file, we could add the line next(myFileReader) after opening the CSV file to skip
over the first line, then loop through the remaining rows as usual.
If we know what fields to expect from the CSV ahead of time, we can even unpack them from each
row into new variables in a single step:
import csv
2
3
import os
4
5
6
7
8
9
myFileReader = csv.reader(myFile)
10
11
next(myFileReader)
12
13
14
15
107
After skipping the first header row with the next() function, we assigned the three values in each row
to the three separate strings firstName, lastName and reward, which we then used inside of the for
loop, generating this output:
1
2
3
4
5
6
>>>
['First name', 'Last name', 'Reward']
Charlie Bucket got: golden ticket, chocolate factory
Veruca Salt got: squirrel revolution
Violet Beauregarde got: fruit chew
>>>
The first line of this output was generated by the call to the next() function rather than a print statement of ours.
The commas in CSV files are called delimiters because they are the character used to separate different pieces of the data. Sometimes a CSV file will use a different character as a delimiter, especially
if there are a lot of commas already contained in the data. For instance, lets read in the file tabbed
wonka.csv, which uses tabs instead of commas to separate entries and looks like this:
1
2
3
4
import csv
2
3
import os
4
5
6
7
8
9
10
11
next(myFileReader)
12
13
14
15
print row
Here we used the special character \t to mean the tab character and assigned it to the argument
delimiter when we created myFileReader.
Writing CSV files is accomplished using the csv.writer() method in much the same way. Just as rows
of data read from CSV files appeared as lists of strings, we first need to structure the rows we want
108
import csv
2
3
import os
4
5
6
7
8
9
myFileWriter = csv.writer(myFile)
10
11
myFileWriter.writerow(["Movie", "Rating"])
12
13
14
15
16
17
import csv
2
3
import os
4
5
6
7
8
9
10
11
12
13
14
15
16
17
myFileWriter = csv.writer(myFile)
18
19
myFileWriter.writerows(myRatings)
If we wanted to export data created by a Python script to (for instance) an Excel workbook file, although its possible to do this directly, its usually sufficient and much easier to create a CSV file that
109
we can then open later in Excel and, if needed, convert to the desired format. There are a number of
special modules that have been designed for interacting with Microsoft Excel documents (although
they all have their own limitations), including xlrd and xlwt for reading and writing basic Excel files,
openpyxl for manipulating Excel 2007 files, and XlsxWriter for creating .xlsx files from scratch.
Review exercises:
1. Write a script that reads in the data from the CSV file pastimes.csv located in the chapter 7
practice files folder, skipping over the header row
2. Display each row of data (except for the header row) as a list of strings
3. Add code to your script to determine whether or not the second entry in each row (the Favorite
Pastime) converted to lower-case includes the word fighting using the string methods find()
and lower()
4. Use the list append() method to add a third column of data to each row that takes the value
Combat if the word fighting is found and takes the value Other if neither word appears
5. Write out a new CSV file categorized pastimes.csv to the Output folder with the updated
data that includes a new header row with the fields Name, Favorite Pastime, and Type of
Pastime
110
Empiro 23
L33tH4x 42
LLCoolDave 27
MaxxT 25
Misha46 25
O_O 22
johnsmith 30
red 12
tom123 26
111
Chapter 11
Interlude: Install Packages
The remaining half of this course relies on functionality found in various toolkits that are not packaged with Python by default. There are over 39,000 of these extra packages registered with Python
and available for download. Although all of the add-on features that well cover are widely used and
freely available, you will first need to download and install each of these packages in order to be able
to import new functionality into your own code.
Python packages and toolkits are usually just another way of referring to a module or set of modules that can be imported into other Python code, although sometimes these modules rely on other
code outside of Python, which is why it can sometimes be tricky to get everything installed correctly.
NOTE: The best way to install most Python packages is through pip, although pip itself
can be tricky to install on some configurations. If youre using Python 3.4+, you already
have pip installed by default and can use it right away to install new packages with no
extra hassle!
Some Python packages (especially for Windows) offer automated installers that you can download
and run without an extra hassle. Usually in Linux, installing a Python package is only a matter of
searching for the correct package name (e.g., in the Debian package directory), then running the
command:
1
can now edit the system directories listed; each of these folders is separated from the previous folder
by a comma. DO NOT delete any of the current paths. Go to the very end of this string and (assuming
you installed Python into the default directory) add the two Python paths as follows, starting with a
semicolon to separate the new paths from the previous entries in your PATH variable:
1
;C:\\Python27\;C:\Python27\Scripts
OS X: You should be fine as long as youve separately downloaded the correct version of Python 2.7.3
and are not relying on the built-in OS X version of Python. Type python into your Terminal to make
sure it is recognized and loads as version 2.7.3 (see below).
Linux: You should be fine. Type python into your Terminal to make sure it is recognized and loads
as version 2.7.3 (see below).
Double-check that Python is recognized by your system by opening a new command prompt (Windows) or Terminal (OS X and Linux) window and simply typing the name python as a command.
Once you hit enter, Python should load just the same way as if you were in the IDLE interactive
window, giving you a >>> prompt as usual. You can exit out of Python in this window by typing
exit().
Now that your operating system knows how to access Python, you can get easy_install installed and
set up:
Windows: Download and run the script ez_setup.py; you may need to copy the code into a new script
manually.
OS X: If you have XCode installed, you may already have a version of easy_install as well. If not,
download the EGG file available here into your user home directory, then install it by typing the
following command into your Terminal:
sh setuptools-0.6c11-py2.7.egg
Debian/Linux: Use the following command to install easy_install:
Most packages will come with a setup.py script that will help you to install them into Python. If this
is the case, you can follow these steps to install the package:
Windows:
113
1. Download the .zip file for the package and unzip it into a folder in your user directory (i.e.,
C:\Users\yourname)
2. Double-check that there is a script named setup.py in the package folder
3. Open up a command prompt (which should display C:\Users\yourname>) and type the command cd followed by the package folder name; for instance, if you wanted to install a package
in the folder named beautifulsoup4-4.1.0 then you would enter: cd beautifulsoup4-4.1.0
4. To install the package, enter the command: python setup.py install
Non-Windows:
1. Download the .tar.gz file for the package and decompress it into a folder in your users directory
(i.e., /home/yourname)
2. Double-check that there is a script named setup.py in the package folder
3. In Terminal, type the command cd followed by the package folder name; for instance, if you
wanted to install a package in the folder named beautifulsoup4-4.1.0 then you would enter:
cd beautifulsoup4-4.1.0
4. To install the package, enter the command: sudo python setup.py install
If all else fails, you can always download (and unzip) the entire set of files for a given package and
copy its folder into the same directory as the script where it is used. Since the package files will be in
the same current location as the script, you will be able to find and import them automatically.
Of course, there are a number of problems with this approach - mainly, the package may take up a lot
of memory if its large, and youll have to copy the entire library over into a new directory every time
you write a script in a new folder. If its a small package (and the license allows you to copy the entire
set of source files), however, one benefit is that you can then send this entire folder, complete with
your script and the needed library files, to someone else, who would then be able to run your script
without having to install the package as well. Despite this minor possible convenience, this approach
should usually only be used as a last-ditch effort if all other proper attempts to install a package have
failed.
114
Chapter 12
Interact with PDF files
Read and write PDFs
PDF files have become a sort of necessary evil these days. Despite their frequent use, PDFs are some
of the most difficult files to work with in terms of making modifications, combining files, and especially for extracting text information.
Fortunately, there are a few options in Python for working specifically with PDF files. None of these
options are perfect solutions, but often you can use Python to completely automate or at least ease
some of the pain of performing certain tasks using PDFs.
The most frequently used package for working with PDF files in Python is named pyPdf and can be
found here. You will need to download and install this package before continuing with the chapter.
NOTE Python 3 note: I highly recommend that you stick with Python 2.7 for this chapter; there is an unofficial version of pyPdf renamed PyPDF2 available here that supports
Python 3, but its still in development. At the time of this writing, it does not have the
same full functionality of pyPdf and lacks documentation.
115
import os
2
3
4
5
6
7
8
9
10
11
12
13
2
3
4
5
6
7
8
>>>
We can also retrieve individual pages from the PDF document using the getPage() method and specifying the index number of the page (as always, starting at 0). However, since PDF pages include
much more than simple text, displaying the text data on a PDF page is more involved. Fortunately,
pyPdf has made the process of parsing out text somewhat easier, and we can use the extractText()
method on each page:
2
3
The Project Gutenberg EBook of Pride and Prejudice, by Jane Austen This eBook
is for the use of anyone anywhere at no cost and with almost no
restrictions whatsoever. You may copy it, give it away or re-use it under the
terms of the Project Gutenberg License included with this eBook or online at
www.gutenberg.org
Title: Pride and Prejudice Author: Jane Austen Release
116
6
7
8
9
Date: August 26, 2008 [EBook #1342] [Last updated: August 11, 2011] Language:
English Character set encoding: ASCII *** START OF THIS PROJECT GUTENBERG
EBOOK PRIDE AND PREJUDICE ***
Produced by Anonymous Volunteers, and David
Widger
PRIDE AND PREJUDICE
By Jane Austen
Contents
10
11
>>>
Formatting standards in PDFs are inconsistent at best, and its usually necessary to take a look at the
PDF files you want to use on a case-by-case basis. In this instance, notice how we dont actually see
newline characters in the output; instead, it appears that new lines are being represented as multiple
spaces in the text extracted by pyPdf. We can use this knowledge to write out a roughly formatted
version of the book to a plain text file (for instance, if we only had the PDF available and wanted to
make it readable on an untalented mobile device):
import os
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
outputFile.write(title + "\n")
20
21
22
23
24
25
text = inputFile.getPage(pageNum).extractText()
26
27
text = text.replace("
", "\n")
28
29
outputFile.write(text)
30
31
outputFile.close()
Since were writing out basic text, we chose the plain w mode and created a file book.txt in the
117
Output folder. Meanwhile, we still use rb mode to read data from the PDF file since, before we
can extract the plain text from each page, we are in fact reading much more complicated data. We
loop over every page number in the PDF file, extracting the text from that page. Since we know that
new lines will show up as additional spaces, we can approximate better formatting by replacing every
instance of double spaces ( ) with a newline character.
You may find that a PDF document includes unusual characters that cannot be written into a plain
text file - for instance, a trademark symbol. These characters are not in the ASCII character set,
meaning that they cant be represented using any of the 128 standard computer characters. Because
of this, your code will not be able to write out a raw text file in w mode. Usually the way to get
around this is by using the encode() method, like so: text = text.encode("utf-8")
If we have a string text that has unusual characters in it, this line will allow us to change how each
character is represented (using UTF-8 encoding) so that we can now store these symbols in a raw
text file. These unusual characters might not appear the same way as in the original file, since the
text file has a much more limited set of characters available, but if you do not change the encoding
then you will not be able to output the text at all.
(If you really want to get a handle on text encoding and whats really happening, take a look at this
talk. Note that in Python 3, strings are unicode by default.)
Instead of extracting text, we might want to modify the PDF file itself, saving out a new version of
the PDF. Well see more examples of why and how this might occur in the next section, but for now
create the simplest modified file by saving out only a section of the original file. Here we copy over
the first three pages of the PDF (not including the cover page) into a new PDF file:
1
import os
2
3
4
5
6
7
8
9
10
11
outputPDF = PdfFileWriter()
12
13
14
15
outputPDF.addPage(inputFile.getPage(pageNum))
16
17
18
19
20
21
outputPDF.write(outputFile)
22
23
outputFile.close()
118
We imported both PdfFileReader and PdfFileWriter from pyPdf so that we can write out a PDF
file of our own. PdfFileWriter doesnt take any arguments, which might be surprising; we can start
adding PDF pages to our outputPDF before weve specified what file it will become. However, in
order to save the output to an actual PDF file, at the end of our code we create an outputFile as
usual and then call outputPDF.write(outputFile) in order to write the PDF contents into this file.
Review exercises:
1. Write a script that opens the file named The Whistling Gypsy.pdf from the Chapter 8 practice
files, then displays the title, author, and total number of pages in the file
2. Extract the full contents of The Whistling Gypsy.pdf into a .TXT file; you will need to encode
the text as UTF-8 before you can output it
3. Save a new version of The Whistling Gypsy.pdf that does not include the cover page into the
Output folder
119
import os
2
3
4
5
6
7
8
9
10
11
outputPDF = PdfFileWriter()
12
13
14
15
page = inputFile.getPage(pageNum)
16
17
if pageNum %2==0:
18
19
page.rotateClockwise(90)
20
21
outputPDF.addPage(page)
22
23
24
25
26
27
outputPDF.write(outputFile)
28
29
outputFile.close()
Another useful feature of pyPdf is the ability to crop pages, which in turn will allow us to split up PDF
pages into multiple parts or save out partial sections of pages. For instance, open up the file half
and half.pdf from the chapter 8 practice files folder to see an example of where this might be useful.
This time, we have a PDF thats presented in two frames per page, which again is not an ideal layout
120
in many situations. In order to split these pages up, we will have to refer to the MediaBox belonging
to each PDF page, which is a rectangle representing the boundaries of the page. Lets take a look at
the MediaBox of a PDF page in the interactive window to get an idea of what it looks like:
1
2
3
4
5
6
7
8
9
10
11
12
>>>
A mediaBox is a type of object called a RectangleObject. Consequently, we can get the coordinates of
the rectangles corners:
2
3
(0, 0)
4
5
6
7
(792, 0)
8
9
10
11
(792, 612)
12
13
14
15
792.0
16
17
18
19
612.0
20
21
>>>
These locations are returned to us as tuples that include the x and y coordinate pairs. Notice how we
didnt include parentheses anywhere because mediaBox and its corners are unchangeable attributes,
not methods of the PDF page.
We will have to do a little math in order to crop each of our PDF pages. Basically, we need to set the
corners of each half-page so that we crop out the side of the page that we dont want. To do this, we
121
divide the width of the landscape page into two halves; we set the right corner of the left-side page
to be half of the total width, and we set the left corner of the right-side page to start halfway across
the width of the page.
Since we have to crop the half-pages in order to write them out to our new PDF file, we will also
have to create a copy of each page. This is because the PDF pages are mutable objects; if we change
something about a page, we also change the same things about any variable that references that
object. This is exactly the same problem that we ran into when having to copy an entire list into a
new list before making changes. In this case, we import the built-in copy module, which creates and
returns a copy of an object by using the copy.copy() function. (In fact, this function works just as well
for making copies of entire lists instead of the shorthand list2 = list1[:] notation.)
This is tricky code, so take a while to work through it and play with different variations of the copying
and cropping to make sure you understand the underlying math:
1
import os
2
3
import copy
4
5
6
7
8
9
10
11
12
13
outputPDF = PdfFileWriter()
14
15
16
17
pageLeft = inputFile.getPage(pageNum)
18
19
pageRight = copy.copy(pageLeft)
20
21
22
23
24
25
26
27
outputPDF.addPage(pageLeft)
28
29
30
31
32
33
outputPDF.addPage(pageRight)
122
34
35
36
37
38
39
outputPDF.write(outputFile)
40
41
outputFile.close()
NOTE: PDF files are a bit unusual in how they save page orientation. Depending on
how the PDF was originally created, it might be the case that your axes are switched for instance, a standard portrait document thats been converted into a landscape PDF
might have the x-axis represented vertically while the y-axis is horizontal. Likewise, the
corners would all be rotated by 90 degrees; the upperLeft corner would appear on the
upper right or the lower left, depending on the files rotation. Especially if youre working
with a landscape PDF file, its best to do some initial testing to make sure that you are
using the correct corners and axes.
Beyond manipulating an already existing PDF, we can also add our own information by merging one
PDF page with another. For instance, perhaps we want to automatically add a header or a watermark
to every page in a file. Ive saved an image with a transparent background into a one-page PDF file
for this purpose, which we can use as a watermark, combining this image with every page in a PDF
file by using the mergePage() method:
1
import os
2
3
4
5
6
7
8
9
10
11
outputPDF = PdfFileWriter()
12
13
14
15
16
17
18
19
page = inputFile.getPage(pageNum)
20
21
22
123
23
outputPDF.addPage(page)
24
25
26
27
28
29
30
31
outputPDF.write(outputFile)
32
33
outputFile.close()
While we were securing the file, notice that we also added basic encryption by supplying the password
good2Bking through the PdfFileWriters encrypt() method. If you know the password used to protect a PDF file, there is also a matching decrypt() method to decrypt an input file that is password
protected;this can be incredibly useful as an automation tool if you have many identically encrypted
PDFs and dont want to have to type out a password each time you open one of the files.
Although pyPdf is one of the best and most frequently relied-upon packages for interacting with PDFs
in Python, it does have some weaknesses. For instance, there is no way to generate your own PDF
files from scratch; instead, you must start with at least a template document. For PDF generation in
particular, I suggest researching the ReportLab toolkit, which is also free and open-source. Another
popular choice for manipulation of existing PDF files in Python is PDFMiner, which offers slightly
different functionality from pyPdf.
There is also a PyPDF2 in the works that aims to handle more difficult PDF files and make some PDF
manipulation tasks even easier than in pyPdf. However, as of this writing (March 2014) there is not
yet any documentation available on how to use it.
Review exercises:
1. Write a script that opens the file named Walrus.pdf from the Chapter 8 practice files; you
will need to decrypt the file using the password IamtheWalrus
2. Rotate every page in this input file counter-clockwise by 90 degrees
3. Split each page in half vertically, such that every column appears on its own separate page, and
output the results as a new PDF file in the Output folder
124
125
Chapter 13
SQL database connections
Communicate with databases using SQLite
If youre interested in this chapter, Im assuming that you have at least a basic knowledge of SQL and
the concept of querying a database. If not, you might want to take a moment to read through this
article introducing databases and browse through these lessons introducing basic SQL code.
There are many different variations of SQL, and some are suited to certain purposes better than
others. The simplest, most lightweight version of SQL is SQLite, which runs directly on your machine
and comes bundled with Python automatically.
SQLite is usually used within applications for small internal storage tasks, but it can also be useful
for testing SQL code before setting an application up to use a larger database.
In order to communicate with SQLite, we need to import the module and connect to a database:
1
import sqlite3
2
3
connection = sqlite3.connect("test_database.db")
Here weve created a new database named test_database.db, but connecting to an existing database
works exactly the same way. Now we need a way to communicate across the connection:
c = connection.cursor()
This line creates a Cursor object, which will let us execute commands on the SQL database and return
the results. Well be using the cursor a lot, so we can just call it c for short. Now we easily execute
regular SQL statements on the database through the cursor like so:
2
3
connection.commit()
Here weve inserted a new row, with a Firstname of Ron, a LastName of Obvious, and an Age equal
to 42. In the second line, we had to commit the change we made to the table to say that we really
meant to change the tables contents - otherwise our change wouldnt actually be saved.
NOTE: We used double quotation marks in the string above, with single quotes denoting
strings inside of the SQL statement. Although Python doesnt differentiate between using single and double quotes, some versions of SQL (including SQLite) only allow strings
to be enclosed in single quotation marks, so its important not to switch these around.
At this point, you could close and restart IDLE completely, and if you then reconnect to
test_database.db, your People table will still exists there, storing Ron Obvious and his Age;
this is why SQLite can be useful for internal storage for those times when it makes sense to structure
your data as a database of tables rather than writing output to individual files. The most common
example of this is to store information about users of an application.
NOTE: If you just want to create a one-time-use database while youre testing code or
playing around with table structures, you can use the special name :memory: to create
the database in temporary RAM like so:
connection = sqlite3.connect(':memory:')
If we want to delete the People table, its as easy as executing a DROP TABLE statement:
1
connection.close()
When working with a database connection, its also a good idea to use the with keyword to simplify
your code (and your life), similar to how we used with to open files:
2
3
Besides making your code more compact, this will benefit you in a few important ways. Firstly, you
no longer need to commit() changes you make; theyre automatically saved. Using with also helps
with handling potential errors and freeing up resources that are no longer needed, much like how we
can open (and automatically close) files using the with keyword. Keep in mind, however, that you
will still need to commit() a change if you want to see the result of that change immediately (before
closing the connection); well see an example of this later in the section.
If you want to run more than one line of SQL code at a time, there are a couple possible options. One
simple option is to use the executescript() method and give it a string that represents a full script;
although lines of SQL code will be separated by semicolons, its common to pass a multi-line string
for readability. Our full code might look like so:
1
importsqlite3
2
3
c = connection.cursor()
5
6
c.executescript("""
7
8
9
10
11
12
13
14
15
""")
We can also execute many similar statements by using the executemany() method and supplying a
tuple of tuples, where each inner tuple supplies the information for a single command. For instance,
if we have a lot of peoples information to insert into our People table, we could save this information
in the following tuple of tuples:
peopleValues = (
2
3
4
5
6
7
terized statement. The difference between parameterized and non-parameterized code is very similar
to how we can write out strings by concatenating many parts together versus using the string format()
method to insert specific pieces into a string after creating it.
For security reasons, especially when you need to interact with a SQL table based on user-supplied
input, you should always use parameterized SQL statements. This is because the user could potentially supply a value that looks like SQL code and causes your SQL statement to behave in unexpected
ways. This is called a SQL injection attack and, even if you arent dealing with a malicious user, it
can happen completely by accident.
For instance, suppose we want to insert a person into our People table based on user-supplied information. We might initially try something like the following (assuming we already have our People
table set up):
1
import sqlite3
2
3
4
5
6
7
8
9
10
11
12
13
14
15
c = connection.cursor()
16
17
18
"',"+str(age) +")"
19
20
21
c.execute(line)
Notice how we had to change age into an integer to make sure that it was a valid age, but then we
had to change it back into a string in order to concatenate it with the rest of the line; this is because
we created the line by adding a bunch of strings together, including using single quotation marks to
denote strings within our string. If youre still not clear how this works, try inserting a person into
the table and then print line to see how the full line of SQL code looks.
But what if the users name included an apostrophe? Try adding Flannery OConnor to the table,
and youll see that she breaks the code; this is because the apostrophe gets mixed up with the single
quotes in the line, making it appear that the SQL code ends earlier than expected.
In this case, our code only causes an error (which is bad) instead of corrupting the entire table (which
would be very bad), but there are many other hard-to-predict cases that can break SQL tables when
129
not parameterizing your statements. To avoid this, we should have used place-holders in our SQL
code and inserted the person data as a tuple:
1
import sqlite3
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
c = connection.cursor()
18
19
2
3
import sqlite3
2
3
peopleValues = (
4
5
6
7
8
9
10
130
11
12
c = connection.cursor()
13
14
15
16
17
18
19
20
# select all first and last names from people over age 30
21
22
23
24
25
26
print row
27
We executed a SELECT statement that returned the first and last names of all people over the age of
30, then called fetchall() on our cursor to retrieve the results of this query, which are stored as a list
of tuples. Looping over the rows in this list to view the individual tuples, we see:
1
>>>
2
3
(u'Ron', u'Obvious')
4
5
(u'Luigi', u'Vercotti')
6
7
>>>
The u before each string stands for unicode (as opposed to ASCII) and basically means that the
string might contain complicated characters that cant be represented by the usual basic set of characters we normally see in English text.
NOTE: Python 3 note: You wont see the u before the strings in Python 3, because
strings in Python 3 are always stored as unicode by default. In order to get this same
functionality in Python 2, you can include the following line at the beginning of your
script: from __future__ import unicode_literals
If we wanted to loop over our result rows one at a time instead of fetching them all at once, we would
usually use a loop such as the following:
1
2
3
while True:
131
row = c.fetchone()
6
7
if row is None:
8
9
break
10
11
print row
This checks each time whether our fetchone() returned another row from the cursor, displaying the
row if so and breaking out of the loop once we run out of results.
The None keyword is the way that Python represents the absence of any value for an object. When we
wanted to compare a string to a missing value, we used empty quotes to check that the string object
had no information inside: stringName == ""
When we want to compare other objects to missing values to see if those objects hold any information,
we compare them to None, like so: objectName is None
This comparison will return True if objectName exists but is empty and False if objectName holds
any value.
Review exercises:
1. Create a database table in RAM named Roster that includes the fields Name, Species and
IQ
2. Populate your new table with the following values:
1
2
3
Jean-Baptiste Zorg,
Human,
122
Korben Dallas,
Meat Popsicle,
100
Ak'not,
Mangalore,
-5
132
133
Chapter 14
Interacting with the web
Scrape and parse text from websites
Given the hundreds of millions of websites out there, chances are that you might at some point be
interested in gathering data from a webpage - or perhaps from thousands of webpages. In this chapter
we will explore various options for interacting with and gathering data from the Internet through
Python.
Collecting data from websites using an automated process is known as web scraping. Some websites
explicitly forbid users from scraping their data with automated tools like the ones we will create.
Websites do this for either of two possible reasons:
1. The site has a good reason to protect its data; for instance, Google Maps will not allow you to
request too many results too quickly
2. Making many repeated requests to a websites server may use up bandwidth, slowing down
the website for other users and potentially overloading the server such that the website stops
responding entirely
You should always check a websites acceptable use policy before scraping its data to see if accessing
the website by using automated tools is a violation of its terms of use. Legally, web scraping against
the wishes of a website is very much a gray area, but I just want to make it clear that the following
techniques may be illegal when used on websites that prohibit web scraping.
The primary language of information on the Internet is HTML (HyperText Markup Language), which
is how most webpages are displayed in browsers. For instance, if you browse to a particular website
and choose to view page source in your browser, you will most likely be presented with HTML code
underlying that webpage; this is the information that your browser receives and translates into the
page you actually see.
If you are not familiar with the basics of HTML tags, you should take a little while to read through
the first dozen chapters of a brief overview of HTML. None of the HTML used in this chapter will be
very complicated, but having a solid understanding of HTML elements is an important prerequisite
for developing good techniques for web scraping.
134
Lets start by grabbing all of the HTML code from a single webpage. Well take a very simple page
thats been set up just for practice:
1
2
3
4
5
myAddress = "http://RealPython.com/practice/aphrodite.html"
6
7
htmlPage = urlopen(myAddress)
8
9
htmlText = htmlPage.read()
10
11
12
13
print htmlText
This displays the following result for us, which represents the full HTML of the page just as a web
browser would see it:
>>>
2
3
<html>
4
5
<head>
6
7
<title>Profile: Aphrodite</title>
8
9
</head>
10
11
<body bgcolor="yellow">
12
13
<center>
14
15
<br><br>
16
17
18
19
<h2>Name: Aphrodite</h2>
20
21
<br><br>
22
23
24
25
<br><br>
26
27
28
135
29
<br><br>
30
31
32
33
</center>
34
35
</body>
36
37
</html>
38
39
>>>
Calling urlopen() will cause the following error if Python cannot connect to the Internet:
URLError: <urlopen error [Errno 11001] getaddrinfo failed>
If you provide an invalid web address that cant be found, you will see the following error, which is
equivalent to the 404 page that a browser would load: HTTPError: HTTP Error 404: Not Found
Now we can scrape specific information from the webpage using text parsing, i.e., looking through
the full string of text and grabbing only the pieces that are relevant to us. For instance, if we wanted to
get the title of the webpage (in this case, Profile: Aphrodite), we could use the string find()method
to search through the text of the HTML for the <title> tags and parse out the actual title using index
numbers:
2
3
4
5
myAddress = "http://RealPython.com/practice/aphrodite.html"
6
7
htmlPage = urlopen(myAddress)
8
9
htmlText = htmlPage.read()
10
11
12
13
startTag = "<title>"
14
15
endTag = "</title>"
16
17
18
19
endIndex = htmlText.find(endTag)
20
21
print htmlText[startIndex:endIndex]
Running this script correctly displays the HTML code limited to only the text in the title:
136
>>>
2
3
Profile: Aphrodite
4
5
>>>
Of course, this worked for a very simple example, but HTML in the real world can be much more
complicated and far less predictable. For a small taste of the expectations versus reality of text
parsing, visit poseidon.html and view the HTML source code. It looks like the same layout as before, but lets try running the same script as before on myAddress = "http://RealPython.com
practice/poseidon.html":
>>>
2
3
<head>
4
5
6
7
>>>
We didnt manage to find the beginning of the <title> tag correctly this time because there was a
space before the closing >, like so: <title >. Instead, our find() method returned -1 (because the
exact string <title> wasnt found anywhere) and then added the length of the tag string, making us
think that the beginning of the title was six characters into the HTML code.
Because these sorts of problems can occur in countless unpredictable ways, a more reliable alternative than using find() is to use regular expressions. Regular expressions (shortened to regex in
Python) are strings that can be used to determine whether or not text matches a particular pattern.
Regular expressions are not particular to Python; they are a general programming concept that can
be used with a wide variety of programming languages. Regular expressions use a language all of
their own that is notoriously difficult to learn but incredibly useful once mastered. Although a full
overview of regular expressions is outside the scope of this book, Ill cover just a few couple examples
to get started.
Python allows us to use regular expressions through the re module. Just as Python uses the backslash
character as an escape character for representing special characters that cant simply be typed into
strings, regular expressions use a number of different special characters (called meta-characters)
that are interpreted as ways to signify different types of patterns. For instance, the asterisk character,
*, stands for zero or more of whatever came just before the asterisk. Lets see this in an example,
where we use the re.findall() function to find any text within a string that matches a given regular
expression. The first argument we pass to re.findall() is the regular expression that we want to match,
and the second argument is the string to test:
>>> import re
2
3
137
['ac']
6
7
8
9
['abc']
10
11
12
13
['ac']
14
15
16
17
['abc', 'ac']
18
19
20
21
>>>
Our regular expression, ab*c, matches any part of the string that begins with an a, ends with
a c, and has zero or more of b in between the two. This function returns a list of all matches.
Note that this is case-sensitive; if we wanted to match this pattern regardless of upper-case or lowercase differences, we could pass a third argument with the value re.IGNORECASE, which is a specific
variable stored in the re module:
2
3
4
5
['ABC']
6
7
>>>
We can use a period to stand for any single character in a regular expression. For instance, we could
find all the strings that contains the letters a and c separated by a single character as follows:
2
3
['abc']
4
5
6
7
8
9
10
11
['acc']
12
138
13
>>>
Therefore, putting the term .* inside of a regular expression stands for any character being repeated
any number of times. For instance, if we wanted to find every string inside of a particular string that
starts with the letter a and ends with the letter c, regardless of what occurs in between these two
letters, we could say:
2
3
['abc']
4
5
6
7
['abbc']
8
9
10
11
['ac']
12
13
14
15
['acc']
16
17
>>>
Usually we will want to use the re.search() function to search for a particular pattern inside a string.
This function is somewhat more complicated because it returns an object called a MatchObject that
stores different groups of data; this is because there might be matches inside of other matches, and
re.search() wants to return every possible result. The details of MatchObject are irrelevant here,
but for our purposes, calling the group() method on a MatchObject will return the first and most
inclusive result, which most instances is all we care to find. For instance:
2
3
4
5
ABC
6
7
>>>
There is one more re function that will come in handy when parsing out text. The sub() function,
which is short for substitute, allows us to replace text in a string that matches a regular expression
with new text (much like the string replace() method). The arguments passed to re.su () are the
regular expression, followed by the replacement text, then followed by the string. For instance:
139
4
5
6
7
Everything is ELEPHANTS.
8
9
>>>
Perhaps that wasnt quite what we expected to happen We found and replaced everything in between the first < and last >, which ended up being most of the string. This is because Pythons
regular expressions are greedy, meaning that they try to find the longest possible match when characters like are used. Instead, we should have used the non-greedy matching pattern ?, which
works the same way as * except that it tries to match the shortest possible string of text:
2
3
4
5
6
7
8
9
>>>
Armed with all this knowledge, lets now try to parse out the title from dionysus.html, which includes
this rather carelessly written line of HTML:
/ >
Our find() method would have a difficult time dealing with the inconsistencies here, but with the
clever use of regular expressions, we will be able to handle this code easily:
1
import re
2
3
4
5
6
7
myAddress = "http://RealPython.com/practice/dionysus.html"
8
9
htmlPage = urlopen(myAddress)
10
11
htmlText = htmlPage.read()
12
13
14
15
16
17
title = matchResults.group()
18
19
20
21
print title
Lets take the first regular expression we used and break it down into three parts:
1. <title .*?> - First we check for the opening tag, where there must be a space after the word
title and the tag must be closed, but any characters can appear in the rest of the tag; we use
the non-greedy .*? because we want the first closing > to match the tags end
2. .* - Any characters can appear in between the <title> tags
3. </title .*?> - This expression is the same as the first part, except that we also require the
forward slash before title because this is a closing HTML tag
Likewise, we then use the non-greedy .*? placed inside of an HTML tag to match any HTML tags
and remove them from the parsed-out title.
Regular expressions are an incredibly powerful tool when used correctly. Weve only scratched the
surface of their potential here, although youre encouraged to take some time to study the very thorough Python Regular Expression HOWTO document.
NOTE: web scraping in practice can be very tedious work. Beyond the fact that no two
websites are organized the same way, usually webpages are messy and inconsistent in
their formatting. This leads to a lot of time spent handling unexpected exceptions to
every rule, which is less than ideal when you want to automate a task.
Review exercises:
1. Write a script that grabs the full HTML from the page dionysus.html
2. Use the string find() method to display the text following Name: and Favorite Color: (not
including any leading spaces or trailing HTML tags that might appear on the same line)
3. Repeat the previous exercise using regular expressions; the end of each pattern should be a
< (i.e., the start of an HTML tag) or a newline character, and you should remove any extra
spaces or newline characters from the resulting text using the string strip() method
141
2
3
4
5
myAddress = "http://RealPython.com/practice/dionysus.html"
6
7
htmlPage = urlopen(myAddress)
8
9
10
11
mySoup = BeautifulSoup(htmlText)
From here, we can parse data out of mySoup in various useful ways depending on what information
we want. For instance, BeautifulSoup includes a get_text()method for extracting just the text from
a document, removing any HTML tags automatically:
2
3
Profile: Dionysus
4
5
Name: Dionysus
6
7
8
9
10
11
12
13
>>>
142
There are a lot of extra blank lines left, but these can always be taken out using the string replace()
method. If we only want to get specific text from an HTML document, using BeautifulSoup to extract
the text first and then using find() is sometimes easier than working with regular expressions.
However, sometimes the HTML tags are actually the elements that point out the data we want to
retrieve. For instance, perhaps we want to retrieve links for all the images on the page, which will
appear in <img> HTML tags. In this case, we can use the find_all() method to return a list of all
instances of that particular tag:
1
2
3
4
5
6
7
<br><br>
8
9
10
11
<br>
12
13
14
15
</br></br></br></br></br></br></img>]
16
17
>>>
This wasnt exactly what we expected to see, but it happens quite often in the real world; the first
element of the list, <img src="dionysus.jpg"/>, is a self-closing HTML image tag that doesnt
require a closing </img> tag. Unfortunately, whoever wrote the sloppy HTML for this page never
added a closing forward slash to the second HTML image tag, <img src="grapes.png">, and didnt
include a </img> tag either. So BeautifulSoup ended up grabbing a fair amount of HTML after the
image tag as well before inserting a </img> on its own to correct the HTML.
Fortunately, this still doesnt have much bearing on how we can parse information out of the image
tags with Beautiful Soup. This is because these HTML tags are stored as Tag objects, and we can
easily extract certain information out of each Tag. In our example, assume for simplicity that we
know to expect two images in our list so that we can pull two Tag objects out of the list:
2
3
>>>
We now have two Tag objects, image1 and image2. These Tag objects each have a name, which is just
the type of HTML tag that to which they correspond:
143
img
4
5
>>>
These Tag objects also have various attributes, which can be accessed in the same way as a dictionary. The HTML tag <img src="dionysus.jpg"/> has a single attribute src that takes on the
value dionysus.jpg (much like a key: value pair in a dictionary). Likewise, an HTML tag such
as <a href="http://RealPython.com" target="_blank"> would have two attributes, a href attribute that is assigned the value http://RealPython.com and a target attribute that has the value
_blank.
We can therefore pull the image source (the link that we wanted to parse) out of each image tag using
standard dictionary notation to get the value that the src attribute of the image has been assigned:
2
3
dionysus.jpg
4
5
6
7
grapes.png
8
9
>>>
Even though the second image tag had a lot of extra HTML code associated with it, we could still
pull out the value of the image src without any trouble because of the way Beautiful Soup organizes
HTML tags into Tag objects.
In fact, if we only want to grab a particular tag, we can identify it by the corresponding name of the
Tag object in our soup:
2
3
<title>Profile: Dionysus</title>
4
5
>>>
Notice how the HTML <title> tags have automatically been cleaned up by Beautiful Soup. Furthermore, if we want to extract only the string of text out of the <title> tags (without including the tags
themselves), we can use the string attribute stored by the title:
2
3
Profile: Dionysus
4
5
>>>
144
We can even search for specific kinds of tags whose attributes match certain values. For instance, if
we wanted to find all of the <img> tags that had a src attribute equal to the value dionysus.jpg, we
could provide the following additional argument to the find_all() method:
1
2
3
[<img src="dionysus.jpg"/>]
4
5
>>>
In this case, the example is somewhat arbitrary since we only returned a list that contained a single
image tag, but we will use this technique in a later section in order to help us find a specific HTML
tag buried in a vast sea of other HTML tags.
Although Beautiful Soup is still used frequently today, the code is no longer being maintained and
updated by its creator. A similar toolkit, lxml, is somewhat trickier to get started using, but offers all
of the same functionality as Beautiful Soup and more. Once you are comfortable with the basics of
Beautiful Soup, you should move on to learning how to use lxml for more complicated HTML parsing
tasks.
NOTE: HTML parsers like Beautiful Soup can (and often do) save a lot of time and
effort when it comes to locating specific data in webpages. However, sometimes HTML
is so poorly written and disorganized that even a sophisticated parser like Beautiful Soup
doesnt really know how to interpret the HTML tags properly. In this case, youre often
left to your own devices (namely, find() and regex) to try to piece out the information
you need.
Review exercises:
1. Write a script that grabs the full HTML from the page profiles.html
2. Parse out a list of all the links on the page using Beautiful Soup by looking for HTML tags with
the name a and retrieving the value taken on by the href attribute of each tag
3. Get the HTML from each of the pages in the list by adding the full path to the file name, and
display the text (without HTML tags) on each page using Beautiful Soups get_text() method
145
import mechanize
2
3
myBrowser = mechanize.Browser()
4
5
myBrowser.open("http://RealPython.com/practice/aphrodite.html")
We now have various information that the website returned to us stored in our mechanize browser as
a response which we can return by calling the response() method. This response also has various
methods that help us piece out information returned from the website:
2
3
http://www.RealPython.com/practice/aphrodite.html
4
5
6
7
<html>
8
9
<head>
10
11
<title>Profile: Aphrodite</title>
146
12
13
</head>
14
15
<body bgcolor="yellow">
16
17
<center>
18
19
<br><br>
20
21
22
23
<h2>Name: Aphrodite</h2>
24
25
<br><br>
26
27
28
29
<br><br>
30
31
32
33
<br><br>
34
35
36
37
</center>
38
39
</body>
40
41
</html>
42
43
>>>
Here we used the geturl() method to return the URL (i.e., the full address) of the webpage and
the get_data() method to return the actual HTML code in the same way that we used read() with
urllib2. We could then use Beautiful Soup or regular expressions to parse out the information we
want from the string returned by the get_data() method.
But what if we have to submit information to the website? For instance, what if the information we
want is behind a login page such as [login.php (http://www.realpython.com/practice/login.php)? If
we are trying to do things automatically, then we will need a way to automate the login process as
well.
First, lets take a look at the HTML response provided by login.php:
import mechanize
2
3
myBrowser = mechanize.Browser()
147
4
5
myBrowser.open("http://RealPython.com/practice/login.php")
6
7
myResponse = myBrowser.response()
8
9
print myResponse.get_data()
This returns the following form (which you should take a look at in a regular browser as well to see
how it appears):
>>>
2
3
<html>
4
5
<head>
6
7
<title>Log In</title>
8
9
</head>
10
11
<body bgcolor="yellow">
12
13
<center>
14
15
<br><br>
16
17
18
19
<br><br>
20
21
22
23
24
25
26
27
28
29
</form>
30
31
</center>
32
33
</body>
34
35
</html>
36
37
>>>
148
The code we see is HTML, but the page itself is written in another language called PHP. In this case,
the PHP is creating the HTML that we see based on the information we provide. For instance, try
logging into the page with an incorrect username and password, and you will see that the same page
now includes a line of text to let you know: Wrong username or password! However, if you provide
the correct login information (username of zeus and password of ThunderDude), you will be
redirected to the profiles.html page.
For our purposes, the important section of HTML code is the login form, i.e., everything inside the
<form> tags. We can see that there is a submission <form> named login that includes two <input>
tags, one named user and the other named pwd. The third <input> is the actual Submit button.
Now that we know the underlying structure of the form, we can return to mechanize to automate the
login process.
1
import mechanize
2
3
myBrowser = mechanize.Browser()
4
5
myBrowser.open("http://RealPython.com/practice/login.php")
6
7
8
9
myBrowser.select_form("login")
10
11
myBrowser["user"] = "zeus"
12
13
myBrowser["pwd"] = "ThunderDude"
14
15
16
17
149
trying to log in with many different usernames and passwords until they find a working
combination. Besides this being highly illegal, almost all websites these days (including
my practice form) will lock you out and report your IP address if they see you making
too many failed requests, so dont try it!
We were able to retrieve the webpage form by name because mechanize includes its own HTML
parser. We can use this parser through various browser methods as well to easily obtain other types
of HTML elements. The links() method will return all the links appearing on the browsers current
page as Link objects, which we can then loop over to obtain their addresses. For instance, if our
browser is still on the profiles.html page, we could say:
1
>>>
2
3
print"Address:", link.absolute_url
5
6
print"Text:", link.text
7
8
9
Address: http://RealPython.com/practice/aphrodite.html
10
11
Text: Aphrodite
12
13
Address: http://RealPython.com/practice/poseidon.html
14
15
Text: Poseidon
16
17
Address: http://RealPython.com/practice/dionysus.html
18
19
Text: Dionysus
20
21
>>>
Each Link object has a number of attributes, including an absolute_url attribute that represents the
address of the webpage (i.e., the href value) and a text attribute that represents the actual text that
appears as a link on the webpage.
The mechanize browser provides many other methods to offer us the full functionality of a standard
web browser. For instance, the browsers back() method simply takes us back one page, similar to
hitting the Back button in an ordinary browser. We can also click on links in a page to follow
them using the browsers follow_link() method.
Lets follow each of the links on the profiles.html page using the browsers follow_link() and back()
methods, displaying the title of each webpage we visit by using the browsers title() method:
import mechanize
2
3
myBrowser = mechanize.Browser()
150
4
5
myBrowser.open("http://RealPython.com/practice/profiles.html")
6
7
myBrowser.follow_link(nextLink)
9
10
11
12
13
14
15
>>>
2
3
4
5
6
7
8
9
10
11
12
13
14
>>>
Unfortunately, mechanizes HTML parser couldnt find the closing title tag for Dionysus because
of the errant forward slash at the end, and we ended up gathering the rest of the page. When one
parsing method fails us (in this case, the mechanize browsers title() method) because of poorly written HTML, the best course of action is to turn to another parsing method - in this case, creating a
BeautifulSoup object and extracting the webpages title from it. For instance:
151
import mechanize
2
3
4
5
myBrowser = mechanize.Browser()
6
7
htmlPage = myBrowser.open("http://www.RealPython.com/practice/dionysus.html")
8
9
htmlText = htmlPage.get_data()
10
11
mySoup = BeautifulSoup(htmlText)
12
13
print mySoup.title.string
Notice that we didnt have to revisit using urllib2 since mechanize already provides the functionality
of opening webpages and retrieving their HTML for us.
Review exercises:
1. Use mechanize to provide the correct username zeus and password ThunderDude to the
login page submission form located at: http://RealPython.com/practice/login.php
2. Using Beautiful Soup, display the title of the current page to determine that you have been
redirected to profiles.html
3. Use mechanize to return to login.php by going back to the previous page
4. Provide an incorrect username and password to the login form, then search the HTML of the
returned webpage for the text Wrong username or password! to determine that the login
process failed
152
import mechanize
2
3
4
5
myBrowser = mechanize.Browser()
6
7
htmlPage = myBrowser.open("http://finance.yahoo.com/q?s=yhoo")
8
9
htmlText = htmlPage.get_data()
10
11
mySoup = BeautifulSoup(htmlText)
12
13
14
15
16
17
# take the BeautifulSoup string out of the first (and only) <span> tag
18
19
myPrice = myTags[0].string
20
21
ally a more complex BeautifulSoup string that has other information associated with it, and Beautiful
Soup would try to display this other information as well if we had only said print myPrice.
Now, in order to repeatedly get the newest stock quote available, well need to create a loop that either
uses the reload() method or loads the page in the browser each time. But first, we should check the
Yahoo! Finance terms of use to make sure that this isnt in violation of their acceptable use policy.
The terms state that we should not use the Yahoo! Finance Modules in a manner that exceeds reasonable request volume [or] constitutes excessive or abusive usage, which seems reasonable enough.
Of course, reasonable and excessive are entirely subjective terms, but the general rules of Internet
etiquette suggest that you dont ask for more data than you need. Sometimes, the amount of data
you need for a particular use might still be considered excessive, but following this rule is a good
place to start.
In our case, an infinite loop that grabs stock quotes as quickly as possible is definitely more than we
need, especially since it appears that Yahoo! only updates its stock quotes once per minute. Since
well only be using this script to make a few webpage requests as a test, lets wait one minute in
between each request. We can pause the functioning of a script by passing a number of seconds to
the sleep() method of Pythons time module, like so:
1
2
3
4
5
sleep(5)
6
7
2
3
import mechanize
4
5
6
7
8
9
myBrowser = mechanize.Browser()
10
11
12
13
14
15
htmlPage = myBrowser.open("http://finance.yahoo.com/q?s=yhoo")
16
17
htmlText = htmlPage.get_data()
154
18
19
mySoup = BeautifulSoup(htmlText)
20
21
22
23
myPrice = myTags[0].string
24
25
26
27
28
29
sleep(60)
If we had originally loaded the page once outside of the loop so that myBrowser already pointed to
that URL, we also could have them called myBrowser.reload() instead of re-loading the page each
time in order to refresh the page. This can mainly be useful when you want a piece of code to refresh
a current page without having to worry about what the original URL of the page was.
Review exercises:
1. Repeat the example in this section to scrape YHOO stock quotes, but additionally include the
current time of the quote as obtained from the Yahoo! Finance webpage; this time can be taken
from part of a string inside another span tag that appears shortly after the actual stock price
in the webpages HTML
155
Chapter 15
Scientific computing and graphing
Use NumPy for matrix manipulation
If you are a scientist, an engineer, or the sort of person who couldnt survive a week without using
MATLAB, chances are high that you will want to make use of the NumPy and SciPy packages to
increase your Python coding abilities. Even if you dont fall into one of those categories, these tools
can still be quite useful. This section will introduce you to a Python package that lets you store and
manipulate matrices of data, and the next section will introduce an additional package that makes it
possible to visualize data through endless varieties of graphs and charts.
The main package for scientific computing in Python is NumPy; there are a number of additional
specialized packages, but most of these are based on the functionality offered by NumPy. To install
NumPy, download the latest version here for 32-bit Windows or here for 32-bit Mac, then run the
automated installer. If you have a 64-bit system (including OS X 10.6 or later), youll need to install
an unofficial 64-bit release of SciPy (which includes Numpy), available here or here. Debian/Ubuntu
users can get NumPy by typing: sudo apt-get install python-numpy
Among many other possibilities, NumPy primarily offers an easy way to manipulate data stored in
many dimensions. For instance, we usually think of a two-dimensional list as a matrix or a table
that could be created by forming a list of lists in Python:
1
2
3
4
1
2
3
>>> matrix[0][1]
2
>>>
156
Things get much more complicated, however, if we want to do anything more complicated with this
two-dimensional list. For instance, what if we wanted to multiply every entry in our matrix by 2?
That would require looping over every entry in every list inside the main list.
Lets create this same list using NumPy:
1
2
3
4
5
6
7
8
9
10
11
Another benefit, as we can already see, is that NumPy automatically knows to display our twodimensional (two axis) array in two dimensions so that we can easily read its contents. We also
have two options for accessing an entry, either by the usual indexing or by specifying the index
number for each axis, separated by commas.
Remember to include the main set of square brackets when creating a NumPy array; even though the
array() has parentheses, including square brackets is necessary when you want to type out the array
entries directly. For instance, this is correct: matrix = array([[1,2],[3,4]])
Meanwhile, this would be INCORRECT because it is missing outer brackets: matrix = array([1,2],[3,4])
We have to type out the array this way because we could also have given a different input to create the
array() - for instance, we could have supplied a list that is already enclosed in its own set of square
brackets:
1
2
1
2
3
4
5
NOTE: Even if you have no interest in using matrices for scientific computing, you still
might find it helpful at some point to store information in a NumPy array because of its
many helpful properties. For instance, perhaps you are designing a game and need an
easy way to store, view and manipulate a grid of values with rows and columns. Rather
than creating a list of lists or some other complicated structure, using a NumPy array is
a simple way to store your two-dimensional data.
2
3
4
5
6
7
1
2
3
4
5
2
3
4
5
6
7
8
>>>
Two matrices can also be stacked vertically using vstack() or horizontally using hstack() if their
axis sizes match:
158
2
3
4
5
6
7
8
9
10
[[1
[4
[7
[5
[7
[9
2
5
8
4
6
8
3]
6]
9]
3]
5]
7]]
11
12
13
14
15
16
[[1 2 3 5 4 3]
[4 5 6 7 6 5]
[7 8 9 9 8 7]]
17
18
>>>
For basic linear algebra purposes, a few of the most commonly used NumPy array properties are also
shown here briefly, as they are all fairly self- explanatory:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
NOTE: If youre using Python 3.4, you already have some basic statistics functionality
built in by default. However, this wont help out much with more advanced matrix manipulation!
159
We can also reshape arrays with the reshape() function to shift entries around:
1
2
3
4
5
6
7
8
9
10
11
1
2
3
4
5
6
7
8
9
10
1
2
3
4
>>>
>>>
[[[
[
5
6
[[ 5
6]
160
[ 7
8]]
8
9
10
11
[[ 9 10]
[11 12]]]
>>>
An easier and safer way to create this particular array would be to reshape() an arange():
1
2
3
1
2
3
4
5
6
Review exercises:
1. Create a 3 x 3 NumPy array named firstMatrix that includes the numbers 3 through 11 by using
arange() and reshape()
2. Display the minimum, maximum and mean of all entries in firstMatrix
161
3. Square every entry in firstMatrix using the ** operator, and save the results in an array named
secondMatrix
4. Use vstack() to stack firstMatrix on top of secondMatrix and save the results in an array
named thirdMatrix
5. Use dot() to calculate the dot product of thirdMatrix by firstMatrix
6. Reshape thirdMatrix into an array of dimensions 3 x 3 x 2
162
1
2
3
When plotting, we dont even have to specify the horizontal axis points; if we dont include any, matplotlib will assume that we want to graph our y values against a sequential x axis increasing by one:
1
2
3
164
165
There is a optional formatting argument that can be inserted into plot() after specifying the points
to be plotted. This argument specifies the color and style of lines or points to draw. Unfortunately,
the standard is borrowed from MATLAB and (compared to most Python) the formatting is not very
intuitive to read or remember. The default value is solid blue line, which would be represented by
the format string b-. If we wanted to plot green circular dots connected by solid lines instead, we
would use the format string g-o like so:
1
2
3
1
2
3
4
1
2
3
4
5
6
1
2
3
4
5
6
7
1
2
3
4
5
6
7
8
9
168
169
10
11
plt.show()
172
1
2
3
4
5
4
5
6
8
9
10
176
adding graph annotations is a very detailed topic that can quickly become case-specific. For one
short example, lets point out the expected average value on our histogram, complete with an arrow:
1
2
3
4
5
6
177
We can even include mathematical expressions in our text by using a writing style called TeX markup
language. This will be familiar to you if youve ever used LaTeX, although a brief introduction for use
in matplotlib can be found here. As a simple example, lets make out annotation a little more scientific
by adding the symbol with a hat over-line to show the predicted mean value:
1
2
3
4
5
6
178
Once we have a chart created, chances are that well want to be able to save it somewhere. It turns
out that this process is even easier than writing other kinds of files, because matplotlib allows us to
save PNG images, SVG images, PDF documents and PostScript files by simply specifying the type of
file to use and then calling the savefig() function. For instance, instead of displaying out histogram
on the screen, lets save it out as both a PNG image and a PDF file to our chapter 11 output folder:
1
2
3
4
5
plt.hist(random.randn(10000), 20)
6
7
8
9
plt.savefig(path+"histogram.png")
10
11
plt.savefig(path+"histogram.pdf")
NOTE: When using pyplot, if you want to both save a figure and display it on the screen,
make sure that you save it first before displaying it! Because show() pauses your code
and because closing the display window destroys the graph, trying to save the figure after
calling show() will only result in an empty file.
Finally, when youre initially tweaking the layout and formatting of a particular graph, you might
want to change parts of the graph without re- running an entire script to re-display the graph. Unfortunately, on some systems matplotlib doesnt work well with IDLE when it comes to creating an
interactive process, but there are a couple options available if you do want this functionality. One
simple work-around is simply to save out a script specifically to create the graph, then continually
modify and rerun this script. Another possibility is to install the IPython package, which creates
an interactive version of Python that will allow you to work with a graphics window thats already
open. A simpler but less user-friendly solution is to run Python from the Windows command line or
Mac/Linux Terminal (see the Interlude: Install Packages section for instructions on how to do this).
In either case, you will then be able to turn on matplotlibs interactive mode for a given plot using
the ion() function, like so:
1
2
3
>>> plt.ion()
4
5
>>>
You can then create a plot as usual, without having to call the show() function to display it, and then
make changes to the plot while it is still being displayed by typing commands into the interactive
window.
179
Although weve covered the most commonly used basics for creating plots, the functionality offered
in matplotlib is incrediblyextensive. If you have an idea in mind for a particular type of data visualization, no matter how complex, the best way to get started is usually to browse the matplotlib gallery
for something that looks similar and then make the necessary modifications to the example code.
Review exercises:
1. Recreate all the graphs shown in this section by writing your own scripts without referring to
the provided code
2. It is a well-documented fact that the number of pirates in the world is correlated with a rise in
global temperatures. Write a script pirates.py that visually examines this relationship:
Read in the file pirates.csv from the Chapter 11 practice files folder.
Create a line graph of the average world temperature in degrees Celsius as a function of the
number of pirates in the world, i.e., graph Pirates along the x-axis and Temperature along the
y-axis.
Add a graph title and label your graphs axes.
Save the resulting graph out as a PNG image file.
Bonus: Label each point on the graph with the appropriate Year; you should do this programmatically by looping through the actual data points rather than specifying the individual
position of each annotation.
180
Chapter 16
Graphical User Interface
Add GUI elements with EasyGUI
Weve made a few pretty pictures with matplotlib and manipulated some files, but otherwise we have
limited ourselves to programs that are generally invisible and occasionally spit out text. While this
might be good enough for most purposes, there are some programs that could really benefit from
letting the user point and click - say, a script you wrote to rename a folders worth of files that
your technologically impaired friend now wants to use. For this, we need to design a graphical user
interface (referred to as a GUI and pronounced gooey - really).
When we talk about GUIs, we usually mean a full GUI application where everything about the program happens inside a window full of visual elements (as opposed to a text-based program). Designing good GUI applications can be incredibly difficult because there are so many moving parts
to manage; all the pieces of an application constantly have to be listening to each other and to the
user, and you have to keep updating all the visual elements so that the user only sees the most recent
version of everything.
Instead of diving right into the complicated world of making GUI applications, lets first add some
individual GUI elements to our code. That way, we can still improve the experience for the person
using our program without having to spend endless hours designing and coding it.
Well start out in GUI programming with a module named EasyGUI. First, youll need to download
this package:
Windows: If you have easy_install set up, then you can install the package by typing easy_install
easygui at a command prompt. Otherwise, download the compressed .zip file, unzip it, and install
the package by running Python on setup.py from the command prompt as described in the installation section.
OS X: If you have easy_install set up, then you can install the package by typing sudo easy_install
easygui into the Terminal. Otherwise, download the compressed .tar.gz file, decompress it, then
install the package by running Python on setup.py from Terminal as described in the installation
section.
Debian/Linux: sudo apt-get install python-easygui
181
EasyGUI is different from other GUI modules because it doesnt rely on events. Most GUI applications are event-driven, meaning that the flow of the program depends on actions taken by the user;
this is usually what makes GUI programming so complicated, because any object that might change
in response to the user has to listen for different events to occur. By contrast, EasyGUI is structured linearly like any function; at some point in our code, we display some visual element on the
screen, use it to take input from the user, then return that users input to our code and proceed as
usual.
Lets start by importing the functionality from EasyGUI into the interactive window and displaying
a simple message box:
1
2
182
183
Notice that when we ran the previous code, clicking the button returned the value that was in the
button text back to the interactive window. In this case, returning the text of the button clicked by
the user wasnt that informative, but we can also provide more than one choice by using a buttonbox()
and providing a tuple of button values:
1
2
'Auuugh!' >>>
Now we were able to tell that our user chose Auuugh! as the favorite color, and we could set that
return value equal to a variable to be used later in our code.
There are a number of other ways to receive input from the user through easyGUI to suit your needs.
For starters, try out the following lines a few times each and see what values they return based on the
different choices you select:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
There is also a diropenbox() function for letting the user choose a folder rather than an individual
file; in this case, the optional third argument can tell the dialog box which directory to have open by
default.
Finally, there is a filesavebox() that works similarly to fileopenbox() except that it allows the user to
select a file to be saved rather than opened. This dialog box also confirms that the user wants to
overwrite the file if the chosen name is the same as a file that already exists. Again, no actual saving
of files is happening - thats still up to us to program once we receive the file name from easyGUI.
In practice, one of the most difficult problems when it comes to letting the user select files is what to
do if the user cancels out of the window when you need to have selected a file. One simple solution
is to display the dialog in a loop until the user finally does select a file, but thats not very nice - after
all, maybe your user had a change of heart and really doesnt want to run whatever code comes next.
Instead, you should plan to handle rejection gracefully. Depending on what exactly youre asking
of the user, most of the time you should use exit() to end the program without a fuss when the user
cancels. (If youre running the script in IDLE, exit() will also close the current interactive window.
Its very thorough.)
Lets get some practice with how to handle file dialogs by writing a simple, usable program. We will
guide the user (with GUI elements) through the process of opening a PDF file, rotating its pages in
some way, and then saving the rotated file as a new PDF:
1
2
3
4
5
6
7
8
9
10
11
exit()
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
186
27
28
29
30
31
if outputFileName == None:
32
33
34
35
36
37
38
39
outputPDF = PdfFileWriter()
40
41
42
43
page = inputFile.getPage(pageNum)
44
45
page = page.rotateClockwise(int(degrees))
46
47
outputPDF.addPage(page)
48
49
50
51
outputPDF.write(outputFile)
52
53
outputFile.close()
Besides the use of exit(), you should already be familiar with all the code in this script. The tricky
part is the logic - i.e., how to put all the different pieces together to create a seamless experience for
the user. For longer programs, it can be helpful to draw out diagrams by hand using boxes and arrows
to represent how we want the user to experience the different parts of our program; its a good idea
to do this even before you start writing any code at all. Lets represent how this script works in an
outline form in terms of what we display to the user:
1. Let the user select an input file
2. If the user canceled the open file dialog (None was returned), exit the program
3. Let the user select a rotation amount
[No alternative choice here; the user must click a button]
4. Let the user select an output file
5. If the user tries to save a file with the same name as the input file:
Alert the user of the problem with a message box
187
Return to step 4
6. If the user canceled the save file dialog (None was returned), exit the program
The final steps are the hardest to plan out. After step 4, since we already know (from step 2) that the
input file isnt None, we can check whether the output file and input file match before checking for
the canceled dialog. Then, based on the return value from the dialog, we can check for whether or
not the user canceled the dialog box after the fact.
Review exercises:
1. Recreate the three different GUI elements pictured in this section by writing your own scripts
without referring to the provided code
2. Save each of the values returned from these GUI elements into new variables, then print each
of them
3. Test out indexbox(), choicebox(), multchoicebox(), enterbox(), passwordbox() and textbox()
to see what GUI elements they produce; you can use the help() function to read more
about each function in the interactive window - for instance, type import easygui then
help(easygui.indexbox)
188
189
GUI applications exist as windows, which are just the application boxes youre used to using everywhere, each one with its own title, minimize button, close button, and usually the ability to be resized.
Within a window we can have one or more frames that contain the actual content; the frames help to
separate the windows content into different sections. Frames, and all the different objects inside of
them (menus, text labels, buttons, etc.) are all called widgets.
Lets start with a window that only contains a single widget. In this case, well use a Label to show a
single string of text:
1
2
3
4
5
6
7
8
9
processing of running the GUI application. Although it cant do anything interesting yet, it does
function; the application can be resized, minimized, and closed.
If you run this script, a window should appear that will look something like one of the following,
depending on your operating system.
Windows:
Mac:
Ubuntu:
Again, for the sake of the majority, Ill stick to displaying only the resulting Windows application
version for the remained of this section.
Usually, the layout of even a very basic complete GUI application will be much more complicated
than our last example. Although it may look intimidating when starting out, keep in mind that all
of the initial work is helpful to make it much easier once we want to add additional features to the
application. A more typical setup for a (still blank) GUI application might look something more like
this:
1
2
3
4
5
class App(Frame):
6
7
def__init__(self, master):
8
9
Frame.__init__(self, master)
10
11
12
191
13
window = Tk()
14
15
16
17
myApplication = App(window)
18
19
20
21
2
3
window = Tk()
4
5
window.title("Here's a window")
6
7
8
9
myFrame = Frame()
10
11
myFrame.pack()
12
13
14
15
16
17
labelText.pack()
18
19
window.mainloop()
The result is very similar to our first window, since the frame itself isnt a visible object:
When we created the label, we had to assign the label to the frame by passing the name of our frame
(myFrame) as the first argument of this Label widget. This is important to do because were otherwise
packing the label into the window, and anything we do to modify the frames formatting wont be
applied to widgets that dont specifically name the frame to which they belong. The frame is called
the parent widget of the label since the label is placed inside of it. This becomes especially important
192
2
3
window = Tk()
4
5
myFrame = Frame()
6
7
myFrame.pack()
8
9
10
11
12
13
labelText1.pack(fill=X)
14
15
16
17
18
19
20
21
22
23
24
25
26
27
labelRight.pack()
28
29
window.mainloop()
193
2
3
window = Tk()
4
5
6
7
myFrame = Frame()
8
9
myFrame.pack()
10
11
12
13
button1.place(x=100, y=150)
14
15
16
17
18
19
window.mainloop()
First we set a specific window size, which is 200 pixels wide and 200 pixels tall. We created button1
using Button() and placed it at the location (100,150), which is halfway across the window and of
the way down the window. We then placed button2 at the location (0, 0), which is just the upper left
corner of the window, and gave it a width of 100 pixels (half the width of the window) and a height
of 50 pixels ( the height of the window).
Other than specifying absolute placement and size (meaning that we give exact amounts of pixels to
use), we can instead provide relative placement and size of a widget. For instance, since frames are
194
2
3
window = Tk()
4
5
6
7
8
9
10
11
12
13
14
15
frameRight.place(relx=0.7, relwidth=0.3)
16
17
18
19
20
21
leftLabel.pack()
22
23
24
25
rightLabel.pack()
26
27
window.mainloop()
Although it offers more detailed control than pack() does, using place() to arrange widgets usually
isnt an ideal strategy, since it can be difficult to update the specific placements of everything if one
widget gets added or deleted. Beyond this, different screen resolutions can make a window appear
somewhat differently, making your careful placements less effective if you want to share the program
to be run on different computers.
The last and usually the easiest way to make clean, simple GUI layouts without too much hassle is
by using grid() to place widgets in a two-dimensional grid. To do this, we can imagine a grid with
numbered rows and columns, where we can then specify which cell (or cells) of our grid we want to
be taken up by each particular widget.
196
2
3
window = Tk()
4
5
myFrame = Frame()
6
7
8
9
10
11
labelTopLeft.grid(row=1, column=1)
12
13
14
15
labelBottomLeft.grid(row=2, column=1)
16
17
18
19
buttonBottomRight.grid(row=3, column=2)
20
21
We can assign values to the arguments padx and pady for any given widget, which will include extra
space around the widget horizontally and/or vertically. We can also assign values to the argument
197
sticky for each widget, which takes cardinal directions (like a compass) such as E or N+W; this will
tell the widget which side (or sides) it should stick to if there is extra room in that cell of the grid.
Lets look at these arguments in an example:
1
2
3
window = Tk()
4
5
myFrame = Frame()
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
The buttons weve created so far arent very useful, since they dont yet do anything when we click
on them. However, we can specify a command argument to a button that will run a function of
our choice. For instance, lets create a simple function incrementButton() that takes the number
displayed on our button and increments it by one:
1
2
3
window = Tk()
4
5
def incrementButton():
6
7
newNumber = 1+ myButton.cget("text")
8
9
myButton.config(text=newNumber)
10
11
12
13
myButton.pack()
14
15
An entry works a little differently from labels and buttons in that the text in the entry box is blank at
first. If you want to add default text to display, it must be inserted after creating the entry using the
insert() method, which requires as its first argument a position in the text for inserting the new string.
For instance, after creating a new Entry with myEntry = Entry(), we would then say myEntry.insert(0,
default text) to add the string default text to the entry box. In order to return text currently in
the entry box, we use the get() method instead of the usual cget() method. Both these concepts are
easier seen in an example:
1
2
3
window = Tk()
4
5
entry1 = Entry()
6
7
entry1.pack()
8
9
10
11
entry2 = Entry()
12
13
entry2.pack()
14
15
16
17
entry2.insert(0, myText)
18
19
20
21
200
2
3
def recalc():
4
5
6
7
8
9
10
11
12
13
resultFar.config(text=farTemp)
14
15
16
17
resultFar.config(text="invalid")
18
19
20
21
window = Tk()
22
23
window.title("Temperature converter")
24
25
frame = Frame()
26
27
frame.grid(padx=5, pady=5) # pad top and left of frame 5 pixels before grid
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
entryCel.grid(row=1, column=2)
44
45
entryCel.insert(0, 0)
46
47
48
49
resultFar = Label(frame)
50
51
resultFar.grid(row=2, column=2)
52
53
54
55
56
57
58
59
60
61
202
2
3
4
5
window = Tk()
6
7
frame = Frame()
8
9
frame.pack()
10
11
12
13
14
15
fileName = tkFileDialog.askopenfilename(filetypes=typeList)
16
17
18
19
labelFile.config(text=fileName)
20
21
22
23
labelFile = Label(frame)
24
25
labelFile.pack()
26
27
28
29
30
31
buttonOpen.pack()
32
33
separate modules!
We passed typeList into the argument filetypes of the tkFileDialog.askopenfilename() function in
order to provide a list of the different types of files that we want the user to be able to choose; these
are provided as a list of tuples, where the first item in each tuple is a description of the file type and
the second item in each tuple is the actual file extension.
Just like with EasyGUI, the tkFileDialog.askopenfilename() function returns the full name of the file,
which we can then set equal to our string fileName and pass into our label. If the user hit the Cancel
button instead, the function returns None and our label becomes blank.
Likewise, there is a tkFileDialog.asksaveasfilename() function that takes the same arguments as tkFileDialog.askopenfilename() and works analogously to the filesavebox() of EasyGUI. We can also
pass a default extension type to asksaveasfilename() in order to append a file extension onto the
name of the provided file if the user didnt happen to provide one (although some operating systems
might ignore this argument):
1
2
3
4
5
window = Tk()
6
7
frame = Frame()
8
9
frame.pack()
10
11
12
13
14
15
16
fileName = tkFileDialog.asksaveasfilename(filetypes=typeList,
defaultextension=".py")
17
18
19
20
21
22
labelFile.config(text=myText)
23
24
25
26
labelFile = Label(frame)
27
28
labelFile.pack()
29
30
31
32
33
34
buttonOpen.pack()
35
36
Review exercises:
1. Recreate the various windows pictured and described in this section, complete with all the
same GUI widgets and any necessary interaction, by writing your own scripts without referring
to the provided code
2. Using grid() to organize your widgets horizontally, create a button that, when clicked, takes on
the value entered into an entry box to its right
205
206
Chapter 17
Web applications
Create a simple web application
You know how to write useful Python scripts, and now you want to show them off to the world but
how? Most non-programmers wont have any use for your .py script files. Programs like PyInstaller
and cx_Freeze help turn Python scripts into executable programs that can be run by themselves on
different platforms without the need to use Python to interpret the code. More and more, however,
were seeing a trend away from desktop-based applications and toward web applications that can
be accessed and run through Internet browsers.
Historically, websites on the Internet were full of plain webpages that offered the exact same information to every user; you would request a page, and the information from that page would be displayed.
These webpages were static because their content never changed; a web server would simply respond to a users request for a webpage by sending along that page, regardless of who the user was
or what other actions the user took.
Today, most websites are actually web applications, which offer dynamic webpages that can change
their content in any number of ways. For instance, a webmail application allows the user to interact
with it, displaying all sorts of different information, often while staying in a single webpage.
The idea behind creating a Python-driven web application is that you can use Python code to determine what content to show a user and what actions to take. The code is actually run by the web server
that hosts your website, so your user doesnt need to install anything to use your application; if the
user has a browser and an Internet connection, then everything else will be run online.
The task of getting Python code to run on a website is a complicated one, but there are a number of
different web frameworks available for Python that automatically take care of a lot of the details.
The first thing that you will need is a web hosting plan that allows and supports the ability to run
Python code. Since these usually cost money (and since not everyone even has a website), well stick
with a free alternative that is one of the simplest to set up: Google App Engine, which uses a web
framework called webapp2.
There are a number of other alternatives (both free and paid) that are more customizable, and you can
use webapp2 on its own later without relying on Google App Engine, but getting started with Google
207
App Engine will be the quickest and easiest way to begin learning about web application development
in Python.
NOTE: If you have any plan to go deeper into web development, dont bother with
Google App Engine; jump right to the more advanced web development courses. If you
only want to get a web application up with minimal time and effort, Google App Engine
is a solid option, but it wont teach you much of the web development skills that youll
need when moving into more advanced topics.
First, go here to download and install the appropriate Python SDK for Google App Engine. SDK
stands for Software Development Kit. This particular SDK includes two main resources: a web
server application, which will allow you to run your web applications on your own computer without
actually putting them online, and the Google App Engine Launcher, which will help to put your web
applications online.
NOTE: Python 3 note: Unfortunately, Google App Engine only works with Python 2.7
and has no immediate plans to support Python 3 code.
Before we dive into writing a web application, lets get a very broad, generalized overview of whats
about to happen. There are a lot of different pieces involved, and they all have to communicate with
each other to function correctly:
1. First, your user makes a request for a particular webpage on your website (i.e., by typing a
URL into a browser).
2. This request gets received by the web server that hosts your website.
3. The web server uses App Engine to look at the configuration file for your application. App
Engine matches the users reuest to a particular portion of your Python script.
4. This Python code is called up by App Engine. When your code runs, it writes out a response
webpage.
5. App Engine delivers this response back to your user through the web server.
6. The user can then view the web servers response (i.e., by displaying the resulting webpage in
a browser).
The application were going to create will rely on a couple different files, so the first thing we need to
do is make a project folder to hold all of these files. Create a new folder named first_app anywhere
you like (just remember where it is). First we will write a very simple Python script that can respond
with the content of your webpage:
1
2
3
print ""
4
5
Save this code in a script named hello.py inside your first_app folder.
So whats with the first two print statements? Web servers communicate with users (usually
browsers) through HTTP by receiving HTTP requests and sending HTTP responses. The HTTP
response that our application sends can have both header lines and a body.
We added a header line to our HTTP response in the first line. Header lines contain optional information to let a browser know how to interpret the body of the response. In this case, setting our headers
Content-Type equal to the value text/plain is the way that our HTTP response lets a browser know
to expect the body to contain plain text as opposed to HTML code, an image, or some other type of
file. Leaving a blank line after this header line is how we told the browser, the header lines are over
now; here comes the actual body to display.
The body of the response is what we will actually see when we load the page in a browser. In this
case, its just a simple string of text: Congratulations, its a web app!
Before we can run our web application, we need to provide App Engine with a configuration file. This
is the file that the web server will use to get information about what Python code we want it to run.
Open up any text editor (or another script window in IDLE) and copy the following text into a new
file:
1
2
3
4
5
6
7
8
application: hello
version: 1
runtime: python27
api_version: 1
threadsafe: false
handlers:
- url: /.*
script: hello.py
Now name this file app.yaml and save it in the same first_app folder as the Python script. Make
sure you get the spacing correct; the last line includes two leading spaces so that script lines up
under url. Just like Python, YAML files rely on precise indentation.
The YAML configuration file gives App Engine all the necessary information it needs to run the web
application. First, application: hello provides a unique identifying name for the application so that
we will be able to launch it later; we are giving the name hello to our web app.
The line version: 1 lets App Engine know that this is version 1 of our application. (If we update this
later to version: 2, App Engine will keep a copy of version 1 in memory so that we can go back to
running this previous version if necessary.)
The lines runtime: python27 and api_version: 1 let App Engine know that we want to use Python
2.7 to run our application. (There is currently only one version of the Python API offered by Google.)
The line threadsafe: false means that our web application isnt designed to be able to receive multiple requests at once; if App Engine has multiple requests then it will send them to our application
one at a time instead of all at once.
Finally we define our handlers to handle different webpage requests from our users (i.e., if a user
requested the main page at / or another page at a different address on our site). These requested
209
paths can each be assigned to a different piece of Python code. In this case, we only have one script,
hello.py, so we want to direct user requests for any page on the website to the same script. In these
last two lines of the configuration file, we say that any URL matching the regular expression /.*
(which is any URL on our site) should be directed to the hello.py script.
Okay, now we can finally take a look at our application! Its not online yet, but we can view it by
running the application through our own local web server (that cant be accessed by other users)
using Google App Engine. This will help us simulate what things will look like to a user once our
application is online.
Open the Google App Engine Launcher program, then choose File -> Add Existing Application
You can then browse to and select your first_app folder that contains the web application. Add the
application using port 8080, then select the application in the main window and click the green Run
button.
Linux users: you will need to navigate in your Terminal to the directory just before the first_app
folder (i.e., its parent directory), then type the following command to launch the web application
(which will run on port 8080 by default): google_appengine/dev_appserver.py first_app/
The console window that appears will track lots of extra information, but your web application is up
and running once you see a blinking cursor.
The port number can be thought of as selecting a particular channel to use, similar to broadcasting
a television or radio channel. We chose to run the web application on port 8080, meaning that the
user can essentially tune in to this port number and receive communication from our web server.
(We could host a completely different web application using a different port number and the two
would not interfere with each other.)
Once the web application is running (this might take a little while), we can click Browse to view the
web application in the default web browser. This will open up the page at the URL localhost:8080
(which we can also type into a browser manually to load the web application). The web address
localhost is just a way of saying the web address of my own computer (since the application isnt
actually online yet). The :8080 specifies that we should listen for communication on port number
8080. If everything has been set up correctly, your browser should load a page that displays the plain
text: Congratulations, its a web app!
NOTE: If you make any changes to your script, as long as Google App Engine Launcher
is still running your application, all you need to do in order to view the newest version
of the web application is to save the script and reload the webpage. App Engine will
automatically listen for any changes that might have been made.
Well, thats a start. As far as an application goes, however, the Python script involved was fairly
useless. In order to make something with a bit more potential, we need to create a special object in
our Python code called a WSGIApplication. WSGI stands for Web Server Gateway Interface and is
a way to allow Python to communicate with the web server in a better way than simply printing a
single chunk of information back as a response. Our new Python script, which still just displays the
same line of text, is considerably more complicated:
210
import webapp2
2
3
class MainPage(webapp2.RequestHandler):
4
5
defget(self):
6
7
self.response.headers["Content-Type"] ="text/plain"
8
9
10
11
12
13
1
2
self.response.headers["Content-Type"] = "text/plain"
self.response.write("Congratulations, it's a web app!")
211
Again, we have to write both a header line and a body. The header line gets packed into headers like
in a dictionary, setting the value of the Content-Type equal to text/plain. This time, we create the
body of the response using the write() method. WSGI takes care of separating the header lines from
the body, so this time theres no need to write() a blank line first.
We now have to update our YAML configuration file as well:
1
2
3
4
5
6
7
8
application: hello
version: 1
runtime: python27
api_version: 1
threadsafe: false
handlers:
- url: /.*
script: hello.myApp
The only difference from the previous configuration file is the very last line, where we point the
script to use hello.myApp instead of hello.py; instead of the entire Python script, we now want the
web server to run the WSGIApplication object inside of our script named myApp, which we access
using the simple dot notation of hello.myApp. (Theres no new file named hello.myApp - were
just pointing our configuration file to the myApp object inside the hello.py script.)
If all goes well, you should now be able to save these two files, reload the webpage, and see the exact
same thing as before: Congratulations, its a web app!
NOTE: If you make a mistake in your Python script, your web application might load
a horrendous error page that makes it look as if you broke the Internet for good. Dont
panic! Just look at the last line, which will point you to the specific error (usually a line
of your code displayed in red) that caused the entire chain of failure
If you do end up with an error that your browser refuses to give you further advice about (instead
showing a
500 Internal Server Error, even when youve set your WSGIApplication to have debug=True), you
can try running the Python script itself in IDLE. Although it wont run (because Python wont be able
to find the webapp2 module), IDLE can still point out if your code has any syntax errors.
Review exercises:
1. Play around with your web application; make small changes to the script and see how App
Engine responds
2. Use your WSGIApplication to create a basic HTML page by changing the Content-Type header
value to text/html and then using write() multiple times to respond with lines of HTML
code; for instance, the first write() statement could be self.response.write(<html>) to begin
the HTML webpage
212
import webapp2
2
3
class MainPage(webapp2.RequestHandler):
4
5
defget(self):
6
7
self.response.headers["Content-Type"] ="text/html"
8
9
self.response.write("""
10
11
<html>
12
13
14
<body>
15
16
17
18
19
20
21
22
</form>
23
24
</body>
25
26
27
</html>""")
213
28
29
class Greeting(webapp2.RequestHandler):
30
31
defpost(self):
32
33
username =self.request.get("myName")
34
35
welcomeString ="""<html><body>
36
37
Hi there, {}!
38
39
</body></html>""".format(username)
40
41
self.response.headers["Content-Type"] ="text/html"
42
43
self.response.write(welcomeString)
44
45
46
47
security hole. Sure, maybe you dont expect to have malicious users who are actively trying to break
your application, but never underestimate the potential for users to do unexpected things that cause
your application to break in unexpected ways.
For instance, maybe someone decides that an appropriate username to enter into our application is
<b>. Our /welcome webpage ends up displaying:
1
Hi there, !
Since were inserting this text into HTML code, the <b> was interpreted as an HTML tag to begin
making text bold - so instead of greeting out user, we only change our explanation point to be displayed bold. (You can imagine how this might present a security problem; any user can now write
code that runs on our web server.)
To avoid this, we can use Pythons built-in cgi.escape() function, which converts the special HTML
characters <, >, and & into equivalent representations that can be displayed correctly. You will first
need to import cgi into your Python script to use this functionality. Then, when you get() the value
of myName from the users request, you can convert any special HTML characters by instead saying:
username = cgi.escape(self.request.get("myName"))
With these changes, try re-running your web application and signing in with a username of <b> again.
You should now see it display the username back to you correctly:
Hi there, <b>!
Okay, so we can now create one webpage that helps the user post data to another webpage interactively. What about using the get() method to make a single page interactive? Lets look at an example
with just a little bit more Python code behind it and revisit the temperature converter script that
we wrote way back in chapter 4 and again as a simple GUI application.
The webpage we want to display will have a simple text field where the user can input a temperature
in degrees Celsius. We will also include a Convert button to convert the users supplied Celsius
temperature into degrees Fahrenheit. This converted result will be displayed on the next line and
will be updated whenever the user clicks the Convert button. The HTML for this page, with placeholders for the actual temperature values, will look like so:
<html>
2
3
<head><title>Temperature Converter</title></head>
4
5
<body>
6
7
8
9
10
11
12
13
Fahrenheit temperature: {}
14
15
</form>
16
17
</body>
18
19
</html>
This time, our form uses the get method with a form action that points back to the main page
itself. In other words, when the user submits this form by clicking on the Convert button, instead
of sending a post request to a new webpage, the user will send a get request for the same page,
providing the page with some input data.
Just as we did before, we will want to put the temperature conversion into a function of its own. The
full code will look as follows:
import webapp2
2
3
def convertTemp(celTemp):
4
5
6
7
if celTemp =="":
return""
9
10
11
try:
12
13
14
15
16
17
returnstr(farTemp)
18
19
20
21
return"invalid input"
22
23
class MainPage(webapp2.RequestHandler):
24
25
defget(self):
26
27
celTemp =self.request.get("celTemp")
28
29
farTemp = convertTemp(celTemp)
30
31
self.response.headers["Content-Type"] ="text/html"
216
32
33
self.response.write("""
34
35
<html>
36
37
<head><title>Temperature Converter</title></head>
38
39
<body>
40
41
42
43
44
name="celTemp" value={}>
45
46
47
48
49
Fahrenheit temperature: {}
50
51
</form>
52
53
</body>
54
55
</html>""".format(celTemp, farTemp))
56
57
58
59
1
2
3
application: temperature-converter
version: 1
runtime: python27
217
4
5
6
7
8
api_version: 1
threadsafe: false
handlers:
- url: /.*
script: temperature.myApp
We also updated the name of the application just to provide a descriptive name for what the application actually does. Even if you left App Engine running, this name will update automatically in
the Launcher. Notice how we used a dash but didnt use any capitalization in the name; application
names for Google App Engine can only include lower-case letters, digits and hyphens.
You should now be able to use your new web application, supplying temperatures and seeing the
converted result appear on the same webpage. Since we use a get request, we can also now see
the user-supplied data appear in the URL. In fact, you can even circumvent the form and provide
your own value for celTemp by supplying an appropriate address. For instance, try typing the URL
localhost:8080/?celTemp=30 directly into your browser and you will see the resulting temperature
conversion.
Review exercises:
1. Modify the log in web application example so that it only has a single main webpage that can
receive get requests from the user; instead of a Sign In button under the text field, make a
Greet me! button that, when clicked, reloads the page to greet the user by name (if a name
has been supplied) and display the greeting form again
218
219
Chapter 18
Final Thoughts
Congratulations! Youve made it to the beginning. You already know enough to do a lot of amazing
things with Python, but now the real fun starts: its time to explore on your own!
The best way to learn is by solving real problems of your own. Sure, your code might not be very
pretty or efficient when youre just starting out, but it will be useful. If you dont think you have any
problems of the variety that Python could solve, pick a popular module that interests you and create
your own project around it.
Part of what makes Python so great is the community. Log in at the RealPython.com members forum
and help out other new Python students; the only way to know youve really mastered a concepts is
when you can explain it to someone else.
If youre interested in web development, consider diving into the [more advanced Real Python
courses](http://RealPython.com.
If youve made it this far using Python 2.7, you should consider getting a handle on Python 3.4 as
well. The changes are all fairly minimal, but eventually Python 3 will actually catch on and be used
with more frequency, so its good to be prepared. For transitioning Python 2 code that youve already
written into Python 3, you should consider learning to use the built-in 2to3 program.
When you feel ready, consider helping out with an open-source project on GitHub. If puzzles are
more your style, try working through some of the mathematical challenges on Project Euler or the
series of riddles at Python Challenge. You can also sign up for Udacitys free CS101 course to learn
how to build a basic search engine using Python - although you know most of the Python concepts
covered there already!
If you get stuck somewhere along the way, I guarantee that someone else has encountered (and potentially solved) the exact same problem before; search around for answers, particularly at Stack
Overflow, or find a community of Pythonistas willing to help you out.
If all else fails, import this and take a moment to meditate on that which is Python.
221
Chapter 19
Appendix A: Installing Python
Windows 7
Start by downloading Python 2.7.6 from the official Python website. The Windows version is distributed as a MSI package. Once downloaded, double-click the file to install. By default this will
install Python to C:\Python2.7.
You also need to add Python to your PATH environmental variables, so when you want to run a
Python script, you do not have to type the full path each and every time, as this is quite tedious.
Since you downloaded Python version 2.7.6, you need to add the add the following directories to your
PATH:
C:\Python27\
C:\Python27\Scripts\
C:\PYTHON27\DLLs\
C:\PYTHON27\LIB\
[Environment]::SetEnvironmentVariable("Path",
"$env:Path;C:\Python27\;C:\Python27\Scripts\;C:\PYTHON27\DLLs\;C:\PYTHON27\LIB\;",
"User")
Thats it. To test to make sure Python was installed correctly open your command prompt and then
type python to load the Shell:
Video
Watch the video here for assistance.
222
223
Mac OS X
All Mac OS X versions since 10.4 come with Python pre-installed. You can view the version by opening the terminal and typing python to enter the shell. The output will look like this:
224
Linux
If you are using Ubuntu, Linux Mint, or another Debian-based system, enter the following command
in your terminal:
1
1
2
3
4
$
$
$
$
Once installed, fire up the terminal and type python to get to the shell:
225
Chapter 20
Acknowledgements
This book would not have been possible without the help and support of so many friends and colleagues.
For providing valuable advice and candid feedback, I would like to thank Brian, Peter, Anna, Doug,
and especially Sofia, who by now has probably read this material more times than I have. Thanks as
well to Josh for taking the time to share his valuable experience and insights.
A special thanks to the Python Software Foundation for allowing me to graffiti their logo.
Finally, my deepest thanks to all of my Kickstarter backers who took a chance on this project. I
never expected to gather such a large group of helpful, encouraging people; I truly believe that my
Kickstarter project webpage might be one of the nicest gatherings of people that the Internet has ever
experienced.
I hope that all of you will continue to be active in the community, asking questions and sharing tips
in the member forum. Your feedback has already shaped this course and will continue to help me
make improvements in future editions, so I look forward to hearing from all of you.
This book would never have existed without your generous support:
Benjamin Bangsberg|JT|Romer Magsino|Daniel J Hall|John Mattaliano|Jordan DJ Rebirth
Jacobs|Al Grimsley|Ralf Huelsmann, Germany|Amanda Pingel Ramsay|Edser|Andrew Steve
Abrams|Diego Somarribas B.|John McGrath|Zaw Mai Tangbau|Florian Petrikovics|Victor Pera
(Zadar, Croatia)|[email protected]|Daniel R. Lucas|Matthew C, Duda|Kenneth|Helena|Jason
Kaplan|Barry Jones|Steven Kolln|Marek Rewers|Andrey Zhukov|Dave Schlicher|Sue Anne
Teo|Chris Forrence|Toby Gallo|Jakob Campbell|Christian DisOrd3r Johansson|Steve Walsh|Joost
Romanus|Jozsef Tschosie Kovacs|Back Kuo|James Anselm|Christian Gerbrandt|Mike StoopsMichael
A Lundsveen|David R. Bissonnette, Jr.|Geoff Mason|Joao da Silveira|Jason Ian Smith|Anders
Kring|Ruddi Oliver Bodholdt Dal|edgley|Richard Japenga|Jake|Ken Harney|Brandon Hall|B.
Chao|Chinmay Bajikar|Clint LeClair|Davin Reid- Montanaro|Isaac Yung|Espen Torseth|Thomas
Hogan|Nick Poenn|Eric Vogel|Jack Salisbury|James Rank|Jamie Pierson|Christine Paluch|Peter
Laws|Ken Hurst|Patrick Papent Tennant|Anshu Prabhat|Kevin Wilkinson|Joshua Hunsberger|Nicholas Johnson|Max Woerner Chase|Justin Hanssen| [email protected]|James
Edward Johnson|Griffin Jones|Bob Byroad|Hagen Dias|Jerin Mathew|Jasper Blacketer|Jonathan
226
Burkhart|Nicholas A. DeLateur|Ben Hamilton|Cole Mercer|Dougie Nix|Shaun Walker|Olof Bengtson|Marek Belski|Chris Cannon|Bob Putnam|Jeff McMorris|Timothy Phillips|Rodolfo F. Guzman|Joe Burgwin|Andreas Borgstrom|Philip Ratzsch|Kostas Sarafidis|R. Arteaga|fullzero|Petros
Ring|Harold Garron|Thomas Gears Butler|Neil Broadley|JPao|Aviel Mak|Kjeld Jensen|I P
Freely|Arturo Goicochea Hoefken|Leo Sutedja|Cameron Kellett|Werner Beytel|Muhammad
Khan|Jason Trogdon|Dao Tran|Thomas Juberg|Andy Nguyen|Petr Tomicek|Erik Rex|Stephen
Meriwether|Benjamin Klasmer|Derick Schmidt|Kyle Thomas|R.Nana|Arpan Bhowmik|Jacob
A. Thias|Elliot|Isaiah St. Pierre|Josh Milo Drchuncks|Dr. Sam N Jaxsam|Matthew M. McKee|Kyle Simmons|Jason Nell|Darcy Townsend|Jesse Houghton|Evan D.|Marcel Arnold|Thomas
Siemens|C Hodgson|Adrien D.|Bjrn Spongsveen|Jemma|Ed Matthews|Nik Mohd Imran
- Malaysia|Jason Weber|JTR Haldr|Matthew Ringman|Yoshio Goto|Evan Gary Hecht|Eric
L.|Hannes Hauer|Robert Mais|Highland Dryside Rusnovs|Michael Torres|Mike Alchus|Jonathan
L. Martin|Oliver Graf|David Anspaugh|Joe Griesmer|Garrett Dunham|Srujan Kotikela|Laurel
Richards|Lovelesh|Sarah Guermond|Brian Canada, PhD, Assistant Professor of Computational Science, University of South Carolina Beaufort|Shao|Antti Kankaanpaa|Carl F. Corneil|Laird_Dave|Nyholm
Family|Brandon Graham|M. A. Merrell|Kyle Frazier|PT_SD|Travis Corwith|Elliot Jobe|A R
Collins|rjan Smme|Jay B Robertson|Jim Matt|Christopher Holabird|Ronny Varghese|Claudia
Foley|Andrew Simeon|D G|Jay V. Schindler|Douglas Butler AKA SherpaDoug|Jon Perez|Pieter
Maes|Gabriel McCann|John Holbrook|Melissa Cloud|Inukai|Henning Fjellheim|Jason Tsang|Juliovn|Reagan
Short|Carlos Clavero Aymerich|Vaughan Allan|James Dougherty|Miles Johnson|Shirwin|Thomas
Taimre|Michael Urhausen Cody C. Foster|Christoph G.W. Gertzen|Mag|Matt Monach|Tabor|Ashwin.Manthena|Lan
Tully|NoVuS ReNoVaTiO|Joshua Broussard|Laurence Livermore|Rob Flavell|Fabian und Holger
Winklbauer|Adriano Lo conte|Decio Valeri|Stephen Ewell|Erik Zolan|Dharm Kapadia|Esteban Barrios|Mehul|Thomas Fauser|Nathan Pinck|Grant P|Gary|Jonathan Moisan|David Warshow|Erica
Zahka|Frederik Laursen|Piotr Napiorkowski|Chris Webster|James Kobulyar|Cobalt Halley|Dewey
Kang|Fall|Susan Engle|David Garber|Rebecka Wallebom|Pai Siwamutita|Joel Gerstein|Brant
Bagnall|Mr. Backer 7|Cole Smith|Gary Androphy|Keith L Hudson|Anthony Jackman|Regis
LUTTER|Charles Jurica|Jose Gleiser|Mike Henderson|Khalid M AlNuaim|Dan CidonaBoy
Murphy|BrianF84|Gunnar Wehrhahn|Marc Herzog|Leon Duckham|Justin S.|DC|Kit Santa
Ana|Tom Pritchard|Hamilton Chapman|Taylor Daniels|Andrew Wood|Tomer Frid|Peter B.
Goode|John Ford|Otto Ho|LCT|WinDragon68|Faber M.|Douglas Davies|Jacob Mylet|Niels Gamsgaard Frederiksen|Mark Greene|Rob Davarnia|Alex|Zabrina W.|William Ochetski Hellas|Jose I.
Rey|Dustin T. Petersen|A Nemeth|Praveen Tipirneni|Derek Etheridge|J.W. Tam|Andrei George
Selembo|Leo Evans|Sandu Bogi Nasse|Christopher J Ruth|Erin Thomas|Matt Pham|KMFS|Todd
Fritz|Brandon R. Sargent|boo|Lord Sharvfish Thargon the Destructor|Kylie Shadow Stafford|Edd
Almond|Stanley|Brandon Dawalt|Sebastian Baum|F. Iqbal|Mungo Hossenfeffer|Zubair Waqar|Matt
Russell|Sam Lau|Jean-Pierre Buntinx|James Giannikos|Chris Kimball|Happy|Nathan Craike|[email protected]|Asa
Ansari|J. Rugged|Stephanie Johnston|Shunji|Mohammad Ammar Khan|John-Eric Dufour|Brad
Keen|Ricardo Torres de Acha|Denis Mittakarin|Jeffrey de Lange|Stewart C. Annis|Nicholas
Goguen|Vipul Shekhawat|Daniel Hutchison|@lobstahcrushah|Bjoern Goretzki|Hans de Wolf|Ray
Barron|Garrett Jordan|Benjamin Lebsanft|Alessandro A. Minali|carlopezzi|Patrick Yang|Kieran
Fung|Niloc|David Duncan|Tom Naughton|Barry G.-A. Wilson|Dave Clarke|Shawn Xu|Kevin D.
Dienst|Durk Diggler|Marcus Friesl|Krisztina J.|V. Darre|Duane Sibilly, II|Marije Pierson|Anco
van Voskuilen|Joey van den Berg|Gil Nahmani|Stephen Yip|Richard Heying|Patrick Thornton|Ali
AlOhali|Eric LeBlanc|Clifton Neff|Steve Bofferbrauer Weidig|Jacob Hanshaw|daedahl|Lee
228
man (@mlitman)|Peter Ung|Fargo Jillson|Patrick Yevsukov|Dee Bender|Batbean|Aris MIchalopoulos|Threemoons|Christopher Yap|Tim Williams|O. Johnson|David N Buggs|Myles Kelvin|M.
Leiper|Brogan Zumwalt|Roy Nielsen|Jaen Kaiser|Joe C.|Emily Chen|Bryan Powell|SayRoar.com
(Film & TV)|Alan Hatakeyama|Chris Pacheco|Alex Galick|p a d z e r o|Juri Palokas|Gregg
Lamb|Lani Tamanaha Broadbent|Ami noMiko|Aaron Harvey|Angel Sicairos|Shiloh N.|Katherine
Allaway|AlamoCityPhoto.Com|John Laudun|Greg Reyes|Jagmas|Dan Allen (QLD, Australia)|Dustin
Niehoff|Ag|Scott M.|Esben Mlgaard|Ian McShane|Timothy Gibson|Que|Janice Whitson|Babur
Habib|Brent Cappello|Meep Meep|Justin G.|Stuart Woods|Ryan Cook|Mike R.|John Surdakowski|Ehren Cheung|powerful hallucinogens|Robert Drake|Steve Rose|Trenshin|Meeghan
Appleman|Hanning Butch Cordes!|Jose de Jesus Medina Rios|Guadalupe Ramirez|Andrew
Willerding|NathAn Talbot (MB)|Jorge Esteban|bchan84|[email protected]|Krish|Vaughn Valencia|Jeromy hogg|Jorge Hidalgo G.|zombieturtle|Mors|Rick D|Rob Schutt|Wee Lin|incenseroute.com|IceTiger|Jam
E. Baird Hulusi Onur Kuzucu|TheHodge|Yannick Scheiwiller|Robin arvidsson|Oliver Diestel|Daniel Knell|Elise Lepple|Frank Elias|Shaun Budding|Shane Williams|Chin Kun Ning|Eike
Steinig|Hogg Koh|AaronMFJ|John P. Doran|M Blaskie|Eric Harry Brisson|Chris Krispy89
Guerin|Duck Dodgers|Jonathan Coon|Sally Jo Cunningham|Joe Desiderio|Anon|Mike Sison|Shane
Higgins|Russell Wee|Gabriel B.|Thomas Sebastian Jensen|Amy Pearse|James Finley|Mikhail
Ushanov|James AE Wilson|Michael Dussert|Felix Varjord Soderstrom|Eric Metelka|Stephen Harland|muuusiiik|Shandra Iannucci|Joe Racey|Cook66|Nicholas R. Aufdemorte|Justin Marsten|Barrett|Zachary
Smith|mctouch|Donald D. Parker|Rob Silva|Phillip Vanderpool|David Herrera|Otto|Roland
Santos|Peter the Magnificent|Brandon B.|Brett Russell|Joe Carter|Andrew Paulus|Peter Harris|Brian Kaisner|Stefan Gobel|Melissa Beher|Jesse B - PotatoHandle|pwendelboe|Matthias
Gindele|Andy Smith-Thomas|Elizabeth S.|Erez Simon|Andrew Cook|Wouter van der Aa|Iain Cadman|Kyle Bishop|Andrew Rosen|Alessandro juju Faraoni|GeekDoc|Arran Stirton|AMCD|Eddie
Nguyen|Steve Stout|Richard Bolla|John Hendrix III|Pallav Laskar|Scrivener|Bobbinfickle|Vijay
Shankar V|Zach Peskin|Mark Neeley|oswinsim|Joe Briguglio|Stacy E. Braxton|Alan L Chan|Markus
Fromherz|Jim Otto|Neil Ewington|Sarah New|Harish Gowda|Eva Vanamee|Peter Knepley|RVPARKREVIEWS.COM
Wheel|Eric Gutierrez|Jeff Wilden|Dave Coleman|Brian S Beckham|Bill Long|Jeremy Kirsch|Tim Erwich|Ryan Valle|John Palmer|Rick Skubic|Vincent Parker|David Von Derau|Jonathan Wang|Chris
Stolte|Thomas Boettge|Jochen Schmiedbauer|Dirk Wiggins|David Recor|Joshua Hepburn|Pelayo
Rey|Jabben Stick|Amit paz|Rob B.|Art?rs Nar?ickis|Merrick Royball|Jerome amos|Soba|Varian
Hebert|Geoff A.|Dave Strange|Roy Arvenas|Ryan S.|Suresh Kumar|Stefan Nygren|John H|Justin A.
Aronson|Dave C|Keegan Willoughby|Martin-Patrick Molloy|David Hay|Jeff Beck|Sean Callaghan|Greg
Tuballes|Mark Filley|Somashekar Reddy|Jorge Palacio|Glen Andre Than|Garrett N.|Garry
Bowlin|Sathish K|Lucas A. Defelipe|Michael Roberts|Norman Sue|Tommaso Antonini|Herbert|Frank
Ahrens|Uberevan|Andy Schofield|Amir Prizant|Bennett A. Blot|Rob Udovich|Holli Broadfoot|Ray Lee Ramirez|Jeffrey Danziger|Kevin Rhodes|Brendon Van Heyzen|Jeff Simon|Jamie
E. Browne|Vote Obama 2012 !|Wel|n33tfr33k|J. Phillip Stewart|dham|Ove Kristensen|Phillip
Lin|Steve Paulo|Jerry Fu|Chris Wray|Daniel Schwan|Sean Cantellay|Azmi Abu Noor|Lucas
Van Deun|The Mutilator|Isaac Amaya|chandradeep|B. Graves|Benji|Leonard Chan|James
Smeltzer|George Ioakimedes|Andrew Keh|Bobby T. Varghese|Sir Nathan Reed Chiurco|Christian
Nissler|Ethan Johns|David Strakovsky|Leslie Harbaugh|AdamGross|Darren Koh|Matt Palmer|Michael
Anthony Sena|Blade Olson|Larry Sequino|jeremy levitan|Rahul Shah|Mike Schulze|Smallbee|Mark
Johanson|MS|Pat Ferate|Dennis Chan|Matthew D Johnson|Jefferson Tan|Eric G|Jordan Tucker|Steffen
Reinhart|Benjamin Rea|Brendan Burke|Oppa Gangam Style|P B Dietsche|Daniel Lauer|Jon
231
233