Headfirst Python
Headfirst Python
THIRD EDITION
With Early Release ebooks, you get books in their earliest form—the
author’s raw and unedited content as they write—so you can take
advantage of these technologies long before the official release of these
titles.
Paul Barry
Head First Python
by Paul Barry
Copyright © 2023 Paul Barry. All rights reserved.
Printed in the United States of America.
Published by O’Reilly Media, Inc., 1005 Gravenstein Highway North,
Sebastopol, CA 95472.
O’Reilly books may be purchased for educational, business, or sales
promotional use. Online editions are also available for most titles
(http://oreilly.com). For more information, contact our
corporate/institutional sales department: 800-998-9938 or
[email protected].
Installing on Windows
The wonderful Python folk at Microsoft work hard to ensure the most-
recent release of Python is always available to you via the Windows Store
application. Open the Store, search for “Python”, select the most-recent
version, then click the Get button. Watch patiently while the progress
indicator moves from zero to 100% then, once the install completes, move
to the next page – you’re ready.
Installing on macOS
The latest Macs ship with older, out-of-date releases of Python. Don’t use
these. Instead, head over to Python’s home on the web,
https://www.python.org/, then click on the “Downloads” option. The latest
release of Python 3 should being to download, as the Python site is smart
enough to spot you’re connecting from a Mac. Once the download
completes, run the installer that’s waiting for you in your Downloads folder.
Click the Next button until there are no more Next buttons to click then,
when the install is complete, move to the next page – you’re ready.
NOTE
There’s no need to remove the older pre-installed releases of Python which come
with your Mac. This install will supersede them.
Installing on Linux
The Head First Coders are a rag-tag team of techies whose job is to keep
the Head First Authors on the straight and narrow (no mean feat). The
coders love Linux and the Ubuntu distribution, so that’s discussed here.
It should come as no surprise that the latest Ubuntu comes with Python 3
installed and up-to-date. If this is the case, cool, you’re all set. If you are
using a Linux distribution other than Ubuntu, use your system’s package
manager to install Python 10 (or later) into your Linux system. Once done,
move to the next page – you’re ready.
Let’s complete your install with two things: a required back-end
dependency, as well as a modern, Python-aware text editor.
NOTE
Don’t worry, you’ll learn all about what this is used for soon!
NOTE
Their are alternatives to VS Code, but – in our view – VS Code is hard to beat when it
comes to this book’s material. And, no, we are *not* part of some global conspiracy to
promote Microsoft products!!
Pick the download which matches your environment, then wait for the
download to complete. Follow the instructions from the site to install VS
Code, then flip the page to learn how to complete your VS Code setup.
NOTE
On the Mac, start with the “Code” menu.
Until you become familiar with VS Code, you may wish to configure your
editor to match the settings preferred by the Head First Coders. Here are
the settings used in this book:
Add 2 required extensions to VS Code
Whereas you are not obliged to copy the recommended editor setup, you
absolutely have to install two VS Code extensions, namely Python and
Jupyter.
When you are done adjusting your preferred editor settings, close the
Settings tab by clicking the X. Then, to search for, select, and install
extensions, click on the Extensions icon to the left of the main VS Code
screen:
VS Code’s Python support is state-of-the-art
Installing the Python and Jupyter extensions actually results in a few
additional VS Code extension installations, as shown here:
These additional extensions enhance VS Code’s support for Python and
Jupyter over and above what’s included in the standard extensions.
Although you don’t need to know what these extra extensions do (for now),
know this: They help turn VS Code into a supercharged Python editor.
GEEK NOTE
NOTE
Yes, this is a Geek Note about Geek Notes (and we’ll not be having any recusrion
jokes, thank you).
Why Python?: Similar But Different
Your cursor is blinking in that empty code cell. Go ahead and type in
the first three lines of code from the card deck example from the start of
this chapter. Here’s the first three lines of code again:
Your notebook is waiting for more code. Let’s get in a little bit of
practice using VS Code by adding the following code to your notebook:
You were asked to add more code to your notebook. Here’s what your
notebook should look like now:
You’ve yet to invoke draw.
Cell #3 defines your function but doesn’t invoke it. This explains why cell
#3 (as well as the other two) show no output: The definition of variables
produces no output (cell #1) nor does the importation of a library (cell #2).
TEST DRIVE
NOTE
Do this now!
THERE ARE NO DUMB QUESTIONS
Q: The output from the draw function looks a little strange. What’s
the deal with those parens?
A: Technically, the draw function is returning a tuple, which is an
immutable data structure built into Python. Don’t worry what all this
means for now, as you’ll be learning lots about how Python works with
data structures later in this book. And, yes, the output from the draw
function could look more human-friendly, but at this stage in this book
we’re not interested in making your output look nice. Rather we’re
concentrating on showing you running Python code.
Q: “.ipynb” as a file extension? Kinda awkward, isn’t it?
A: It stands for “Interactive PYthon NoteBook”, which is the format
used by Jupyter to store your notebooks. Despite the weird filename
extension, notebooks are cool in that they are text files based on a
standard JSON format. You can treat an ipynb file like any other text
file. In fact, if you know someone who can run a Jupyter notebook, you
can share your ipynb file with them (perhaps including it as an email
attachment?). Upon receipt, they can load your notebook into their
Jupyter and work with their copy of your notebook as needed.
Q: What’s the significance of those code cell numbers?
A: They are there mainly as a convenience, in that they are a visual clue
as to what order your code cells executed in. They have nothing to do
with Python. As such, and going forward, we’ll only show the cell
numbers when it makes sense to do so, as our goal is to get you to
concentrate on the Python we’re showing you, not the ins-and-outs of
VS Code and Jupyter. For now, if you understand why you have to
press Shift+Enter, then you’re good to go.
Q: I just opened my Cards. ipynb file in VS Code and it’s saved
everything, including all the output! Shouldn’t it just save my code?
A: No. Jupyter’s format saves all the information from your notebook,
including any generated output. This is why notebooks are saved as
JSON. You can control what get’s saved, so if you don’t want the output
saved, you don’t have to save it. Having said, code saved without
output is still saved as JSON, so things will look weird if you open an
ipynb file in an editor which doesn’t understand Jupyter, as you’ll see
the raw notebook JSON (which can look a tad intimidating).
Q: I have a Data Scientist friend and when I showed them my VS
Code setup eyes were raised, and I was asked why I’m not running
Jupyter inside my browser, like everyone else? What’s the story?
A: Yes, our Data Scientist friends also love to run their notebooks in the
browser-based Jupyter Notebook and Jupyter Lab environments (and
we like both of those tools, too, BTW). However, we feel running
notebook’s within VS Code is a “better fit” to the way programmer’s
brains are wired. As you work through this book and gain more
experience with VS Code, you’ll see that the editor also has lots more
to offer over what we’ve shown you so far.
Q: Can I use my VS Code produced notebooks with other Jupyter
tools, for instance, inside a web browser?
A: Yes. If the tool you want to use understands the Jupyter notebook
format, it can use anything produced by VS Code, as VS Code
notebooks are 100% compatible with Jupyter’s JSON standard.
Q: Can I use VS Code to manage my GIT stuff?
A: Yes, but getting into how you do that is beyond the scope of Head
First Python, so you won’t see us using GIT here. Of course, that’s not
to say we don’t use GIT to manage the code in our projects: We do! Oh,
BTW, If you’re looking for an excellent GIT primer, check out Head
First GIT.
Q: Why the funny spelling of Jupyter? Isn’t the planet spelled with
an “i”?
A: Yes, the planet is spelled with an “i”, but the tool is not named after
the planet. Jupyter is named after the three programming languages it
initially supported, namely Julia, Python, and R. That’s why the “py”
letters are included in the name: That’s a reference to Python’s preferred
filename extension for code files, which is .py.
Q: [Coughs] Emmm… Is R a real programming language?
A: Oh, come on, now, let’s not go there. (This is a Python book, after
all).
NOTE
To create a new notebook in VS Code, select File, then New File... from the menu.
Choose the third option to create a new, untitled notebook. Perform a File, Save to
change the untitled name to “WhyPython.ipynb”.
Yes. Every… single…word.
We’re only joking.
The goal for this chapter is to arm you with enough know-how to
confidently answer the question: Why Python? To do that, you’ll be
introduced to Python language features on the pages which follow, albeit
from a high-level.
But, don’t worry: You’ll be returning to all of these features in detail later in
this book. For now, concentrate on understanding the gist of what you’re
seeing.
With your new notebook ready in VS Code, get ready to dig in!
The PSL represents a large body of tested code which you don’t
have to write, just use As the PSL has existed for decades now, the
modules it contains have been tested to destruction by legions of
Python programmers all over the globe. Consequently, you can use
PSL modules with confidence.
Let’s see two modules from your recent Who Does What? exercise in
action. In your WhyPython notebook, type the code below into code
cells, remembering to press Shift+Enter to execute the cells one-at-a-
time. First up is a bit of randomness. Let’s face it, everyone loves
random numbers, and the PSL’s random module makes generating
them super easy:
What’s in a name?
Lots, when it comes to Python, which is not named after a type of
snake! Instead, Python is named in honor of Monty Python’s Flying
Circus, a classic British comedy TV series and movie franchise
featuring John Cleese, Michael Palin, Graham Chapman, Terry
Gilliam, Terry Jones, and Eric Idle. Who knew? For more on the origin
of Python’s name, see: https://docs.python.org/3/faq/general.html#why-
is-it-called-python.
You met a list earlier in the card deck code. Recall the suit variable:
TEST DRIVE
Lists are really useful for lots of reasons, but mostly due to the fact they
can mutate: As your code runs, lists can shrink and grow as needed. If
what you are trying to model with your data does not require mutability,
you may wish to consider using a tuple which – keeping things simple,
for now – can be thought of as a list which cannot mutate.
On the surface, using a tuple instead of a list doesn’t look all that
different.
TEST DRIVE
Lists and tuples are all over Python, but everybody’s favorite built-in
data structure is the dictionary (dict), which is a mapping data
structure associating keys with values. Here’s a simple example which
associates a handful of student names with alphabetic grades.
TEST DRIVE
The last of The Big 4 is the set, which is just like the sets you learned
about in Math class. When a bunch of objects are assigned to a set,
duplicates are removed, and it’s this characteristic which most Python
programmers exploit. Of course, sets can be do so much more.
Python has powerful built-in operators
Like other programming languages, Python comes with a large collection of
operators. There’s the usual suspects, such as ==, >, !=, <, and so on. But,
Python has a few extras which can be incredibly useful, especially when
combined with the built-in data types/structures.
One such operator is in, which performs membership testing. Let’s see in at
work against some of the variables from earlier in this chapter.
The in operator knows all about the built-in data types/structures. The print
BIF, once again, can be put to good use to help you see the in operator
doing its stuff. As you run these fives calls to print in your notebook, note
the absence of any loop code:
THERE ARE NO DUMB QUESTIONS
Q: The suit variable started out as a list, then you assigned a tuple
to it, which surely changed its type, right? Why didn’t Python
complain?
A: On the surface it looks like suit’s type changed, but it didn’t. The
type of the object suit refers to changed. Think of variables in Python
as object references. As a variable in Python can refer to any object of
any type, it follows that the type of the object a variable refers to can
change as your program runs. This is what is meant by “dynamic
typing”, in that the type your variable refers to is bound at run-time, not
compile-time. Some programmer view such an arrangement as evil at
work. Python programmers do not share this view. To keep things
straight in your head, remember that Python variables are object
references which refer to values which themselves have type. There’s a
double look-up here: First, Python looks up the variable’s name then,
second, Python accesses the object the name refers to. (This might
sound absurd, but it works surprisingly well).
Q: Surely there’s more to the big 4 built-in data structures than
what’s presented in those four Test Drives?
A: Of course there is! In fact, knowing when to effectively use the
correct built-in data structure is what often separates the better Python
programmers from the pack. You’ll see lots of uses of every one of the
big 4 in this book and, by the time you’re done, you’ll be wielding all
four of them like a real pro.
Q: What if none of the big 4 built-in data structures fit my needs?
Can I create my own?
A: Yes, you can. But, let’s not get ahead of ourselves here. When the
time comes for you to craft a custom data structure as a 100% perfect fit
for your application needs, we’ll walk you through the process. All
we’ll say now is it’s a class act!
Pretty much, yes.
If you think the PSL is cool, just wait until you learn about PyPI. Flip the
page for a quick intro.
NOTE
Who can forget the “combo mambo”?
The PSL has “The Big 4” data structures: lists, tuples, dictionaries,
and sets.
Python has powerful built-in operators (like “in”).
NOTE
Just think of all the code included in 4 through 8 that you don’t have to write,
maintain, nor test!
NOTE
Don’t underestimate the importance of this last one.
The Opening Crossword
Shift+Enter Execute the current code cell, then move the focus
to the next cell (creating a new empty cell when at
the bottom of the notebook).
Ctrl+Enter Execute the current code cell, but don’t move the
focus.
Esc then A Insert a new empty cell above the current cell.
Move the focus to the new cell.
Esc then B Insert a new empty cell below the current cell.
Move the focus to the new cell.
Esc then V Paste a previously copied (or cut) cell below the
currently focused cell.
Esc then X Cut the currently focused cell from the notebook.
NOTE
Just as well, as we asked you to take your scissors to what’s on the flip-side!
Chapter 1. Diving in: Hit the
Ground Running!
Cubicle Conversation
Ava: OK, folks, let’s offer some suggestions on how best to process this
data file.
Juan: I guess there are two parts to this, right?
Matt: How so?
Juan: Well, firstly, I think there’s some useful data embedded in the
filename, so that needs to be processed. And, secondly, there’s the timing
data in the file itself, which needs to be extracted, converted, and processed,
too.
Ava: What do you mean by “converted”?
Matt: That was my question, too.
Juan: A value like “1:27.95” represents, I’d imagine, one minute, 27
seconds, and 95 one-hundredths of a second. That needs to be taken into
consideration when working with these values, especially when calculating
averages. So, some sort of value conversion is needed here. Remember, too,
that the data in the file is textual.
Matt: I’ll add “conversion” to the to-do list.
Ava: And I guess the filename needs to be somehow broken apart to get at
the swimmer’s details?
Juan: Yes. The “Darius-13-100m-Fly” part can be broken apart on the “-”
character, giving us the swimmer’s name (Darius), their age group (under
13), the distance (100m), and the swimming stroke (Fly).
Matt: That’s assuming we can read the filename?
Juan: Isn’t that a given?
Ava: Not really, so we’ll still have to code for it, although I’m pretty sure
the PSL can help here.
Matt: This is getting a little complex…
Juan: Not if we take things bit-by-bit.
Ava: We just need a plan of action.
Matt: If we’re going to do all this work in Python, we’ll also have a bit
more learning to do.
Juan: I can recommend a great book…
SHARPEN YOUR PENCIL
From the conversation on the last page, it looks like there are two main
tasks identified at this stage: (1) extract data from the filename, and (2)
process the swim times data in the file.
Grab your pencil and, for each of the identified tasks, write down what
you think are the required sub-tasks for both (in the spaces provided).
Our lists of sub-tasks can be found over the page.
Extract data from the file’s name
___________________________________________
___________________________________________
___________________________________________
___________________________________________
___________________________________________
Process the data in the file
___________________________________________
___________________________________________
___________________________________________
___________________________________________
___________________________________________
___________________________________________
___________________________________________
___________________________________________
SHARPEN YOUR PENCIL SOLUTION
From the recent conversation, it looks like there are two main tasks
identified at this stage: (1) extract data from the filename, and (2)
process the swim times data in the file.
You were to grab your pencil and, for each of the identified tasks, write
down what you thought the required sub-tasks are for both (in the
spaces provided). Here’s what we came up with. How did you do?.
Extract data from the file’s name
GEEK NOTE
It’s important to note that Python never requires you to perform the
double-lookup, as it’s all handled automatically for you whenever you
use a variable.
And, although Python provides an id BIF which given a variable name
returns the variable’s object reference (i.e., it’s memory address), you
should never use nor rely on the value returned. Remembering to not
use the id BIF is really, really important. Let Python handle memory for
you, as you’ve more than enough to worry about trying to write the
code you need to build your application.
The print dir invocation produced a big list, but you only need to
worry about half of it.
You can safely ignore all of the methods which start and end with the
double underscore character, such as __add__ and __ne__. These are
this object’s “magic methods” and they do serve a purpose, but it’s far
too early in your Python journey to worry about what they do and how
you can use them. Instead, concentrate on the rest of the methods on
this list.
THERE ARE NO DUMB QUESTIONS
Q: If those double-underscore methods are not important, why are
they on the list returned by dir?
A: It’s not that they aren’t important, it’s more a case that you don’t
need to concern yourself with what they do at this stage. Trust us, when
you need to understand what the double-underscore methods do, we’ll
tell you. Pinky-promise.
Q: Is there a way I can learn more about what a particular method
does?
A: Yes, and we’ll show you how in a page or two. What’s cool is that
using Jupyter makes this an especially easy thing to do.
Q: It’s all a bit of a mouthful, all this double-underscore stuff, isn’t
it?
A: Yes, it is. Most Python programmers shorten “double_ underscore
add double_underscore” to simply “dunder add”. So, if you hear
someone refer to an method as “dunder exit”, what they are actually
referring to is __exit__. All of these (as a group) are called “the
dunders”. Further, any method which starts with a single-underscore is
known as a “wonder” (and – yes – it is a perfectly acceptable reaction to
groan at all of this).
SHARPEN YOUR PENCIL
Let’s try out two of the methods provided with strings. Take each of the
lines of code shown below and enter them into a new, empty code cell.
Execute each then – using a pencil – make a note (in the space
provided) of what you think each function attribute does.
fn.upper() ___________________
fn.lower() ___________________
No problem. Great question, by the way.
This is Python’s dot operator, which allows you to invoke a method on an
object. This means fn.upper() calls the upper method on the string
referenced by the fn variable.
This is a little different to the BIFs which are invoked like functions. For
instance, len(fn) returns the size of the object referred to by the fn
variable.
It’s an error to invoke fn.len() (as there’s no such method), just as it’s an
error to try upper(fn) (as there’s no such BIF).
Think of things this way: The methods are object-specific, whereas the
BIFs provide generic functionality which can be applied to objects of any
type.
SHARPEN YOUR PENCIL SOLUTION
You were asked you try out two of the methods provided. You were to
take each of the lines of code shown below and enter them into a new,
empty code cell. You were then to execute each, then – using a pencil –
make a note (in the space provided) of what you thought each method
did. Here’s what we think happens here:
Yes, that’s right.
The values returned by the upper and lower methods are both new string
objects, which have a value-part and a methods-part, as well as unique
object references (or IDs, if you prefer).
This is all by design: Python is supposed to work this way.
Returning to the diagram we used earlier to introduce object references,
here’s what it looks like as a result of the code from the Sharpen executing:
The three string objects are in the table of objects, with each assigned their
own object reference. Sadly, this state of affairs doesn’t last very long. As
the two most-recent string objects aren’t assigned to a variable name, there
are no actual references to them. The next time Python’s memory
management technology runs, the unreferenced objects are garbage
collected, and effectively disappear. Cue the sad music…
If you flip back one page and re-read the split method’s documentation,
you’ll learn the default behavior is to split on whitespace (e.g., space,
tab, newline, carriage-return, formfeed, or vertical-tab). This is not what
you want here, as you want to break the fn string apart on the “-”
character.
Let’s try again, this time specifying “-” as the delimiter. Doing so is
easy, as all you do is pass the dash character as an argument to the split
method call:
You did read the “split” method’s documentation, didn’t you? The
answer’s right there…
But doing so would be premature. Take a closer look at the list produced by
your call to the split method:
Emmm, maybe…
Let’s spend a moment or two with split to ensure you understand how it
works its magic.
EXERCISE
Of course, this line of code failed, which is a bummer because the idea was
sound, in that you want to split your string twice in an attempt to break the
strings “Fly” and “txt” apart. But, look at the error message you’re
getting:
Yes, that’s exactly what’s happening.
The first split works fine, breaking the string object using “-”, producing a
list. This list is then passed onto the next method in the chain which is also
split. The trouble is lists do not have a split method, so trying to invoke
split on a list makes no sense, resulting in Python throwing it’s hands up in
the air with an AttributeError.
But… now you know this, how do you fix it?
You’re trying to get rid of that “.txt” bit at the end of the original
string. Here’s the list of string methods from earlier. Do any of these
method names jump out at you?
Follow along in VS Code while you test the rstrip method against
some test values similar to those you’ll encounter.
As with the rstrip method, ask the help BIF for details on what
removesuffix does:
Let’s take this method for a spin in your notebook.
TEST DRIVE
As when you tested rstrip, let’s throw some test data at the
removesuffix method, too:
Now that you’ve identified the method you need, you can create the
method chain needed to extract the data you need from the fn string
object:
THERE ARE NO DUMB QUESTIONS
Q: Can method chains be of any length?
A: Yes. Although you do need to think about code readability. The
examples seen thus far have chained two methods together, which is not
hard to get your head around. But imagine if a programmer decides to
chain a dozen methods together? Python let’s the programmer do this,
but you’ll likely pull your hair out trying to decipher what such a large
chain does… or maybe you’ll want to hunt down the programmer so
you can pull their hair (not that we’re condoning such behavior… it’s
just that we understand the urge).
Q: When I run the code in this chapter the original string in my fn
variable never changes. What if I want to apply the changes to the
original string. Can I do this?
A: The short answer is no. The longer answer is also no, in that you
cannot change a string object in Python once it’s defined. Strings in
Python are immutable (they cannot mutate or change once they exist).
This behavior is by design, and sometimes strikes programmers from
other languages as strange. Python treats strings as a basic data type,
like numbers. Numbers are also immutable. Once 42 exists in your code
it cannot be mutated nor changed. It’s always 42. Same thing with
strings. If a string has the value “Marvin”, it’ll remain that value until
one of two things happen: (1) your code terminates or (2) the Universe
ends.
Q: Is the fact that strings are immutable not a huge disadvantage?
A: Not really. Knowing that strings can never change frees you from
having to worry about a whole host of nasty side-effects.
Q: Let me get this straight: When I assign the string “Galaxy” to a
variable called place, I can never change place’s value later in my
code due to strings being immutable?!?
A: No, that’s not what this means, as it’s not the same thing. The string
“Galaxy” once defined can never be changed. However, when the
string is assigned to your place variable, the string’s object reference is
assigned to place, not the actual string “Galaxy”. It’s perfectly legal,
later in your code, to assign a different object reference to place (after
all, that’s what variables are in Python: somewhere to store an object
reference). But, the string object which contains “Galaxy” can never
change. The string object “Galaxy” and the variable name place are
two different things.
Q: So, what happens to a string (“Galaxy” for example) which is
assigned to a variable in some code then, later, the variable is
assigned some other value? Does “Galaxy” just hang around?
A: Doesn’t every galaxy just hang around? [Apologies to all the
Astronomers reading this]. Joking aside, once any previously created
object in Python gets to the stage where it is no longer referred to by
any variable, it is garbage collected by Python’s memory management
system. You don’t need to do anything to make this happen, as Python
takes care of all the details.
Your last line of code produced a list with the four values you need, but
how do you assign each of these four values to individual variables?
Indeed they do.
When working with lists, it is possible to use the familiar square bracket
notation. And, as in most other programming languages, Python starts
counting from zero, so [0] refers to the first element in the list, [1] the
second, [2] the third, and so on.
Let’s put this new-found list knowledge to immediate use.
TEST DRIVE
NOTE
Don’t forget to follow along!
The line of code from the last page produces a list of four data values:
We were all set to begin celebrating getting to this point, but it looks like
someone has a question…
The parts variable feels kinda integral.
That said, we get where Nina’s coming from, in that parts is created to
temporarily hold the list of data items, which is then combined with the
square bracket notation to extract the individual data items. Once that’s
done, the parts list is no longer needed.
But, can you do without parts?
NOTE
So… if the “parts” variable is not needed, does this mean it’s spare? (Sorry).
The list is then assigned to the parts variable name, allowing you to use
the square bracket notation to access the data you need:
As the split method produces a list, you could do what’s shown below to
achieve the same thing as what’s shown above, removing the parts
variable from the code:
Although the parts variable is no more, can you think of a reason why
this version of your code may not be optimal?
Grab your pencil and see if you can fill in the blanks below. Based on
what you now know about multiple assignment (unpacking), provide
the individual lines of code which assign the correct unpacked values to
the individual variable names. Provide the printed output, too.
SHARPEN YOUR PENCIL SOLUTION
You were to grab your pencil and see if you could fill in the blanks.
Based on what you knew about multiple assignment (unpacking), you
were to provide the individual lines of code which assign the correct
unpacked values to the individual variable names. You were to provide
the printed output, too. Here’s our code, below. Is your code the same?
Task #1 is done!
Recall the list of sub-tasks once more:
Great. Thanks!
Of course, there’s still a bit of work to do. Let’s remind everyone what Task
#2 is (over the page).
All of the answers to the clues are found in this chapter’s pages, and the
solution is on the next page. Have fun!
Across
2. Can be used to get rid of a filename’s extension.
4. Removes a set of characters from the end of a string.
5. Short for “double underscore”.
7. Another name for multiple assignment.
9. Ball and ________.
10. The plural name given to an object’s built-in functions.
14. Comes in handy when breaking apart strings.
Down
1. Every object in Python has 10 across, in addition to this.
2. It’s not an object identifier, it’s an object __________.
3. The swim coach mucks about with one of these.
6. More than one.
8. Never rely on the value returned by this BIF.
11. Everything is one of these.
12. Our favorite brackets.
13. This is sort of like an array in other programming languages.
NOTE
Don’t worry, those 62 data files are small so it only takes a few seconds for the ZIP to
download.
Once your download completes, unzip the file then copy the resulting
swimdata folder into your Learning folder. This ensures the code which
follows can find the data as it’ll be in a known place.
Each file in the swimdata folder contains the recorded times for one
swimmer’s attempts at a specific underage distance/stroke pairing. Recall
the data file from the start of the previous chapter which shows Darius’s
under-13 times for the 100m fly:
Yes, it does.
There’s a BIF called open which can work with files, opening them for
reading, writing, appending, or any combination of the above.
The open BIF is powerful on it’s own, but it shines when combined with
Python’s… em, eh… with statement.
NOTE
As always, follow along.
Let’s see the with statement together with the open BIF at work, so you
can get in on the action.
For the code which follows to work, the assumption is you’ve already
downloaded the Coach’s data, unzipping the file into a folder called
swimdata within your Learning folder. Do that now if you forgot (and
skip back two pages for the URL to use).
To get going, create another new notebook in VS Code, and give it the
name Average.ipynb, saving this latest notebook in your Learning
folder.
To identify the file you plan to work with you need two things: the file’s
name and the location it’s to be found in. Here’s how Python
programmers would define constants for these values:
Although referred to as “constants”, Python doesn’t actually support the
notion of constant values, so it is a convention within the Python
programming community to use UPPERCASE variable names to signal
to other programmers that the values are constant (and should not be
changed).And, yes, eagle-eyed readers will have spotted that we –
rather blatantly – disregarded this convention in the previous chapter
when we named our filename variable “fn”. This is, of course, a
shocking use of a lowercase variable name for a constant value! Just to
be clear, we won’t tell if you won’t tell… and we promise to conform to
this convention from here on in.
With your constants defined, here’s the Python code which opens the
file, reads all its data into a list called (exploiting unprecedented
imagination here) data, then automatically close the file:
NOTE
If you are coming to Python from one of those programming languages which uses
curly-braces to delimit blocks of code, using indentation in this way may unnerve
you. Don’t let it, as it’s really not that big a deal.
It not that we don’t want to talk about indentation.
It’s just we feel there’s much more to Python than its use of indentation (or,
more correctly, whitespace) to delimit code blocks. Yes, it’s an important
aspect of the language, but it’s something most Python newbies get used to
quickly. When we need to, we’ll call it out, otherwise we’ll just get on with
things. And with that said, let’s get back to the take-aways.
The with statement closes the file after its code block runs.
This is a cool feature, as we’d forgotten to do this. It’s nice to know the
with statement has your back, tidying up after your code block executes.
Two variables are created by the code: df and data.
The df variable refers to a file object created by the successful execution of
the open BIF. The data variable refers to the list of lines read from the df
file object by the readlines method. Both variables continue to exist after
the code block ends, although the df variable now refers to a closed file
object.
The as keyword, together with with, does the same thing (and looks nicer,
too).
Let’s take a closer look at what df is, as well as learn a bit about what it can
do:
The data value in the first slot in the data list is a string representing the
swimmer’s times:
You can safely ignore anything else in the file, as the data you need is in the
above string. It’s time for a couple tick marks to indicate your progress with
Task #2:
The third sub-task should not be hard for anyone who has spent any amount
of time working with Python’s string technology. As luck would have it,
you’ve just worked through the string material in the previous chapter, so
you’re all set to have a go. But before you get to that sub-task, we need to
talk a little about one specific part of that with statement: the colon.
In the previous chapter you took a string then applied the split and
removesuffix methods to it to produce the data values you needed from
the file’s name.
A similar strategy can be applied to your next sub-task, although you
are unlikely to need to use removesuffix. The string you’re working
with has a newline character (\n) at the end you don’t need. Find a
string method to use in place of removesuffix to enable you to remove
the newline character from the string. Combine the call to the new
method in a chain which includes split to break the string apart by “,”
producing a new list, which you can assign to a new variable called
times.
Experiment in your VS Code-hosted notebook until you’ve written the
code you need, then write the code which create the times variable in
the space provided below (and our code is on the next page):
___________________________________________
___________________________________________
Yes, to both questions.
Yes, we did indeed introduce strings in the previous chapter and, yes, we’re
concentrating on lists in this one.
Recall the split method produces a list from a string, which is precisely why
you need to use it now. If your times variable, above, isn’t a list, you’re
likely doing something wrong.
When you’re ready, flip the page to see the code we came up with.
In the previous chapter you took a string then applied the split and
removesuffix methods to it to produce the data values you needed from
the file’s name.
A similar strategy can be applied to your next sub-task, although you
are unlikely to need to use removesuffix. The string you’re working
with has a newline character (\n) at the end you don’t need. Knowing
this, you were asked to find a string method to use in place of
removesuffix to enable you to remove the newline character from the
string. You were to combine the call to the new method in a chain
which was to include split to break the string apart by “,” producing a
new list to be assigned to a new variable called times.
It was suggested you experiment in your VS Code-hosted notebook
until you’ve written the code you need, then you were to write the code
to create the times variable in the space provided below. Here’s what
we came up with:
That was almost too easy
With your prior experience of working with strings from the previous
chapter, we’re hoping that most recent Sharpen wasn’t too taxing.
It is important to call strip before split, producing a new list from the data
value in the data’s first slot (data[0]). In fact, your latest chain code is
very similar to the code from the previous chapter:
With the result of your latest chain assigned to the times variable, you’ve
completed sub-task (c). It’s time for another tick mark.
Assuming you can extract the three numbers you need from the string,
can you think of a calculation which converts the string into a numeric
value?
NOTE
There’s more than one way to do this, so don’t worry if what you think up isn’t the
same method as ours (which is detailed over the page).
The strategy described on the previous page can be turned into Python code
without too much difficulty.
In the code which follows, some of the annotations from the previous page
are converted to single-line comments which start with the # character and
continue to the end of the line (and are, obviously, ignored by the Python
interpreter).
VS Code, like most other editors, displays comments in a different color to
your code.
If you type this code into a new code cell in your notebook, then press
Shift+Enter, the value 8795 appears on screen. Sweet.
EXERCISE
Now that you’ve seen the for loop in action, take a moment to
experiment in your notebook to combine the Ready Bake Code from a
few pages back with a for loop in order to convert all of the swim times
to hundredths of seconds, displaying the swim times and their converted
values on screen as you go. When you are done, write the code you
used into the space below. Our code is coming up in two pages time.
___________________________________________
___________________________________________
___________________________________________
___________________________________________
___________________________________________
___________________________________________
Python does indeed support while.
But, the while loop in Python is used much less than an equivalent for.
Before getting to our solution code for the above exercise, let’s take a
moment to compare for loops against while loops.
Now that you’ve seen the for loop in action, you were to take a moment
to experiment in your notebook to combine the Ready Bake Code from
a few pages back with a for loop in order to convert all of the swim
times to hundredths of seconds, displaying the swim times and their
converted values on screen. You were to write the code you used into
the space below. Here’s the code we came up with:
TEST DRIVE
Taking the Exercise Solution code for a spin produces the expected
output:
Recall that the type BIF is used to determine what type a variable refers to.
A quick call to type confirms you’re working with a list, and a call to the
len BIF confirms your new list is empty:
Can you remember what you need to do to display your new list’s built-
in methods?
Ah ha! That final line of output (“Append object to the end of the list.”) is
all you need to know, even though it’s tempting to take some time to
experiment with those other methods, some of which sound cool. But, let’s
not do that. Let’s stick to the task of building a new list of converted swim
time values as you go.
No, you do not need to worry.
In the previous chapter, we made a big deal about lists in Python being like
arrays in other programming languages. This let us introduce the use of the
square bracket notation with lists, which is a common technique when
working with arrays and lists.
However, unlike with arrays, where you typically have to say how big your
array is likely to get (e.g., 1000 slots) and what type of data it’s going to
contain (e.g., integers), there’s no need to declare either of these with your
Python lists.
Python lists are dynamic, which means they grow as needed (so there’s no
need to pre-declare the number of slots beforehand). And Python lists don’t
contain data values, they contain object references, so you can put any data
of any type in a Python list. You can even mix’n’match types.
SHARPEN YOUR PENCIL
Grab your pencil, as you’ve work to do. Here’s the most recent code
which displays the swim time strings together with equivalent
conversion to hundredths of seconds:
Adjust the above code to do two things: (1) Create a new empty list
called converts right before the loop starts, and (2) Replace the line
which starts with a call to the print BIF with a line of code which adds
the converted value onto the end of the converts list. Write your code
in the space below (and, when you’re ready, check you code against
ours on the next page):
SHARPEN YOUR PENCIL SOLUTION
You were to grab your pencil, as you’d work to do. You’d been shown
the most recent code which displays the swim time strings together with
equivalent conversion to hundredths of seconds:
Your job was to adjust the above code to do two things: (1) Create a
new empty list called converts right before the loop starts, and (2)
Replace the line which starts with a call to the print BIF with a line of
code which adds the converted value onto the end of the converts list.
Here’s the code we came up with:
TEST DRIVE
Let’s take your latest code for a spin. Recall the previous version of
your loop produced this output:
Your new loop code is similar, but does not produce any output.
Instead, the converts list is populated with the conversion values.
Below, the new loop code executes in a code cell (producing no output)
then, in two subsequent code cells, the contents of the times list as well
as the (new) converts list is shown:
It’s time to calculate the average
You don’t need to be a programmer to know how to calculate an average
when given a list of numbers. The code is not difficult, but this fact alone
does not justify your decision to actually write it. When you happen upon a
coding need which feels like someone else may have already coded it, ask
yourself this question: I wonder if there’s anything in the Python Standard
Library which might help?
There is no shame in reusing existing code, even for something you
consider simple. With that in mind, here’s how to calculate the average from
the converts list with some help from the PSL:
Don’t forget the PSL – it’s full of cool code.
Although calculating the average is easy, as shown above you haven’t had
to write a loop, maintain a count, keep a running total, nor perform the
average calculation. All you do is pass the name of the list of numbers into
the mean function which returns the arithmetic mean (i.e., the average) of
your data. Cool. That’ll do.
Yes, as mins:secs.hundredths.
In effect, you need to reverse the process from earlier which converted the
original swim time string into it’s numeric equivalent.
It can’t be that hard, can it?
All of the answers to the clues are found in this chapter’s pages, and the
solution is on the next page. Got for it!
Across
1. Python programmer’s favorite looping construct.
5. When a variable name is in UPPERCASE, it’s meant to be treated as one
of these.
9. This method creates a list from your file’s data.
12. Part of a famous combo, when paired with split.
13. The less-used looping construct.
15. Another name for whitespace when used with code blocks.
17. Your new BFF.
Down
2. Another powerful combo when used with 7 down.
3. // performs _______ division.
4. A numeric conversion BIF.
6. A module loved by Maths-heads.
7. The recommended statement to use when opening files.
8. This method grows lists.
10. A small keyword which you’ll learn more about in a later chapter.
11. A string-creating BIF.
14. A BIF to control decimal places.
16. This [] signifies an ______ list.