Doing Math With Python (En)
Doing Math With Python (En)
EXPLORE MATH
WITH CODE
Doing Math with Python shows you how to use • Write programs to find derivatives and integrate
WITH PYTHON
Python to delve into high school–level math topics functions U S E P R O G R A M M I N G T O E X P L O R E A L G E B R A ,
like statistics, geometry, probability, and calculus.
Creative coding challenges and applied examples help S T A T I S T I C S , C A L C U L U S , AND MORE!
You’ll start with simple projects, like a factoring
you see how you can put your new math and coding
program and a quadratic-equation solver, and then
skills into practice. You’ll write an inequality solver, plot
create more complex projects once you’ve gotten
gravity’s effect on how far a bullet will travel, shuffle a
the hang of things. AMIT SAHA
deck of cards, estimate the area of a circle by throwing
Along the way, you’ll discover new ways to explore 100,000 “darts” at a board, explore the relationship
math and gain valuable programming skills that you’ll between the Fibonacci sequence and the golden ratio,
use throughout your study of math and computer and more.
science. Learn how to:
Whether you’re interested in math but have yet to dip
• Describe your data with statistics, and visualize it into programming or you’re a teacher looking to bring
with line graphs, bar charts, and scatter plots programming into the classroom, you’ll find that Python
makes programming easy and practical. Let Python
• Explore set theory and probability with programs for
handle the grunt work while you focus on the math.
coin flips, dicing, and other games of chance
ABOUT THE AUTHOR
• Solve algebra problems using Python’s symbolic math
functions Amit Saha is a software engineer who has worked
for Red Hat and Sun Microsystems. He created and
• Draw geometric shapes and explore fractals like
maintains Fedora Scientific, a Linux distribution for
the Barnsley fern, the Sierpiński triangle, and the
scientific and educational users. He is also the author
Mandelbrot set
of Write Your First Program (Prentice Hall Learning).
COVERS PYTHON 3
T H E F I N E ST I N G E E K E N T E RTA I N M E N T ™
w w w.nostarch.com
b y Amit Sa ha
San Francisco
Doing Math with Python. Copyright © 2015 by Amit Saha.
All rights reserved. No part of this work may be reproduced or transmitted in any form or by any means,
electronic or mechanical, including photocopying, recording, or by any information storage or retrieval
system, without the prior written permission of the copyright owner and the publisher.
Printed in USA
First printing
19 18 17 16 15 1 2 3 4 5 6 7 8 9
ISBN-10: 1-59327-640-0
ISBN-13: 978-1-59327-640-9
For information on distribution, translations, or bulk sales, please contact No Starch Press, Inc. directly:
No Starch Press and the No Starch Press logo are registered trademarks of No Starch Press, Inc. Other
product and company names mentioned herein may be the trademarks of their respective owners. Rather
than use a trademark symbol with every occurrence of a trademarked name, we are using the names only
in an editorial fashion and to the benefit of the trademark owner, with no intention of infringement of the
trademark.
The information in this book is distributed on an “As Is” basis, without warranty. While every precaution
has been taken in the preparation of this work, neither the author nor No Starch Press, Inc. shall have any
liability to any person or entity with respect to any loss or damage caused or alleged to be caused directly or
indirectly by the information contained in it.
To Protyusha, for never giving up on me
Brief Contents
Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xiii
Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xv
Afterword . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 209
Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 237
Conte nt s in De ta il
Acknowledgments xiii
Introduction xv
Who Should Read This Book . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xvi
What’s in This Book? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xvi
Scripts, Solutions, and Hints . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xvii
1
Working with Numbers 1
Basic Mathematical Operations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
Labels: Attaching Names to Numbers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
Different Kinds of Numbers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
Working with Fractions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
Complex Numbers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
Getting User Input . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
Handling Exceptions and Invalid Input . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
Fractions and Complex Numbers as Input . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
Writing Programs That Do the Math for You . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
Calculating the Factors of an Integer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
Generating Multiplication Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
Converting Units of Measurement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
Finding the Roots of a Quadratic Equation . . . . . . . . . . . . . . . . . . . . . . . . . . 20
What You Learned . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
Programming Challenges . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
#1: Even-Odd Vending Machine . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
#2: Enhanced Multiplication Table Generator . . . . . . . . . . . . . . . . . . . . . . . . 23
#3: Enhanced Unit Converter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
#4: Fraction Calculator . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
#5: Give Exit Power to the User . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
2
Visualizing Data with Graphs 27
Understanding the Cartesian Coordinate Plane . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
Working with Lists and Tuples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
Iterating over a List or Tuple . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
Creating Graphs with Matplotlib . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
Marking Points on Your Graph . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
Graphing the Average Annual Temperature in New York City . . . . . . . . . . . . . 35
Comparing the Monthly Temperature Trends of New York City . . . . . . . . . . . . 38
Customizing Graphs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
Saving the Plots . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
Plotting with Formulas . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
Newton’s Law of Universal Gravitation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
Projectile Motion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
What You Learned . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54
Programming Challenges . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
#1: How Does the Temperature Vary During the Day? . . . . . . . . . . . . . . . . . . 55
#2: Exploring a Quadratic Function Visually . . . . . . . . . . . . . . . . . . . . . . . . . 55
#3: Enhanced Projectile Trajectory Comparison Program . . . . . . . . . . . . . . . . 56
#4: Visualizing Your Expenses . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
#5: Exploring the Relationship Between
the Fibonacci Sequence and the Golden Ratio . . . . . . . . . . . . . . . . . . . . . 59
3
Describing Data with Statistics 61
Finding the Mean . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62
Finding the Median . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63
Finding the Mode and Creating a Frequency Table . . . . . . . . . . . . . . . . . . . . . . . . . . . 65
Finding the Most Common Elements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66
Finding the Mode . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67
Creating a Frequency Table . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69
Measuring the Dispersion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71
Finding the Range of a Set of Numbers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71
Finding the Variance and Standard Deviation . . . . . . . . . . . . . . . . . . . . . . . . 72
Calculating the Correlation Between Two Data Sets . . . . . . . . . . . . . . . . . . . . . . . . . . . 75
Calculating the Correlation Coefficient . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76
High School Grades and Performance on College Admission Tests . . . . . . . . . 78
Scatter Plots . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81
Reading Data from Files . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83
Reading Data from a Text File . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84
Reading Data from a CSV File . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86
What You Learned . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89
Programming Challenges . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89
#1: Better Correlation Coefficient–Finding Program . . . . . . . . . . . . . . . . . . . . 89
#2: Statistics Calculator . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89
#3: Experiment with Other CSV Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89
#4: Finding the Percentile . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89
#5: Creating a Grouped Frequency Table . . . . . . . . . . . . . . . . . . . . . . . . . . . 90
4
Algebra and Symbolic Math with SymPy 93
Defining Symbols and Symbolic Operations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94
Working with Expressions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 96
Factorizing and Expanding Expressions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 96
Pretty Printing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97
Substituting in Values . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100
Converting Strings to Mathematical Expressions . . . . . . . . . . . . . . . . . . . . . . 103
Solving Equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105
Solving Quadratic Equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 106
Solving for One Variable in Terms of Others . . . . . . . . . . . . . . . . . . . . . . . . 106
Solving a System of Linear Equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 108
Plotting Using SymPy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 108
Plotting Expressions Input by the User . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111
Plotting Multiple Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113
What You Learned . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 115
Programming Challenges . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 115
#1: Factor Finder . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 115
#2: Graphical Equation Solver . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 115
#3: Summing a Series . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 116
#4: Solving Single-Variable Inequalities . . . . . . . . . . . . . . . . . . . . . . . . . . . 117
x Contents in Detail
5
Playing with Sets and Probability 121
What’s a Set? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 121
Set Construction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 122
Subsets, Supersets, and Power Sets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 124
Set Operations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 126
Probability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 131
Probability of Event A or Event B . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 133
Probability of Event A and Event B . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 134
Generating Random Numbers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 134
Nonuniform Random Numbers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 137
What You Learned . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 140
Programming Challenges . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 140
#1: Using Venn Diagrams to Visualize Relationships Between Sets . . . . . . . . . 140
#2: Law of Large Numbers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 143
#3: How Many Tosses Before You Run Out of Money? . . . . . . . . . . . . . . . . . 144
#4: Shuffling a Deck of Cards . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 144
#5: Estimating the Area of a Circle . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 145
6
Drawing Geometric Shapes and Fractals 149
Drawing Geometric Shapes with Matplotlib’s Patches . . . . . . . . . . . . . . . . . . . . . . . . 150
Drawing a Circle . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 151
Creating Animated Figures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 153
Animating a Projectile’s Trajectory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 156
Drawing Fractals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 158
Transformations of Points in a Plane . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 158
Drawing the Barnsley Fern . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 163
What You Learned . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 168
Programming Challenges . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 168
#1: Packing Circles into a Square . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 168
#2: Drawing the Sierpiń ski Triangle . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 170
#3: Exploring Hénon’s Function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 171
#4: Drawing the Mandelbrot Set . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 172
7
Solving Calculus Problems 177
What Is a Function? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 178
Domain and Range of a Function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 178
An Overview of Common Mathematical Functions . . . . . . . . . . . . . . . . . . . . 178
Assumptions in SymPy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 180
Finding the Limit of Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 181
Continuous Compound Interest . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 183
Instantaneous Rate of Change . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 184
Finding the Derivative of Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 185
A Derivative Calculator . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 186
Calculating Partial Derivatives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 187
Higher-Order Derivatives and Finding the Maxima and Minima . . . . . . . . . . . . . . . . . 188
Finding the Global Maximum Using Gradient Ascent . . . . . . . . . . . . . . . . . . . . . . . . 191
A Generic Program for Gradient Ascent . . . . . . . . . . . . . . . . . . . . . . . . . . . 195
A Word of Warning About the Initial Value . . . . . . . . . . . . . . . . . . . . . . . . 196
The Role of the Step Size and Epsilon . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 197
Contents in Detail xi
Finding the Integrals of Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 200
Probability Density Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 201
What You Learned . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 205
Programming Challenges . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 205
#1: Verify the Continuity of a Function at a Point . . . . . . . . . . . . . . . . . . . . . 205
#2: Implement the Gradient Descent . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 205
#3: Area Between Two Curves . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 206
#4: Finding the Length of a Curve . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 207
Afterword 209
Things to Explore Next . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 209
Project Euler . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 210
Python Documentation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 210
Books . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 210
Getting Help . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 211
Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 211
A
Software Installation 213
Microsoft Windows . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 214
Updating SymPy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 215
Installing matplotlib-venn . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 215
Starting the Python Shell . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 215
Linux . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 216
Updating SymPy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 217
Installing matplotlib-venn . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 217
Starting the Python Shell . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 217
Mac OS X . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 217
Updating SymPy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 220
Installing matplotlib-venn . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 220
Starting the Python Shell . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 220
B
Overview of Python Topics 221
if __name__ == '__main__' . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 221
List Comprehensions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 223
Dictionary Data Structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 224
Multiple Return Values . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 226
Exception Handling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 228
Specifying Multiple Exception Types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 228
The else Block . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 230
Reading Files in Python . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 230
Reading All the Lines at Once . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 232
Specifying the Filename as Input . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 232
Handling Errors When Reading Files . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 232
Reusing Code . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 235
Index 237
xii Contents in Detail
Ac k n o w l ed g me n t s
I would like to thank everyone at No Starch Press for making this book
possible. From the first emails discussing the book idea with Bill Pollock
and Tyler Ortman, through the rest of the process, everyone there has
been an absolute pleasure to work with. Seph Kramer was amazing with his
technical insights and suggestions and Riley Hoffman was meticulous in
checking and re-checking that everything was correct. It is only fair to say
that without these two fine people, this book wouldn’t have been close to
what it is. Thanks to Jeremy Kun and Otis Chodosh for their insights and
making sure all the math made sense. I would also like to thank the copy-
editor, Julianne Jigour, for her thoroughness.
SymPy forms a core part of many chapters in this book and I would
like to thank everyone on the SymPy mailing list for answering my queries
patiently and reviewing my patches with promptness. I would also like to
thank the matplotlib community for answering and clearing up my doubts.
I would like to thank David Ash for lending me his Macbook, which
helped me when writing the software installation instructions.
I also must thank every writer and thinker who inspired me to write,
from humble web pages to my favorite books.
I n t r o duc t i o n
xvi Introduction
• Chapter 4, Algebra and Symbolic Math with SymPy, introduces sym-
bolic math using the SymPy library. It begins with the basics of repre-
senting and manipulating algebraic expressions before introducing
more complicated matters, such as solving equations.
• Chapter 5, Playing with Sets and Probability, discusses the representa-
tion of mathematical sets and moves on to basic discrete probability.
You’ll also learn to simulate uniform and nonuniform random events.
• Chapter 6, Drawing Geometric Shapes and Fractals, discusses using
matplotlib to draw geometric shapes and fractals and create animated
figures.
• Chapter 7, Solving Calculus Problems, discusses some of the math-
ematical functions available in the Python standard library and SymPy
and then introduces you to solving calculus problems.
• Appendix A, Software Installation, covers installation of Python 3,
matplotlib, and SymPy on Microsoft Windows, Linux, and Mac OS X.
• Appendix B, Overview of Python Topics, discusses several Python
topics that may be helpful for beginners.
Introduction xvii
1
W o r k i n g w i t h Numbe r s
>>> 1 + 2
3
>>> 1 + 3.5
4.5
>>> -1 + 2.5
1.5
>>> 100 – 45
55
>>> -1.1 + 5
3.9
>>> 3 * 2
6
>>> 3.5 * 1.5
5.25
>>> 3 / 2
1.5
>>> 4 / 2
2.0
As you can see, when you ask Python to perform a division operation,
it returns the fractional part of the number as well. If you want the result in
the form of an integer, with any decimal values removed, you should use the
floor division (//) operator:
>>> 3 // 2
1
2 Chapter 1
The floor division operator divides the first number by the second
number and then rounds down the result to the next lowest integer. This
becomes interesting when one of the numbers is negative. For example:
>>> -3 // 2
-2
The final result is the integer lower than the result of the division oper-
ation (-3/2 = -1.5, so the final result is -2).
On the other hand, if you want just the remainder, you should use the
modulo (%) operator:
>>> 9 % 2
1
You can calculate the power of numbers using the exponential (**)
operator. The examples below illustrate this:
>>> 2 ** 2
4
>>> 2 ** 10
1024
>>> 1 ** 10
1
We can also use the exponential symbol to calculate powers less than 1.
For example, the square root of a number n can be expressed as n1/2 and the
cube root as n1/3 :
>>> 8 ** (1/3)
2.0
>>> 5 + 5 * 5
30
>>> (5 + 5) * 5
50
u >>> a = 3
>>> a + 1
4
v >>> a = 5
>>> a + 1
6
>>> type(3)
<class 'int'>
>>> type(3.5)
<class 'float'>
>>> type(3.0)
<class 'float'>
4 Chapter 1
Here, you can see that Python classifies the number 3 as an integer
(type 'int') but classifies 3.0 as a floating point number (type 'float'). We
all know that 3 and 3.0 are mathematically equivalent, but in many situa-
tions, Python will treat these two numbers differently because they are two
different types.
Some of the programs we write in this chapter will work properly only
with an integer as an input. As we just saw, Python won’t recognize a num-
ber like 1.0 or 4.0 as an integer, so if we want to accept numbers like that
as valid input in these programs, we’ll have to convert them from floating
point numbers to integers. Luckily, there’s a function built in to Python that
does just that:
>>> int(3.8)
3
>>> int(3.0)
3
The function int() takes the input floating point number, gets rid of
anything that comes after the decimal point, and returns the resulting inte-
ger. The float() function works similarly to perform the reverse conversion:
>>> float(3)
3.0
float() takes the integer that was input and adds a decimal point to
turn it into a floating point number.
Now you know the basics of working with fractions in Python. Let’s
move on to a different kind of number.
Complex Numbers
The numbers we’ve seen so far are the so-called real numbers. Python also
supports complex numbers with the imaginary part identified by the letter j
or J (as opposed to the letter i used in mathematical notation). For example,
the complex number 2 + 3i would be written in Python as 2 + 3j:
>>> a = 2 + 3j
>>> type(a)
<class 'complex'>
As you can see, when we use the type() function on a complex number,
Python tells us that this is an object of type complex.
You can also define complex numbers using the complex() function:
>>> a = complex(2, 3)
>>> a
(2 + 3j)
Here we pass the real and imaginary parts of the complex number as
two arguments to the complex() function, and it returns a complex number.
You can add and subtract complex numbers in the same way as real
numbers:
>>> b = 3 + 3j
>>> a + b
(5 + 6j)
>>> a - b
(-1 + 0j)
6 Chapter 1
Multiplication and division of complex numbers are also carried out
similarly:
>>> a * b
(-3 + 15j)
>>> a / b
(0.8333333333333334 + 0.16666666666666666j)
The modulus (%) and the floor division (//) operations are not valid for
complex numbers.
The real and imaginary parts of a complex number can be retrieved
using its real and imag attributes, as follows:
>>> z = 2 + 3j
>>> z.real
2.0
>>> z.imag
3.0
The conjugate of a complex number has the same real part but an imagi-
nary part with an equal magnitude and an opposite sign. It can be obtained
using the conjugate() method:
>>> z.conjugate()
(2 - 3j)
Both the real and imaginary parts are floating point numbers. Using the
real and imaginary parts, you can then calculate the magnitude of a complex
number with the following formula, where x and y are the real and imaginary
parts of the number, respectively: . In Python, this would look like
the following:
>>> abs(z)
3.605551275463989
The standard library’s cmath module (cmath for complex math) provides
access to a number of other specialized functions to work with complex
numbers.
u >>> a = input()
v 1
At u, we call the input() function, which waits for you to type something,
as shown at v, and press enter. The input provided is stored in a:
>>> a
w '1'
>>> a = '1'
>>> int(a) + 1
2
>>> float(a) + 1
2.0
These are the same int() and float() functions we saw earlier, but this
time instead of converting the input from one kind of number to another,
they take a string as input ('1') and return a number (2 or 2.0). It’s impor-
tant to note, however, that the int() function cannot convert a string con-
taining a floating point decimal into an integer. If you take a string that has
a floating point number (like '2.5' or even '2.0') and input that string into
the int() function, you’ll get an error message:
>>> int('2.0')
Traceback (most recent call last):
8 Chapter 1
File "<pyshell#26>", line 1, in <module>
int('2.0')
ValueError: invalid literal for int() with base 10: '2.0'
>>> a = float(input())
3/4
Traceback (most recent call last):
File "<pyshell#25>", line 1, in <module>
a=float(input())
ValueError: could not convert string to float: '3/4'
>>> try:
a = float(input('Enter a number: '))
except ValueError:
print('You entered an invalid number')
The user will now see the message hinting to enter an integer as input:
Input an integer: 1
In many programs in this book, we’ll ask the user to enter a number
as input, so we’ll have to make sure we take care of conversion before we
attempt to perform any operations on these numbers. You can combine the
input and conversion in a single statement, as follows:
>>> a = int(input())
1
>>> a + 1
2
This works great if the user inputs an integer. But as we saw earlier, if
the input is a floating point number (even one that’s equivalent to an inte-
ger, like 1.0), this will produce an error:
>>> a = int(input())
1.0
Traceback (most recent call last):
File "<pyshell#42>", line 1, in <module>
a=int(input())
ValueError: invalid literal for int() with base 10: '1.0'
In order to avoid this error, we could set up a ValueError catch like the
one we saw earlier for fractions. That way the program would catch float-
ing point numbers, which won’t work in a program meant for integers.
However, it would also flag numbers like 1.0 and 2.0, which Python sees as
floating point numbers but that are equivalent to integers and would work
just fine if they were entered as the right Python type.
To get around all this, we will use the is_integer() method to filter out
any numbers with a significant digit after the decimal point. (This method
is only defined for float type numbers in Python; it won’t work with num-
bers that are already entered in integer form.)
Here’s an example:
>>> 1.1.is_integer()
False
10 Chapter 1
hand, when the method is called with 1.0 as the floating point number, the
result is True:
>>> 1.0.is_integer()
True
The ZeroDivisionError exception message tells you (as you already know)
that a fraction with a denominator of 0 is invalid. If you’re planning on hav-
ing users enter fractions as input in one of your programs, it’s a good idea
to always catch such exceptions. Here is how you can do something like that:
>>> try:
a = Fraction(input('Enter a fraction: '))
except ZeroDivisionError:
print('Invalid fraction')
If you enter the string as '2 + 3j' (with spaces), it will result in a
ValueError error message:
For any positive integer n, how do we find all its positive factors? For
each of the integers between 1 and n, we check the remainder after divid-
ing n by this integer. If it leaves a remainder of 0, it’s a factor. We’ll use the
12 Chapter 1
range() function to write a program that will go through each of those num-
bers between 1 and n.
Before we write the full program, let’s take a look at how range() works.
A typical use of the range() function looks like this:
Here, we set up a for loop and gave the range function two arguments.
The range() function starts from the integer stated as the first argument
(the start value) and continues up to the integer just before the one stated
by the second argument (the stop value). In this case, we told Python to
print out the numbers in that range, beginning with 1 and stopping at 4.
Note that this means Python doesn’t print 4, so the last number it prints is
the number before the stop value (3). It’s also important to note that the
range() function accepts only integers as its arguments.
You can also use the range() function without specifying the start value,
in which case it’s assumed to be 0. For example:
Okay, now that we see how the range() function works, we’re ready to
look at a factor-calculating program. Because I’m writing a fairly long pro-
gram, instead of writing this program in the interactive IDLE prompt, I
write it in the IDLE editor. You can start the editor by selecting File4New
Window in IDLE. Notice that we start out by commenting our code with
'''
Find the factors of an integer
'''
def factors(b):
if __name__ == '__main__':
The factors() function defines a for loop that iterates once for every
integer between 1 and the input integer at u using the range() function.
Here, we want to iterate up to the integer entered by the user, b, so the
stop value is stated as b+1. For each of these integers, i, the program checks
whether it divides the input number with no remainder and prints it if so.
When you run this program (by selecting Run4Run Module), it asks
you to input a number. If your number is a positive integer, its factors are
printed. For example:
14 Chapter 1
Generating Multiplication Tables
Consider three numbers, a, b, and n, where n is an integer, such that
a × n = b.
We can say here that b is the nth multiple of a. For example, 4 is the 2nd
multiple of 2, and 1024 is the 512nd multiple of 2.
A multiplication table for a number lists all of that number’s multiples.
For example, the multiplication table of 2 looks like this (first three mul-
tiples shown here):
2×1=2
2×2=4
2×3=6
Our next program generates the multiplication number up to 10
for any number input by the user. In this program, we’ll use the format()
method with the print() function to help make the program’s output look
nicer and more readable. In case you haven’t seen it before, I’ll now briefly
explain how it works.
The format() method lets you plug in labels and set it up so that they get
printed out in a nice, readable string with extra formatting around it. For
example, if I had the names of all the fruits I bought at the grocery store
with separate labels created for each and wanted to print them out to make
a coherent sentence, I could use the format() method as follows:
First, we created three labels (item1, item2, and item3), each referring to
a different string (apples, bananas, and grapes). Then, in the print() function,
we typed a string with three placeholders in curly brackets: {0}, {1}, and {2}.
We followed this with .format(), which holds the three labels we created.
This tells Python to fill those three placeholders with the values stored in
those labels in the order listed, so Python prints the text with {0} replaced
by the first label, {1} replaced by the second label, and so on.
It’s not necessary to have labels pointing to the values we want to print.
We can also just type values into .format(), as in the following example:
'''
Multiplication table printer
'''
def multi_table(a):
if __name__ == '__main__':
a = input('Enter a number: ')
multi_table(float(a))
Enter a number : 5
5.0 x 1 = 5.0
5.0 x 2 = 10.0
5.0 x 3 = 15.0
5.0 x 4 = 20.0
5.0 x 5 = 25.0
5.0 x 6 = 30.0
5.0 x 7 = 35.0
5.0 x 8 = 40.0
5.0 x 9 = 45.0
5.0 x 10 = 50.0
See how nice and orderly that table looks? That’s because we used the
.format() method to print the output according to a readable, uniform
template.
You can use the format() method to further control how numbers are
printed. For example, if you want numbers with only two decimal places,
you can specify that with the format() method. Here is an example:
>>> '{0}'.format(1.25456)
'1.25456'
>>> '{0:.2f}'.format(1.25456)
'1.25'
The first format statement above simply prints the number exactly as we
entered it. In the second statement, we modify the place holder to {0:.2f},
16 Chapter 1
meaning that we want only two numbers after the decimal point, with the f
indicating a floating point number. As you can see, there are only two num-
bers after the decimal point in the next output. Note that the number is
rounded if there are more numbers after the decimal point than you speci-
fied. For example:
>>> '{0:.2f}'.format(1.25556)
'1.26'
>>> '{0:.2f}'.format(1)
'1.00'
Two zeros are added because we specified that we should print exactly
two numbers after the decimal point.
>>> F = 98.6
>>> (F - 32) * (5 / 9)
37.0
>>> C = 37
>>> C * (9 / 5) + 32
98.60000000000001
We create a label, C, with the value 37 (the normal human body tem-
perature in Celsius). Then, we convert it into Fahrenheit using the formula,
and the result is 98.6 degrees.
It’s a chore to have to write these conversion formulas over and over
again. Let’s write a unit conversion program that will do the conversions
18 Chapter 1
for us. This program will present a menu to allow users to select the conver-
sion they want to perform, ask for relevant input, and then print the calcu-
lated result. The program is shown below:
'''
Unit converter: Miles and Kilometers
'''
def print_menu():
print('1. Kilometers to Miles')
print('2. Miles to Kilometers')
def km_miles():
km = float(input('Enter distance in kilometers: '))
miles = km / 1.609
def miles_km():
miles = float(input('Enter distance in miles: '))
km = miles * 1.609
if __name__ == '__main__':
u print_menu()
v choice = input('Which conversion would you like to do?: ')
if choice == '1':
km_miles()
if choice == '2':
miles_km()
This is a slightly longer program than the others, but not to worry.
It’s actually simple. Let’s start from u. The print_menu() function is called,
which prints a menu with two unit conversion choices. At v, the user is
asked to select one of the two conversions. If the choice is entered as 1
(kilometers to miles), the function km_miles() is called. If the choice is
entered as 2 (miles to kilometers), the function miles_km() is called. In both
of these functions, the user is first asked to enter a distance in the unit
chosen for conversion (kilometers for km_miles() and miles for miles_km()).
The program then performs the conversion using the corresponding for-
mula and displays the result.
Here is a sample run of the program:
1. Kilometers to Miles
2. Miles to Kilometers
u Which conversion would you like to do?: 2
Enter distance in miles: 100
Distance in kilometers: 160.900000
>>> x = 10 - 500 + 79
>>> x
-411
and .
A quadratic equation has two roots—two values of x for which the two
sides of the quadratic equation are equal (although sometimes these two
values may turn out to be the same). This is indicated here by the x 1 and x 2
in the quadratic formula.
Comparing the equation x 2 + 2x + 1 = 0 to the generic quadratic
equation, we see that a = 1, b = 2, and c = 1. We can substitute these values
directly into the quadratic formula to calculate the value of x 1 and x 2. In
Python, we first store the values of a, b, and c as the labels a, b, and c with
the appropriate values:
>>> a = 1
>>> b = 2
>>> c = 1
20 Chapter 1
Then, considering that both the formulas have the term b 2 – 4ac, we’ll
define a new label with D, such that :
In this case, the values of both the roots are the same, and if you substi-
tute that value into the equation x 2 + 2x + 1, the equation will evaluate to 0.
Our next program combines all these steps in a function roots(),
which takes the values of a, b, and c as parameters, calculates the roots,
and prints them:
'''
Quadratic equation root calculator
'''
D = (b*b - 4*a*c)**0.5
x_1 = (-b + D)/(2*a)
x_2 = (-b - D)/(2*a)
print('x1: {0}'.format(x_1))
print('x2: {0}'.format(x_2))
if __name__ == '__main__':
a = input('Enter a: ')
b = input('Enter b: ')
c = input('Enter c: ')
roots(float(a), float(b), float(c))
At first, we use the labels a, b, and c to reference the values of the three
constants of a quadratic equation. Then, we call the roots() function with
these three values as arguments (after converting them to floating point
numbers). This function plugs a, b, and c into the quadratic formula, finds
the roots for that equation, and prints them.
When you execute the program, it will ask the user to input values of a, b,
and c corresponding to a quadratic equation they want to find the roots for.
Enter a: 1
Enter b: 2
Enter c: 1
Try solving a few more quadratic equations with different values for the
constants, and the program will find the roots correctly.
You most likely know that quadratic equations can have complex num-
bers as roots, too. For example, the roots of the equation x 2 + x + 1 = 0 are
both complex numbers. The above program can find those for you as well.
Let’s give it a shot by executing the program again (the constants are a = 1,
b = 1, and c = 1):
Enter a: 1
Enter b: 1
Enter c: 1
x1: (-0.49999999999999994+0.8660254037844386j)
x2: (-0.5-0.8660254037844386j)
The roots printed above are complex numbers (indicated by j), and the
program has no problem calculating or displaying them.
Programming Challenges
Here are a few challenges that will give you a chance to practice the concepts
from this chapter. Each problem can be solved in multiple ways, but you can
find sample solutions at http://www.nostarch.com/doingmathwithpython/.
If the input is 2, the program should print even and then print 2, 4, 6,
8, 10, 12, 14, 16, 18, 20. Similarly, if the input is 1, the program should
print odd and then print 1, 3, 5, 7, 9, 11, 13, 15, 17, 19.
22 Chapter 1
Your program should use the is_integer() method to display an error
message if the input is a number with significant digits beyond the decimal
point.
'''
Fraction operations
'''
from fractions import Fraction
if __name__ == '__main__':
u a = Fraction(input('Enter first fraction: '))
v b = Fraction(input('Enter second fraction: '))
op = input('Operation to perform - Add, Subtract, Divide, Multiply: ')
if op == 'Add':
add(a,b)
In the case of division, you should let the user know whether the first
fraction is divided by the second fraction or vice versa.
'''
Run until exit layout
'''
def fun():
print('I am in an endless loop')
if __name__ == '__main__':
u while True:
fun()
v answer = input('Do you want to exit? (y) for yes ')
if answer == 'y':
break
24 Chapter 1
execution—that is, it prints the string again and continues doing so until
the user wishes to exit. Here is a sample run of the program:
I am in an endless loop
Do you want to exit? (y) for yes n
I am in an endless loop
Do you want to exit? (y) for yes n
I am in an endless loop
Do you want to exit? (y) for yes n
I am in an endless loop
Do you want to exit? (y) for yes y
'''
Multiplication table printer with
exit power to the user
'''
def multi_table(a):
if __name__ == '__main__':
while True:
a = input('Enter a number: ')
multi_table(float(a))
If you compare this program to the one we wrote earlier, you’ll see
that the only change is the addition of the while loop, which includes the
prompt asking the user to input a number and the call to the multi_table()
function.
When you run the program, the program will ask for a number and
print its multiplication table, as before. However, it will also subsequently
ask whether the user wants to exit the program. If the user doesn’t want to
exit, the program will be ready to print the table for another number. Here
is a sample run:
Enter a number: 2
2.000000 x 1.000000 = 2.000000
2.000000 x 2.000000 = 4.000000
2.000000 x 3.000000 = 6.000000
2.000000 x 4.000000 = 8.000000
Try rewriting some of the other programs in this chapter so that they
continue executing until asked by the user to exit.
26 Chapter 1
2
V i su a l i z i n g D a t a w i t h G r a p h s
−3 −2 −1 0 1 2 3
Origin 2 A (x, y)
x
−3 −2 −1 1 2 3
−1
−2
−3
As with the number line, we can have infinitely many points on the
plane. We describe a point with a pair of numbers instead of one number.
For example, we describe the point A in the figure with two numbers, x
and y, usually written as (x, y) and referred to as the coordinates of the point.
28 Chapter 2
As shown in Figure 2-2, x is the distance of the point from the origin along
the x-axis, and y is the distance along the y-axis. The point where the two
axes intersect is called the origin and has the coordinates (0, 0).
The Cartesian coordinate plane allows us to visualize the relationship
between two sets of numbers. Here, I use the term set loosely to mean a col-
lection of numbers. (We’ll learn about mathematical sets and how to work
with them in Python in Chapter 5.) No matter what the two sets of numbers
represent—temperature, baseball scores, or class test scores—all you need
are the numbers themselves. Then, you can plot them—either on graph
paper or on your computer with a program written in Python. For the rest
of this book, I’ll use the term plot as a verb to describe the act of plotting
two sets of numbers and the term graph to describe the result—a line,
curve, or simply a set of points on the Cartesian plane.
Now you can refer to the individual numbers—1, 2, and 3—using the
label and the position of the number in the list, which is called the index.
So simplelist[0] refers to the first number, simplelist[1] refers to the second
number, and simplelist[2] refers to the third number:
>>> simplelist[0]
1
>>> simplelist[1]
2
>>> simplelist[2]
3
Notice that the first item of the list is at index 0, the second item is at
index 1, and so on—that is, the positions in the list start counting from 0,
not 1.
One advantage of creating a list is that you don’t have to create a separate
label for each value; you just create a label for the list and use the index posi-
tion to refer to each item. Also, you can add to the list whenever you need to
store new values, so a list is the best choice for storing data if you don’t know
beforehand how many numbers or strings you may need to store.
An empty list is just that—a list with no items or elements—and it can be
created like this:
>>> emptylist = []
Empty lists are mainly useful when you don’t know any of the items that
will be in your list beforehand but plan to fill in values during the execu-
tion of a program. In that case, you can create an empty list and then use
the append() method to add items later:
u >>> emptylist
[]
v >>> emptylist.append(1)
>>> emptylist
[1]
w >>> emptylist.append(2)
>>> emptylist
x [1, 2]
>>> simpletuple[0]
1
30 Chapter 2
>>> simpletuple[1]
2
>>> simpletuple[2]
3
You can also use negative indices with both lists and tuples. For example,
simplelist[-1] and simpletuple[-1] would refer to the last element of the list
or the tuple, simplelist[-2] and simpletuple[-2] would refer to the second-to-
last element, and so on.
Tuples, like lists, can have strings as values, and you can create an empty
tuple with no elements as emptytuple=(). However, there’s no append() method
to add a new value to an existing tuple, so you can’t add values to an empty
tuple. Once you create a tuple, the contents of the tuple can’t be changed.
>>> l = [1, 2, 3]
>>> for item in l:
print(item)
1
2
3
>>> l = [1, 2, 3]
>>> for index, item in enumerate(l):
print(index, item)
0 1
1 2
2 3
In the first line, we import the plot() and show() functions from the pylab
module, which is part of the matplotlib package. Next, we call the plot() func-
tion in the second line. The first argument to the plot() function is the list of
numbers we want to plot on the x-axis, and the second argument is the cor-
responding list of numbers we want to plot on the y-axis. The plot() function
returns an object—or more precisely, a list containing an object. This object
contains the information about the graph that we asked Python to create. At
this stage, you can add more information, such as a title, to the graph, or you
can just display the graph as it is. For now we’ll just display the graph.
The plot() function only creates the graph. To actually display it, we
have to call the show() function:
>>> show()
32 Chapter 2
Figure 2-3: A graph showing a line passing through the points (1, 2), (2, 4), and (3, 6)
Notice that instead of starting from the origin (0, 0), the x-axis starts
from the number 1 and the y-axis starts from the number 2. These are
the lowest numbers from each of the two lists. Also, you can see incre-
ments marked on each of the axes (such as 2.5, 3.0, 3.5, etc., on the y-axis).
In “Customizing Graphs” on page 41, we’ll learn how to control those
aspects of the graph, along with how to add axes labels and a graph title.
You’ll notice in the interactive shell that you can’t enter any further
statements until you close the matplotlib window. Close the graph window
so that you can continue programming.
Figure 2-4: A graph showing a line passing through the points (1, 2), (2, 4), and (3, 6)
with the points marked by a dot
The marker at (2, 4) is easily visible, while the others are hidden in
the very corners of the graph. You can choose from several marker options,
including 'o', '*', 'x', and '+'. Using marker= includes a line connecting
the points (this is the default). You can also make a graph that marks
only the points that you specified, without any line connecting them, by
omitting marker=:
Here, 'o' indicates that each point should be marked with a dot, but
there should be no line connecting the points. Call the function show() to
display the graph, which should look like the one shown in Figure 2-5.
34 Chapter 2
Figure 2-5: A graph showing the points (1, 2), (2, 4), and (3, 6)
As you can see, only the points are now shown on the graph, with no
line connecting them. As in the previous graph, the first and the last points
are barely visible, but we’ll soon see how to change that.
>>> nyc_temp = [53.9, 56.3, 56.4, 53.4, 54.5, 55.8, 56.8, 55.0, 55.3, 54.0, 56.7, 56.4, 57.3]
>>> plot(nyc_temp, marker='o')
[<matplotlib.lines.Line2D object at 0x7f2549d52f90>]
Figure 2-6: A graph showing the average annual temperature of New York City during
the years 2000–2012
36 Chapter 2
You can also see that numbers on the y-axis are floating point numbers
(because that’s what we asked to be plotted) and those on the x-axis are
integers. Matplotlib can handle either.
Plotting the temperature without showing the corresponding years is a
quick and easy way to visualize the variations between the years. If you were
planning to present this graph to someone, however, you’d want to make
it clearer by showing which year each temperature corresponds to. We can
easily do this by creating another list with the years in it and then calling
the plot() function:
>>> nyc_temp = [53.9, 56.3, 56.4, 53.4, 54.5, 55.8, 56.8, 55.0, 55.3, 54.0, 56.7, 56.4, 57.3]
>>> years = range(2000, 2013)
>>> plot(years, nyc_temp, marker='o')
[<matplotlib.lines.Line2D object at 0x7f2549a616d0>]
>>> show()
Figure 2-7: A graph showing the average annual temperature of New York City,
displaying the years on the x-axis
>>> nyc_temp_2000 = [31.3, 37.3, 47.2, 51.0, 63.5, 71.3, 72.3, 72.7, 66.0, 57.0, 45.3, 31.1]
>>> nyc_temp_2006 = [40.9, 35.7, 43.1, 55.7, 63.1, 71.0, 77.9, 75.8, 66.6, 56.2, 51.9, 43.6]
>>> nyc_temp_2012 = [37.3, 40.9, 50.9, 54.8, 65.1, 71.0, 78.8, 76.7, 68.8, 58.0, 43.9, 41.5]
The first list corresponds to the year 2000, and the next two lists corre-
spond to the years 2006 and 2012, respectively. We could plot the three sets
of data on three different graphs, but that wouldn’t make it very easy to see
how each year compares to the others. Try doing it!
The clearest way to compare all of these temperatures is to plot all
three data sets on a single graph, like this:
38 Chapter 2
Figure 2-8: A graph showing the average monthly temperature of New York City during
the years 2000, 2006, and 2012
Now we have three plots all on one graph. Python automatically chooses
a different color for each line to indicate that the lines have been plotted
from different data sets.
Instead of calling the plot function with all three pairs at once, we
could also call the plot function three separate times, once for each pair:
Then, import the legend() function from the pylab module and call it as
follows:
We call the legend() function with a list of the labels we want to use to
identify each plot on the graph. These labels are entered in this order to
match the order of the pairs of lists that were entered in the plot() func-
tion. That is, 2000 will be the label for the plot of the first pair we entered
in the plot() function; 2006, for the second pair; and 2012, for the third.
You can also specify a second argument to the function that will specify
the position of the legend. By default, it’s always positioned at the top
right of the graph. However, you can specify a particular position, such
as 'lower center', 'center left', and 'upper left'. Or you can set the posi-
tion to 'best', and the legend will be positioned so as not to interfere
with the graph.
Finally, we call show() to display the graph:
>>> show()
As you can see in the graph (see Figure 2-9), there’s now a legend box
in the top-right corner. It tells us which line represents the average monthly
temperature for the year 2000, which line represents the year 2006, and
which line represents the year 2012.
Looking at the graph, you can conclude two interesting facts: the high-
est temperature for all three years was in and around July (corresponding to
7 on the x-axis), and it has been increasing from 2000 with a more dramatic
rise between 2000 and 2006. Having all three lines plotted together in one
graph makes it a lot easier to see these kinds of relationships. It’s certainly
clearer than just looking at a few long lists of numbers or even looking at
three lines plotted on three separate graphs.
40 Chapter 2
Figure 2-9: A graph showing the average monthly temperature of New York City,
with a legend to show the year each color corresponds to
Customizing Graphs
We already learned about one way to customize a graph—by adding a leg-
end. Now, we’ll learn about other ways to customize a graph and to make it
clearer by adding labels to the x- and y-axes, adding a title to the graph, and
controlling the range and steps of the axes.
>>> from pylab import plot, show, title, xlabel, ylabel, legend
>>> plot(months, nyc_temp_2000, months, nyc_temp_2006, months, nyc_temp_2012)
[<matplotlib.lines.Line2D object at 0x7f2549a9e210>, <matplotlib.lines.Line2D
object at 0x7f2549a4be90>, <matplotlib.lines.Line2D object at 0x7f2549a82090>]
>>> title('Average monthly temperature in NYC')
<matplotlib.text.Text object at 0x7f25499f7150>
>>> xlabel('Month')
<matplotlib.text.Text object at 0x7f2549d79210>
>>> ylabel('Temperature')
<matplotlib.text.Text object at 0x7f2549b8b2d0>
Figure 2-10: Axes labels and a title have been added to the graph.
With the three new pieces of information added, the graph is easier to
understand.
42 Chapter 2
>>> nyc_temp = [53.9, 56.3, 56.4, 53.4, 54.5, 55.8, 56.8, 55.0, 55.3, 54.0, 56.7, 56.4, 57.3]
>>> plot(nyc_temp, marker='o')
[<matplotlib.lines.Line2D object at 0x7f3ae5b767d0>]
>>> axis(ymin=0)
(0.0, 12.0, 0, 57.5)
Calling the axis() function with the new starting value for the y-axis
(specified by ymin=0) changes the range, and the returned tuple confirms
it. If you display the graph by calling the show() function, the y-axis starts
at 0, and the differences between the values of the consecutive years look
less drastic (see Figure 2-11).
Figure 2-11: A graph showing the average annual temperature of New York City
during the years 2000–2012. The y-axis has been customized to start from 0.
'''
Simple plot using pyplot
'''
u import matplotlib.pyplot
v def create_graph():
x_numbers = [1, 2, 3]
y_numbers = [2, 4, 6]
matplotlib.pyplot.plot(x_numbers, y_numbers)
matplotlib.pyplot.show()
if __name__ == '__main__':
create_graph()
First, we import the pyplot module using the statement import matplotlib
.pyplot u. This means that we’re importing the entire pyplot module from
the matplotlib package. To refer to any function or class definition defined in
this module, you’ll have to use the syntax matplotlib.pyplot.item, where item is
the function or class you want to use.
This is different from importing a single function or class at a time,
which is what we’ve been doing so far. For example, in the first chapter we
imported the Fraction class as from fractions import Fraction. Importing an
entire module is useful when you’re going to use a number of functions from
that module. Instead of importing them individually, you can just import the
whole module at once and refer to different functions when you need them.
In the create_graph() function at v, we create the two lists of numbers
that we want to plot on the graph and then pass the two lists to the plot()
function, the same way we did before with pylab. This time, however, we call
the function as matplotlib.pyplot.plot(), which means that we’re calling the
plot() function defined in the pyplot module of the matplotlib package.
Then, we call the show() function to display the graph. The only difference
44 Chapter 2
between the way you plot the numbers here compared to what we did ear-
lier is the mechanism of calling the functions.
To save us some typing, we can import the pyplot module by entering
import matplotlib.pyplot as plt. Then, we can refer to pyplot with the label
plt in our programs, instead of having to always type matplotlib.pyplot:
'''
Simple plot using pyplot
'''
import matplotlib.pyplot as plt
def create_graph():
x_numbers = [1, 2, 3]
y_numbers = [2, 4, 6]
plt.plot(x_numbers, y_numbers)
plt.show()
if __name__ == '__main__':
create_graph()
Now, we can call the functions by prefixing them with the shortened plt
instead of matplotlib.pyplot.
Going ahead, for the rest of this chapter and this book, we’ll use pylab
in the interactive shell and pyplot otherwise.
This program will save the graph to an image file, mygraph.png, in your
current directory. On Microsoft Windows, this is usually C:\Python33 (where
you installed Python). On Linux, the current directory is usually your home
directory (/home/<username>), where <username> is the user you’re logged in
as. On a Mac, IDLE saves files to ~/Documents by default. If you wanted to save
it in a different directory, specify the complete pathname. For example, to
save the image under C:\ on Windows as mygraph.png, you’d call the savefig()
function as follows:
>>> savefig('C:\mygraph.png')
where r is the distance between the two bodies and G is the gravitational
constant. We want to see what happens to the force as the distance between
the two bodies increases.
Let’s take the masses of two bodies: the mass of the first body (m 1) is
0.5 kg, and the mass of the second body (m2) is 1.5 kg. The value of the gravi-
tational constant is 6.674 × 10−11 N m2 kg−2. Now we’re ready to calculate the
gravitational force between these two bodies at 19 different distances: 100 m,
150 m, 200 m, 250 m, 300 m, and so on up through 1000 m. The following
program performs these calculations and also draws the graph:
'''
The relationship between gravitational force and
distance between two bodies
'''
46 Chapter 2
plt.ylabel('Gravitational force in newtons')
plt.title('Gravitational force and distance')
plt.show()
def generate_F_r():
# Generate values for r
u r = range(100, 1001, 50)
# Empty list to store the calculated values of F
F = []
# Constant, G
G = 6.674*(10**-11)
# Two masses
m1 = 0.5
m2 = 1.5
if __name__=='__main__':
generate_F_r()
Projectile Motion
Now, let’s graph something you’ll be familiar with from everyday life. If
you throw a ball across a field, it follows a trajectory like the one shown in
Figure 2-13.
Highest point
vy
u vx
uy
θ
A ux Ground B
Figure 2-13: Motion of a ball that’s thrown at point A—at an angle ( θ)
with a velocity (u)—and that hits the ground at point B
48 Chapter 2
In the figure, the ball is thrown from point A and lands at point B. This
type of motion is referred to as projectile motion. Our aim here is to use the
equations of projectile motion to graph the trajectory of a body, showing
the position of the ball starting from the point it’s thrown until it hits the
ground again.
When you throw the ball, it has an initial velocity and the direction of
that velocity creates a certain angle with the ground. Let’s call the initial
velocity u and the angle that it makes with the ground θ (theta), as shown
in Figure 2-13. The ball has two velocity components: one along the x direc-
tion, calculated by ux = u cosθ, and the other along the y direction, where
uy = u sinθ.
As the ball moves, its velocity changes, and we will represent that
changed velocity using v: the horizontal component is vx and the verti-
cal component is vy. For simplicity, assume the horizontal component (vx)
doesn’t change during the motion of the body, whereas the vertical compo-
nent (vy) decreases because of the force of gravity according to the equa-
tion vy = uy – gt. In this equation, g is the gravitational acceleration and t
is the time at which the velocity is measured. Because uy = u sinθ, we can
substitute to get
Let’s take a ball that’s thrown with an initial velocity (u) of 5 m/s at an
angle (θ) of 45 degrees. To calculate the total time of flight, we substitute
u = 5, θ = 45, and g = 9.8 into the equation we saw above:
In this case, the time of flight for the ball turns out to be 0.72154 seconds
(rounded to five decimal places). The ball will be in air for this period of
time, so to draw the trajectory, we’ll calculate its x- and y-coordinates at
regular intervals during this time period. How often should we calculate
the coordinates? Ideally, as frequently as possible. In this chapter, we’ll
calculate the coordinates every 0.001 seconds.
'''
Generate equally spaced floating point
numbers between two given values
'''
numbers = []
u while start < final:
v numbers.append(start)
start = start + increment
return numbers
50 Chapter 2
We’ve defined a function frange() (“floating point” range) that receives
three parameters: start and final refer to the starting and the final points
of the range of numbers, and increment refers to the difference between
two consecutive numbers. We initialize a while loop at u, which continues
execution as long as the number referred to by start is less than the value
for final. We store the number pointed to by start in the list numbers v and
then add the value we entered as an increment during every iteration of the
loop. Finally, we return the list numbers.
We’ll use this function to generate equally spaced time instants in the
trajectory-drawing program described next.
'''
Draw the trajectory of a body in projectile motion
'''
numbers = []
while start < final:
numbers.append(start)
start = start + interval
return numbers
u theta = math.radians(theta)
g = 9.8
# Time of flight
v t_flight = 2*u*math.sin(theta)/g
# Find time intervals
intervals = frange(0, t_flight, 0.001)
draw_graph(x, y)
if __name__ == '__main__':
x try:
u = float(input('Enter the initial velocity (m/s): '))
theta = float(input('Enter the angle of projection (degrees): '))
except ValueError:
print('You entered an invalid input')
else:
draw_trajectory(u, theta)
plt.show()
52 Chapter 2
Figure 2-14: The trajectory of a ball when thrown with a velocity of 25 m/s at an angle
of 60 degrees
if __name__ == '__main__':
Figure 2-15: The trajectory of a ball thrown at a 60-degree angle, with a velocity of
20, 40, and 60 m/s
54 Chapter 2
Programming Challenges
Here are a few challenges that build on what you’ve learned in this
chapter. You can find sample solutions at http://www.nostarch.com/
doingmathwithpython/.
'''
Quadratic function calculator
'''
# Assume values of x
u x_values = [-1, 1, 2, 3, 4, 5]
v for x in x_values:
# Calculate the value of the quadratic function
y = x**2 + 2*x + 1
print('x={0} y={1}'.format(x, y))
At u, we create a list with six different values for x. The for loop start-
ing at v calculates the value of the function above for each of these values
and uses the label y to refer to the list of results. Next, we print the value of
x and the corresponding value of y. When you run the program, you should
see the following output:
x=-1 y=0
x=1 y=4
x=2 y=9
Notice that the first line of the output is a root of the quadratic equa-
tion because it’s a value for x that makes the function equal to 0.
Your programming challenge is to enhance this program to create
a graph of the function. Try using at least 10 values for x instead of the 6
above. Calculate the corresponding y values using the function and then
create a graph using these two sets of values.
Once you’ve created the graph, spend some time analyzing how the
value of y varies with respect to x. Is the variation linear or nonlinear?
56 Chapter 2
Enter category: Transportation
Expenditure: 35
Enter category: Entertainment
Expenditure: 30
Enter category: Phone/Internet
Expenditure: 30
Figure 2-16 shows the bar chart that will be created to compare the
expenditures. If you save the bar chart for every week, at the end of the
month, you’ll be able to see how the expenditures varied between the
weeks for different categories.
Figure 2-16: A bar chart showing the expenditures per category during the week
Center
Figure 2-17: A bar chart showing the number of steps walked during a week
'''
Example of drawing a horizontal bar chart
'''
import matplotlib.pyplot as plt
def create_bar_chart(data, labels):
# Number of bars
num_bars = len(data)
# This list is the point on the y-axis where each
# Bar is centered. Here it will be [1, 2, 3...]
u positions = range(1, num_bars+1)
v plt.barh(positions, data, align='center')
# Set the label of each bar
plt.yticks(positions, labels)
plt.xlabel('Steps')
plt.ylabel('Day')
plt.title('Number of steps walked')
# Turns on the grid which may assist in visual estimation
plt.grid(
plt.show()
if __name__ == '__main__':
# Number of steps I walked during the past week
steps = [6534, 7000, 8900, 10786, 3467, 11045, 5095]
# Corresponding days
labels = ['Sun', 'Mon', 'Tue', 'Wed', 'Thu', 'Fri', 'Sat']
create_bar_chart(steps, labels)
58 Chapter 2
The create_bar_chart() function accepts two parameters— data, which
is a list of numbers we want to represent using the bars and labels, and the
corresponding labels list. The center of each bar has to be specified, and
I’ve arbitrarily chosen the centers as 1, 2, 3, 4, and so on using the help of
the range() function at u.
We then call the barh() function, passing positions and data as the first
two arguments and then the keyword argument, align='center', at v. The
keyword argument specifies that the bars are centered at the positions on
the y-axis specified by the list. We then set the labels for each bar, the axis
labels, and the title using the yticks() function. We also call the grid() func-
tion to turn on the grid, which may be useful for a visual estimation of the
number of steps. Finally, we call the show() function.
def fibo(n):
if n == 1:
return [1]
if n == 2:
return [1, 1]
# n > 2
a = 1
b = 1
# First two members of the series
series = [a, b]
for i in range(n):
c = a + b
series.append(c)
a = b
b = c
return series
Figure 2-18: The ratio between the consecutive Fibonacci numbers approaches the
golden ratio.
60 Chapter 2
3
D esc r i b i n g D a t a w i t h
S t a t i s t i cs
Note In statistics, some statistical measures are calculated slightly differently depending
on whether you have data for an entire population or just a sample. To keep things
simple, we’ll stick with the calculation methods for a population in this chapter.
Finding the Mean
The mean is a common and intuitive way to summarize a set of numbers.
It’s what we might simply call the “average” in everyday use, although as
we’ll see, there are other kinds of averages as well. Let’s take a sample set
of numbers and calculate the mean.
Say there’s a school charity that’s been taking donations over a period
of time spanning the last 12 days (we’ll refer to this as period A). In that
time, the following 12 numbers represent the total dollar amount of dona-
tions received for each day: 100, 60, 70, 900, 100, 200, 500, 500, 503, 600,
1000, and 1200. We can calculate the mean by summing these totals and
then dividing the sum by the number of days. In this case, the sum of the
numbers is 5733. If we divide this number by 12 (the number of days), we
get 477.75, which is the mean donation per day. This number gives us a gen-
eral idea of how much money was donated on any given day.
In a moment, we’ll write a program that calculates and prints the mean
for a collection of numbers. As we just saw, to calculate the mean, we’ll need
to take the sum of the list of numbers and divide it by the number of items
in the list. Let’s look at two Python functions that make both of these opera-
tions very easy: sum() and len().
When you use the sum() function on a list of numbers, it adds up all the
numbers in the list and returns the result:
>>> len(shortlist)
3
When we use the len() function on the list, it returns 3 because there
are three items in shortlist. Now we’re ready to write a program that will
calculate the mean of the list of donations.
'''
Calculating the mean
'''
def calculate_mean(numbers):
u s = sum(numbers)
v N = len(numbers)
# Calculate the mean
w mean = s/N
return mean
62 Chapter 3
if __name__ == '__main__':
x donations = [100, 60, 70, 900, 100, 200, 500, 500, 503, 600, 1000, 1200]
y mean = calculate_mean(donations)
N = len(donations)
z print('Mean donation over the last {0} days is {1}'.format(N, mean))
The calculate_mean() function will calculate the sum and length of any
list, so we can reuse it to calculate the mean for other sets of numbers, too.
We calculated that the mean donation per day was 477.75. It’s worth
noting that the donations during the first few days were much lower than
the mean donation we calculated and that the donations during the last
couple of days were much higher. The mean gives us one way to summarize
the data, but it doesn’t give us a full picture. There are other statistical
measurements, however, that can tell us more about the data when com-
pared with the mean.
Now we can write our next program, which finds the median of a list of
numbers:
'''
Calculating the median
'''
def calculate_median(numbers):
u N = len(numbers)
v numbers.sort()
return median
if __name__ == '__main__':
donations = [100, 60, 70, 900, 100, 200, 500, 500, 503, 600, 1000, 1200]
64 Chapter 3
median = calculate_median(donations)
N = len(donations)
print('Median donation over the last {0} days is {1}'.format(N, median))
>>> 6/2
3.0
As you can see, the mean (477.75) and the median (500) are pretty
close in this particular list, but the median is a little higher.
Here, we start off with a list of five numbers and import Counter from
the collections module. Then, we create a Counter object, using c to refer
to the object. We then call the most_common() method, which returns a list
ordered by the most common elements.
Each member of the list is a tuple. The first element of the first tuple
is the number that occurs most frequently, and the second element is the
number of times it occurs. The second, third, and fourth tuples contain the
other numbers along with the count of the number of times they appear.
This result tells us that 4 occurs the most (twice), while the others appear
only once. Note that numbers that occur an equal number of times are
returned by the most_common() method in an arbitrary order.
When you call the most_common() method, you can also provide an argu-
ment telling it the number of most common elements you want it to return.
For example, if we just wanted to find the most common element, we would
call it with the argument 1:
>>> c.most_common(1)
[(4, 2)]
If you call the method again with 2 as an argument, you’ll see this:
>>> c.most_common(2)
[(4, 2), (1, 1)]
Now the result returned by the most_common method is a list with two
tuples. The first is the most common element, followed by the second most
common. Of course, in this case, there are several elements tied for most
common, so the fact that the function returns 1 here (and not 2 or 3) is
arbitrary, as noted earlier.
66 Chapter 3
The most_common() method returns both the numbers and the number
of times they occur. What if we want only the numbers and we don’t care
about the number of times they occur? Here’s how we can retrieve that
information:
'''
Calculating the mode
'''
def calculate_mode(numbers):
u c = Counter(numbers)
v mode = c.most_common(1)
w return mode[0][0]
if __name__=='__main__':
scores = [7, 8, 9, 2, 10, 9, 9, 9, 9, 4, 5, 6, 1, 5, 6, 7, 8, 6, 1, 10]
mode = calculate_mode(scores)
The calculate_mode() function finds and returns the mode of the num-
bers passed to it as a parameter. To calculate the mode, we first import
the class Counter from the collections module and use it to create a Counter
object at u. Then, at v, we use the most_common() method, which, as we saw
earlier, gives us a list that contains a tuple with the most common number
and the number of times it occurs. We assign that list the label mode. Finally,
we use mode[0][0] w to access the number we want: the most frequent num-
ber from the list, which is the mode.
What if you have a set of data where two or more numbers occur the
same maximum number of times? For example, in the list of numbers 5, 5,
5, 4, 4, 4, 9, 1, and 3, both 4 and 5 are present three times. In such cases,
the list of numbers is said to have multiple modes, and our program should
find and print all the modes. The modified program follows:
'''
Calculating the mode when the list of numbers may
have multiple modes
'''
def calculate_mode(numbers):
c = Counter(numbers)
u numbers_freq = c.most_common()
v max_count = numbers_freq[0][1]
modes = []
for num in numbers_freq:
w if num[1] == max_count:
modes.append(num[0])
return modes
if __name__ == '__main__':
scores = [5, 5, 5, 4, 4, 4, 9, 1, 3]
modes = calculate_mode(scores)
print('The mode(s) of the list of numbers are:')
x for mode in modes:
print(mode)
68 Chapter 3
When you execute the preceding program, you should see the follow-
ing output:
What if you wanted to find the number of times every number occurs
instead of just the mode? A frequency table, as the name indicates, is a table
that shows how many times each number occurs within a collection of
numbers.
Score Frequency
1 2
2 1
4 1
5 2
6 3
7 2
8 2
9 5
10 2
Note that the sum of the individual frequencies in the second column
adds up to the total number of scores (in this case, 20).
We’ll use the most_common() method once again to print the frequency
table for a given set of numbers. Recall that when we don’t supply an argu-
ment to the most_common() method, it returns a list of tuples with all the num-
bers and the number of times they appear. We can simply print each number
and its frequency from this list to display a frequency table.
Here’s the program:
'''
Frequency table for a list of numbers
'''
if __name__=='__main__':
scores = [7, 8, 9, 2, 10, 9, 9, 9, 9, 4, 5, 6, 1, 5, 6, 7, 8, 6, 1, 10]
frequency_table(scores)
Number Frequency
9 5
6 3
1 2
5 2
7 2
8 2
10 2
2 1
4 1
Here, you can see that the numbers are listed in decreasing order of
frequency because the most_common() function returns the numbers in this
order. If, instead, you want your program to print the frequency table
sorted by value from lowest to highest, as shown in Table 3-1, you’ll have
to re-sort the list of tuples.
The sort() method is all we need to modify our earlier frequency table
program:
'''
Frequency table for a list of numbers
Enhanced to display the table sorted by the numbers
'''
def frequency_table(numbers):
table = Counter(numbers)
u numbers_freq = table.most_common()
v numbers_freq.sort()
print('Number\tFrequency')
w for number in numbers_freq:
print('{0}\t{1}'.format(number[0], number[1]))
70 Chapter 3
if __name__ == '__main__':
scores = [7, 8, 9, 2, 10, 9, 9, 9, 9, 4, 5, 6, 1, 5, 6, 7, 8, 6, 1, 10]
frequency_table(scores)
Number Frequency
1 2
2 1
4 1
5 2
6 3
7 2
8 2
9 5
10 2
In this section, we’ve covered mean, median, and mode, which are
three common measures for describing a list of numbers. Each of these can
be useful, but they can also hide other aspects of the data when considered
in isolation. Next, we’ll look at other, more advanced statistical measures
that can help us draw more conclusions about a collection of numbers.
'''
Find the range
'''
def find_range(numbers):
u lowest = min(numbers)
v highest = max(numbers)
# Find the range
r = highest-lowest
if __name__ == '__main__':
donations = [100, 60, 70, 900, 100, 200, 500, 500, 503, 600, 1000, 1200]
x lowest, highest, r = find_range(donations)
print('Lowest: {0} Highest: {1} Range: {2}'.format(lowest, highest, r))
This tells us that the days’ total donations were fairly spread out, with a
range of 1140, because we had daily totals as small as 60 and as large as 1200.
72 Chapter 3
A high variance means that values are far from the mean; a low variance
means that the values are clustered close to the mean. We calculate the vari-
ance using the formula
In the formula, xi stands for individual numbers (in this case, daily
total donations), x mean stands for the mean of these numbers (the mean
daily donation), and n is the number of values in the list (the number of
days on which donations were received). For each value in the list, we take
the difference between that number and the mean and square it. Then, we
add all those squared differences together and, finally, divide the whole
sum by n to find the variance.
If we want to calculate the standard deviation as well, all we have to do
is take the square root of the variance. Values that are within one standard
deviation of the mean can be thought of as fairly typical, whereas values
that are three or more standard deviations away from the mean can be
considered much more atypical—we call such values outliers.
Why do we have these two measures of dispersion—variance and stan-
dard deviation? In short, the two measures are useful in different situations.
Going back to the formula we used to calculate the variance, you can see
that the variance is expressed in square units because it’s the average of
the squared difference from the mean. For some mathematical formulas,
it’s nicer to work with those square units instead of taking the square root
to find the standard deviation. On the other hand, the standard deviation
is expressed in the same units as the population data. For example, if you
calculate the variance for our list of donations (as we will in a moment),
the result is expressed in dollars squared, which doesn’t make a lot of sense.
Meanwhile, the standard deviation is simply expressed in dollars, the same
unit as each of the donations.
The following program finds the variance and standard deviation for a
list of numbers:
'''
Find the variance and standard deviation of a list of numbers
'''
def calculate_mean(numbers):
s = sum(numbers)
N = len(numbers)
# Calculate the mean
mean = s/N
return mean
def find_differences(numbers):
# Find the mean
mean = calculate_mean(numbers)
# Find the differences from the mean
diff = []
return diff
def calculate_variance(numbers):
if __name__ == '__main__':
donations = [100, 60, 70, 900, 100, 200, 500, 500, 503, 600, 1000, 1200]
variance = calculate_variance(donations)
print('The variance of the list of numbers is {0}'.format(variance))
x std = variance**0.5
print('The standard deviation of the list of numbers is {0}'.format(std))
The variance and the standard deviation are both very large, meaning
that the individual daily total donations vary greatly from the mean. Now,
let’s compare the variance and the standard deviation for a different set of
donations that have the same mean: 382, 389, 377, 397, 396, 368, 369, 392,
398, 367, 393, and 396. In this case, the variance and the standard deviation
turn out to be 135.38888888888889 and 11.63567311713804, respectively.
Lower values for variance and standard deviation tell us that the individual
numbers are closer to the mean. Figure 3-1 illustrates this point visually.
74 Chapter 3
Figure 3-1: Variation of the donations around the average donation
The mean donations for both lists of donations are similar, so the two
lines overlap, appearing as a single line in the figure. However, the dona-
tions from the first list vary widely from the mean, whereas the donations
from the second list are very close to the mean, which confirms what we
inferred from the lower variance value.
In statistics, you’ll often come across the statement “correlation doesn’t imply
causation.” This is a reminder that even if two sets of observations are strongly
correlated with each other, that doesn’t mean one variable causes the other.
When two variables are strongly correlated, sometimes there’s a third factor
that influences both variables and explains the correlation. A classic example
is the correlation between ice cream sales and crime rates—if you track both
of these variables in a typical city, you’re likely to find a correlation, but this
doesn’t mean that ice cream sales cause crime (or vice versa). Ice cream sales
and crime are correlated because they both go up as the weather gets hotter
during the summer. Of course, this doesn’t mean that hot weather directly
causes crime to go up either; there are more complicated causes behind that
correlation as well.
.
In the above formula, n is the total number of values present in each set
of numbers (the sets have to be of equal length). The two sets of numbers
are denoted by x and y (it doesn’t matter which one you denote as which).
The other terms are described as follows:
Sum of the products of the individual elements of the two sets
of numbers, x and y
Sum of the numbers in set x
Sum of the numbers in set y
Square of the sum of the numbers in set x
Square of the sum of the numbers in set y
Sum of the squares of the numbers in set x
Sum of the squares of the numbers in set y
76 Chapter 3
Once we’ve calculated these terms, you can combine them according
to the preceding formula to find the correlation coefficient. For small lists,
it’s possible to do this by hand without too much effort, but it certainly gets
complicated as the size of each set of numbers increases.
In a moment, we’ll write a program that calculates the correlation
coefficient for us. In this program, we’ll use the zip() function, which
will help us calculate the sum of products from the two sets of numbers.
Here’s an example of how the zip() function works:
1 4
2 5
3 6
def find_corr_x_y(x,y):
n = len(x)
v sum_prod_x_y = sum(prod)
w sum_x = sum(x)
x sum_y = sum(y)
squared_sum_x = sum_x**2
squared_sum_y = sum_y**2
x_square = []
y for xi in x:
x_square.append(xi**2)
# Find the sum
x_square_sum = sum(x_square)
y_square=[]
for yi in y:
y_square.append(yi**2)
# Find the sum
y_square_sum = sum(y_square)
return correlation
78 Chapter 3
Table 3-2: High School Grades and College Admission Test Performance
To analyze this data, let’s look at a scatter plot. Figure 3-2 shows the scat-
ter plot of the preceding data set, with the x-axis representing high school
grades and the y-axis representing the corresponding college admission test
performance.
Figure 3-2: Scatter plot of high school grades and college admission test scores
Table 3-3: High School Math Grades and College Admission Test Performance
Now, the scatter plot (Figure 3-3) shows the data points lying almost per-
fectly along a straight line. This is an indication of a high correlation between
the high school math scores and performance on the college admission test.
The correlation coefficient, in this case, turns out to be approximately 1.
With the help of the scatter plot and correlation coefficient, we can conclude
that there is indeed a strong relationship in this data set between grades in
high school math and performance on college admission tests.
80 Chapter 3
Figure 3-3: Scatter plot of high school math grades and college admission test scores
Scatter Plots
In the previous section, we saw an example of how a scatter plot can give
us a first indication of the existence of any correlation between two sets of
numbers. In this section, we’ll see the importance of analyzing scatter plots
by looking at a set of four data sets. For these data sets, conventional statisti-
cal measures all turn out to be the same, but the scatter plots of each data
set reveal important differences.
First, let’s go over how to create a scatter plot in Python:
>>> x = [1, 2, 3, 4]
>>> y = [2, 4, 6, 8]
>>> import matplotlib.pyplot as plt
u >>> plt.scatter(x, y)
<matplotlib.collections.PathCollection object at 0x7f351825d550>
>>> plt.show()
The scatter() function is used to create a scatter plot between two lists
of numbers, x and y u. The only difference between this plot and the plots
we created in Chapter 2 is that here we use the scatter() function instead of
the plot() function. Once again, we have to call show() to display the plot.
Table 3-4: Anscombe’s Quartet—Four Different Data Sets with Almost Identical
Statistical Measures
A B C D
X1 Y1 X2 Y2 X3 Y3 X4 Y4
10.0 8.04 10.0 9.14 10.0 7.46 8.0 6.58
8.0 6.95 8.0 8.14 8.0 6.77 8.0 5.76
13.0 7.58 13.0 8.74 13.0 12.74 8.0 7.71
9.0 8.81 9.0 8.77 9.0 7.11 8.0 8.84
11.0 8.33 11.0 9.26 11.0 7.81 8.0 8.47
14.0 9.96 14.0 8.10 14.0 8.84 8.0 7.04
6.0 7.24 6.0 6.13 6.0 6.08 8.0 5.25
4.0 4.26 4.0 3.10 4.0 5.39 19.0 12.50
12.0 10.84 12.0 9.13 12.0 8.15 8.0 5.56
7.0 4.82 7.0 7.26 7.0 6.42 8.0 7.91
5.0 5.68 5.0 4.74 5.0 5.73 8.0 6.89
We’ll refer to the pairs (X1, Y1), (X2, Y2), (X3, Y3), and (X4, Y4) as
data sets A, B, C, and D, respectively. Table 3-5 presents the statistical mea-
sures of the data sets rounded off to two decimal digits.
X Y
Data set Mean Std. dev. Mean Std. dev. Correlation
A 9.00 3.32 7.50 2.03 0.82
B 9.00 3.32 7.50 2.03 0.82
C 9.00 3.32 7.50 2.03 0.82
D 9.00 3.32 7.50 2.03 0.82
The scatter plots for each data set are shown in Figure 3-4.
1. F.J. Anscombe, “Graphs in Statistical Analysis,” American Statistician 27, no. 1 (1973): 17–21.
82 Chapter 3
Figure 3-4: Scatter plots of Anscombe’s quartet
100
60
70
900
100
200
500
500
503
600
1000
1200
The following program will read this file and print the sum of the num-
bers stored in the file:
if __name__ == '__main__':
sum_data('mydata.txt')
84 Chapter 3
All our programs in this chapter have assumed that the input data is
available in lists. To use our earlier programs on the data from a file, we
need to first create a list from that data. Once we have a list, we can use
the functions we wrote earlier to calculate the corresponding statistic. The
following program calculates the mean of the numbers stored in the file
mydata.txt:
'''
Calculating the mean of numbers stored in a file
'''
def read_data(filename):
numbers = []
with open(filename) as f:
for line in f:
u numbers.append(float(line))
return numbers
def calculate_mean(numbers):
s = sum(numbers)
N = len(numbers)
mean = s/N
return mean
if __name__ == '__main__':
v data = read_data('mydata.txt')
mean = calculate_mean(data)
print('Mean: {0}'.format(mean))
Mean: 477.75
Of course, you’ll see a different value for the mean if the numbers in
your file are different from those in this example.
See Appendix B for hints on how you can ask the user to input the file-
name and then modify your program accordingly. This will allow your pro-
gram’s user to specify any data file.
Number,Squared
10,100
9,81
22,484
The first line is referred to as the header. In this case, it tells us that the
entries in the first column of this file are numbers and those in the second
column are the corresponding squares. The next three lines, or rows, con-
tain a number and its square separated by a comma. It’s possible to read the
data from this file using an approach similar to what I showed for the .txt
file. However, Python’s standard library has a dedicated module (csv) for
reading (and writing) CSV files, which makes things a little easier.
Save the numbers and their squares into a file, numbers.csv, in the
same directory as your programs. The following program shows how to
read this file and then create a scatter plot displaying the numbers against
their squares:
import csv
import matplotlib.pyplot as plt
def read_csv(filename):
numbers = []
squared = []
with open(filename) as f:
u reader = csv.reader(f)
next(reader)
v for row in reader:
numbers.append(int(row[0]))
squared.append(int(row[1]))
return numbers, squared
if __name__ == '__main__':
numbers, squared = read_csv('numbers.csv')
scatter_plot(numbers, squared)
86 Chapter 3
The read_csv() function reads the CSV file using the reader() function
defined in the csv module (which is imported at the beginning of the pro-
gram). This function is called with the file object f passed to it as an argu-
ment u. This function then returns a pointer to the first line of the CSV file.
We know that the first line of the file is the header, which we want to skip,
so we move the pointer to the next line using the next() function. We then
read every line of the file with each line referred to by the label row v, with
row[0] referring to the first column of the data and row[1] referring to the
second. For this specific file, we know that both these numbers are integers,
so we use the int() function to convert these from strings to integers and
to store them in two lists. The lists are then returned—one containing the
numbers and the other containing the squares.
We then call the scatter_plot() function with these two lists to create
the scatter plot. The find_corr_x_y() function we wrote earlier can also easily
be used to find the correlation coefficient between the two sets of numbers.
Now let’s try dealing with a more complex CSV file. Open https://www
.google.com/trends/correlate/ in your browser, enter any search query you wish
to (for example, summer), and click the Search correlations button. You’ll
see that a number of results are returned under the heading “Correlated
with summer,” and the first result is the one with the highest correlation
(the number on the immediate left of each result). Click the Scatter plot
option above the graph to see a scatter plot with the x-axis labeled summer
and the y-axis labeled with the top result. Ignore the exact numbers plotted
on both axes as we’re interested only in the correlation and the scatter plot.
A little above the scatterplot, click Export data as CSV and a file down-
load will start. Save this file in the same directory as your programs.
This CSV file is slightly different from the one we saw earlier. At the
beginning of the file, you’ll see a number of blank lines and lines with a '#'
symbol until finally you’ll see the header and the data. These lines aren’t
useful to us—go ahead and delete them by hand using whatever software
you opened the file with so that the first line of the file is the header. Also
delete any blank lines at the end of the file. Now save the file. This step—
where we cleaned up the file to make it easier to process with Python—is
usually called preprocessing the data.
The header has several columns. The first contains the date of the
data in each row (each row has data corresponding to the week that started
on the date in this column). The second column is the search query you
entered, the third column shows the search query with the highest correla-
tion with your search query, and the other columns include a number of
other search queries arranged in decreasing order of correlation with your
entered search query. The numbers in these columns are the z-scores of the
corresponding search queries. The z-score indicates the difference between
the number of times a term was searched for during a specific week and the
overall mean number of searches per week for that term. A positive z-score
indicates that the number of searches was higher than the mean for that
week, and a negative z-score indicates it was lower.
def read_csv(filename):
with open(filename) as f:
reader = csv.reader(f)
next(reader)
summer = []
highest_correlated = []
u for row in reader:
summer.append(float(row[1]))
highest_correlated.append(float(row[2]))
This is pretty much like the earlier version of the read_csv function; the
main change here is how we append the values to each list starting at u:
we’re now reading the second and the third members of each row, and we’re
storing them as floating point numbers.
The following program uses this function to calculate the correlation
between the values for the search query you provided and the values for the
query with the highest correlation with it. It also creates a scatter plot of
these values:
if __name__ == '__main__':
u summer, highest_correlated = read_csv('correlate-summer.csv')
corr = find_corr_x_y(summer, highest_correlated)
print('Highest correlation: {0}'.format(corr))
scatter_plot(summer, highest_correlated)
Assuming that the CSV file was saved as correlate-summer.csv, we call the
read_csv() function to read the data in the second and third columns u.
Then, we call the find_corr_x_y() function we wrote earlier with the two lists
summer and highest_correlated. It returns the correlation coefficient, which
we then print. Now, we call the scatter_plot() function we wrote earlier
with these two lists again. Before you can run this program, you’ll need to
include the definitions of the read_csv(), find_corr_x_y(), and scatter_plot()
functions.
On running, you’ll see that it prints the correlation coefficient and also
creates a scatter plot. Both of these should be very similar to the data shown
on the Google correlate website.
88 Chapter 3
What You Learned
In this chapter, you learned to calculate statistical measures to describe
a set of numbers and the relationships between sets of numbers. You also
used graphs to aid your understanding of these measures. You learned a
number of new programming tools and concepts while writing programs
to calculate these measures.
Programming Challenges
Next, apply what you’ve learned to complete the following programming
challenges.
Using this approach, write a program that will take a set of numbers in
a file and display the number that corresponds to a specific percentile sup-
plied as an input to the program.
Grade Frequency
1–6 6
6–11 14
The table classifies the grades into two classes: 1–6 (which includes
1 but not 6) and 6–11 (which includes 6 but not 11). It displays against
them the number of grades that belong to each category. Determining
the number of classes and the range of numbers in each class are two key
steps involved in creating this table. In this example, I’ve demonstrated two
classes with the range of numbers in each class equally divided between
the two.
90 Chapter 3
Here’s one simple approach to creating classes, which assumes the
number of classes can be arbitrarily chosen:
>>> x = 1
>>> x + x + 1
3
First, we import the Symbol class from the sympy library. Then, we create
an object of this class passing 'x' as a parameter. Note that this 'x' is writ-
ten as a string within quotes. We can now define expressions and equations
in terms of this symbol. For example, here’s the earlier expression:
>>> a = Symbol('x')
>>> a + a + 1
2*x + 1
94 Chapter 4
Finding the Sy mbol Represe nt ed by
a Sy mbol Objec t
For any Symbol object, its name attribute is a string that is the actual symbol it
represents:
>>> x = Symbol('x')
>>> x.name
'x'
>>> a = Symbol('x')
>>> a.name
'x'
You can use .name on a label to retrieve the symbol that it is storing.
Just to be clear, the symbol you create has to be specified as a string. For
example, you can’t create the symbol x using x = Symbol(x)—you must define
it as x = Symbol('x').
To define multiple symbols, you can either create separate Symbol objects
or use the symbols() function to define them more concisely. Let’s say you
wanted to use three symbols—x, y, and z—in your program. You could
define them individually, as we did earlier:
>>> x = Symbol('x')
>>> y = Symbol('y')
>>> z = Symbol('z')
>>> p = x*(x + x)
>>> p
2*x**2
>>> p = (x + 2)*(x + 3)
>>> p
(x + 2)*(x + 3)
You may have expected SymPy to multiply everything out and output
x**2 + 5*x + 6. Instead, the expression was printed exactly how we entered
it. SymPy automatically simplifies only the most basic of expressions and
leaves it to the programmer to explicitly require simplification in cases such
as the preceding one. If you want to multiply out the expression to get the
expanded version, you’ll have to use the expand() function, which we’ll see
in a moment.
96 Chapter 4
Next, we import the factor() function and use it to convert the
expanded version (on the left side of the identity) to the factored version
(on the right side):
>>> expand(factors)
x**3 + 3*x**2*y + 3*x*y**2 + y**3
The factor() function is able to factorize the expression, and then the
expand() function expands the factorized expression to return to the origi-
nal expression.
If you try to factorize an expression for which there’s no possible fac-
torization, the original expression is returned by the factor() function. For
example, see the following:
Pretty Printing
If you want the expressions we’ve been working with to look a bit nicer when
you print them, you can use the pprint() function. This function will print
the expression in a way that more closely resembles how we’d normally
write it on paper. For example, here’s an expression:
>>> expr
x**2 + 2*x*y + y**2
Now, let’s use the pprint() function to print the preceding expression:
The init_printing() function is first imported and called with the key-
word argument order='rev-lex'. This indicates that we want SymPy to print
the expressions so that they’re in reverse lexicographical order. In this case, the
keyword argument tells Python to print the lower-power terms first.
Note Although we used the init_printing() function here to set the printed order of the
expressions, this function can be used in many other ways to configure how an expres-
sion is printed. For more options and to learn more about printing in SymPy, see the
documentation at http://docs.sympy.org/latest/tutorial/printing.html.
98 Chapter 4
Printing a Series
Consider the following series:
.
Let’s write a program that will ask a user to input a number, n, and
print this series for that number. In the series, x is a symbol and n is an
integer input by the program’s user. The nth term in this series is given by
'''
Print the series:
x + x**2 + x**3 + ... + x**n
____ _____ _____
2 3 n
'''
x = Symbol('x')
u series = x
v for i in range(2, n+1):
w series = series + (x**i)/i
pprint(series)
if __name__ == '__main__':
n = input('Enter the number of terms you want in the series: ')
x print_series(int(n))
i = 2, series = x + x**2 / 2
i = 3, series = x + x**2/2 + x**3/3
--snip--
Try this out with a different number of terms every time. Next, we’ll see
how to calculate the sum of this series for a certain value of x.
Substituting in Values
Let’s see how we can use SymPy to plug values into an algebraic expression.
This will let us calculate the value of the expression for certain values of the
variables. Consider the mathematical expression x 2 + 2xy + y 2, which can be
defined as follows:
>>> x = Symbol('x')
>>> y = Symbol('y')
>>> x*x + x*y + x*y + y*y
x**2 + 2*x*y + y**2
>>> res
9
You can also express one symbol in terms of another and substitute
accordingly, using the subs() method. For example, if you knew that
x = 1 − y, here’s how you could evaluate the preceding expression:
>>> expr.subs({x:1-y})
y**2 + 2*y*(-y + 1) + (-y + 1)**2
100 Chapter 4
Py thon Dic t ion a ries
A dictionary is another type of data structure in Python (lists and tuples are
other examples of data structures, which you’ve seen earlier). Dictionaries
contain key-value pairs inside curly braces, where each key is matched up with
a value, separated by a colon. In the preceding code listing, we entered the
dictionary {x:1, y:2} as an argument to the subs() method. This dictionary has
two key-value pairs—x:1 and y:2, where x and y are the keys and 1 and 2 are
the corresponding values. You can retrieve a value from a dictionary by enter-
ing its associated key in brackets, much as we would retrieve an element from
a list using its index. For example, here we create a simple dictionary and then
retrieve the value corresponding to key1:
'''
Print the series:
x + x**2 + x**3 + ... + x**n
____ _____ _____
2 3 n
'''
x = Symbol('x')
series = x
for i in range(2, n+1):
series = series + (x**i)/i
pprint(series)
if __name__ == '__main__':
n = input('Enter the number of terms you want in the series: ')
v x_value = input('Enter the value of x at which you want to evaluate the series: ')
print_series(int(n), float(x_value))
102 Chapter 4
If you execute the program now, it will ask you for the two inputs and
print out the series and the series value:
In this sample run, we ask for five terms in the series, with x set to 1.2,
and the program prints and evaluates the series.
>>> 2*expr
2*x**3 + 2*x**2 + 10*x
What happens when the user supplies an invalid expression? Let’s see:
Invalid input
The two changes in the preceding program are that we import the
SympifyError exception class from the sympy.core.sympify module and call
the sympify() function in a try...except block. Now if there’s a SympifyError
exception, an error message is printed.
Expression Multiplier
Let’s apply the sympify() function to write a program that calculates the
product of two expressions:
'''
Product of two expressions
'''
if __name__=='__main__':
u expr1 = input('Enter the first expression: ')
v expr2 = input('Enter the second expression: ')
try:
expr1 = sympify(expr1)
expr2 = sympify(expr2)
except SympifyError:
print('Invalid input')
else:
w product(expr1, expr2)
104 Chapter 4
in a try...except block. If the conversion succeeds (indicated by the else
block), we call the product() function at w. In this function, we calculate the
product of the two expressions and print it. Note how we use the expand()
function to print the product so that all its terms are expressed as a sum of
its constituent terms.
Here’s a sample execution of the program:
The last line displays the product of the two expressions. The input can
also have more than one symbol in any of the expressions:
Solving Equations
SymPy’s solve() function can be used to find solutions to equations. When
you input an expression with a symbol representing a variable, such as x,
solve() calculates the value of that symbol. This function always makes its
calculation by assuming the expression you enter is equal to zero—that is,
it prints the value that, when substituted for the symbol, makes the entire
expression equal zero. Let’s start with the simple equation x − 5 = 7. If we
want to use solve() to find the value of x, we first have to make one side of
the equation equal zero (x − 5 − 7 = 0). Then, we’re ready to use solve(), as
follows:
>>> x=Symbol('x')
>>> expr = x**2 + x + 1
>>> solve(expr, dict=True)
[{x: -1/2 - sqrt(3)*I/2}, {x: -1/2 + sqrt(3)*I/2}]
Both the roots are imaginary, as expected with the imaginary compo-
nent indicated by the I symbol.
>>> x = Symbol('x')
>>> a = Symbol('a')
>>> b = Symbol('b')
>>> c = Symbol('c')
106 Chapter 4
Next, we write the expression corresponding to the equation and use
the solve() function on it:
Now that we have the expression for t (referred to by the label t_expr),
we can use the subs() method to replace the values of s, u, and a to find the
two possible values of t.
>>> x = Symbol('x')
>>> y = Symbol('y')
>>> expr1 = 2*x + 3*y - 6
>>> expr2 = 3*x + 2*y – 12
The two equations are defined by the expressions expr1 and expr2,
respectively. Note how we’ve rearranged the expressions so they both
equal zero (we moved the right side of the given equations to the left
side). To find the solution, we call the solve() function with the two
expressions forming a tuple:
108 Chapter 4
the gravitational force for each distance value and supply the lists of dis-
tances and forces to matplotlib. With SymPy, on the other hand, you can
just tell SymPy the equation of the line you want to plot, and the graph will
be created for you. Let’s plot a line whose equation is given by y = 2x + 3:
All we had to do was import plot and Symbol from sympy.plotting, create
a symbol, x, and call the plot() function with the expression 2*x+3. SymPy
takes care of everything else and plots the graph of the function, as shown
in Figure 4-1.
Here, a tuple consisting of the symbol, the lower bound, and the upper
bound of the range— (x, -5, 5)—is specified as the second argument to the
plot() function. Now, the graph displays only the values of y corresponding
to the values of x between −5 and 5 (see Figure 4-2).
Figure 4-2: Plot of the line y = 2x + 3 with the values of x restricted to the range −5 to 5
You can use other keyword arguments in the plot() function, such as
title to enter a title or xlabel and ylabel to label the x-axis and the y-axis,
respectively. The following plot() function specifies the preceding three
keyword arguments (see the corresponding graph in Figure 4-3):
110 Chapter 4
Figure 4-3: Plot of the line y = 2x + 3 with the range of x and other attributes specified
The plot shown in Figure 4-3 now has a title and labels on the x-axis
and the y-axis. You can specify a number of other keyword arguments to
the plot() function to customize the behavior of the function as well as the
graph itself. The show keyword argument allows us to specify whether we
want the graph to be displayed. Passing show=False will cause the graph to
not be displayed when you call the plot() function:
>>> p = plot(2*x + 3, (x, -5, 5), title='A Line', xlabel='x', ylabel='2x+3', show=False)
You will see that no graph is shown. The label p refers to the plot that is
created, so you can now call p.show() to display the graph. You can also save
the graph as an image file using the save() method, as follows:
>>> p.save('line.png')
This will save the plot to a file line.png in the current directory.
'''
Plot the graph of an input expression
'''
def plot_expression(expr):
y = Symbol('y')
solutions = solve(expr, y)
expr_y = solutions[0]
plot(expr_y)
if __name__=='__main__':
112 Chapter 4
try:
expr = sympify(expr)
except SympifyError:
print('Invalid input')
else:
plot_expression(expr)
At u, we call the plot() function with the equations for the two lines
but pass two additional keyword arguments—legend and show. By setting
the legend argument to True, we add a legend to the graph, as we saw in
Chapter 2. Note, however, that the text that appears in the legend will
match the expressions you plotted—you can’t specify any other text. We
also set show=False because we want to set the color of the lines before we
draw the graph. The statement at v, p[0], refers to the first line, 2x + 3, and
we set its attribute line_color to 'b', meaning that we want this line to be
blue. Similarly, we set the color of the second plot to red using the string
'r' w. Finally, we call the show() to display the graph (see Figure 4-5).
Figure 4-5: Plot of the two lines with each line drawn in a different color
114 Chapter 4
In addition to red and blue, you can plot the lines in green, cyan,
magenta, yellow, black, and white (using the first letter of the color in
each case).
Programming Challenges
Here are a few programming challenges that should help you further apply
what you’ve learned. You can find sample solutions at http://www.nostarch
.com/doingmathwithpython/.
Now, expr1 and expr2 will store the two expressions input by the user.
You should convert both of these into SymPy objects using the sympify() step
in a try...except block.
All you need to do from here is plot these two expressions instead
of one.
Once you’ve completed this, enhance your program to print the
solution—the pair of x and y values that satisfies both equations. This
will also be the spot where the two lines on the graph intersect. (Hint:
Refer to how we used the solve() function earlier to find the solution of a
system of two linear equations.)
We call the summation() function at u, with the first argument being the
nth term of the series and the second argument being a tuple that states the
range of n. We want the sum of the first five terms here, so the second argu-
ment is (n, 1, 5).
Once you have the sum, you can use the subs() method to substitute a
value for x to find the numerical value of the sum:
>>> s.subs({x:1.2})
3.51206400000000
116 Chapter 4
#4: Solving Single-Variable Inequalities
You’ve seen how to solve an equation using SymPy’s solve() function. But
SymPy is also capable of solving single-variable inequalities, such as x + 5 > 3
and sinx − 0.6 > 0. That is, SymPy can solve relations besides equality, like
>, <, and so on. For this challenge, create a function, isolve(), that will take
any inequality, solve it, and then return the solution.
First, let’s learn about the SymPy functions that will help you imple-
ment this. The inequality-solving functions are available as three separate
functions for polynomial, rational, and all other inequalities. We’ll need to
pick the right function to solve various inequalities, or we’ll get an error.
A polynomial is an algebraic expression consisting of a variable and
coefficients and involving only the operations of addition, subtraction,
and multiplication and only positive powers of the variable. An example
2
of a polynomial inequality is x + 4 < 0.
To solve a polynomial inequality, use the solve_poly_inequality()
function:
118 Chapter 4
Hints: Handy Functions
Now remember—your challenge is (1) to create a function, isolve(), that
will take any inequality and (2) to choose one of the appropriate functions
discussed in this section to solve it and return the solution. The following
hints may be useful to implement this function.
The is_polynomial() method can be used to check whether an expres-
sion is a polynomial or not:
>>> x = Symbol('x')
>>> expr = x**2 - 4
>>> expr.is_polynomial()
True
>>> expr = 2*sin(x) + 3
>>> expr.is_polynomial()
False
When you run your program, it should ask the user to input an inequal-
ity expression and print back the solution.
What’s a Set?
A set is a collection of distinct objects, often called elements or members. Two
characteristics of a set make it different from just any collection of objects.
A set is “well defined,” meaning the question “Is a particular object in this
collection?” always has a clear yes or no answer, usually based on a rule or
some given criteria. The second characteristic is that no two members of
a set are the same. A set can contain anything—numbers, people, things,
words, and so on.
Let’s walk through some basic properties of sets as we learn how to work
with sets in Python using SymPy.
Set Construction
In mathematical notation, you represent a set by writing the set members
enclosed in curly brackets. For example, {2, 4, 6} represents a set with 2, 4,
and 6 as its members. To create a set in Python, we can use the FiniteSet
class from the sympy package, as follows:
Here, we first import the FiniteSet class from SymPy and then create an
object of this class by passing in the set members as arguments. We assign
the label s to the set we just created.
We can store different types of numbers—including integers, floating
point numbers, and fractions—in the same set:
The cardinality of a set is the number of members in the set, which you
can find by using the len() function:
>>> 4 in s
False
122 Chapter 5
Creating an Empty Set
If you want to make an empty set, which is a set that doesn’t have any elements
or members, create a FiniteSet object without passing any arguments. The
result is an EmptySet object:
>>> s = FiniteSet()
>>> s
EmptySet()
Here, even though we passed in a list that had two instances of the
number 2, the number 2 appears only once in the set created from that list.
In Python lists and tuples, each element is stored in a particular order,
but the same is not always true for sets. For example, we can print out each
member of a set by iterating through it as follows:
When you run this code, the elements could be printed in any possible
order. This is because of how sets are stored by Python—it keeps track of
what members are in the set, but it doesn’t keep track of any particular
order for those members.
Let’s see another example. Two sets are equal when they have the same
elements. In Python, you can use the equality operator, ==, to check whether
two sets are equal:
Although the members of these two sets appear in different orders, the
sets are still equal.
>>> s = FiniteSet(1)
>>> t = FiniteSet(1,2)
>>> s.is_subset(t)
True
>>> t.is_subset(s)
False
Note that an empty set is a subset of every set. Also, any set is a subset of
itself, as you can see in the following:
>>> s.is_subset(s)
True
>>> t.is_subset(t)
True
>>> s.is_superset(t)
False
>>> t.is_superset(s)
True
124 Chapter 5
The power set of a set, s, is the set of all possible subsets of s. Any set, s,
has precisely 2|s| subsets, where |s| is the cardinality of the set. For example,
the set {1, 2, 3} has a cardinality of 3, so it has 23 or 8 subsets: {} (the empty
set), {1}, {2}, {3}, {1, 2}, {2, 3}, {1, 3}, and {1, 2, 3}.
The set of all these subsets form the power set, and we can find the
power set using the powerset() method:
>>> s = FiniteSet(1, 2, 3)
>>> ps = s.powerset()
>>> ps
{{1}, {1, 2}, {1, 3}, {1, 2, 3}, {2}, {2, 3}, {3}, EmptySet()}
As the power set is a set itself, you can find its cardinality using the len()
function:
>>> len(ps)
8
>>> t = FiniteSet(1, 2, 3, 4)
>>> s.is_proper_subset(t)
True
>>> t.is_proper_superset(s)
True
In the first chapter, we learned that there are different kinds of numbers—
integers, floating point numbers, fractions, and complex numbers. All these
numbers form different sets of numbers, which have special names.
All positive and negative whole numbers form the set of integers. All posi-
tive integers form the set of natural numbers (sometimes 0 is included in this
set of numbers even though it’s not positive, but sometimes not). This means
the set of natural numbers is a proper subset of the set of integers.
The set of rational numbers includes any number that can be expressed
as a fraction, which includes all integers, plus any number with a decimal end-
ing that terminates or repeats (including numbers like 1/4 or 0.25, and 1/3 or
0.33333 . . . ). By contrast, nonrepeating, nonterminating decimal numbers are
known as irrational numbers. The square root of 2 and π are both examples of
irrational numbers because they go on forever without repeating.
If you put together all the rational and irrational numbers, you get the set
of real numbers. But even larger than that is the set of complex numbers, which
includes all real numbers and all numbers with an imaginary component.
All of these sets of numbers are infinite sets because they have infinite
members. In contrast, the sets we’ve discussed in this chapter have a finite num-
ber of members, which is why the SymPy class we’re using is called FiniteSet.
Set Operations
Set operations such as union, intersection, and the Cartesian product allow
you to combine sets in certain methodical ways. These set operations are
extremely useful in real-world problem-solving situations when we have to
consider multiple sets together. Later in this chapter, we’ll see how to use
these operations to apply a formula to multiple sets of data and calculate
the probabilities of random events.
126 Chapter 5
We find the union of s and t by applying the union method to s and
passing in t as an argument. The result is a third set with all the distinct
members of the two sets. In other words, each member of this third set is
a member of one or both of the first two sets.
The intersection of two sets creates a new set from the elements common
to both sets. For example, the intersection of the sets {1, 2} and {2, 3} will
result in a new set with the only common element, {2}. Mathematically, this
operation is written as {1, 2} Ç {2, 3}.
In SymPy, use the intersect() method to find the intersection:
>>> s = FiniteSet(1, 2)
>>> t = FiniteSet(2, 3)
>>> s.intersect(t)
{2}
Whereas the union operation finds members that are in one set or
another, the intersection operation finds elements that are present in
both. Both of these operations can also be applied to more than two sets.
For example, here’s how you’d find the union of three sets:
>>> s.intersect(t).intersect(u)
EmptySet()
Cartesian Product
The Cartesian product of two sets creates a set that consists of all possible
pairs made by taking an element from each set. For example, the Cartesian
product of the sets {1, 2} and {3, 4} is {(1, 3), (1, 4), (2, 3), (2, 4)}. In SymPy,
you can find the Cartesian product of two sets by simply using the multipli-
cation operator:
Here, for example, we raised the set s to the power of 3. Because we’re
taking the Cartesian product of three sets, this gives us a set of all possible
triplets that contain a member of each set:
Finding the Cartesian product of sets is useful for finding all possible
combinations of the set members, which we’ll explore next.
128 Chapter 5
Applying a Formula to Multiple Sets of Variables
Consider a simple pendulum of length L. The time period, T, of this
pendulum—that is, the amount of time it takes for the pendulum to
complete one full swing—is given by the formula
if __name__ == '__main__':
v L = FiniteSet(15, 18, 21, 22.5, 25)
for l in L:
w t = time_period(l/100)
print('Length: {0} cm Time Period: {1:.3f} s'. format(float(l), float(t)))
T = 2*pi*(length/g)**0.5
return T
if __name__ == '__main__':
130 Chapter 5
18.0 9.78 0.852
18.0 9.8 0.852
18.0 9.83 0.850
21.0 9.78 0.921
21.0 9.8 0.920
21.0 9.83 0.918
22.5 9.78 0.953
22.5 9.8 0.952
22.5 9.83 0.951
25.0 9.78 1.005
25.0 9.8 1.004
25.0 9.83 1.002
This experiment presents a simple scenario where you need all possible
combinations of the elements of multiple sets (or a group of numbers). In
this type of situation, the Cartesian product is exactly what you need.
Probability
Sets allow us to reason about the basic concepts of probability. We’ll begin
with a few definitions:
Experiment The experiment is simply the test we want to perform. We
perform the test because we’re interested in the probability of each pos-
sible outcome. Rolling a die, flipping a coin, and pulling a card from a
deck of cards are all examples of experiments. A single run of an exper-
iment is referred to as a trial.
Sample space All possible outcomes of an experiment form a set
known as the sample space, which we’ll usually call S in our formulas.
For example, when a six-sided die is rolled once, the sample space is
{1, 2, 3, 4, 5, 6}.
Event An event is a set of outcomes that we want to calculate the prob-
ability of and that form a subset of the sample space. For example, we
might want to know the probability of a particular outcome, like rolling
a 3, or the probability of a set of multiple outcomes, such as rolling an
even number (either 2, 4, or 6). We’ll use the letter E in our formulas to
stand for an event.
If there’s a uniform distribution—that is, if each outcome in the sample
space is equally likely to occur—then the probability of an event, P(E),
occurring is calculated using the following formula (I’ll talk about non-
uniform distributions a bit later in this chapter):
Here, n(E) and n(S) are the cardinality of the sets E, the event, and S,
the sample space, respectively. The value of P(E) ranges from 0 to 1, with
higher values indicating a higher chance of the event happening.
E = {3}
n(S) = 6
n(E) = 1
P(E) =
This confirms what was obvious all along: the probability of a particular
die roll is 1/6. You could easily do this calculation in your head, but we can
use this formula to write the following function in Python that calculates
the probability of any event, event, in any sample space, space:
In this function, the two arguments space and event—the sample space
and event—need not be sets created using FiniteSet. They can also be lists
or, for that matter, any other Python object that supports the len() function.
Using this function, let’s write a program to find the probability of a
prime number appearing when a 20-sided die is rolled:
u def check_prime(number):
if number != 1:
for factor in range(2, number):
if number % factor == 0:
return False
else:
return False
return True
if __name__ == '__main__':
v space = FiniteSet(*range(1, 21))
primes = []
for num in s:
w if check_prime(num):
primes.append(num)
x event= FiniteSet(*primes)
p = probability(space, event)
132 Chapter 5
We first create a set representing the sample space, space, using the
range() function at v. To create the event set, we need to find the prime
numbers from the sample space, so we define a function, check_prime(), at u.
This function takes an integer and checks to see whether it’s divisible (with
no remainder) by any number between 2 and itself. If so, it returns False.
Because a prime number is only divisible by 1 and itself, this function
returns True if an integer is prime and False otherwise.
We call this function for each of the numbers in the sample space at w
and add the prime numbers to a list, primes. Then, we create our event
set, event, from this list at x. Finally, we call the probability() function we
created earlier. We get the following output when we run the program:
if __name__ == '__main__':
space = range(1, 21)
primes = []
for num in space:
if check_prime(num):
primes.append(num)
p = probability(space, primes)
134 Chapter 5
which generates a floating point number between 0 and 1. Let’s see a quick
example of how the randint() function works:
>>> random.randint(1, 6)
6
'''
Roll a die until the total score is 20
'''
target_score = 20
def roll():
return random.randint(1, 6)
if __name__ == '__main__':
score = 0
num_rolls = 0
u while score < target_score:
die_roll = roll()
num_rolls += 1
print('Rolled: {0}'.format(die_roll))
score += die_roll
Rolled: 6
Rolled: 2
Rolled: 5
Rolled: 1
Rolled: 3
Rolled: 4
Score of 21 reached in 6 rolls
If you run the program several times, you’ll notice that the number of
rolls it takes to reach 20 varies.
die_sides = FiniteSet(1, 2, 3, 4, 5, 6)
# Sample space
u s = die_sides**max_rolls
# Find the event set
if max_rolls > 1:
success_rolls = []
v for elem in s:
if sum(elem) >= target_score:
success_rolls.append(elem)
else:
if target_score > 6:
w success_rolls = []
else:
success_rolls = []
for roll in die_sides:
x if roll >= target_score:
success_rolls.append(roll)
y e = FiniteSet(*success_rolls)
# Calculate the probability of reaching target score
return len(e)/len(s)
if __name__ == '__main__':
136 Chapter 5
p = find_prob(target_score, max_rolls)
print('Probability: {0:.5f}'.format(p))
When you run this program, it asks for the target score and the maxi-
mum number of allowed rolls as input, and then it prints out the probabil-
ity of achieving that.
Here are two sample executions:
0 1/2 1
Figure 5-1: A number line with a length of 1 divided into two equal intervals
corresponding to the probability of heads or tails on a coin toss
We’ll refer to this line as the probability number line, with each division
representing an equally possible outcome—for example, heads or tails
upon a fair coin toss. Now, in Figure 5-2, consider a different version of
this number line.
Heads Tails
0 2/3 1
Figure 5-2: A number line with a length of 1 divided into two unequal intervals
corresponding to the probability of heads or tails on a biased coin toss
Here, the division corresponding to heads is 2/3 of the total length and
the division corresponding to tails is 1/3. This represents the situation of
a coin that’s likely to turn up heads in 2/3 of tosses and tails only in 1/3 of
tosses. The following Python function will simulate such a coin toss, consid-
ering this unequal probability of heads or tails appearing:
import random
def toss():
# 0 -> Heads, 1-> Tails
u if random.random() < 2/3:
return 0
else:
return 1
Figure 5-3: A number line with a length of 1 divided into four intervals of different
lengths corresponding to the probability of dispensing bills of different denominations
138 Chapter 5
Here, the probability of a $5 bill or $10 bill being dispensed is 1/6, and
the probability of a $20 bill or $50 bill being dispensed is 1/3.
We create a list to store the rolling sum of the probabilities, and then
we generate a random number between 0 and 1. We start from the left end
of the list that stores the sum and return the first index of this list for which
the corresponding sum is lesser than or equal to the random number gen-
erated. The get_index() function implements this idea:
'''
Simulate a fictional ATM that dispenses dollar bills
of various denominations with varying probability
'''
import random
def get_index(probability):
c_probability = 0
u sum_probability = []
for p in probability:
c_probability += p
sum_probability.append(c_probability)
v r = random.random()
for index, sp in enumerate(sum_probability):
w if r <= sp:
return index
x return len(probability)-1
def dispense():
Programming Challenges
Next, you have a few programming challenges to solve that’ll give you the
opportunity to apply what you’ve learned in this chapter.
'''
Draw a Venn diagram for two sets
'''
def draw_venn(sets):
venn2(subsets=sets)
plt.show()
140 Chapter 5
if __name__ == '__main__':
draw_venn([s1, s2])
Once we import all the required modules and functions (the venn2()
function, matplotlib.pyplot, and the FiniteSet class), all we have to do is
create the two sets and then call the venn2() function, using the subsets key-
word argument to specify the sets as a tuple.
Figure 5-4 shows the Venn diagram created by the preceding program.
The sets A and B share seven common elements, so 7 is written in the com-
mon area. Each of the sets also has unique elements, so the number of
unique elements—3 and 1, respectively—is written in the individual areas.
The labels below the two sets are shown as A and B. You can specify your
own labels using the set_labels keyword argument:
Figure 5-4: Venn diagram showing the relationship between two sets, A and B
StudentID,Football,Others
1,1,0
2,1,1
3,0,1
--snip--
Create 20 such rows for the 20 students in your class. The first column
is the student ID (the survey isn’t anonymous), the second column has a
1 if the student has marked “football” as the sport they love to play, and
the third column has a 1 if the student plays any other sport or none at all.
Write a program to create a Venn diagram to depict the summarized results
of the survey, as shown in Figure 5-5.
Figure 5-5: A Venn diagram showing the number of students who love to play football
and the number who love to play other sports
142 Chapter 5
Depending on the data in the sports.csv file you created, the numbers in
each set will vary. The following function reads a CSV file and returns two
lists corresponding to the IDs of those students who play football and other
sports:
def read_csv(filename):
football = []
others = []
with open(filename) as f:
reader = csv.reader(f)
next(reader)
for row in reader:
if row[1] == '1':
football.append(row[0])
if row[2] == '1':
others.append(row[0])
[3, 9, 21, 50, 32, 4, 20, 52, 7, 13, 41, 25, 49, 36, 23, 45, 1, 22, 40, 19, 2,
35, 28, 30, 39, 44, 29, 38, 48, 16, 15, 18, 46, 31, 14, 33, 10, 6, 24, 5, 43,
47, 11, 34, 37, 27, 8, 17, 51, 12, 42, 26]
144 Chapter 5
The random module in Python’s standard library has a function,
shuffle(), for this exact operation:
Create a list, x, consisting of the numbers [1, 2, 3, 4]. Then, call the
shuffle() function u, passing this list as an argument. You’ll see that the
numbers in x have been shuffled. Note that the list is shuffled “in place.”
That is, the original order is lost.
But what if you wanted to use this program in a card game? There,
it’s not enough to simply output the shuffled list of integers. You’ll also
need a way to map back the integers to the specific suit and rank of each
card. One way you might do this is to create a Python class to represent a
single card:
class Card:
def __init__(self, suit, rank):
self.suit = suit
self.rank = rank
10 of spades
6 of clubs
jack of spades
9 of spades
then the value of f × A, where A is the area of the square, would roughly be
equal to the area of the circle (see Figure 5-6). The darts are represented
by the small circular dots in the figure. We shall refer to the value of f × A
as the estimated area. The actual area is, of course, πr 2.
As part of this challenge, write a program that will find the estimated
area of a circle, given any radius, using this approach. The program should
print the estimated area of the circle for three different values of the num-
ber of darts: 103, 105, and 106. That’s a lot of darts! You’ll see that increas-
ing the number of darts brings the estimated area close to the actual area.
Here’s a sample output of the completed solution:
Radius: 2
Area: 12.566370614359172, Estimated (1000 darts): 12.576
Area: 12.566370614359172, Estimated (100000 darts): 12.58176
Area: 12.566370614359172, Estimated (1000000 darts): 12.560128
146 Chapter 5
Estimating the Value of Pi
Consider Figure 5-6 once again. The area of the square is 4r 2, and the
area of the inscribed circle is πr 2. If we divide the area of the circle by
the area of the square, we get π/4. The fraction f that we calculated earlier,
Figure
3.0
2.5
e
2.0 Lin
1.5
1.0
1.0 1.5 2.0 2.5 3.0
Axes
1. To learn more, see Chapter 11, “matplotlib,” by John Hunter and Michael Droettboom in
The Architecture of Open Source Applications, Volume II: Structure, Scale, and a Few More Fearless
Hacks (2008; edited by Amy Brown and Greg Wilson; http://www.aosabook.org/).
150 Chapter 6
The following program re-creates this plot, but we’ll also explicitly cre-
ate the Figure object and add axes to it, instead of just calling the plot()
function and relying on it to create those:
Here, we create the Figure object using the figure() function at u, and
then we create the axes using the axes() function at v. The axes() function
also adds the axes to the Figure object. The last two lines are the same as in
the earlier program. This time, when we call the plot() function, it sees that
a Figure object with an Axes object already exists and directly proceeds to
plot the data supplied to it.
Besides manually creating Figure and Axes objects, you can use two
different functions in the pyplot module to get a reference to the current
Figure and Axes objects. When you call the gcf() function, it returns a refer-
ence to the current Figure, and when you call the gca() function, it returns
a reference to the current Axes. An interesting feature of these functions
is that each will create the respective object if it doesn’t already exist. How
these functions work will become clearer as we make use of them later in
this chapter.
Drawing a Circle
To draw a circle, you can add the Circle patch to the current Axes object, as
demonstrated by the following example:
'''
Example of using matplotlib's Circle patch
'''
import matplotlib.pyplot as plt
def create_circle():
u circle = plt.Circle((0, 0), radius = 0.5)
return circle
def show_shape(patch):
v ax = plt.gca()
ax.add_patch(patch)
plt.axis('scaled')
plt.show()
if __name__ == '__main__':
w c = create_circle()
show_shape(c)
152 Chapter 6
The circle doesn’t quite look like a circle here, as you can see. This is due
to the automatic aspect ratio, which determines the ratio of the length of the
x- and y-axes. If you insert the statement ax.set_aspect('equal') after , you
will see that the circle does indeed look like a circle. The set_aspect() func-
tion is used to set the aspect ratio of the graph; using the equal argument,
we ask matplotlib to set the ratio of the length of the x- and y-axes to 1:1.
Both the edge color and the face color (fill color) of the patch can be
changed using the ec and fc keyword arguments. For example, passing
fc='g' and ec='r' will create a circle with a green face color and red edge
color.
Matplotlib supports a number of other patches, such as Ellipse, Polygon,
and Rectangle.
'''
A growing circle
'''
def create_circle():
circle = plt.Circle((0, 0), 0.05)
return circle
def create_animation():
u fig = plt.gcf()
ax = plt.axes(xlim=(-10, 10), ylim=(-10, 10))
ax.set_aspect('equal')
circle = create_circle()
v ax.add_patch(circle)
w anim = animation.FuncAnimation(
fig, update_radius, fargs = (circle,), frames=30, interval=50)
plt.title('Simple Circle Animation')
plt.show()
if __name__ == '__main__':
create_animation()
154 Chapter 6
Figure 6-3: Simple circle animation
You probably noted in the animated circle program that we assigned the cre-
ated FuncAnimation object to the label anim even though we don’t use it again
elsewhere. This is because of an issue with matplotlib’s current behavior—it
doesn’t store any reference to the FuncAnimation object, making it subject to
garbage collection by Python. This means the animation will not be created.
Creating a label referring to the object prevents this from happening.
For more on this issue, you may want to follow the discussions at https://
github.com/matplotlib/matplotlib/issues/1656/.
'''
Animate the trajectory of an object in projectile motion
'''
g = 9.8
t_flight = 2*u*math.sin(theta)/g
intervals = []
start = 0
interval = 0.005
while start < t_flight:
intervals.append(start)
start = start + interval
return intervals
t = intervals[i]
x = u*math.cos(theta)*t
y = u*math.sin(theta)*t - 0.5*g*t*t
circle.center = x, y
return circle,
xmin = 0
xmax = u*math.cos(theta)*intervals[-1]
ymin = 0
t_max = u*math.sin(theta)/g
u ymax = u*math.sin(theta)*t_max - 0.5*g*t_max**2
fig = plt.gcf()
v ax = plt.axes(xlim=(xmin, xmax), ylim=(ymin, ymax))
156 Chapter 6
w anim = animation.FuncAnimation(fig, update_position,
fargs=(circle, intervals, u, theta),
frames=len(intervals), interval=1,
repeat=False)
plt.title('Projectile Motion')
plt.xlabel('X')
plt.ylabel('Y')
plt.show()
if __name__ == '__main__':
try:
u = float(input('Enter the initial velocity (m/s): '))
theta = float(input('Enter the angle of projection (degrees): '))
except ValueError:
print('You entered an invalid input')
else:
theta = math.radians(theta)
create_animation(u, theta)
Once we have the values, we create the axes at v, passing the appropri-
ate axis limits. In the next two statements, we create a representation of the
ball and add it to the figure’s Axes object by creating a circle of radius 1.0 at
(xmin, ymin)—the minimum coordinates of the x- and y-axes, respectively.
We then create the FuncAnimation object w, supplying it with the current
figure object and the following arguments:
Drawing Fractals
Fractals are complex geometric patterns or shapes arising out of surpris-
ingly simple mathematical formulas. Compared to geometric shapes, such
as circles and rectangles, a fractal seems irregular and without any obvious
pattern or description, but if you look closely, you see that patterns emerge
and the entire shape is composed of numerous copies of itself. Because
fractals involve the repetitive application of the same geometric transformation
of points in a plane, computer programs are well-suited to create them.
In this chapter, we’ll learn how to draw the Barnsley fern, the Sierpiń ski
triangle, and the Mandelbrot set (the latter two in the challenges)—
popular examples of fractals studied in the field. Fractals abound in
nature, too—popular examples include coastlines, trees, and snowflakes.
158 Chapter 6
transformation, a new point, Q, which is one unit above and one unit to
the right of P, is created. If you then consider Q as the starting point, you’ll
get another point, R, that’s one unit above and one unit to the right of Q.
Consider the starting point, P, to be (1, 1). Figure 6-5 shows what the points
would look like.
Figure 6-5: The points Q and R have been obtained by applying a transformation to
the point P for two iterations.
. . . and so on.
The transformation rule is picked at random, with each rule having
an equal probability of being selected. No matter which one is picked, the
points will advance toward the right because we increase the x-coordinate
in both cases. As the points go to the right, they move either up or down,
thus creating a zigzag path. The following program charts out the path of a
point when subjected to one of these transformations for a specified num-
ber of iterations:
'''
Example of selecting a transformation from two equally probable
transformations
'''
import matplotlib.pyplot as plt
import random
def transformation_1(p):
x = p[0]
y = p[1]
return x + 1, y - 1
def transformation_2(p):
x = p[0]
y = p[1]
return x + 1, y + 1
def transform(p):
# List of transformation functions
u transformations = [transformation_1, transformation_2]
# Pick a random transformation function and call it
v t = random.choice(transformations)
w x, y = t(p)
return x, y
160 Chapter 6
return x, y
if __name__ == '__main__':
# Initial point
p = (1, 1)
n = int(input('Enter the number of iterations: '))
x x, y = build_trajectory(p, n)
# Plot
y plt.plot(x, y)
plt.xlabel('X')
plt.ylabel('Y')
plt.show()
The random.choice() function we saw in our first fractal program can be used
to select a random element from a list. Each element has an equal chance of
being returned. Here’s an example:
The function also works with tuples and strings. In the latter case, it returns
a random character from the string.
Figure 6-6: The zigzag path traced by the point (1, 1) when subjected to one or the
other of the two transformations randomly for 100 iterations
162 Chapter 6
Figure 6-7: The zigzag path traced by the point (1, 1) when subjected to one or the
other of the two transformations randomly for 10,000 iterations.
164 Chapter 6
Transformation 4 (0.01 probability):
xn+1 = 0
yn+1 = 0.16yn
'''
Draw a Barnsley Fern
'''
import random
import matplotlib.pyplot as plt
def transformation_1(p):
x = p[0]
y = p[1]
x1 = 0.85*x + 0.04*y
y1 = -0.04*x + 0.85*y + 1.6
return x1, y1
def transformation_2(p):
x = p[0]
y = p[1]
x1 = 0.2*x - 0.26*y
y1 = 0.23*x + 0.22*y + 1.6
return x1, y1
def transformation_3(p):
x = p[0]
y = p[1]
x1 = -0.15*x + 0.28*y
y1 = 0.26*x + 0.24*y + 0.44
return x1, y1
def transformation_4(p):
x = p[0]
y = p[1]
x1 = 0
y1 = 0.16*y
return x1, y1
def transform(p):
# List of transformation functions
transformations = [transformation_1, transformation_2,
transformation_3, transformation_4]
u probability = [0.85, 0.07, 0.07, 0.01]
# Pick a random transformation function and call it
tindex = get_index(probability)
v t = transformations[tindex]
x, y = t(p)
return x, y
def draw_fern(n):
# We start with (0, 0)
x = [0]
y = [0]
x1, y1 = 0, 0
for i in range(n):
x1, y1 = transform((x1, y1))
x.append(x1)
y.append(y1)
return x, y
if __name__ == '__main__':
n = int(input('Enter the number of points in the Fern: '))
x, y = draw_fern(n)
# Plot the points
plt.plot(x, y, 'o')
plt.title('Fern with {0} points'.format(n))
plt.show()
When you run this program, it asks for the number of points in the fern
to be specified and then creates the fern. Figures 6-9 and 6-10 show ferns
with 1,000 and 10,000 points, respectively.
166 Chapter 6
Figure 6-9: A fern with 1,000 points
Programming Challenges
Here are a few programming challenges that should help you further apply
what you’ve learned. You can find sample solutions at http://www.nostarch
.com/doingmathwithpython/.
'''
Draw a square
'''
def draw_square():
ax = plt.axes(xlim = (0, 6), ylim = (0, 6))
square = plt.Polygon([(1, 1), (5, 1), (5, 5), (1, 5)], closed = True)
ax.add_patch(square)
plt.show()
if __name__ == '__main__':
draw_square()
The Polygon object is created by passing the list of the vertices’ coordi-
nates as the first argument. Because we’re drawing a square, we pass the
coordinates of the four vertices: (1, 1), (5, 1), (5, 5), and (1, 5). Passing
closed=True tells matplotlib that we want to draw a closed polygon, where
the starting and the ending vertices are the same.
168 Chapter 6
In this challenge, you’ll attempt a very simplified version of the “circles
packed into a square” problem. How many circles of radius 0.5 will fit in the
square produced by this code? Draw and find out! Figure 6-11 shows how
the final image will look.
The trick here is to start from the lower-left corner of the square—
that is, (1, 1)—and then continue adding circles until the entire square is
filled. The following snippet shows how you can create the circles and add
them to the figure:
y = 1.5
while y < 5:
x = 1.5
while x < 5:
c = draw_circle(x, y)
ax.add_patch(c)
x += 1.0
y += 1.0
A point worth noting here is that this is not the most optimal or, for that
matter, the only way to pack circles into a square, and finding different ways
of solving this problem is popular among mathematics enthusiasts.
The interesting thing here is that the same process that we used to
draw a fern will also draw the Sierpiń ski triangle—only the transforma-
tion rules and their probability will change. Here’s how you can draw the
Sierpiń ski triangle: start with the point (0, 0) and apply one of the follow-
ing transformations:
Transformation 1:
xn+1 = 0.5xn
yn+1 = 0.5yn
Transformation 2:
xn+1 = 0.5xn + 0.5
170 Chapter 6
Transformation 3:
xn+1 = 0.5xn + 1
yn+1 = 0.5yn
Figure 6-14: Mandelbrot set in the plane between (−2.5, −1.0) and (1.0, 1.0)
172 Chapter 6
points with a shade of gray—that is, some of these points will be black,
some will be white, and others will be colored with a shade in between,
randomly chosen. Figure 6-15 illustrates the scenario.
Figure 6-15: Part of the x-y plane with x and y both ranging from 0 to 5. We’ve
considered 36 points in the region equidistant from each other and colored each
with a shade of gray.
To create this figure, we have to make a list of six lists. Each of these six
lists will in turn consist of six integers ranging from 0 to 10. Each number
will correspond to the color for each point, 0 standing for black and 10
standing for white. We’ll then pass this list to the imshow() function along
with other necessary arguments.
>>> l1 = [1, 2, 3]
>>> l2 = [4, 5, 6]
>>> l = [l1, l2]
Here, we created a list, l, consisting of two lists, l1 and l2. The first ele-
ment of the list, l[0], is thus the same as the l1 list and the second element
of the list, l[1], is the same as the l2 list:
>>> l[0]
[1, 2, 3]
def color_points():
x_p = 6
y_p = 6
image = initialize_image(x_p, y_p)
for i in range(y_p):
for j in range(x_p):
v image[i][j] = random.randint(0, 10)
w plt.imshow(image, origin='lower', extent=(0, 5, 0, 5),
cmap=cm.Greys_r, interpolation='nearest')
plt.colorbar()
plt.show()
if __name__ == '__main__':
color_points()
174 Chapter 6
Then, call the imshow() function at w, passing image as the first argu-
ment. The keyword argument origin='lower' specifies that the number
in image[0][0] corresponds to the color of the point (0, 0). The keyword
argument extent=(0, 5, 0, 5) sets the lower-left and upper-right corners
of the image to (0, 0) and (5, 5), respectively. The keyword argument
cmap=cm.Greys_r specifies that we’re going to create a grayscale image.
The last keyword argument, interpolation='nearest', specifies that
matplotlib should color a point for which the color wasn’t specified with
the same color as the one nearest to it. What does this mean? Note that we
consider and specify the color for only 36 points in the region (0, 5) and
(5, 5). Because there is an infinite number of points in this region, we tell
matplotlib to set the color of an unspecified point to that of its nearest point.
This is the reason you see color “boxes” around each point in the figure.
Call the colorbar() function to display a color bar in the figure showing
which integer corresponds to which color. Finally, call show() to display the
image. Note that due to the use of the random.randint() function, your image
will be colored differently than the one in Figure 6-15.
If you increase the number of points along each axis by setting x_p and
y_p to, let’s say, 20 in color_points(), you’ll see a figure similar to the one
shown in Figure 6-16. Note that the color boxes grow smaller in size. If you
increase the number of points even more, you’ll see the size of the boxes
shrink further, giving the illusion that each point has a different color.
Figure 6-16: Part of the x-y plane with x and y both ranging from 0 to 5. We’ve con-
sidered 400 points in the region equidistant from each other and colored each with a
shade of gray.
Once you have the complete image list, call the imshow() function with
the extent keyword argument changed to indicate the region bounded by
(−2.5, −1.0) and (1.0, 1.0).
This algorithm is usually referred to as the escape-time algorithm. When
the maximum number of iterations is reached before a point’s magnitude
exceeds 2, that point belongs to the Mandelbrot set and is colored white.
The points that exceed the magnitude within fewer iterations are said to
“escape”; they don’t belong to the Mandelbrot set and are colored black.
You can experiment by decreasing and increasing the number of points
along each axis. Decreasing the number of points will lead to a grainy
image, while increasing them will result in a more detailed image.
176 Chapter 6
7
S o l v i n g C a l cu l us P r o b l ems
−2 1
1
4
2
9 81
x f(x)
Figure 7-1: A function describes a mapping between an
input set and an output set. Here, an element of the out-
put set is the square of an element from the input set.
Using the familiar function notation, we’d write this function as f(x) = x 2,
where x is the independent variable quantity. So f(2) = 4, f(100) = 10000,
and so on. We refer to x as the independent variable quantity because we’re
free to assume a value for it as long as that value is within its domain (see
the next section).
Functions can also be defined in terms of multiple variables. For
example, f(x, y) = x 2 + y 2 defines a function of two variables, x and y.
Note The domain and range of a function can certainly be different. For example, for the
function x2, the domain is all positive and negative numbers, but the range is only
the positive numbers.
178 Chapter 7
functions sine and cosine. Other trigonometric functions—tan() and the
inverse equivalents of these functions, asin(), acos(), and atan()—are also
defined.
The math module also includes functions that find the logarithm of a
number—the natural logarithm function log(), the base-2 logarithm log2(),
and the base-10 logarithm log10()—as well as the function exp(), which
finds the value of ex , where e is Euler’s number (approximately 2.71828).
One drawback of all these functions is that they’re not suitable for
working with symbolic expressions. If we want to manipulate a mathemati-
cal expression involving symbols, we have to start using the equivalent func-
tions defined by SymPy.
Let’s see a quick example:
Here, we find the sine of the angle π/2 using the sin() function
defined by the standard library’s math module. Then, we can do the same
using SymPy.
Assumptions in SymPy
In all our programs, we’ve created a Symbol object in SymPy, defining the
variable like so: x = Symbol('x'). Assume that as a result of an operation you
asked SymPy to perform, SymPy needs to check whether the expression
x + 5 is greater than 0. Let’s see what would happen:
180 Chapter 7
So if we create a Symbol object specifying positive=True, we tell SymPy
to assume only positive values. Now it knows for sure that x + 5 is definitely
greater than 0:
Do Something
Note that if we’d instead specified negative=True, we could get the same
error as in the first case. Just as we can declare a symbol as positive and
negative, it’s also possible to specify it as real, integer, complex, imaginary,
and so on. These declarations are referred to as assumptions in SymPy.
Figure 7-2: A graph showing the function 1/x as the value of x increases
By default, the limit is found from a positive direction, unless the value
at which the limit is to be calculated is positive or negative infinity. In the
case of positive infinity, the direction is negative, and vice versa. You can
change the default direction as follows:
Here, we calculate
,
and as we approach 0 for x from the negative side, the value of the limit
approaches negative infinity. On the other hand, if we approach 0 from
the positive side, the value approaches positive infinity:
182 Chapter 7
automatically:
You have very likely used l’Hôpital’s rule to find such limits, but as we
see here, the Limit class takes care of this for us.
For any principal amount p, any rate r, and any number of years t, the
compound interest is calculated using the formula
We will now evaluate the above limit. First, let’s create the various
expression objects:
>>> t1 = Symbol('t1')
>>> delta_t = Symbol('delta_t')
184 Chapter 7
Now, let’s evaluate the limit:
The limit turns out to be 10*t1 + 2, and it’s the rate of change of S(t)
at time t1, or the instantaneous rate of change. This change is more com-
monly referred to as the instantaneous speed of the car at the time instant t1.
The limit we calculated here is referred to as the derivative of a func-
tion, and we can calculate it directly using SymPy’s Derivative class.
>>> t = Symbol('t')
>>> St = 5*t**2 + 2*t + 8
v >>> Derivative(St, t)
Derivative(5*t**2 + 2*t + 8, t)
>>> d = Derivative(St, t)
>>> d.doit()
10*t + 2
>>> d.doit().subs({t:t1})
10*t1 + 2
>>> d.doit().subs({t:1})
12
You may consider this function the product of two independent func-
tions, which means that, by hand, we’d need to make use of the product rule
of differentiation to find the derivative. But we don’t need to worry about
that here because we can just create an object of the Derivative class to
do that for us.
Try out some other complicated expressions, such as expressions involv-
ing trigonometric functions.
A Derivative Calculator
Now let’s write a derivative calculator program, which will take a function
as input and then print the result of differentiating it with respect to the
variable specified:
'''
Derivative calculator
'''
if __name__=='__main__':
186 Chapter 7
enters an invalid input. If the input expression is a valid expression, we call
the derivative function at w, passing the converted expression and the vari-
able with respect to which the function is to be differentiated as arguments.
In the derivative() function, we first create a Symbol object that cor-
responds to the variable with respect to which the function is to be dif-
ferentiated. We use the label var to refer to this variable. Next, we create
a Derivative object that passes both the function to differentiate and the
symbol object var. We immediately call the doit() method to evaluate the
derivative, and we then use the pprint() function to print the result so that
it appears close to its mathematical counterpart. A sample execution of the
program follows:
Note A key assumption I’ve made in this chapter is that all the functions we’re calculating
the derivative of are differentiable in their respective domains.
maximum
maximum
C
B
minimum
D
minimum
From the graph, we can see that the function attains its minimum
value on the interval −2 ≤ x ≤ 0 at the point B. Similarly, it attains its maxi-
mum value on the interval 0 ≤ x ≤ 2 at the point C. On the other hand, the
function attains its maximum and minimum values on the entire domain
of x that we’ve considered here at the points A and D, respectively. Thus,
when we consider the function on the whole interval [−5, 5], the points B
and C are referred to as a local minimum and a local maximum, respectively,
while the points A and D are the global maximum and the global minimum,
respectively.
The term extremum (plural extrema) refers to the points where the func-
tion attains a local or global maximum or minimum. If x is an extremum of
188 Chapter 7
the function f(x), then the first-order derivative of f at x, denoted f ′(x), must
vanish. This property shows that a good way to find possible extrema is to
try to solve the equation f ′(x) = 0. Such solutions are called critical points of
the function. Let’s try this out:
Now that we have calculated the first-order derivative, f ′(x), we’ll solve
f ′(x) = 0 to find the critical points:
>>> A = critical_points[2]
>>> B = critical_points[0]
>>> C = critical_points[1]
>>> D = critical_points[3]
Because all the critical points for this function lie within the considered
interval, they are all relevant for our search for the global maximum and
minimum of f(x). We may now apply the so-called second derivative test to
narrow down which critical points could be global maxima or minima.
First, we calculate the second-order derivative for the function f(x).
Note that to do so, we enter 2 as the third argument:
Now, we find the value of the second derivative by substituting the value
of each of the critical points one by one in place of x. If the resulting value
is less than 0, the point is a local maximum; if the value is greater than 0,
it’s a local minimum. If the resulting value is 0, then the test is inconclusive
and we cannot deduce anything about whether the critical point x is a local
minimum, maximum, or neither.
>>> d2.subs({x:B}).evalf()
127.661060789073
>>> d2.subs({x:C}).evalf()
-127.661060789073
>>> d2.subs({x:A}).evalf()
-703.493179468151
>>> d2.subs({x:D}).evalf()
703.493179468151
>>> x_min = -5
>>> x_max = 5
>>> f.subs({x:A}).evalf()
705.959460380365
>>> f.subs({x:C}).evalf()
25.0846626340294
>>> f.subs({x:x_min}).evalf()
375.000000000000
>>> f.subs({x:x_max}).evalf()
-375.000000000000
>>> f.subs({x:B}).evalf()
-25.0846626340294
>>> f.subs({x:D}).evalf()
-705.959460380365
>>> f.subs({x:x_min}).evalf()
375.000000000000
>>> f.subs({x:x_max}).evalf()
-375.000000000000
The point where f(x) has the smallest value must be the global mini-
mum for the function; this turns out to be point D.
This method for finding the extrema of a function—by considering
the function’s value at all of the critical points (after potentially discarding
190 Chapter 7
some via the second derivative test) and boundary values—will always work
as long as the function is twice differentiable. That is, both the first and sec-
ond derivative must exist everywhere in the domain.
For a function such as ex , there might not be any critical points in the
domain, but in this case the method works fine: it simply tells us that the
extrema occur at the domain boundary.
to calculate the time of flight for a body in projectile motion that’s thrown
with a velocity u at an angle θ. The range of a projectile, R, is the total hori-
zontal distance traveled by the projectile and is given by the product of
ux × t flight. Here, ux is the horizontal component of the initial velocity and
is equal to u cosθ. Substituting the formulas for ux and t flight, we get the
expression
θnew = θold + λ ,
The value of epsilon (ε) determines when we decide to stop the iteration
of the algorithm. It is discussed in “The Role of the Step Size and Epsilon”
on page 197.
192 Chapter 7
Figure 7-5: The gradient ascent method takes us iteratively toward the maximum point
of the function.
'''
Use gradient ascent to find the angle at which the projectile
has maximum range for a fixed velocity, 25 m/s
'''
import math
from sympy import Derivative, Symbol, sin
return x_new
if __name__ == '__main__':
g = 9.8
# Assume initial velocity
u = 25
# Expression for range
theta = Symbol('theta')
| R = u**2*sin(2*theta)/g
We set the epsilon value to 1e-6 and the step size to 1e-4 at u and v,
respectively. The epsilon value must always be a very small positive value
close to 0, and the step size should be chosen such that the variable is incre-
mented in small amounts at every iteration of the algorithm. The choice of
the value of epsilon and step size is discussed in a bit more detail in “The
Role of the Step Size and Epsilon” on page 197.
We set x_old to x0 at w and calculate x_new for the first time at x. We use
the subs() method to substitute the value of x_old in place of the variable
and then use evalf() to calculate the numerical value. If the absolute dif-
ference abs(x_old – x_new) is greater than epsilon, the while loop at y keeps
executing, and we keep updating the value of x_old and x_new as per steps
1 and 2 of the gradient ascent algorithm. Once we’re out of the loop—that
is, abs(x_old – x_new) > epsilon—we return x_new, the variable value corre-
sponding to the maximum function value.
We begin to define the find_max_theta() function at z. In this function,
we calculate the first-order derivative of R; create a label, theta0, and set it to
1e-3; and call the grad_ascent() function with these two values as arguments,
as well as a third argument, the symbol object theta. Once we get the value
of θ corresponding to the maximum function value (theta_max), we return
it at {.
Finally, we create the expression representing the horizontal range
at |, having set the initial velocity, u = 25, and the theta Symbol object cor-
responding to the angle θ. Then we call the find_max_theta() function with R
and theta at .
When you run this program, you should see the following output:
Theta: 44.99999978475661
Maximum Range: 63.7755102040816
194 Chapter 7
The value of θ is printed in degrees and turns out to be close to
45 degrees, as expected. If you change the initial velocity to other values,
you’ll see that the angle of projection at which the maximum range is
reached is always close to 45 degrees.
'''
Use gradient ascent to find the maximum value of a
single-variable function
'''
return x_new
if __name__ == '__main__':
The function grad_ascent() remains the same here. Now, however, the
program asks the user to input the function, the variable in the function,
and the initial value of the variable, where gradient ascent will begin. Once
we’re sure that SymPy can recognize the user’s input, we create a Symbol
object corresponding to the variable at u, find the first derivative with
respect to it at v, and call the grad_ascent() function with these three
arguments. The maximum value is returned at w.
The gradient ascent algorithm stops when it finds the closest peak, which
is not always the global maximum. In this example, when you start from the
initial value of −2, it stops at the peak that also corresponds to the global
196 Chapter 7
maximum (approximately 706) in the considered domain. To verify this
further, let’s try a different initial value:
In this case, the closest peak at which the gradient ascent algorithm
stops is not the true global maximum of the function. Figure 7-6 depicts
the result of the gradient ascent algorithm for both of these scenarios.
Figure 7-6: Results of the gradient ascent algorithm with different initial values.
Gradient ascent always takes us to the closest peak.
Thus, when using this method, the initial value must be chosen care-
fully. Some variations of the algorithm try to address this limitation.
θnew = θold + λ ,
where λ is the step size. The step size determines the distance of the next
step. It should be small to avoid going over a peak. That is, if the current
'''
Use gradient ascent to find the maximum value of a
single-variable function. This also checks for the existence
of a solution for the equation f'(x)=0.
'''
return x_new
198 Chapter 7
if __name__ == '__main__':
The reverse algorithm of the gradient ascent algorithm is the gradient descent
algorithm, which is a method to find the minimum value of a function. It is
similar to the gradient ascent algorithm, but instead of “climbing up” along
the function, we “climb down.” Challenge #2 on page 205 discusses the dif-
ference between these two algorithms and gives you an opportunity to imple-
ment the reverse one.
which is really F(b) − F(a), where F(b) and F(a) are the values of the anti
derivative of the function at x = b and at x = a, respectively. We can find both
the integrals by creating an Integral object.
Here’s how we can find the integral , where k is a constant term:
We import the Integral and Symbol classes and create two Symbol objects
corresponding to k and x. Then, we create an Integral object with the func-
tion kx, specifying the variable to integrate with respect to x. Similar to
Limit and Derivative classes, we can now evaluate the integral using the
doit() method:
200 Chapter 7
f (x) = x
D
E C
A B
Figure 7-7: The definite integral of a function between two points is the area
enclosed by the graph of the function bounded by the x-axis.
The value of the integral turns out to be the same as the area of the
region ABDE. This isn’t a coincidence; you’ll find this is true for any func-
tion of x for which the integral can be determined.
Understanding that the definite integral is the area enclosed by the
function between specified points on the x-axis is key for understanding
probability calculations in random events that involve continuous random
variables.
,
where E is the set of all grades possible between 11 and 12 and S is the set
of all possible grades—that is, all real numbers between 1 and 20. By our
definition of the preceding problem, n(E) is infinite because it’s impossible
to count all possible real numbers between 11 and 12; the same is true for
n(S). Thus, we need a different approach to calculate the probability.
A probability density function, P(x), expresses the probability of the value
of a random variable being close to x, an arbitrary value.1 It can also tell
us the probability of x falling within an interval. That is, if we knew the
probability density function representing the probability of grades in our
fictional class, calculating P(11 < x < 12) would give us the probability that
we’re looking for. But how do we calculate this? It turns out that this prob-
ability is the area enclosed by the graph of the probability density function
and the x-axis between the points x = 11 and x = 12. Assuming an arbitrary
probability density function, Figure 7-8 demonstrates this.
P(11 < x < 12)
1. For more information, see “The idea of a probability density function” by Duane Q.
Nykamp from Math Insight (http://mathinsight.org/probability_density_function_idea).
202 Chapter 7
We already know that this area is equal to the value of the integral,
;
thus, we have an easy way to find the probability of the grade lying between
11 and 12. With the math out of the way, we can now find out what the
probability is. The probability density function we assumed earlier is the
function
,
where x is the grade obtained. This function has been chosen so that the
probability of the grade being close to 10 (either greater or less than) is
high but then decreases sharply.
Now, let’s calculate the integral
We create the Integral object for the function, with p representing the
probability density function that specifies that we want to calculate the
definite integral between 11 and 12 on the x-axis. We evaluate the function
using doit() and find the numerical value using evalf(). Thus, the probabil-
ity that a grade lies between 11 and 12 is close to 0.14.
Because this integral has the same lower and upper limits, its value is 0.
This is rather unintuitive and paradoxical, so let’s try to understand it.
Consider the range of grades we addressed earlier—0 to 20. The grade
a student can obtain can be any number in this interval, which means there
is an infinite number of numbers. If each number were to have an equal
probability of being selected, what would that probability be? According
to the formula for discrete probability, this should be 1/∞, which means a
very small number. In fact, this number is so small that for all practical pur-
poses, it’s considered 0. Hence, the probability of the grade being 11.5 is 0.
204 Chapter 7
What You Learned
In this chapter, you learned how to find the limits, derivatives, and integrals
of functions. You learned about the gradient ascent method for finding the
maximum value of a function and saw how you can apply integration prin
ciples to calculate the probability of continuous random variables. Next,
you have a few tasks to attempt.
Programming Challenges
The following challenges build on what you’ve learned in this chapter. You
can find sample solutions at http://www.nostarch.com/doingmathwithpython/.
Your challenge here is to write a program that will (1) accept a single-
variable function and a value of that variable as inputs and (2) check
whether the input function is continuous at the point where the variable
assumes the value input.
Here is a sample working of the completed solution:
x new = x old − λ ,
expresses the area enclosed by the function f(x), with the x-axis between
x = a and x = b. The area between two curves is thus expressed as the
integral
where a and b are the points of intersection of the two curves with a < b.
The function f(x) is referred to as the upper function and g(x) as the lower
2
function. Figure 7-9 illustrates this, assuming f(x) = x and g(x) = x , with
a = 0 and b = 1.
Your challenge here is to write a program that will allow the user to
input any two single-variable functions of x and print the enclosed area
between the two. The program should make it clear that the first function
entered should be the upper function, and it should also ask for the values
of x between which to find the area.
206 Chapter 7
f (x) = x
2
g (x) = x
Enclosed
area
Figure 7-9: The functions f(x) = x and g(x) = x 2 enclose an area between
x = 0 and x = 1.0.
208 Chapter 7
Af terword
Python Documentation
You may also wish to start exploring Python’s documentation of various
features.
Books
If you’re interested in exploring more math and programming topics, check
out the following books:
• Invent Your Own Computer Games with Python and Making Games with
Python and Pygame by Al Sweigart (both freely available at https://
inventwithpython.com/) don’t specifically address solving math prob-
lems but apply math for the purpose of writing computer games using
Python.
• Think Stats: Probability and Statistics for Programmers by Allen B. Downey
is a freely available book (http://greenteapress.com/thinkstats/). As the title
suggests, it delves deeply into statistics and probability topics beyond
the ones discussed in this book.
• Teach Your Kids to Code by Bryson Payne (No Starch Press, 2015) is
meant for beginners and covers various Python topics. You’ll learn
turtle graphics, various interesting ways of using the random Python
module, and how to create games and animations using Pygame.
210 Afterword
• Computational Physics with Python by Mark Newman (2013) focuses on
a number of advanced math topics geared toward solving problems in
physics. However, there are a number of chapters that are relevant to
anyone interested in learning more about writing programs for solving
numerical and mathematical problems.
Getting Help
If you are stuck on a specific issue discussed in this book, please contact me
via email at [email protected]. If you want to learn more about
any of the functions or classes we have used in our programs, the first place
to look would be the official documentation of the relevant projects:
If you are stuck with a problem and want help, you can also email the
project-specific mailing lists. You can find links to these on the book’s
website.
Conclusion
And finally, we’ve reached the end of the book. I hope you’ve learned a lot
as you followed along. Go out there and solve some more problems using
Python!
Afterword 211
A
Sof t wa re Ins ta ll at ion
Microsoft Windows
Download the Anaconda GUI installer for Python 3 from http://continuum.io/
downloads. Double-click the installer and then follow these steps:
2. You can choose to install the distribution either for your username only
or for all users using this computer.
3. Choose the folder where you want Anaconda to install the programs.
The defaults should work fine.
4. Make sure to check the two boxes in the Advanced Options dialog so
that you can invoke the Python shell and other programs, such as conda,
214 Appendix A
pip, and idle, from anywhere on the command prompt. In addition,
any other Python programs looking for a Python 3.4 installation will
be pointed to the one installed by Anaconda:
5. Click Install to start the installation. When the installation has fin-
ished, click Next and then click Finish to complete the installation.
You should be able to find Python in your Start Menu.
6. Open a Windows command prompt and carry out the following steps.
Updating SymPy
The installation may come with SymPy already installed, but we want to
make sure that we have at least 0.7.6, so we’ll install it using this command:
Installing matplotlib-venn
To install matplotlib-venn, use this command:
Software Installation 215
Linux
The Linux installer is distributed as a shell script installer, so you’ll want
to download the Anaconda Python installer from http://continuum.io/
downloads. Then start the installer by executing the following:
$ bash Anaconda3-2.1.0-Linux-x86_64.sh
[/home/testuser/anaconda3] >>>
PREFIX=/home/testuser/anaconda3
installing: python-3.4.1-4 ...
installing: conda-3.7.0-py34_0
..
When asked to confirm the install location, enter yes so that the
Python 3.4 interpreter installed by Anaconda is always invoked when you
invoke the Python program from your terminal:
216 Appendix A
For this change to become active, you have to open a new terminal.
Updating SymPy
First, make sure that SymPy 0.7.6 is installed:
Installing matplotlib-venn
Use the following command to install matplotlib-venn:
Mac OS X
Download the graphical installer from http://continuum.io/downloads. Then
double-click the .pkg file and follow the instructions:
Software Installation 217
2. Click Agree to accept the “Anaconda END USER LICENSE
AGREEMENT”:
3. In the following dialog, choose the “Install for me only” option. The
error message you see is a bug in the installer software. Just click it, and
it will disappear. Click Continue to proceed.
218 Appendix A
4. Select Install:
5. Once the installation is finished, open the Terminal app and follow the
next steps to update SymPy and install matplotlib-venn.
Software Installation 219
Updating SymPy
First, make sure that SymPy 0.7.6 is installed:
Installing matplotlib-venn
Use the following command to install matplotlib-venn:
220 Appendix A
B
Ove r v i e w o f P y t h o n T o p i cs
if __name__ == '__main__'
Throughout the book, we’ve used the following block of code, where func()
is a function we’ve defined in the program:
if __name__ == '__main__':
# Do something
func()
This block of code ensures that the statements within the block are
executed only when the program is run on its own.
When a program runs, the special variable __name__ is set to __main__
automatically, so the if condition evaluates to True and the function func()
is called. However, __name__ is set differently when you import the program
into another program (see “Reusing Code” on page 235).
Here’s a quick demonstration. Consider the following program, which
we’ll call factorial.py:
u print(__name__)
if __name__ == '__main__':
n = int(input('Enter an integer to find the factorial of: '))
f = fact(n)
print('Factorial of {0}: {1}'.format(n, f))
__main__
Enter an integer to find the factorial of: 5
Factorial of 5: 120
Note that both the programs must be in the same directory. When you
run this program, you’ll get the following output:
factorial
Factorial of 5: 120
222 Appendix B
To summarize, it’s good practice to use if __name__ == '__main__' in your
programs so that the statements you want executed when your program is
run as a standalone are also not executed when your program is imported
into another program.
List Comprehensions
Let’s say we have a list of integers and we want to create a new list contain-
ing the squares of the elements of the original list. Here’s one way that we
could do this that’s already familiar to you:
>>> x = [1, 2, 3, 4]
>>> x_square = []
u >>> for n in x:
v x_square.append(n**2)
>>> x_square
[1, 4, 9, 16]
Using list comprehension, you can rewrite the block of code as follows:
The code is more compact now, as you didn’t have to create the empty
lists, write a for loop, and append to the lists. List comprehension lets you
do this in a single statement.
You can also add conditionals to a list comprehension in order to selec-
tively choose which list items are evaluated in the expression. Consider, once
again, the first example:
>>> x = [1, 2, 3, 4]
>>> x_square = [n**2 for n in x if n%2 == 0]
>>> x_square
[4, 16]
Once we create a dictionary, we can add a new key-value pair to it, simi-
lar to how we can append elements to a list. Here’s an example:
224 Appendix B
>>> d
{'key1': 5, 'x': 1, 'key2': 20}
This code snippet checks whether the key 'x' already exists in the dic-
tionary, d. If it does, it prints the value corresponding to it; otherwise, it
adds the key to the dictionary with 1 as the corresponding value. Similar to
Python’s behavior with sets, Python can’t guarantee a particular order of
the key-value pairs in a dictionary. The key-value pairs can be in any order,
irrespective of the order of insertion.
Besides specifying the key as an index to the dictionary, we can also use
the get() method to retrieve the value corresponding to the key:
>>> d.get('x')
1
>>> d.get('y', 0)
0
>>> d['y'] = 1
>>> d.get('y', 0)
1
The keys() and values() methods each return a list-like data structure of
all the keys and values, respectively, in a dictionary:
>>> d.keys()
dict_keys(['key1', 'x', 'key2', 'y'])
>>> d.values()
dict_values([5, 1, 20, 1])
To iterate over the key and value pairs in a dictionary, use the items()
method:
>>> d.items()
dict_items([('key1', 5), ('x', 1), ('key2', 20), ('y', 1)])
This method returns a view of tuples, and each tuple is a key-value pair.
We can use the following code snippet to print them nicely:
Views are more memory efficient than lists, and they don’t let you add
or remove items.
import math
def components(u, theta):
x = u*math.cos(theta)
y = u*math.sin(theta)
return x, y
if __name__ == '__main__':
theta = math.radians(45)
x, y = components(theta)
c = components(theta)
x = c[0]
y = c[1]
This has advantages because we don’t have to know all the different
values being returned. For one, you don’t have to write x,y,z = myfunc1()
when the function returns three values or a,x,y,z = myfunc1() when the
function returns four values, and so on.
In either of the preceding cases, the code calling the components() func-
tion must know which of the return values correspond to which component
of the velocity, as there’s no way to know that from the values themselves.
226 Appendix B
A user-friendly approach is to return a dictionary object instead, as we
saw in the case of SymPy’s solve() function when used with the dict=True
keyword argument. Here’s how we can rewrite the preceding components
function to return a dictionary:
import math
def components(theta):
x = math.cos(theta)
y = math.sin(theta)
Here, we return a dictionary with the keys 'x' and 'y' referring to the x
and y components and their corresponding numerical values. With this new
function definition, we don’t need to worry about the order of the returned
values. We just use the key 'x' to retrieve the x component and the key 'y'
to retrieve the y component:
if __name__ == '__main__':
theta = math.radians(45)
c = components(theta)
y = c['y']
x = c['x']
print(x, y)
'''
Find the range using a dictionary to return values
'''
def find_range(numbers):
lowest = min(numbers)
highest = max(numbers)
# Find the range
r = highest-lowest
return {'lowest':lowest, 'highest':highest, 'range':r}
if __name__ == '__main__':
donations = [100, 60, 70, 900, 100, 200, 500, 500, 503, 600, 1000, 1200]
result = find_range(donations)
u print('Lowest: {0} Highest: {1} Range: {2}'.
format(result['lowest'], result['highest'], result['range']))
The find_range() function now returns a dictionary with the keys lowest,
highest, and range and with the lowest number, highest number, and the range
as their corresponding values. At u, we simply use the corresponding key to
retrieve the corresponding value.
Exception Handling
In Chapter 1, we learned that trying to convert a string such as '1.1' to an
integer using the int() function results in a ValueError exception. But with
a try...except block, we can print a user-friendly error message:
>>> try:
int('1.1')
except ValueError:
print('Failed to convert 1.1 to an integer')
When any statement in the try block raises an exception, the type of
exception raised is matched with the one specified by the except statement.
If there’s a match, the program resumes in the except block. If the excep-
tion doesn’t match, the program execution halts and displays the exception.
Here’s an example:
>>> try:
print(1/0)
except ValueError:
print('Division unsuccessful')
def reciprocal(n):
try:
print(1/n)
except (ZeroDivisionError, TypeError):
print('You entered an invalid number')
228 Appendix B
We defined the function reciprocal(), which prints the reciprocal of
the user’s input. We know that if the function is called with 0, it’ll cause
a ZeroDivisionError exception. If you pass a string, however, it’ll cause a
TypeError exception. The function considers both these cases as invalid
input and specifies both ZeroDivisionError and TypeError in the except state-
ment as a tuple.
Let’s try calling the function with a valid input—that is, a nonzero
number:
>>> reciprocal(5)
0.2
>>> reciprocal(0)
Enter an integer: 0
You entered an invalid number
>>> reciprocal('1')
def reciprocal(n):
try:
print(1/n)
except TypeError:
print('You must specify a number')
except ZeroDivisionError:
print('Division by 0 is invalid')
>>> reciprocal(0)
Division by 0 is invalid
>>> reciprocal('1')
You must specify a number
if __name__ == '__main__':
try:
u = float(input('Enter the initial velocity (m/s): '))
theta = float(input('Enter the angle of projection (degrees): '))
except ValueError:
print('You entered an invalid input')
u else:
draw_trajectory(u, theta)
plt.show()
100
60
70
900
100
200
500
500
503
600
1000
1200
230 Appendix B
We want to write a function that reads the file and returns a list of those
numbers:
def read_data(path):
numbers = []
u f = open(path)
v for line in f:
numbers.append(float(line))
f.close()
return numbers
def read_data(path):
numbers = []
u with open(path) as f:
for line in f:
numbers.append(float(line))
v return numbers
def read_data(path):
with open(path) as f:
u lines = f.readlines()
numbers = [float(n) for n in lines]
return numbers
We read all the lines of the file into a list using the readlines() method
at u. Then, we convert each of the items in the list into a floating point
number using the float() function and list comprehension. Finally, we
return the list numbers.
if __name__=='__main__':
data_file = input('Enter the path of the file: ')
data = read_data(data_file)
print(data)
Once you’ve added this code to the end of the read_data() function and
run it, it’ll ask you to input the path to the file. Then, it’ll print the numbers
it reads from the file:
232 Appendix B
Because I entered a file path that doesn’t exist, the FileNotFoundError
exception is raised when we try to open the file. We can make the program
display a user-friendly error message by modifying our read_data() function
as follows:
def read_data(path):
numbers = []
try:
with open(path) as f:
for line in f:
numbers.append(float(line))
except FileNotFoundError:
print('File not found')
return numbers
Now, when you specify a nonexistent file path, you’ll get an error mes-
sage instead:
The second source of errors can be that the data in the file isn’t what
your program expects to read. For example, consider a file that has the
following:
10
20
3o
1/5
5.6
The third line in this file isn’t convertible to a floating point number
because it has the letter o in it instead of the number 0, and the fourth line
consists of 1/5, a fraction in string form, which float() can’t handle.
If you supply this data file to the earlier program, it’ll produce the fol-
lowing error:
The third line in the file is 3o, not the number 30, so when we attempt
to convert it into a floating point number, the result is ValueError. There are
two approaches you can take when such data is present in a file. The first
def read_data(path):
numbers = []
try:
with open(path) as f:
for line in f:
u try:
v n = float(line)
except ValueError:
print('Bad data: {0}'.format(line))
w break
x numbers.append(n)
except FileNotFoundError:
print('File not found')
return numbers
[10.0, 20.0]
def read_data(path):
numbers = []
try:
with open(path) as f:
for line in f:
try:
n = float(line)
except ValueError:
print('Bad data: {0}'.format(line))
u continue
numbers.append(n)
except FileNotFoundError:
print('File not found')
return numbers
234 Appendix B
The only change here is that instead of breaking out of the for loop, we
just continue with the next iteration using the continue statement at u. The
output from the program is now as follows:
Bad data: 3o
The specific application where you’re reading the file will determine
which of the above approaches you want to take to handle bad data.
Reusing Code
Throughout this book, we’ve used classes and functions that were either
part of the Python standard library or available after installing third-party
packages, such as matplotlib and SymPy. Now we’ll look at a quick example
of how we can import our own programs into other programs.
Consider the function find_corr_x_y() that we wrote in “Calculating the
Correlation Between Two Data Sets” on page 75. We’ll create a separate
file, correlation.py, which has only the function definition:
'''
Function to calculate the linear correlation coefficient
'''
def find_corr_x_y(x,y):
# Size of each set
n = len(x)
sum_prod_x_y = sum(prod)
sum_x = sum(x)
sum_y = sum(y)
squared_sum_x = sum_x**2
squared_sum_y = sum_y**2
x_square=[]
for xi in x:
x_square.append(xi**2)
x_square_sum = sum(x_square)
y_square=[]
for yi in y:
y_square.append(yi**2)
y_square_sum = sum(y_square)
correlation = numerator/denominator
return correlation
This program finds the correlation between the high school math
grades and college admission scores of students we considered in Table 3-3
on page 80. We import the find_corr_x_y() function from the correlation
module, create the lists representing the two sets of grades, and call the
find_corr_x_y() function with the two lists as arguments. When you run the
program, it’ll print the correlation coefficient. Note that the two files must
be in the same directory—this is strictly to keep things simple.
236 Appendix B
I n de x
238 Index
pretty printing, 97–100 integrals of, finding, 200–201
strings, converting to, 103–105 limit of, finding, 181–185
substituting in values, 100–103 probability density, 201–204
extrema, of a function, 188–191 range of, 178
F G
factor() function, 96–97, 115 geometric shapes, drawing,
factors of an integer, 150–158
calculating, 12–14 geometric transformations, 158
fargs keyword argument, 154, 158 global maxima and minima,
Fibonacci sequence, 59–60 188–199
file handling golden ratio, 59–60
close() method, 231 gradient ascent method, 191, 195
filename as input, 232 gradient descent method, 199,
handling errors, 232–235 205–206
open() function, 231 graphs, creating with matplotlib,
reading files, 230–231 32–46
readlines() method, 232 customizing with titles and
file object, 84 labels, 41–44
formatting output, 15 marking points, 33–35
format(), 15 saving as images, 45–46
number of digits, 16 temperature data example,
print() function, 1 35–44
fractals, 158–168
Barnsley fern, 163–168 H
Hénon’s function, 171–172
Mandelbrot set, 172–176 higher-order derivatives of
Sierpiń ski triangle, 170–171 functions, finding,
transformations of points, 188–191
158–163 Hénon’s function, 171–172
fractions Hunter, John, “matplotlib,” 150
calculator, 23–24
working with, 5–6 I
fractions module, 5 IDLE, 1, 13–14
frames argument, 154, 158 new program, 13
frequency tables, creating, 69–71 program execution, 14
FuncAnimation class, 154–158 running a program, 14
functions (calculus), 178 shell, 1
common, 178–180 importing, modules, 5
continuity at a point, imshow() function, 172
verifying, 205 indefinite integral, 200
derivatives of, finding, 185–187 index, of a list, 29, 31
higher-order, 188–191 inequalities, solving, 117–119
domain of, 178 infinite loop, 24
extrema of, 188–191 Infinity, 183, 204
Index 239
in operator, 122 M
input() function, 8
Mac OS X, software installation on,
installation, of software
217–220
on Linux, 216–217
Mandelbrot set, 172–176
on Mac OS X, 217–220
mathematical operations, 1–3
on Windows, 214–215
exponential operator, 3
Integral class, 200
floor division operator, 2
integrals of functions, finding, 200
modulo (%) operator, 3, 12
intersection, of sets, 127
math module, 178
interval argument, 154
matplotlib, 32
animation module, 154
K axes
keys, in a dictionary, 224, 227 auto scaling, 152
customizing, 42
Axes object, 151
L axis() function, 43
labels, 4 barh() function, 57
Lady ferns, 164 Circle patch, 151
law of large numbers, 144 colorbar() function, 175
legend() function, 40 displaying images, 172
len() function, 62 documentation, 211
limit, finding, 181 Figure object, 150, 154
Limit class, 182 FuncAnimation class, 154–158
Linux, software installation on, gca() function, 152
216–217 gcf() function, 154
lists, 29–31 imshow() function, 172
appending to a list, 30 labels, 41
choosing a random legend, adding a, 40
element, 161 legend() function, 40
creating a set, 123 marker, 34
empty lists, 30 multiple data sets, 38, 53
index, 29 patches, 150
iterating over the elements, 31 plot() function, 32, 36
len() function, 62 Polygon patch, 168
list comprehensions, 223–224 pylab module, 32
lists of lists, 173–175 pyplot module, 44
max() function, 72 savefig() function, 45
min() function, 72 saving, 45–46
sort() method, 64 scatter() function, 81
sum() function, 62 scatter plots, 79, 81–83
tuples as members, 66 set_aspect() method, 153
zip() function, 77 show() function, 32
local maxima and minima, title, 41
188–191 title() function, 41
log() function, 179 xlabel() function, 41
ylabel() function, 41
240 Index
maxima and minima, of functions, P
188–191
Packages (Python), 32
max() function, 72
partial derivative of functions,
mean, finding, 62–63
finding, 187
median, finding, 63–65
Pearson correlation coefficient, 75
min() function, 72
PEMDAS (order of operations), 3
mode, finding, 65–69
pi (π), estimating value of, 147
modules, 5
plot() function, 32, 109
modulo (%) operator, 3
plotting
multiplication tables, generating,
expressions, 108–115
15–17, 23
input by the user, 111–113
multiplying expressions, 104–105
multiple, 113–115
with formulas, 46–54
N
projectile motion, 48–54
__name__, 221–223 using SymPy. See SymPy
negative index, of a list, 31 polynomial expressions, 117
NegativeInfinity, 204 polynomial() method, 119
Newton’s law of universal pretty printing, 97–100
gravitation, 46–48 probability, 131–140, 201–204
number line, 28 continuous random
numbers variable, 201
abs() function, 7 density functions, 201–204
common number sets, 126 distribution, uniform, 131
complex numbers. See complex expectation, 143
numbers law of large numbers, 144
conversion between types, 5 nonuniform probability, 164
float() function, 5 random numbers. See also
floating point, 4–5 random numbers
Fraction class, 5, 6 generating, 134–137
fractions module, 5 nonuniform, 137–140
integers, 4–5 random variable, 143
int() function, 5 Project Euler, 210
is_integer() method, 10 projectile motion, 48, 191
random. See random numbers animation, 156
rational, irrational, and real, 126 trajectory drawing, 51, 56
type() function, 4 pylab module, 32
types of, 4–7 pyplot module, 44–45
Nykamp, Duane Q., “The idea of Python
a probability density documentation, 210, 211
function,” 202 IDLE, 1, 13–14
installation
O Linux, 216–217
Mac OS X, 217–220
open() function, 231
Windows, 214–215
order of operations (PEMDAS), 3
overview, 221–236
Index 241
Q sets, 121–131
cardinality, 122
quadratic equations
checking for a number in, 122
finding the roots of, 20–22
common, 126
solving, 106
correlation between, 75–81
quadratic functions, exploring
creating, 122–124
visually, 55–56
empty, 123
from lists or tuples, 123
R EmptySet object, 123
random module, 134 FiniteSet class, 122
choice() function, 160 FiniteSet object, 122
randint() function, 134, 175 intersect() method, 127
random() function, 134 is_subset() method, 124
uniform() function, 146 is_superset() method, 124
random numbers iterating through the
ATM example, 138–140 members, 123
coin tosses, 137–138, 144 operations, 126–131
deck of cards, shuffling, 144–145 Cartesian product, 127–128
die rolls. See die rolls formulas, using sets of
generating, 134–137 variables in, 129
nonuniform, 137–140 gravity example, 130–131
range union and intersection, 126
of a function, 178 powerset() method, 125
of a set, 71–72 repetition and order, 123–124
range() function, 13, 37, 50 subsets, supersets, and power
start, stop, and step values, 13 sets, 124–125
rate of change, finding, 184 union() method, 126–127
reading data from files, 83–88 Venn diagrams, 140–143
CSV files, 86–88 show() function, 32, 111
text files, 84–85 shuffling, deck of cards, 144–145
return values, multiple, 226–228 Sierpiń ski triangle, 170–171
reusing code, 235–236 simultaneous equations, 108
Robertson, Ian, “Calculating sin() function, 52, 178, 179
Percentiles,” 90 software installation
on Linux, 216–217
on Mac OS X, 217–220
S
on Windows, 214–215
sample spaces (probability), 131 solving algebraic equations, 105
save() function, 111 standard deviation, finding, 72–75
saving plots, as image files, statistical measures
45–46, 111 correlation coefficient,
scatter plots, 79, 81–83 75–81, 87
series calculating, 76–78
calculating value of, 102–103 high school grades example,
Fibonacci, 59–60 78–81
printing, 99–100 dispersion, 71–75
summing, 116 frequency tables, 69–71
set_aspect() method, 153 grouped, 90–91
242 Index
mean, 62–63 solve_poly_inequality()
median, 63–65 function, 117
mode, 65–71 solve_univariate_inequality()
Pearson correlation function, 118
coefficient, 75 solving inequalities, 117
percentile, 89–90 S class, 182
range, 71–72 subs() method, 100, 108, 184
standard deviation, 72–75 summation() function, 116
variance, 72–75 symbol, defining a, 94
step size, 192, 197–199 Symbol class, 94
string, 8 symbols() function, 95
format() method, 15 SympifyError class, 104
int() and float(), See under sympify() function, 103, 119, 186
numbers, 8
strings to mathematical T
expressions, 103
sum() function, 62 tan() function, 179
summing a series, 116 title() function, 41–42
symbolic math, 93 trajectory (projectile motion)
SymPy comparing, 53–54, 56
as_numer_denom() method, 118 drawing, 51–53
assumptions, 180 transformation of a point, 158
Derivative class, 185 tuples, 29–31
documentation, 98, 211 empty, 31
doit() method, 182, 185 iterating through the
expand() function, 96 members, 123
expression, factorizing an, 96
factor() function, 96 U
init_printing() function, 98 union, of sets, 118, 126–127
installation. See installation, of units of measurement, converting,
software 17–20, 23
Integral class, 200 universal gravitation, Newton’s law,
is_polynomial() method, 119 46–48
is_rational_function() user input
method, 119 complex() function, 12
Limit class, 182 fractional numbers, 11
plot() function, 109 getting, 8–12
plotting expressions with, handling invalid input, 9–11
108–115 input() function, 8
input by the user, 111–113
multiple, 113–115
Poly class, 117 V
pprint() function, 97–100 ValueError, 9, 12
pretty printing, 97–100 variables, 4, 178
save() function, 111 nonlinear relationship, 47
show() function, 111 variance, finding, 72–75
simplify() function, 101 Venn diagrams, 140–143
solve() function, 105, 106, 180
Index 243
W
while loop, 24
exiting early using break, 24
Windows, software installation on,
214–215
Z
ZeroDivisionError, 11, 228–229
zip() function, 77
244 Index
RESOURCES
Visit https://www.nostarch.com/doingmathwithpython/ for resources, errata, and more
information.
phone: email:
800.420.7240 or sales @ nostarch.com
415.863.9900 web:
www.nostarch.com
DOING MATH
EXPLORE MATH
WITH CODE
Doing Math with Python shows you how to use • Write programs to find derivatives and integrate
WITH PYTHON
Python to delve into high school–level math topics functions U S E P R O G R A M M I N G T O E X P L O R E A L G E B R A ,
like statistics, geometry, probability, and calculus.
Creative coding challenges and applied examples help S T A T I S T I C S , C A L C U L U S , AND MORE!
You’ll start with simple projects, like a factoring
you see how you can put your new math and coding
program and a quadratic-equation solver, and then
skills into practice. You’ll write an inequality solver, plot
create more complex projects once you’ve gotten
gravity’s effect on how far a bullet will travel, shuffle a
the hang of things. AMIT SAHA
deck of cards, estimate the area of a circle by throwing
Along the way, you’ll discover new ways to explore 100,000 “darts” at a board, explore the relationship
math and gain valuable programming skills that you’ll between the Fibonacci sequence and the golden ratio,
use throughout your study of math and computer and more.
science. Learn how to:
Whether you’re interested in math but have yet to dip
• Describe your data with statistics, and visualize it into programming or you’re a teacher looking to bring
with line graphs, bar charts, and scatter plots programming into the classroom, you’ll find that Python
makes programming easy and practical. Let Python
• Explore set theory and probability with programs for
handle the grunt work while you focus on the math.
coin flips, dicing, and other games of chance
ABOUT THE AUTHOR
• Solve algebra problems using Python’s symbolic math
functions Amit Saha is a software engineer who has worked
for Red Hat and Sun Microsystems. He created and
• Draw geometric shapes and explore fractals like
maintains Fedora Scientific, a Linux distribution for
the Barnsley fern, the Sierpiński triangle, and the
scientific and educational users. He is also the author
Mandelbrot set
of Write Your First Program (Prentice Hall Learning).
COVERS PYTHON 3
T H E F I N E ST I N G E E K E N T E RTA I N M E N T ™
w w w.nostarch.com