Menu

[r1028]: / trunk / course / intro_to_python.lyx  Maximize  Restore  History

Download this file

2021 lines (1548 with data), 43.2 kB

#LyX 1.3 created this file. For more info see http://www.lyx.org/
\lyxformat 221
\textclass amsbook
\language english
\inputencoding auto
\fontscheme default
\graphics default
\paperfontsize default
\papersize Default
\paperpackage a4
\use_geometry 0
\use_amsmath 0
\use_natbib 0
\use_numerical_citations 0
\paperorientation portrait
\secnumdepth 3
\tocdepth 3
\paragraph_separation indent
\defskip medskip
\quotes_language english
\quotes_times 2
\papercolumns 1
\papersides 1
\paperpagestyle default

\layout Chapter

A whirlwind tour of python and the standard library
\layout Standard

This is a quick-and-dirty introduction to the python language for the impatient
 scientist.
 There are many top notch, comprehensive introductions and tutorials for
 python.
 For absolute beginners, there is the 
\shape italic 
Python Beginner's Guide
\shape default 
.
\begin_inset Foot
collapsed true

\layout Standard

http://www.python.org/moin/BeginnersGuide
\end_inset 

 The official 
\shape italic 
Python Tutorial
\shape default 
 can be read online
\begin_inset Foot
collapsed true

\layout Standard

http://docs.python.org/tut/tut.html
\end_inset 

 or downloaded
\begin_inset Foot
collapsed true

\layout Standard

http://docs.python.org/download.html
\end_inset 

 in a variety of formats.
 There are over 100 python tutorials collected online.
\begin_inset Foot
collapsed true

\layout Standard

http://www.awaretek.com/tutorials.html
\end_inset 


\layout Standard

There are also many excellent books.
 Targetting newbies is Mark Pilgrim's 
\shape italic 
Dive into Python
\shape default 
 which in available in print and for free online
\begin_inset Foot
collapsed true

\layout Standard

http://diveintopython.org/toc/index.html
\end_inset 

, though for absolute newbies even this may be too hard 
\begin_inset LatexCommand \cite{Dive}

\end_inset 

.
 For experienced programmers, David Beasley's 
\shape italic 
Python Essential Reference
\shape default 
 is an excellent introduction to python, but is a bit dated since it only
 covers python2.1 
\begin_inset LatexCommand \cite{Beasley}

\end_inset 

.
 Likwise Alex Martelli's 
\shape italic 
Python in a Nutshell
\shape default 
 is highly regarded and a bit more current -- a 2nd edition is in the works
\begin_inset LatexCommand \cite{Nutshell}

\end_inset 

.
 And 
\shape italic 
The Python Cookbook
\shape default 
 is an extremely useful collection of python idioms, tips and tricks 
\begin_inset LatexCommand \cite{Cookbook}

\end_inset 

.
\layout Standard

But the typical scientist I encounter wants to solve a specific problem,
 eg, to make a certain kind of graph, to numerically integrate an equation,
 or to fit some data to a parametric model, and doesn't have the time or
 interest to read several books or tutorials to get what they want.
 This guide is for them: a short overview of the language to help them get
 to what they want as quickly as possible.
\layout Section

Hello Python
\layout Standard

Python is a dynamically typed, object oriented, interpreted language.
 Interpreted means that your program interacts with the python interpreter,
 similar to Matlab, Perl, Tcl and Java, and unlike FORTRAN, C, or C++ which
 are compiled.
 So let's fire up the python interpreter and get started.
 I'm not going to cover installing python -- it's standard on most linux
 boxes and for windows there is a friendly GUI installer.
 To run the python interpreter, on windows, you can click 
\family typewriter 
Start->All Programs->Python 2.4->Python (command line)
\family default 
 or better yet, install 
\family typewriter 
ipython
\family default 
, a python shell on steroids, and use that.
 On linux / unix systems, you just need to type 
\family typewriter 
python
\family default 
 or 
\family typewriter 
ipython
\family default 
 at the command line.
 The 
\family typewriter 
>>>
\family default 
 is the default python shell prompt, so don't type it in the examples below
\layout LyX-Code

>>> print 'hello world'
\layout LyX-Code

hello world
\layout LyX-Code

\layout Standard

As this example shows, 
\shape italic 
hello world
\shape default 
 in python is pretty easy -- one common phrase you hear in the python community
 is that 
\begin_inset Quotes eld
\end_inset 

it fits your brain
\begin_inset Quotes erd
\end_inset 

.
 -- the basic idea is that coding in python feels natural.
 Compare python's version with 
\shape italic 
hello world
\shape default 
 in C++
\layout LyX-Code

// C++
\layout LyX-Code

#include <iostream>
\layout LyX-Code

int main ()
\layout LyX-Code

{   
\layout LyX-Code

  std::cout << "Hello World" << std::endl;
\layout LyX-Code

  return 0;
\layout LyX-Code

}
\layout Section

Python is a calculator
\layout Standard

Aside from my daughter's solar powered cash-register calculator, Python
 is the only calculator I use.
 From the python shell, you can type arbitrary arithmetic expressions.
\layout LyX-Code

>>> 2+2
\layout LyX-Code

4
\layout LyX-Code

>>> 2**10
\layout LyX-Code

1024
\layout LyX-Code

>>> 10/5
\layout LyX-Code

2
\layout LyX-Code

>>> 2+(24.3 + .9)/.24
\layout LyX-Code

107.0
\layout LyX-Code

>>> 2/3
\layout LyX-Code

0
\layout Standard

The last line is a standard newbie gotcha -- if both the left and right
 operands are integers, python returns an integer.
 To do floating point division, make sure at least one of the numbers is
 a float
\layout LyX-Code

>>> 2.0/3
\layout LyX-Code

0.66666666666666663
\layout Standard

The distinction between integer and floating point division is a common
 source of frustration among newbies and is slated for destruction in the
 mythical Python 3000.
\begin_inset Foot
collapsed true

\layout Standard

Python 3000 is a future python release that will clean up several things
 that Guido considers to be warts.
\end_inset 

 Since the removal of the distinction is slated, you can invoke the time
 machine with the 
\family typewriter 
from __future__
\family default 
 directive; these directives allow python programmers today to use features
 that will become standard in future releases but are not included by default
 because they would break existing code.
 From future directives should be among the first lines you type in your
 python code if you are going to use them, otherwise they may not work The
 future division operator will assume floating point division by default,
\begin_inset Foot
collapsed false

\layout Standard

The astute reader will note that 2/3 was represented as 0.66666666666666663
 and not 0.66666666666666666 as might be expected.
 This is, of course, because computers are binary calculators, and there
 is no exact binary representation of 2/3, just as there is no exact binary
 representation of 0.1
\layout LyX-Code

>>> 0.1
\layout LyX-Code

0.10000000000000001
\layout Standard

Some languages try and hide this from you, but python is explicit.
\end_inset 

and provides another operator // to do classic integer division.
\layout LyX-Code

>>> from __future__ import division
\layout LyX-Code

>>> 2/3
\layout LyX-Code

0.66666666666666663
\layout LyX-Code

>>> 2//3
\layout LyX-Code

0
\layout Standard

python has four basic numeric types: int, long, float and complex, but unlike
 C++, BASIC, FORTRAN or Java, you don't have to declare these types.
 python can infer them
\layout LyX-Code

>>> type(1)
\layout LyX-Code

<type 'int'>
\layout LyX-Code

>>> type(1.0)
\layout LyX-Code

<type 'float'>
\layout LyX-Code

>>> type(2**200)
\layout LyX-Code

<type 'long'>
\layout LyX-Code

\layout Standard


\begin_inset Formula $2^{200}$
\end_inset 

is a huge number!
\layout LyX-Code

>>> 2**200
\layout LyX-Code

1606938044258990275541962092341162602522202993782792835301376L
\layout Standard

but python will blithely compute it and much larger numbers for you as long
 as you have CPU and memory to handle them.
 The integer type, if it overflows, will automatically convert to a python
 
\family typewriter 
long
\family default 
 (as indicated by the appended 
\family typewriter 
L
\family default 
 in the output above) and has no built-in upper bound on size, unlike C/C++
 longs.
\layout Standard

Python has built in support for complex numbers.
 Eg, we can verify 
\begin_inset Formula $i^{2}=-1$
\end_inset 

 
\layout LyX-Code

>>> x = complex(0,1)
\layout LyX-Code

>>> x*x
\layout LyX-Code

(-1+0j)
\layout Standard

To access the real and imaginary parts of a complex number, use the 
\family typewriter 
real
\family default 
 and 
\family typewriter 
imag
\family default 
 attributes
\layout LyX-Code

>>> x.real
\layout LyX-Code

0.0
\layout LyX-Code

>>> x.imag
\layout LyX-Code

1.0
\layout Standard

If you come from other languages like Matlab, the above may be new to you.
 In matlab, you might do something like this (>> is the standard matlab
 shell prompt)
\layout LyX-Code

>> x = 0+j
\layout LyX-Code

x =
\layout LyX-Code

   0.0000 + 1.0000i
\layout LyX-Code

\layout LyX-Code

>> real(x)
\layout LyX-Code

ans =
\layout LyX-Code

     0
\layout LyX-Code

\layout LyX-Code

>> imag(x)
\layout LyX-Code

ans =
\layout LyX-Code

     1
\layout LyX-Code

\layout LyX-Code

\layout Standard

That is, in Matlab, you use a 
\shape italic 
function
\shape default 
 to access the real and imaginary parts of the data, but in python these
 are attributes of the complex object itself.
 This is a core feature of python and other object oriented languages: an
 object carries its data and methods around with it.
 One might say: 
\begin_inset Quotes eld
\end_inset 

a complex number knows it's real and imaginary parts
\begin_inset Quotes erd
\end_inset 

 or 
\begin_inset Quotes eld
\end_inset 

a complex number knows how to take its conjugate
\begin_inset Quotes erd
\end_inset 

, you don't need external functions for these operations
\layout LyX-Code

>>> x.conjugate
\layout LyX-Code

<built-in method conjugate of complex object at 0xb6a62368>
\layout LyX-Code

>>> x.conjugate()
\layout LyX-Code

-1j
\layout Standard

On the first line, I just followed along from the example above with 
\family typewriter 
real
\family default 
 and 
\family typewriter 
imag
\family default 
 and typed 
\family typewriter 
x.conjugate
\family default 
 and python printed the representation 
\family typewriter 
<built-in method conjugate of complex object at 0xb6a62368>.
 
\family default 
This means that 
\family typewriter 
conjugate
\family default 
 is a 
\shape italic 
method
\shape default 
, a.k.a a function, and in python we need to use parentheses to call a function.
 If the method has arguments, like the 
\family typewriter 
x
\family default 
 in 
\family typewriter 
sin(x)
\family default 
, you place them inside the parentheses, and if it has no arguments, like
 
\family typewriter 
conjugate
\family default 
, you simply provide the open and closing parentheses.
 
\family typewriter 
real
\family default 
, 
\family typewriter 
imag
\family default 
 and 
\family typewriter 
conjugate
\family default 
 are attributes of the complex object, and 
\family typewriter 
conjugate
\family default 
 is a 
\shape italic 
callable
\shape default 
 attribute, known as a 
\shape italic 
method
\shape default 
.
\layout Standard

OK, now you are an object oriented programmer.
 There are several key ideas in object oriented programming, and this is
 one of them: an object carries around with it data (simple attributes)
 and methods (callable attributes) that provide additional information about
 the object and perform services.
 It's one stop shopping -- no need to go to external functions and libraries
 to deal with it -- the object knows how to deal with itself.
\layout Section

Accessing the standard library
\layout Standard

Arithmetic is fine, but before long you may find yourself tiring of it and
 wanting to compute logarithms and exponents, sines and cosines
\layout LyX-Code

>>> log(10)
\layout LyX-Code

Traceback (most recent call last):
\layout LyX-Code

  File "<stdin>", line 1, in ?
\layout LyX-Code

NameError: name 'log' is not defined
\layout Standard

These functions are not built into python, but don't despair, they are built
 into the python standard library.
 To access a function from the standard library, or an external library
 for that matter, you must import it.
\layout LyX-Code

>>> import math
\layout LyX-Code

>>> math.log(10)
\layout LyX-Code

2.3025850929940459
\layout LyX-Code

>>> math.sin(math.pi)
\layout LyX-Code

1.2246063538223773e-16
\layout Standard

Note that the default 
\family typewriter 
log
\family default 
 function is a base 2 logarithm (use 
\family typewriter 
math.log10
\family default 
 for base 10 logs) and that floating point math is inherently imprecise,
 since analytically
\begin_inset Formula $\sin(\pi)=0$
\end_inset 

.
\layout Standard

It's kind of a pain to keep typing 
\family typewriter 
math.log
\family default 
 and 
\family typewriter 
math.sin
\family default 
 and 
\family typewriter 
math.p
\family default 
i, and python is accomodating.
 There are additional forms of 
\family typewriter 
import
\family default 
 that will let you save more or less typing depending on your desires
\layout LyX-Code


\color blue
# Appreviate the module name: m is an alias
\layout LyX-Code

>>> import math as m
\layout LyX-Code

>>> m.cos(2*m.pi)
\layout LyX-Code

1.0
\layout LyX-Code

\layout LyX-Code


\color blue
# Import just the names you need
\layout LyX-Code

>>> from math import exp, log
\layout LyX-Code

>>> log(exp(1))
\layout LyX-Code

1.0
\layout LyX-Code

\layout LyX-Code


\color blue
# Import everything - use with caution!
\layout LyX-Code

>>> from math import *
\layout LyX-Code

>>> sin(2*pi*10)
\layout LyX-Code

-2.4492127076447545e-15
\layout Standard

To help you learn more about what you can find in the math library, python
 has nice introspection capabilities -- introspection is a way of asking
 an object about itself.
 For example, to find out what is available in the math library, we can
 get a directory of everything available with the 
\family typewriter 
dir
\family default 
 command
\begin_inset Foot
collapsed false

\layout Standard

In addition to the introdpection and help provided in the python interpreter,
 the official documentation of the python standard library is very good
 and up-to-date http://docs.python.org/lib/lib.html .
\end_inset 


\layout LyX-Code

>>> dir(math)
\layout LyX-Code

['__doc__', '__file__', '__name__', 'acos', 'asin', 'atan', 'atan2', 'ceil',
 'cos', 'cosh', 'degrees', 'e', 'exp', 'fabs', 'floor', 'fmod', 'frexp',
 'hypot', 'ldexp', 'log', 'log10', 'modf', 'pi', 'pow', 'radians', 'sin',
 'sinh', 'sqrt', 'tan', 'tanh']
\layout Standard

This gives us just a listing of the names that are in the math module --
 they are fairly self descriptive, but if you want more, you can call 
\family typewriter 
help
\family default 
 on any of these functions for more information
\layout LyX-Code

>>> help(math.sin) 
\layout LyX-Code

Help on built-in function sin:
\layout LyX-Code

sin(...)
\layout LyX-Code

sin(x)
\layout LyX-Code

Return the sine of x (measured in radians).
\layout Standard

and for the whole math library
\layout LyX-Code

>>> help(math) 
\layout LyX-Code

Help on module math:
\layout LyX-Code

 
\layout LyX-Code

NAME
\layout LyX-Code

    math
\layout LyX-Code

 
\layout LyX-Code

FILE
\layout LyX-Code

    /usr/local/lib/python2.3/lib-dynload/math.so
\layout LyX-Code

 
\layout LyX-Code

DESCRIPTION
\layout LyX-Code

    This module is always available.
  It provides access to the
\layout LyX-Code

    mathematical functions defined by the C standard.
\layout LyX-Code

 
\layout LyX-Code

FUNCTIONS
\layout LyX-Code

    acos(...)
\layout LyX-Code

        acos(x)
\layout LyX-Code

         
\layout LyX-Code

        Return the arc cosine (measured in radians) of x.
\layout LyX-Code

     
\layout LyX-Code

    asin(...)
\layout LyX-Code

        asin(x)
\layout LyX-Code

         
\layout LyX-Code

        Return the arc sine (measured in radians) of x.
\layout LyX-Code

     
\layout Standard

And much more which is snipped.
 Likewise, we can get information on the complex object in the same way
\layout LyX-Code

>>> x = complex(0,1)
\layout LyX-Code

>>> dir(x)
\layout LyX-Code

['__abs__', '__add__', '__class__', '__coerce__', '__delattr__', '__div__',
 '__divmod__', '__doc__', '__eq__', '__float__', '__floordiv__', '__ge__',
 '__getattribute__', '__getnewargs__', '__gt__', '__hash__', '__init__',
 '__int__', '__le__', '__long__', '__lt__', '__mod__', '__mul__', '__ne__',
 '__neg__', '__new__', '__nonzero__', '__pos__', '__pow__', '__radd__',
 '__rdiv__', '__rdivmod__', '__reduce__', '__reduce_ex__', '__repr__', '__rfloor
div__', '__rmod__', '__rmul__', '__rpow__', '__rsub__', '__rtruediv__',
 '__setattr__', '__str__', '__sub__', '__truediv__', 'conjugate', 'imag',
 'real']
\layout LyX-Code

\layout Standard

Notice that called 
\family typewriter 
dir
\family default 
 or 
\family typewriter 
help
\family default 
 on the 
\family typewriter 
math
\family default 
 
\shape italic 
module
\shape default 
, the 
\family typewriter 
math.sin
\family default 
 
\shape italic 
function
\shape default 
, and the 
\family typewriter 
complex
\family default 
 
\shape italic 
number
\shape default 
 
\family typewriter 
x
\family default 
.
 That's because modules, functions and numbers are all 
\shape italic 
objects
\shape default 
, and we use the same object introspection and help capabilites on them.
 We can find out what type of object they are by calling 
\family typewriter 
type
\family default 
 on them, which is another function in python's introspection arsenal
\layout LyX-Code

>>> type(math)
\layout LyX-Code

<type 'module'>
\layout LyX-Code

>>> type(math.sin)
\layout LyX-Code

<type 'builtin_function_or_method'>
\layout LyX-Code

>>> type(x)
\layout LyX-Code

<type 'complex'>
\layout LyX-Code

\layout Standard

Now, you may be wondering: what were all those god-awful looking double
 underscore methods, like 
\family typewriter 
__abs__ 
\family default 
and 
\family typewriter 
__mul__
\family default 
 in the 
\family typewriter 
dir
\family default 
 listing of the complex object above? These are methods that define what
 it means to be a numeric type in python, and the complex object implements
 these methods so that complex numbers act like the way should, eg 
\family typewriter 
__mul__
\family default 
 implements the rules of complex multiplication.
 The nice thing about this is that python specifies an application programming
 interface (API) that is the definition of what it means to be a number
 in python.
 And this means you can define your own numeric types, as long as you implement
 the required special double underscore methods for your custom type.
 double underscore methods are very important in python; although the typical
 newbie never sees them or thinks about them, they are there under the hood
 providing all the python magic, and more importantly, showing the way to
 let you make magic.
\layout Section

Strings
\layout Standard

We've encountered a number of types of objects above: int, float, long,
 complex, method/function and module.
 We'll continue our tour with an introduction to strings, which are critical
 components of almost every program.
 You can create strings in a number of different ways, with single quotes,
 double quotes, or triple quotes -- this diversity of methods makes it easy
 if you need to embed string characters in the string itself
\layout LyX-Code


\color blue
# single, double and triple quoted strings
\layout LyX-Code

>>> s = 'Hi Mom!'
\layout LyX-Code

>>> s = "Hi Mom!"
\layout LyX-Code

>>> s = """Porky said, "That's all folks!" """
\layout Standard

You can add strings together to concatenate them
\layout LyX-Code


\color blue
# concatenating strings
\layout LyX-Code

>>> first = 'John'
\layout LyX-Code

>>> last = 'Hunter'
\layout LyX-Code

>>> first+last
\layout LyX-Code

'JohnHunter'
\layout Standard

or call string methods to process them: upcase them or downcase them, or
 replace one character with another
\layout LyX-Code


\color blue
# string methods
\layout LyX-Code

>>> last.lower()
\layout LyX-Code

'hunter'
\layout LyX-Code

>>> last.upper()
\layout LyX-Code

'HUNTER'
\layout LyX-Code

>>> last.replace('h', 'p')
\layout LyX-Code

'Hunter'
\layout LyX-Code

>>> last.replace('H', 'P')
\layout LyX-Code

'Punter' 
\layout Standard

Note that in all of these examples, the string 
\family typewriter 
last
\family default 
 is unchanged.
 All of these methods operate on the string and return a new string, leaving
 the original unchanged.
 In fact, python strings cannot be changed by any python code at all: they
 are 
\shape italic 
immutable
\shape default 
 (unchangeable).
 The concept of mutable and immutable objects in python is an important
 one, and it will come up again, because only immutable objects can be used
 as keys in python dictionaries and elements of python sets.
\layout Standard

You can access individual characters, or slices of the string (substrings),
 using indexing.
 A string in sequence of characters, and strings implement the sequence
 protocol in python -- we'll see more examples of python sequences later
 -- and all sequences have the same syntax for accessing their elements.
 Python uses 0 based indexing which means the first element is at index
 0; you can use negative indices to access the last elements in the sequence
\layout LyX-Code


\color blue
# string indexing
\layout LyX-Code

>>> last = 'Hunter'
\layout LyX-Code

>>> last[0]
\layout LyX-Code

'H'
\layout LyX-Code

>>> last[1]
\layout LyX-Code

'u'
\layout LyX-Code

>>> last[-1] 
\layout LyX-Code

'r' 
\layout Standard

To access substrings, or generically in terms of the sequence protocol,
 slices, you use a colon to indicate a range
\layout LyX-Code


\color blue
# string slicing
\layout LyX-Code

>>> last[0:2]
\layout LyX-Code

'Hu'
\layout LyX-Code

>>> last[2:4]
\layout LyX-Code

'nt'
\layout Standard

As this example shows, python uses 
\begin_inset Quotes eld
\end_inset 

one-past-the-end
\begin_inset Quotes erd
\end_inset 

 indexing when defining a range; eg, in the range 
\family typewriter 
indmin:indmax
\family default 
, the element of 
\family typewriter 
imax
\family default 
 is not included.
 You can use negative indices when slicing too; eg, to get everything before
 the last character
\layout LyX-Code

>>> last[0:-1]
\layout LyX-Code

'Hunte'
\layout Standard

You can also leave out either the min or max indicator; if they are left
 out, 0 is assumed to be the 
\family typewriter 
indmin
\family default 
 and one past the end of the sequence is assumed to be 
\family typewriter 
indmax
\layout LyX-Code

>>> last[:3]
\layout LyX-Code

'Hun'
\layout LyX-Code

>>> last[3:]
\layout LyX-Code

'ter'
\layout Standard

There is a third number that can be placed in a slice, a step, with syntax
 indmin:indmax:step; eg, a step of 2 will skip every second letter
\layout LyX-Code

>>> last[1:6:2]
\layout LyX-Code

'utr'
\layout Standard

Although this may be more that you want to know about slicing strings, the
 time spent here is worthwhile.
 As mentioned above, all python sequences obey these rules.
 In addition to strings, lists and tuples, which are built-in python sequence
 data types and are discussed in the next section, the numeric arrays widely
 used in scientific computing also implement the sequence protocol, and
 thus have the same slicing rules.
\layout Exercise

What would you expect last[:] to return?
\layout Standard

One thing that comes up all the time is the need to create strings out of
 other strings and numbers, eg to create filenames from a combination of
 a base directory, some base filename, and some numbers.
 Scientists like to create lots of data files like
\layout LyX-Code

data/myexp01.dat
\layout LyX-Code

data/myexp02.dat
\layout LyX-Code

data/myexp03.dat
\layout LyX-Code

data/myexp04.dat
\layout Standard

and then write code to loop over these files and analyze them.
 We're going to show how to do that, starting with the newbie way and progressiv
ely building up to the way of python zen master.
 All of the methods below 
\shape italic 
work
\shape default 
, but the zen master way will more efficient, more scalable (eg to larger
 numbers of files) and cross-platform.
\begin_inset Foot
collapsed false

\layout Standard


\begin_inset Quotes eld
\end_inset 

But it works
\begin_inset Quotes erd
\end_inset 

 is a common defense of bad code; my rejoinder to this is 
\begin_inset Quotes eld
\end_inset 

A computer scientist is someone who fixes things that aren't broken
\begin_inset Quotes erd
\end_inset 

.
 
\end_inset 

 Here's the newbie way: we also introduce the for-loop here in the spirit
 of diving into python -- note that python uses whitespace indentation to
 delimit the for-loop code block
\layout LyX-Code


\color blue
# The newbie way
\layout LyX-Code

for i in (1,2,3,4):
\layout LyX-Code

    fname = 'data/myexp0' + str(i) + '.dat'
\layout LyX-Code

    print fname
\layout Standard

Now as promised, this will print out the 4 file names above, but it has
 three flaws: it doesn't scale to 10 or more files, it is inefficient, and
 it is not cross platform.
 It doesn't scale because it hard-codes the '
\family typewriter 
0
\family default 
' after 
\family typewriter 
myexp
\family default 
, it is inefficient because to add several strings requires the creation
 of temporary strings, and it is not cross-platform because it hard-codes
 the directory separator '/'.
\layout LyX-Code


\color blue
# On the path to elightenment
\layout LyX-Code

for i in (1,2,3,4):
\layout LyX-Code

    fname = 'data/myexp%02d.dat'%i
\layout LyX-Code

    print fname
\layout Standard

This example uses string interpolation, the funny % thing.
 If you are familiar with C programming, this will be no surprise to you
 (on linux/unix systems do 
\family typewriter 
man sprintf 
\family default 
at the unix shell).
 The percent character is a string formatting character: 
\family typewriter 
%02d
\family default 
 means to take an integer (the 
\family typewriter 
d
\family default 
 part) and print it with two digits, padding zero on the left (the
\family typewriter 
 %02
\family default 
 part).
 There is more to be said about string interpolation, but let's finish the
 job at hand.
 This example is better than the newbie way because is scales up to files
 numbered 0-99, and it is more efficient because it avoids the creation
 of temporary strings.
 For the platform independent part, we go to the python standard library
 
\family typewriter 
os.path
\family default 
, which provides a host of functions for platform-independent manipulations
 of filenames, extensions and paths.
 Here we use 
\family typewriter 
os.path.join
\family default 
 to combine the directory with the filename in a platform independent way.
 On windows, it will use the windows path separator '
\backslash 
' and on unix it will use '/'.
\layout LyX-Code


\color blue
# the zen master approach
\layout LyX-Code

import os
\layout LyX-Code

for i in (1,2,3,4):
\layout LyX-Code

    fname = os.path.join('data', 'myexp%02d.dat'%i)
\layout LyX-Code

    print fname
\layout LyX-Code

\layout Standard

OK, I promised to torture you a bit more with string interpolation -- don't
 worry, I remembered.
 The ability to properly format your data when printing it is crucial in
 scientific endeavors: how many signficant digits do you want, do you want
 to use integer, floating point representation or exponential notation?
 These three choices are provided with 
\family typewriter 
%d
\family default 
, 
\family typewriter 
%f
\family default 
 and 
\family typewriter 
%e
\family default 
, with lots of variations on the theme to indicate precision and more
\layout LyX-Code

>>> 'warm for %d minutes at %1.1f C' % (30, 37.5)
\layout LyX-Code

'warm for 30 minutes at 37.5 C'
\layout LyX-Code

\layout LyX-Code

>>> 'The mass of the sun is %1.4e kg'% (1.98892*10**30)
\layout LyX-Code

'The mass of the sun is 1.9889e+30 kg'
\layout LyX-Code

\layout Standard

There are two string methods, 
\family typewriter 
split
\family default 
 and 
\family typewriter 
join
\family default 
, that arise frequenctly in Numeric processing, specifically in the context
 of processing data files that have comma, tab, or space separated numbers
 in them.
 
\family typewriter 
split
\family default 
 takes a single string, and splits it on the indicated character to a sequence
 of strings.
 This is useful to take a single line of space or comma separated values
 and split them into individual numbers
\layout LyX-Code


\color blue
# s is a single string and we split it into a list of strings
\layout LyX-Code


\color blue
# for further processing
\layout LyX-Code

>>> s = '1.0 2.0 3.0 4.0 5.0'
\layout LyX-Code

>>> s.split(' ')
\layout LyX-Code

['1.0', '2.0', '3.0', '4.0', '5.0']
\layout Standard

The return value, with square brackets, indicates that python has returned
 a list of strings.
 These individual strings need further processing to convert them into actual
 floats, but that is the first step.
  The conversion to floats will be discussed in the next session, when we
 learn about list comprehensions.
 The converse method is join, which is often used to create string output
 to an ASCII file from a list of numbers.
 In this case you want to join a list of numbers into a single line for
 printing to a file.
 The example below will be clearer after the next section, in which lists
 are discussed
\layout LyX-Code


\color blue
# vals is a list of floats and we convert it to a single
\layout LyX-Code


\color blue
# space separated string
\layout LyX-Code

>>> vals = [1.0, 2.0, 3.0, 4.0, 5.0]
\layout LyX-Code

>>> ' '.join([str(val) for val in vals])
\layout LyX-Code

'1.0 2.0 3.0 4.0 5.0'
\layout Standard

There are two new things in the example above.
 One, we called the join method directly on a string itself, and not on
 a variable name.
 Eg, in the previous examples, we always used the name of the object when
 accessing attributes, eg 
\family typewriter 
x.real
\family default 
 or 
\family typewriter 
s.upper()
\family default 
.
 In this example, we call the 
\family typewriter 
join
\family default 
 method on the string which is a single space.
 The second new feature is that we use a list comprehension 
\family typewriter 
[str(val) for val in vals] 
\family default 
as the argument to 
\family typewriter 
join
\family default 
.
 
\family typewriter 
join
\family default 
 requires a sequence of strings, and the list comprehension converts a list
 of floats to a strings.
 This can be confusing at first, so don't dispair if it is.
 But it is worth bringing up early because list comprehensions are a very
 useful feature of python.
 To help elucidate, compare 
\family typewriter 
vals
\family default 
, which is a list of floats, with the conversion of 
\family typewriter 
vals
\family default 
 to a list of strings using list comprehensions in the next line
\layout LyX-Code


\color blue
# converting a list of floats to a list of strings
\layout LyX-Code

>>> vals
\layout LyX-Code

[1.0, 2.0, 3.0, 4.0, 5.0]
\layout LyX-Code

>>> [str(val) for val in vals] 
\layout LyX-Code

['1.0', '2.0', '3.0', '4.0', '5.0']
\layout Section

The basic python data structures
\layout Standard

Strings, covered in the last section, are sequences of characters.
 python has two additional built-in sequence types which can hold arbitrary
 elements: tuples and lists.
 tuples are created using parentheses, and lists are created using square
 brackets
\layout LyX-Code


\color blue
# a tuple and a list of elements of the same type
\layout LyX-Code


\color blue
# (homogeneous)
\layout LyX-Code

>>> t = (1,2,3,4)  # tuple
\layout LyX-Code

>>> l = [1,2,3,4]  # list
\layout Standard

Both tuples and lists can also be used to hold elements of different types
\layout LyX-Code


\color blue
# a tuple and list of int, string, float
\layout LyX-Code

>>> t = (1,'john', 3.0)
\layout LyX-Code

>>> l = [1,'john', 3.0]
\layout Standard

Tuples and lists have the same indexing and slicing rules as each other,
 and as string discussed above, because both implement the python sequence
 protocol, with the only difference being that tuple slices return tuples
 (indicated by the parentheses below) and list slices return lists (indicated
 by the square brackets)
\layout LyX-Code

# indexing and slicing tuples and lists
\layout LyX-Code

>>> t[0]
\layout LyX-Code

1
\layout LyX-Code

>>> l[0]
\layout LyX-Code

1
\layout LyX-Code

>>> t[:-1]
\layout LyX-Code

(1, 'john')
\layout LyX-Code

>>> l[:-1]
\layout LyX-Code

[1, 'john']
\layout Standard

So why the difference between tuples and lists? A number of explanations
 have been offered on the mailing lists, but the only one that makes a differenc
e to me is that tuples are immutable, like strings, and hence can be used
 as keys to python dictionaries and included as elements of sets, and lists
 are mutable, and cannot.
 So a tuple, once created, can never be changed, but a list can.
 For example, if we try to reassign the first element of the tuple above,
 we get an error
\layout LyX-Code

>>> t[0] = 'why not?'
\layout LyX-Code

Traceback (most recent call last):
\layout LyX-Code

 File "<stdin>", line 1, in ?
\layout LyX-Code

TypeError: object doesn't support item assignment
\layout Standard

But the same operation is perfectly accetable for lists
\layout LyX-Code

>>> l[0] = 'why not?'
\layout LyX-Code

>>> l
\layout LyX-Code

['why not?', 'john', 3.0]
\layout Standard

lists also have a lot of methods, tuple have none, save the special double
 underscore methods that are required for python objects and sequences
\layout LyX-Code


\color blue
# tuples contain only 
\begin_inset Quotes eld
\end_inset 

hidden
\begin_inset Quotes erd
\end_inset 

 double underscore methods
\layout LyX-Code

>>> dir(t)
\layout LyX-Code

['__add__', '__class__', '__contains__', '__delattr__', '__doc__', '__eq__',
 '__ge__', '__getattribute__', '__getitem__', '__getnewargs__', '__getslice__',
 '__gt__', '__hash__', '__init__', '__iter__', '__le__', '__len__', '__lt__',
 '__mul__', '__ne__', '__new__', '__reduce__', '__reduce_ex__', '__repr__',
 '__rmul__', '__setattr__', '__str__']
\layout LyX-Code

\layout LyX-Code


\color blue
# but lists contain other methods, eg append, extend and
\layout LyX-Code


\color blue
# reverse
\layout LyX-Code

>>> dir(l)['__add__', '__class__', '__contains__', '__delattr__', '__delitem__',
 '__delslice__', '__doc__', '__eq__', '__ge__', '__getattribute__', '__getitem__
', '__getslice__', '__gt__', '__hash__', '__iadd__', '__imul__', '__init__',
 '__iter__', '__le__', '__len__', '__lt__', '__mul__', '__ne__', '__new__',
 '__reduce__', '__reduce_ex__', '__repr__', '__rmul__', '__setattr__', '__setite
m__', '__setslice__', '__str__', 'append', 'count', 'extend', 'index', 'insert',
 'pop', 'remove', 'reverse', 'sort']
\layout Standard

Many of these list methods change, or mutate, the list, eg append adds an
 element to the list, 
\family typewriter 
extend
\family default 
 extends the list with a sequence of elements, 
\family typewriter 
sort
\family default 
 sorts the list in place, 
\family typewriter 
reverse
\family default 
 reverses it in place, 
\family typewriter 
pop
\family default 
 takes an element off the list and returns it.
\layout Standard

We've seen a couple of examples of creating a list above -- let's look at
 some more using list methods
\layout LyX-Code

>>> x = []                   
\color blue
# create the empty list
\layout LyX-Code

>>> x.append(1)              
\color blue
# add the integer one to it
\layout LyX-Code

>>> x.extend(['hi', 'mom'])  
\color blue
# append two strings to it
\layout LyX-Code

>>> x
\layout LyX-Code

[1, 'hi', 'mom']
\layout LyX-Code

>>> x.reverse()              
\color blue
# reverse the list, in place
\layout LyX-Code

>>> x
\layout LyX-Code

['mom', 'hi', 1]
\layout Standard

We mentioned list comprehensions in the last section when discussing string
 methods.
  List comprehensions are a way of creating a list using a for loop in a
 single line of python.
 Let's create a list of the perfect cubes from 1 to 10, first with a for
 loop and then with a list comprehension.
 The list comprehension code will not only be shorter and more elegant,
 it can be much faster (the dots are the indentation block indicator from
 the python shell and should not be typed)
\layout LyX-Code


\color blue
# a list of perfect cubes using a for-loop
\layout LyX-Code

>>> cubes = []
\layout LyX-Code

>>> for i in range(1,10):
\layout LyX-Code

...
     cubes.append(i**3)
\layout LyX-Code

...
 
\layout LyX-Code

>>> cubes
\layout LyX-Code

[1, 8, 27, 64, 125, 216, 343, 512, 729]
\layout LyX-Code

\layout LyX-Code


\color blue
# functionally equivalent code using list comprehensions
\layout LyX-Code

>>> cubes = [i**3 for i in range(1,10)]
\layout LyX-Code

>>> cubes
\layout LyX-Code

[1, 8, 27, 64, 125, 216, 343, 512, 729]
\layout Standard

The list comprehension code is faster because it all happens at the C level.
  In the simple for-loop version, the python expression which appends the
 cube of 
\family typewriter 
i
\family default 
 has to be evaluated by the python interpreter for each element of the loop.
 In the list comprehension example, the single line is parsed once and executed
 at the C level.
  The difference in speed can be considerable, and the list comprehension
 example is shorter and more elegant to boot.
\layout Standard

The remaining essential built-in data strucuture in python is the dictionary,
 which is an associative array that maps arbitrary immutable objects to
 arbitrary objects.
 int, long, float, string and tuple are all immutable and can be used as
 keys; to a dictionary list and dict are mutable and cannot.
 A dictionary takes one kind of object as the key, and this key points to
 another object which is the value.
 In a contrived but easy to comprehent examples, one might map names to
 ages
\layout LyX-Code

>>> ages = {}            
\color blue
# create an empty dict
\layout LyX-Code

>>> ages['john'] = 36
\layout LyX-Code

>>> ages['fernando'] = 33
\layout LyX-Code

>>> ages                 
\color blue
# view the whole dict
\layout LyX-Code

{'john': 36, 'fernando': 33}
\layout LyX-Code

>>> ages['john']
\layout LyX-Code

36
\layout LyX-Code

>>> ages['john'] = 37    
\color blue
# reassign john's age
\layout LyX-Code

>>> ages['john']
\layout LyX-Code

37
\layout Standard

Dictionary lookup is very fast; Tim Peter's once joked that any python program
 which uses a dictionary is automatically 10 times faster than any C program,
 which is of course false, but makes two worthy points in jest: dictionary
 lookup is fast, and dictionaries can be used for important optimizations,
 eg, creating a cache of frequently used values.
 As a simple eaxample, suppose you needed to compute the product of two
 numbers between 1 and 100 in an inner loop -- you could use a dictionary
 to cache the product of all possible pairs of numbers < 100 .
\layout LyX-Code

\layout LyX-Code

>>> prod = dict([ ( (i,j), i*j ) for i in range(100) 
\layout LyX-Code

                                 for j in range(i,100)] )
\layout LyX-Code

>>> prod[(8,12)]
\layout LyX-Code

96
\layout Standard

The last example is syntactically a bit challenging, but bears careful study.
  We are initializing a dictionary with a list comprehension.
  The list comprehension is made up of length 2 tuples 
\family typewriter 
( (i,j), i*j)
\family default 
 ).
  When a dictionary is initialized with a sequence of length 2 tuples, it
 assumes the first element of the tuple
\family typewriter 
 (i,j)
\family default 
 is the key and the second element i*j is the value.
  Thus we have a lookup table from pairs of numbers 
\family typewriter 
i,j
\family default 
 to their product.
  Creating dictionaries from list comprehensions as in this example is something
 that hard-core python programmers do almost every day, and you should too.
\layout Exercise

Create a dictionary mapping integers from 0-1000 to their cube using list
 comprehensions.
\layout Section

The Zen of Python
\layout Exercise


\family typewriter 
>>> import this
\layout Section

Functions and classes
\layout Standard

You can define functions just about anywhere in python code.
 The typical function definition takes zero or more arguments, zero or more
 keyword arguments, and is followed by a documentation string and the function
 definition, optionally returing a value.
 Here is a function to compute the hypoteneuse of a right triange
\layout LyX-Code

def hypot(base, height):
\layout LyX-Code

   'compute the hypoteneuse of a right triangle'
\layout LyX-Code

   import math
\layout LyX-Code

   return math.sqrt(base**2 + height**2)
\layout Standard

As in the case of the for-loop, leading white space is significant and is
 used to delimt the start and end of the function.
 In the example below, x = 1 is not in the function, because it is not indented
\layout LyX-Code

def growone(l):
\layout LyX-Code

   'append 1 to a list l'
\layout LyX-Code

   l.append(1)
\layout LyX-Code

x = 1
\layout Standard

Note that this function does not return anything, because the append method
 modifies the list that was passed in.
 python is pretty flexible with functions: you can define functions within
 function definitions (just be mindful of your indentation), you can attach
 attributes to functions (like other objects), you can pass functions as
 arguments to other functions.
\layout Standard

A function keyword argument defines a default value for a function that
 can be overridden.
 Here is an example which provides a normalize keyword argument.
 The default argument is 
\family typewriter 
normalize=None
\family default 
; the value None is a standard python idiom which usually means either do
 the default thing or do nothing.
 If 
\family typewriter 
normalize
\family default 
 is not 
\family typewriter 
None
\family default 
, we assume it is a function that can be called to normalize our data
\layout LyX-Code

def psd(x, normalize=None):
\layout LyX-Code

    'compute the power spectral density of x'
\layout LyX-Code

    if normalize is not None: x = normalize(x)
\layout LyX-Code

   
\color blue
 # compute the power spectra of x and return it
\layout Standard

This function could be called with or without a 
\family typewriter 
normalize
\family default 
 keyword argument, since if the argument is not passed, the dcefault of
 
\family typewriter 
None
\family default 
 is assumed
\layout LyX-Code

\layout LyX-Code


\color blue
# no normalize argument do the default thing
\layout LyX-Code

>>> psd(x)   
\layout LyX-Code

\layout LyX-Code


\color blue
# define a custom normalize function as pass it to psd
\layout LyX-Code

>>> def unitstd(x): return x/std(x)
\layout LyX-Code

>>> psd(x, normalize=unitstd)
\the_end
Want the latest updates on software, tech news, and AI?
Get latest updates about software, tech news, and AI from SourceForge directly in your inbox once a month.