Introduction To Tools - Python: 1 Assignment
Introduction To Tools - Python: 1 Assignment
1 Assignment
1. Revise your Python skills, especially scientific python (numpy, scipy & pylab)
in C that will perform an nth order interpolation at the point x. Use n = 8 and generate the interpolated
values.
3. Run the program and evaluate the function on 100 uniformly spaced values between 0 and 2π. Write
out the answers to a text file output.txt with each line containing
xy
import os
os.system(“...”)
to run the C executable. It should then use loadtxt to read in output.txt and plot both the interpolated
values as well as the exact function sin(x) in a plot. Another plot should plot the error between the
interpolated and exact function vs x.
5. Repeat the above with f (x) = sin(x) + n where n is normally distributed noise with a variance σ 2 .
Use randn(N)*sigma to generate N such random numbers. Compare the quality of the interpolation
as the amount of noise increases. Can you explain what you see?
3 Cython
The other way of embedding C in python is to use cython. This is not exactly c embedded in python. Rather
it is a python to c translater. This can be compled and turned into a module for python.
To start off, go to a terminal and try to run cython. You should get a help page. Else you need to install
cython.
Once you have cython, enter the following program:
import numpy as np
cimport numpy as np
# cimport adds C calls for numpy so that the interface bypasses
# python when possible
cdef extern from "<math.h>":
cdef double cos(double x)
# This block tells cython that the cos function should be from
# math.h library and not from python.
DTYPE = np.double
ctypedef np.double_t DTYPE_t
# These lines define a new type that corresponds to numpy arrays. # It ensures that
# python objects
# fouriercy is defined as function for both C and Python (cpdef does this)
cpdef fouriercy(int N,np.ndarray[DTYPE_t,ndim=1] c,np.ndarray[DTYPE_t,ndim=1] x):
# define locals
cdef int k
cdef int m=x.shape[0]
cdef double xx,zz
# Next line very important.
# Return array type and dim declared to avoid object
cdef np.ndarray[DTYPE_t,ndim=1] z=np.zeros(m,dtype=DTYPE)
for i in range(m):
xx=x[i]
zz=0.0
for k in range(N+1):
zz += c[k]*cos(k*xx)
z[i]=zz
return(z)
This is stored to a file, say fouriercy.pyx. The ’x’ indicates it is a cython file. We can compile it and ask for
an analysis of its translation quality:
cython -a fouriercy.pyx
firefox fouriercy.html
This will pull up a html file that shows how well cython was able to translate the file. But the proof of the
pudding is in the eating. To run the code, enter the following lines at the end of your python file:
import pyximport
pyximport.install()
This creates a capability in python to detect any cython file and automatically convert it to a module. Now
import our file:
import fouriercy
We can run fouriercy.fouriercy(N,c,x) and test its speed as we did the weave version.
t1=time.time()
for i in range(M):
zz=fouriercy.fouriercy(N,c,x)
t2=time.time()
f3=(t2-t1)*1000.0/M
print "Time for fourier in cython=%f msec" % f3
4 Debugging Python
Python has a built in debugger. You can take advantage of it in several ways.
• You can edit the python source file in the idle editor. The editor starts the python window as well.
In that window, click on Debug->debugger.The debugger is very slow if you turn on local or global
display. This is because it has to evaluate and display all the variables at each step. So turn on the
source display and turn off the variables. Then go to the idle edit window and press F5, which runs
the module. You are now in the debugger. Ofcourse this does not allow you to debug the C code, only
the python code.
• You can debug the python file in ipython. To do this, start ipython as
ipython -pylab
where abc.py is the source file and arguments to the source file are entered after the source file name.
To debug the source file, use
b 18,i==j
Suppose we want to stop in a loop at line 21 whenever i = j − 1 and print out the value of a variable,
p1.
b 21,i==j-1
commands
silent
p p1
end
This only stops at line 21 when the condition is satisfied. It does not print the usual information about
the break point number etc, but prints the value of p1.
• My preferred way to debug is to use emacs. Emacs has a unified debugging interface for C, C++,
Fortran, perl and python. To start the python debugger, you need to set up things in emacs.
– First load a python file in emacs and make sure that your emacs version understand the python
syntax (the window will say “Python” mode or “Python Abbrev Fill” mode or something like
that.) If it does not, you have to get the corresponding lisp file to teach emacs the editing
specialities of python. Usually python support is built in.
– Emacs has a command called “pdb” which starts the python debugger on a source code.
That is it. You can now do source debugging of python code as you wish. I have not yet figured out
how to debug C under python as I do for C under Scilab. But I am sure it can be done.
5 Profiling
While debugging finds the errors in your code, the more challenging problem is to identify either subtle
bugs or inefficient portions of your code. Debugging cannot easily find such parts of your code. What you
need is to profile your code.
Write the python code in this assignment to a file, week0.py. We can execute the python code as
python week0.py
This will run the code and print messages on the console. It tells you that the C code ran faster than the
python code.
Now we run the code under the profiler.
This runs the script week0.py under the profiler and writes the output to profile.log. Note that the first
few lines of the output will actually be the output of the script. Following that is the output of the profiler
itself. This is too detailed. Let us look at the first few lines of it:
The first four lines are program output. Following that the profiler starts and gives statistics in a table. The
important number is the “cumtime” which is the cumulative time spent by the function.
Let us parse this profile and look for those functions that correspond to our code week0.py and are
ordered by cumulative time.
This extracts those lines containing the word week0 in the profile output. It then sorts them in numerically
ascending order (-n) and the sorting is done on field 4 (-k4,4). The result of this command is:
We can see that the least time was spent in routine fourc and the most time was in fourier. Ofcourse we
knew this already, but this is an invaluable tool when used for looking at which part of the code to optimize.
• Run the code under the profiler and see the above in your machine.