Lab 1 - Introduction To Python
Lab 1 - Introduction To Python
Date: 11-09-2024
Tools/Software Requirement
Python 3.11 or 3.10, Anaconda and Jupyter notebook (preferable)
Python Basics
Table of Contents
The programming assignments in this course will in Python, an interpreted, object- oriented
language that shares some features with both Java and Scheme. This tutorial will walk through
the primary syntactic constructions in Python, using short examples.
You may find the Troubleshooting section helpful if you run into problems. It contains a list of
the frequent problems previous students have encountered when following this tutorial.
Python can be run in one of two modes. It can either be used interactively, via an interpreter, or it
can be called from the command line to execute a script. We will first use the Python interpreter
interactively.
Operators
The Python interpreter can be used to evaluate expressions, for example simple arithmetic
expressions. If you enter such expressions at the prompt ( >>>) they will be evaluated, and the
result will be returned on the next line.
>>> 1 + 1
2
>>> 2 * 3
6
Boolean operators also exist in Python to manipulate the primitive True and False values.
>>> 1==0
False
>>> not (1==0)
True
>>> (2==2) and (2==3)
False
>>> (2==2) or (2==3)
True
Like Java, Python has a built-in string type. The + operator is overloaded to do string
concatenation on string values.
There are many built-in methods which allow you to manipulate strings.
>>> 'machine'.upper()
'MACHINE'
>>> 'HELP'.lower()
'help'
>>> len('Help') 4
Notice that we can use either single quotes ' ' or double quotes " " to surround string. This
allows for easy nesting of strings.
In Python, you do not have declare variables before you assign to them.
Learn about the methods Python provides for strings. To see what methods Python provides for a
datatype, use the dir and help commands:
>>> s = 'abc'
>>> dir(s)
[' add ', ' class ', ' contains ', ' delattr ', ' doc ', ' eq ', ' ge ', ' getattribute ', ' getitem ', ' getnewargs ',
' getslice ', ' gt ', ' hash ', ' init ',' le ', ' len ',
' lt ', ' mod ', ' mul ', ' ne ', ' new ', ' reduce ',
Page 4
' reduce_ex ',' repr ', ' rmod ', ' rmul ', ' setattr ', ' str ', 'capitalize', 'center', 'count', 'decode', 'encode', 'endswith',
'expandtabs', 'find', 'index', 'isalnum', 'isalpha', 'isdigit', 'islower',
'isspace', 'istitle', 'isupper', 'join', 'ljust', 'lower', 'lstrip',
'replace', 'rfind','rindex', 'rjust', 'rsplit', 'rstrip', 'split', 'splitlines', 'startswith', 'strip', 'swapcase', 'title',
'translate', 'upper', 'zfill']
>>> help(s.find)
find(...)
S.find(sub [,start [,end]]) -> int
Return the lowest index in S where substring sub is found, such that sub is contained
within s[start,end]. Optional
arguments start and end are interpreted as in slice notation.
Return -1 on failure.
>> s.find('b') 1
Try out some of the string functions listed in dir (ignore those with underscores '_' around the
method name).
Python comes equipped with some useful built-in data structures, broadly like Java's
collections package.
Lists
Page 5
Python also allows negative indexing from the back of the list. For instance, fruits[-1] will
access the last element 'banana':
We can also index multiple adjacent elements using the slice operator. For instance, fruits[1:3],
returns a list containing the elements at position 1 and 2. In general fruits[start:stop] will get the
elements in start, start+1, ..., stop-1. We can also do fruits[start:] which returns all elements starting
from the start index. Also fruits[:end] will return all elements before the element at position end:
The items stored in lists can be any Python data type. So, for instance we can have lists of lists:
Exercise: Lists
Play with some of the list functions. You can find the methods you can call on an object via the
dir and get information about them via the help command:
>>> dir(list)
[' add ', ' class ', ' contains ', ' delattr ', ' delitem ', ' delslice ', ' doc ', ' eq ', ' ge ', ' getattribute ',
Page 6
' getitem ', ' getslice ', ' gt ', ' hash ', ' iadd ', ' imul ',
' init ', ' iter ', ' le ', ' len ', ' lt ', ' mul ', ' ne ', ' new ', ' reduce ', ' reduce_ex ', ' repr ', ' reversed ',
' rmul ', ' setattr ', ' setitem ', ' setslice ', ' str ', 'append', 'count', 'extend', 'index', 'insert', 'pop', 'remove',
'reverse', 'sort']
>>> help(list.reverse)
Help on built-in function reverse:
reverse(...)
L.reverse() -- reverse *IN PLACE*
Note: Ignore functions with underscores "_" around the names; these are private helper methods.
Press 'q' to back out of a help screen.
Tuples
A data structure similar to the list is the tuple, which is like a list except that it is immutable once
it is created (i.e. you cannot change its content once created). Note that tuples are surrounded
with parentheses while lists have square brackets.
The attempt to modify an immutable structure raised an exception. Exceptions indicate errors:
index out of bounds errors, type errors, and so on will all report exceptions in this way.
Sets
A set is another data structure that serves as an unordered list with no duplicate items. Below, we
show how to create a set, add things to the set, test if an item is in the set, and perform common
set operations (difference, intersection, union):
Page 7
>>> shapes = ['circle','square','triangle','circle']
>>> setOfShapes = set(shapes)
>>> setOfShapes set(['circle','square','triangle'])
>>> setOfShapes.add('polygon')
>>> setOfShapes set(['circle','square','triangle','polygon'])
>>> 'circle' in setOfShapes True
>>> 'rhombus' in setOfShapes False
>>> favoriteShapes = ['circle','triangle','hexagon']
>>> setOfFavoriteShapes = set(favoriteShapes)
>>> setOfShapes - setOfFavoriteShapes
set(['square','polyon'])
>>> setOfShapes & setOfFavoriteShapes
set(['circle','triangle'])
>>> setOfShapes | setOfFavoriteShapes set(['circle','square','triangle','polygon','hexagon'])
Note that the objects in the set are unordered; you cannot assume that their traversal or
print order will be the same across machines!
Dictionaries
The last built-in data structure is the dictionary which stores a map from one type of object (the
key) to another (the value). The key must be an immutable type (string, number, or tuple). The
value can be any Python data type.
Note: In the example below, the printed order of the keys returned by Python could be different
than shown below. The reason is that unlike lists which have a fixed ordering, a dictionary is
simply a hash table for which there is no fixed ordering of the keys (like HashMaps in Java). The
order of the keys depends on how exactly the hashing algorithm maps keys to buckets, and will
usually seem arbitrary. Your code should not rely on key ordering, and you should not be
surprised if even a small modification to how your code uses a dictionary results in a new key
ordering.
Page 8
['knuth', 'turing', 'nash']
>>> studentIds.values()
[[42.0, 'forty-two'], 56.0, 'ninety-two']
>>> studentIds.items()
[('knuth',[42.0, 'forty-two']), ('turing',56.0), ('nash','ninety-two')]
>>> len(studentIds) 3
Exercise: Dictionaries
Use dir and help to learn about the functions you can call on dictionaries.
Writing Scripts
Now that you've got a handle on using Python interactively, let's write a simple Python script
that demonstrates Python's for loop. Open the file called foreach.py and update it with the
following code:
Run the above code the output will look like this:
Remember that the print statements listing the costs may be in a different order on your screen
than in this tutorial; that's since we're looping over dictionary keys, which are unordered. To
learn more about control structures (e.g., if and else) in Python, check out the official Python
tutorial section on this topic.
Page 9
If you like functional programming, you might also like map and filter:
nums = [1,2,3,4,5,6]
plusOneNums = [x+1 for x in nums]
oddNums = [x for x in nums if x % 2 == 1] print oddNums
oddNumsPlusOne = [x+1 for x in nums if x % 2 ==1] print
oddNumsPlusOne
Write a list comprehension which, from a list, generates a lowercased version of each string that
has length greater than five. You can find the solution in listcomp2.py.
Beware of Indentation!
Unlike many other languages, Python uses the indentation in the source code for interpretation.
So for instance, for the following script:
if 0 == 1:
print 'We are in a world of arithmetic pain' print 'Thank you for
playing'
will output
if 0 == 1:
Page 10
print 'We are in a world of arithmetic pain' print 'Thank you for
playing'
there would be no output. The moral of the story: be careful how you indent! It's best to use four
spaces for indentation -- that's what the course code uses.
Tabs vs Spaces
Because Python uses indentation for code evaluation, it needs to keep track of the level of
indentation across code blocks. This means that if your Python file switches from using tabs as
indentation to spaces as indentation, the Python interpreter will not be able to resolve the
ambiguity of the indentation level and throw an exception. Even though the code can be lined up
visually in your text editor, Python "sees" a change in indentation and most likely will throw an
exception (or rarely, produce unexpected behavior).
This most commonly happens when opening up a Python file that uses an indentation scheme
that is opposite from what your text editor uses (aka, your text editor uses spaces and the file
uses tabs). When you write new lines in a code block, there will be a mix of tabs and spaces,
even though the whitespace is aligned. For a longer discussion on tabs vs spaces, see this
discussion on StackOverflow.
Writing Functions
# Main Function
if __name__ == '__main__':
buyFruit('apples', 2.4)
buyFruit('coconuts', 2)
Rather than having a main function as in Java, the if __name__ == '__main__': check is used to
delimit expressions which are executed when the file is called as a script from the command
line. The code after the main check is thus the same sort of code you would put in a main
function in Java.
Page 11
Run the above code, the output will look like this:
Object Basics
Although this isn't a class in object-oriented programming, you'll have to use some objects in the
programming projects, and so it's worth covering the basics of objects in Python. An object
encapsulates data and provides functions for interacting with that data.
Defining Classes
"""
values e.g.
"""
self.fruitPrices = fruitPrices
self.name = name
"""
None otherwise.
"""
Page 12
if fruit not in self.fruitPrices:
return None
return self.fruitPrices[fruit]
"""
Returns total cost of the orderList. If any of the fruits are not
"""
totalCost = 0.0
costPerPound = self.getCostPerPound(fruit)
return totalCost
def getName(self):
return self.name
The FruitShop class has some data, the name of the shop and the prices per pound of some fruit,
and it provides functions, or methods, on this data. What advantage is there to wrapping this data
in a class?
Using Objects
Page 13
So how do we make an object and use it? Make sure you have the FruitShop implementation in
shop.py. We then import the code from this file (making it accessible to other scripts) using import
shop, since shop.py is the name of the file. Then, we can create FruitShop objects as follows:
applePrice = berkeleyShop.getCostPerPound('apples')
print(applePrice)
print('Apples cost $%.2f at %s.' % (applePrice, shopName))
otherPrice = otherFruitShop.getCostPerPound('apples')
print(otherPrice)
print('Apples cost $%.2f at %s.' % (otherPrice, otherName))
print("My, that's expensive!")
So what just happended? The import shop statement told Python to load all of the functions and
classes in shop.py. The line berkeleyShop = shop.FruitShop(shopName, fruitPrices) constructs an instance
of the FruitShop class, by calling the init function in that class. Note that we only passed two
arguments in, while init seems to take three arguments: (self, name, fruitPrices). The reason for this is
that all methods in a class have self as the first argument. The self variable's value is automatically
set to the object itself; when calling a method, you only supply the remaining arguments. The self
variable contains all the data (name and fruitPrices) for the current specific instance (similar to this
in Java). The print statements use the substitution operator (described in the Python docs if you're
curious).
Page 14
The following example illustrates how to use static and instance variables in Python.
class Person:
population = 0
def get_population(self):
return Person.population
def get_age(self):
return self.age
Page 15
12
>>> p2.get_age() 63
In the code above, age is an instance variable and population is a static variable. population is shared
by all instances of the Person class whereas each instance has its own age variable.
This tutorial has briefly touched on some major aspects of Python that will be relevant to the
course. Here are some more useful tidbits:
Use range to generate a sequence of integers, useful for generating traditional indexed for
loops:
for index in range(3):
print lst[index]
After importing a file, if you edit a source file, the changes will not be immediately propagated in
the interpreter. For this, use the reload command:
>>> reload(shop)
Troubleshooting
These are some problems (and their solutions) that new Python learners commonly encounter.
Problem:
ImportError: No module named py
Solution:
When using import, do not include the ".py" from the filename.
For example, you should say: import shop
NOT: import shop.py
Problem:
NameError: name 'MY VARIABLE' is not defined
Even after importing you may see this.
Solution:
To access a member of a module, you have to type MODULE NAME.MEMBER NAME,
where MODULE NAME is the name of the .py file, and MEMBER NAME is the name of the
variable (or function) you are trying to access.
Page 16
Problem:
TypeError: 'dict' object is not callable
Solution:
Dictionary looks up are done using square brackets: [ and ]. NOT parenthesis: ( and ).
Problem:
ValueError: too many values to unpack
Solution:
Make sure the number of variables you are assigning in a for loop matches the number
of elements in each item of the list. Similarly for working with tuples.
For example, if pair is a tuple of two elements (e.g. pair =('apple', 2.0)) then the following
code would cause the "too many values to unpack error":
(a,b,c) = pair
pairList = [('apples', 2.00), ('oranges', 1.50), ('pears', 1.75)] for fruit, price, color in pairList:
print '%s fruit costs %f and is the color %s' % (fruit, price, color)
Problem:
AttributeError: 'list' object has no attribute 'length' (or something similar)
Solution:
Finding length of lists is done using len(NAME OF LIST).
Problem:
Changes to a file are not taking effect.
Solution:
1. Make sure you are saving all your files after any changes.
2. If you are editing a file in a window different from the one you are using to execute
python, make sure you reload(YOUR_MODULE) to guarantee your changes are being
reflected. reload works similarly to import.
Page 17
More References
Lab Tasks
Task #1
Write down a python program which takes two strings as input and calculate the
Levenshtein/Edit distance between the two strings.
Explanation:-
Levenshtein/Edit distance gives us a measure of similarity between two strings/sequences.
Going by formal definition it is minimum number of single character edits required to transform
one string into another.
Single character edits include:-
Insertion
Deletion
Substitution
Mathematically:-
Mathematically Levenshtein/Edit distance between two strings ‘a’ and ‘b’ is defined as:-
For further understanding of the formula you may read this blog as it explains it in great depth
or you may get back to me wherever/whenever you stuck.
https://medium.com/@ethannam/understanding-the-levenshtein-distance-equation-for-beginners-
c4285a5604f0
Page 18
But it does not explain how to count the edit operations while calculating overall Levenshtein
distance.
Task #2
Now modify the above written program in such a way that it takes two text files containing
single- line and lowercase English sentences named as reference.txt and hypothesis.txt, and
outputs the file result.txt containing Levenshtein distance of these two files as below. The
distance should be word level and not character level.
Page 19
**********reference.txt***************
this is some text and we would like to see if it has been identified correctly by speech recognition system
***************************************
**********hypothesis.txt*************
this is a text and we would like to check what has been identified by the speech recognition
***************************************
*********result.txt*******************
Levenshtein distance is 7
Insertions 1
Deletions 3
Substitutions 3
***************************************
Hint:-
Task #3
Now modify the above program so that it ignores 10 common words in such a way:-
Page 20
*********result2.txt*******************
Levenshtein distance is 5
Insertions 0
Deletions 3
Substitutions 2
***************************************
Submission Guidelines:-
Deliverables and Deadline: All three tasks should be submitted in the lab. Please submit your
executable python files (without errors).
Page 21