Topic_2_Python_review
Topic_2_Python_review
1 Python
1.1 List Comprehensions
1.2 Enumerate
1.3 Zip
1.4 Dictionaries
1.5 String Methods
2 Lambda & Map
3 Numpy Library
3.1 Creating Numpy Arrays
3.2 Numpy Array Attributes
3.3 Indexing
3.4 Slicing
3.5 Sorting
3.6 Copy of Array
3.7 Reshaping
3.8 Concatenate/Split
3.9 Vectorized Operations
3.10 Ufuncs
3.11 Broadcasting
3.12 Masking
4 References
4.1 Online:
Python
A review on Python for ISE 291.
Python reference:
1. Any method or attribute of any python object can be read by using help() method.
2. One can also refer to online documentations, like: https://docs.python.org/3/
List Comprehensions
It is an elegant way to create the lists in Python. To see how useful the list comprehensions is, consider the following example:
In [3]: # 2. Create another list of numbers containing square of numbers form 1,...,5.
l=[1,4,9,16,25]
[1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 4
7, 48, 49, 50]
[1, 4, 9, 16, 25, 36, 49, 64, 81, 100, 121, 144, 169, 196, 225, 256, 289, 324, 361, 400, 441, 484, 529, 576, 625, 676, 729, 784, 841, 900, 961, 1024, 1089, 1156, 1225, 1296, 136
9, 1444, 1521, 1600, 1681, 1764, 1849, 1936, 2025, 2116, 2209, 2304, 2401, 2500]
In [8]: # 4. Create a list, say eSquare, which has only the squares of even numbers form 1,...,50.
eSquare = [x**2 for x in numbers if x % 2 == 0]
print(eSquare)
# Create a new list, say PV, which has '+' for even numbers and '-' for odd numbers for numbers from 1,...,50.
PV = ['+' if x % 2 == 0 else '-' for x in numbers]
print(PV)
[4, 16, 36, 64, 100, 144, 196, 256, 324, 400, 484, 576, 676, 784, 900, 1024, 1156, 1296, 1444, 1600, 1764, 1936, 2116, 2304, 2500]
['-', '+', '-', '+', '-', '+', '-', '+', '-', '+', '-', '+', '-', '+', '-', '+', '-', '+', '-', '+', '-', '+', '-', '+', '-', '+', '-', '+', '-', '+', '-', '+', '-', '+', '-', '
+', '-', '+', '-', '+', '-', '+', '-', '+', '-', '+', '-', '+', '-', '+']
25
True
In [9]: # 8. Create two new variable, containing the max and min numbers inside eSquare.
maxValue, minValue = max(eSquare), min(eSquare)
print(maxValue, minValue )
2500 4
4 2500
[4, 16, 36, 64, 100, 144, 256, 324, 400, 484, 576, 676, 784, 900, 1024, 1156, 1296, 1444, 1600, 1764, 1936, 2116, 2304, 2500]
Enumerate
It is useful, when you have a list over which you want to iterate, as well as keep track of index. To see how it works, consider the following:
enumerate(l1)
list(enumerate(l1))
dict(enumerate(l1))
tuple(enumerate(l1))
# or
fruits2 = sorted(fruits) #this will create a copy of the list and sort the copy, that is why we need to store it in a variable
print("sorted: ", fruits2)
sorted: ['Apple', 'Banana', 'Grapes', 'Kiwi', 'Orange', 'Pear', 'Plum', 'Strawberry', 'Watermelon']
sorted: ['Apple', 'Banana', 'Grapes', 'Kiwi', 'Orange', 'Pear', 'Plum', 'Strawberry', 'Watermelon']
In [9]: # 3. To the sorted list, for each element, prefix the value by its position.
# i=1
# fruits2=[]
# for x in fruits:
# fruits2.append(str(i)+'.'+x)
# i+=1
# print(fruits2)
# fruits2=[]
# for x in fruits:
# fruits2.append(str(fruits.index(x)+1)+'.'+x)
# print(fruits2)
# fruits2=[]
# for i in range(len(fruits)):
# fruits2.append(str(i)+'.'+fruits[i])
# fruits2=[]
# for i,x in enumerate(fruits):
# fruits2.append(str(i+1)+'.'+x)
# print(fruits2)
Zip
It is useful, when you have to join two lists. It creates a zip object, from which a list of tuple can be obtained. Such lists are very handy in looping over two or more lists simultaneously.
In [14]: # zip will create a tuple where the fisrt is from the fisrt list and the second from the second
# 1. Consider the following list:
# [['Apple', 'Banana', 'Orange'], ['Watermelon', 'Plum', 'Grapes', 'Kiwi'], ['Strawberry', 'Pear', 'Mango']]
# 2. Create a new list containing all the 'fruits'.
fruits = [['Apple', 'Banana', 'Orange'], ['Watermelon', 'Plum', 'Grapes', 'Kiwi'], ['Strawberry', 'Pear', 'Mango']]
#outer loop comes first and then the inner loop
allFruits = [y for innerList in fruits for y in innerList]
print(allFruits)
# print(allFruits)
['Apple', 'Banana', 'Orange', 'Watermelon', 'Plum', 'Grapes', 'Kiwi', 'Strawberry', 'Pear', 'Mango']
In [16]: # 3. Create a new list, say zipZap, containing tuple of fruits name followed by length of the name.
zipZap = list(zip(allFruits,lenFruits))
print(zipZap)
[('Apple', 5), ('Banana', 6), ('Orange', 6), ('Watermelon', 10), ('Plum', 4), ('Grapes', 6), ('Kiwi', 4), ('Strawberry', 10), ('Pear', 4), ('Mango', 5)]
for k in zipZap:
print(k[0],k[1])
Apple 5
Banana 6
Orange 6
Watermelon 10
Plum 4
Grapes 6
Kiwi 4
Strawberry 10
Pear 4
Mango 5
In [18]: # 5. Iterate over zipZap and print the index and all the corresponding elements.
0 Apple 5
1 Banana 6
2 Orange 6
3 Watermelon 10
4 Plum 4
5 Grapes 6
6 Kiwi 4
7 Strawberry 10
8 Pear 4
9 Mango 5
Dictionaries
Dictionaries are used to store data values in key:value pairs. For example, consider the following lists:
List of fruits:
['Apple', 'Banana', 'Orange', 'Watermelon', 'Plum', 'Grapes', 'Kiwi', 'Strawberry', 'Pear', 'Mango']
List of corresponding prices per KG:
[47, 27, 35, 13, 28, 10, 30, 56, 15, 25]
Do the following:
1. Create a dictionary, say book1, which has fruits as keys, and prices as values.
2. Loop over all the key & value pairs of book1.
3. Create a dictionary, say book2, which has fruits as values, and unique numbers as keys. (when unique use enumrate )
4. Display book2 and ask the user to pick a number. Then display the price of the fruit using book1.
In [3]: # 1. Create a dictionary, say book1, which has fruits as keys, and prices as values.
book1 = dict(zip(fruits,prices))
print(book1)
{'Apple': 47, 'Banana': 27, 'Orange': 35, 'Watermelon': 13, 'Plum': 28, 'Grapes': 10, 'Kiwi': 30, 'Strawberry': 56, 'Pear': 15, 'Mango': 25}
In [11]: # 2. Loop over all the key & value pairs of book1.
for k,v in book1.items():
print(f'The price of {k} is {v} SAR.')
In [5]: # 3. Create a dictionary, say book2, which has fruits as values, and unique numbers as keys.
book2 = dict(zip(list(range(len(fruits))),fruits))
print(book2)
{0: 'Apple', 1: 'Banana', 2: 'Orange', 3: 'Watermelon', 4: 'Plum', 5: 'Grapes', 6: 'Kiwi', 7: 'Strawberry', 8: 'Pear', 9: 'Mango'}
In [10]: # 4. Display book2 and ask the user to pick a number. Then display the price of the fruit using book1.
for k,v in book2.items():
print(f'{k}. {v}')
itemSelected = input('Enter the number corresponding to a fruit to know its price.')
itemSelected = int(itemSelected)
fruitSelected = book2[itemSelected]
priceSelected = book1[fruitSelected]
print('\n'*4)
print('*'*width)
print(f'{f"*You have selected {fruitSelected}.":<{width-1}}*') #f strings can be nested
print(f'*The price per KG of {fruitSelected} is {priceSelected}.*')
print('*'*width)
####
# f'{string: <{width}}' is used for formatting (align, width) while printing strings
#to align the string to the right use >, and to the left use <
#example: f'{s:>5}' implies right align string with space of 5 chars
0. Apple
1. Banana
2. Orange
3. Watermelon
4. Plum
5. Grapes
6. Kiwi
7. Strawberry
8. Pear
9. Mango
Enter the number corresponding to a fruit to know its price.3
***************************************
*You have selected Watermelon. *
*The price per KG of Watermelon is 13.*
***************************************
Duplicate keys are not allowed. That is, a given key can appear in a dictionary only once.
Dictionary keys can be of any type, but they must be immutable.
Re-assigning a new value to an existing key, will override the first/previous value.
String Methods
Out of the box, Python has inbuilt functions that are very handy in text processing. For example, consider the following text:
we are reviewing python programming topics in ise 291. it is the most popular language for data scientists. also, it is a good general purpose programming language.
Do the following:
In [2]: text = """we are reviewing python programming topics in ise 291.
it is the most popular language for data scientists.
also, it is a good general purpose programming language."""
['we are reviewing python programming topics in ise 291', '\nit is the most popular language for data scientists', '\nalso, it is a good general purpose programming language']
In [26]: sentences1=[s.strip().capitalize().replace('ise','ISE')+'.'
for s in sentences]
print(sentences1)
## other variations
# sentences2=[]
# for s in sentences:
# sentences2.append(s.strip().capitalize().replace('ise','ISE')+'.')
# print(sentences2)
## or
# sentences3=[]
# for s in sentences:
# temp=s.strip()
# temp=temp.capitalize()
# temp=temp.replace('ise','ISE')
# temp=temp+'.'
# sentences3.append(temp)
# print(sentences3)
['We are reviewing python programming topics in ISE 291.', 'It is the most popular language for data scientists.', 'Also, it is a good general purpose programming language.']
In [27]: newText="\n".join(sentences1)
print(newText)
The map and lambda functions extends the ability of Python to perform complex operations using a compact & simple style.
1. Python's lambda() function is a small anonymous function, which can take any number of arguments, but can only have one expression.
2. Python's map() function apply one function to each element of an iterable like list, tuple in Python.
17
7 is an odd number.
numbers = [1, 2, 3, 4, 5]
print(squares)
print(squares)
list1 = [1, 2, 3, 4, 5]
list2 = [5, 5, 3, 5, 5]
[0, None, 4, None, 16, None, 36, None, 64, None, 100]
[0, 4, 16, 36, 64, 100]
Numpy Library
A must for scientific work
In [18]: #1. Create an integer array of 10 elements filled with all zeros.
np.zeros(10, dtype="int")
## another way
# list1 = [0 for x in range(10)]
# print(list1)
# print(type(list1))
# print('-'*8)
# array1 = np.array(list1)
# print(array1)
# print(type(array1))
# or the default
# np.ones((3,5))
## or the following
# print(np.full((3,5), np.round(np.pi,3))) # numpy functions are Universal Functions (ufuncs) more on this in Sec 3.10.
In [26]: #4. Create a 3x3 array of uniformly distributed random values between 0 and 1.
np.random.seed(0) # seed for reproducibility
print(np.random.random((3,3)))
# print(np.random.rand(3,3))
In [25]: #5. Create a 3x3 array of random integers in the interval [0, 10).
np.random.randint(0,10,(3,3))
In [30]: # 7. Create an array of 11 elements uniformly dividing the interval [0,1], i.e, 0. , 0.1,..., 0.9, 1.
np.linspace(0,1,11)
Out[30]: array([0. , 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 1. ])
1. Create a random one-dimensional array (x1), and check all the above attributes.
2. Create a random two-dimensional array (x2), and check all the above attributes.
In [6]: # 1. Create a random one-dimensional array (x1), and check all the above attributes
np.random.seed(0) # seed for reproducibility
x1 = np.random.randint(10,size=6) #it's same ((np.random.randint((0,10), size=6))) # One-dimensional array
print(x1)
print(f"x1 ndim: {x1.ndim}")
print(f"x1 shape: {x1.shape}")
print(f"x1 size: {x1.size}") #totaly,6 elements
[5 0 3 3 7 9]
x1 ndim: 1
x1 shape: (6,)
x1 size: 6
In [13]: # 2. Create a random two-dimensional array (x2), and check all the above attributes.
np.random.seed(0) # seed for reproducibility
x2 = np.random.randint(10,size=(3,4)) # Two-dimensional array
print(x2)
print(f"x2 ndim: {x2.ndim}")
print(f"x2 shape: {x2.shape}")
print(f"x2 size: {x2.size}") #totaly,12 elements
[[5 0 3 3]
[7 9 3 5]
[2 4 7 6]]
x2 ndim: 2
x2 shape: (3, 4)
x2 size: 12
Indexing
Numpy array indexing for 1d array is same as Python's lists. However, Numpy nd-array indexing is slightly different from Python's nd-lists. For example:
where index1 and index2 can be indices or slices. See the following examples:
print('-'*10);
print(x1[0],x1[4],x1[-1],x1[-2])
[5 0 3 3 7 9]
indices:[0, 1, 2, 3, 4, 5]
array :[5, 0, 3, 3, 7, 9]
----------
5 7 9 7
print(x2)
print('-'*10);
print(x2[2,1],x2[2,0],x2[2,-4],x2[-2,-3])
[[5 0 3 3]
[7 9 3 5]
[2 4 7 6]]
----------
4 2 2 9
[3 0 3 3 7 9]
[[12 0 3 3]
[ 7 9 3 5]
[ 2 4 7 6]]
Slicing
Slicing refer to selecting/extracting a sub-array from a given array. As an example for 1d arrays, do the following tasks:
indices:[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
array :[5, 0, 3, 3, 7, 9, 3, 5, 2, 4]
In [56]: # 7. Display elements starting from fourth element from the end and ending at last but one element.
x1[-4:-1]
[[5 0 3 3]
[7 9 3 5]
[2 4 7 6]]
[5 7 2]
[5 0 3 3]
In [64]: # 7. Display third row elements in order of column 3, then column 1 then column 2
x2[2,[2,0,1]]
In [65]: # 8. Display second and following row elements in order of column 3, then column 1 then column 2
x2[1:, [2, 0, 1]]
Sorting
Numpy .sort() method sorts an array in-place. The sorting is done row/column wise, and order/preference can be provided too. As an example, do the following:
1. Create a 1d random array of size 6, and sort it. Print the corresponding original indices in the sorted array.
2. Create a random 2d numpy array of size 3x4. Sort row wise.
3. Create a random 2d numpy array of size 3x4. Sort column wise.
In [3]: # 1. Create a 1d random array of size 6, and sort it. Print the corresponding original indices in the sorted array.
np.random.seed(0) # seed for reproducibility
x1 = np.random.randint(10,size=6) # One-dimensional array
print(f'original indices:{list(range(6))}')
print(f'given array :{x1.tolist()}')
print(f'sorted-array :{np.sort(x1).tolist()}') # Ascending
print(f'sorted indices :{np.argsort(x1).tolist()}')
print(f'sorted-array :{np.sort(x1)[::-1].tolist()}') #Descending
original indices:[0, 1, 2, 3, 4, 5]
given array :[5, 0, 3, 3, 7, 9]
sorted-array :[0, 3, 3, 5, 7, 9]
sorted indices :[1, 2, 3, 0, 4, 5]
sorted-array :[9, 7, 5, 3, 3, 0]
In [4]: # 2. Create a random 2d numpy array of size 3x4. Sort row wise.
np.random.seed(0) # seed for reproducibility
x2 = np.random.randint(10, size=(3,4)) # Two-dimensional array
print(x2)
print('-'*10)
x2.sort(axis=0)
print(x2)
[[5 0 3 3]
[7 9 3 5]
[2 4 7 6]]
----------
[[2 0 3 3]
[5 4 3 5]
[7 9 7 6]]
In [5]: #3. Create a random 2d numpy array of size 3x4. Sort column wise.
np.random.seed(0) # seed for reproducibility
x2 = np.random.randint(10, size=(3,4)) # Two-dimensional array
print(x2)
print('-'*10)
x2.sort(axis=1)
print(x2)
[[5 0 3 3]
[7 9 3 5]
[2 4 7 6]]
----------
[[0 3 3 5]
[3 5 7 9]
[2 4 6 7]]
Copy of Array
Arrays are mutable too. Thus, the assignment "=" is by reference. Use ".copy()" to copy the entire numpy array. For example, see the code in the following cells:
x2_copy[0,1:3]=[4,4]
print(f'x2_copy modified is:\n{x2_copy}')
print('-'*10)
print(f'x2 is:\n{x2}')
## x2_copy = x2[:,:]
x2 is:
[[5 0 3 3]
[7 9 3 5]]
----------
x2_copy is:
[[5 0 3 3]
[7 9 3 5]]
----------
x2_copy modified is:
[[5 4 4 3]
[7 9 3 5]]
----------
x2 is:
[[5 4 4 3]
[7 9 3 5]]
x2_copy[0,1:3]=[4,4]
print(f'x2_copy modified is:\n{x2_copy}')
print('-'*10)
print(f'x2 is:\n{x2}')
x2 is:
[[5 0 3 3]
[7 9 3 5]]
----------
x2_copy is:
[[5 0 3 3]
[7 9 3 5]]
----------
x2_copy modified is:
[[5 4 4 3]
[7 9 3 5]]
----------
x2 is:
[[5 0 3 3]
[7 9 3 5]]
Reshaping
Numpy .reshape method gives a new shape to an existing array without changing its data. As an exmaple, do the following task:
[5 0 3 3 7 9 3 5 2 4]
[[5 0]
[3 3]
[7 9]
[3 5]
[2 4]]
[[5 0 3 3 7]
[9 3 5 2 4]]
Concatenate/Split
Concatenate means joining, and Numpy's .concatenate() function is used to join two or more arrays of the same shape along a specified axis. Numpy's .vstack() (.hstack()) can be used for row wise (column
wise) joining. Numpy's split/vsplit/hsplit methods are for breaking/splitting the array, opposite to the concatenation.
In [74]: x = np.array([1,2,3])
y = np.array([3,2,1])
z = np.array([9,9,9])
[1 2 3 3 2 1 9 9 9]
In [75]: x = np.array([1,2,3])
y = np.array([3,2,1])
z = np.array([9,9,9])
[[1 2 3]
[3 2 1]
[9 9 9]]
In [22]: y = np.array([[9,8,7],
[6,5,4]])
z = np.array([[99],
[99]])
In [12]: # An example to split array into three sub-arrays using the given break points
x = [1,2,3,99,99,3,2,1]
print(x)
In [28]: # An example to split 2d-array into three sub-arrays using the given break points
x = np.array([[1,2,3,99],[99,3,2,1]])
print(x)
print('-'*16)
[[ 1 2 3 99]
[99 3 2 1]]
----------------
[[ 1 2 3 99]] [[99 3 2 1]]
----------------
[[ 1 2]
[99 3]] [[ 3 99]
[ 2 1]]
Vectorized Operations
Vectorized operations are perhaps the most crucial factor to the wide usage of Numpy library. Vectorized operations simply put are those operations that can be done on arrays without using loops.
1. Create a random id array, and add, subtract,multiply and divide all the elements with 5.
2. Take the square of all elements.
3. Find remainder w.r.t 2 for all elements.
4. Transform the array as: −( x + 1)2 , where x is the random 1d array.
2
In [80]: # 1. Create a random id array, and add, subtract,multiply and divide all the elements with 5.
x = [5 0 3 3 7 9 3 5 2 4]
x + 5 = [10 5 8 8 12 14 8 10 7 9]
x - 5 = [ 0 -5 -2 -2 2 4 -2 0 -3 -1]
x * 5 = [25 0 15 15 35 45 15 25 10 20]
x / 5 = [1. 0. 0.6 0.6 1.4 1.8 0.6 1. 0.4 0.8]
x // 5 = [1 0 0 0 1 1 0 1 0 0]
x ** 2 = [25 0 9 9 49 81 9 25 4 16]
x % 2 = [1 0 1 1 1 1 1 1 0 0]
In [82]: # 4. Transform the array as: $-(\frac{x}{2}+1)^2$, where $x$ is the random 1d array.
newArray=-(0.5*x+1) ** 2
Given array : [5 0 3 3 7 9 3 5 2 4]
Transformed array: [-12.25 -1. -6.25 -6.25 -20.25 -30.25 -6.25 -12.25 -4. -9. ]
Ufuncs
A universal function (or ufunc for short) is a vectorized wrapper for a function that takes a fixed number of specific inputs and produces a fixed number of specific outputs.
In [83]: x = np.array([-2,-1,0,1,2])
print('-'*10)
print(np.abs(x)) # convert all elements to +ve numbers
print(np.sqrt(np.abs(x))) # get square-roots of all abs(elements)
print(np.min(x)) # get the min value
print(np.max(x)) # get the max value
print(np.sum(x)) # get the sum of all value
[-2 -1 0 1 2]
----------
[2 1 0 1 2]
[1.41421356 1. 0. 1. 1.41421356]
-2
2
0
Broadcasting
Vectorization involving multiple arrays can be done. Vectorization follows broadcasting rules. The key broadcasting rule is: In order to broadcast, the size of the trailing axes for both arrays in an
operation must either be the same size or one of them must be one. In rough sense, what it means is:
1. If two arrays of same dimensions have the same shape, then corresponding elements can be broadcasted.
2. If the arrays do not have the same dimensions, prepend the shape of the lower dimension array with 1s until both arrays have the same dimensions.
3. In any dimension where one array had size 1 and the other array had size greater than 1, the first array behaves as if it were copied along that dimension.
## list1 + list2, where list1 and list2 are Python 1d lists results in concatenation!
## list1 * list2, where list1 and list2 are Python 1d lists results in error!
array 1: [0 1 2]
array 2: [5 5 3]
--------------------
array sum: [5 6 5]
array dot: [0 5 6]
Matrix 1:
[[1 1 1]
[1 1 1]]
Array 1:
[0 1 2]
----------
The sum:
[[1 2 3]
[1 2 3]]
The dot:
[[0 1 2]
[0 1 2]]
Masking
Masked array is an array of booleans that determines for each element of the associated array whether the value is valid or not. The masked arrays are obtained using different conditions. Numpy array
comparison are vectorized too.
In [5]: x = np.array([1,2,3,4,5,6])
In [6]: mask=x<3 # masked array that shows which elements of x are strictly less than 3
x[mask]
In [88]: mask=(x > 2) & (x < 5) # shows which elements of x are strictly between than 2 and 5
x[mask]
In [8]: mask=(x < 2) | (x >= 5) # shows which elements of x are not in between than 2 and 4
x[mask]
References
Online:
1. https://docs.python.org/3/
2. https://numpy.org/