Programming in Python: 1.1 Getting Started With Python
Programming in Python: 1.1 Getting Started With Python
Programming in Python
Python is a simple, high level language with a clean syntax. It oers strong support for integration
with other languages and tools, comes with extensive standard libraries, and can be learned within
few days. Many Python programmers report substantial productivity gains and feel the language
encourages the development of higher quality, more maintainable code. To know more visit the
Python website.1
1
CHAPTER 1. PROGRAMMING IN PYTHON 2
You can type this at the >>> prompt and press the Enter key to view the result. After trying
this method you should open a new le from the File menu of IDLE and type the one line of code
inside it. Save it as 'hello.py' and execute it from the Run menu. Pressing the F5 key also does
the job. You need to save the code after making changes.
The lenames of python programs should not be any of the the keywords of python. For
example, naming a le as 'string.py' or 'int.py' may result in hard to track error messages.
The output of the program is shown below. Note that the type of the variable x changes during
the execution of the program, depending on the value assigned to it.
10 <type 'int'>
10.4 <type 'float'>
(3+4j) <type 'complex'>
I am a String <type 'str'>
The program treats the variables like humans treat labelled envelopes. We can pick an envelope,
write some name on it and keep something inside it for future use. In a similar manner the program
creates a variable, gives it a name and keeps some value inside it, to be used in subsequent steps.
So far we have used four data types of Python: int, oat, complex and str. To become familiar with
them, you may write simple programs performing arithmetic and logical operations using them.
CHAPTER 1. PROGRAMMING IN PYTHON 4
Example: oper.py
x = 2
y = 4
print(x + y * 2)
s = 'Hello '
print(s + s)
print(3 * s)
print(x == y)
print(y == 2 * x)
print(5/2 , 5//2)
Note that a String can be added to another string and it can be multiplied by an integer. Try
to understand the logic behind that and also try adding a String to an Integer to see what is the
error message you will get. We have used the logical operator == for comparing two variables.
The last line demonstrates the nature of division operator on python3. By default it does oating
point division. Integer division can be forced by the operator //.
1.4.1 Slicing
Part of a String can be extracted using the slicing operation. It can be considered as a modied
form of indexing a single character. Indexing using s[a : b] extracts elements s[a] to s[b − 1]. We
can skip one of the indices. If the index on the left side of the colon is skipped, slicing starts from
the rst element and if the index on right side is skipped, slicing ends with the last element.
Example: slice.py
a = 'hello world'
print(a[3:5])
CHAPTER 1. PROGRAMMING IN PYTHON 6
print(a[6:])
print(a[:5])
The reader can guess the nature of slicing operation from the output of this code, shown below.
'lo'
'world'
'hello'
Please note that specifying a right side index more than the length of the string is equivalent to
skipping it. Modify slice.py to print the result of a[6 : 20] to demonstrate it.
1.5.1 Tuples
Tuples are similar to lists, but they are immutable.
a = (2.3, 3.5, 234) # make a tuple
a[1] = 10 # will give error
CHAPTER 1. PROGRAMMING IN PYTHON 7
The List data type is very exible, an element of a list can be another list. We will be using lists
extensively in the coming chapters. Tuple is another data type similar to List, except that it is
immutable. List is dened inside square brackets, tuple is dened in a similar manner but inside
parenthesis, like (3, 3.5, 234).
Dictionary does not allow duplicate keys. If you dene it with duplicte keys, the last one will
prevail.
a = {'a':12, 'b':23, 34:3000, 'a':2000}
print (a)
will print
{'a': 2000, 'b': 23, 34: 3000} , it can be seen that the key 'a' gets a value of 2000.
CHAPTER 1. PROGRAMMING IN PYTHON 8
It is also possible to read more than one variable using a single input() statement. String type
data read using input() may be converted into integer or oat type if they contain only the valid
characters. In order to show the eect of conversion explicitly, we multiply the variables by 2
before printing. Multiplying a String by 2 prints it twice. If the String contains invalid characters
then eval() will give error.
Example: kin2.py
x,y = eval(input('Enter x and y separated by comma '))
print 'The sum is ', x + y
s = input('Enter a decimal number ')
a = eval(s)
print s * 2 # prints the string twice
print a * 2 # prints the converted numeric value times 2
We have learned about the basic data types of Python and how to get input data from the keyboard.
This is enough to try some simple problems and algorithms to solve them.
Example: area.py
pi = 3.1416
r = input('Enter Radius ')
a = pi * r ** 2 # A = πr2
print ('Area = ', a)
The above example calculates the area of a circle. Line three calculates r2 using the exponentiation
operator ∗∗, and multiply it with π using the multiplication operator ∗. r2 is evaluated rst because
** has higher precedence than *, otherwise the result would be (πr)2 .
There are mainly two things to remember about Python syntax: indentation and colon. Python
uses indentation to delimit blocks of code. Both space characters and tab characters are
currently accepted as forms of indentation in Python. Mixing spaces and tabs can create
bugs that are hard to track, since the text editor does not show the dierence. There should
not be any extra white spaces in the beginning of any line.
The line before any indented block must end with a colon (:) character.
1.10 Iteration: while and for loops
If programs can only execute from the rst line to the last in that order, as shown in the previous
examples, it would be impossible to write any useful program. For example, we need to print the
multiplication table of eight. Using our present knowledge, it would look like the following
Example: badtable.py
print (1 * 8)
print (2 * 8)
print (3 * 8)
print (4 * 8)
print (5 * 8)
Well, we are stopping here and looking for a better way to do this job.
The solution is to use the while loop of Python. The logical expression in front of while
is evaluated, and if it is True, the body of the while loop (the indented lines below the while
statement) is executed. The process is repeated until the condition becomes false. We should
have some statement inside the body of the loop that will make this condition false after few
iterations. Otherwise the program will run in an innite loop and you will have to press Control-C
to terminate it.
The program table.py, denes a variable x and assigns it an initial value of 1. Inside the while
loop x ∗ 8 is printed and the value of x is incremented. This process will be repeated until the
value of x becomes greater than 10.
Example: table.py
x = 1
while x <= 10:
print (x * 8)
x = x + 1
As per the Python syntax, the while statement ends with a colon and the code inside the while
loop is indented. Indentation can be done using tab or few spaces. In this example, we have
demonstrated a simple algorithm.
Example: forloop.py
a = 'Hello'
for ch in a: # ch is the loop variable
print (ch)
b = ['haha', 3.4, 2345, 3+5j]
for item in b:
print (item)
a = [2, 5, 3, 4, 12]
size = len(a)
for k in range(size):
a[k] = 0
print a
The output is
[0, 0, 0, 0, 0]
Example: big.py
x = eval(input('Enter a number '))
if x > 10:
print ('Bigger Number')
elif x < 10:
print ('Smaller Number')
else:
print ('Same Number')
The statement x > 10 and x < 15 can be expressed in a short form, like 10 < x < 15.
The next example uses while and if keywords in the same program. Note the level of indentation
when the if statement comes inside the while loop. Remember that, the if statement must be
aligned with the corresponding elif and else.
Example: big2.py
x = 1
while x < 11:
if x < 5:
print ('Small ', x)
else:
print ('Big ', x)
x = x + 1
print ('Done')
CHAPTER 1. PROGRAMMING IN PYTHON 12
Do not split in the middle of words except for Strings. A long String can be split as shown below.
longname = 'I am so long and will \
not fit in a single line'
print (longname)
1.15 Exercises
We have now covered the minimum essentials of Python; dening variables, performing arithmetic
and logical operations on them and the control ow statements. These are sucient for handling
most of the programming tasks. It would be better to get a grip of it before proceeding further,
by writing some code.
1. Modify the expression print 5+3*2 to get a result of 16
2. What will be the output of print (type(4.5))
3. Print all even numbers upto 30, suxed by a * if the number is a multiple of 6. (hint: use
% operator)
4. Write Python code to remove the last two characters of 'I am a long string' by slicing, without
counting the characters. (hint: use negative indexing)
5. s = '012345' . (a) Slice it to remove last two elements (b) remove rst two element.
6. a = [1,2,3,4,5]. Use Slicing and multiplication to generate [2,3,4,2,3,4] from it.
7. Compare the results of 5/2, 5//2 and 2.0/3.
8. Print the following pattern using a while loop
+
++
+++
++++
9. Write a program to read inputs like 8A, 10C etc. and print the integer and alphabet parts
separately.
10. Write code to print a number in the binary format (for example 5 will be printed as 101)
11. Write code to print all perfect cubes upto 2000.
12. Write a Python program to print the multiplication table of 5.
13. Write a program to nd the volume of a box with sides 3,4 and 5 inches in cm3 ( 1 inch =
2.54 cm)
CHAPTER 1. PROGRAMMING IN PYTHON 14
14. Write a program to nd the percentage of volume occupied by a sphere of diameter r tted
in a cube of side r. Read r from the keyboard.
15. Write a Python program to calculate the area of a circle.
16. Write a program to divide an integer by another without using the / operator. (hint: use -
operator)
17. Count the number of times the character 'a' appears in a String read from the keyboard.
Keep on prompting for the string until there is no 'a' in the input.
18. Create an integer division machine that will ask the user for two numbers then divide and
give the result. The program should give the result in two parts: the whole number result
and the remainder. Example: If a user enters 11 / 4, the computer should give the result 2
and remainder 3.
19. Modify the previous program to avoid division by zero error.
20. Create an adding machine that will keep on asking the user for numbers, add them together
and show the total after each step. Terminate when user enters a zero.
21. Modify the adding machine to check for errors like user entering invalid characters.
22. Create a script that will convert Celsius to Fahrenheit. The program should ask the users to
enter the temperature in Celsius and should print out the temperature in Fahrenheit, using
f = 59 c + 32.
23. Write a program to convert Fahrenheit to Celsius.
24. Create a script that uses a variable and will write 20 times "I will not talk in class." Make
each sentence on a separate line.
25. Dene 2 + 5j and 2 − 5j as complex numbers , and nd their product. Verify the result by
dening the real and imaginary parts separately and using the multiplication formula.
26. Write the multiplication table of 12 using while loop.
27. Write the multiplication table of a number, from the user, using for loop.
28. Print the powers of 2 up to 1024 using a for loop. (only two lines of code)
29. Dene the list a = [123, 12.4, 'haha', 3.4]
a) print all members using a for loop
b) print the oat type members ( use type() function)
c) insert a member after 12.4
d) append more members
30. Make a list containing 10 members using a for loop.
31. Generate multiplication table of 5 with two lines of Python code. (hint: range function)
32. Write a program to nd the sum of ve numbers read from the keyboard.
33. Write a program to read numbers from the keyboard until their sum exceeds 200. Modify
the program to ignore numbers greater than 99.
34. Write a Python function to calculate the GCD of two numbers
35. Write a Python program to nd annual compound interest. Get P,N and R from user
CHAPTER 1. PROGRAMMING IN PYTHON 15
1.16 Functions
Large programs need to be divided into small logical units. A function is generally an isolated
unit of code that has a name and performs a well dened job. A function groups several program
statements into a unit and gives it a name. This unit can be invoked from other parts of a program.
Python allows you to dene functions using the def keyword. A function may have one or more
variables as parameters, which receive their values from the calling program.
In the example shown below, function parameters (a and b) get the values 3 and 4 respectively
from the caller. One can specify more than one variables in the return statement, separated by
commas. The function will return a tuple containing those variables. Some functions may not
have any arguments, but while calling them we need to use an empty parenthesis, otherwise the
function will not be invoked. If there is no return statement, a None is returned to the caller.
Example func.py
def sum(a,b): # a trivial function
return a + b
The function factorial.py calls itself recursively. The value of argument is decremented before each
call. Try to understand the working of this by inserting print statements inside the function.
Example factor.py
def factorial(n): # a recursive function
if n == 0:
return 1
else:
return n * factorial(n-1)
print (factorial(10))
Example bonacci.py
def fib(n): # print Fibonacci series up to n
a, b = 0, 1
while b < n:
print (b)
a, b = b, a+b
print fib(30)
counter = 10
change(5)
print (counter)
The program will print 10 and not 5. The two variables, both named counter, are not related
to each other. In some cases, it may be desirable to allow the function to change some external
variable. This can be achieved by using the global keyword, as shown in global.py.
Example global.py
def change(x):
global counter # use the global variable
counter = x
counter = 10
change(5)
print (counter)
The program will now print 5. Functions with global variables should be used carefully, to avoid
inadvertent side eects.
Python is an object oriented language and all variables are objects belonging to various classes.
The method upper() (a function belonging to a class is called a method) is invoked using the dot
operator. All we need to know at this stage is that there are several methods that can be used for
manipulating objects and they can be invoked like: variable_name.method_name().
Modules are loaded by using the import keyword. Several ways of using import is explained
below, using the math (containing mathematical functions) module as an example.4
Example mathsin2.py
import math as m # Give another name for math
print (m.sin(0.5)) # Refer by the new name
We can also import the functions to behave like local (like the ones within our source le) function,
as shown below. The character * is a wild card for importing all the functions.
Example mathlocal.py
from math import sin # sin is imported as local
print (sin(0.5))
Example mathlocal2.py
from math import * # import everything from math
print (sin(0.5))
In the third and fourth cases, we need not type the module name every time. But there could be
trouble if two modules imported contains a function with same name. In the program conict.py,
the sin() from numpy is capable of handling a list argument. After importing math.py, line 4, the
sin function from math module replaces the one from numpy. The error occurs because the sin()
from math can accept only a numeric type argument.
Example conict.py
from numpy import *
x = [0.1, 0.2, 0.3]
print (sin(x)) # numpy's sin can handle lists
from math import * # sin of math becomes effective
print (sin(x)) # will give ERROR
4 While giving names to your Python programs, make sure that you are not directly or indirectly importing any
Python module having same name. For example, if you create a program named math.py and keep it in your working
directory, the import math statement from any other program started from that directory will try to import your
le named math.py and give error. If you ever do that by mistake, delete all the les with .pyc extension from your
directory.
CHAPTER 1. PROGRAMMING IN PYTHON 20
1.18.2 Packages
Packages are used for organizing multiple modules. The module name A.B designates a module
named B in a package named A. The concept is demonstrated using the packages Numpy5 and
Scipy.
Example submodule.py
import numpy as np
print (np.random.normal())
import scipy.special
print (scipy.special.j0(.1))
In this example random is a module inside the package NumPy. Similarly special is a module inside
the package Scipy. We use both of them in the package.module.function() format. But there is
some dierence. In the case of Numpy, the random module is loaded by default, importing scipy
does not import the module special by default. This behavior can be dened while writing the
Package and it is up to the author of the package.
Above program creates a new le named 'test.dat' (any existing le with the same name will be
deleted) and writes a String to it. The following program opens this le for reading the data.
Example rle.py
f = open('test.dat' , 'r')
print (f.read())
f.close()
Note that the data written/read are character strings. read() function can also be used to read a
xed number of characters, as shown below.
Example rle2.py
f = open('test.dat' , 'r')
print (f.read(7)) # get first seven characters
print (f.read()) # get the remaining ones
f.close()
5 NumPy will be discusssed later in chapter 2.
CHAPTER 1. PROGRAMMING IN PYTHON 21
Now we will examine how to read a text data from a le and convert it into numeric type. First
we will create a le with a column of numbers.
Example wle2.py
f = open('data.dat' , 'w')
for k in range(1,4):
s = '%3d\n' %(k)
f.write(s)
f.close()
Data can be printed in various formats. The conversion types are summarized in the following
table. There are several ags that can be used to modify the formatting, like justication, lling
etc.
The following example shows some of the features available with formatted printing.
Example: format2.py
6 This will give error if there is a blank line in the data le. This can be corrected by changing the comparison
statement to if len(s) < 1: , so that the processing stops at a blank line. Modify the code to skip a blank line
instead of exiting (hint: use continue ).
CHAPTER 1. PROGRAMMING IN PYTHON 22
If any exception occurs while running the code inside the try block, the code inside the except
block is executed. The following program implements error checking on input using exceptions.
Example: except2.py
CHAPTER 1. PROGRAMMING IN PYTHON 23
def get_number():
while 1:
try:
a = input('Enter a number ')
x = atof(a)
return x
except:
print ('Enter a valid number')
print get_number()
1.22 Exercises
1. Generate multiplication table of eight and write it to a le.
2. Write a Python program to open a le and write 'hello world' to it.
3. Write a Python program to open a text le and read all lines from it.
4. Write a program to generate the multiplication table of a number from the user. The output
should be formatted as shown below
1x5= 5
2 x 5 = 10
5. Write Python code to generate the sequence of numbers
25 20 15 10 5
using the range() function. Delete 15 from the result and sort it. Print it using a for loop.
6. Dene a string s = 'mary had a little lamb'.
a) print it in reverse order
b) split it using space character as separator
7. Join the elements of the list ['I', 'am', 'in', 'pieces'] using + character. Do the same using a
for loop also.
8. Create a program that will check a sentence to see if it is a palindrome. A palindrome is a
sentence that reads the same backwards and forwards ('malayalam').
9. Read a String from the keyboard. Multiply it by an integer to make its length more than
50. How do you nd out the smallest number that does the job.
10. Write a program using for loop to reverse a string.
11. Find the syntax error in the following code and correct it.
x=1
while x <= 10:
print x * 5
12. Write a function that returns the sum of numbers in a list.
13. Write a function returns are area of a circle, default radius should be taken as 1.
14. Split 'hello world' and join back to get 'hello+world'.
CHAPTER 1. PROGRAMMING IN PYTHON 24
15. Read a 5 digit integer from keyboard and print the sum of the digits.
16. Modify the list [1,2,3,4] to make it [1,2,3,10]
Chapter 2
Example numpy2.py
import numpy as np
a = [ [1, 2, 3] , [4, 5, 6] ] # make a list of lists
x = np.array(a) # and convert to an array
print (x)
[[1 2 3]
[4 5 6]]
Other than than array(), there are several other functions that can be used for creating dierent
types of arrays. Some of them are described below.
1 https://docs.scipy.org/doc/numpy-1.13.0/reference/routines.array-creation.html
25
CHAPTER 2. ARRAYS AND MATRICES 26
2.1.3 zeros(shape)
Returns a new array of given shape and type, lled zeros. For example np.zeros( [2,3]) generates
a 2 x 3 array lled with zeros
[[0.,0.,0.]
[0.,0.,0.]]
2.1.4 ones(shape)
Similar to np.zeros() except that the elements are initialized to 1.
2.1.5 random.random(shape)
Similar to the functions above, but the matrix is lled with random numbers ranging from 0 to 1,
of oat type. For example, np.random.random([3,3]) will generate the 3x3 matrix;
array([[ 0.3759652 , 0.58443562, 0.41632997],
[ 0.88497654, 0.79518478, 0.60402514],
[ 0.65468458, 0.05818105, 0.55621826]])
2.1.7 Copying
Numpy arrays can be copied using the copy method, as shown below.
Example array_copy.py
import numpy as np
a = np.zeros(5)
print a
b = a
c = a.copy()
c[0] = 10
print a, c
b[0] = 10
print a,c
The output of the program is shown below. The statement b = a does not make a copy of a.
Modifying b aects a, but c is a separate entity.
[ 0. 0. 0.]
[ 0. 0. 0.] [ 10. 0. 0.]
[ 10. 0. 0.] [ 10. 0. 0.]
import numpy as np
a = np.array([[2,3], [4,5]])
print(a)
np.savetxt('mydata.txt', a, fmt = '%2.3f', delimiter = ' ')
print('Data saved to text file')
b = np.loadtxt("mydata.txt", delimiter = ' ')
print('Data loaded from text file')
print(b)
and extract elements from it in dierent ways, python indexing starts from zero.
x[1] extracts the second row to give [3,4,5].
x[2,1] extracts 7, the element at third row and second column.
x[1:3, 1] extracts the elements [4,7] from second column.
x[ :, 1} gives the full second column [ 1, 4, 7, 10]
x[:2,:2] gives the array
[[0, 1],
[3, 4]]
You should remember that a statement like y = x[1] does not create a new array. It is only a
reference to the second row or x . The statement y[0] = 100 changes x[1, 0]to 100.
The program slice.py
CHAPTER 2. ARRAYS AND MATRICES 29
import numpy as np
x = np.reshape(range(12),(4,3))
print(x[1])
print(x[2,1])
print(x[1:3,1])
print(x[:,1])
print(x[:2,:2])
will generate the output
[3 4 5]
7
[4 7]
[ 1 4 7 10]
[[0 1]
[3 4]]
2.2 Exercises
1. Write code to make a one dimensional array with elements 5,10,15,20 and 25. make another
matrix by slicing the rst three elements from it.
2. Create a 3 × 2 array and print the sum of its elements using for loops.
3. Create a 2 × 3 array and ll it with random numbers.
4. Use linspace to make an array from 0 to 10, with step-size of 0.1
5. Use arange to make an 100 element array ranging from 0 to 10
6. Make an array a = [2,3,4,5] and copy it to b. change one element of b and print both.
7. Make a 3x3 array and multiply it by 5.
8. Create two 3x3 arrays and add them.
9. Create a 4x3 matrix using range() and reshape().
10. Extract the second row from a 4x3 matrix
11. Extract the second column from a 3x3 matrix
12. create a 2D array [ [1,2] , [3,4] ]. Save it to a le, read it back and print.
Chapter 3
Data visualization
A graph or chart is used to present numerical data in visual form. A graph is one of the easiest
ways to compare numbers. They should be used to make facts clearer and more understandable.
Results of mathematical computations are often presented in graphical format. In this chapter, we
will explore the Python modules used for generating two and three dimensional graphs of various
types.
3.1.1 2D plots
Let us start with some simple plots to become familiar with matplotlib.
Example plot1.py
import matplotlib.pyplot as plt
data = [1,2,5]
plt.plot(data)
plt.show()
1 https://www.tutorialspoint.com/matplotlib/matplotlib_quick_guide.htm
https://matplotlib.org/
https://matplotlib.org/tutorials/introductory/pyplot.html#sphx-glr-tutorials-introductory-pyplot-py
30
CHAPTER 3. DATA VISUALIZATION 31
In the above example, the x-axis of the three points is taken from 0 to 2. We can specify both the
axes as shown below.
Example plot2.py
import matplotlib.pyplot as plt
x = [1,2,5]
y = [4,5,6]
plt.plot(x,y)
plt.show()
By default, the color is blue and the line style is continuous. This can be changed by an optional
argument after the coordinate data, which is the format string that indicates the color and line
type of the plot. The default format string is `b-` (blue, continuous line). Let us rewrite the above
example to plot using red circles. We will also set the ranges for x and y axes and label them, as
shown in plot3.py.
Example plot3.py
import matplotlib.pyplot as plt
x = [1,2,5]
y = [4,5,6]
plt.plot(x,y,'ro')
plt.xlabel('x-axis')
plt.ylabel('y-axis')
plt.axis([0,6,1,7])
plt.show()
The gure 3.1 shows two dierent plots in the same window, using dierent markers and colors.
It also shows how to use legends.
Example plot4.py
import numpy as np
import matplotlib.pyplot as plt
t = np.arange(0.0, 5.0, 0.2)
plt.plot(t, t**2, 'x', label='t^2') # t2
plt.plot(t, t**3, 'ro', label='t^3') # t3
plt.legend(framealpha=0.5)
plt.show()
CHAPTER 3. DATA VISUALIZATION 32
We have just learned how to draw a simple plot using the pylab interface of matplotlib. Another
way to call numpy and matplotlib is like,
import numpy as np
import matplotlib.pyplot as plt
t = np.arange(0.0, 5.0, 0.2)
plt.plot(t, t**2, color='red', marker='x')
plt.show()
Introduction to Pandas
We have seen that NumPy handles homogeneous numerical array data. Pandas is designed for
working with tabular or heterogeneous data, like contents of a spreadsheet. Pandas shares many
coding idioms from NumPy. To get started with pandas, you will need to get familiar with its two
main data structures; Series and DataFrame.
4.1 Series
We have seen that the elements of a one dimensional array, of N elements, is indexed using integers
ranging from 0 to N-1. A Pandas Series data type can be considered as an extension to this, where
the indices also can be specied. If not specied it takes the default, integers ranging from 0 to
N-1.
We can create a Pandas Series using;
import pandas as pd
ser = pd.Series([4, 12, 55, 100])
print (ser)
The rst column is the indices and the second column are the values. It looks very similar to a
column in a spreadsheet. We can specify the indices also as shown below.
import pandas as pd
ser = pd.Series([4, 12, 55, 100], index=['a', 'b', 'c', 'd'])
print (ser)
33
CHAPTER 4. INTRODUCTION TO PANDAS 34
c 55
d 100
dtype: int64
The output is
b 12
a 230
34 3000
dtype: int64
You may note that the duplicate entry of index 'a' has been removed, and the latest value (230)
has prevailed. This is the behavior of Dictionary.Pandas Index can contain duplicate labels ,
as demonstrated below.
import pandas as pd
ps = pd.Series([4, 7, -5, 3], index=['d', 'a', 'a', 'c'])
print(ps)
It can be seen that the index 'a' is appearing twice. Pandas supports non-unique index values.
If an operation that does not support duplicate index values is attempted, an exception will be
raised at that time.
Index of a Series can be altered in-place by assignment. The exact number of indices should
be given.
ser = pd.Series([10, 20])
ser.index = ['a', 'b']
print (ser)
prints the following. The default indices 0 and 1 are replaced by 'a' and 'b' respectively.
a 10
b 20
dtype: int64
CHAPTER 4. INTRODUCTION TO PANDAS 35
Reindexing
It is possible to re-index a Series. This may alter the number of indices also. The relation between
an index and the corresponding value will be preserved during this operation.
import pandas as pd
ser = pd.Series([10, 20], index =['a', 'b'])
ser = ser.reindex(['b', 1, 2])
print (ser)
The result is
b 20.0
1 NaN
2 NaN
dtype: float64
It can be seen that the value corresponding to the old index 'b' is preserved. The new indices 1 and
2 are assigned with a NaN (Not a Number) value. The function isnull() tests for the NA values.
A key dierence between Series and ndarray is that operations between Series automatically align
the data based on index.
print (pd.isnull(ser))
prints
b False
1 True
2 True
dtype: bool
Series automatically aligns by index label in arithmetic operations, as demonstrated below.
s1 = pd.Series([10, 20, 30,40], index =[0,1,2,3])
s2 = pd.Series([12, 13, 14], index =[1,2,3])
s = s1 + s2
print (s)
The output is given below.
0 NaN
1 32.0
2 43.0
3 54.0
dtype: float64
0 s2
Index and the corresponding value are available in s1. But they are missing in , so the addition
of 10 and a non-existant value results in a NaN value. Values corresponding to the other three
indices are added. Data always align with the index.
Using NumPy functions or NumPy-like operations, such as ltering with a boolean array, scalar
multiplication, or applying math functions, will also preserve the index-value link. The default
result of operations between dierently indexed objects yield the union of the indexes in order to
avoid loss of information. You have the option of dropping labels with missing data via the dropna
function.
CHAPTER 4. INTRODUCTION TO PANDAS 36
print ( s.dropna() )
1 32.0
2 43.0
3 54.0
dtype: float64
Name attribute
Both the Series object itself and its index have a name attribute.
s = pd.Series([34, 56], index = ['John', 'Ravi'], name='ages')
s.index.name = 'names'
print (s.name)
print (s.index.name)
prints
ages
names
4.2 DataFrame
A DataFrame represents a tabular, spreadsheet-like data structure containing an ordered collection
of columns, each of which can be a dierent data type. It has both a row and column index.
It has both a row and column index. ; it can be thought of as a dictionary of Series (sharing
the same index). DataFrames can be dened in several dierent ways:
It can be seen that each column is made from a Series, where the keys become the column names.
A union of all the Series indices becomes the DataFrame index column. If any Series is missing an
index, the corresponding value is assigned a NaN.
will print
one two
a 1.0 4.0
b 2.0 3.0
c 3.0 2.0
d 4.0 1.0
Number of elements in all the lists/arrays in the dictionary should match with the number of
elements in the index list. The value from each dictionary item forms a column with it's key as
the label. The values can be arrays or lists.
There are several other methods to dene a DataFrame, that are not discussed here. We will
focus more on creating DataFrames by loading table data from les.
Index Properties
Pandas Index can contain duplicate labels.
will print
A A C D
1 0 1 2 3
2 4 5 6 7
2 8 9 10 11
3 12 13 14 15
4 16 17 18 19
Column selection
In the examples given below, we will be using the DataFrame created in the previous section is
assumed.
print (df['one'])
will print
CHAPTER 4. INTRODUCTION TO PANDAS 38
a 1.0
b 2.0
c 3.0
d 4.0
Name: one, dtype: float64
We can also print a selection of the column elements, by applying some condition like,
print (df['one'] > 2)
will print
a False
b False
c True
d True
Name: one, dtype: bool
df.query('one > 2')
The query() and assign() functions can be combined to create a new DataFrame as shown below.
ndf = df.query('one > 2').assign(ratio = df['one']/ df['two'])
print(ndf)
Deleting a column
The del keyword will delete columns as with a dict. You can also pop it out in to a new Series.
del df['two']
s1 = df.pop('one')
print (s1)
will print
a 1.0
b 2.0
c 3.0
d 4.0
Name: one, dtype: float64
The last two lines prints the same thing, third row of the DataFrame. Use loc when the index
is used. Use iloc if the integer representing while using the row number. The extracted row is a
Series and printed in the column format.
one 3.0
two 2.0
Name: b, dtype: float64
one 3.0
two 2.0
Name: b, dtype: float64
Transposing a DataFrame
This operation interchanges the rows and columns.
A B C D
0 0.0 2.0 4.0 NaN
1 7.0 9.0 11.0 NaN
2 14.0 16.0 18.0 NaN
3 NaN NaN NaN NaN
4 NaN NaN NaN NaN
The addition of elements takes place only where elements from both the Frames are present.
Wherever the operation is not possible, is lled with a NaN value.