Pandas - 1
Pandas - 1
Question 1
Name the Pandas object that can store one dimensional array like object and can have numeric or labelled
indexes.
Answer
The Pandas object that can store one-dimensional array like objects with numeric or labeled indexes is
called a "Series Object".
Question 3
To get the size of the datatype of the items in Series object, you can display ............... attribute.
1. index
2. size
3. itemsize
4. ndim
Answer
itemsize
Reason — The itemsize attribute is used to know the number of bytes allocated to each data item in
Series object. The syntax is <Series object>.itemsize.
Question 6
To get the number of elements in a Series object, ............... attribute may be used.
1. index
2. size
3. itemsize
4. ndim
Answer
size
Reason — The size attribute is used to know the number of elements in the Series object. The syntax is
<Series object>.size.
Question 7
To get the number of bytes of the Series data, ............... attribute is displayed.
1. hasnans
2. nbytes
3. ndim
4. dtype
Answer
nbytes
Reason — The nbytes attribute is used to know total number of bytes taken by Series object data. The
syntax is <Series object>.nbytes.
Question 8
To check if the Series object contains NaN values, ............... attribute is displayed.
1. hasnans
2. nbytes
3. ndim
4. dtype
Answer
hasnans
Reason — The hasnans attribute is used to check if a Series object contains some NaN value or not. The
syntax is <Series object>.hasnans.
Question 9
To display first three elements of a Series object S, you may write ............... .
1. S[:3]
2. S[3]
3. S[3rd]
4. all of these
Answer
S[:3]
Reason — The syntax to extract slices from Series object is <Series Object>[start:end:step].
Therefore, according to this syntax, the correct slice notation to display the first three elements of a Series
object S is S[:3].
Question 11
To display last five rows of a Series object S, you may write ............... .
1. head()
2. head(5)
3. tail()
4. tail(5)
Answer
tail(), tail(5)
Reason — The syntax to display the last n rows of a Series object is <Series Object>.tail([n]).
Therefore, according to this syntax, tail(5) will display last five rows of a Series object S. If n value is
not specified, then tail() will return the last 5 rows of a Series object.
Question 12
To display last five rows of a series object 'S', you may write :
1. S.Head()
2. S.Tail(5)
3. S.Head(5)
4. S.tail()
Answer
S.tail()
Reason — The syntax to display the last n rows of a Series object is <Series Object>.tail([n]).
Therefore, according to this syntax, S.tail() will display last five rows of a Series object S.
Question 16
In Python Pandas, while performing mathematical operations on series, index matching is implemented
and all missing values are filled in with ............... by default.
1. Null
2. Blank
3. NaN
4. Zero
Answer
NaN
Reason — When performing mathematical operations on pandas Series objects, index matching is
implemented (this is called data alignment in Pandas objects), and missing values are filled with NaN
(Not a Number) by default.
Question 18
Given a Pandas series called Sequences, the command which will display the first 4 rows is ............... .
1. print(Sequences.head(4))
2. print(Sequences.Head(4))
3. print(Sequences.heads(4)
4. print(Sequences.Heads(4))
Answer
print(Sequences.head(4))
Reason — The syntax to display the first n rows from a Series object is <Series object>.head([n]).
Therefore, according to this syntax, the command to display the first 4 rows of Sequences is
print(Sequences.head(4)).
Question 19
When we create a DataFrame from a list of Dictionaries the columns labels are formed by the :
1. Union of the keys of the dictionaries
2. Intersection of the keys of the dictionaries
3. Union of the values of the dictionaries
4. Intersection of the values of the dictionaries
Answer
Union of the keys of the dictionaries
Reason — When we create a DataFrame from a list of dictionaries, the column labels are formed by the
union of the keys of the dictionaries.
Question 22
If a DataFrame is created using a 2D dictionary, then the indexes/row labels are formed from ............... .
1. dictionary's values
2. inner dictionary's keys
3. outer dictionary's keys
4. none of these
Answer
inner dictionary's keys
Reason — When a DataFrame is created using a 2D dictionary, then the indexes/row labels are formed
from keys of inner dictionaries.
Question 23
If a dataframe is created using a 2D dictionary, then the column labels are formed from ............... .
1. dictionary's values
2. inner dictionary's keys
3. outer dictionary's keys
4. none of these
Answer
outer dictionary's keys
Reason — When a DataFrame is created using a 2D dictionary, then the column labels are formed from
keys of outer dictionaries.
Question 24
Which of the following can be used to specify the data while creating a DataFrame ?
1. Series
2. List of Dictionaries
3. Structured ndarray
4. All of these
Answer
All of these
Reason — We can create a DataFrame object by passing data in many different ways, such as two-
dimensional dictionaries (i.e., dictionaries having lists or dictionaries or ndarrays or series objects etc),
two-dimensional ndarrays, series type object and another DataFrame object.
Question 25
To get a number representing number of axes in a dataframe, ............... attribute may be used.
1. size
2. shape
3. values
4. ndim
Answer
ndim
Reason — The ndim attribute will return an integer representing the number of axes/array dimensions.
Question 30
To display the 3rd, 4th and 5th columns from the 6th to 9th rows of a dataframe DF, you can
write ............... .
1. DF.loc[6:9, 3:5]
2. DF.loc[6:10, 3:6]
3. DF.iloc[6:10, 3:6]
4. DF.iloc[6:9, 3:5]
Answer
DF.iloc[6:10, 3:6]
Reason — To display subset from dataframe using row and column numeric index/position, iloc is used
with syntax <DF object>.iloc[<start row index>:<end row index>, <start col index>:<end
col index>]. Therefore, according to this syntax, DF.iloc[6:10, 3:6] is correct slice notation to
display the 3rd, 4th and 5th columns from the 6th to 9th rows of a dataframe DF.
Question 34
To change the 5th column's value at 3rd row as 35 in dataframe DF, you can write ............... .
1. DF[4, 6] = 35
2. DF[3, 5] = 35
3. DF.iat[4, 6] = 35
4. DF.iat[3, 5] = 35
Answer
DF.iat[3, 5] = 35
Reason — The syntax to modify values using row and column position is <DataFrame>.iat[<row
position>, <column position>]. Therefore, according to this syntax, DF.iat[3, 5] = 35 is used to
change the 5th column's value at 3rd row as 35 in dataframe DF.
Question 35
Which among the following options can be used to create a DataFrame in Pandas ?
1. A scalar value
2. An ndarray
3. A python dict
4. All of these
Answer
All of these
Reason — We can create a DataFrame object in Pandas by passing data in many different ways, such as a
scalar value, an ndarray and a Python dictionary.
Question 36
Identify the correct option to select first four rows and second to fourth columns from a DataFrame 'Data':
1. display(Data.iloc[1 : 4, 2 : 4])
2. display(Data.iloc[1 : 5, 2 ; 5])
3. print(Data.iloc[0 : 4, 1 : 4])
4. print(Data.iloc[1 : 4, 2 : 4])
Answer
print(Data.iloc[0 : 4, 1 : 4])
Reason — To display subset from dataframe using row and column numeric index/position, iloc is used
with syntax <DF object>.iloc[<start row index>:<end row index>, <start col index>:<end
col index>]. Therefore, according to this syntax, print(Data.iloc[0 : 4, 1 : 4]) is correct
statement to display first four rows and second to fourth columns from a DataFrame Data.
Question 38
Sudhanshu has written the following code to create a DataFrame with boolean index :
import numpy as npimport pandas as pd
df = pd.DataFrame(data = [[5, 6, 7]], index = [true, false, true])
print(df)
While executing the code, she is getting an error, help her to rectify the code :
1. df = pd.DataFrame([True, False, True], data = [5, 6, 7])
2. df = pd.DataFrame(data = [5, 6, 7], index = [True, False, True])
3. df = pd.DataFrame([true, false, true], data = [5, 6, 7])
4. df = pd.DataFrame(index = [true, false, true], data = [[5, 6, 7]])
Answer
df = pd.DataFrame(data = [5, 6, 7], index = [True, False, True])
Reason — The index values 'true' and 'false' should have the first letter capitalized to match Python's
boolean values. Also, the 'data' parameter should contain the list of values to be included in the
DataFrame. Hence, df = pd.DataFrame(data = [5, 6, 7], index = [True, False, True]) is
correct.
Fill in the Blanks
Question 1
The len() function on Series object returns total elements in it including NaNs.
Question 7
The count() function on Series object returns only the count of non-NaN values in it.
Question 8
To access individual value, you can use DF.at using row/column index labels.
Question 14
To access individual value, you can use DF.iat using row/column integer position.
Question 15
The rename() function requires inplace argument to make changes in the original dataframe.
True/False Questions
Question 1
A Series object can store only homogeneous (same type of) elements.
Answer
True
Reason — A Series object in Pandas can store only homogeneous elements, meaning all elements must
be of the same data type.
Question 12
The del statement can remove the rows as well as columns in a dataframe.
Answer
False
Reason — The del statement is used to delete columns in a DataFrame, while the drop() function is
used to delete rows from a DataFrame.
Question 14
Assertion (A). To use the Pandas library in a Python program, one must import it.
Reasoning (R). The only alias name that can be used with the Pandas library is pd.
1. Both A and R are true and R is the correct explanation of A.
2. Both A and R are true but R is not the correct explanation of A.
3. A is true but R is false.
4. A is false but R is true.
Answer
A is true but R is false.
Explanation
In order to work with Pandas in Python, we need to import the Pandas library into our Python
environment using the statement import pandas as pd. While pd is a common alias used with the
Pandas library, it's not the only alias that can be used. We can import Pandas using other alias names as
well.
Question 2
Assertion. A dataframe is a 2D data structure which is value mutable and size mutable.
Reason. Every change in a dataframe internally creates a new dataframe object.
1. Both A and R are true and R is the correct explanation of A.
2. Both A and R are true but R is not the correct explanation of A.
3. A is true but R is false.
4. A is false but R is true.
Answer
A is true but R is false.
Explanation
A DataFrame is a two-dimensional data structure that is both value-mutable and size-mutable. This means
that we can modify the values within a DataFrame, change its size once it's created, and add or drop
elements in an existing DataFrame object without creating a new DataFrame internally.
Question 4
Assertion. Arithmetic operations on two series objects take place on matching indexes.
Reason. Non-matching indexes are removed from the result of arithmetic operation on series objects.
1. Both A and R are true and R is the correct explanation of A.
2. Both A and R are true but R is not the correct explanation of A.
3. A is true but R is false.
4. A is false but R is true.
Answer
A is true but R is false.
Explanation
Arithmetic operations on two Series objects take place on matching indexes. When performing operations
on objects with non-matching indexes, Pandas aligns the indexes and adds values for matching indexes,
resulting in NaN (Not a Number) for non-matching indexes in both objects.
Question 7
Assertion. Arithmetic operations on two series objects take place on matching indexes.
Reason. For non-matching indexes of series objects in an arithmetic operation, NaN is returned.
1. Both A and R are true and R is the correct explanation of A.
2. Both A and R are true but R is not the correct explanation of A.
3. A is true but R is false.
4. A is false but R is true.
Answer
Both A and R are true and R is the correct explanation of A.
Explanation
Arithmetic operations on two Series objects take place on matching indexes. When performing operations
on objects with non-matching indexes, Pandas aligns the indexes and adds values for matching indexes,
resulting in NaN (Not a Number) for non-matching indexes in both objects.
Question 8
Assertion. While changing the values of a column in a dataframe, if the column does not exist, an error
occurs.
Reason. If values are provided for a non-existing column in a dataframe, a new column is added with
those values.
1. Both A and R are true and R is the correct explanation of A.
2. Both A and R are true but R is not the correct explanation of A.
3. A is true but R is false.
4. A is false but R is true.
Answer
A is false but R is true.
Explanation
While changing the values of a column in a dataframe where the column does not exist does not cause an
error. Instead, a new column with those values is added to the dataframe. If values are provided for a non-
existing column in a dataframe, a new column is added with those values.
Question 9
Assertion. .loc() is a label based data selecting method to select a specific row(s) or column(s) which we
want to select.
Reason. .iloc() can not be used with default indices if customized indices are provided.
1. Both A and R are true and R is the correct explanation of A.
2. Both A and R are true but R is not the correct explanation of A.
3. A is true but R is false.
4. A is false but R is true.
Answer
A is true but R is false.
Explanation
The .loc() is a label-based method in Pandas used for selecting specific rows or columns based on their
labels (indices). While .iloc() can be used with default indices (0-based integer indices) even if
customized indices are provided. .iloc[] is primarily used for integer-location based indexing.
Question 10
Assertion. DataFrame has both a row and column index.
Reason. A DataFrame is a two-dimensional labelled data structure like a table of MySQL.
1. Both A and R are true and R is the correct explanation of A.
2. Both A and R are true but R is not the correct explanation of A.
3. A is true but R is false.
4. A is false but R is true.
Answer
Both A and R are true and R is the correct explanation of A.
Explanation
A DataFrame in Pandas has both a row index and a column index. It is a two-dimensional labeled data
structure, similar to a table in MySQL, each value is identifiable with the combination of row and column
indices.
Type A: Very Short Answer Questions
Question 1
How is a Series object different from and similar to ndarrays ? Support your answer with examples.
Answer
A Series object in Pandas is both similar to and different from ndarrays (NumPy arrays).
Similarities:
Both Series and ndarrays store homogeneous data, meaning all elements must be of the same data type
(e.g., integers, floats, strings).
Differences:
Series Object ndarrays
It does not support explicit indexing,
It supports explicit indexing, i.e., we can
only supports implicit indexing
programmatically choose, provide and change
whereby the indexes are implicitly
indexes in terms of numbers or labels.
given 0 onwards.
It supports indexes of numeric as well of It supports indexes of only numeric
string types. types.
It can perform vectorized operations on two
It can perform vectorized operations
series objects, even if their shapes are
on two ndarrays only if their shapes
different by using NaN for non-matching
match.
indexes/labels.
It takes more memory compared to a numpy It takes lesser memory compared to
Series Object ndarrays
array. a Series object.
Question 4
Write single line Pandas statement for the following. (Assuming necessary modules have been
imported) :
Declare a Pandas series named Packets having dataset as :
[125, 92, 104, 92, 85, 116, 87, 90]
Answer
Packets = pandas.Series([125, 92, 104, 92, 85, 116, 87, 90], name = 'Packets')
Question 5
A 90
B 60
C 108
D 150
dtype: int64
Question 7
Consider two objects x and y. x is a list whereas y is a Series. Both have values 20, 40, 90, 110.
What will be the output of the following two statements considering that the above objects have been
created already ?
(a) print (x*2)
(b) print (y*2)
Justify your answer.
Answer
(a)
Output
0 40
1 80
2 180
3 220
dtype: int64
In the second statement, y represents a Series. When a Series is multiplied by a value, each element of the
Series is multiplied by 2, as Series supports vectorized operations.
Question 8
A B D C
0 15 17 19 NaN
1 16 18 20 NaN
2 20 21 22 NaN
(b) df['C'] = [2, 5] — This statement will result in error because the length of the list [2, 5] does not
match the number of rows in the DataFrame df.
(c) df['C'] = [12, 15, 27] — This statement will add a new column 'C' to the dataframe and assign
the values from the list [12, 15, 27] to the new column. This time, all rows in the new column will be
assigned a value.
The updated dataframe will look like this:
Output
A B D C
0 15 17 19 12
1 16 18 20 15
2 20 21 22 27
Question 9
Write code statements to list the following, from a dataframe namely sales:
(a) List only columns 'Item' and 'Revenue'.
(b) List rows from 3 to 7.
(c) List the value of cell in 5th row, 'Item' column.
Answer
(a)
>>> sales[['Item', 'Revenue']]
(b)
>>> sales.iloc[2:7]
(c)
>>> sales.Item[4]
Question 10
Hitesh wants to display the last four rows of the dataframe df and has written the following code :
df.tail()
But last 5 rows are being displayed. Identify the error and rewrite the correct code so that last 4 rows get
displayed.
Answer
The error in Hitesh's code is that the tail() function in pandas by default returns the last 5 rows of the
dataframe. To display the last 4 rows, Hitesh needs to specify the number of rows he wants to display.
Here's the correct code:
df.tail(4)
Question 11
How would you add a new column namely 'val' to a dataframe df that has 10 rows in it and has columns
as 'Item', 'Qty', 'Price' ? You can choose to put any values of your choice.
Answer
The syntax to add a new column to a DataFrame is <DF object>.[<column>] = <new value>.
Therefore, according to this syntax, the statement to add a column named 'val' to a dataframe df with 10
rows is :
df['val'] = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
Question 12
Write statement(s) to change the value at 5th row, 6th column in a DataFrame df.
Answer
The statement to change the value at 5th row, 6th column in a DataFrame df is:
df.iat[5, 6] = <new value>.
Question 16
Write statement(s) to change the values to 750 at 4th row to 9th row, 7th column in a DataFrame df.
Answer
The statement to change the value to 750 at 4th row to 9th row, 7th column in a DataFrame df is:
df.iloc[3:9, 6] = 750.
Question 17
What is the difference between iloc and loc with respect to a DataFrame ?
Answer
iloc method loc method
iloc is used for integer-based indexing. loc is used for label-based indexing.
It allows to access rows and columns using It allows to access rows and columns
integer indices, where the first row or column using their labels (index or column
has an index of 0. names).
With loc, both the start label and end
With iloc, the end index/position in slices is
label are included when given as
excluded when given as start:end.
start:end.
The syntax is df.iloc[row_index, The syntax is df.loc[row_label,
column_index]. column_label].
Question 18
Which function would you use to rename the index/column names in a dataframe ?
Answer
The rename() function in pandas is used to rename index or column names in a DataFrame.
Type B: Short Answer Questions/Conceptual
Questions
Question 1
0 43.0271
1 61.7328
2 -26.5421
3 -83.6113
dtype: float64
(b) S > 0
Output
0 True
1 True
2 False
3 False
dtype: bool
(c) S1 = pd.Series(S)
Output
0 0.430271
1 0.617328
2 -0.265421
3 -0.836113
dtype: float64
(d) S2 = pd.Series(S1) + 3
Output
0 3.430271
1 3.617328
2 2.734579
3 2.163887
dtype: float64
The values of Series object S1 created above is as follows:
0 0.430271
1 0.617328
2 -0.265421
3 -0.836113
dtype: float64
The values of Series object S2 created above is as follows:
0 3.430271
1 3.617328
2 2.734579
3 2.163887
dtype: float64
Question 2
Consider the same Series object, S, given in the previous question. What output will be produced by
following code fragment ?
S.index = ['AMZN', 'AAPL', 'MSFT', 'GOOG']print(S) print(S['AMZN'])S['AMZN'] =
1.5print(S['AMZN'])print(S)
Answer
Output
AMZN 0.430271
AAPL 0.617328
MSFT -0.265421
GOOG -0.836113
dtype: float64
0.430271
1.5
AMZN 1.500000
AAPL 0.617328
MSFT -0.265421
GOOG -0.836113
dtype: float64
Explanation
The provided code fragment first changes the index labels of the Series S to ['AMZN', 'AAPL', 'MSFT',
'GOOG'], prints the modified Series S, and then proceeds to print and modify the value corresponding to
the 'AMZN' index. Specifically, it prints the value at the 'AMZN' index before and after assigning a new
value of 1.5 to that index. Finally, it prints the Series S again, showing the updated value at the 'AMZN'
index.
Question 3
pencils 37
notebooks 46
scales 83
erasers 42
dtype: int64
pencils 54
notebooks 59
scales 114
erasers 74
dtype: int64
Explanation
The code creates two Pandas Series, S and S2. It then prints the result of adding these two Series element-
wise based on their corresponding indices. After updating S by adding S and S2, it prints the result of
adding updated S and S2 again.
Question 4
What will be the output produced by following code, considering the Series object S given above ?
(a) print(S[1:1])
(b) print(S[0:1])
(c) print(S[0:2])
(d)
S[0:2] = 12
print(S)
(e)
print(S.index)
print(S.values)
Answer
(a)
Output
The slice S[1:1] starts at index 1 and ends at index 1, but because the end index is exclusive, it does not
include any elements, resulting in an empty Series.
(b)
Output
pencils 20
dtype: int64
Explanation
The slice S[0:1] starts at index 0 and ends at index 1, but because the end index is exclusive, it includes
only one element i.e., the element at index 0.
(c)
Output
pencils 20
notebooks 33
dtype: int64
Explanation
The slice S[0:2] starts at index 0 and ends at index 1, hence, it includes two elements i.e., elements from
index 0 and 1.
(d)
Output
pencils 12
notebooks 12
scales 52
erasers 10
dtype: int64
Explanation
The slice S[0:2] = 12 assigns the value 12 to indices 0 and 1 in Series S, directly modifying those
elements. The updated Series is then printed.
(e)
Output
The code print(S.index) displays the indices of Series S, while print(S.values) displays the values
of Series.
Question 5
Write a Python program to create a series object, country using a list that stores the capital of each
country.
Note. Assume four countries to be used as index of the series object are India, UK, Denmark and
Thailand having their capitals as New Delhi, London, Copenhagen and Bangkok respectively.
Solution
If Ser is a Series type object having 30 values, then how are statements (a), (b) and (c), (d) similar and
different ?
(a) print(Ser.head())
(b) print(Ser.head(8))
(c) print(Ser.tail())
(d) print(Ser.tail(11))
Answer
The statements (a), (b), (c) and (d) are all used to view the values from a pandas Series object Ser.
However, they differ in the number of values they display.
(a) print(Ser.head()): This statement will display the first 5 values from the Series Ser.
(b) print(Ser.head(8)): This statement will display the first 8 values from the Series Ser.
(c) print(Ser.tail()): This statement will display the last 5 values from the Series Ser.
(d) print(Ser.tail(11)): This statement will display the last 11 values from the Series Ser.
Question 11
What advantages does dataframe offer over series data structure ? If you have similar data stored in
multiple series and a single dataframe, which one would you prefer and why ?
Answer
The advantages of using a DataFrame over a Series are as follows:
1. A DataFrame can have multiple columns, whereas a Series can only have one.
2. A DataFrame can store data of different types in different columns, whereas a Series can only store data of a single
type.
3. A DataFrame allows to perform operations on entire columns, whereas a Series only allows to perform operations
on individual elements.
4. A DataFrame allows to index data using both row and column labels, whereas a Series only allows to index data
using a single label.
If there is similar data stored in multiple Series and a single DataFrame, I would prefer to use the
DataFrame. This is because a DataFrame allows us to store and manipulate data in a more organized and
structured way, and it allows us to perform operations on entire columns. Additionally, a DataFrame
allows us to index data using both row and column labels, which makes it easier to access and manipulate
data.
Question 12
one two
a 1.0 1.0
b 2.0 2.0
c 3.0 3.0
d NaN 4.0
one two
d NaN 4.0
b 2.0 2.0
a 1.0 1.0
two three
d 4.0 NaN
a 1.0 NaN
Explanation
The given code creates three pandas DataFrames df, df1, and df2 using the same dictionary d with
different index and column labels. The first DataFrame df is created using the dictionary d with index
labels taken from the index of the Series objects in the dictionary. The resulting DataFrame has two
columns 'one' and 'two' with index labels 'a', 'b', 'c', and 'd'. The values in the DataFrame are filled in
accordance to the index and column labels. The second DataFrame df1 is created with the same
dictionary d but with a custom index ['d', 'b', 'a']. The third DataFrame df2 is created with a custom index
['d', 'a'] and a custom column label ['two', 'three']. Since the dictionary d does not have a column label
three, all its values are NaN (Not a Number), indicating missing data.
Question 15(a)
From the DataFrames created in previous question, write code to display only row 'a' from DataFrames
df, df1, and df2.
Solution
import pandas as pdd = {'one' : pd.Series([1., 2., 3.], index = ['a', 'b', 'c']), 'two' :
pd.Series([1., 2., 3., 4.], index = ['a', 'b', 'c', 'd'])} df = pd.DataFrame(d)df1 =
pd.DataFrame(d, index = ['d', 'b', 'a'])df2 = pd.DataFrame(d, index = ['d', 'a'],
columns = ['two', 'three'])print(df.loc['a',:])print(df1.loc['a',:])print(df2.loc['a',:])
Output
one 1.0
two 1.0
Name: a, dtype: float64
one 1.0
two 1.0
Name: a, dtype: float64
two 1.0
three NaN
Name: a, dtype: object
Question 15(b)
From the DataFrames created in previous question, write code to display only rows 0 and 1 from
DataFrames df, df1, and df2.
Solution
import pandas as pdd = {'one' : pd.Series([1., 2., 3.], index = ['a', 'b', 'c']), 'two' :
pd.Series([1., 2., 3., 4.], index = ['a', 'b', 'c', 'd'])} df = pd.DataFrame(d)df1 =
pd.DataFrame(d, index = ['d', 'b', 'a'])df2 = pd.DataFrame(d, index = ['d', 'a'],
columns = ['two', 'three'])print(df.iloc[0:2])print(df1.iloc[0:2])print(df2.iloc[0:2])
Output
one two
a 1.0 1.0
b 2.0 2.0
one two
d NaN 4.0
b 2.0 2.0
two three
d 4.0 NaN
a 1.0 NaN
Question 15(c)
From the DataFrames created in previous question, write code to display only rows 'a' and 'b' for columns
1 and 2 from DataFrames df, df1 and df2.
Solution
import pandas as pdd = {'one' : pd.Series([1., 2., 3.], index = ['a', 'b', 'c']), 'two' :
pd.Series([1., 2., 3., 4.], index = ['a', 'b', 'c', 'd'])} df = pd.DataFrame(d)df1 =
pd.DataFrame(d, index = ['d', 'b', 'a'])df2 = pd.DataFrame(d, index = ['d', 'a'],
columns = ['two', 'three'])print(df.loc['a' : 'b', :])print(df1.loc['b' :
'a', :])print(df2.loc['d' : 'a', :])
Output
one two
a 1.0 1.0
b 2.0 2.0
one two
b 2.0 2.0
a 1.0 1.0
two three
d 4.0 NaN
a 1.0 NaN
Question 15(d)
From the DataFrames created in previous question, write code to add an empty column 'x' to all
DataFrames.
Solution
import pandas as pdd = {'one' : pd.Series([1., 2., 3.], index = ['a', 'b', 'c']), 'two' :
pd.Series([1., 2., 3., 4.], index = ['a', 'b', 'c', 'd'])} df = pd.DataFrame(d)df1 =
pd.DataFrame(d, index = ['d', 'b', 'a'])df2 = pd.DataFrame(d, index = ['d', 'a'],
columns = ['two', 'three'])df['x'] = Nonedf1['x'] = Nonedf2['x'] =
Noneprint(df)print(df1)print(df2)
Output
one two x
a 1.0 1.0 None
b 2.0 2.0 None
c 3.0 3.0 None
d NaN 4.0 None
one two x
d NaN 4.0 None
b 2.0 2.0 None
a 1.0 1.0 None
two three x
d 4.0 NaN None
a 1.0 NaN None
Question 16
What will be the output of the following program ?
import pandas as pddic = {'Name' : ['Sapna', 'Anmol', 'Rishul', 'Sameep'], 'Agg' :
[56, 67, 75, 76], 'Age' : [16, 18, 16, 19]}df = pd.DataFrame(dic, columns =
['Name', 'Age'])print(df)
(a)
Name Agg Age
101 Sapna 56 16
102 Anmol 67 18
103 Rishul 75 16
104 Sameep 76 19
(b)
Name Agg Age
0 Sapna 56 16
1 Anmol 67 18
2 Rishul 75 16
3 Sameep 76 19
(c)
Name
0 Sapna
1 Anmol
2 Rishul
3 Sameep
(d)
Name Age
0 Sapna 16
1 Anmol 18
2 Rishul 16
3 Sameep 19
Answer
(d)
Output
Name Age
0 Sapna 16
1 Amol 18
2 Rishul 16
3 Sameep 19
Explanation
The code creates a DataFrame df with columns 'Name' and 'Age' using a dictionary. It contains data about
individual's names and ages. The DataFrame is then printed, displaying the specified columns.
Question 17
Predict the output of following code (it uses below given dictionary my_di).
my_di = {"name" : ["Jiya", "Tim", "Rohan"],
"age" : np.array([10, 15, 20]),
"weight" : (75, 123, 239),
"height" : [4.5, 5, 6.1],
"siblings" : 1,
"gender" : "M"}
df = pd.DataFrame(my_di)print(df)
Answer
Output
The given code creates a dictionary my_di. Then, a DataFrame df is created using the pd.DataFrame()
constructor and passing the my_di dictionary. The print() function is used to display the DataFrame.
Question 18
Consider the same dictionary my_di in the previous question (shown below), what will be the output
produced by following code ?
my_di = {"name" : ["Jiya", "Tim", "Rohan"],
"age" : np.array([10, 15, 20]),
"weight" : (75, 123, 239),
"height" : [4.5, 5, 6.1],
"siblings" : 1,
"gender" : "M"}
df2 = pd.DataFrame(my_di, index = my_di["name"])print(df2)
Answer
Output
The given code creates a dictionary my_di. Then, a DataFrame df2 is created using the pd.DataFrame()
constructor and passing the my_di dictionary and the my_di["name"] list as the index. The print()
function is used to display the DataFrame.
Question 19
Assume that required libraries (panda and numpy) are imported and dataframe df2 has been created as per
questions 17 and 18 above. Predict the output of following code fragment :
print(df2["weight"])print(df2.weight['Tim'])
Answer
Output
Jiya 75
Tim 123
Rohan 239
Name: weight, dtype: int64
123
Explanation
The given code creates a dictionary my_di. Then, a DataFrame df2 is created using the pd.DataFrame()
constructor and passing the my_di dictionary and the my_di["name"] list as the index. The print()
function is used to display the 'weight' column of the DataFrame df2 and the value of the 'weight' column
for the row with index 'Tim'.
Question 20
Assume that required libraries (panda and numpy) are imported and dataframe df2 has been created as per
questions 17 and 18 above. Predict the output of following code fragment :
df2["IQ"] = [130, 105, 115] df2["Married"] = Falseprint(df2)
Answer
Output
The code adds two new columns "IQ" with values [130, 105, 115] and "Married" with value "False" for
all rows to DataFrame df2, then prints the DataFrame.
Question 21
Assume that required libraries (panda and numpy) are imported and dataframe df2 has been created as per
questions 17 and 18 above. Predict the output produced by following code fragment :
df2["College"] = pd.Series(["IIT"], index=["Rohan"]) print(df2)
Answer
Output
The code snippet uses the pandas and numpy libraries in Python to create a DataFrame named df2 from a
dictionary my_di. The DataFrame is indexed by names, and a new column "College" is added with "IIT"
as the value only for the index named "Rohan."
Question 22
Assume that required libraries (panda and numpy) are imported and dataframe df2 has been created as per
questions 17 and 18 above. Predict the output produced by following code fragment :
print(df2.loc["Jiya"])print(df2.loc["Jiya", "IQ"])print(df2.loc["Jiya":"Tim",
"IQ":"College"]) print(df2.iloc[0])print(df2.iloc[0, 5])print(df2.iloc[0:2, 5:8])
Answer
Output
name Jiya
age 10
weight 75
height 4.5
siblings 1
gender M
IQ 130
College NaN
Name: Jiya, dtype: object
130
IQ College
Jiya 130 NaN
Tim 105 NaN
name Jiya
age 10
weight 75
height 4.5
siblings 1
gender M
IQ 130
College NaN
Name: Jiya, dtype: object
M
gender IQ College
Jiya M 130 NaN
Tim M 105 NaN
Explanation
1. print(df2.loc["Jiya"]) — This line prints all columns of the row with the index "Jiya".
2. print(df2.loc["Jiya", "IQ"]) — This line prints the value of the "IQ" column for the row with the index
"Jiya".
3. print(df2.loc["Jiya":"Tim", "IQ":"College"]) — This line prints a subset of rows and columns
using labels, from "Jiya" to "Tim" for rows and from "IQ" to "College" for columns.
4. print(df2.iloc[0]) — This line prints all columns of the first row using integer-based indexing (position 0).
5. print(df2.iloc[0, 5]) — This line prints the value of the 6th column for the first row using integer-based
indexing.
6. print(df2.iloc[0:2, 5:8]) — This line prints a subset of rows and columns using integer-based indexing,
selecting rows from position 0 to 1 and columns from position 5 to 7.
Question 23
Original DataFrame
col1 col2 col3
0 1 6 9
1 4 7 0
2 3 8 1
New DataFrame :
col1 col2 col3
0 1 6 9
Explanation
The code creates a DataFrame using the pandas library in Python, named df, with three columns ('col1',
'col2', 'col3') and three rows of data. The DataFrame df is printed, and then a new DataFrame named dfn
is created by dropping the rows with indices 1 and 2 from the original DataFrame using
df.drop(df.index[[1, 2]]). The resulting DataFrame, dfn, contains only the first row from the df
DataFrame, removing rows 2 and 3.
Question 24
Before
age name
1 20 Ruhi
2 23 Ali
3 22 Sam
After
age name Edu
1 20 Ruhi BA
2 23 Ali BE
3 22 Sam MBA
Explanation
The code utilizes the pandas library in Python to create a DataFrame named df1 using a dictionary data.
The df1 DataFrame is printed, showing the initial data. Then, a new column 'Edu' is added to the
DataFrame using df1['Edu'] = ['BA', 'BE' , 'MBA']. The updated DataFrame is printed.
Question 25
Write a program in Python Pandas to create the following DataFrame batsman from a Dictionary :
B_ Scor Score
Name
NO e1 2
1 Sunil Pillai 90 80
Gaurav
2 65 45
Sharma
3 Piyush Goel 70 90
Karthik
4 80 76
Thakur
Perform the following operations on the DataFrame :
(i) Add both the scores of a batsman and assign to column "Total".
(ii) Display the highest score in both Score1 and Score2 of the DataFrame.
(iii) Display the DataFrame.
Answer
import pandas as pddata = {'B_NO': [1, 2, 3, 4], 'Name': ['Sunil Pillai', 'Gaurav
Sharma', 'Piyush Goel', 'Karthik Thakur'], 'Score1': [90, 65, 70, 80], 'Score2': [80,
45, 90, 76]}batsman = pd.DataFrame(data)batsman['Total'] = batsman['Score1'] +
batsman['Score2']highest_score1 = batsman['Score1'].max()highest_score2 =
batsman['Score2'].max()print("Highest score in Score1: ",
highest_score1)print("Highest score in Score2: ", highest_score2)print(batsman)
Output
Consider the following dataframe, and answer the questions given below:
import pandas as pddf = pd.DataFrame( { "Quarter1": [2000, 4000, 5000, 4400,
10000], "Quarter2": [5800, 2500, 5400, 3000, 2900], "Quarter3": [20000, 16000,
7000, 3600, 8200], "Quarter4": [1400, 3700, 1700, 2000, 6000]})
(i) Write the code to find mean value from above dataframe df over the index and column axis.
(ii) Use sum() function to find the sum of all the values over the index axis.
Answer
(i)
import pandas as pddf = pd.DataFrame( { "Quarter1": [2000, 4000, 5000, 4400,
10000], "Quarter2": [5800, 2500, 5400, 3000, 2900], "Quarter3": [20000, 16000,
7000, 3600, 8200], "Quarter4": [1400, 3700, 1700, 2000,
6000]})mean_over_columns = df.sum(axis=1) / df.count(axis=1)print("Mean over
columns: \n", mean_over_columns)
mean_over_rows = df.sum(axis=0) / df.count(axis=0)print("Mean over rows: \n",
mean_over_rows)
Output
Write the use of the rename(mapper = <dict-like>, axis = 1) method for a Pandas Dataframe. Can the
mapper and columns parameter be used together in a rename() method ?
Answer
The rename() method in pandas DataFrame is used to alter the names of columns or rows. It accepts
various parameters, including mapper and axis, which can be used together to rename columns and rows
based on a mapping dictionary. The mapper parameter allows for a dict-like object mapping old names to
new names, while axis specifies whether the renaming should occur along columns (axis=1) or rows
(axis=0).
Yes, the mapper parameter and the columns parameter can be used together in the rename() method of a
pandas DataFrame to rename columns. The mapper parameter is used to rename columns based on a
mapping dictionary where keys represent the old column names and values represent the new column
names. The columns parameter allows us to directly specify new column names without using a mapping
dictionary. With columns, we provide a list-like input containing the new column names, and pandas will
rename the columns accordingly.
Question 29
Find the error in the following code considering the same dataframe topDf given in the previous question.
(i) topDf.rename(index=['a', 'b', 'c', 'd'])
(ii) topDf.rename(columns = {})
Answer
(i) The line topDf.rename(index=['a', 'b', 'c', 'd']) attempts to rename the index of the
DataFrame topDf, but it doesn't assign the modified DataFrame back to topDf or use the inplace =
True parameter to modify topDf directly. Additionally, using a list of new index labels without
specifying the current index labels will result in an error.
The corrected code is:
topDf.rename(index={'Sec A': 'a', 'Sec B': 'b', 'Sec C': 'c', 'Sec D': 'd'}, inplace =
True)
(ii) The line topDf.rename(columns={}) attempts to rename columns in the DataFrame topDf, but it
provides an empty dictionary {} for renaming, which will not perform any renaming. We need to provide
a mapping dictionary with old column names as keys and new column names as values. To modify topDf
directly, it should use the inplace = True parameter.
The corrected code is:
topDf.rename(columns={'RollNo': 'NewRollNo', 'Name': 'NewName', 'Marks':
'NewMarks'}, inplace = True)
Type C: Long Answer Questions
Question 1
Write Python code to create a Series object Temp1 that stores temperatures of seven days in it. Take any
random seven temperatures.
Solution
import pandas as pdtemperatures = [28.0, 30.4, 26.5, 29.4, 27.0, 31.2, 25.8]Temp1
= pd.Series(temperatures)print(Temp1)
Output
0 28.0
1 30.4
2 26.5
3 29.4
4 27.0
5 31.2
6 25.8
dtype: float64
Question 2
Write Python code to create a Series object Temp2 storing temperatures of seven days of week. Its
indexes should be 'Sunday', 'Monday',... 'Saturday'.
Solution
import pandas as pdtemperatures = [28.9, 30.1, 26.2, 29.3, 27.5, 31.9,
25.5]days_of_week = ['Sunday', 'Monday', 'Tuesday', 'Wednesday', 'Thursday',
'Friday', 'Saturday']Temp2 = pd.Series(temperatures, index =
days_of_week)print(Temp2)
Output
Sunday 28.9
Monday 30.1
Tuesday 26.2
Wednesday 29.3
Thursday 27.5
Friday 31.9
Saturday 25.5
dtype: float64
Question 3
A series object (say T1) stores the average temperature recorded on each day of a month. Write code to
display the temperatures recorded on :
(i) first 7 days
(ii) last 7 days.
Solution
import pandas as pdT1 = pd.Series([25.6, 26.3, 27.9, 28.2, 29.1, 30.9, 31.2, 32.4,
33.2, 34.4, 33.3, 32.5, 31.4, 30.7, 29.6, 28.9, 27.0, 26.2, 25.32, 24.34, 23.4, 22.3,
21.6, 20.9, 19.8, 18.1, 17.2, 16.34, 15.5, 14.6])first_7_days =
T1.head(7)print("Temperatures recorded on the first 7 days:")print(first_7_days)
last_7_days = T1.tail(7)print("\nTemperatures recorded on the last 7
days:")print(last_7_days)
Output
Series objects Temp1, Temp2, Temp3, Temp4 store the temperatures of days of week1, week2, week3,
week4 respectively.
Write a script to
(a) print the average temperature per week.
(b) print average temperature of entire month.
Solution
import pandas as pdTemp1 = pd.Series([28.0, 30.2, 26.1, 29.6, 27.7, 31.8, 25.9])
Temp2 = pd.Series([25.5, 24.5, 23.6, 22.7, 21.8, 20.3, 19.2]) Temp3 =
pd.Series([32.4, 33.3, 34.1, 33.2, 32.4, 31.6, 30.9]) Temp4 = pd.Series([27.3, 28.1,
29.8, 30.6, 31.7, 32.8, 33.0])
Week_1 = sum(Temp1)Week_2 = sum(Temp2)Week_3 = sum(Temp3)Week_4 =
sum(Temp4)
print("Week 1 : Average Temperature is", Week_1 / 7, "degree
Celsius")print("Week 2 : Average Temperature is", Week_2 / 7, "degree
Celsius")print("Week 3 : Average Temperature is", Week_3 / 7, "degree
Celsius")print("Week 4 : Average Temperature is", Week_4 / 7, "degree Celsius")
total = Week_1 + Week_2 + Week_3 + Week_4print("\nAverage temperature of
entire month:", total / 28, "degree Celsius")
Output
Ekam, a Data Analyst with a multinational brand has designed the DataFrame df that contains the four
quarters' sales data of different stores as shown below :
Qtr Qtr Qtr Qtr
Store
1 2 3 4
Store
0 300 240 450 230
1
Store
1 350 340 403 210
2
Store
2 250 180 145 160
3
Answer the following questions :
(i) Predict the output of the following Python statement :
(a) print(df.size)
(b) print(df[1:3])
(ii) Delete the last row from the DataFrame.
(iii) Write Python statement to add a new column Total_Sales which is the addition of all the 4 quarter
sales.
Answer
(i)
(a) print(df.size)
Output
15
Explanation
The size attribute of a DataFrame returns the total number of elements in the DataFrame df.
(b) print(df[1:3])
Output
This statement uses slicing to extract rows 1 and 2 from the DataFrame df.
(ii)
df = df.drop(2)
Output
Store Qtr1 Qtr2 Qtr3 Qtr4
0 Store1 300 240 450 230
1 Store2 350 340 403 210
(iii)
df['Total_Sales'] = df['Qtr1'] + df['Qtr2'] + df['Qtr3'] + df['Qtr4']
Output
Consider the following DataFrame df and answer any four questions from (i)-(v):
rolln UT UT UT UT
name
o 1 2 3 4
1 Prerna Singh 24 24 20 22
2 Manish Arora 18 17 19 22
3 Tanish Goel 20 22 18 24
4 Falguni Jain 22 20 24 20
Kanika
5 15 20 18 22
Bhatnagar
Ramandeep
6 20 15 22 24
Kaur
Write down the command that will give the following output :
roll no 6
name Tanish Goel
UT1 24
UT2 24
UT3 24
UT4 24
dtype : object
(a) print(df.max)
(b) print(df.max())
(c) print(df.max(axis = 1))
(d) print(df.max, axis = 1)
Answer
print(df.max())
Explanation
The df.max() function in pandas is used to find the maximum value in each column of a DataFrame.
Question 6(ii)
Consider the following DataFrame df and answer any four questions from (i)-(v):
rolln UT UT UT UT
name
o 1 2 3 4
1 Prerna Singh 24 24 20 22
2 Manish Arora 18 17 19 22
3 Tanish Goel 20 22 18 24
4 Falguni Jain 22 20 24 20
Kanika
5 15 20 18 22
Bhatnagar
Ramandeep
6 20 15 22 24
Kaur
The teacher needs to know the marks scored by the student with roll number 4. Help her identify the
correct set of statement/s from the given options:
(a) df1 = df[df['rollno'] == 4]
print(df1)
(b) df1 = df[rollno == 4]
print(df1)
(c) df1 = df.[df.rollno = 4]
print(df1)
(d) df1 = df[df.rollno == 4]
print(df1)
Answer
df1 = df[df.rollno == 4] print(df1)
Explanation
The statement df1 = df[df.rollno == 4] filters the DataFrame df to include only the rows where the
roll number is equal to 4. This is accomplished using boolean indexing, where a boolean mask is created
by checking if each row's rollno is equal to 4. Rows that satisfy this condition (True in the boolean mask)
are selected, while others are excluded. The resulting DataFrame df1 contains only the rows
corresponding to roll number 4 from the original DataFrame df.
Question 6(iii)
Consider the following DataFrame df and answer any four questions from (i)-(v):
rolln UT UT UT UT
name
o 1 2 3 4
1 Prerna Singh 24 24 20 22
2 Manish Arora 18 17 19 22
3 Tanish Goel 20 22 18 24
4 Falguni Jain 22 20 24 20
Kanika
5 15 20 18 22
Bhatnagar
Ramandeep
6 20 15 22 24
Kaur
Which of the following statement/s will give the exact number of values in each column of the
dataframe ?
(I) print(df.count())
(II) print(df.count(0))
(III) print(df.count)
(IV) print((df.count(axis = 'index')))
Choose the correct option :
(a) both (I) and (II)
(b) only (II)
(c) (I), (II) and (III)
(d) (I), (II) and (IV)
Answer
(I), (II) and (IV)
Explanation
In pandas, the statement df.count() and df.count(0) calculate the number of non-null values in each
column of the DataFrame df. The statement df.count(axis='index') specifies the axis parameter as
'index', which is equivalent to specifying axis=0. This means it will count non-null values in each column
of the DataFrame df.
Question 6(iv)
Consider the following DataFrame df and answer any four questions from (i)-(v):
rolln UT UT UT UT
name
o 1 2 3 4
1 Prerna Singh 24 24 20 22
rolln UT UT UT UT
name
o 1 2 3 4
2 Manish Arora 18 17 19 22
3 Tanish Goel 20 22 18 24
4 Falguni Jain 22 20 24 20
Kanika
5 15 20 18 22
Bhatnagar
Ramandeep
6 20 15 22 24
Kaur
Which of the following command will display the column labels of the DataFrame ?
(a) print(df.columns())
(b) print(df.column())
(c) print(df.column)
(d) print(df.columns)
Answer
print(df.columns)
Explanation
The statement df.columns is used to access the column labels (names) of a DataFrame in pandas.
Question 6(v)
Consider the following DataFrame df and answer any four questions from (i)-(v):
rolln UT UT UT UT
name
o 1 2 3 4
1 Prerna Singh 24 24 20 22
2 Manish Arora 18 17 19 22
3 Tanish Goel 20 22 18 24
4 Falguni Jain 22 20 24 20
Kanika
5 15 20 18 22
Bhatnagar
Ramandeep
6 20 15 22 24
Kaur
Ms. Sharma, the class teacher wants to add a new column, the scores of Grade with the values, 'A', 'B',
'A', 'A', 'B', 'A' , to the DataFrame.
Help her choose the command to do so :
(a) df.column = ['A', 'B', 'A', 'A', 'B', 'A']
(b) df['Grade'] = ['A', 'B', 'A', 'A', 'B', 'A']
(c) df.loc['Grade'] = ['A', 'B', 'A', 'A', 'B', 'A']
(d) Both (b) and (c) are correct
Answer
df['Grade'] = ['A', 'B', 'A', 'A', 'B', 'A']
Explanation
The statement df['Grade'] specifies that we are creating a new column named 'Grade' in the DataFrame
df. The square brackets [] are used to access or create a column in a DataFrame.
Question 7
Write a program that stores the sales of 5 fast moving items of a store for each month in 12 Series objects,
i.e., S1 Series object stores sales of these 5 items in 1st month, S2 stores sales of these 5 items in 2nd
month, and so on.
The program should display the summary sales report like this :
Total Yearly Sales, item-wise (should display sum of items' sales over the months)
Maximum sales of item made : <name of item that was sold the maximum in whole year>
Maximum sales for individual items
Maximum sales of item 1 made : <month in which that item sold the maximum>
Maximum sales of item 2 made : <month in which that item sold the maximum>
Maximum sales of item 3 made : <month in which that item sold the maximum>
Maximum sales of item 4 made : <month in which that item sold the maximum>
Maximum sales of item 5 made : <month in which that item sold the maximum>
Solution
Three Series objects store the marks of 10 students in three terms. Roll numbers of students form the
index of these Series objects. The Three Series objects have the same indexes.
Calculate the total weighted marks obtained by students as per following formula :
Final marks = 25% Term 1 + 25% Term 2 + 50% Term 3
Store the Final marks of students in another Series object.
Solution
import pandas as pdterm1 = pd.Series([80, 70, 90, 85, 75, 95, 80, 70, 85, 90],
index=[1, 2, 3, 4, 5, 6, 7, 8, 9, 10])term2 = pd.Series([85, 90, 75, 80, 95, 85, 90,
75, 80, 85], index=[1, 2, 3, 4, 5, 6, 7, 8, 9, 10])term3 = pd.Series([90, 85, 95, 90,
80, 85, 95, 90, 85, 90], index=[1, 2, 3, 4, 5, 6, 7, 8, 9, 10])
final_marks = (term1 * 0.25) + (term2 * 0.25) + (term3 * 0.50)print(final_marks)
Output
1 86.25
2 82.50
3 88.75
4 86.25
5 82.50
6 87.50
7 90.00
8 81.25
9 83.75
10 88.75
dtype: float64
Question 9
a 1
b 2
c 3
d 4
dtype: int64
<class 'pandas.core.series.Series'>
Index: 4 entries, a to d
Series name: None
Non-Null Count Dtype
-------------- -----
4 non-null int64
dtypes: int64(1)
memory usage: 64.0+ bytes
Question 10
Write a program to create three different Series objects from the three columns of a DataFrame df.
Solution
import pandas as pddf = pd.DataFrame({'A': [1, 2, 3], 'B': [4, 5, 6], 'C': [7, 8, 9]})s1
= df['A']s2 = df['B']s3 = df['C']print(s1)print(s2)print(s3)
Output
0 1
1 2
2 3
Name: A, dtype: int64
0 4
1 5
2 6
Name: B, dtype: int64
0 7
1 8
2 9
Name: C, dtype: int64
Question 11
Write a program to create three different Series objects from the three rows of a DataFrame df.
Solution
import pandas as pddf = pd.DataFrame({'A': [1, 2, 3], 'B': [4, 5, 6], 'C': [7, 8, 9]})s1
= df.iloc[0]s2 = df.iloc[1]s3 = df.iloc[2]print(s1)print(s2)print(s3)
Output
A 1
B 4
C 7
Name: 0, dtype: int64
A 2
B 5
C 8
Name: 1, dtype: int64
A 3
B 6
C 9
Name: 2, dtype: int64
Question 12
Write a program to create a Series object from an ndarray that stores characters from 'a' to 'g'.
Solution
import pandas as pdimport numpy as npdata = np.array(['a', 'b', 'c', 'd', 'e', 'f',
'g'])S = pd.Series(data)print(S)
Output
0 a
1 b
2 c
3 d
4 e
5 f
6 g
dtype: object
Question 13
Write a program to create a Series object that stores the table of number 5.
Solution
0 5
1 10
2 15
3 20
4 25
5 30
6 35
7 40
8 45
9 50
dtype: int32
Question 14
Write a program to create a Dataframe that stores two columns, which store the Series objects of the
previous two questions (12 and 13).
Solution
import pandas as pdimport numpy as npdata = np.array(['a', 'b', 'c', 'd', 'e', 'f',
'g'])S1 = pd.Series(data)arr = np.arange(1, 11)S2 = pd.Series(arr * 5)df =
pd.DataFrame({'Characters': S1, 'Table of 5': S2})
print(df)
Output
Characters Table of 5
0 a 5
1 b 10
2 c 15
3 d 20
4 e 25
5 f 30
6 g 35
7 NaN 40
8 NaN 45
9 NaN 50
Question 15
Write a program to create a Dataframe storing salesmen details (name, zone, sales) of five salesmen.
Solution
Four dictionaries store the details of four employees-of-the-month as (empno, name). Write a program to
create a dataframe from these.
Solution
A list stores three dictionaries each storing details, (old price, new price, change). Write a program to
create a dataframe from it.
Solution