0% found this document useful (0 votes)
174 views7 pages

12 Ip Dataframes Notes

This document discusses Pandas DataFrames. It provides 15 very short questions and answers about DataFrames. Key topics covered include that DataFrames are two-dimensional data structures, how to access rows and columns via indexing, and common functions like head(), tail(), and count().

Uploaded by

abesaale10
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
174 views7 pages

12 Ip Dataframes Notes

This document discusses Pandas DataFrames. It provides 15 very short questions and answers about DataFrames. Key topics covered include that DataFrames are two-dimensional data structures, how to access rows and columns via indexing, and common functions like head(), tail(), and count().

Uploaded by

abesaale10
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 7

CHAPTER – DATA HANDLIN USING PANDAS-I

VSA – Very Short Answer Question (for 1 Mark)


Q.1 _________ is a one dimensional labelled array capable of holding any data type.
Ans. Series

Q.2 If data is an ndarray, _______ must be of same length as data.


Ans. Index

Q.3 Pandas was developed by _________in 2008


Ans. Wes Mckinney.

Q.4 ________ is used for 2D plots of array in Python.


Ans. matplotlib.

Q.5 Pandas provides _______ data structures for processing the data.
Ans. Two

Q.6 ________ function is used to add series and other, elements wise.
Ans. add()

Q.7 head() function is used to get the ______ n rows.


Ans. First

Q.8 if data is ______, an index must be provided.


Ans. Scalar value

Q.9 Given a Pandas series called Sample, the command which will display the last 3 rows
is_________.
Ans. print(Sample.tail(3))

Q.10 Given a Pandas series called Sequences, the command which will display the
first 4 rows is ____________.
Ans. print(Sequence.head(4))

Q.11 ____________ method in Pandas does not raise errors for multiple entries of a row,
column combinations.
Ans.pivot_table( )

Q.12 Given the following Series T1 and T2:


T1 T2
A 10 A 80
B 40 B 20
C 34 C 74
D 60 D 90
Write the command to find the sum of series T1 and T2
Ans. print(T1+T2)

29
Q.13 Given the following Series S1 and S2:
S1 S2
A 10 A 5
B 20 B 4
C 30 C 6
D 40 D 8
Write the command to find the multiplication of series S1 and S2
Ans. print(S1*S2)

Q.14 Give the output of the following program:


import numpy as np
arr=np.array([21,22,23,24,25,26,27,28,29,30])
print(arr[4:9])
a. [25 26 27 28 29] b. [25 29] c. 25 26 29] d. None of these
Ans. a. [25 26 27 28 29]

Q.15 __________ is a two dimensional structure storing heterogeneous mutable data.


Ans. DataFrame

Q.16 Mention the different types of data structure in Pandas.


Ans. The two data structures which are supported by Pandas library are Series and
DataFrames.

Q.17 Which command is used to import matplotlib?


Ans. import matplotlib.pyplot as plt

Q.18 How to create empty series


Ans. Series_Object = pandas.Series()

Q.19 Define add() function in Series ()


Ans. add() function is used to add series and other elements wise
Syntax : Series.add(other,fill_value=None, axis=0)

Q.20 What do mean by clear code API


Ans. The clear API of the Pandas allows you to focus on the core part of the code.

30
SA – Short Answer Question (for 2 Marks)
Q.1 List two key features of Pandas.
Ans. The two features of Pandas are :
(i) It can process a variety of data set in different formats : time series, tabular
heterogeneous arrays and matrix data.
(ii) If facilitates loading and importing data from varied sources such as CSV and
DB/SQL.

Q.2 What are the benefits of Pandas ?


Ans. Benefits of Pandas are :
(i) Data representation : It ca easily represent data in form naturally suited for data
analysis via its DataFrame and series data structures in a concise manner.
(ii) Data sub setting and filtering : It provides for easy sub setting and filtering of
data, procedures that are a staple of doing analysis.
Q.3 What is series ? Explain with an example.
Ans. Pandas series is one dimensional labelled array capable of holding data of any type
(integer, string, float, Python objects etc.) The axis labels are collectively called index.
Example :
import pandas as pd
data=pd.Series([1,2,3,4,5])
print(data)

Q.4 Consider the following Series : Subject


INDEX MARK
ENGLISH 75
HINDI 78
MATHS 82
SCIENCE 86
Write a program in Python Pandas to create a Series.
Ans.
import pandas as pd
subject=pd.Series([75,78,82,86],index=['ENGLISH','HINDI','MATHS','SCIENCE'])

Q5. Consider the following Series object, “company” and its profit in Crores

TCS 350
Reliance 200
L&T 800
Wipro 150

(i) Write the command which will display the name of the company having
profit>250.
(ii) Write the command to name the series as Profit.

31
SA – Short Answer Question (for 3 Marks)
Q. 1 Consider two objects a and b.
a is a list whereas b is a Series. Both have values 10,20,25,50.

What will be the output of the following two statements considering that the above objects
have been created already
a. print(a*2) b. print(b*2)
Justify your answer.
Ans.
a. will give the output as:
[10,20,25,50,10,20,25,50]
b. will give the output as
0 20
1 40
2 50
3 100
Justification: In the first statement a represents a list so when a list is multiplied by a number, it
is replicated that many number of times.
The second b represents a series. When a series is multiplied by a value, then each element of the
series is multiplied by that number.

Q.2 Explain the data structure in Pandas


Ans. Data structure is defined as the storage and management of the data for its efficient and
easy access in the future where the data is collected, modified and the various types of
operations are performed on the data respectively.
Pandas provides two data structures for processing the data, which are described below :
(i) Series : It is an one dimensional object similar to an array, list or column in a
table. It will assign a labelled index to each item in the series. By default, each
item will receive an index label from 0 to N, where N is the length of the series
minus one.
(ii) DataFrame : It is a tabular data structure comprised of rows and columns. Data
Frame is defined as a standard way to store data and has two different indexes
i.e., row index and column index.

Q.3 What is slicing ?


Ans. Slicing is a powerful approach to retrieve subsets of data from a Pandas object. A slice
object is built using a syntax of start : end : step, the segments representing the first item, last
item and the increment between each item that you would like as the step.

Q.4 Define the following terms


(i) .loc[ ] (ii) .iloc [ ]
Ans. .loc [ ] :This attribute is used to access a group of rows and columns by label(s) or a
Boolean array in the given series object.
Syntax : Series.loc
.iloc[ ] : This attributes enables purely integer location based indexing for selection by
position over the given series object.
Syntax : Series.iloc

32
CHAPTER – DATAFRAME
VSA – Very Short Answer Question (for 1 Mark)
Q.1 DataFrame is ___________ dimensional data structure
Ans. two

Q.2 In DataFrame, __________ is used for the row label.


Ans. Index

Q.3 ________________ is a general term for taking each item of something, one after another.
Ans. Iteration.

Q.4 _____________ function return last n rows from the object based on position.
Ans. tail( )

Q.5 ____________ can also be known as subset section.


Ans. Indexing

Q.6 Boolean indexing helps us to select the data from the DataFrame using_______.
Ans. boolean vector.

Q.7 CSV file are the _____


Ans. Comma Separated Values.

Q.8 ________ function is used to import a CSV file to DataFrame format.


Ans. read_CSV( )

Q.9 Hitesh wants to display the last four rows of the dataframe df and has written the following
code :
df.tail( )
but last 5 rows are being displayed. Identify the errors and rewrite the correct code so that last 4
rows get displayed.
Ans. df.tail(4)

Q.10 Consider the following Python code and write the output for statement.
import pandas as pd
values=[“India”, “Canada”]
code=[“IND”, “CAN”]
df=pd.DataFrame(values,Index=Code,columns=[‘Country’]
Ans.
Code Country
IND India
CAN Canada

Q.11 The teacher needs to know the marks scored by the student with roll number 4. Help her to
identify the correct set of statement/s from the given options :
a. df1=df[df[‘rollno’]==4]
print(df1)

33
b. df1=df[rollno==4]
print(df1)
c. df1=df[df.rollno=4]
print(df1)
d. df1=df[df.rollno==4]
print(df1)
Ans.
a. df1=df[df[‘rollno’]==4]
print(df1)
d. df1=df[df.rollno==4]
print(df1)

Q.12
In Pandas the function used to delete a column in a DataFrame is
a. remove
b. del
c. drop
d. cancel
Ans. (b) del

Q.13 ____________ function applies the passed function on each individual data element of the
dataframe.
a. apply() b. applymap() c. pivot() d. pivot_table()
Ans. a. apply()

Q.14 Which of the following statement/s will give the exact number of values in
each column of the dataframe?
i. print(df.count())
ii. print(df.count(0))
iii. print(df.count)
iv. print(df.count(axis=’index’))
Choose the correct option:
a. both (i) and (ii)
b. only (ii)
c. (i), (ii) and (iii)
d. (i), (ii) and (iv)
Ans. a. both (i) and (ii)
Q.15 Which of the following command will display the column labels of the DataFrame?
a. print(df.columns()) b. print(df.column()) c. print(df.column) d. print(df.columns)
Ans. a. print(df.columns()) or d. print(df.columns)

Q.16 State True / False:


A dataframe cannot be created using another dataframe.
Ans. False

34
Q.17 Which method is used to access vertical subset of a dataframe.?
(i) Iterrows()
(ii) Iteritems()
(iii) Itertuples()
Ans.(ii) Iteritems( )

Q.18 State whether True or False


a. A series object is size mutable.
b. A Dataframe object is value mutable
Ans. a. False
b. True

Q.19 Define the iterrows()


Ans. iterrows() returns the iterator yielding each index value along with a series containing the
data in each row.

Q.20 Which function is used to export DataFrame to a CSV file ?


Ans. To export a Pandas DataFrame to a CSV file, use to_CSV function.
Syntax : to_CSV(parameter)

SA – Short Answer Question (for 2 Marks)


Q.1What are the operation on Pandas DataFrame?
Ans. We can perform the following advanced operation on the DataFrame as
 Assignment
 Selection
 Pivoting
 Sorting
Aggregation

Q.2 Given the Output of the code


>>>import pandas as pd
>>>a= pd.DataFrame([1,1,1,None],index=[‘a’, ‘b’, ‘c’ , ‘d’], column = [‘One’])
>>>print(a)
Ans. One
a 1.0
b 1.0
c 1.0
d NaN

Q.3 Explain DataFrame. Can it be considered as 1D Array or 2D Array


Ans. DataFrame is two dimensional array with heterogeneous data usually represented in
tabular format. It can be considered as 2D array.
Q.4 Write the output of the following code
import pandas as pd
data=[‘a’, ‘b’, ‘c’, ‘d’, ‘e’]
df = pd.DataFrame(data)
print(df)
Ans. Output

35

You might also like