CSE1703 - Fundamental of Data Science
CSE1703 - Fundamental of Data Science
CSE1703 - Fundamental of Data Science
Enrolment No:
Instructions:
All Questions are compulsory.
All questions are Objective Type.
Excel sheet to solve questions will be provided to you just before the exam.
Exam is of 40 marks and you have 1.5 hours to complete it.
CO1 [1
X10
=
10]
a) Data Analysis
b) Data Science
c) Descriptive Analytics
d) None of the mentioned
C. Which of the following is not true about Pivot Tables? Select all that apply CO1
a) Pivot tables can be filtered by multiple columns
b) Pivot tables automatically calculate grand totals of rows and columns
c) Editing a pivot table will impact the original data source
d) Dates in a pivot table can be grouped by years, quarters, months, days, hours,
minutes and seconds.
CO2
F. Say, we have 200 input features and 1 target variable. Now you have to select 20 most
important features based on the relationship between input and the target features. Do CO1
you think, this is an example of dimensionality reduction?
a. Yes
b. No
G. Which of the following techniques would perform better for reducing dimensions of a
feature set? CO1
a. Removing columns with dissimilar data trends
b. Removing columns which have high variance in data.
c. Removing columns which have too many missing values
d. None of these
H. PCA can be used for projecting and visualizing data in lower dimensions.
a. True CO3
b. False
b. You can see the ‘?’ in the dataframe. Remove all the ‘?’ and the dimension of the
resultant dataframe is:
I. (170,26)
II. (159, 26) [3]
III. (201,26)
IV. (205,26)
import pandas as pd
import seaborn as sns
import numpy as np
import csv
import matplotlib.pyplot as plt
import pandas as pd
import numpy as np
from sklearn.impute import SimpleImputer
# Reading an excel file using Python
import xlrd
#read excel Data
df = pd.read_excel("…Insert path…/Summer-Olympic-medals-1976-to-2008student.xlsx")
df.head()
c. Find the total number of medals won by city Atlanta in the sport ‘Wrestling’
i. 48
ii. 70 [3]
iii. 60
iv. 54
d. How many total medals won by Women from the city Sydney?
i. 600
ii. 777
iii. 899 [2]
iv. 889
4 Use the Excel sheet outlier_example2.xlsx, provided to you for the following problem. CO2
Problem Statement: Use the following code and extend it to answer the following
questions.
import csv
import matplotlib.pyplot as plt
import pandas as pd
import numpy as np
from sklearn.impute import SimpleImputer
import seaborn as sns
5 Let’s assume you wanted to predict the Car price based on the parameter Highway-mpg.
After building a model you will obtain following results
CO3 [2]
6 Consider the following code and the respective solutions. What is the final estimated linear CO3 [3]
model that you get?