0% found this document useful (0 votes)
99 views4 pages

Machine Learning Using Python Question Paper 1

fgf
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
99 views4 pages

Machine Learning Using Python Question Paper 1

fgf
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 4

Exam Paper 1

Machine Learning using Python [Time: 4 hrs]


[Total Marks: 50]
Exam – Paper 1

Part I: Supervised Learning [Total Marks - 30]

Given is the ‘Portugal Bank Marketing’ dataset:

Bank client data:


1) age (numeric)

2) job: type of
job(categorical:"admin.","bluecollar","entrepreneur","housemaid","management","retired","self-
employed","services","student","technician","unemployed","unknown")

3) marital: marital status (categorical: "divorced","married","single","unknown"; note: "divorced"


means divorced or widowed)

4) education: education of individual (categorical:


"basic.4y","basic.6y","basic.9y","high.school","illiterate","professional.course","university.degree","u
nknown")

5) default: has credit in default? (categorical: "no","yes","unknown")

6) housing: has housing loan? (categorical: "no","yes","unknown")

7) loan: has personal loan? (categorical: "no","yes","unknown")

Related with the last contact of the current campaign:


8) contact: contact communication type (categorical: "cellular","telephone")

9) month: last contact month of year (categorical: "jan", "feb", "mar", …, "nov", "dec")

10) dayofweek: last contact day of the week (categorical: "mon","tue","wed","thu","fri")

11) duration: last contact duration, in seconds (numeric). Important note: this attribute highly
affects the output target (e.g., if duration=0 then y="no"). Yet, the duration is not known before a call
is performed. Also, after the end of the call y is obviously known. Thus, this input should only be
included for benchmark purposes and should be discarded if the intention is to have a realistic
predictive model.
Exam Paper 1

Other attributes:
12) campaign: number of contacts performed during this campaign and for this client (numeric,
includes last contact)

13) pdays: number of days that passed by after the client was last contacted from a previous
campaign (numeric; 999 means client was not previously contacted)

14) previous: number of contacts performed before this campaign and for this client (numeric)

15) poutcome: outcome of the previous marketing campaign (categorical:


"failure","nonexistent","success")

Social and economic context attributes


16) emp.var.rate: employment variation rate - quarterly indicator (numeric)

17) cons.price.idx: consumer price index - monthly indicator (numeric)

18) cons.conf.idx: consumer confidence index - monthly indicator (numeric)

19) concavepoints_se: standard error for number of concave portions of the contour

20) euribor3m: euribor 3 month rate - daily indicator (numeric)

21) nr.employed: number of employees - quarterly indicator (numeric)

Output variable (desired target):

22) y: has the client subscribed a term deposit? (binary: "yes","no")

Perform the following tasks: Marks


Q1. What does the primary analysis of several categorical [5]
features reveal?
Q2. Perform the following Exploratory Data Analysis tasks: [10]
a. Missing Value Analysis
b. Label Encoding wherever required
c. Selecting important features based on Random Forest
d. Handling unbalanced data using SMOTE
e. Standardize the data using the anyone of the scalers
provided by sklearn
Exam Paper 1

Q3. Build the following Supervised Learning models: [10]


a. Logistic Regression
b. AdaBoost
c. Naïve Bayes
d. KNN
e. SVM
Q4. Tabulate the performance metrics of all the above models [5]
and tell which model performs better in predicting if the
client will subscribe to term deposit or not

Part II: Time Series [Total Marks - 20]

For the given data ‘MonthWiseMarketArrivals_Clean.csv’, below is attribute


information:

This dataset is about Indian onion market.

1. Market Name - Market Place Name


2. Month - Month (January-December)
3. Year - 1996-2016
4. Quantity - Quantity of Onion (in Kgs)
5. priceMin - Minimum Selling Price
6. priceMax - Maximum Selling Price
7. Pricemod - Modal Price
8. State - State of market
9. City - City of market
10. Date - Date of arrival
Exam Paper 1

Perform the following tasks: Marks


Q1. Get the modal price of onion for each month for the Mumbai [2]
market (Hint: set monthly date as index and drop
redundant columns)
Q2. Build time series model and check the performance of the [8]
model using RMSE
Q3. Plot ACF and PACF plots [5]
Q4. Exponential smoothing using Holt-Winter’s technique and [5]
Forecast onion price for Mumbai market

You might also like