20CS1101_Introduction to Data Science
20CS1101_Introduction to Data Science
1 a Define Data Science and discuss Benefits and uses of data science. [L1][CO1] [6M]
2 Explain in Details various data types used in Data science and Big data [L2][CO1] [12M]
b How will you creating research goals in a project charter [L1][CO1] [6M]
5 How will you retrieve the required data from data science [L1][CO5] [12M]
6 Discuss in detailed Data Cleaning operation in data science [L2][CO1] [12M]
b What are the ways analyzed the data and built a well-performing model [L2][CO1] [6M]
10 a How will you handling missing data in data science [L2][CO1] [6M]
b Examine K-nearest neighbor techniques look at the k-nearest point to make [L4][CO1] [6M]
a prediction
Course Code: 20CS1101 R20
UNIT –II
STATISTICAL METHODS FOR EVALUATION&ASSOCIATION RULES
UNIT –III
REGRESSION& CLASSIFICATION
1 a Which two basic measures does the entropy methods select the most informative [L1][CO3] [6M]
attribute?
b Define confusion matrix [L1][CO3] [6M]
2 Explain the analytical technique Linear Regression with its model description. [L2][CO3] [12M]
3 Discuss the following with respect to linear regression [L2][CO3] [12M]
a) Categorical Variables
b) Confidence Intervals on the Parameters
c) Confidence Interval on the Expected Outcome
d) Prediction Interval on a Particular Outcome
4 a Justify the usage of linear regression and logistic regression. [L6][CO3] [4M]
b Illustrate Logistic Regression Model. [L3][CO3] [8M]
5 a Describe Decision Trees in detail with example. [L2][CO3] [6M]
b Difference between Alternative hypothesis and null hypothesis [L2][CO4] [6M]
6 Intercept the decision trees algorithms [L4][CO4] [12M]
7 a State Bayes’ Theorem [L1][CO4] [4M]
b Discuss Naïve Bayes classification method considering an example [L2][CO4] [8M]
8 How does one pick the mostsuitable method for a given classification problem? [L2][CO4] [12M]
9 a Compare the C4.5 and CART algorithm of decision tree. [L4][CO4] [4M]
b Discriminate the way show the evaluation of decision tree is done [L5][CO4] [4M]
c Give the two approaches that help avoid over fitting in decision tree learning. [L2][CO4] [4M]
10 Discuss the following term: [L4][CO4] [12M]
a) Accuracy
b) TPR
c) FPR
d) FNR
e) Precision
Course Code: 20CS1101 R20
UNIT –IV
CLUSTERING & TIME SERIES ANALYSIS
4 a Indicate when the time series ytfor t=1,2,3,…. is said to be stationary time series. [L2][CO6] [6M]
b Express the stationary time series conditions in detail. [L6][CO6] [6M]
5 Discussion detail each part of the ARIMA model [L2][CO5] [12M]
6 a List and explain time series components [L1][CO6] [6M]
b Discriminate the steps involved in Box-Jenkins Methodology [L5][CO6] [6M]
7 a What is meant by k-means [L1][CO5] [4M]
b Describe k-means algorithm to find k clusters [L2][CO5] [8M]
8 Correlate ARMA and ARIMA Models [L4][CO6] [12M]
9 Express the following [L2][CO6] [12M]
a) Autocorrelation Function
b) Autoregressive Models
10 List and describe Additional time series methods [L2][CO6] [12M]
Course Code: (20CS1101 R20
Preparedby:
Mr.G.Prasad Babu
Associate Professor