0% found this document useful (0 votes)

16 views5 pages

20CS1101 - Introduction To Data Science

The document is a question bank for the course 'Introduction to Data Science' at Siddharth Institute of Engineering & Technology, covering various units including data science fundamentals, statistical methods, regression and classification, clustering, and text analysis. Each unit contains descriptive questions aimed at evaluating students' understanding of key concepts and techniques in data science. The questions range from definitions and explanations to detailed discussions and analyses of specific methods and algorithms.

Uploaded by

210IIM2OO2

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

16 views5 pages

20CS1101 - Introduction To Data Science

Uploaded by

210IIM2OO2

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 5

Course Code: 20CS1101 R20

SIDDHARTH INSTITUTE OF ENGINEERING & TECHNOLOGY:: PUTTUR

(AUTONOMOUS)
Siddharth Nagar, Narayanavanam Road – 517583

QUESTION BANK (DESCRIPTIVE)

Subject with Code: (20CS1101) INTRODUCTION TO DATA SCIENCE

Course & Branch: B.Tech – CSE(CAD) Regulation: R20
UNIT –I
INTRODUCTION TO DATA SCIENCE

1 a Define Data Science and discuss Benefits and uses of data science. [L1][CO1] [6M]

b Discuss the Various Processing Steps in Data Science [L2][CO1] [6M]

2 Explain in Details various data types used in Data science and Big data [L2][CO1] [12M]

3 a Analyze the term: Distributed file systems [L4][CO1] [6M]

b How will you creating research goals in a project charter [L1][CO1] [6M]

4 Classify the term big data ecosystem [L4][CO1] [12M]

5 How will you retrieve the required data from data science [L1][CO5] [12M]
6 Discuss in detailed Data Cleaning operation in data science [L2][CO1] [12M]

7 a What are various steps involved in integrating phase [L1][CO1] [6M]

b What is meant by exploratory data analysis [L1][CO1] [6M]

8 Examine the term: Transforming data in Data science [L3][CO1] [12M]

9 a Show the various components of model building. [L2][CO1] [6M]

b What are the ways analyzed the data and built a well-performing model [L2][CO1] [6M]

10 a How will you handling missing data in data science [L2][CO1] [6M]
b Examine K-nearest neighbor techniques look at the k-nearest point to make [L4][CO1] [6M]
a prediction
Course Code: 20CS1101 R20

UNIT –II
STATISTICAL METHODS FOR EVALUATION&ASSOCIATION RULES

1 a Define Hypothesis Testing [L1][CO2] [6M]

b How will you mathematically define Confidence [L1][CO2] [6M]
2 a Differentiate Null Hypotheses and Alternative Hypotheses [L4][CO2] [6M]
b Examine the application property of Wilcoxon rank-sum test [L3][CO2] [6M]
3 Discriminate about Difference of Means [L5][CO2] [12M]
4 Explain the differences between Bl and Data Science. [L2][CO2] [12M]
5 Explain the following [L2][CO2] [12M]
a) Student’s t-test
b) Welch’s t-test
6 a What are the three characteristics of Big Data, and what are the main [L1][CO2] [6M]
considerations in processing Big Data?
b How evaluation of Candidate Rules is done? [L2][CO2] [6M]
7 a What is a type I error? What is a type II error? Is one always more serious [L1][CO2] [6M]
than the other? Why?
b Give the difference between Validation and Testing [L4][CO2] [6M]
8 a State Apriori Algorithm [L1][CO2] [4M]
b Explain Apriori Algorithm with example [L2][CO2] [8M]
9 a List and discuss the four measures of significance of Association rules [L1][CO2] [6M]
b Give the Applications of Association Rules [L1][CO2] [6M]
10 Illustrate any five approaches to improve Apriori’s efficiency when the [L3][CO2] [12M]
dataset is large.
Course Code: 20CS1101 R20

UNIT –III
REGRESSION& CLASSIFICATION

1 a Which two basic measures does the entropy methods select the most informative [L1][CO3] [6M]
attribute?
b Define confusion matrix [L1][CO3] [6M]
2 Explain the analytical technique Linear Regression with its model description. [L2][CO3] [12M]
3 Discuss the following with respect to linear regression [L2][CO3] [12M]
a) Categorical Variables
b) Confidence Intervals on the Parameters
c) Confidence Interval on the Expected Outcome
d) Prediction Interval on a Particular Outcome
4 a Justify the usage of linear regression and logistic regression. [L6][CO3] [4M]
b Illustrate Logistic Regression Model. [L3][CO3] [8M]
5 a Describe Decision Trees in detail with example. [L2][CO3] [6M]
b Difference between Alternative hypothesis and null hypothesis [L2][CO4] [6M]
6 Intercept the decision trees algorithms [L4][CO4] [12M]
7 a State Bayes’ Theorem [L1][CO4] [4M]
b Discuss Naïve Bayes classification method considering an example [L2][CO4] [8M]
8 How does one pick the mostsuitable method for a given classification problem? [L2][CO4] [12M]
9 a Compare the C4.5 and CART algorithm of decision tree. [L4][CO4] [4M]
b Discriminate the way show the evaluation of decision tree is done [L5][CO4] [4M]
c Give the two approaches that help avoid over fitting in decision tree learning. [L2][CO4] [4M]
10 Discuss the following term: [L4][CO4] [12M]
a) Accuracy
b) TPR
c) FPR
d) FNR
e) Precision
Course Code: 20CS1101 R20
UNIT –IV
CLUSTERING & TIME SERIES ANALYSIS

1 a What is clustering? [L1][CO5] [6M]

b State the advantage of using PAM. [L1][CO5] [6M]
2 Illustrate the method to find k clusters from a collection of M objects with n [L3][CO5] [12M]
attributes.
3 a Explain any one case study for time series analysis [L2][CO5] [6M]

b What is forecasting in association with time series. Explain [L1][CO6] [6M]

4 a Indicate when the time series ytfor t=1,2,3,…. is said to be stationary time series. [L2][CO6] [6M]
b Express the stationary time series conditions in detail. [L6][CO6] [6M]
5 Discussion detail each part of the ARIMA model [L2][CO5] [12M]
6 a List and explain time series components [L1][CO6] [6M]
b Discriminate the steps involved in Box-Jenkins Methodology [L5][CO6] [6M]
7 a What is meant by k-means [L1][CO5] [4M]
b Describe k-means algorithm to find k clusters [L2][CO5] [8M]
8 Correlate ARMA and ARIMA Models [L4][CO6] [12M]
9 Express the following [L2][CO6] [12M]
a) Autocorrelation Function
b) Autoregressive Models
10 List and describe Additional time series methods [L2][CO6] [12M]
Course Code: (20CS1101 R20

UNIT –V TEXT ANALYSIS

1 a Define Porter’s stemming algorithm. [L1][CO6] [6M]

b What is Topic modeling? [L1][CO6] [6M]
2 Explain the three important steps of the text analysis [L2][CO6] [12M]
3 a Sketch the flow diagram of Text analysis process [L5][CO6] [6M]
b Illustrate in detail the steps involved in the process of Text Analysis done by [L3][CO6] [6M]
organizations
4 a Define TFIDF. [L1][CO6] [4M]
b Describe the usage of TFIDF to compute the usefulness of each word in the [L2][CO6] [8M]
texts.
5 Explain how the data science team will categorize the reviews by topics [L2][CO6] [12M]
6 Illustrate the main challenges of text analysis [L3][CO6] [12M]
7 a Define Topic model. Describe LDA. [L2][CO6] [6M]
b Justify the process of topic modeling simplification. [L6][CO6] [6M]
8 Explain the following [L3][CO6] [12M]
a) Tokenization
b) Case folding
9 a Explain how categorizing documents by topics is done. [L2][CO6] [6M]
b Interpret the procedure used in data science to gain insights into customer [L3][CO6] [6M]
opinions
10 a What is meant by sentiment analysis [L1][CO6] [4M]
b Discriminate the methods used for sentiment analysis [L5][CO6] [8M]

Preparedby:
Mr.G.Prasad Babu
Associate Professor

INTRODUCTION TO DATA SCIENCE (R20)

Data Science and ML-KTU
No ratings yet
Data Science and ML-KTU
11 pages
AD3491 - Unit 1 - Introduction To Data Science Important Questions 2 Marks With Answer - 3-8
No ratings yet
AD3491 - Unit 1 - Introduction To Data Science Important Questions 2 Marks With Answer - 3-8
6 pages
WCN 5th Unit
No ratings yet
WCN 5th Unit
28 pages
CSBS - AD3491 - FDSA - IA 1 - Answer Key
100% (11)
CSBS - AD3491 - FDSA - IA 1 - Answer Key
14 pages
1152CS239-Intro. To Data Science-Syllabus
No ratings yet
1152CS239-Intro. To Data Science-Syllabus
6 pages
12 2marks With Ans
No ratings yet
12 2marks With Ans
21 pages
Data Mining Merged
No ratings yet
Data Mining Merged
10 pages
Chapter 5.3-Mulitple Linear Regression
No ratings yet
Chapter 5.3-Mulitple Linear Regression
26 pages
Econometrics 1: Classical Linear Regression Analysis
No ratings yet
Econometrics 1: Classical Linear Regression Analysis
20 pages
Dse Q B
No ratings yet
Dse Q B
13 pages
Chapter7 Econometrics Multicollinearity
No ratings yet
Chapter7 Econometrics Multicollinearity
25 pages
Straightforward Statistics With Excel, 2nd Edition
No ratings yet
Straightforward Statistics With Excel, 2nd Edition
927 pages
Introduction To Path Analysis and SEM With AMOS
0% (1)
Introduction To Path Analysis and SEM With AMOS
41 pages
Food Science Journal - Kelompok 8
No ratings yet
Food Science Journal - Kelompok 8
10 pages
The Effect of Selected Human Resource Management Practices On Employees' Job Satisfaction in Ethiopian Public Banks
No ratings yet
The Effect of Selected Human Resource Management Practices On Employees' Job Satisfaction in Ethiopian Public Banks
17 pages
DSP U1
No ratings yet
DSP U1
89 pages
7 OLS Assumptions
No ratings yet
7 OLS Assumptions
37 pages
Module 3
No ratings yet
Module 3
146 pages
Edit Ds
No ratings yet
Edit Ds
37 pages
DSDBA Sppu Dsbda QP
No ratings yet
DSDBA Sppu Dsbda QP
11 pages
MATH 101-Week 7-8 - Lesson 4.1 Correlation & Regression Analysis
No ratings yet
MATH 101-Week 7-8 - Lesson 4.1 Correlation & Regression Analysis
53 pages
Department of Electronics and Communication: Atria Institute of Technology
No ratings yet
Department of Electronics and Communication: Atria Institute of Technology
74 pages
Question Bank With Answers
No ratings yet
Question Bank With Answers
103 pages
Model Question Paper With Effect From 2021 (CBCS Scheme) : Data Science and Visualization
No ratings yet
Model Question Paper With Effect From 2021 (CBCS Scheme) : Data Science and Visualization
29 pages
Fds Question Bank
No ratings yet
Fds Question Bank
116 pages
MWE Unit 2.75
No ratings yet
MWE Unit 2.75
30 pages
MPC 006 Previous Year Question Papers by
No ratings yet
MPC 006 Previous Year Question Papers by
67 pages
Machinistas Meet Randomistas: Useful ML Tools For Empirical Researchers Esther Duflo
No ratings yet
Machinistas Meet Randomistas: Useful ML Tools For Empirical Researchers Esther Duflo
71 pages
Computer Networks Question Bank
No ratings yet
Computer Networks Question Bank
40 pages
Evaluation of Surface Water Quality of Ukkadam Lake in Coimbatore Using UAV and Sentinel-2 Multispectral Data
No ratings yet
Evaluation of Surface Water Quality of Ukkadam Lake in Coimbatore Using UAV and Sentinel-2 Multispectral Data
17 pages
GEMECHU NEGASA JIRu
No ratings yet
GEMECHU NEGASA JIRu
64 pages
CH 7. Demend Estimation and Forecasting
No ratings yet
CH 7. Demend Estimation and Forecasting
27 pages
Bivariate Analysis
No ratings yet
Bivariate Analysis
16 pages
Fds UNIT 1
No ratings yet
Fds UNIT 1
38 pages
0.extracted Pages 20MCA201 From 2020 MCA S3 S4
No ratings yet
0.extracted Pages 20MCA201 From 2020 MCA S3 S4
18 pages
The Influence of Handling Health Services Complaints On Patient's Trust in Regional General Hospital
No ratings yet
The Influence of Handling Health Services Complaints On Patient's Trust in Regional General Hospital
20 pages
Borsboom, D., Mellenbergh, G. J., Van Heerden, J. (2003) .
No ratings yet
Borsboom, D., Mellenbergh, G. J., Van Heerden, J. (2003) .
17 pages
Mwe Unit 3
No ratings yet
Mwe Unit 3
37 pages
Big Data (Imp-Questions)
No ratings yet
Big Data (Imp-Questions)
17 pages
CNS UNIT-2 Notes
No ratings yet
CNS UNIT-2 Notes
43 pages
DA QnBank Full 17jan22 NoKey
No ratings yet
DA QnBank Full 17jan22 NoKey
16 pages
12 2marks With Ans
No ratings yet
12 2marks With Ans
21 pages
Mba ZG536 Course Handout
No ratings yet
Mba ZG536 Course Handout
7 pages
Question Bank For All 5 Units: Department of Computer Science and Engineering & Department of Information Technology
No ratings yet
Question Bank For All 5 Units: Department of Computer Science and Engineering & Department of Information Technology
14 pages
DA - QnBank 1
No ratings yet
DA - QnBank 1
15 pages
Question Bank (DA) - 1
No ratings yet
Question Bank (DA) - 1
14 pages
Da QB F1 Cse It 24 25
No ratings yet
Da QB F1 Cse It 24 25
16 pages
MWE
No ratings yet
MWE
23 pages
01.ad3491 Fdsa QB
No ratings yet
01.ad3491 Fdsa QB
16 pages
3 - Applied Econometrics Syllabus
No ratings yet
3 - Applied Econometrics Syllabus
7 pages
SC Unit5 Notes
No ratings yet
SC Unit5 Notes
21 pages
DSBDA Merged
No ratings yet
DSBDA Merged
13 pages
WCN 4th Unit
No ratings yet
WCN 4th Unit
20 pages
Sfds Aat
No ratings yet
Sfds Aat
8 pages
WCN 3rd Unit
No ratings yet
WCN 3rd Unit
20 pages
DA Long Questions (12!11!24)
No ratings yet
DA Long Questions (12!11!24)
10 pages
II CSE - A&B (96) DS-int 1 QP ANS-set1
No ratings yet
II CSE - A&B (96) DS-int 1 QP ANS-set1
7 pages
FDS Unit 1 QB
No ratings yet
FDS Unit 1 QB
7 pages
Investigating The Relationship Between Knowledge Management Dimensions and Organizational Performance in Lean Manufacturing
No ratings yet
Investigating The Relationship Between Knowledge Management Dimensions and Organizational Performance in Lean Manufacturing
9 pages
Mortality Prediction Analysis
No ratings yet
Mortality Prediction Analysis
7 pages
Linear Regression CASIO
No ratings yet
Linear Regression CASIO
9 pages
Fdsa 12 - 2M
No ratings yet
Fdsa 12 - 2M
15 pages
Audit Quality 43-50
No ratings yet
Audit Quality 43-50
8 pages
Data Science
No ratings yet
Data Science
14 pages
Economic Instructor Manual
No ratings yet
Economic Instructor Manual
29 pages
23CS0902
No ratings yet
23CS0902
13 pages
Course Plan - FDS Theory
No ratings yet
Course Plan - FDS Theory
7 pages
Model Solution - Econ f241 Mid
No ratings yet
Model Solution - Econ f241 Mid
3 pages
Da Externalqp
No ratings yet
Da Externalqp
6 pages
QP Univ DS-set1
No ratings yet
QP Univ DS-set1
6 pages
CFM - Programming Task
No ratings yet
CFM - Programming Task
10 pages
Statistics Final Exam - 20S2
No ratings yet
Statistics Final Exam - 20S2
8 pages
Course Learning Objectives: This Course Will Enable Students To
No ratings yet
Course Learning Objectives: This Course Will Enable Students To
3 pages
FDSA - Question Bank
No ratings yet
FDSA - Question Bank
5 pages
DSBDA Sample Questions-1
No ratings yet
DSBDA Sample Questions-1
4 pages
FDSA SEM Answer Key
No ratings yet
FDSA SEM Answer Key
11 pages
Data Science Content
No ratings yet
Data Science Content
4 pages
A Survey On Data Mining Techniques in Cu
No ratings yet
A Survey On Data Mining Techniques in Cu
7 pages
Strings
No ratings yet
Strings
4 pages
SSRN Id3635544
No ratings yet
SSRN Id3635544
7 pages
Vlsi Important Questions
No ratings yet
Vlsi Important Questions
7 pages
Unit I 2 Marks With Ans
No ratings yet
Unit I 2 Marks With Ans
7 pages
Da QB
No ratings yet
Da QB
4 pages
Winter 2024 3160714
No ratings yet
Winter 2024 3160714
2 pages
FODS Prevoius Paper
No ratings yet
FODS Prevoius Paper
4 pages
Revision Worksheet Ai - Grade Xii 2025
No ratings yet
Revision Worksheet Ai - Grade Xii 2025
6 pages
QP Univ DS-set2
No ratings yet
QP Univ DS-set2
6 pages
DS
No ratings yet
DS
7 pages
Data Warehousing and Data Mining Dec 2023
No ratings yet
Data Warehousing and Data Mining Dec 2023
7 pages
Foundation of Data Science Imp
No ratings yet
Foundation of Data Science Imp
6 pages
Adsp U5
No ratings yet
Adsp U5
5 pages
CAT1 Foundation of Data Science
No ratings yet
CAT1 Foundation of Data Science
5 pages
Minerals Engineering: L. Vinnett, M. Alvarez-Silva
No ratings yet
Minerals Engineering: L. Vinnett, M. Alvarez-Silva
5 pages
Data Analytics
No ratings yet
Data Analytics
4 pages
DISC 212-Introduction To Management Science-Muhammad Adeel Zaffar
No ratings yet
DISC 212-Introduction To Management Science-Muhammad Adeel Zaffar
4 pages
Xii - Ai - Notes - U 2
No ratings yet
Xii - Ai - Notes - U 2
8 pages
Noida Institute of Engineering and Technology, Greater Noida
No ratings yet
Noida Institute of Engineering and Technology, Greater Noida
3 pages
DS Honor Sem 5 Endsem Paper 1
No ratings yet
DS Honor Sem 5 Endsem Paper 1
2 pages
CS3352 Iat QB
No ratings yet
CS3352 Iat QB
2 pages
U22PC402DS
No ratings yet
U22PC402DS
1 page
Q1S 1
No ratings yet
Q1S 1
2 pages

20CS1101 - Introduction To Data Science

Uploaded by

20CS1101 - Introduction To Data Science

Uploaded by

Course Code: 20CS1101 R20

SIDDHARTH INSTITUTE OF ENGINEERING & TECHNOLOGY:: PUTTUR

QUESTION BANK (DESCRIPTIVE)

Subject with Code: (20CS1101) INTRODUCTION TO DATA SCIENCE

b Discuss the Various Processing Steps in Data Science [L2][CO1] [6M]

3 a Analyze the term: Distributed file systems [L4][CO1] [6M]

4 Classify the term big data ecosystem [L4][CO1] [12M]

7 a What are various steps involved in integrating phase [L1][CO1] [6M]

b What is meant by exploratory data analysis [L1][CO1] [6M]

8 Examine the term: Transforming data in Data science [L3][CO1] [12M]

9 a Show the various components of model building. [L2][CO1] [6M]

1 a Define Hypothesis Testing [L1][CO2] [6M]

1 a What is clustering? [L1][CO5] [6M]

b What is forecasting in association with time series. Explain [L1][CO6] [6M]

UNIT –V TEXT ANALYSIS

1 a Define Porter’s stemming algorithm. [L1][CO6] [6M]

INTRODUCTION TO DATA SCIENCE (R20)

You might also like