Question Bank Semester: IV Sem Subject: Data Science Sub Code: 17MCA441 SL - No. Questions Marks

This document contains a question bank for the Data Science subject for Semester IV. It is divided into 5 modules. Module 1 covers topics related to data mining including classification and clustering. Module 2 discusses data types, attributes, preprocessing and dimensionality reduction. Module 3 focuses on association rule mining and frequent item set generation. Module 4 is about decision trees, rule-based classifiers, nearest neighbor algorithms and Bayesian classifiers for classification. Module 5 covers different clustering algorithms like K-means, hierarchical and DBSCAN clustering. Each module contains short answer and long answer questions related to the topics in that module.

Uploaded by

Achutha JC

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

3K views

Question Bank Semester: IV Sem Subject: Data Science Sub Code: 17MCA441 SL - No. Questions Marks

Uploaded by

Achutha JC

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 4

DAYANANDASAGAR COLLEGE OF ENGINEERING

DEPARTMENT OF MCA

Question Bank

Semester: IV Sem Subject: Data Science Sub Code: 17MCA441

Module 1

Sl.No. Questions Marks

1 What is Data Mining? Explain the process of knowledge discovery in data 8marks
bases
2 Explain the motivating challenges in data mining 8 marks
3 Describe data mining as a confluence of many disciplines 8 marks
4 What are data mining tasks? Explain 8 marks
5 Explain the market basket analysis in data mining 8 marks
6 Describe the application areas in data mining 8 marks
7 What is classification? Explain 10marks
8 List out some applications in classification 10marks
9 What is clustering? List out some applications in clustering 10marks
10 10marks

Module 2

Sl.No. Questions Marks

1 What is Data? What are the types of Data 8marks
2 Define an attribute? Explain types of attributes 8 marks
3 Mention the general characteristics of data sets 5marks
4 Discuss specific aspects of data quality 8 marks
5 Mention the data quality issues in mining the data 8 marks
6 What is data preprocessing? Explain some methods for preprocessing the 8 marks
data.
7 What is dimensionality reduction? Explain the curse of dimensionality 10marks
8 Explain the approaches in feature subset selection 10marks
9 Explain discretization and binarization in data mining 10marks
10 What are the measures of similarity and dissimilarity 10marks
11 Explain the methods to measure the similarities between the data objects 8 marks
12 What are the methods to measure the dissimilarities between the data 8 marks
objects
13 What is correlation ?Explain 8 marks
14 What are the issues in proximity calculation 8 marks
15 Describe the general characteristics of selecting the right proximity 8 marks
measure
16 Distinguish between noise and outliers 8 marks
17 Which approach jaccard or hamming distance is more similar to simple 8 marks
matching coefficient
18 Which approach is more similar to the cosine measure 8 marks
19 For the following vectors,x and y calculate the indicated similarity or 8 marks
distance measure
X=(1,1,1,1)Y=(2,2,2,2) cosine,correlation ,Euclidean
X=(0,1,0,1)Y=(1,0,1,0) cosine,correlation,Euclidian,jaccard
20 Explain why computing the proximity between two attributes is often 8 marks
simpler than computing the similarity between two objects.

Module 3

Sl.No. Questions Marks

1 Why use support and confidence? 5 marks
2 What is association rule mining? Explain 8 marks
3 How to generate the frequent item sets? 8 marks
4 Write the steps to generate the frequent item sets using Aprioiri algorithm 10 marks
5 Explain about the candidate generation and pruning. 10 marks
6 Describe the support counting using a hash tree 10 marks
7 How to measure the computational complexity in data mining 8 marks
8 How to generate the rules in apriori algorithm 8 marks
9 Describe the maximal frequent item sets 5 marks
10 Explain the closed frequent item sets 5 marks
11 Describe all the alternative methods for generating frequent item sets. 10 marks
12 Construct the FP-tree algorithm with an example. 10 marks
13 How to generate the frequent item set generation in FP-growth algorithm 10 marks
14 How to evaluate the association patterns 8 marks
15 What are the objective measures of interestingness 5 marks
16 Write the limitation of interest factor, correlation analysis, and IS 8 marks
measures.
17 How to discover association rule using hash tree explain 10 marks
18 Mention the factors affecting the complexity 5 marks
19 Differentiate between the maximal and closed frequent item sets. 5 marks
20 How to compute the interestingness measures. 5 marks
21 What is the effect of support based prunining 5 marks
22 What is statistical independence and statistical based measures 5 marks
23 Distinguish between interestingness and unexpectedness 5 marks
Module 4

Sl.No. Questions Marks

1 Describe the general approach to solving a classification problem 8 marks
2 How a decision tree works. 8 marks
3 How to build a decision tree. 8 marks
4 Describe how a decision tree grows recursively using Hunt’s Algorithm. 10 marks
5 Describe the design issues of decision tree induction. 8 marks
6 Explain the methods for expressing attribute test conditions. 8 marks
7 What are the measures for selecting the best split? 8 marks
8 Explain how to split the binary, nominal and continuous attributes. 8 marks
9 Write an algorithm for decision tree induction. 8 marks
10 Explain rule based classifier? 8 marks
11 How a rule based classifier works? 8 marks
12 What are the rules ordering schemes? 8 marks
13 How to build a rule based classifier? 8 marks
14 Explain the RIPPER algorithm used for rule induction. 8 marks
15 Describe indirect methods for rule extraction. 8 marks
16 Mention the characteristics of rule based classifiers. 8 marks
17 Describe the nearest neighbor algorithm. 8 marks
18 Write the k-nearest neighbor classification algorithm. 8 marks
19 What are the characteristics of nearest neighbor classifiers? 8 marks
20 Describe the Bayesian classifiers for classification. 8 marks
21 Explain the naïve Bayes classifier. 8 marks
22 How a naïve Bayes classifier works. 8 marks
23 How to estimate conditional probabilities for categorical attributes 8 marks
24 How to estimate conditional probabilities for continuous attributes 8 marks
25 Describe the M-estimate of conditional probability. 8 marks
26 What are the characteristics of naïve Bayes classifiers? 8 marks
27 How to measure the Bayes error rate. 8 marks
28 What are Bayesian Belief networks? Explain 8 marks

Module 5

1 What is cluster Analysis? List out the application areas of cluster analysis 10 marks
to practical problems.
2 Explain the different types of clustering. 8 marks
3 Describe the basic K-means algorithm with example. 8 marks
4 Mention the ways in choosing the initial centroids. 8 marks
5 Determine the time and space complexity of K-means algorithm. 5 marks
6 Explain Bisecting K-means algorithm. 8 marks
7 Mention the strengths and weakness of K-means algorithm 5 marks
8 Describe the agglomerative hierarchical clustering algorithm. 8 marks
9 Write a basic agglomerative hierarchical clustering algorithm. 8 marks
10 Explain the different ways in defining the proximity between clusters. 8 marks
11 Determine the time and space complexity of hierarchical clustering 8 marks
algorithm.
12 Illustrate Ward’s method in finding the proximity between two clusters. 8 marks
13 Describe DBSCAN clustering algorithm with example. 8 marks
14 How to evaluate clusters? Explain 8 marks
15 What is anomaly detection? Illustrate applications for which anomalies are 8 marks
of interest.
16 Mention the causes for anomalies. 8 marks
17 Explain different approaches to anomaly detection. 8 marks
18 Explain the different issues that need to be addressed when dealing with 8 marks
anomaly detection.
19 Describe the statistical approaches to outlier detection. 8 marks
20 Explain the proximity based outlier detection. 8 marks
21 Explain the clustering based techniques for outlier detection. 8 marks

Artificial Intelligence Class IX Code 417 Part B Unit 1
75% (32)
Artificial Intelligence Class IX Code 417 Part B Unit 1
30 pages
Unit-3 DS
No ratings yet
Unit-3 DS
21 pages
Machine Learning Notes
100% (4)
Machine Learning Notes
60 pages
NLP Question Paper Solution
No ratings yet
NLP Question Paper Solution
27 pages
ME P4252-II Semester - MACHINE LEARNING
No ratings yet
ME P4252-II Semester - MACHINE LEARNING
48 pages
Unit 2
No ratings yet
Unit 2
32 pages
Machine Learning Lab Viva
100% (1)
Machine Learning Lab Viva
9 pages
1-Mapping Problems To Machine Learning Tasks
No ratings yet
1-Mapping Problems To Machine Learning Tasks
19 pages
Data Analytics (A) CS-503, B.Tech. 5 Semester Assignment Questions
0% (1)
Data Analytics (A) CS-503, B.Tech. 5 Semester Assignment Questions
2 pages
NLP Lab Manual
83% (6)
NLP Lab Manual
56 pages
Question Bank For Int - Data Science
100% (1)
Question Bank For Int - Data Science
5 pages
Unit 1 Introduction of Machine Learning Notes
No ratings yet
Unit 1 Introduction of Machine Learning Notes
57 pages
Big Data Analytics Lab Manual
No ratings yet
Big Data Analytics Lab Manual
38 pages
Data Analytics Unit-I
No ratings yet
Data Analytics Unit-I
25 pages
MC4301 - ML Unit 2 (Model Evaluation and Feature Engineering)
No ratings yet
MC4301 - ML Unit 2 (Model Evaluation and Feature Engineering)
40 pages
Data Analytics III I
No ratings yet
Data Analytics III I
86 pages
Co-Po Big Data Analytics
No ratings yet
Co-Po Big Data Analytics
41 pages
Data Mining and Visualization Question Bank
100% (1)
Data Mining and Visualization Question Bank
11 pages
r18 - Big Data Analytics - Cse (DS)
0% (1)
r18 - Big Data Analytics - Cse (DS)
1 page
ML Notes
No ratings yet
ML Notes
202 pages
Question Bank Python For Data Science
0% (1)
Question Bank Python For Data Science
3 pages
Data Science-Lab Manual
100% (1)
Data Science-Lab Manual
15 pages
DAA PPT - Unit - I
No ratings yet
DAA PPT - Unit - I
111 pages
Fdsa UNIT V
No ratings yet
Fdsa UNIT V
18 pages
Data Analytics Unit-3 Notes
No ratings yet
Data Analytics Unit-3 Notes
21 pages
ML Question Bank - Beena Kapadia
No ratings yet
ML Question Bank - Beena Kapadia
3 pages
AD3391 Database Design and Management Nov Dec 2023 Question Paper Download
No ratings yet
AD3391 Database Design and Management Nov Dec 2023 Question Paper Download
3 pages
CCS360 Lab Record
No ratings yet
CCS360 Lab Record
28 pages
Unit-III (Data Analytics)
100% (1)
Unit-III (Data Analytics)
15 pages
5.2 Natural Language Processing
No ratings yet
5.2 Natural Language Processing
43 pages
Characteristics of Soft Computing
88% (8)
Characteristics of Soft Computing
11 pages
03 - Decision - Tree - Hunt Algorithm
No ratings yet
03 - Decision - Tree - Hunt Algorithm
28 pages
What Is Data Visualization UNIT-V
No ratings yet
What Is Data Visualization UNIT-V
24 pages
CCW331-UNIT3,4,5 Business Analytics
No ratings yet
CCW331-UNIT3,4,5 Business Analytics
300 pages
CP4252-ML-SYLLABUS
No ratings yet
CP4252-ML-SYLLABUS
4 pages
Jntuk r20 Unit-I Deep Learning Techniques (WWW - Jntumaterials.co - In)
No ratings yet
Jntuk r20 Unit-I Deep Learning Techniques (WWW - Jntumaterials.co - In)
23 pages
Data Analytics - Unit-IV
No ratings yet
Data Analytics - Unit-IV
21 pages
CS8091-BIG DATA ANALYTICS UNIT V Notes
100% (4)
CS8091-BIG DATA ANALYTICS UNIT V Notes
31 pages
CCS334 Big Data Analytics Important Question
No ratings yet
CCS334 Big Data Analytics Important Question
1 page
R20 Iii-Ii ML Lab Manual
100% (1)
R20 Iii-Ii ML Lab Manual
79 pages
Unit-I (Data Analytics)
No ratings yet
Unit-I (Data Analytics)
22 pages
CS-605 Data - Analytics - Lab Complete Manual (2) - 1672730238
No ratings yet
CS-605 Data - Analytics - Lab Complete Manual (2) - 1672730238
56 pages
CS8661-IP LAB MAUAL UPDATION NEW (1) Lak
100% (1)
CS8661-IP LAB MAUAL UPDATION NEW (1) Lak
87 pages
Machine Learning Notes - TutorialsDuniya
100% (1)
Machine Learning Notes - TutorialsDuniya
58 pages
21cs644 Module 3
No ratings yet
21cs644 Module 3
95 pages
Cp5293 Big Data Analytics Question Bank
0% (1)
Cp5293 Big Data Analytics Question Bank
13 pages
MCQ - Bda
33% (3)
MCQ - Bda
3 pages
DVT - Question Bank
100% (1)
DVT - Question Bank
3 pages
BTCS9202 Data Sciences Lab Manual
No ratings yet
BTCS9202 Data Sciences Lab Manual
39 pages
Lab - Manual FDS
No ratings yet
Lab - Manual FDS
12 pages
Ads &aa Unit 5
100% (1)
Ads &aa Unit 5
17 pages
Machine Learning Notes
100% (3)
Machine Learning Notes
134 pages
Unit I Content Beyond Syllabus - I Introduction To Data Mining and Data Warehousing What Are Data Mining and Knowledge Discovery?
No ratings yet
Unit I Content Beyond Syllabus - I Introduction To Data Mining and Data Warehousing What Are Data Mining and Knowledge Discovery?
12 pages
Feature Creation in Data Mining
No ratings yet
Feature Creation in Data Mining
5 pages
ML OLD Question Paper
63% (8)
ML OLD Question Paper
2 pages
ML Mid Sem Question Bank
No ratings yet
ML Mid Sem Question Bank
11 pages
Machine Learning Techniques: Important Questions Unit-1
No ratings yet
Machine Learning Techniques: Important Questions Unit-1
8 pages
Unit-1 Data Mining Metrics
No ratings yet
Unit-1 Data Mining Metrics
2 pages
Dcs 7302
No ratings yet
Dcs 7302
17 pages
CEUC502 - DMBI_Question_Bank
No ratings yet
CEUC502 - DMBI_Question_Bank
12 pages
Sample Question DMW
No ratings yet
Sample Question DMW
4 pages
Burelli - 2016 - Game Cinematography From Camera Control To Player Emotions
No ratings yet
Burelli - 2016 - Game Cinematography From Camera Control To Player Emotions
16 pages
ML NLP Assignment
No ratings yet
ML NLP Assignment
3 pages
Automata Theory Assignment 1
No ratings yet
Automata Theory Assignment 1
8 pages
University of Colorado Boulder Thesis
100% (2)
University of Colorado Boulder Thesis
7 pages
The Elements of Statistical Learning Data Mining Inference and Prediction Second Edition Trevor Hastie - Quickly download the ebook to never miss any content
100% (1)
The Elements of Statistical Learning Data Mining Inference and Prediction Second Edition Trevor Hastie - Quickly download the ebook to never miss any content
59 pages
mini project report finalll
No ratings yet
mini project report finalll
38 pages
Ball Patom Theory
No ratings yet
Ball Patom Theory
12 pages
Supply Chain On AI
No ratings yet
Supply Chain On AI
6 pages
Accenture Art of AI Maturity Report
No ratings yet
Accenture Art of AI Maturity Report
39 pages
Case Study BFSI
No ratings yet
Case Study BFSI
3 pages
Forward Chaining and Backward Chaining
No ratings yet
Forward Chaining and Backward Chaining
4 pages
Fire Alarm Systems Construction On Artificial Inte
No ratings yet
Fire Alarm Systems Construction On Artificial Inte
8 pages
NeuralMonobloc Koh
No ratings yet
NeuralMonobloc Koh
11 pages
AI Can Transform Education For The Better
No ratings yet
AI Can Transform Education For The Better
6 pages
Educational Technology & Education Conferences #41 June To December 2019 Clayton R Wright
No ratings yet
Educational Technology & Education Conferences #41 June To December 2019 Clayton R Wright
147 pages
Artificial Intelligence Versus Human Intelligence
No ratings yet
Artificial Intelligence Versus Human Intelligence
2 pages
CS3491 Set4
No ratings yet
CS3491 Set4
3 pages
CIQ Collateral Report State ICM 2024
No ratings yet
CIQ Collateral Report State ICM 2024
45 pages
đề số 12
No ratings yet
đề số 12
19 pages
CS-671: Deep Learning and Its Applications Distance Metric Learning
No ratings yet
CS-671: Deep Learning and Its Applications Distance Metric Learning
15 pages
The Evolution of Digital Assurance - PIEDAP
No ratings yet
The Evolution of Digital Assurance - PIEDAP
5 pages
An Application of Selected Artificial Intelligence Techniques To Engineering Analysis
No ratings yet
An Application of Selected Artificial Intelligence Techniques To Engineering Analysis
153 pages
JD_Capsitech
No ratings yet
JD_Capsitech
2 pages
TT - Be - 6th-Semt CBCS
No ratings yet
TT - Be - 6th-Semt CBCS
2 pages
Module Name: Business Intelligence Module Code Level Credit Value Module Leader BSOM079
No ratings yet
Module Name: Business Intelligence Module Code Level Credit Value Module Leader BSOM079
6 pages
A Collection of Definitions of Intelligence
100% (1)
A Collection of Definitions of Intelligence
11 pages
Deep Learning With Keras - Quick Guide
No ratings yet
Deep Learning With Keras - Quick Guide
22 pages
Attended Report
No ratings yet
Attended Report
20 pages
Numenta - Path To Machine Intelligence White Paper
No ratings yet
Numenta - Path To Machine Intelligence White Paper
9 pages

Question Bank Semester: IV Sem Subject: Data Science Sub Code: 17MCA441 SL - No. Questions Marks

Uploaded by

Question Bank Semester: IV Sem Subject: Data Science Sub Code: 17MCA441 SL - No. Questions Marks

Uploaded by

DAYANANDASAGAR COLLEGE OF ENGINEERING

Semester: IV Sem Subject: Data Science Sub Code: 17MCA441

Sl.No. Questions Marks

Sl.No. Questions Marks

Sl.No. Questions Marks

Sl.No. Questions Marks

You might also like