0% found this document useful (0 votes)
27 views2 pages

Midterm Exam (CSE 321 (Data Mining and Machine Learning), A (Day), Fall 2021)

The document outlines the midterm exam details for the course CSE 321 on Data Mining and Machine Learning at Daffodil International University, scheduled for November 16, 2021. It includes a case study related to COVID-19 infection factors and poses questions requiring data mining techniques and classification models. Additionally, it presents a series of short answer questions on various data mining concepts.

Uploaded by

Istiak Utsab
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
27 views2 pages

Midterm Exam (CSE 321 (Data Mining and Machine Learning), A (Day), Fall 2021)

The document outlines the midterm exam details for the course CSE 321 on Data Mining and Machine Learning at Daffodil International University, scheduled for November 16, 2021. It includes a case study related to COVID-19 infection factors and poses questions requiring data mining techniques and classification models. Additionally, it presents a series of short answer questions on various data mining concepts.

Uploaded by

Istiak Utsab
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 2

Daffodil International University

Department of Computer Science and Engineering


Faculty of Science & Information Technology
Midterm Exam Examination, Fall 2021 @ DIU Blended Learning Center
Course Code: CSE 321 (Day), Course Title: Data Mining and Machine Learning
Level: 3 Term: 2 Section: ALL
Instructor: ALL Modality: Open Book Exam
Date: Tuesday 16 November, 2021 Time: 01:30 pm - 04:00 pm
Two and half hours (2.5 hrs) to support online open/case study based assessment Marks: 25
Directions:
• Students need to go through the CASE STUDY shown in this exam paper.
• Analyze and answer specific section based on your own thinking and work.
• Do not share as this will be treated as plagiarism by Blended Learning Center.

Answer all of the following questions. Figures in the right-hand margin indicate full marks.

1. The COVID-19 pandemic has been causing huge losses of life and sufferings around the
world since the beginning of this year. Although, its intensity has diminished to some
extent, its outbreak is still carrying on. To live a healthy and safe life during this pandemic
period, World Health Organization (WHO) and some other organizations/researchers
have found out the factors causing the infection of COVID-19. A data set has been
collected based on some prominent factors, which is as follows:

Community Hygiene Infected by


Sl # Name Age Gander
Consciousness Practice COVID-19
1 Rubel 18 Male Low Low No
2 Sohel 15 Male Low Low No
3 Parvej 56 Male Medium Medium Yes
4 Jamal 50 Male Medium Medium Yes
5 Kamal 53 Male Low Low Yes
6 Miron 42 Female High High No
7 Kiron 63 Female High Medium Yes
8 Shila 57 Female Medium High No
9 Nahar 60 Female High High No
10 Robin 65 Male Medium High Yes
11 Tahera 39 Female Low Low Yes
12 Mila 29 Female Low Medium No

Page No. 1 of 2
Now, answer the following questions based on the above data set:

a) Write your plan to apply a suitable data mining technique for the given problem and 5
justify your answer.
b) Prepare a Bayesian classification model to classify the following test record X: 10
X = (51, Male, Medium, Low).

2. Write the answer to each of the following questions in a single sentence.

a) Can regression be directly applied on the data set in Question 1? Justify your answer. 1

b) What is the necessity of using scaling in k-NN? What can be the minimum value of k? 1

c) How does Hamming distance become Manhattan distance? 1

d) What is the maximum value of GINI index for an n-class problem? 1

e) If cos θ is used for similarity, what sin θ can be used for? 1

f) What cause(s) the necessity of dimensionality reduction? 1

g) Can sin x be a good choice of the function for attribute transformation? Why? 1

h) According to the definition, what can the possible values of the pth percentile? 1

i) If some researchers want to find the major causes of smoking in the context of 1
Bangladesh based of data, which type of algorithms of different data mining tasks is
suitable here?

j) Given: 1

– A doctor knows that cavity causes toothache 50% of the time

– The prior probability of any patient having toothache is 1/10,000

– If a patient has toothache, the probability that he/she has cavity is 50%.

Calculate the prior probability of any patient having cavity.

Page No. 2 of 2

You might also like