0% found this document useful (0 votes)
9 views4 pages

Assignment 2

Uploaded by

crce.9980.ce
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
9 views4 pages

Assignment 2

Uploaded by

crce.9980.ce
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 4

FR.

CONCEICAO RODRIGUES COLLEGE OF ENGINEERING


Department of Computer Engineering
Practice Problems 2
(2024-2025)
Class/Sem./Branch –TE/V/COMP Course code: CSC504
Subject: Data warehousing and mining (DWM) Date: 12/08/2024

Course outcomes: On successful completion of course learner will be able to:


CSC504.2 Understand data mining principles and perform Data preprocessing and Visualization.

Improvements Q 11,12,13 is added for new course


Exercise 6

Suppose that the minimum and maximum values for the attribute income are
$12,000 and $98,000, respectively. We would like to map income to the range [0.0,
1.0]. Find min-max normalization, for the value of $73,600 income

Exercise 7

Explain attribute types with examples and operations in table format for college
system. (Consider students and employee)

Exercise 8

Draw the box plot, Histogram for the data in exercise 5

Exercise 9

Give the cosine similarity for following lines using binary data formula and
nominal data formula and comment on the answer.

L1= I like the data mining than DBMS

L2= Raj loves data mining than DBMS


Exercise 10

Find the dissimilarity among the objects for following data

Name Gender Fever cough Test1 Test2 Test3 Test4


Jack M Y N P N N N
Marry F Y N P N P N
JIM M Y Y N N N N

Exercise 11

Suppose that the values for a given set of data are grouped into intervals. The
intervals and corresponding
frequencies are as follows.
age frequency
1–5 200
6–15 450
16–20 300
21–50 1500
51–80 700
81–110 44
Compute an approximate median value for the data

Exercise 12: Suppose that a data warehouse for Big University consists of the
four dimensions student, course, semester, and instructor, and two measures
count and avg grade. At the lowest conceptual level (e.g., for a given student,
course, semester, and instructor combination), the avg grade measure stores
the actual course grade of the student. At higher conceptual levels, avg grade
stores the average grade for the given combination.

Give the metadata for each dimension.

Exercise 13: Use the two methods below to normalize the following group of
data:

200; 300; 400; 600; 1000


(a) min-max normalization by setting min = 0 and max = 1

(b) z-score normalization

c) Decimal scaling

You might also like