0% found this document useful (0 votes)
26 views2 pages

Tutorial Week 05 Data Exploration

The document discusses calculating similarity measures on frequency data and ranking data points based on similarity to a query point using different distance measures. It also discusses normalizing data and comparing different visualization techniques.

Uploaded by

SHABIT MAHMUD
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
26 views2 pages

Tutorial Week 05 Data Exploration

The document discusses calculating similarity measures on frequency data and ranking data points based on similarity to a query point using different distance measures. It also discusses normalizing data and comparing different visualization techniques.

Uploaded by

SHABIT MAHMUD
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 2

বঙ্গবন্ধু শেখ মুজিবুর রহমান জিজিটাল ইউজনভাজসিজট, বাাংলাদেে

কাজলয়াককর, গাজীপুর

Week 05 Tutorial: Data Exploration

1 Given data:

p=1000000000
q=0000001001

The frequency table is:

q=1 q=0
p=1 0 1
p=0 2 7

Calculate the Simple matching coefficient and the Jaccard coefficient.

2 It is important to define or select similarity measures in data analysis. However, there is no commonly
accepted subjective similarity measure. Results can vary depending on the similarity measures used.
Nonetheless, seemingly different similarity measures may be equivalent after some transformation.
Suppose we have the following two-dimensional data set:

A1 A2
x1 1.5 1.7
x2 2 1.9
x3 1.6 1.8
x4 1.2 1.5
x5 1.5 1.0

(a) Consider the data as two-dimensional data points. Given a new data point, x = (1.4, 1.6) as a
query, rank the database points based on similarity with the query using Euclidean distance,
Manhattan distance, supremum distance, and cosine similarity.

(b) Normalize the data set to make the norm of each data point equal to 1. Use Euclidean distance on
the transformed data to rank the data points.

Prepared by: Nurjahan Nipa, Lecturer, Department of Internet of Things & Robotics Engineering (IRE), BDU Page 1|2
বঙ্গবন্ধু শেখ মুজিবুর রহমান জিজিটাল ইউজনভাজসিজট, বাাংলাদেে
কাজলয়াককর, গাজীপুর

5 Compare between following visualization techniques i) Pixel-oriented visualization techniques ii)


Geometric projection visualization techniques iii) Icon-based visualization techniques iv) Hierarchical
visualization techniques v) Visualizing complex data and relations.

Prepared by: Nurjahan Nipa, Lecturer, Department of Internet of Things & Robotics Engineering (IRE), BDU Page 2|2

You might also like