Assignment 1: Chapter 1 - (1.7:2)

Data mining techniques like clustering, classification, association rule mining, and anomaly detection can help improve an internet search engine company. [1] Clustering can group related search results to display items containing the searched keyword as well as related keywords. [2] Classification can provide lists of research papers associated with search keywords by analyzing labeled training data. [3] Association rule mining can append additional related information to search results based on frequent co-occurrences of attributes in data. [4] Anomaly detection can avoid displaying irrelevant search results that do not conform to general patterns for the searched keyword.

Uploaded by

Sanaullah Nazik

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

101 views2 pages

Assignment 1: Chapter 1 - (1.7:2)

Uploaded by

Sanaullah Nazik

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 2

Assignment 1: Chapter 1 – (1.

7:2)

Q. Suppose that you are employed as a data mining consultant for
an Internet search engine company. Describe how data mining
can help the company by giving specific examples of how
techniques such as clustering, classification, association rule
mining and anomaly detection can be applied?

Answer: Data Mining is the process of discovering interesting knowledge from large
amounts of data stored either in databases, data warehouses or other information
repositories. There are various data mining functionalities and each of these can be
applied in order to improve the company’s search engine.

1. Clustering – is the process of grouping a set of physical or abstract objects into
classes of similar objects. The objects are grouped based on the principle of increasing
intraclass similarity and decreasing interclass similarity. In the context of a search
engine, clustering can help to display the results that not only contain the keyword
specified in the “search” box but also related results.

For example. On entering ‘paintbrush’ in the search box, the search engine should not
only display the results with keyword ‘paint’ but can also display the ones with keywords
‘canvas’ or ‘paint’ or ’easel’.

2. Classification – is the process of finding a set of functions that describe and
distinguish data classes or concepts, and using this function to predict the class of
object whose class label is unknown. Classification analyzes class-labeled data objects
whereas clustering analyzes data objects without consulting a known class label. This is
more of an internal implementation.

For example: A list of research papers associated with a keyword could be provided by
the search engine. This is done by using either classification rules or decision tree or
any other classification algorithms on a set of data whose list of research papers are
known and then applying that function to the keyword.

3. Association rule mining – is the discovery of association rules showing attribute-value
conditions that occur frequently together in a given set of data. A search engine could
append additional information in its result based on the keywords entered by the user.

For example. A user searching the web to buy a large screen TV might also be
interested in a new home theatre system. Returning results for both TV and the home
theatre system could keep the search engine one step ahead of the user.

4. Anomaly detection – Anomalies are the data objects that do not conform to the
general behavior of the data. The analysis of anomalies is known as anomaly detection.
In cases such as fraud detection, an anomaly is more important than the rest of the
data. A search engine can use anomaly detection to avoid displaying results that are not
relevant to the searched keyword.

For example: a user might search for ‘heart attack’, anomaly detection would not allow
‘attack on china’, which is irrelevant to the searched topic, and is an outlier in this
context, to be displayed.

Cheat Sheet Imperva
100% (2)
Cheat Sheet Imperva
12 pages
Online Music Player
No ratings yet
Online Music Player
35 pages
Gtag Understanding and Auditing Big Data
100% (1)
Gtag Understanding and Auditing Big Data
42 pages
Machine Learning with Clustering: A Visual Guide for Beginners with Examples in Python
From Everand
Machine Learning with Clustering: A Visual Guide for Beginners with Examples in Python
Artem Kovera
No ratings yet
A Brief Overview On Data Mining Survey PDF
No ratings yet
A Brief Overview On Data Mining Survey PDF
8 pages
Chapter 1 - Introduction To Information System Audit
100% (3)
Chapter 1 - Introduction To Information System Audit
9 pages
Vendor Attestation Policy PDF
No ratings yet
Vendor Attestation Policy PDF
9 pages
Oracle Forms & Reports 12.2.1.2.0 - Create and Configure On The OEL 7
100% (1)
Oracle Forms & Reports 12.2.1.2.0 - Create and Configure On The OEL 7
50 pages
Data Mining Module Discussion-1
No ratings yet
Data Mining Module Discussion-1
1 page
DM-Unit-I Introduction To Association-1
No ratings yet
DM-Unit-I Introduction To Association-1
97 pages
DM 1
No ratings yet
DM 1
47 pages
DM Unit1 Intro
No ratings yet
DM Unit1 Intro
12 pages
DM - Unit I-Updated
No ratings yet
DM - Unit I-Updated
65 pages
CSC 425 Data Mining and Warehousing 2024
No ratings yet
CSC 425 Data Mining and Warehousing 2024
54 pages
Data Mining Real
No ratings yet
Data Mining Real
19 pages
DM Mod1
No ratings yet
DM Mod1
29 pages
Unit-4 Introduction To Data Mining
No ratings yet
Unit-4 Introduction To Data Mining
26 pages
DM Notes
No ratings yet
DM Notes
91 pages
LECTURE NOTES ON DATA MINING and DATA WA
No ratings yet
LECTURE NOTES ON DATA MINING and DATA WA
84 pages
Mekelle University-Mekelle Institute of Technology Department of Information Technology Data Mining and Knowledge Discovery
No ratings yet
Mekelle University-Mekelle Institute of Technology Department of Information Technology Data Mining and Knowledge Discovery
36 pages
Module 4
No ratings yet
Module 4
54 pages
DataMining Chapter1
No ratings yet
DataMining Chapter1
13 pages
DWDM R13 Unit 1 PDF
No ratings yet
DWDM R13 Unit 1 PDF
10 pages
Data Mining, Data Pattern, Machine Learning (Week 2
No ratings yet
Data Mining, Data Pattern, Machine Learning (Week 2
19 pages
5 Data Mining Proccess and Techniques - Week 7
No ratings yet
5 Data Mining Proccess and Techniques - Week 7
61 pages
1.data Mining Functionalities
No ratings yet
1.data Mining Functionalities
14 pages
Unit-4 DWM
No ratings yet
Unit-4 DWM
73 pages
R18CSE4102-UNIT 2 Data Mining Notes
100% (1)
R18CSE4102-UNIT 2 Data Mining Notes
31 pages
Data Warehouse and Mining Notes
No ratings yet
Data Warehouse and Mining Notes
12 pages
DM Chapter 1
No ratings yet
DM Chapter 1
10 pages
Data Mining New Notes Unit 3 PDF
No ratings yet
Data Mining New Notes Unit 3 PDF
12 pages
Unit 1 Data Mining
No ratings yet
Unit 1 Data Mining
16 pages
Fundamentals of Data Science Notes (Module - 1)
No ratings yet
Fundamentals of Data Science Notes (Module - 1)
19 pages
Unit III
No ratings yet
Unit III
101 pages
Data Mining Module - New
No ratings yet
Data Mining Module - New
38 pages
Datawarehouse&Data Mining - ALL
No ratings yet
Datawarehouse&Data Mining - ALL
46 pages
IDW Lecture 31 - Basic Concepts About Data Mining
No ratings yet
IDW Lecture 31 - Basic Concepts About Data Mining
9 pages
Paper - Xvii Data Mining and Warehousing
No ratings yet
Paper - Xvii Data Mining and Warehousing
140 pages
Unit 1 DM
No ratings yet
Unit 1 DM
24 pages
Data Mining Summaries PDF
No ratings yet
Data Mining Summaries PDF
22 pages
DWDM Unit-II Notes
No ratings yet
DWDM Unit-II Notes
29 pages
III-IT-Data Mining Unit 1-Session 2-Part1
No ratings yet
III-IT-Data Mining Unit 1-Session 2-Part1
17 pages
Introduction To Data Mining 1604
No ratings yet
Introduction To Data Mining 1604
32 pages
Data Mining Mod 1 Notes
No ratings yet
Data Mining Mod 1 Notes
25 pages
Adbms Ans
No ratings yet
Adbms Ans
4 pages
Data Mining Tutorials
No ratings yet
Data Mining Tutorials
52 pages
Data Mining: Concepts & Techniques
No ratings yet
Data Mining: Concepts & Techniques
29 pages
CH 2
No ratings yet
CH 2
37 pages
Unit 1
No ratings yet
Unit 1
148 pages
Unit I-1data Mining Introduction
No ratings yet
Unit I-1data Mining Introduction
39 pages
Data Mining Introduction
No ratings yet
Data Mining Introduction
35 pages
Unit-2 Finalized
No ratings yet
Unit-2 Finalized
12 pages
Data Mining and Data Analysis UNIT-1 Notes For Print
No ratings yet
Data Mining and Data Analysis UNIT-1 Notes For Print
22 pages
Data Warehousing and Data Mining Dr.P.rizwan Ahmed
0% (1)
Data Warehousing and Data Mining Dr.P.rizwan Ahmed
20 pages
2 Unit
No ratings yet
2 Unit
15 pages
DW and DM Notes
No ratings yet
DW and DM Notes
89 pages
ML Lect1
100% (1)
ML Lect1
51 pages
DMWH M1
No ratings yet
DMWH M1
25 pages
8 Data Mining and Warehousing
No ratings yet
8 Data Mining and Warehousing
171 pages
Unit 1
No ratings yet
Unit 1
46 pages
Data Mining Roles in Extracting The Knowledge: Full Length Research Paper
No ratings yet
Data Mining Roles in Extracting The Knowledge: Full Length Research Paper
6 pages
Unit-1 Data Mining
No ratings yet
Unit-1 Data Mining
19 pages
Cs1004: Data Warehousing and Mining Two Marks Questions and Answers Unit I
No ratings yet
Cs1004: Data Warehousing and Mining Two Marks Questions and Answers Unit I
31 pages
PYTHON DATA SCIENCE: A Practical Guide to Mastering Python for Data Science and Artificial Intelligence (2023 Beginner Crash Course)
From Everand
PYTHON DATA SCIENCE: A Practical Guide to Mastering Python for Data Science and Artificial Intelligence (2023 Beginner Crash Course)
Calvert Long
No ratings yet
Data Mining: Fundamentals and Applications
From Everand
Data Mining: Fundamentals and Applications
Fouad Sabry
No ratings yet
(Excerpts From) Investigating Performance: Design and Outcomes With Xapi
From Everand
(Excerpts From) Investigating Performance: Design and Outcomes With Xapi
Janet Laane Effron
No ratings yet
Project Report
No ratings yet
Project Report
44 pages
Guidelines For SACFA Members
No ratings yet
Guidelines For SACFA Members
4 pages
Why Do I Receive Error Code - 1073807253 When Performing A VISA Serial Read
No ratings yet
Why Do I Receive Error Code - 1073807253 When Performing A VISA Serial Read
3 pages
Final Report
No ratings yet
Final Report
29 pages
DNS Forensic..final
No ratings yet
DNS Forensic..final
31 pages
Boost Internet Speed
No ratings yet
Boost Internet Speed
57 pages
ABW2011IQM15
No ratings yet
ABW2011IQM15
21 pages
MaaS360 Scoped Application Design Document
No ratings yet
MaaS360 Scoped Application Design Document
13 pages
Personal and Social Domain
No ratings yet
Personal and Social Domain
2 pages
Chapter 14
No ratings yet
Chapter 14
3 pages
ActiveRoles 7.2 Azure AD Office365 Administrator Guide
No ratings yet
ActiveRoles 7.2 Azure AD Office365 Administrator Guide
46 pages
Aishwarya DevOps Engineer Updated
No ratings yet
Aishwarya DevOps Engineer Updated
9 pages
Pratical Examination Syllabus 8th
No ratings yet
Pratical Examination Syllabus 8th
4 pages
DSA Sample - Arshad
No ratings yet
DSA Sample - Arshad
21 pages
Network Security or Firewall Policy
No ratings yet
Network Security or Firewall Policy
4 pages
Programming
No ratings yet
Programming
43 pages
Initial Requirement Document of Bank Management System
No ratings yet
Initial Requirement Document of Bank Management System
13 pages
NTT DATA AI-DX Agent Powered by Microsoft Copilot Studio Fact Sheet
No ratings yet
NTT DATA AI-DX Agent Powered by Microsoft Copilot Studio Fact Sheet
5 pages
1 GovTech Factsheet 26 Apr
No ratings yet
1 GovTech Factsheet 26 Apr
3 pages
1z0 064
100% (3)
1z0 064
52 pages
Git & Linux Cheat Sheets
No ratings yet
Git & Linux Cheat Sheets
4 pages
SAP SD For Dummies
No ratings yet
SAP SD For Dummies
51 pages
Faculty End Sem 2024 Practical Routine
No ratings yet
Faculty End Sem 2024 Practical Routine
2 pages
CiscoPlus Next Generation Video Services Fundamentals PCHAVE
No ratings yet
CiscoPlus Next Generation Video Services Fundamentals PCHAVE
58 pages

Assignment 1: Chapter 1 - (1.7:2)

Uploaded by

Assignment 1: Chapter 1 - (1.7:2)

Uploaded by

Assignment 1: Chapter 1 – (1.

You might also like