Questions in Data Mining

The document describes data mining and the steps involved in the knowledge discovery process. It then discusses a proposed architecture for a university course database data mining system, including components like a database, data mining engine, and graphical user interface. Finally, it compares approaches for integrating a data mining system with a database/data warehouse system from no coupling to loose, semitight, and tight coupling, stating that tight coupling is most desirable as it provides an efficient, integrated information processing environment.

Uploaded by

امير علي

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

100% found this document useful (1 vote)

2K views3 pages

Questions in Data Mining

Uploaded by

امير علي

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 3

Sura rahim

1. What is data mining? Describe the steps involved in data

mining when viewed as a process of knowledge discovery.
data mining is the process of discovering interesting knowledge
from large amounts of data stored in databases, data
warehouses, or other information repositories.
The steps involved in data mining when viewed as a
process of knowledge discovery arc as follows:
1. Data cleaning (to remove noise and inconsistent data)
2. Data integration (where multiple data sources may be combined)
3. Data selection (where data relevant to the analysis task are
retrieved fromthe database)
4. Data transformation (where data are transformed or consolidated
into forms appropriate for mining by performing summary or
aggregation operations, for instance)
5. Data mining (an essential process where intelligent methods are
applied in order to extract data patterns)
6. Pattern evaluation (to identify the truly interesting patterns
representing knowledge based on some interestingness
measures)
7. Knowledge presentation (where visualization and knowledge
representation techniques are used to present the mined
knowledge to the user)

2. Suppose your task as a software engineer at Big-University is

to design a data mining system to examine their university
course database, which contains the following information: the
name, address, and status(e.g., undergraduate or graduate) of
each student, the courses taken, and their cumulative grade
point average (GPA). Describe the architecture you would
choose. What is the purpose of each component of this
architecture?
- A data mining architecture that can be used for this
application would consist of the following major
components:
1. A database, data warehouse, or other information
repository, which consists of the set of databases, data
warehouses, spreadsheets, or other kinds of information
repositories containing the student and course
information.
Sura rahim

2. A database or data warehouse server, which fetches the

relevant data based on the users' data mining requests.
3. A knowledge base that contains the domain knowledge
used to guide the search or to evaluate the interestingness
of resulting patterns. For example, the knowledge base
may contain concept hierarchies and metadata (e.g.,
describing data from multiple heterogeneous sources).
4. A data mining engine, which consists of a set of functional
modules for tasks such as classiffication,association,
classiffication, cluster analysis, and evolution and deviation
analysis.
5. A pattern evaluation module that works in tandem with the
data mining modules by employing interestingness
measures to help focus the search towards interesting
patterns.
6. A graphical user interface that provides the user with an
interactive approach to the data mining system.

3. Describe the differences between the following approaches for

the integration of a data mining system with a database or
data warehouse system: no coupling, loose coupling, semitight
coupling, and tight coupling. State which approach you think is
the most popular, and why?
A good system architecture will facilitate the data mining
system to make best use of the software environment,
accomplish data mining tasks in an efficient and timely
manner and exchange information with other information
systems.
A critical question in the design of a data mining (DM)
system is how to integrate or couple the DM system with a
database (DB) system and/or a data warehouse (DW) system.
If a DM system works as a stand-alone system there are no
DB or DW systems with which it has to communicate. This
simple scheme is called no coupling, when a DM System
works in an environment that requires it to communicate
with other information system components, such as DB and
DW systems, possible integration schemes include as
follows:

1-No coupling: No coupling means that a DM system will not

utilize any function of a DB or DW system. It may fetch data
Sura rahim

from a particular source (such as a file system), process data

using some data mining algorithms, and then store the
mining results in another file.
Such a system suffers from several drawbacks. First, a DB
system provides flexibility and efficiency at storing,
accessing, and processing data .Without using a DB/DW
system, a DM system may spend amount of time finding,
collecting and transforming data In DB and/or DW systems
Second, there are many tested, scalable algorithms
implemented in DB and DWsystems, Without any coupling of
such systems, a DM system will need to use other tools to
extract data, making it difficult to integrate such a system
into an information processing environment. Thus, no
coupling represents a poor design.

2-Loose coupling: Loose coupling means that a DM system

will use some facilities of a DB or DW system, Loose
coupling is better than no coupling because it can fetch any
portion of data stored in databases or data warehouses by
using query processing, indexing, and other system facilities.
it is difficult for loose coupling to achieve high scalability and
good performance with large data sets.

3-Semitight coupling: Semitight coupling is a compromise

between loose and tight coupling. Semitight coupling means
that besides linking a DM system to a DB/DW system,
efficient implementations of a few essential data mining
primitives. Moreover, some frequently used intermediate
mining results can be precomputed and stored in the DB/DW
system. Because these intermediate mining results are
either precomputed or can be computed efficiently, this
design will enhance the performance of a DM system.

4-Tight coupling: Tight coupling means that a DM system is

smoothly integrated into the DB/DW system. DM, DB, and
DW systems will evolve and integrate together as one
information system with multiple functionalities. This will
provide a uniform information processing environment. This
approach is highly desirable because it facilitates efficient
implementations of data mining functions, high system
performance, and an integrated information processing
environment.

Microsoft Power BI Cookbook by Greg Deckler
100% (18)
Microsoft Power BI Cookbook by Greg Deckler
655 pages
Question Bank
No ratings yet
Question Bank
4 pages
Data Mining 5 Units Notes
No ratings yet
Data Mining 5 Units Notes
85 pages
Excel VBA Bundle 2 Books Excel VBA and Macros and 51 Awesome Macros
100% (19)
Excel VBA Bundle 2 Books Excel VBA and Macros and 51 Awesome Macros
230 pages
Chapter - 11: Access Layer: Object Storage and Object Interoperability
100% (1)
Chapter - 11: Access Layer: Object Storage and Object Interoperability
8 pages
Mining Multilevel Association Rules From Transactional Databases
No ratings yet
Mining Multilevel Association Rules From Transactional Databases
46 pages
Learn Power BI Step by Step Guide To Building Your Own Reports (CDO Advisors Book 1)
100% (5)
Learn Power BI Step by Step Guide To Building Your Own Reports (CDO Advisors Book 1)
87 pages
Data Warehouse and Data Mining Question Bank R13 PDF
No ratings yet
Data Warehouse and Data Mining Question Bank R13 PDF
12 pages
Lab 1: Preprocessing Using Python
No ratings yet
Lab 1: Preprocessing Using Python
5 pages
DWDM Online Bits
No ratings yet
DWDM Online Bits
3 pages
DBMS Lab (18IS507) Manual With Solutions-1
No ratings yet
DBMS Lab (18IS507) Manual With Solutions-1
24 pages
Data Mining Written Notes 1
No ratings yet
Data Mining Written Notes 1
35 pages
Multimedia Mining Presentation
No ratings yet
Multimedia Mining Presentation
18 pages
Data Warehousing and Data Mining
No ratings yet
Data Warehousing and Data Mining
4 pages
Unit-2 Introduction To Data Mining
100% (1)
Unit-2 Introduction To Data Mining
11 pages
Data Warehousing & Data Mining Important Questions
No ratings yet
Data Warehousing & Data Mining Important Questions
1 page
DMW Question Paper
0% (1)
DMW Question Paper
7 pages
Web Services Notes
No ratings yet
Web Services Notes
119 pages
DM Important Questions
100% (1)
DM Important Questions
2 pages
Segmented Paging: Unit Iv
100% (1)
Segmented Paging: Unit Iv
11 pages
Data Mining-Graph Mining
No ratings yet
Data Mining-Graph Mining
9 pages
CS402 Data Mining and Warehousing Question Bank
No ratings yet
CS402 Data Mining and Warehousing Question Bank
6 pages
Vtu 7TH Sem Cse/ise Data Warehousing & Data Mining Notes 10cs755/10is74
94% (18)
Vtu 7TH Sem Cse/ise Data Warehousing & Data Mining Notes 10cs755/10is74
70 pages
Classification and Prediction
No ratings yet
Classification and Prediction
126 pages
Data Mining-Partitioning Methods
100% (1)
Data Mining-Partitioning Methods
7 pages
Unit-3-Greedy Method PDF
No ratings yet
Unit-3-Greedy Method PDF
22 pages
MC5032 - DMDW
No ratings yet
MC5032 - DMDW
3 pages
Locating Mobile Entities in Distributed Systems
67% (3)
Locating Mobile Entities in Distributed Systems
2 pages
LP-VI - BI - Lab Manual
No ratings yet
LP-VI - BI - Lab Manual
48 pages
CONTENT BEYOND SYLLABUS (DBMS)
No ratings yet
CONTENT BEYOND SYLLABUS (DBMS)
26 pages
Data Mining - Discretization
100% (1)
Data Mining - Discretization
5 pages
DBMS LAB Manual Iare
No ratings yet
DBMS LAB Manual Iare
10 pages
Irs Important Questions
0% (1)
Irs Important Questions
3 pages
Unit 2 Searching and Sorting
No ratings yet
Unit 2 Searching and Sorting
83 pages
Part - A: Database Management System Lab
No ratings yet
Part - A: Database Management System Lab
26 pages
Backup and Recovery
No ratings yet
Backup and Recovery
35 pages
Candidate Generation and Pruning
No ratings yet
Candidate Generation and Pruning
9 pages
SQL Level 2 - Powerpoint Joins
No ratings yet
SQL Level 2 - Powerpoint Joins
43 pages
Data Analytics Lab File Rohit
No ratings yet
Data Analytics Lab File Rohit
23 pages
Unit 1 Data Mining task
No ratings yet
Unit 1 Data Mining task
7 pages
Ccs341-Question-Bank NNNNNN
No ratings yet
Ccs341-Question-Bank NNNNNN
10 pages
Hive Lecture Notes
100% (1)
Hive Lecture Notes
17 pages
Dbms Lab Manual II Cse II Sem
No ratings yet
Dbms Lab Manual II Cse II Sem
58 pages
Formal Languages and Automata Theory
No ratings yet
Formal Languages and Automata Theory
24 pages
Presentation On: Presented To Dr. Vinay Pathak
89% (19)
Presentation On: Presented To Dr. Vinay Pathak
37 pages
Software Engineering FULL ANSWER
No ratings yet
Software Engineering FULL ANSWER
109 pages
Data Mining Metrices
No ratings yet
Data Mining Metrices
6 pages
Major Issues in Data Mining
75% (4)
Major Issues in Data Mining
45 pages
DBMS Practical Slips
No ratings yet
DBMS Practical Slips
2 pages
Dbms Practical Slips
No ratings yet
Dbms Practical Slips
10 pages
Binary Search Tree
No ratings yet
Binary Search Tree
8 pages
Unit 1-Architecture of DBMS
No ratings yet
Unit 1-Architecture of DBMS
11 pages
CS3492 Database Management Systems Question Bank 1
No ratings yet
CS3492 Database Management Systems Question Bank 1
11 pages
Students Marks Analysis
No ratings yet
Students Marks Analysis
3 pages
Binary Search Tree: Reny Jose
No ratings yet
Binary Search Tree: Reny Jose
36 pages
FINC 614 Introduction To Data Science Mid Term Exam Do The Following in R and Turn in A Word or PDF Document Generated With Knitr, Via Blackboard
No ratings yet
FINC 614 Introduction To Data Science Mid Term Exam Do The Following in R and Turn in A Word or PDF Document Generated With Knitr, Via Blackboard
1 page
Aproiri Qand A
No ratings yet
Aproiri Qand A
9 pages
Q.1. What Is Data Mining?
No ratings yet
Q.1. What Is Data Mining?
15 pages
L36-37-Converting ER Diagram To Relational Schema
No ratings yet
L36-37-Converting ER Diagram To Relational Schema
36 pages
Cs2032 Data Warehousing and Data Mining Notes (Unit III) .PDF - Www.chennaiuniversity - Net.notes
No ratings yet
Cs2032 Data Warehousing and Data Mining Notes (Unit III) .PDF - Www.chennaiuniversity - Net.notes
54 pages
(Ebook) Essentials of Pattern Recognition: An Accessible Approach by Jianxin Wu ISBN 9781108483469, 1108483461 all chapter instant download
100% (8)
(Ebook) Essentials of Pattern Recognition: An Accessible Approach by Jianxin Wu ISBN 9781108483469, 1108483461 all chapter instant download
81 pages
Optimizing Hadoop for MapReduce
From Everand
Optimizing Hadoop for MapReduce
Khaled Tannir
No ratings yet
Lesson Plan Details (LP DWDM)
No ratings yet
Lesson Plan Details (LP DWDM)
10 pages
Integration of DB & DW-D6
No ratings yet
Integration of DB & DW-D6
10 pages
Data Mining Important Questions
67% (3)
Data Mining Important Questions
5 pages
Excel To PowerBI - My Project
100% (7)
Excel To PowerBI - My Project
22 pages
CIS Module 4 VDC Storage
No ratings yet
CIS Module 4 VDC Storage
34 pages
Database Technology Topic 3: Enhanced Entity-Relationship (EER) Modeling
No ratings yet
Database Technology Topic 3: Enhanced Entity-Relationship (EER) Modeling
4 pages
Module 4 - Assignment
No ratings yet
Module 4 - Assignment
6 pages
Perancangan Enterprise Architecture Pada PT - Gadingputra Samudra Menggunakan Framework Togaf Adm
No ratings yet
Perancangan Enterprise Architecture Pada PT - Gadingputra Samudra Menggunakan Framework Togaf Adm
11 pages
Inside Fortianalyzer 50
No ratings yet
Inside Fortianalyzer 50
4 pages
DATA_MINING_UNIT-5
No ratings yet
DATA_MINING_UNIT-5
6 pages
Lexar DataShield Quick Start Guide - Mac-20200716
No ratings yet
Lexar DataShield Quick Start Guide - Mac-20200716
39 pages
Image-Based Service Recommendation System A JPEG-Coefficient RFs Approach
No ratings yet
Image-Based Service Recommendation System A JPEG-Coefficient RFs Approach
11 pages
How Things Work
No ratings yet
How Things Work
7 pages
Cs329s 03 Note Data Engineering
No ratings yet
Cs329s 03 Note Data Engineering
26 pages
Reference Librarian
No ratings yet
Reference Librarian
14 pages
Database Assignment
No ratings yet
Database Assignment
29 pages
Database 621 Mcqs
No ratings yet
Database 621 Mcqs
29 pages
DataAnalystRoadmap PDF
No ratings yet
DataAnalystRoadmap PDF
6 pages
Requirments de en
No ratings yet
Requirments de en
21 pages
Applsci 11 03986 With Cover
No ratings yet
Applsci 11 03986 With Cover
24 pages
T4d Digital Preservation
No ratings yet
T4d Digital Preservation
112 pages
Sagara Pravalika Garlapati_Data Engineer
No ratings yet
Sagara Pravalika Garlapati_Data Engineer
4 pages
TerrysTopics 135
No ratings yet
TerrysTopics 135
16 pages
Data Driven Governance Competency Guide: Resource Person
No ratings yet
Data Driven Governance Competency Guide: Resource Person
42 pages
MAATrica A Measure For Assessing Consistency and Me - 2024 - European Journal o
No ratings yet
MAATrica A Measure For Assessing Consistency and Me - 2024 - European Journal o
13 pages
Pentachart Example 3
No ratings yet
Pentachart Example 3
1 page
Citation Styles & Bibliographic Format-UGC NET
No ratings yet
Citation Styles & Bibliographic Format-UGC NET
6 pages
Major_Project_Presentation_v1.0_for_Review_Final.ppt
No ratings yet
Major_Project_Presentation_v1.0_for_Review_Final.ppt
20 pages
Window Sap I
No ratings yet
Window Sap I
2,238 pages
OL ICT Model Papers With Answers Tamil Medium
100% (2)
OL ICT Model Papers With Answers Tamil Medium
6 pages
Sja Pgdca Online Classes: We Will Be Live in Few Minutes
No ratings yet
Sja Pgdca Online Classes: We Will Be Live in Few Minutes
12 pages
Chapter: 3.7 Drives Topic: 3.7.1 Drives: E-Content of It Tools and Business System
100% (1)
Chapter: 3.7 Drives Topic: 3.7.1 Drives: E-Content of It Tools and Business System
15 pages
Oracle Locks
No ratings yet
Oracle Locks
5 pages
Oracle PL SQL Interview Questions For 3 Years Exp
91% (11)
Oracle PL SQL Interview Questions For 3 Years Exp
20 pages

Questions in Data Mining

Uploaded by

Questions in Data Mining

Uploaded by

Sura rahim

1. What is data mining? Describe the steps involved in data

2. Suppose your task as a software engineer at Big-University is

2. A database or data warehouse server, which fetches the

3. Describe the differences between the following approaches for

1-No coupling: No coupling means that a DM system will not

from a particular source (such as a file system), process data

2-Loose coupling: Loose coupling means that a DM system

3-Semitight coupling: Semitight coupling is a compromise

4-Tight coupling: Tight coupling means that a DM system is

You might also like