0% found this document useful (0 votes)
80 views1 page

Data Integration in Data Mining

Data integration combines data from multiple sources to provide a unified view. This is commonly done through building an enterprise data warehouse, which allows analyzing data based on what is in the warehouse. There are two main approaches for data integration - tight coupling, which combines data into a single location through extraction, transformation, and loading, and loose coupling, which keeps data in its original sources but provides an interface to query across sources.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
80 views1 page

Data Integration in Data Mining

Data integration combines data from multiple sources to provide a unified view. This is commonly done through building an enterprise data warehouse, which allows analyzing data based on what is in the warehouse. There are two main approaches for data integration - tight coupling, which combines data into a single location through extraction, transformation, and loading, and loose coupling, which keeps data in its original sources but provides an interface to query across sources.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 1

Data Integration In Data Mining

Data Integration is a data preprocessing technique that combines data from multiple sources and
provides users a unified view of these data.

Data Integration

These sources may include multiple databases, data cubes, or flat files. One of the most well-
known implementation of data integration is building an enterprise's data warehouse.

The benefit of a data warehouse enables a business to perform analyses based on the data in the
data warehouse.

There are mainly 2 major approaches for data integration:-


1 Tight Coupling

In tight coupling data is combined from different sources into a single physical location through
the process of ETL - Extraction, Transformation and Loading.

2 Loose Coupling

In loose coupling data only remains in the actual source databases. In this approach, an interface
is provided that takes query from user and transforms it in a way the source database can
understand and then sends the query directly to the source databases to obtain the result.

2.

One challenge to data mining regarding performance issues is the efficiency and scalability of data
mining algorithms. Data mining algorithms must be efficient and scalable in order to effectively extract
information from large amounts of data in databases within predictable and acceptable running times.
 Another challenge is the parallel, distributed, and incremental processing of data mining algorithms.
The need for parallel and distributed data mining algorithms has been brought about by the huge size of
many databases, the wide distribution of data, and the computational complexity of some data mining
methods. Due to the high cost of some data mining processes, incremental data mining algorithms
incorporate database updates without the need to mine the entire data again from scratch

You might also like