Data Mining Architecture
Data Mining Architecture
The significant components of data mining systems are a data source, data mining
engine, data warehouse server, the pattern evaluation module, graphical user interface,
and knowledge base.
Data Source:
The actual source of data is the Database, data warehouse, World Wide Web (WWW),
text files, and other documents. Organizations typically store data in databases or data
warehouses. Data warehouses may comprise one or more databases, text files
spreadsheets, or other repositories of data. Sometimes, even plain text files or
spreadsheets may contain information. Another primary source of data is the World
Wide Web or the internet.
Different processes:
Before passing the data to the database or data warehouse server, the data must be
cleaned, integrated, and selected. As the information comes from various sources and in
different formats, it can't be used directly for the data mining procedure because the
data may not be complete and accurate. So, the first data requires to be cleaned and
unified. More information than needed will be collected from various data sources, and
only the data of interest will have to be selected and passed to the server. Several
methods may be performed on the data as part of selection, integration, and cleaning.
This segment commonly employs stake measures that cooperate with the data mining
modules to focus the search towards fascinating patterns. It might utilize a stake
threshold to filter out discovered patterns. On the other hand, the pattern evaluation
module might be coordinated with the mining module, depending on the
implementation of the data mining techniques used. For efficient data mining, it is
abnormally suggested to push the evaluation of pattern stake as much as possible into
the mining procedure to confine the search to only fascinating patterns.