Major Issues in DM
Major Issues in DM
a
language
l d be integrated with a database or data warehouse query language
When
may reflect noise, exceptional cases, or incomplete data objects.
confuse the process, causing
mining data regularities, these objects may
the knowledge model constructed to overtit the
data. As a result, the
efticiency. scalability
and parallelization of ddata
These include
mining algorithms.
algorithms: To efectivel
Eficiency and scalability of data mining
extract informationfrom a amount
huge data in databases, data minino
of g
words, the runnino
algorithms must be efficient and scalable. In other Ang
must be predictable and acceptable in
time of a data mining algorithm
on knowledge discovery
large databases. From a database perspective
of data
efticiency and scalability are key issues in the implementation
issues discussed above under mining
mining systems. Many of the
consider efficiency and
methodology and user interaction must also
scalability.
Parallel, distributed, and incremental mining algorithms: The huge
size of many databases, the wide distribution of data and the
algorithms. Such algorithms divide the data into partitions, which are
processed in parallel. The results from the partitions are then merged.
nt.
efficient and effective data mining systems for such data is important
However, other databases may contain complex data objects, hyperteext
and multimedia data, spatial lata, temporal data, or
transaction data. It
unrealistie
is unrealistic
is
to expect one system to mine all kinds of data,
given the
diversity of data types and dilferent
goals of data mining. Specific data
nining systems should be constructed for
mining specific kinds of data
Therefore. one may expect to have different data mining systems for