Understanding Organizations and Its Data
Understanding Organizations and Its Data
Data mining is the process of finding anomalies, patterns and correlations within
large data sets to predict outcomes.
3. Describe the limitations of data mining? How can we address those limitations?
Along with all the great benefits that data mining may offer, they equally share
limitations or disadvantages. Some of limitations experienced with data mining
may include: (1.) violation of user privacy, (2.) additional irrelevant information,
(3) Misuse of information, and (4) accuracy of data.
The best way to address the limitations of data mining is to use the same
disadvantages and adapt to use them in your favor while simultaneously taking
advantage of all of the benefits.
Pros Cons
● Economical ●
Organizational Data
Pros Cons
5. Describe the ethical issues we face in data mining? How can they be addressed?
One of the more common ethical issues with data mining is that, if an individual is
not aware that the information is being collected or how it will be used, that
individual has no opportunity to give consent for its collection and use. This can
be addressed by sharing a disclaimer beforehand.
6. Explain what is meant by out-of-synch data? How can this situation be remedied?
Out-of-synch-data refers to any supply of data that is inconsistent and breaks the
data integrity of an existing database. The best way to remedy this is to prioritize
the data integrity by sorting through and syncing the data within an existing
database and updating them to maintain consistency within systems.
7. What is normalization? What are some reasons why it is a good thing in OLTP
systems, but not so good in OLAP systems?
Data normalization or data pre processing is a basic element of data mining. It
means transforming the data, namely converting the source data in to another
format that allows for processing data effectively. The main purpose of
normalization is to minimize or even exclude duplicated data.
For a transactional database in OLTP the data is typically created in a normalized
manner, However, You can’t use a highly normalized database in OLAP. Because
the normalization can quickly lead to issues with performance.