Unit 6 - Information Privacy and Data Mining
Unit 6 - Information Privacy and Data Mining
Information privacy may be applied in numerous ways, including encryption, authentication and
data masking - each attempting to ensure that information is available only to those with authorized
access. These protective measures are geared toward preventing data mining and the unauthorized
use of personal information, which are illegal in many parts of the world.
Arjun Lamichhane 1
Data Mining and Data Warehousing Unit 6: Information Privacy and Data Mining
Transparent: An agency must provide you with details regarding the personal information
they are storing, why they are storing it and what rights you have to access it.
Accessible: An agency must allow you to access your personal information without
excessive delay or expense.
4. Use
Accurate: An agency must ensure that your personal information is relevant, accurate, up to
date and complete before using it.
Limited: An agency can only use your personal information for the purpose for which it was
collected.
5. Disclosure
Restricted: An agency can only disclose your information in limited circumstances if you
have consented or if you were told at the time they collected it that they would do so. An
agency cannot disclose your sensitive personal information without your consent, for
example, information about political opinions, religious or philosophical beliefs, medical
conditions or trade union membership.
Arjun Lamichhane 2
Data Mining and Data Warehousing Unit 6: Information Privacy and Data Mining
Arjun Lamichhane 3
Data Mining and Data Warehousing Unit 6: Information Privacy and Data Mining
Retail data mining can help identify customer buying behaviors, discover customer shopping
patterns and trends, improve the quality of customer service, achieve better customer retention and
satisfaction, enhance goods consumption ratios, design more effective goods transportation and
distribution policies, and reduce the cost of business.
A few examples of data mining in the retail industry are outlined as follows:
Arjun Lamichhane 4
Data Mining and Data Warehousing Unit 6: Information Privacy and Data Mining
increase sales. Similarly, information, such as “hot items this week” or attractive deals, can be
displayed together with the associative information to promote sales.
Fraudulent analysis and the identification of unusual patterns
Fraudulent activity costs the retail industry millions of dollars per year. It is important to (1)
identify potentially fraudulent users and their atypical usage patterns; (2) detect attempts to gain
fraudulent entry or unauthorized access to individual and organizational accounts; and (3) discover
unusual patterns that may need special attention. Many of these patterns can be discovered by
multidimensional analysis, cluster analysis, and outlier analysis.
Today, scientific data can be amassed at much higher speeds and lower costs. This has resulted in
the accumulation of huge volumes of high-dimensional data, stream data, and heterogenous data,
containing rich spatial and temporal information. Consequently, scientific applications are shifting
from the “hypothesize-and-test” paradigm toward a “collect and store data, mine for new
hypotheses, confirm with data or experimentation” process. This shift brings about new challenges
for data mining.
Arjun Lamichhane 5
Data Mining and Data Warehousing Unit 6: Information Privacy and Data Mining
Arjun Lamichhane 6
Data Mining and Data Warehousing Unit 6: Information Privacy and Data Mining
1. Classification: Discovery of a predictive learning function that classifies a data item into one of
several predefined classes.
2. Regression: Discovery of a predictive learning function that maps a data item to a real value
prediction variable.
3. Clustering: A common descriptive task in which one seeks to identify a finite set of categories
or clusters to describe the data.
4. Optimization: enhance the use limited resources such as time space or material
References
[1] J. Han and K. Micheline, Data Mining: Concepts and Techniques, San Francisco: Elsevier Inc., 2006.
Arjun Lamichhane 8