0% found this document useful (0 votes)
15 views1 page

Main Focus: Sanitization Problem Is NP-hard. Towards The Solution

The document discusses various approaches for hiding sensitive association rules discovered in transactional databases. It describes three main classes of techniques: 1) Perturbation approaches which modify database values to decrease support for sensitive rules, 2) Use of unknowns which obscure rules by introducing unknown values into transactions, and 3) Recent sophisticated approaches providing new perspectives along with computationally expensive exact solutions. The document provides details on early heuristic algorithms and improvements made to incorporate both itemset support and rule confidence.

Uploaded by

Srinivas
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
15 views1 page

Main Focus: Sanitization Problem Is NP-hard. Towards The Solution

The document discusses various approaches for hiding sensitive association rules discovered in transactional databases. It describes three main classes of techniques: 1) Perturbation approaches which modify database values to decrease support for sensitive rules, 2) Use of unknowns which obscure rules by introducing unknown values into transactions, and 3) Recent sophisticated approaches providing new perspectives along with computationally expensive exact solutions. The document provides details on early heuristic algorithms and improvements made to incorporate both itemset support and rule confidence.

Uploaded by

Srinivas
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 1

Association Rule Hiding Methods

rule hiding refers to transforming the database D into a approach is also known as frequent itemset hiding.
database D’ of the same degree (same number of items) The heuristic employed in their approach traverses the
as D in such a way that only the rules in R-Rh can be itemset lattice in the space of items from bottom to top
mined from D’ at either the pre-specified or even higher in order to identify these items that need to turn from 1
thresholds. We should note here that in the association to 0 so that the support of an itemset that corresponds
rule hiding problem we consider the publishing of a to a sensitive rule becomes lower than the minimum
modified database instead of the secure rules because support threshold. The algorithm sorts the sensitive
we claim that a modified database will certainly have itemsets based on their supports and then it proceeds
higher utility to the data holder compared to the set of by hiding all of the sensitive itemsets one by one. A
secure rules. This claim relies on the fact that either a major improvement over the first heuristic algorithm
different data mining approach may be applied to the which was proposed in the previous work appeared in
published data, or a different support and confidence the work of Dasseni, Verykios, Elmagarmid & Bertino
threshold may be easily selected by the data miner, if (2001). The authors extended the existing association
the data itself is published. rule hiding technique from using only the support of the
It has been proved (Atallah, Bertino, Elmagarmid, generating frequent itemsets to using both the support
Ibrahim, & Verykios, 1999) that the association rule of the generating frequent itemsets and the confidence
hiding problem which is also referred to as the database of the association rules. In that respect, they proposed
sanitization problem is NP-hard. Towards the solution three new algorithms that exhibited interesting be-
of this problem a number of heuristic and exact tech- havior with respect to the characteristics of the hiding
niques have been introduced. In the following section process. Verykios, Elmagarmid, Bertino, Saygin &
we present a thorough analysis of some of the most Dasseni (2004) along the same lines of the first work,
interesting techniques which have been proposed for presented five different algorithms based on various
the solution of the association rule hiding problem. hiding strategies, and they performed an extensive
evaluation of these algorithms with respect to different
metrics like the execution time, the number of changes
MAIN FOCUS in the original data, the number of non-sensitive rules
which were hidden (hiding side effects or false rules)
In the following discussion we present three classes of and the number of “ghost” rules which were produced
state of the art techniques which have been proposed after the hiding. Oliveira & Zaiane (2002) extended
for the solution of the association rule hiding problem. existing work by focusing on algorithms that solely
The first class contains the perturbation approaches remove information so that they create a smaller impact
which rely on heuristics for modifying the database in the database by not generating false or ghost rules. In
values so that the sensitive knowledge is hidden. The their work they considered two classes of approaches:
use of unknowns for the hiding of rules comprises the pattern restriction based approaches that remove
the second class of techniques to be investigated in patterns completely from sensitive transactions, and
this expository study. The third class contains recent the item restriction based approaches that selectively
sophisticated approaches that provide a new perspec- remove items from sensitive transactions. They also
tive to the association rule hiding problem, as well as proposed various performance measures for quantify-
a special class of computationally expensive solutions, ing the fraction of mining patterns which are preserved
the exact solutions. after sanitization.

Perturbation Approaches Use of Unknowns

Atallah, Bertino, Elmagarmid, Ibrahim & Verykios A completely different approach to the hiding of sensi-
(1999) were the first to propose a rigorous solution to tive association rules was taken by employing the use
the association rule hiding problem. Their approach was of unknowns in the hiding process (Saygin, Verykios
based on the idea of preventing disclosure of sensitive & Elmagarmid, 2002, Saygin, Verykios & Clifton,
rules by decreasing the support of the itemsets generat- 2001). The goal of the algorithms that incorporate un-
ing the sensitive association rules. This reduced hiding knowns in the hiding process was to obscure a given



You might also like