Chakraborty 2015
Chakraborty 2015
Fig. 1: APL of (α, k) anonymized ER graph Fig. 2: APL of (α, l) anonymized graph Fig. 3: APL of recursive (α, c, l) anonymized graph
proposed in [3]. An equivalence group G satisfies safety noise nodes in such an intelligent manner that the social
grouping, if it follows the following three conditions- 1) |G| ≥ importance of the noise nodes are always very low as
k, 2) f1 < c (fl + fl+1 +…….+ fm), 3) ((f1 +1) / (f1(m-l+1))) < c. compared to the social status of the nodes which are already
For every equivalence group in the (α, c, l) diverse degree present in the raw graph. Our proposed algorithms also do not
sequence, the target degree for all the nodes belonging to that mix high influential nodes with low influential nodes. As the
equivalence group is the average of the degrees of all the method proposed in [3] also uses the noise node addition
nodes present in that equivalence group. technique to generate the anonymized graph, we compared the
results obtained from our method with that proposed in [3].
B. Graph construction algorithms We denote the method proposed in [3] as the KDLD method.
After generating the different anonymous degree sequences, We tested our proposed algorithms for k=5, 10, 15, and 20.
we applied the noise node addition technique proposed in [3] For (α, 5) diversity and (α, 10) diversity, except for k=20, our
to construct the anonymized graph. approach produces noise nodes with lesser average
eigenvector centrality value as compared to the average
III. RESULT AND ANALYSIS eigenvector centrality value of the nodes present in the raw
graph. For (α, 3, 10) diversity, we find that for all values of k,
A. Synthetic Dataset our proposed method generates a recursive (c, l) diverse graph
where the average eigenvector centrality value of the noise
We generated the Erdos Renyi (ER) model graph which
nodes is less than the average eigenvector centrality of the
consists of 1000 vertices and 5000 edges. This dataset is used
nodes present in the raw graph. So, if preservation of the
for (α, k) anonymization only. We compared the results social status of the users in the anonymized graph is a
obtained from our approach with the other existing k- parameter to determine the quality and performance of
anonymity techniques such as k-match algorithm [5], anonymization, then our proposed approach performs much
generalization method [1], against 1 neighbor [2]. An analysis better than the existing method which uses degree centrality
of the graph in Fig. 1, shows that except our method the APL concept to generate the anonymized graph.
difference between the raw graph and anonymized graph for
all the methods is significant. As we increase the value of k, IV. CONCLUSION
the APL of the anonymized graph generated from those
methods starts deviating more from the raw graph APL The proposed anonymization model performs better than the
whereas our proposed method produces anonymized graph existing k-anonymity models in preserving the structural
which have almost equal APL of the raw graph. property of the graphs. Our proposed algorithms also ensure
that the noise nodes added for anonymization purpose attain
B. Real Datasets low social importance. Estimation of utility of the anonymized
We used the co-authorship network data compiled by M. data and also the effect of the addition of noise nodes on the
Newman which consists of 1589 nodes and 2742 edges i.e. utility of the anonymized data can be studied further.
among those 1589 scientists, 2742 distinct connections
existed. We considered the first letter of the name of the References
scientists as the distinct sensitive label for our experiment [1] M. Hay, G. Miklau, D. Jensen, D. Towsley, and P. Weis, Resisting
purpose. From Fig. 2, we can see that as the k value is Structural Re-Identification in Anonymized Social Networks, Proc. VLDB
Endowment, 2008, vol. 1, pp. 102-114.
increased, our proposed method tends to generate anonymized [2] K. Liu and E. Terzi, “Towards Identity Anonymization on Graphs,”
graph with almost same APL as that of the raw graph. Fig. 3 SIGMOD ’08: Proc. ACM SIGMOD International Conference Management
shows the performance of our proposed recursive (α, c, l) of Data, pp. 93-106, 2008.
diversity algorithm to generate anonymized graph. The [3] M. Yuan, L. Chen, Yu, P.S. and T. Yu, Protecting Sensitive Labels in
Social Network Data Anonymization, Knowledge and Data Engineering,
deviation from the APL of the raw graph becomes minimal as IEEE Transactions on , vol.25, no.3, pp.633,647, March 2013
we increase the k value. [4] R. C. Wong and J. Li and A. W. Fu and K. Wang, {α, k)-anonymity: an
Since we used the noise node addition technique to enhanced k-anonymity model for privacy preserving data publishing, Proc.
generate the anonymized graph, so, we also analyzed the ACM SIGKDD, 2006, pp. 754-759.
effect of noise nodes on the final anonymized graph. It is [5] L. Zou, L. Chen, and M.T. Ozsu, “K-Automorphism: A General
Framework for Privacy Preserving Network Publication,” Proc. VLDB
observed that our proposed algorithms preserve the social Endowment, vol. 2, no.1, pp. 946-957, 2009.
status of the nodes present in the raw graph and also adds the