Revision as of 18:01, 14 October 2019 edit Monkbot (talk \| contribs) Bots 3,119,824 edits m →‎top: Task 16: replaced (0×) / removed (1×) deprecated \|dead-url= and \|deadurl= with \|url-status=; Tag: AWB ← Previous edit		Revision as of 23:25, 22 October 2019 edit undo 122.58.132.37 (talk) "Unicity" has a technical meaning in this context. Next edit →
Line 108: Because ''k''-anonymization does not include any randomization, attackers can still make inferences about data sets that may harm individuals. For example, if the 19-year-old John from Kerala is known to be in the database above, then it can be reliably said that he has either cancer, a heart-related disease, or a viral infection. ''K''-anonymization is not a good method to anonymize high-dimensional datasets.<ref>{{cite conference\|last = Aggarwal\|first = Charu C.\|title = On ''k''-Anonymity and the Curse of Dimensionality\|year = 2005\|location = Trondheim, Norway\|isbn = 1-59593-154-6\|book-title = VLDB '05 – Proceedings of the 31st International Conference on Very large Data Bases\|url = http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.60.3155&rep=rep1&type=pdf}}</ref> For example, researchers showed that, given 4 locations, the [[Unicity_(computer_science)\|unicity]] of mobile phone timestamp-location datasets (<math>\mathcal{E}_4</math>, ''k''-anonymity when <math>k=1</math>) can be as high as 95%.<ref>{{cite journal\|last=de Montjoye\|first=Yves-Alexandre\|author2=César A. Hidalgo \|author3=Michel Verleysen \|author4=Vincent D. Blondel \|title=Unique in the Crowd: The privacy bounds of human mobility\|journal=Scientific Reports\|volume=3\|pages=1376\|date=March 25, 2013\|doi=10.1038/srep01376\|pmid=23524645\|bibcode=2013NatSR...3E1376D\|url=http://dspace.mit.edu/bitstream/1721.1/92263/1/Hidalgo_Unique%20in%20the%20crowd.pdf}}</ref> It has also been shown that ''k''-anonymity can skew the results of a data set if it disproportionately suppresses and generalizes data points with unrepresentative characteristics.<ref>{{cite web\|last1=Angiuli\|first1=Olivia\|author2=Joe Blitzstein \|author3=Jim Waldo \|authorlink3=Jim Waldo\|title=How to De-Identify Your Data\|url=http://queue.acm.org/detail.cfm?id=2838930\|website=ACM Queue\|publisher=ACM}}</ref> The suppression and generalization algorithms used to ''k''-anonymize datasets can be altered, however, so that they do not have such a skewing effect.<ref>{{cite journal\|last1=Angiuli\|first1=Olivia\|author2=Jim Waldo\|authorlink2=Jim Waldo\|title=Statistical Tradeoffs between Generalization and Suppression in the De-Identification of Large-Scale Data Sets\|journal=IEEE Computer Society Intl Conference on Computers, Software, and Applications\|date=June 2016\|url=https://ieeexplore.ieee.org/abstract/document/7552278/}}</ref>

K-anonymity: Difference between revisions