Beyond The Text: Analysis of Privacy Statements through Syntactic and Semantic Role Labeling

Shvartzshnaider, Yan; Balashankar, Ananth; Patidar, Vikas; Wies, Thomas; Subramanian, Lakshminarayanan

Computer Science > Computation and Language

arXiv:2010.00678 (cs)

[Submitted on 1 Oct 2020]

Title:Beyond The Text: Analysis of Privacy Statements through Syntactic and Semantic Role Labeling

Authors:Yan Shvartzshnaider, Ananth Balashankar, Vikas Patidar, Thomas Wies, Lakshminarayanan Subramanian

View PDF

Abstract:This paper formulates a new task of extracting privacy parameters from a privacy policy, through the lens of Contextual Integrity, an established social theory framework for reasoning about privacy norms. Privacy policies, written by lawyers, are lengthy and often comprise incomplete and vague statements. In this paper, we show that traditional NLP tasks, including the recently proposed Question-Answering based solutions, are insufficient to address the privacy parameter extraction problem and provide poor precision and recall. We describe 4 different types of conventional methods that can be partially adapted to address the parameter extraction task with varying degrees of success: Hidden Markov Models, BERT fine-tuned models, Dependency Type Parsing (DP) and Semantic Role Labeling (SRL). Based on a detailed evaluation across 36 real-world privacy policies of major enterprises, we demonstrate that a solution combining syntactic DP coupled with type-specific SRL tasks provides the highest accuracy for retrieving contextual privacy parameters from privacy statements. We also observe that incorporating domain-specific knowledge is critical to achieving high precision and recall, thus inspiring new NLP research to address this important problem in the privacy domain.

Comments:	11 pages, 4 figures
Subjects:	Computation and Language (cs.CL); Computers and Society (cs.CY)
Cite as:	arXiv:2010.00678 [cs.CL]
	(or arXiv:2010.00678v1 [cs.CL] for this version)
	https://doi.org/10.48550/arXiv.2010.00678

Submission history

From: Yan Shvartzshnaider [view email]
[v1] Thu, 1 Oct 2020 20:48:37 UTC (508 KB)

Computer Science > Computation and Language

Title:Beyond The Text: Analysis of Privacy Statements through Syntactic and Semantic Role Labeling

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computation and Language

Title:Beyond The Text: Analysis of Privacy Statements through Syntactic and Semantic Role Labeling

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators