Anaïs Ollagnier

Also published as: Anais Ollagnier


2024

pdf bib
CyberAgressionAdo-v2: Leveraging Pragmatic-Level Information to Decipher Online Hate in French Multiparty Chats
Anais Ollagnier
Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)

As a part of the release of the CyberAgressionAdo-V2 dataset, this paper introduces a new tagset that includes tags marking pragmatic-level information occurring in cyberbullying situations. The previous version of this dataset, CyberAgressionAdo-V1, consists of aggressive multiparty chats in French annotated using a hierarchical tagset developed to describe bullying narrative events including the participant roles, the presence of hate speech, the type of verbal abuse, among others. In contrast, CyberAgressionAdo-V2 uses a multi-label, fine-grained tagset marking the discursive role of exchanged messages as well as the context in which they occur — for instance, attack (ATK), defend (DFN), counterspeech (CNS), abet/instigate (AIN), gaslight (GSL), etc. This paper provides a comprehensive overview of the annotation tagset and presents statistical insights derived from its application. Additionally, we address the challenges encountered when annotating pragmatic-level information in this context, conducting a thorough analysis of annotator disagreements. The resulting dataset comprises 19 conversations that have been manually annotated and is now available to facilitate further research in the field.

2022

pdf bib
CyberAgressionAdo-v1: a Dataset of Annotated Online Aggressions in French Collected through a Role-playing Game
Anaïs Ollagnier | Elena Cabrio | Serena Villata | Catherine Blaya
Proceedings of the Thirteenth Language Resources and Evaluation Conference

Over the past decades, the number of episodes of cyber aggression occurring online has grown substantially, especially among teens. Most solutions investigated by the NLP community to curb such online abusive behaviors consist of supervised approaches relying on annotated data extracted from social media. However, recent studies have highlighted that private instant messaging platforms are major mediums of cyber aggression among teens. As such interactions remain invisible due to the app privacy policies, very few datasets collecting aggressive conversations are available for the computational analysis of language. In order to overcome this limitation, in this paper we present the CyberAgressionAdo-V1 dataset, containing aggressive multiparty chats in French collected through a role-playing game in high-schools, and annotated at different layers. We describe the data collection and annotation phases, carried out in the context of a EU and a national research projects, and provide insightful analysis on the different types of aggression and verbal abuse depending on the targeted victims (individuals or communities) emerging from the collected data.

2015

pdf bib
Analyse en dépendance et classification de requêtes en langue naturelle, application à la recommandation de livres [Dependency parsing and classification of natural language queries: application to book recommendation]
Anaïs Ollagnier | Sébastien Fournier | Patrice Bellot
Traitement Automatique des Langues, Volume 56, Numéro 3 : Recherche d'Information [Information Retrieval]

2014

pdf bib
Impact of the nature and size of the training set on performance in the automatic detection of named entities (Impact de la nature et de la taille des corpus d’apprentissage sur les performances dans la détection automatique des entités nommées) [in French]
Anaïs Ollagnier | Sébastien Fournier | Patrice Bellot | Frédéric Béchet
Proceedings of TALN 2014 (Volume 2: Short Papers)