Inna Vogel

Inna Vogel

Senior Consultant at Advisori - passionate about NLP and Generative AI

Langen (Hessen), Hessen, Deutschland
10.152 Follower:innen 500+ Kontakte

Info

Passionate about natural language processing (NLP) and generative AI, with 7 years of experience in machine learning research. Currently dedicated to helping companies implement AI into their businesses.

Berufserfahrung

  • ADVISORI FTC GmbH Grafik

    ADVISORI FTC GmbH

    Frankfurt am Main, Hessen, Deutschland

  • -

    Frankfurt am Main, Hessen, Deutschland

  • -

    Darmstadt und Umgebung, Deutschland

  • -

    Frankfurt am Main und Umgebung, Deutschland

  • -

    Frankfurt am Main und Umgebung, Deutschland

  • -

    Darmstadt und Umgebung, Deutschland

Ausbildung

Veröffentlichungen

  • Adapter fusion for check-worthiness detection - combining a task adapter with a NER adapter

    ECIR 2024

    Detecting check-worthy statements aims to facilitate manual fact-checking efforts by prioritizing claims that fact-checkers should prioritize first. It can also be considered as the first step of a fact-checking system. In this paper, we present an adapter fusion model that combines a task adapter with a NER adapter achieving state-of-the-art results on two challenging check-worthiness benchmarks. Adapters are a resource-efficient alternative to fully fine-tuning transformer models. Our best…

    Detecting check-worthy statements aims to facilitate manual fact-checking efforts by prioritizing claims that fact-checkers should prioritize first. It can also be considered as the first step of a fact-checking system. In this paper, we present an adapter fusion model that combines a task adapter with a NER adapter achieving state-of-the-art results on two challenging check-worthiness benchmarks. Adapters are a resource-efficient alternative to fully fine-tuning transformer models. Our best performing model obtains an F1 score of 0.92 on the CheckThat! Lab 2023 dataset. Additionally, we interpret the fusion attentions, demonstrating the effectiveness of our approach. The quantitative analysis of the fusion attentions shows that named entities contribute significantly to the prediction of the adapter fusion model.

    Veröffentlichung anzeigen
  • Datenerfassung und -analyse von radikalen Online-Inhalten

    MOTRA-Monitor 2021

    Um der Flut von radikalen und extremistischen Inhalten im Internet gerecht zu werden, bedarf es unterstützender automatisierter Lösungen für Ermittler und Forschende. Für die Unterstützung bei der Sammlung und Auswertung relevanter Textinhalte entwickelte das Fraunhofer SIT ein Werkzeug, mit dessen Hilfe relevante Inhalte auf Social-Media-Platt-formen wie Facebook, Twitter und YouTube automatisch gesammelt und hinsichtlich ihrer Radikalität…

    Um der Flut von radikalen und extremistischen Inhalten im Internet gerecht zu werden, bedarf es unterstützender automatisierter Lösungen für Ermittler und Forschende. Für die Unterstützung bei der Sammlung und Auswertung relevanter Textinhalte entwickelte das Fraunhofer SIT ein Werkzeug, mit dessen Hilfe relevante Inhalte auf Social-Media-Platt-formen wie Facebook, Twitter und YouTube automatisch gesammelt und hinsichtlich ihrer Radikalität bewertet werden können.

    Veröffentlichung anzeigen
  • Profiling Hate Speech Spreaders on Twitter: SVM vs. Bi-LSTM

    CLEF 2021 – Conference and Labs of the Evaluation Forum, September 21–24, 2021, Bucharest, Romania

    Hate speech is a crime that has been growing in recent years, especially in online communication. It can harm the individual or a group of people by targeting their conscious or unconscious intrinsic characteristics. Additionally, the psychological burden of manual moderation has necessitated the need for automated hate speech detection methods. In this notebook, we describe our profiling system to the PAN at CLEF 2021 lab “Profiling Hate Speech Spreaders on Twitter”. The aim of the task is to…

    Hate speech is a crime that has been growing in recent years, especially in online communication. It can harm the individual or a group of people by targeting their conscious or unconscious intrinsic characteristics. Additionally, the psychological burden of manual moderation has necessitated the need for automated hate speech detection methods. In this notebook, we describe our profiling system to the PAN at CLEF 2021 lab “Profiling Hate Speech Spreaders on Twitter”. The aim of the task is to determine whether it is possible to identify hate speech spreaders on Twitter automatically. Our final submitted system uses character 𝑛-grams as features in combination with an SVM and achieves an overall average accuracy of 69.5% for the English and Spanish datasets. Additionally, we experimented with a Bi-LSTM model and trained it with Sentence-BERT, achieving slightly worse performance results. The experiments show that it is difficult to detect solidly hate speech spreaders on Twitter as hate speech is not only the use of profanity.

    Veröffentlichung anzeigen
  • Automatisierung beim Auffinden radikaler Inhalte im Internet

    Verlag für Polizeiwissenschaft

    Hamachers, Annika: Extremistische Dynamiken im Social Web : Befunde zu den digitalen Katalysatoren politisch und religiös motivierter Gewalt. Frankfurt am Main: Verlag für Polizeiwissenschaft, 2020
    ISBN: 978-3-86676-671-6

    Veröffentlichung anzeigen
  • Detecting Fake News Spreaders on Twitter from a Multilingual Perspective

    2020 IEEE 7th International Conference on Data Science and Advanced Analytics (DSAA)

    The creators of fake news often use facts from verified news sources and layer them with misinformation to confuse the reader, either intentionally or unintentionally. It can be increasingly seen as a threat to democracy, public order and free debate that can cause confusion and provoke unrest. Several websites have taken on the mission of fact-checking rumors and claims – particularly those that get thousands of views and likes before being debunked and dismissed by expert sources. To prevent…

    The creators of fake news often use facts from verified news sources and layer them with misinformation to confuse the reader, either intentionally or unintentionally. It can be increasingly seen as a threat to democracy, public order and free debate that can cause confusion and provoke unrest. Several websites have taken on the mission of fact-checking rumors and claims – particularly those that get thousands of views and likes before being debunked and dismissed by expert sources. To prevent fake news from being spread among online users, a near real-time reaction is crucial. Fact-checking websites are often not fast enough to verify the content of all the news being spread. Fake news detection is a challenging task aiming to reduce human time and effort to check the truthfulness of news. In this paper, we propose an approach that is able to identify possible fake news spreaders on social media as a first step towards preventing fake news from being propagated among online users. Therefore, we conduct different learning experiments from a multilingual perspective, English and Spanish. We evaluate different textual features that are primarily not tied to a specific language and compare different machine learning algorithms. Our results indicate that language-independent features can be used to distinguish between possible fake news spreaders and users who share credible information with an average detection accuracy of 78% for the English and 87% for the Spanish corpus.

    Veröffentlichung anzeigen
  • Fake News Spreader Detection on Twitter using Character N-Grams

    Conference and Labs of the Evaluation Forum (CLEF)

    The authors of fake news often use facts from verified news sources and mix them with misinformation to create confusion and provoke unrest among the readers. The spread of fake news can thereby have serious implications on our society. They can sway political elections, push down the stock price or crush reputations of corporations or public figures. Several websites have taken on the mission of checking rumors and allegations, but are often not fast enough to check the content of all the news…

    The authors of fake news often use facts from verified news sources and mix them with misinformation to create confusion and provoke unrest among the readers. The spread of fake news can thereby have serious implications on our society. They can sway political elections, push down the stock price or crush reputations of corporations or public figures. Several websites have taken on the mission of checking rumors and allegations, but are often not fast enough to check the content of all the news being disseminated. Especially social media websites have offered an easy platform for the fast propagation of information. Towards limiting fake news from being propagated among social media users, the task of this year’s PAN 2020 challenge lays the focus on the fake news spreaders. The aim of the task is to determine whether it is possible to discriminate authors that have shared fake news in the past from those that have never done it. In this notebook, we describe our profiling system for the fake news detection task on Twitter. For this, we conduct different feature extraction techniques and learning experiments from a multilingual perspective, namely English and Spanish. Our final submitted systems use character n-grams as features in combination with a linear SVM for English and Logistic Regression for the Spanish language. Our submitted models achieve an overall accuracy of 73% and 79% on the English and Spanish official test set, respectively. Our experiments show that it is difficult to differentiate solidly fake news spreaders on Twitter from users who share credible information leaving room for further investigations. Our model ranked 3rd out of 72 competitors.

    Veröffentlichung anzeigen
  • Analyzing Linguistic Features of German Fake News: Characterization, Detection, and Discussion.

    Sicherheitslagen und Sicherheitstechnologien. LIT Verlag

    The contribution of this paper is to provide an analytic study on the style and language of German fake news and show whether they differ systematically from true news. Therefore, we compare the language of reliable news with that of propaganda fake news to find differences in the genres.

    Veröffentlichung anzeigen
  • Desinformation aufdecken und bekämpfen

    Nomos Verlagsgesellschaft mbH & Co. KG

    Die Digitalisierung hebt die Lüge auf eine neue Ebene. Ausgewiesene Forscherinnen und Forscher legen mit diesem Band ihre umfassenden Analyseergebnisse vor, die sie bezüglich digitaler Desinformation in einem interdisziplinären Ansatz gewonnen haben: Was macht Desinformation im deutschsprachigen Internet aus? Wie wirkt Desinformation? Wie kann sie mithilfe technischer Mittel erkannt werden? Was kann und könnte mit regulatorischen und rechtlichen Maßnahmen gegen Desinformation getan werden? Aus…

    Die Digitalisierung hebt die Lüge auf eine neue Ebene. Ausgewiesene Forscherinnen und Forscher legen mit diesem Band ihre umfassenden Analyseergebnisse vor, die sie bezüglich digitaler Desinformation in einem interdisziplinären Ansatz gewonnen haben: Was macht Desinformation im deutschsprachigen Internet aus? Wie wirkt Desinformation? Wie kann sie mithilfe technischer Mittel erkannt werden? Was kann und könnte mit regulatorischen und rechtlichen Maßnahmen gegen Desinformation getan werden? Aus den Erkenntnissen von Journalistik, Medienpsychologie, Informatik und Recht werden Handlungsempfehlungen an die relevanten Adressaten hergeleitet: An den Gesetzgeber, Presserat, Medienschaffende, Betreiber von Social Networks, Einrichtungen der Forschungsförderung und nicht zuletzt Mediennutzende. Dieser Band endet nicht bei der Analyse, sondern zeigt auf, wie die Verbreitung von Desinformationen über das Internet wirkungsvoll eingedämmt werden kann.

    Andere Autor:innen
    Veröffentlichung anzeigen
  • FraunhoferSIT at GermEval 2019: Can Machines Distinguish Between Offensive Language and Hate Speech? Towards a Fine-Grained Classification

    Preliminary Proceedings of the 15th Conference on Natural Language Processing (KONVENS 2019)

    In this paper, we describe the FraunhoferSIT submission for the “GermEval 2019 – Shared Task on the Identification of Offensive Language”. We participated in two subtasks: task 1 is a binary classification of German tweets on the identification of offensive language. Task 2 is a fine-grained classification to distinguish between three subcategories of offensive language. Our best model is an SVM classifier based on tfidf character n-gram features. Our submitted runs in the shared task are:…

    In this paper, we describe the FraunhoferSIT submission for the “GermEval 2019 – Shared Task on the Identification of Offensive Language”. We participated in two subtasks: task 1 is a binary classification of German tweets on the identification of offensive language. Task 2 is a fine-grained classification to distinguish between three subcategories of offensive language. Our best model is an SVM classifier based on tfidf character n-gram features. Our submitted runs in the shared task are: FraunhoferSIT coarse [1-3].txt for task 1 and FraunhoferSIT fine [1-3].txt for task 2. Our final system reaches 0.70 macro-average F1- score for the binary classification and 0.46 F1-score for the fine-grained classification. The achieved results show that the problem of automatically distinguishing between offensive language and “Hate Speech” is far from being solved.

    Andere Autor:innen
    Veröffentlichung anzeigen
  • Automatisierte Analyse Radikaler Inhalte im Internet

    INFORMATIK 2019: 50 Jahre Gesellschaft für Informatik – Informatik für Gesellschaft

    Rassismus, Antisemitismus, Sexismus und andere Diskriminierungs- und Radikalisierungsformen zeigen sich auf unterschiedliche Arten im Internet. Es kann als Satire verpackt sein oder als menschenverachtende Parolen. Sogenannte Hassrede ist für die Kommunikationskultur ein Problem, dem die betroffenen Personen oder Personengruppen ausgesetzt sind. Zwar gibt es den Volksverhetzungsparagraphen (§ 130 StGB), Hassrede liegt allerdings nicht selten außerhalb des justiziablen Bereichs. Dennoch sind…

    Rassismus, Antisemitismus, Sexismus und andere Diskriminierungs- und Radikalisierungsformen zeigen sich auf unterschiedliche Arten im Internet. Es kann als Satire verpackt sein oder als menschenverachtende Parolen. Sogenannte Hassrede ist für die Kommunikationskultur ein Problem, dem die betroffenen Personen oder Personengruppen ausgesetzt sind. Zwar gibt es den Volksverhetzungsparagraphen (§ 130 StGB), Hassrede liegt allerdings nicht selten außerhalb des justiziablen Bereichs. Dennoch sind hasserfüllte Aussagen problematisch, da sie mit falschen Fakten Gruppierungen radikalisieren und Betroffene in ihrer Würde verletzen. 2017 stellte die Bundesregierung das Netzwerkdurchsetzungsgesetz vor, welches die sozialen Netzwerke dazu zwingt, Hassrede konsequent zu entfernen. Ohne eine automatisierte Erkennung ist dieses aber nur schwer möglich. In unserer Arbeit stellen wir einen Ansatz vor, wie solche Inhalte mithilfe des maschinellen Lernens erkannt werden können. Hierfür werden zunächst die Begriffe Radikalisierung und Hate Speech sprachlich eingeordnet. In diesem Zusammenhang wird darauf eingegangen wie Textdaten bereinigt und strukturiert werden. Anschließend wird der k-Nearest-Neighbor-Algorithmus eingesetzt, um Hate Speech in Tweets zu erkennen und zu klassifizieren. Mit unserem Vorgehen konnten wir einen Genauigkeitswert von 0,82 (Accuracy) erreichen - dieser zeigt die Effektivität des KNN-Klassifikationsansatzes.

    Andere Autor:innen
    Veröffentlichung anzeigen
  • Fake News Detection with the New German Dataset “GermanFakeNC”

    SpringerLink

    23rd International Conference on Theory and Practice of Digital Libraries, TPDL 2019, Oslo, Norway, September 9-12, 2019, Proceedings

    Veröffentlichung anzeigen
  • Bot and Gender Identification in Twitter using Word and Character N-Grams

    Working Notes of CLEF 2019 - Conference and Labs of the Evaluation Forum

    https://pan.webis.de/clef19/pan19-web/author-profiling.html

    Automated social media accounts, called bots, gained worldwide considerable importance over the course of the last years. Social bots can have serious implications on our society by swaying political elections or preading disinformation - giving rationale to social bot detection as an emerging research area. Hence, tools and techniques to automatically detect and classify manipulative bots are needed. In this notebook…

    https://pan.webis.de/clef19/pan19-web/author-profiling.html

    Automated social media accounts, called bots, gained worldwide considerable importance over the course of the last years. Social bots can have serious implications on our society by swaying political elections or preading disinformation - giving rationale to social bot detection as an emerging research area. Hence, tools and techniques to automatically detect and classify manipulative bots are needed. In this notebook, we describe our system for the author profiling task at PAN 2019 on bot and gender identification on Twitter. The submitted system uses word unigrams and bigrams as well as character n-grams as features. Tweet preprocessing and feature construction were conducted to train a linear Support Vector Machine (SVM) classifier. Our model shows that it is possible to differentiate bots from
    humans with a (fairly) high accuracy. Additionally, the accuracy shows that our SVM architecture can solidly determine the gender of the author (male or female). Our submitted model achieved an overall accuracy of 0.92 for bot detection on the English dataset and an accuracy of 0.91 for Spanish tweets. Gender can be determined by the accuracy of 0.82 and 0.78 on the English and Spanish corpus, respectively. Our simple model ranked 8th out of 55 competitors.

    Veröffentlichung anzeigen
  • Authorship Verification in the Absence of Explicit Features and Thresholds

    Springer

    Enhancing information retrieval systems with the ability to take the writing style of people into account opens the door for a number of applications. For example, one can link articles by authorships that can help identifying authors who generate hoaxes and deliberate misinformation in news stories, distributed across different platforms. Authorship verification (AV) is a technique that can be used for this purpose. AV deals with the task to judge, whether two or more documents stem from the…

    Enhancing information retrieval systems with the ability to take the writing style of people into account opens the door for a number of applications. For example, one can link articles by authorships that can help identifying authors who generate hoaxes and deliberate misinformation in news stories, distributed across different platforms. Authorship verification (AV) is a technique that can be used for this purpose. AV deals with the task to judge, whether two or more documents stem from the same author. The majority of existing AV approaches relies on machine learning concepts based on explicitly defined stylistic features and complex models that involve a fair amount of parameters. Moreover, many existing AV methods are based on explicit thresholds (needed to accept or reject a stated authorship), which are determined on training corpora. We propose a novel parameter-free AV approach, which derives its thresholds for each verification case individually and enables AV in the absence of explicit features and training corpora. In an experimental setup based on eight evaluation corpora (each one from another language) we show that our approach yields competitive results against the current state of the art and other noteworthy AV baselines.

    Veröffentlichung anzeigen

Kurse

  • T.I.S.P. - TeleTrusT Information Security Professional

    -

Projekte

  • Athene

    ATHENE - Anwendungs­orientierte Cyber­sicher­heits­for­schung für Wirtschaft, Gesellschaft und Staat

    Projekt anzeigen
  • DORIAN - Fake News Detection

    The interdisciplinary research project directed by the Fraunhofer Institute for Secure Information Technology SIT aims to uncover and comprehensively fight disinformation. The main challenges of the project are to be able to recognize fake news reliably and quickly and to design and evaluate approaches for effective prevention.

    Projekt anzeigen
  • X-Sonar

    The joint project 'X-SONAR: Extremist Engagement in Social Media Networks: Identifying, Analyzing and Preventing Processes of Radicalization' conducts applied and fundamental interdisciplinary research on the dynamics and escalation of extremist interactions in social online networks. More specifically, X-SONAR analyzes mechanisms of individual and collective dynamics of violence as well as self-regulation of radicalism within these networks. Online radicalization and the escalation of violence…

    The joint project 'X-SONAR: Extremist Engagement in Social Media Networks: Identifying, Analyzing and Preventing Processes of Radicalization' conducts applied and fundamental interdisciplinary research on the dynamics and escalation of extremist interactions in social online networks. More specifically, X-SONAR analyzes mechanisms of individual and collective dynamics of violence as well as self-regulation of radicalism within these networks. Online radicalization and the escalation of violence on the internet are relevant not only in regard to prosecution but also require new pathways for early warning and the development of appropriate prevention measures.

    Projekt anzeigen

Prüfungsergebnisse

  • Scrum Master TÜV

    Prüfungsergebnis: Certificate

    This certificate is based on the Scrum Guide of November 2020.

  • Udacity Natural Language Processing Nanodegree

    Prüfungsergebnis: Completed

    1. Build a speech tagging model with Hidden Markov Model (HMM).
    2. Build a machine translation model using different recurrent neural networks.
    3. Build a speech recognition model that turns speech into text and vice versa using a combination of deep neural networks such as CNN and RNN.

  • Udemy "NLP - Natural Language Processing with Python"

    Prüfungsergebnis: completed

    - Utilizing NLTK, Spacy library for Python (tokenization, parsing, pos-tagging, entity recognition, lemmatization)
    - Machine learning with Scikit-Learn (sentiment analysis, spam versus legitimate email)
    - Unsupervised learning (topic modeling)
    - Deep Learning to build a chatbot

Sprachen

  • Englisch

    Verhandlungssicher

  • Russisch

    Muttersprache oder zweisprachig

  • Deutsch

    Muttersprache oder zweisprachig

  • Spanisch

    Gute Kenntnisse

Inna Vogels vollständiges Profil ansehen

  • Herausfinden, welche gemeinsamen Kontakte Sie haben
  • Sich vorstellen lassen
  • Inna Vogel direkt kontaktieren
Mitglied werden. um das vollständige Profil zu sehen

Weitere ähnliche Profile

Weitere Mitglieder, die Inna Vogel heißen