Skip to main content

Showing 1–9 of 9 results for author: Ferritto, A

Searching in archive cs. Search in all archives.
.
  1. arXiv:2211.16634  [pdf, other

    cs.CL cs.AI cs.LG

    SPARTAN: Sparse Hierarchical Memory for Parameter-Efficient Transformers

    Authors: Ameet Deshpande, Md Arafat Sultan, Anthony Ferritto, Ashwin Kalyan, Karthik Narasimhan, Avirup Sil

    Abstract: Fine-tuning pre-trained language models (PLMs) achieves impressive performance on a range of downstream tasks, and their sizes have consequently been getting bigger. Since a different copy of the model is required for each task, this paradigm is infeasible for storage-constrained edge devices like mobile phones. In this paper, we propose SPARTAN, a parameter efficient (PE) and computationally fast… ▽ More

    Submitted 29 November, 2022; originally announced November 2022.

  2. arXiv:2206.08441  [pdf, other

    cs.CL

    GAAMA 2.0: An Integrated System that Answers Boolean and Extractive Questions

    Authors: Scott McCarley, Mihaela Bornea, Sara Rosenthal, Anthony Ferritto, Md Arafat Sultan, Avirup Sil, Radu Florian

    Abstract: Recent machine reading comprehension datasets include extractive and boolean questions but current approaches do not offer integrated support for answering both question types. We present a multilingual machine reading comprehension system and front-end demo that handles boolean questions by providing both a YES/NO answer and highlighting supporting evidence, and handles extractive questions by hi… ▽ More

    Submitted 21 June, 2022; v1 submitted 16 June, 2022; originally announced June 2022.

  3. arXiv:2105.03229  [pdf, other

    cs.CL

    VAULT: VAriable Unified Long Text Representation for Machine Reading Comprehension

    Authors: Haoyang Wen, Anthony Ferritto, Heng Ji, Radu Florian, Avirup Sil

    Abstract: Existing models on Machine Reading Comprehension (MRC) require complex model architecture for effectively modeling long texts with paragraph representation and classification, thereby making inference computationally inefficient for production use. In this work, we propose VAULT: a light-weight and parallel-efficient paragraph representation for MRC based on contextualized representation from long… ▽ More

    Submitted 2 June, 2021; v1 submitted 7 May, 2021; originally announced May 2021.

    Comments: Accepted at ACL 2021

  4. arXiv:2010.05904  [pdf, other

    cs.CL

    Multi-Stage Pre-training for Low-Resource Domain Adaptation

    Authors: Rong Zhang, Revanth Gangi Reddy, Md Arafat Sultan, Vittorio Castelli, Anthony Ferritto, Radu Florian, Efsun Sarioglu Kayi, Salim Roukos, Avirup Sil, Todd Ward

    Abstract: Transfer learning techniques are particularly useful in NLP tasks where a sizable amount of high-quality annotated data is difficult to obtain. Current approaches directly adapt a pre-trained language model (LM) on in-domain text before fine-tuning to downstream tasks. We show that extending the vocabulary of the LM with domain-specific terms leads to further gains. To a bigger effect, we utilize… ▽ More

    Submitted 12 October, 2020; originally announced October 2020.

    Comments: Accepted at EMNLP 2020

  5. arXiv:1911.02984  [pdf, other

    cs.CL cs.IR

    The TechQA Dataset

    Authors: Vittorio Castelli, Rishav Chakravarti, Saswati Dana, Anthony Ferritto, Radu Florian, Martin Franz, Dinesh Garg, Dinesh Khandelwal, Scott McCarley, Mike McCawley, Mohamed Nasr, Lin Pan, Cezar Pendus, John Pitrelli, Saurabh Pujar, Salim Roukos, Andrzej Sakrajda, Avirup Sil, Rosario Uceda-Sosa, Todd Ward, Rong Zhang

    Abstract: We introduce TechQA, a domain-adaptation question answering dataset for the technical support domain. The TechQA corpus highlights two real-world issues from the automated customer support domain. First, it contains actual questions posed by users on a technical forum, rather than questions generated specifically for a competition or a task. Second, it has a real-world size -- 600 training, 310 de… ▽ More

    Submitted 7 November, 2019; originally announced November 2019.

    Comments: Long version of conference paper to be submitted

  6. arXiv:1911.00337  [pdf, other

    cs.CL

    Ensembling Strategies for Answering Natural Questions

    Authors: Anthony Ferritto, Lin Pan, Rishav Chakravarti, Salim Roukos, Radu Florian, J. William Murdock, Avirup Sil

    Abstract: Many of the top question answering systems today utilize ensembling to improve their performance on tasks such as the Stanford Question Answering Dataset (SQuAD) and Natural Questions (NQ) challenges. Unfortunately most of these systems do not publish their ensembling strategies used in their leaderboard submissions. In this work, we investigate a number of ensembling techniques and demonstrate a… ▽ More

    Submitted 6 November, 2019; v1 submitted 30 October, 2019; originally announced November 2019.

    Comments: arXiv admin note: text overlap with arXiv:1909.05286

  7. arXiv:1909.05286  [pdf, other

    cs.CL

    Frustratingly Easy Natural Question Answering

    Authors: Lin Pan, Rishav Chakravarti, Anthony Ferritto, Michael Glass, Alfio Gliozzo, Salim Roukos, Radu Florian, Avirup Sil

    Abstract: Existing literature on Question Answering (QA) mostly focuses on algorithmic novelty, data augmentation, or increasingly large pre-trained language models like XLNet and RoBERTa. Additionally, a lot of systems on the QA leaderboards do not have associated research documentation in order to successfully replicate their experiments. In this paper, we outline these algorithmic components such as Atte… ▽ More

    Submitted 11 September, 2019; originally announced September 2019.

  8. arXiv:1909.04120  [pdf, other

    cs.CL cs.AI cs.LG

    Span Selection Pre-training for Question Answering

    Authors: Michael Glass, Alfio Gliozzo, Rishav Chakravarti, Anthony Ferritto, Lin Pan, G P Shrivatsa Bhargav, Dinesh Garg, Avirup Sil

    Abstract: BERT (Bidirectional Encoder Representations from Transformers) and related pre-trained Transformers have provided large gains across many language understanding tasks, achieving a new state-of-the-art (SOTA). BERT is pre-trained on two auxiliary tasks: Masked Language Model and Next Sentence Prediction. In this paper we introduce a new pre-training task inspired by reading comprehension to better… ▽ More

    Submitted 18 June, 2020; v1 submitted 9 September, 2019; originally announced September 2019.

    Comments: Accepted at ACL2020

  9. CFO: A Framework for Building Production NLP Systems

    Authors: Rishav Chakravarti, Cezar Pendus, Andrzej Sakrajda, Anthony Ferritto, Lin Pan, Michael Glass, Vittorio Castelli, J. William Murdock, Radu Florian, Salim Roukos, Avirup Sil

    Abstract: This paper introduces a novel orchestration framework, called CFO (COMPUTATION FLOW ORCHESTRATOR), for building, experimenting with, and deploying interactive NLP (Natural Language Processing) and IR (Information Retrieval) systems to production environments. We then demonstrate a question answering system built using this framework which incorporates state-of-the-art BERT based MRC (Machine Readi… ▽ More

    Submitted 19 June, 2020; v1 submitted 16 August, 2019; originally announced August 2019.

    Comments: http://ibm.biz/cfo_framework

    Report number: D19-3006

    Journal ref: Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP): System Demonstrations