Skip to main content

Showing 1–25 of 25 results for author: Castelli, V

Searching in archive cs. Search in all archives.
.
  1. arXiv:2407.13998  [pdf, other

    cs.CL cs.AI

    RAG-QA Arena: Evaluating Domain Robustness for Long-form Retrieval Augmented Question Answering

    Authors: Rujun Han, Yuhao Zhang, Peng Qi, Yumo Xu, Jenyuan Wang, Lan Liu, William Yang Wang, Bonan Min, Vittorio Castelli

    Abstract: Question answering based on retrieval augmented generation (RAG-QA) is an important research topic in NLP and has a wide range of real-world applications. However, most existing datasets for this task are either constructed using a single source corpus or consist of short extractive answers, which fall short of evaluating large language model (LLM) based RAG-QA systems on cross-domain generalizati… ▽ More

    Submitted 18 July, 2024; originally announced July 2024.

  2. arXiv:2403.06326  [pdf, other

    cs.CL cs.AI cs.LG

    From Instructions to Constraints: Language Model Alignment with Automatic Constraint Verification

    Authors: Fei Wang, Chao Shang, Sarthak Jain, Shuai Wang, Qiang Ning, Bonan Min, Vittorio Castelli, Yassine Benajiba, Dan Roth

    Abstract: User alignment is crucial for adapting general-purpose language models (LMs) to downstream tasks, but human annotations are often not available for all types of instructions, especially those with customized constraints. We observe that user instructions typically contain constraints. While assessing response quality in terms of the whole instruction is often costly, efficiently evaluating the sat… ▽ More

    Submitted 10 March, 2024; originally announced March 2024.

  3. arXiv:2402.18479  [pdf, other

    cs.CL

    NewsQs: Multi-Source Question Generation for the Inquiring Mind

    Authors: Alyssa Hwang, Kalpit Dixit, Miguel Ballesteros, Yassine Benajiba, Vittorio Castelli, Markus Dreyer, Mohit Bansal, Kathleen McKeown

    Abstract: We present NewsQs (news-cues), a dataset that provides question-answer pairs for multiple news documents. To create NewsQs, we augment a traditional multi-document summarization dataset with questions automatically generated by a T5-Large model fine-tuned on FAQ-style news articles from the News On the Web corpus. We show that fine-tuning a model with control codes produces questions that are judg… ▽ More

    Submitted 15 June, 2024; v1 submitted 28 February, 2024; originally announced February 2024.

    Comments: minor wording change

  4. arXiv:2310.17060  [pdf

    cs.RO

    Aplicacion de Robots Humanoides como Guias Interactivos en Museos: Una Simulacion con el Robot NAO

    Authors: Hiago Sodre, Pablo Moraes, Monica Rodriguez, Victor Castelli, Pamela Barboza, Martin Mattos, Guillermo Vivas, Bruna de Vargas, Tobias Dörnbach, Ricardo Grando

    Abstract: This article presents an application that evaluates the feasibility of humanoid robots as interactive guides in art museums. The application entailes programming a NAO robot and a chatbot to provide information about art pieces in a simulated museum environment. In this controlled scenario, the learning employees interact with the robot and the chatbot. The result is a skilled participation in the… ▽ More

    Submitted 25 October, 2023; originally announced October 2023.

    Comments: Febitec 2023, in Spanish language

  5. arXiv:2308.05317  [pdf, other

    cs.CL

    Few-Shot Data-to-Text Generation via Unified Representation and Multi-Source Learning

    Authors: Alexander Hanbo Li, Mingyue Shang, Evangelia Spiliopoulou, Jie Ma, Patrick Ng, Zhiguo Wang, Bonan Min, William Wang, Kathleen McKeown, Vittorio Castelli, Dan Roth, Bing Xiang

    Abstract: We present a novel approach for structured data-to-text generation that addresses the limitations of existing methods that primarily focus on specific types of structured data. Our proposed method aims to improve performance in multi-task training, zero-shot and few-shot scenarios by providing a unified representation that can handle various forms of structured data such as tables, knowledge graph… ▽ More

    Submitted 9 August, 2023; originally announced August 2023.

  6. arXiv:2305.18842  [pdf, other

    cs.CL cs.AI cs.CV

    Generate then Select: Open-ended Visual Question Answering Guided by World Knowledge

    Authors: Xingyu Fu, Sheng Zhang, Gukyeong Kwon, Pramuditha Perera, Henghui Zhu, Yuhao Zhang, Alexander Hanbo Li, William Yang Wang, Zhiguo Wang, Vittorio Castelli, Patrick Ng, Dan Roth, Bing Xiang

    Abstract: The open-ended Visual Question Answering (VQA) task requires AI models to jointly reason over visual and natural language inputs using world knowledge. Recently, pre-trained Language Models (PLM) such as GPT-3 have been applied to the task and shown to be powerful world knowledge sources. However, these methods suffer from low knowledge coverage caused by PLM bias -- the tendency to generate certa… ▽ More

    Submitted 30 May, 2023; originally announced May 2023.

    Comments: Accepted to ACL 2023 Findings

  7. arXiv:2305.17337  [pdf, other

    cs.CL cs.AI

    Benchmarking Diverse-Modal Entity Linking with Generative Models

    Authors: Sijia Wang, Alexander Hanbo Li, Henry Zhu, Sheng Zhang, Chung-Wei Hang, Pramuditha Perera, Jie Ma, William Wang, Zhiguo Wang, Vittorio Castelli, Bing Xiang, Patrick Ng

    Abstract: Entities can be expressed in diverse formats, such as texts, images, or column names and cell values in tables. While existing entity linking (EL) models work well on per modality configuration, such as text-only EL, visual grounding, or schema linking, it is more challenging to design a unified model for diverse modality configurations. To bring various modality configurations together, we constr… ▽ More

    Submitted 26 May, 2023; originally announced May 2023.

    Comments: 15 pages. ACL 2023

  8. arXiv:2305.16265  [pdf, other

    cs.CL

    UNITE: A Unified Benchmark for Text-to-SQL Evaluation

    Authors: Wuwei Lan, Zhiguo Wang, Anuj Chauhan, Henghui Zhu, Alexander Li, Jiang Guo, Sheng Zhang, Chung-Wei Hang, Joseph Lilien, Yiqun Hu, Lin Pan, Mingwen Dong, Jun Wang, Jiarong Jiang, Stephen Ash, Vittorio Castelli, Patrick Ng, Bing Xiang

    Abstract: A practical text-to-SQL system should generalize well on a wide variety of natural language questions, unseen database schemas, and novel SQL query structures. To comprehensively evaluate text-to-SQL systems, we introduce a UNIfied benchmark for Text-to-SQL Evaluation (UNITE). It is composed of publicly available text-to-SQL datasets, containing natural language questions from more than 12 domains… ▽ More

    Submitted 14 July, 2023; v1 submitted 25 May, 2023; originally announced May 2023.

    Comments: 5 pages

  9. arXiv:2305.14827  [pdf, other

    cs.CL

    Pre-training Intent-Aware Encoders for Zero- and Few-Shot Intent Classification

    Authors: Mujeen Sung, James Gung, Elman Mansimov, Nikolaos Pappas, Raphael Shu, Salvatore Romeo, Yi Zhang, Vittorio Castelli

    Abstract: Intent classification (IC) plays an important role in task-oriented dialogue systems. However, IC models often generalize poorly when training without sufficient annotated examples for each user intent. We propose a novel pre-training method for text encoders that uses contrastive learning with intent psuedo-labels to produce embeddings that are well-suited for IC tasks, reducing the need for manu… ▽ More

    Submitted 13 November, 2023; v1 submitted 24 May, 2023; originally announced May 2023.

    Comments: EMNLP 2023

  10. arXiv:2305.13191  [pdf, other

    cs.CL cs.AI cs.LG

    Taxonomy Expansion for Named Entity Recognition

    Authors: Karthikeyan K, Yogarshi Vyas, Jie Ma, Giovanni Paolini, Neha Anna John, Shuai Wang, Yassine Benajiba, Vittorio Castelli, Dan Roth, Miguel Ballesteros

    Abstract: Training a Named Entity Recognition (NER) model often involves fixing a taxonomy of entity types. However, requirements evolve and we might need the NER model to recognize additional entity types. A simple approach is to re-annotate entire dataset with both existing and additional entity types and then train the model on the re-annotated dataset. However, this is an extremely laborious task. To re… ▽ More

    Submitted 22 May, 2023; originally announced May 2023.

  11. arXiv:2305.11242  [pdf, other

    cs.CL

    Comparing Biases and the Impact of Multilingual Training across Multiple Languages

    Authors: Sharon Levy, Neha Anna John, Ling Liu, Yogarshi Vyas, Jie Ma, Yoshinari Fujinuma, Miguel Ballesteros, Vittorio Castelli, Dan Roth

    Abstract: Studies in bias and fairness in natural language processing have primarily examined social biases within a single language and/or across few attributes (e.g. gender, race). However, biases can manifest differently across various languages for individual attributes. As a result, it is critical to examine biases within each language and attribute. Of equal importance is to study how these biases com… ▽ More

    Submitted 18 May, 2023; originally announced May 2023.

  12. arXiv:2301.08881  [pdf, other

    cs.CL

    Dr.Spider: A Diagnostic Evaluation Benchmark towards Text-to-SQL Robustness

    Authors: Shuaichen Chang, Jun Wang, Mingwen Dong, Lin Pan, Henghui Zhu, Alexander Hanbo Li, Wuwei Lan, Sheng Zhang, Jiarong Jiang, Joseph Lilien, Steve Ash, William Yang Wang, Zhiguo Wang, Vittorio Castelli, Patrick Ng, Bing Xiang

    Abstract: Neural text-to-SQL models have achieved remarkable performance in translating natural language questions into SQL queries. However, recent studies reveal that text-to-SQL models are vulnerable to task-specific perturbations. Previous curated robustness test sets usually focus on individual phenomena. In this paper, we propose a comprehensive robustness benchmark based on Spider, a cross-domain tex… ▽ More

    Submitted 28 January, 2023; v1 submitted 20 January, 2023; originally announced January 2023.

    Comments: ICLR 2023

  13. arXiv:2212.08785  [pdf, other

    cs.CL

    Importance of Synthesizing High-quality Data for Text-to-SQL Parsing

    Authors: Yiyun Zhao, Jiarong Jiang, Yiqun Hu, Wuwei Lan, Henry Zhu, Anuj Chauhan, Alexander Li, Lin Pan, Jun Wang, Chung-Wei Hang, Sheng Zhang, Marvin Dong, Joe Lilien, Patrick Ng, Zhiguo Wang, Vittorio Castelli, Bing Xiang

    Abstract: Recently, there has been increasing interest in synthesizing data to improve downstream text-to-SQL tasks. In this paper, we first examined the existing synthesized datasets and discovered that state-of-the-art text-to-SQL algorithms did not further improve on popular benchmarks when trained with augmented synthetic data. We observed two shortcomings: illogical synthetic SQL queries from independe… ▽ More

    Submitted 16 December, 2022; originally announced December 2022.

  14. arXiv:2211.04903  [pdf, other

    cs.CL

    Novel Chapter Abstractive Summarization using Spinal Tree Aware Sub-Sentential Content Selection

    Authors: Hardy Hardy, Miguel Ballesteros, Faisal Ladhak, Muhammad Khalifa, Vittorio Castelli, Kathleen McKeown

    Abstract: Summarizing novel chapters is a difficult task due to the input length and the fact that sentences that appear in the desired summaries draw content from multiple places throughout the chapter. We present a pipelined extractive-abstractive approach where the extractive step filters the content that is passed to the abstractive component. Extremely lengthy input also results in a highly skewed data… ▽ More

    Submitted 9 November, 2022; originally announced November 2022.

  15. arXiv:2204.09248  [pdf, ps, other

    cs.CL cs.IR

    Synthetic Target Domain Supervision for Open Retrieval QA

    Authors: Revanth Gangi Reddy, Bhavani Iyer, Md Arafat Sultan, Rong Zhang, Avirup Sil, Vittorio Castelli, Radu Florian, Salim Roukos

    Abstract: Neural passage retrieval is a new and promising approach in open retrieval question answering. In this work, we stress-test the Dense Passage Retriever (DPR) -- a state-of-the-art (SOTA) open domain neural retrieval model -- on closed and specialized target domains such as COVID-19, and find that it lags behind standard BM25 in this important real-world setting. To make DPR more robust under domai… ▽ More

    Submitted 20 April, 2022; originally announced April 2022.

    Comments: Published at SIGIR 2021

  16. arXiv:2109.02329  [pdf, other

    cs.RO

    Predicting Performance of SLAM Algorithms

    Authors: Matteo Luperto, Valerio Castelli, Francesco Amigoni

    Abstract: Among the abilities that autonomous mobile robots should exhibit, map building and localization are definitely recognized as fundamental. Consequently, countless algorithms for solving the Simultaneous Localization And Mapping (SLAM) problem have been proposed. Currently, their evaluation is performed ex-post, according to outcomes obtained when running the algorithms on data collected by robots i… ▽ More

    Submitted 6 September, 2021; originally announced September 2021.

    Comments: Working preprint draft. To be polished and submitted for peer review

  17. arXiv:2104.07800  [pdf, other

    cs.CL cs.AI cs.IR

    Towards Robust Neural Retrieval Models with Synthetic Pre-Training

    Authors: Revanth Gangi Reddy, Vikas Yadav, Md Arafat Sultan, Martin Franz, Vittorio Castelli, Heng Ji, Avirup Sil

    Abstract: Recent work has shown that commonly available machine reading comprehension (MRC) datasets can be used to train high-performance neural information retrieval (IR) systems. However, the evaluation of neural IR has so far been limited to standard supervised learning settings, where they have outperformed traditional term matching baselines. We conduct in-domain and out-of-domain evaluations of neura… ▽ More

    Submitted 15 April, 2021; originally announced April 2021.

  18. arXiv:2012.01414  [pdf, other

    cs.CL cs.AI cs.IR

    End-to-End QA on COVID-19: Domain Adaptation with Synthetic Training

    Authors: Revanth Gangi Reddy, Bhavani Iyer, Md Arafat Sultan, Rong Zhang, Avi Sil, Vittorio Castelli, Radu Florian, Salim Roukos

    Abstract: End-to-end question answering (QA) requires both information retrieval (IR) over a large document collection and machine reading comprehension (MRC) on the retrieved passages. Recent work has successfully trained neural IR systems using only supervised question answering (QA) examples from open-domain datasets. However, despite impressive performance on Wikipedia, neural IR lags behind traditional… ▽ More

    Submitted 2 December, 2020; originally announced December 2020.

    Comments: Preprint

  19. arXiv:2011.03435  [pdf, other

    cs.CL cs.AI cs.LG

    Answer Span Correction in Machine Reading Comprehension

    Authors: Revanth Gangi Reddy, Md Arafat Sultan, Efsun Sarioglu Kayi, Rong Zhang, Vittorio Castelli, Avirup Sil

    Abstract: Answer validation in machine reading comprehension (MRC) consists of verifying an extracted answer against an input context and question pair. Previous work has looked at re-assessing the "answerability" of the question given the extracted answer. Here we address a different problem: the tendency of existing MRC systems to produce partially correct answers when presented with answerable questions.… ▽ More

    Submitted 6 November, 2020; originally announced November 2020.

    Comments: Accepted in Findings of EMNLP 2020

  20. arXiv:2010.12776  [pdf, other

    cs.CL

    Improved Synthetic Training for Reading Comprehension

    Authors: Yanda Chen, Md Arafat Sultan, Vittorio Castelli

    Abstract: Automatically generated synthetic training examples have been shown to improve performance in machine reading comprehension (MRC). Compared to human annotated gold standard data, synthetic training data has unique properties, such as high availability at the possible expense of quality. In view of such differences, in this paper, we explore novel applications of synthetic examples to MRC. Our prop… ▽ More

    Submitted 24 October, 2020; originally announced October 2020.

    Comments: 11 pages, 2 figures

  21. arXiv:2010.05904  [pdf, other

    cs.CL

    Multi-Stage Pre-training for Low-Resource Domain Adaptation

    Authors: Rong Zhang, Revanth Gangi Reddy, Md Arafat Sultan, Vittorio Castelli, Anthony Ferritto, Radu Florian, Efsun Sarioglu Kayi, Salim Roukos, Avirup Sil, Todd Ward

    Abstract: Transfer learning techniques are particularly useful in NLP tasks where a sizable amount of high-quality annotated data is difficult to obtain. Current approaches directly adapt a pre-trained language model (LM) on in-domain text before fine-tuning to downstream tasks. We show that extending the vocabulary of the LM with domain-specific terms leads to further gains. To a bigger effect, we utilize… ▽ More

    Submitted 12 October, 2020; originally announced October 2020.

    Comments: Accepted at EMNLP 2020

  22. arXiv:1911.02984  [pdf, other

    cs.CL cs.IR

    The TechQA Dataset

    Authors: Vittorio Castelli, Rishav Chakravarti, Saswati Dana, Anthony Ferritto, Radu Florian, Martin Franz, Dinesh Garg, Dinesh Khandelwal, Scott McCarley, Mike McCawley, Mohamed Nasr, Lin Pan, Cezar Pendus, John Pitrelli, Saurabh Pujar, Salim Roukos, Andrzej Sakrajda, Avirup Sil, Rosario Uceda-Sosa, Todd Ward, Rong Zhang

    Abstract: We introduce TechQA, a domain-adaptation question answering dataset for the technical support domain. The TechQA corpus highlights two real-world issues from the automated customer support domain. First, it contains actual questions posed by users on a technical forum, rather than questions generated specifically for a competition or a task. Second, it has a real-world size -- 600 training, 310 de… ▽ More

    Submitted 7 November, 2019; originally announced November 2019.

    Comments: Long version of conference paper to be submitted

  23. CFO: A Framework for Building Production NLP Systems

    Authors: Rishav Chakravarti, Cezar Pendus, Andrzej Sakrajda, Anthony Ferritto, Lin Pan, Michael Glass, Vittorio Castelli, J. William Murdock, Radu Florian, Salim Roukos, Avirup Sil

    Abstract: This paper introduces a novel orchestration framework, called CFO (COMPUTATION FLOW ORCHESTRATOR), for building, experimenting with, and deploying interactive NLP (Natural Language Processing) and IR (Information Retrieval) systems to production environments. We then demonstrate a question answering system built using this framework which incorporates state-of-the-art BERT based MRC (Machine Readi… ▽ More

    Submitted 19 June, 2020; v1 submitted 16 August, 2019; originally announced August 2019.

    Comments: http://ibm.biz/cfo_framework

    Report number: D19-3006

    Journal ref: Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP): System Demonstrations

  24. arXiv:1606.07548  [pdf, other

    cs.CL

    A Sentence Compression Based Framework to Query-Focused Multi-Document Summarization

    Authors: Lu Wang, Hema Raghavan, Vittorio Castelli, Radu Florian, Claire Cardie

    Abstract: We consider the problem of using sentence compression techniques to facilitate query-focused multi-document summarization. We present a sentence-compression-based framework for the task, and design a series of learning-based compression models built on parse trees. An innovative beam search decoder is proposed to efficiently find highly probable compressions. Under this framework, we show how to i… ▽ More

    Submitted 23 June, 2016; originally announced June 2016.

    Comments: ACL 2013

  25. arXiv:1606.05702  [pdf, ps, other

    cs.CL

    Query-Focused Opinion Summarization for User-Generated Content

    Authors: Lu Wang, Hema Raghavan, Claire Cardie, Vittorio Castelli

    Abstract: We present a submodular function-based framework for query-focused opinion summarization. Within our framework, relevance ordering produced by a statistical ranker, and information coverage with respect to topic distribution and diverse viewpoints are both encoded as submodular functions. Dispersion functions are utilized to minimize the redundancy. We are the first to evaluate different metrics o… ▽ More

    Submitted 17 June, 2016; originally announced June 2016.

    Comments: COLING 2014