Skip to main content

Showing 1–50 of 101 results for author: Yao, B

Searching in archive cs. Search in all archives.
.
  1. arXiv:2408.16465  [pdf, other

    cs.HC

    Human and LLM-Based Voice Assistant Interaction: An Analytical Framework for User Verbal and Nonverbal Behaviors

    Authors: Szeyi Chan, Shihan Fu, Jiachen Li, Bingsheng Yao, Smit Desai, Mirjana Prpa, Dakuo Wang

    Abstract: Recent progress in large language model (LLM) technology has significantly enhanced the interaction experience between humans and voice assistants (VAs). This project aims to explore a user's continuous interaction with LLM-based VA (LLM-VA) during a complex task. We recruited 12 participants to interact with an LLM-VA during a cooking task, selected for its complexity and the requirement for cont… ▽ More

    Submitted 3 September, 2024; v1 submitted 29 August, 2024; originally announced August 2024.

  2. arXiv:2408.03586  [pdf, other

    cs.HC

    Clinical Challenges and AI Opportunities in Decision-Making for Cancer Treatment-Induced Cardiotoxicity

    Authors: Siyi Wu, Weidan Cao, Shihan Fu, Bingsheng Yao, Ziqi Yang, Changchang Yin, Varun Mishra, Daniel Addison, Ping Zhang, Dakuo Wang

    Abstract: Cardiotoxicity induced by cancer treatment has become a major clinical concern, affecting the long-term survival and quality of life of cancer patients. Effective clinical decision-making, including the detection of cancer treatment-induced cardiotoxicity and the monitoring of associated symptoms, remains a challenging task for clinicians. This study investigates the current practices and needs of… ▽ More

    Submitted 7 August, 2024; originally announced August 2024.

    Comments: In Submission

  3. arXiv:2408.02456  [pdf, other

    cs.LG cs.AI

    Enhancing Heterogeneous Knowledge Graph Completion with a Novel GAT-based Approach

    Authors: Wanxu Wei, Yitong Song, Bin Yao

    Abstract: Knowledge graphs (KGs) play a vital role in enhancing search results and recommendation systems. With the rapid increase in the size of the KGs, they are becoming inaccuracy and incomplete. This problem can be solved by the knowledge graph completion methods, of which graph attention network (GAT)-based methods stand out since their superior performance. However, existing GAT-based knowledge graph… ▽ More

    Submitted 5 August, 2024; originally announced August 2024.

    Journal ref: ACM Transactions on Knowledge Discovery from Data, Volume 18, Issue 4, 2024

  4. arXiv:2407.18271  [pdf, other

    cs.AR cs.AI

    Large Language Model for Verilog Generation with Golden Code Feedback

    Authors: Ning Wang, Bingkun Yao, Jie Zhou, Xi Wang, Zhe Jiang, Nan Guan

    Abstract: Recent advancements in large language models (LLMs) have catalyzed significant interest in the automatic generation of Register-Transfer Level (RTL) code, particularly Verilog, from natural language instructions. While commercial LLMs like ChatGPT have dominated this domain, open-source alternatives have lagged considerably in performance, limiting the flexibility and data privacy of this emerging… ▽ More

    Submitted 5 August, 2024; v1 submitted 21 July, 2024; originally announced July 2024.

  5. arXiv:2407.17571  [pdf, other

    cs.CV

    Diffusion Models for Multi-Task Generative Modeling

    Authors: Changyou Chen, Han Ding, Bunyamin Sisman, Yi Xu, Ouye Xie, Benjamin Z. Yao, Son Dinh Tran, Belinda Zeng

    Abstract: Diffusion-based generative modeling has been achieving state-of-the-art results on various generation tasks. Most diffusion models, however, are limited to a single-generation modeling. Can we generalize diffusion models with the ability of multi-modal generative training for more generalizable modeling? In this paper, we propose a principled way to define a diffusion model by constructing a unifi… ▽ More

    Submitted 24 July, 2024; originally announced July 2024.

    Comments: Published as a conference paper at ICLR 2024

  6. arXiv:2407.16999  [pdf, other

    cs.LG cs.AI cs.HC

    SepsisLab: Early Sepsis Prediction with Uncertainty Quantification and Active Sensing

    Authors: Changchang Yin, Pin-Yu Chen, Bingsheng Yao, Dakuo Wang, Jeffrey Caterino, Ping Zhang

    Abstract: Sepsis is the leading cause of in-hospital mortality in the USA. Early sepsis onset prediction and diagnosis could significantly improve the survival of sepsis patients. Existing predictive models are usually trained on high-quality data with few missing information, while missing values widely exist in real-world clinical scenarios (especially in the first hours of admissions to the hospital), wh… ▽ More

    Submitted 24 July, 2024; originally announced July 2024.

    Comments: To be published in KDD 2024

    MSC Class: 68T07 (primary) 92C50 (secondary) ACM Class: H.2.8; I.2.1; J.3

  7. arXiv:2407.13851  [pdf, other

    cs.CV cs.LG cs.MM

    X-Former: Unifying Contrastive and Reconstruction Learning for MLLMs

    Authors: Sirnam Swetha, Jinyu Yang, Tal Neiman, Mamshad Nayeem Rizve, Son Tran, Benjamin Yao, Trishul Chilimbi, Mubarak Shah

    Abstract: Recent advancements in Multimodal Large Language Models (MLLMs) have revolutionized the field of vision-language understanding by integrating visual perception capabilities into Large Language Models (LLMs). The prevailing trend in this field involves the utilization of a vision encoder derived from vision-language contrastive learning (CL), showing expertise in capturing overall representations w… ▽ More

    Submitted 18 July, 2024; originally announced July 2024.

    Comments: Accepted at ECCV2024

  8. arXiv:2407.12725  [pdf, other

    cs.CL

    Is Sarcasm Detection A Step-by-Step Reasoning Process in Large Language Models?

    Authors: Ben Yao, Yazhou Zhang, Qiuchi Li, Jing Qin

    Abstract: Elaborating a series of intermediate reasoning steps significantly improves the ability of large language models (LLMs) to solve complex problems, as such steps would evoke LLMs to think sequentially. However, human sarcasm understanding is often considered an intuitive and holistic cognitive process, in which various linguistic, contextual, and emotional cues are integrated to form a comprehensiv… ▽ More

    Submitted 24 August, 2024; v1 submitted 17 July, 2024; originally announced July 2024.

    Comments: 9 pages, 5 figures

  9. arXiv:2407.09073  [pdf, other

    cs.CV

    Open Vocabulary Multi-Label Video Classification

    Authors: Rohit Gupta, Mamshad Nayeem Rizve, Jayakrishnan Unnikrishnan, Ashish Tawari, Son Tran, Mubarak Shah, Benjamin Yao, Trishul Chilimbi

    Abstract: Pre-trained vision-language models (VLMs) have enabled significant progress in open vocabulary computer vision tasks such as image classification, object detection and image segmentation. Some recent works have focused on extending VLMs to open vocabulary single label action classification in videos. However, previous methods fall short in holistic video understanding which requires the ability to… ▽ More

    Submitted 12 July, 2024; originally announced July 2024.

    Comments: Accepted at ECCV 2024

  10. arXiv:2407.03772  [pdf, other

    eess.IV cs.CV q-bio.QM

    CS3: Cascade SAM for Sperm Segmentation

    Authors: Yi Shi, Xu-Peng Tian, Yun-Kai Wang, Tie-Yi Zhang, Bin Yao, Hui Wang, Yong Shao, Cen-Cen Wang, Rong Zeng, De-Chuan Zhan

    Abstract: Automated sperm morphology analysis plays a crucial role in the assessment of male fertility, yet its efficacy is often compromised by the challenges in accurately segmenting sperm images. Existing segmentation techniques, including the Segment Anything Model(SAM), are notably inadequate in addressing the complex issue of sperm overlap-a frequent occurrence in clinical samples. Our exploratory stu… ▽ More

    Submitted 9 July, 2024; v1 submitted 4 July, 2024; originally announced July 2024.

    Comments: Early accepted by MICCAI2024

  11. arXiv:2407.03663  [pdf, other

    cs.CV

    Limited-View Photoacoustic Imaging Reconstruction Via High-quality Self-supervised Neural Representation

    Authors: Youshen xiao, Yuting Shen, Bowei Yao, Xiran Cai, Yuyao Zhang, Fei Gao

    Abstract: In practical applications within the human body, it is often challenging to fully encompass the target tissue or organ, necessitating the use of limited-view arrays, which can lead to the loss of crucial information. Addressing the reconstruction of photoacoustic sensor signals in limited-view detection spaces has become a focal point of current research. In this study, we introduce a self-supervi… ▽ More

    Submitted 4 July, 2024; originally announced July 2024.

  12. arXiv:2407.00840  [pdf, other

    cs.LG

    MUSE-Net: Missingness-aware mUlti-branching Self-attention Encoder for Irregular Longitudinal Electronic Health Records

    Authors: Zekai Wang, Tieming Liu, Bing Yao

    Abstract: The era of big data has made vast amounts of clinical data readily available, particularly in the form of electronic health records (EHRs), which provides unprecedented opportunities for developing data-driven diagnostic tools to enhance clinical decision making. However, the application of EHRs in data-driven modeling faces challenges such as irregularly spaced multi-variate time series, issues o… ▽ More

    Submitted 30 June, 2024; originally announced July 2024.

  13. arXiv:2406.19749  [pdf, other

    eess.IV cs.CV

    SPIRONet: Spatial-Frequency Learning and Topological Channel Interaction Network for Vessel Segmentation

    Authors: De-Xing Huang, Xiao-Hu Zhou, Xiao-Liang Xie, Shi-Qi Liu, Shuang-Yi Wang, Zhen-Qiu Feng, Mei-Jiang Gui, Hao Li, Tian-Yu Xiang, Bo-Xian Yao, Zeng-Guang Hou

    Abstract: Automatic vessel segmentation is paramount for developing next-generation interventional navigation systems. However, current approaches suffer from suboptimal segmentation performances due to significant challenges in intraoperative images (i.e., low signal-to-noise ratio, small or slender vessels, and strong interference). In this paper, a novel spatial-frequency learning and topological channel… ▽ More

    Submitted 28 June, 2024; originally announced June 2024.

  14. arXiv:2405.13803  [pdf, other

    cs.HC cs.CL

    Sunnie: An Anthropomorphic LLM-Based Conversational Agent for Mental Well-Being Activity Recommendation

    Authors: Siyi Wu, Feixue Han, Bingsheng Yao, Tianyi Xie, Xuan Zhao, Dakuo Wang

    Abstract: A longstanding challenge in mental well-being support is the reluctance of people to adopt psychologically beneficial activities, often due to lack of motivation, low perceived trustworthiness, and limited personalization of recommendations. Chatbots have shown promise in promoting positive mental health practices, yet their rigid interaction flows and less human-like conversational experiences pr… ▽ More

    Submitted 13 June, 2024; v1 submitted 22 May, 2024; originally announced May 2024.

    Comments: In Submission

  15. arXiv:2404.13409  [pdf, other

    cs.HC

    "I Wish There Were an AI": Challenges and AI Potential in Cancer Patient-Provider Communication

    Authors: Ziqi Yang, Xuhai Xu, Bingsheng Yao, Jiachen Li, Jennifer Bagdasarian, Guodong Gao, Dakuo Wang

    Abstract: Patient-provider communication has been crucial to cancer patients' survival after their cancer treatments. However, the research community and patients themselves often overlook the communication challenges after cancer treatments as they are overshadowed by the severity of the patient's illness and the variety and rarity of the cancer disease itself. Meanwhile, the recent technical advances in A… ▽ More

    Submitted 20 April, 2024; originally announced April 2024.

    Comments: 18 pages, 2 figures, submission to CSCW'24

  16. arXiv:2404.05012  [pdf, other

    cs.AI cs.CL

    Towards Reliable and Empathetic Depression-Diagnosis-Oriented Chats

    Authors: Kunyao Lan, Cong Ming, Binwei Yao, Lu Chen, Mengyue Wu

    Abstract: Chatbots can serve as a viable tool for preliminary depression diagnosis via interactive conversations with potential patients. Nevertheless, the blend of task-oriented and chit-chat in diagnosis-related dialogues necessitates professional expertise and empathy. Such unique requirements challenge traditional dialogue frameworks geared towards single optimization goals. To address this, we propose… ▽ More

    Submitted 7 April, 2024; originally announced April 2024.

  17. arXiv:2403.16398  [pdf, other

    cs.LG cs.AI

    Rethinking the Representation in Federated Unsupervised Learning with Non-IID Data

    Authors: Xinting Liao, Weiming Liu, Chaochao Chen, Pengyang Zhou, Fengyuan Yu, Huabin Zhu, Binhui Yao, Tao Wang, Xiaolin Zheng, Yanchao Tan

    Abstract: Federated learning achieves effective performance in modeling decentralized data. In practice, client data are not well-labeled, which makes it potential for federated unsupervised learning (FUSL) with non-IID data. However, the performance of existing FUSL methods suffers from insufficient representations, i.e., (1) representation collapse entanglement among local and global models, and (2) incon… ▽ More

    Submitted 24 March, 2024; originally announced March 2024.

    Comments: CVPR 2024

  18. arXiv:2403.14870  [pdf, other

    cs.CV cs.CL cs.LG

    VidLA: Video-Language Alignment at Scale

    Authors: Mamshad Nayeem Rizve, Fan Fei, Jayakrishnan Unnikrishnan, Son Tran, Benjamin Z. Yao, Belinda Zeng, Mubarak Shah, Trishul Chilimbi

    Abstract: In this paper, we propose VidLA, an approach for video-language alignment at scale. There are two major limitations of previous video-language alignment approaches. First, they do not capture both short-range and long-range temporal dependencies and typically employ complex hierarchical deep network architectures that are hard to integrate with existing pretrained image-text foundation models. To… ▽ More

    Submitted 21 March, 2024; originally announced March 2024.

    Comments: Accepted to CVPR 2024

  19. arXiv:2403.08154  [pdf, other

    cs.LG eess.SP

    The Effect of Different Optimization Strategies to Physics-Constrained Deep Learning for Soil Moisture Estimation

    Authors: Jianxin Xie, Bing Yao, Zheyu Jiang

    Abstract: Soil moisture is a key hydrological parameter that has significant importance to human society and the environment. Accurate modeling and monitoring of soil moisture in crop fields, especially in the root zone (top 100 cm of soil), is essential for improving agricultural production and crop yield with the help of precision irrigation and farming tools. Realizing the full sensor data potential depe… ▽ More

    Submitted 12 March, 2024; originally announced March 2024.

  20. arXiv:2403.01273  [pdf, other

    cs.LG cs.AI cs.CL

    NoMAD-Attention: Efficient LLM Inference on CPUs Through Multiply-add-free Attention

    Authors: Tianyi Zhang, Jonah Wonkyu Yi, Bowen Yao, Zhaozhuo Xu, Anshumali Shrivastava

    Abstract: Large language model inference on Central Processing Units (CPU) is challenging due to the vast quantities of expensive Multiply-Add (MAD) matrix operations in the attention computations. In this paper, we argue that there is a rare gem in modern CPUs, Single-Instruction-Multiple-Data (SIMD) registers, which allow for ultra-low-latency lookups in batch. We leverage this unique capability of CPUs t… ▽ More

    Submitted 2 March, 2024; originally announced March 2024.

  21. arXiv:2402.01994  [pdf, ps, other

    cs.HC cs.AI cs.CR

    Human-Centered Privacy Research in the Age of Large Language Models

    Authors: Tianshi Li, Sauvik Das, Hao-Ping Lee, Dakuo Wang, Bingsheng Yao, Zhiping Zhang

    Abstract: The emergence of large language models (LLMs), and their increased use in user-facing systems, has led to substantial privacy concerns. To date, research on these privacy concerns has been model-centered: exploring how LLMs lead to privacy risks like memorization, or can be used to infer personal characteristics about people from their content. We argue that there is a need for more research focus… ▽ More

    Submitted 2 February, 2024; originally announced February 2024.

    Comments: 4 pages, CHI EA'24

  22. arXiv:2401.13804  [pdf, other

    cs.HC cs.CY

    Exploring Parent's Needs for Children-Centered AI to Support Preschoolers' Interactive Storytelling and Reading Activities

    Authors: Yuling Sun, Jiaju Chen, Bingsheng Yao, Jiali Liu, Dakuo Wang, Xiaojuan Ma, Yuxuan Lu, Ying Xu, Liang He

    Abstract: Interactive storytelling is vital for preschooler development. While children's interactive partners have traditionally been their parents and teachers, recent advances in artificial intelligence (AI) have sparked a surge of AI-based storytelling and reading technologies. As these technologies become increasingly ubiquitous in preschoolers' lives, questions arise regarding how they function in pra… ▽ More

    Submitted 31 August, 2024; v1 submitted 24 January, 2024; originally announced January 2024.

    Comments: Accepted at ACM CSCW 2024

  23. arXiv:2401.13799  [pdf, other

    cs.CY cs.HC

    Who Changed the Destiny of Rural Students, and How?: Unpacking ICT-Mediated Remote Education in Rural China

    Authors: Yuling Sun, Xiuqi Zhu, Xiaomu Zhou, Bingsheng Yao, Kai Zhang, Dakuo Wang, Jiaju Chen, Liang He

    Abstract: The proliferation of Information and Communication Technologies (ICTs) has shown great promise in addressing educational challenges facing rural areas. However, the complex rural context poses significant challenges to the effective utilization of these technologies. This paper examines the empirical integration of live-streaming-based remote classrooms (LSRC) through a qualitative study in rural… ▽ More

    Submitted 24 January, 2024; originally announced January 2024.

    Comments: In submission

  24. arXiv:2401.11876  [pdf, other

    cs.SE cs.RO

    First-principles Based 3D Virtual Simulation Testing for Discovering SOTIF Corner Cases of Autonomous Driving

    Authors: Lehang Li, Haokuan Wu, Botao Yao, Tianyu He, Shuohan Huang, Chuanyi Liu

    Abstract: 3D virtual simulation, which generates diversified test scenarios and tests full-stack of Autonomous Driving Systems (ADSes) modules dynamically as a whole, is a promising approach for Safety of The Intended Functionality (SOTIF) ADS testing. However, as different configurations of a test scenario will affect the sensor perceptions and environment interaction, e.g. light pulses emitted by the LiDA… ▽ More

    Submitted 22 January, 2024; originally announced January 2024.

    Comments: 11 pages, 10 figures

  25. arXiv:2312.14478  [pdf, other

    cs.LG

    Federated Learning via Input-Output Collaborative Distillation

    Authors: Xuan Gong, Shanglin Li, Yuxiang Bao, Barry Yao, Yawen Huang, Ziyan Wu, Baochang Zhang, Yefeng Zheng, David Doermann

    Abstract: Federated learning (FL) is a machine learning paradigm in which distributed local nodes collaboratively train a central model without sharing individually held private data. Existing FL methods either iteratively share local model parameters or deploy co-distillation. However, the former is highly susceptible to private data leakage, and the latter design relies on the prerequisites of task-releva… ▽ More

    Submitted 22 December, 2023; originally announced December 2023.

    Comments: Accepted at AAAI 2024

  26. arXiv:2312.00029  [pdf, other

    cs.CR cs.AI cs.CL

    Bergeron: Combating Adversarial Attacks through a Conscience-Based Alignment Framework

    Authors: Matthew Pisano, Peter Ly, Abraham Sanders, Bingsheng Yao, Dakuo Wang, Tomek Strzalkowski, Mei Si

    Abstract: Research into AI alignment has grown considerably since the recent introduction of increasingly capable Large Language Models (LLMs). Unfortunately, modern methods of alignment still fail to fully prevent harmful responses when models are deliberately attacked. Such vulnerabilities can lead to LLMs being manipulated into generating hazardous content: from instructions for creating dangerous materi… ▽ More

    Submitted 18 August, 2024; v1 submitted 16 November, 2023; originally announced December 2023.

  27. arXiv:2311.09825  [pdf, other

    cs.CL

    Human Still Wins over LLM: An Empirical Study of Active Learning on Domain-Specific Annotation Tasks

    Authors: Yuxuan Lu, Bingsheng Yao, Shao Zhang, Yun Wang, Peng Zhang, Tun Lu, Toby Jia-Jun Li, Dakuo Wang

    Abstract: Large Language Models (LLMs) have demonstrated considerable advances, and several claims have been made about their exceeding human performance. However, in real-world tasks, domain knowledge is often required. Low-resource learning methods like Active Learning (AL) have been proposed to tackle the cost of domain expert annotation, raising this question: Can LLMs surpass compact models trained wit… ▽ More

    Submitted 16 November, 2023; originally announced November 2023.

  28. arXiv:2311.09782  [pdf, other

    cs.CL

    More Samples or More Prompts? Exploring Effective In-Context Sampling for LLM Few-Shot Prompt Engineering

    Authors: Bingsheng Yao, Guiming Chen, Ruishi Zou, Yuxuan Lu, Jiachen Li, Shao Zhang, Yisi Sang, Sijia Liu, James Hendler, Dakuo Wang

    Abstract: While most existing works on LLM prompting techniques focus only on how to select a better set of data samples inside one single prompt input (In-Context Learning or ICL), why can not we design and leverage multiple prompts together to further improve the LLM's performance? In this work, we propose In-Context Sampling (ICS), a low-resource LLM prompting technique to produce confident predictions b… ▽ More

    Submitted 2 April, 2024; v1 submitted 16 November, 2023; originally announced November 2023.

    Comments: Accepted at NAACL 2024 Findings

  29. arXiv:2311.09756  [pdf, other

    cs.CL

    FairytaleCQA: Integrating a Commonsense Knowledge Graph into Children's Storybook Narratives

    Authors: Jiaju Chen, Yuxuan Lu, Shao Zhang, Bingsheng Yao, Yuanzhe Dong, Ying Xu, Yunyao Li, Qianwen Wang, Dakuo Wang, Yuling Sun

    Abstract: AI models (including LLM) often rely on narrative question-answering (QA) datasets to provide customized QA functionalities to support downstream children education applications; however, existing datasets only include QA pairs that are grounded within the given storybook content, but children can learn more when teachers refer the storybook content to real-world knowledge (e.g., commonsense knowl… ▽ More

    Submitted 16 November, 2023; originally announced November 2023.

  30. arXiv:2310.17245  [pdf, other

    cs.LG cs.AI

    CROP: Conservative Reward for Model-based Offline Policy Optimization

    Authors: Hao Li, Xiao-Hu Zhou, Xiao-Liang Xie, Shi-Qi Liu, Zhen-Qiu Feng, Xiao-Yin Liu, Mei-Jiang Gui, Tian-Yu Xiang, De-Xing Huang, Bo-Xian Yao, Zeng-Guang Hou

    Abstract: Offline reinforcement learning (RL) aims to optimize policy using collected data without online interactions. Model-based approaches are particularly appealing for addressing offline RL challenges due to their capability to mitigate the limitations of offline data through data generation using models. Prior research has demonstrated that introducing conservatism into the model or Q-function during… ▽ More

    Submitted 26 October, 2023; originally announced October 2023.

  31. arXiv:2310.15077  [pdf, other

    cs.CL

    'Don't Get Too Technical with Me': A Discourse Structure-Based Framework for Science Journalism

    Authors: Ronald Cardenas, Bingsheng Yao, Dakuo Wang, Yufang Hou

    Abstract: Science journalism refers to the task of reporting technical findings of a scientific paper as a less technical news article to the general public audience. We aim to design an automated system to support this real-world task (i.e., automatic science journalism) by 1) introducing a newly-constructed and real-world dataset (SciTechNews), with tuples of a publicly-available scientific paper, its cor… ▽ More

    Submitted 23 October, 2023; originally announced October 2023.

    Comments: Accepted to EMNLP 2023

  32. arXiv:2310.05853  [pdf, other

    cs.HC

    "Mango Mango, How to Let The Lettuce Dry Without A Spinner?'': Exploring User Perceptions of Using An LLM-Based Conversational Assistant Toward Cooking Partner

    Authors: Szeyi Chan, Jiachen Li, Bingsheng Yao, Amama Mahmood, Chien-Ming Huang, Holly Jimison, Elizabeth D Mynatt, Dakuo Wang

    Abstract: The rapid advancement of the Large Language Model (LLM) has created numerous potentials for integration with conversational assistants (CAs) assisting people in their daily tasks, particularly due to their extensive flexibility. However, users' real-world experiences interacting with these assistants remain unexplored. In this research, we chose cooking, a complex daily task, as a scenario to inve… ▽ More

    Submitted 9 October, 2023; originally announced October 2023.

    Comments: Under submission to CHI2024

  33. arXiv:2309.13879  [pdf, other

    cs.HC

    LLM-Powered Conversational Voice Assistants: Interaction Patterns, Opportunities, Challenges, and Design Guidelines

    Authors: Amama Mahmood, Junxiang Wang, Bingsheng Yao, Dakuo Wang, Chien-Ming Huang

    Abstract: Conventional Voice Assistants (VAs) rely on traditional language models to discern user intent and respond to their queries, leading to interactions that often lack a broader contextual understanding, an area in which Large Language Models (LLMs) excel. However, current LLMs are largely designed for text-based interactions, thus making it unclear how user interactions will evolve if their modality… ▽ More

    Submitted 25 September, 2023; originally announced September 2023.

  34. arXiv:2309.12368  [pdf, other

    cs.HC cs.AI cs.LG

    Rethinking Human-AI Collaboration in Complex Medical Decision Making: A Case Study in Sepsis Diagnosis

    Authors: Shao Zhang, Jianing Yu, Xuhai Xu, Changchang Yin, Yuxuan Lu, Bingsheng Yao, Melanie Tory, Lace M. Padilla, Jeffrey Caterino, Ping Zhang, Dakuo Wang

    Abstract: Today's AI systems for medical decision support often succeed on benchmark datasets in research papers but fail in real-world deployment. This work focuses on the decision making of sepsis, an acute life-threatening systematic infection that requires an early diagnosis with high uncertainty from the clinician. Our aim is to explore the design requirements for AI systems that can support clinical e… ▽ More

    Submitted 26 February, 2024; v1 submitted 17 September, 2023; originally announced September 2023.

    Comments: Accepted by CHI'24

    MSC Class: 68U35 ACM Class: H.5.2; I.2.1

  35. arXiv:2309.11653  [pdf, other

    cs.HC cs.AI cs.CR

    "It's a Fair Game", or Is It? Examining How Users Navigate Disclosure Risks and Benefits When Using LLM-Based Conversational Agents

    Authors: Zhiping Zhang, Michelle Jia, Hao-Ping Lee, Bingsheng Yao, Sauvik Das, Ada Lerner, Dakuo Wang, Tianshi Li

    Abstract: The widespread use of Large Language Model (LLM)-based conversational agents (CAs), especially in high-stakes domains, raises many privacy concerns. Building ethical LLM-based CAs that respect user privacy requires an in-depth understanding of the privacy risks that concern users the most. However, existing research, primarily model-centered, does not provide insight into users' perspectives. To b… ▽ More

    Submitted 1 April, 2024; v1 submitted 20 September, 2023; originally announced September 2023.

    Comments: 26 pages, 5 figures

  36. arXiv:2309.09357  [pdf, other

    cs.CL cs.AI cs.HC

    Talk2Care: Facilitating Asynchronous Patient-Provider Communication with Large-Language-Model

    Authors: Ziqi Yang, Xuhai Xu, Bingsheng Yao, Shao Zhang, Ethan Rogers, Stephen Intille, Nawar Shara, Guodong Gordon Gao, Dakuo Wang

    Abstract: Despite the plethora of telehealth applications to assist home-based older adults and healthcare providers, basic messaging and phone calls are still the most common communication methods, which suffer from limited availability, information loss, and process inefficiencies. One promising solution to facilitate patient-provider communication is to leverage large language models (LLMs) with their po… ▽ More

    Submitted 3 February, 2024; v1 submitted 17 September, 2023; originally announced September 2023.

    Comments: Under submission to IMWUT'23, 26 pages

    MSC Class: 68U35 ACM Class: H.5.2; I.2.7

  37. arXiv:2307.15868  [pdf, ps, other

    math.OC cs.LG

    Faster Stochastic Algorithms for Minimax Optimization under Polyak--Łojasiewicz Conditions

    Authors: Lesi Chen, Boyuan Yao, Luo Luo

    Abstract: This paper considers stochastic first-order algorithms for minimax optimization under Polyak--Łojasiewicz (PL) conditions. We propose SPIDER-GDA for solving the finite-sum problem of the form $\min_x \max_y f(x,y)\triangleq \frac{1}{n} \sum_{i=1}^n f_i(x,y)$, where the objective function $f(x,y)$ is $μ_x$-PL in $x$ and $μ_y$-PL in $y$; and each $f_i(x,y)$ is $L$-smooth. We prove SPIDER-GDA could f… ▽ More

    Submitted 28 July, 2023; originally announced July 2023.

    Comments: published in NeurIPS 2022; fix a mistake in the proof of Thm. 4.1 and polish the writing

  38. Mental-LLM: Leveraging Large Language Models for Mental Health Prediction via Online Text Data

    Authors: Xuhai Xu, Bingsheng Yao, Yuanzhe Dong, Saadia Gabriel, Hong Yu, James Hendler, Marzyeh Ghassemi, Anind K. Dey, Dakuo Wang

    Abstract: Advances in large language models (LLMs) have empowered a variety of applications. However, there is still a significant gap in research when it comes to understanding and enhancing the capabilities of LLMs in the field of mental health. In this work, we present a comprehensive evaluation of multiple LLMs on various mental health prediction tasks via online text data, including Alpaca, Alpaca-LoRA… ▽ More

    Submitted 28 January, 2024; v1 submitted 26 July, 2023; originally announced July 2023.

    Comments: Published at Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies (IMWUT) 2024

    MSC Class: 68U35 ACM Class: H.5.2; I.2.m

  39. arXiv:2306.08126  [pdf, other

    cs.CL cs.AI

    PersonaPKT: Building Personalized Dialogue Agents via Parameter-efficient Knowledge Transfer

    Authors: Xu Han, Bin Guo, Yoon Jung, Benjamin Yao, Yu Zhang, Xiaohu Liu, Chenlei Guo

    Abstract: Personalized dialogue agents (DAs) powered by large pre-trained language models (PLMs) often rely on explicit persona descriptions to maintain personality consistency. However, such descriptions may not always be available or may pose privacy concerns. To tackle this bottleneck, we introduce PersonaPKT, a lightweight transfer learning approach that can build persona-consistent dialogue models with… ▽ More

    Submitted 13 June, 2023; originally announced June 2023.

    Comments: 10 pages, 3 figures, accepted to SustaiNLP 2023

  40. Reducing Communication for Split Learning by Randomized Top-k Sparsification

    Authors: Fei Zheng, Chaochao Chen, Lingjuan Lyu, Binhui Yao

    Abstract: Split learning is a simple solution for Vertical Federated Learning (VFL), which has drawn substantial attention in both research and application due to its simplicity and efficiency. However, communication efficiency is still a crucial issue for split learning. In this paper, we investigate multiple communication reduction methods for split learning, including cut layer size reduction, top-k spar… ▽ More

    Submitted 29 May, 2023; originally announced May 2023.

    Comments: Accepted by IJCAI 2023

    Journal ref: IJCAI 2023

  41. arXiv:2305.16163  [pdf, other

    cs.IR cs.AI

    PPGenCDR: A Stable and Robust Framework for Privacy-Preserving Cross-Domain Recommendation

    Authors: Xinting Liao, Weiming Liu, Xiaolin Zheng, Binhui Yao, Chaochao Chen

    Abstract: Privacy-preserving cross-domain recommendation (PPCDR) refers to preserving the privacy of users when transferring the knowledge from source domain to target domain for better performance, which is vital for the long-term development of recommender systems. Existing work on cross-domain recommendation (CDR) reaches advanced and satisfying recommendation performance, but mostly neglects preserving… ▽ More

    Submitted 11 May, 2023; originally announced May 2023.

    Comments: To be appear in AAAI2023

  42. arXiv:2305.14725  [pdf, other

    cs.CL

    AMELI: Enhancing Multimodal Entity Linking with Fine-Grained Attributes

    Authors: Barry Menglong Yao, Yu Chen, Qifan Wang, Sijia Wang, Minqian Liu, Zhiyang Xu, Licheng Yu, Lifu Huang

    Abstract: We propose attribute-aware multimodal entity linking, where the input is a mention described with a text and image, and the goal is to predict the corresponding target entity from a multimodal knowledge base (KB) where each entity is also described with a text description, a visual image and a set of attributes and values. To support this research, we construct AMELI, a large-scale dataset consist… ▽ More

    Submitted 24 May, 2023; originally announced May 2023.

    Comments: 12 pages, 4 figures

    ACM Class: I.2.7

  43. arXiv:2305.14328  [pdf, other

    cs.CL

    Benchmarking LLM-based Machine Translation on Cultural Awareness

    Authors: Binwei Yao, Ming Jiang, Diyi Yang, Junjie Hu

    Abstract: Translating cultural-specific content is crucial for effective cross-cultural communication. However, many MT systems still struggle to translate sentences containing cultural-specific entities accurately and understandably. Recent advancements in in-context learning utilize lightweight prompts to guide large language models (LLMs) in machine translation tasks. Nevertheless, the effectiveness of t… ▽ More

    Submitted 22 March, 2024; v1 submitted 23 May, 2023; originally announced May 2023.

  44. arXiv:2305.12710  [pdf, other

    cs.CL

    Beyond Labels: Empowering Human Annotators with Natural Language Explanations through a Novel Active-Learning Architecture

    Authors: Bingsheng Yao, Ishan Jindal, Lucian Popa, Yannis Katsis, Sayan Ghosh, Lihong He, Yuxuan Lu, Shashank Srivastava, Yunyao Li, James Hendler, Dakuo Wang

    Abstract: Real-world domain experts (e.g., doctors) rarely annotate only a decision label in their day-to-day workflow without providing explanations. Yet, existing low-resource learning techniques, such as Active Learning (AL), that aim to support human annotators mostly focus on the label while neglecting the natural language explanation of a data point. This work proposes a novel AL architecture to suppo… ▽ More

    Submitted 23 October, 2023; v1 submitted 22 May, 2023; originally announced May 2023.

    Comments: Accepted to EMNLP 2023 Findings

  45. arXiv:2305.03117  [pdf, other

    cs.CL

    Are Human Explanations Always Helpful? Towards Objective Evaluation of Human Natural Language Explanations

    Authors: Bingsheng Yao, Prithviraj Sen, Lucian Popa, James Hendler, Dakuo Wang

    Abstract: Human-annotated labels and explanations are critical for training explainable NLP models. However, unlike human-annotated labels whose quality is easier to calibrate (e.g., with a majority vote), human-crafted free-form explanations can be quite subjective. Before blindly using them as ground truth to train ML models, a vital question needs to be asked: How do we evaluate a human-annotated explana… ▽ More

    Submitted 22 May, 2023; v1 submitted 4 May, 2023; originally announced May 2023.

    Comments: Accepted to ACL2023

  46. arXiv:2305.01810  [pdf, other

    cs.CL cs.AI

    KEPLET: Knowledge-Enhanced Pretrained Language Model with Topic Entity Awareness

    Authors: Yichuan Li, Jialong Han, Kyumin Lee, Chengyuan Ma, Benjamin Yao, Derek Liu

    Abstract: In recent years, Pre-trained Language Models (PLMs) have shown their superiority by pre-training on unstructured text corpus and then fine-tuning on downstream tasks. On entity-rich textual resources like Wikipedia, Knowledge-Enhanced PLMs (KEPLMs) incorporate the interactions between tokens and mentioned entities in pre-training, and are thus more effective on entity-centric tasks such as entity… ▽ More

    Submitted 2 May, 2023; originally announced May 2023.

  47. Time series anomaly detection with reconstruction-based state-space models

    Authors: Fan Wang, Keli Wang, Boyu Yao

    Abstract: Recent advances in digitization have led to the availability of multivariate time series data in various domains, enabling real-time monitoring of operations. Identifying abnormal data patterns and detecting potential failures in these scenarios are important yet rather challenging. In this work, we propose a novel unsupervised anomaly detection method for time series data. The proposed framework… ▽ More

    Submitted 9 October, 2023; v1 submitted 6 March, 2023; originally announced March 2023.

  48. arXiv:2302.02599  [pdf, other

    cs.LG cs.AI cs.DC

    Colossal-Auto: Unified Automation of Parallelization and Activation Checkpoint for Large-scale Models

    Authors: Yuliang Liu, Shenggui Li, Jiarui Fang, Yanjun Shao, Boyuan Yao, Yang You

    Abstract: In recent years, large-scale models have demonstrated state-of-the-art performance across various domains. However, training such models requires various techniques to address the problem of limited computing power and memory on devices such as GPUs. Some commonly used techniques include pipeline parallelism, tensor parallelism, and activation checkpointing. While existing works have focused on fi… ▽ More

    Submitted 21 February, 2023; v1 submitted 6 February, 2023; originally announced February 2023.

  49. arXiv:2210.11408  [pdf, other

    eess.SP cs.LG

    Hierarchical Deep Learning with Generative Adversarial Network for Automatic Cardiac Diagnosis from ECG Signals

    Authors: Zekai Wang, Stavros Stavrakis, Bing Yao

    Abstract: Cardiac disease is the leading cause of death in the US. Accurate heart disease detection is of critical importance for timely medical treatment to save patients' lives. Routine use of electrocardiogram (ECG) is the most common method for physicians to assess the electrical activities of the heart and detect possible abnormal cardiac conditions. Fully utilizing the ECG data for reliable heart dise… ▽ More

    Submitted 19 October, 2022; originally announced October 2022.

  50. arXiv:2209.15312  [pdf, other

    cs.IT

    Strings And Colorings Of Topological Coding Towards Asymmetric Topology Cryptography

    Authors: Bing Yao, Chao Yang, Xia Liu, Fei Ma, Jing Su, Hui Sun, Xiaohui Zhang, Yarong Mu

    Abstract: We, for anti-quantum computing, will discuss various number-based strings, such as number-based super-strings, parameterized strings, set-based strings, graph-based strings, integer-partitioned and integer-decomposed strings, Hanzi-based strings, as well as algebraic operations based on number-based strings. Moreover, we introduce number-based string-colorings, magic-constraint colorings, and vect… ▽ More

    Submitted 30 September, 2022; originally announced September 2022.

    Comments: Asymmetric topology encryption is a new topic of topological coding towards the certificateless public key cryptography