Skip to main content

Showing 1–50 of 266 results for author: Su, Z

Searching in archive cs. Search in all archives.
.
  1. arXiv:2408.12076  [pdf, other

    cs.CL cs.AI

    ConflictBank: A Benchmark for Evaluating the Influence of Knowledge Conflicts in LLM

    Authors: Zhaochen Su, Jun Zhang, Xiaoye Qu, Tong Zhu, Yanshu Li, Jiashuo Sun, Juntao Li, Min Zhang, Yu Cheng

    Abstract: Large language models (LLMs) have achieved impressive advancements across numerous disciplines, yet the critical issue of knowledge conflicts, a major source of hallucinations, has rarely been studied. Only a few research explored the conflicts between the inherent knowledge of LLMs and the retrieved contextual knowledge. However, a thorough assessment of knowledge conflict in LLMs is still missin… ▽ More

    Submitted 21 August, 2024; originally announced August 2024.

    Comments: Under Review

  2. arXiv:2408.10613  [pdf, other

    cs.IR

    Task-level Distributionally Robust Optimization for Large Language Model-based Dense Retrieval

    Authors: Guangyuan Ma, Yongliang Ma, Xing Wu, Zhenpeng Su, Ming Zhou, Songlin Hu

    Abstract: Large Language Model-based Dense Retrieval (LLM-DR) optimizes over numerous heterogeneous fine-tuning collections from different domains. However, the discussion about its training data distribution is still minimal. Previous studies rely on empirically assigned dataset choices or sampling ratios, which inevitably leads to sub-optimal retrieval performances. In this paper, we propose a new task-le… ▽ More

    Submitted 20 August, 2024; originally announced August 2024.

  3. arXiv:2408.08343  [pdf, other

    cs.SE cs.AI

    API-guided Dataset Synthesis to Finetune Large Code Models

    Authors: Zongjie Li, Daoyuan Wu, Shuai Wang, Zhendong Su

    Abstract: Large code models (LCMs), pre-trained on vast code corpora, have demonstrated remarkable performance across a wide array of code-related tasks. Supervised fine-tuning (SFT) plays a vital role in aligning these models with specific requirements and enhancing their performance in particular domains. However, synthesizing high-quality SFT datasets poses a significant challenge due to the uneven quali… ▽ More

    Submitted 22 August, 2024; v1 submitted 15 August, 2024; originally announced August 2024.

  4. arXiv:2408.07525  [pdf, other

    cs.DB cs.SE

    Dinkel: Testing Graph Database Engines via State-Aware Query Generation

    Authors: Dominic Wüst, Zu-Ming Jiang, Zhendong Su

    Abstract: Graph database management systems (GDBMSs) store and manipulate graph data and form a core part of many data-driven applications. To ensure their reliability, several approaches have been proposed to test GDBMSs by generating queries in Cypher, the most popular graph query language. However, Cypher allows queries with complicated state changes and data dependencies, which existing approaches do no… ▽ More

    Submitted 14 August, 2024; originally announced August 2024.

  5. arXiv:2408.06019  [pdf, other

    cs.CV

    HeadGAP: Few-shot 3D Head Avatar via Generalizable Gaussian Priors

    Authors: Xiaozheng Zheng, Chao Wen, Zhaohu Li, Weiyi Zhang, Zhuo Su, Xu Chang, Yang Zhao, Zheng Lv, Xiaoyuan Zhang, Yongjie Zhang, Guidong Wang, Lan Xu

    Abstract: In this paper, we present a novel 3D head avatar creation approach capable of generalizing from few-shot in-the-wild data with high-fidelity and animatable robustness. Given the underconstrained nature of this problem, incorporating prior knowledge is essential. Therefore, we propose a framework comprising prior learning and avatar creation phases. The prior learning phase leverages 3D head priors… ▽ More

    Submitted 12 August, 2024; originally announced August 2024.

    Comments: Project page: https://headgap.github.io/

  6. Mesh deformation-based single-view 3D reconstruction of thin eyeglasses frames with differentiable rendering

    Authors: Fan Zhang, Ziyue Ji, Weiguang Kang, Weiqing Li, Zhiyong Su

    Abstract: With the support of Virtual Reality (VR) and Augmented Reality (AR) technologies, the 3D virtual eyeglasses try-on application is well on its way to becoming a new trending solution that offers a "try on" option to select the perfect pair of eyeglasses at the comfort of your own home. Reconstructing eyeglasses frames from a single image with traditional depth and image-based methods is extremely d… ▽ More

    Submitted 9 August, 2024; originally announced August 2024.

    Journal ref: Graphical Models, Volume 135, October 2024, 101225

  7. arXiv:2408.03865  [pdf, other

    cs.LG

    PackMamba: Efficient Processing of Variable-Length Sequences in Mamba training

    Authors: Haoran Xu, Ziqian Liu, Rong Fu, Zhongling Su, Zerui Wang, Zheng Cai, Zhilin Pei, Xingcheng Zhang

    Abstract: With the evolution of large language models, traditional Transformer models become computationally demanding for lengthy sequences due to the quadratic growth in computation with respect to the sequence length. Mamba, emerging as a groundbreaking architecture in the field of generative AI, demonstrates remarkable proficiency in handling elongated sequences with reduced computational and memory com… ▽ More

    Submitted 21 August, 2024; v1 submitted 7 August, 2024; originally announced August 2024.

  8. arXiv:2408.00220  [pdf, other

    math.DG cs.LG

    Persistent de Rham-Hodge Laplacians in the Eulerian representation

    Authors: Zhe Su, Yiying Tong, Guo-Wei Wei

    Abstract: Recently, topological data analysis (TDA) has become a trending topic in data science and engineering. However, the key technique of TDA, i.e., persistent homology, is defined on point cloud data, which restricts its scope. In this work, we propose persistent de Rham-Hodge Laplacian, or persistent Hodge Laplacian (PHL) for abbreviation, for the TDA on manifolds with boundaries, or volumetric data.… ▽ More

    Submitted 31 July, 2024; originally announced August 2024.

  9. arXiv:2407.21289  [pdf, other

    cs.CV cs.GR

    Fine-grained Metrics for Point Cloud Semantic Segmentation

    Authors: Zhuheng Lu, Ting Wu, Yuewei Dai, Weiqing Li, Zhiyong Su

    Abstract: Two forms of imbalances are commonly observed in point cloud semantic segmentation datasets: (1) category imbalances, where certain objects are more prevalent than others; and (2) size imbalances, where certain objects occupy more points than others. Because of this, the majority of categories and large objects are favored in the existing evaluation metrics. This paper suggests fine-grained mIoU a… ▽ More

    Submitted 30 July, 2024; originally announced July 2024.

    Comments: PRCV 2024

  10. arXiv:2407.20189  [pdf, other

    cs.IR cs.CL

    Aligning Query Representation with Rewritten Query and Relevance Judgments in Conversational Search

    Authors: Fengran Mo, Chen Qu, Kelong Mao, Yihong Wu, Zhan Su, Kaiyu Huang, Jian-Yun Nie

    Abstract: Conversational search supports multi-turn user-system interactions to solve complex information needs. Different from the traditional single-turn ad-hoc search, conversational search encounters a more challenging problem of context-dependent query understanding with the lengthy and long-tail conversational history context. While conversational query rewriting methods leverage explicit rewritten qu… ▽ More

    Submitted 29 July, 2024; originally announced July 2024.

    Comments: Accepted by CIKM 2024

  11. arXiv:2407.17942  [pdf, other

    cs.RO cs.IT

    A Novel Perception Entropy Metric for Optimizing Vehicle Perception with LiDAR Deployment

    Authors: Yongjiang He, Peng Cao, Zhongling Su, Xiaobo Liu

    Abstract: Developing an effective evaluation metric is crucial for accurately and swiftly measuring LiDAR perception performance. One major issue is the lack of metrics that can simultaneously generate fast and accurate evaluations based on either object detection or point cloud data. In this study, we propose a novel LiDAR perception entropy metric based on the probability of vehicle grid occupancy. This m… ▽ More

    Submitted 25 July, 2024; originally announced July 2024.

  12. arXiv:2407.15070  [pdf, other

    cs.CV

    3D Gaussian Parametric Head Model

    Authors: Yuelang Xu, Lizhen Wang, Zerong Zheng, Zhaoqi Su, Yebin Liu

    Abstract: Creating high-fidelity 3D human head avatars is crucial for applications in VR/AR, telepresence, digital human interfaces, and film production. Recent advances have leveraged morphable face models to generate animated head avatars from easily accessible data, representing varying identities and expressions within a low-dimensional parametric space. However, existing methods often struggle with mod… ▽ More

    Submitted 21 July, 2024; originally announced July 2024.

    Comments: project page: https://yuelangx.github.io/gphm/

  13. arXiv:2407.14006  [pdf, other

    eess.AS cs.SD

    MSceneSpeech: A Multi-Scene Speech Dataset For Expressive Speech Synthesis

    Authors: Qian Yang, Jialong Zuo, Zhe Su, Ziyue Jiang, Mingze Li, Zhou Zhao, Feiyang Chen, Zhefeng Wang, Baoxing Huai

    Abstract: We introduce an open source high-quality Mandarin TTS dataset MSceneSpeech (Multiple Scene Speech Dataset), which is intended to provide resources for expressive speech synthesis. MSceneSpeech comprises numerous audio recordings and texts performed and recorded according to daily life scenarios. Each scenario includes multiple speakers and a diverse range of prosodic styles, making it suitable for… ▽ More

    Submitted 18 July, 2024; originally announced July 2024.

    Comments: Accepted by INTERSPEECH 2024

  14. arXiv:2407.13161  [pdf, other

    physics.soc-ph cs.CY physics.ed-ph

    How to quantify an examination? Evidence from physics examinations via complex networks

    Authors: Min Xia, Zhu Su, Weibing Deng, Xiumei Feng, Benwei Zhang

    Abstract: Given the untapped potential for continuous improvement of examinations, quantitative investigations of examinations could guide efforts to considerably improve learning efficiency and evaluation and thus greatly help both learners and educators. However, there is a general lack of quantitative methods for investigating examinations. To address this gap, we propose a new metric via complex network… ▽ More

    Submitted 18 July, 2024; originally announced July 2024.

  15. arXiv:2407.13076  [pdf, other

    cs.MA cs.NI eess.SP

    Matching-Driven Deep Reinforcement Learning for Energy-Efficient Transmission Parameter Allocation in Multi-Gateway LoRa Networks

    Authors: Ziqi Lin, Xu Zhang, Shimin Gong, Lanhua Li, Zhou Su, Bo Gu

    Abstract: Long-range (LoRa) communication technology, distinguished by its low power consumption and long communication range, is widely used in the Internet of Things. Nevertheless, the LoRa MAC layer adopts pure ALOHA for medium access control, which may suffer from severe packet collisions as the network scale expands, consequently reducing the system energy efficiency (EE). To address this issue, it is… ▽ More

    Submitted 17 July, 2024; originally announced July 2024.

  16. arXiv:2407.10285  [pdf, other

    cs.CV

    Noise Calibration: Plug-and-play Content-Preserving Video Enhancement using Pre-trained Video Diffusion Models

    Authors: Qinyu Yang, Haoxin Chen, Yong Zhang, Menghan Xia, Xiaodong Cun, Zhixun Su, Ying Shan

    Abstract: In order to improve the quality of synthesized videos, currently, one predominant method involves retraining an expert diffusion model and then implementing a noising-denoising process for refinement. Despite the significant training costs, maintaining consistency of content between the original and enhanced videos remains a major challenge. To tackle this challenge, we propose a novel formulation… ▽ More

    Submitted 14 July, 2024; originally announced July 2024.

    Comments: ECCV 2024, Project Page: https://yangqy1110.github.io/NC-SDEdit/, Code Repo: https://github.com/yangqy1110/NC-SDEdit/

    ACM Class: I.2; I.4.3

  17. arXiv:2407.09816  [pdf, other

    cs.CL

    MaskMoE: Boosting Token-Level Learning via Routing Mask in Mixture-of-Experts

    Authors: Zhenpeng Su, Zijia Lin, Xue Bai, Xing Wu, Yizhe Xiong, Haoran Lian, Guangyuan Ma, Hui Chen, Guiguang Ding, Wei Zhou, Songlin Hu

    Abstract: Scaling the size of a model enhances its capabilities but significantly increases computation complexity. Mixture-of-Experts models (MoE) address the issue by allowing model size to scale up without substantially increasing training or inference costs. In MoE, there is an important module called the router, which is used to distribute each token to the experts. Currently, the mainstream routing me… ▽ More

    Submitted 29 August, 2024; v1 submitted 13 July, 2024; originally announced July 2024.

    Comments: Work in progress

  18. arXiv:2407.03804  [pdf, other

    cs.LG cs.NI

    Multi-Time Scale Service Caching and Pricing in MEC Systems with Dynamic Program Popularity

    Authors: Yiming Chen, Xingyuan Hu, Bo Gu, Shimin Gong, Zhou Su

    Abstract: In mobile edge computing systems, base stations (BSs) equipped with edge servers can provide computing services to users to reduce their task execution time. However, there is always a conflict of interest between the BS and users. The BS prices the service programs based on user demand to maximize its own profit, while the users determine their offloading strategies based on the prices to minimiz… ▽ More

    Submitted 4 July, 2024; originally announced July 2024.

  19. arXiv:2407.00769  [pdf, other

    quant-ph cs.DC

    Achieving Energetic Superiority Through System-Level Quantum Circuit Simulation

    Authors: Rong Fu, Zhongling Su, Han-Sen Zhong, Xiti Zhao, Jianyang Zhang, Feng Pan, Pan Zhang, Xianhe Zhao, Ming-Cheng Chen, Chao-Yang Lu, Jian-Wei Pan, Zhiling Pei, Xingcheng Zhang, Wanli Ouyang

    Abstract: Quantum Computational Superiority boasts rapid computation and high energy efficiency. Despite recent advances in classical algorithms aimed at refuting the milestone claim of Google's sycamore, challenges remain in generating uncorrelated samples of random quantum circuits. In this paper, we present a groundbreaking large-scale system technology that leverages optimization on global, node, and de… ▽ More

    Submitted 30 June, 2024; originally announced July 2024.

  20. UWBAD: Towards Effective and Imperceptible Jamming Attacks Against UWB Ranging Systems with COTS Chips

    Authors: Yuqiao Yang, Zhongjie Wu, Yongzhao Zhang, Ting Chen, Jun Li, Jie Yang, Wenhao Liu, Xiaosong Zhang, Ruicong Shi, Jingwei Li, Yu Jiang, Zhuo Su

    Abstract: UWB ranging systems have been adopted in many critical and security sensitive applications due to its precise positioning and secure ranging capabilities. We present a practical jamming attack, namely UWBAD, against commercial UWB ranging systems, which exploits the vulnerability of the adoption of the normalized cross-correlation process in UWB ranging and can selectively and quickly block rangin… ▽ More

    Submitted 30 June, 2024; originally announced July 2024.

    Comments: Proceedings of the 2024 ACM SIGSAC Conference on Computer and Communications Security

  21. arXiv:2406.17309  [pdf, other

    cs.CV

    Zero-Shot Long-Form Video Understanding through Screenplay

    Authors: Yongliang Wu, Bozheng Li, Jiawang Cao, Wenbo Zhu, Yi Lu, Weiheng Chi, Chuyun Xie, Haolin Zheng, Ziyue Su, Jay Wu, Xu Yang

    Abstract: The Long-form Video Question-Answering task requires the comprehension and analysis of extended video content to respond accurately to questions by utilizing both temporal and contextual information. In this paper, we present MM-Screenplayer, an advanced video understanding system with multi-modal perception capabilities that can convert any video into textual screenplay representations. Unlike pr… ▽ More

    Submitted 25 June, 2024; originally announced June 2024.

    Comments: Highest Score Award to the CVPR'2024 LOVEU Track 1 Challenge

  22. arXiv:2406.14192  [pdf, other

    cs.CL cs.AI

    Timo: Towards Better Temporal Reasoning for Language Models

    Authors: Zhaochen Su, Jun Zhang, Tong Zhu, Xiaoye Qu, Juntao Li, Min Zhang, Yu Cheng

    Abstract: Reasoning about time is essential for Large Language Models (LLMs) to understand the world. Previous works focus on solving specific tasks, primarily on time-sensitive question answering. While these methods have proven effective, they cannot generalize to a wider spectrum of temporal reasoning tasks. Therefore, we propose a crucial question: Can we build a universal framework to handle a variety… ▽ More

    Submitted 18 August, 2024; v1 submitted 20 June, 2024; originally announced June 2024.

    Comments: This paper has been accepted to the COLM 2024 conference

  23. arXiv:2406.13607  [pdf, other

    cs.CV

    Ultra-High-Definition Restoration: New Benchmarks and A Dual Interaction Prior-Driven Solution

    Authors: Liyan Wang, Cong Wang, Jinshan Pan, Weixiang Zhou, Xiaoran Sun, Wei Wang, Zhixun Su

    Abstract: Ultra-High-Definition (UHD) image restoration has acquired remarkable attention due to its practical demand. In this paper, we construct UHD snow and rain benchmarks, named UHD-Snow and UHD-Rain, to remedy the deficiency in this field. The UHD-Snow/UHD-Rain is established by simulating the physics process of rain/snow into consideration and each benchmark contains 3200 degraded/clear image pairs o… ▽ More

    Submitted 22 June, 2024; v1 submitted 19 June, 2024; originally announced June 2024.

  24. arXiv:2406.12459  [pdf, other

    cs.CV

    HumanSplat: Generalizable Single-Image Human Gaussian Splatting with Structure Priors

    Authors: Panwang Pan, Zhuo Su, Chenguo Lin, Zhen Fan, Yongjie Zhang, Zeming Li, Tingting Shen, Yadong Mu, Yebin Liu

    Abstract: Despite recent advancements in high-fidelity human reconstruction techniques, the requirements for densely captured images or time-consuming per-instance optimization significantly hinder their applications in broader scenarios. To tackle these issues, we present HumanSplat which predicts the 3D Gaussian Splatting properties of any human from a single input image in a generalizable manner. In part… ▽ More

    Submitted 18 June, 2024; originally announced June 2024.

  25. arXiv:2406.10375  [pdf, other

    cs.SE

    Mokav: Execution-driven Differential Testing with LLMs

    Authors: Khashayar Etemadi, Bardia Mohammadi, Zhendong Su, Martin Monperrus

    Abstract: It is essential to detect functional differences in various software engineering tasks, such as automated program repair, mutation testing, and code refactoring. The problem of detecting functional differences between two programs can be reduced to searching for a difference exposing test (DET): a test input that results in different outputs on the subject programs. In this paper, we propose Mokav… ▽ More

    Submitted 14 June, 2024; originally announced June 2024.

  26. arXiv:2406.09072  [pdf, other

    cs.CL

    Living in the Moment: Can Large Language Models Grasp Co-Temporal Reasoning?

    Authors: Zhaochen Su, Juntao Li, Jun Zhang, Tong Zhu, Xiaoye Qu, Pan Zhou, Yan Bowen, Yu Cheng, Min zhang

    Abstract: Temporal reasoning is fundamental for large language models (LLMs) to comprehend the world. Current temporal reasoning datasets are limited to questions about single or isolated events, falling short in mirroring the realistic temporal characteristics involving concurrent nature and intricate temporal interconnections. In this paper, we introduce CoTempQA, a comprehensive co-temporal Question Answ… ▽ More

    Submitted 13 June, 2024; originally announced June 2024.

    Comments: This paper has been accepted to the ACL 2024 main conference

  27. Practical, Automated Scenario-based Mobile App Testing

    Authors: Shengcheng Yu, Chunrong Fang, Mingzhe Du, Zimin Ding, Zhenyu Chen, Zhendong Su

    Abstract: The importance of mobile application (app) quality insurance is increasing with the rapid development of the mobile Internet. Automated test generation approaches, as a dominant direction of app quality insurance, follow specific models or strategies, targeting at optimizing the code coverage. Such approaches lead to a huge gap between testing execution and app business logic. Test scripts develop… ▽ More

    Submitted 12 June, 2024; originally announced June 2024.

    Comments: Accepted by IEEE Transaction on Software Engineering in 2024

  28. arXiv:2406.04778  [pdf, other

    cs.PL cs.SE

    Compilation Quotient (CQ): A Metric for the Compilation Hardness of Programming Languages

    Authors: Vince Szabo, Dominik Winterer, Zhendong Su

    Abstract: Today's programmers can choose from an exceptional range of programming languages, each with its own traits, purpose, and complexity. A key aspect of a language's complexity is how hard it is to compile programs in the language. While most programmers have an intuition about compilation hardness for different programming languages, no metric exists to quantify it. We introduce the compilation quot… ▽ More

    Submitted 7 June, 2024; originally announced June 2024.

  29. arXiv:2405.19665  [pdf

    eess.SY cs.AI cs.LG

    A novel fault localization with data refinement for hydroelectric units

    Authors: Jialong Huang, Junlin Song, Penglong Lian, Mengjie Gan, Zhiheng Su, Benhao Wang, Wenji Zhu, Xiaomin Pu, Jianxiao Zou, Shicai Fan

    Abstract: Due to the scarcity of fault samples and the complexity of non-linear and non-smooth characteristics data in hydroelectric units, most of the traditional hydroelectric unit fault localization methods are difficult to carry out accurate localization. To address these problems, a sparse autoencoder (SAE)-generative adversarial network (GAN)-wavelet noise reduction (WNR)- manifold-boosted deep learni… ▽ More

    Submitted 29 May, 2024; originally announced May 2024.

    Comments: 6pages,4 figures,Conference on Decision and Control(CDC) conference

  30. arXiv:2405.19642  [pdf

    cs.AI

    Few-shot fault diagnosis based on multi-scale graph convolution filtering for industry

    Authors: Mengjie Gan, Penglong Lian, Zhiheng Su, Jiyang Zhang, Jialong Huang, Benhao Wang, Jianxiao Zou, Shicai Fan

    Abstract: Industrial equipment fault diagnosis often encounter challenges such as the scarcity of fault data, complex operating conditions, and varied types of failures. Signal analysis, data statistical learning, and conventional deep learning techniques face constraints under these conditions due to their substantial data requirements and the necessity for transfer learning to accommodate new failure mode… ▽ More

    Submitted 29 May, 2024; originally announced May 2024.

    Comments: 6 pages, 2 figures, 2 tables, 63rd IEEE Conference on Decision and Control

  31. arXiv:2405.19581  [pdf, other

    cs.SE cs.AI

    Source Code Foundation Models are Transferable Binary Analysis Knowledge Bases

    Authors: Zian Su, Xiangzhe Xu, Ziyang Huang, Kaiyuan Zhang, Xiangyu Zhang

    Abstract: Human-Oriented Binary Reverse Engineering (HOBRE) lies at the intersection of binary and source code, aiming to lift binary code to human-readable content relevant to source code, thereby bridging the binary-source semantic gap. Recent advancements in uni-modal code model pre-training, particularly in generative Source Code Foundation Models (SCFMs) and binary understanding models, have laid the g… ▽ More

    Submitted 29 May, 2024; originally announced May 2024.

  32. arXiv:2405.18725  [pdf, other

    cs.LG cs.MA

    Can We Enhance the Quality of Mobile Crowdsensing Data Without Ground Truth?

    Authors: Jiajie Li, Bo Gu, Shimin Gong, Zhou Su, Mohsen Guizani

    Abstract: Mobile crowdsensing (MCS) has emerged as a prominent trend across various domains. However, ensuring the quality of the sensing data submitted by mobile users (MUs) remains a complex and challenging problem. To address this challenge, an advanced method is required to detect low-quality sensing data and identify malicious MUs that may disrupt the normal operations of an MCS system. Therefore, this… ▽ More

    Submitted 28 May, 2024; originally announced May 2024.

  33. arXiv:2405.16671  [pdf, other

    cs.LG cs.AI

    Mixture of Experts Using Tensor Products

    Authors: Zhan Su, Fengran Mo, Prayag Tiwari, Benyou Wang, Jian-Yun Nie, Jakob Grue Simonsen

    Abstract: In multi-task learning, the conventional approach involves training a model on multiple tasks simultaneously. However, the training signals from different tasks can interfere with one another, potentially leading to \textit{negative transfer}. To mitigate this, we investigate if modular language models can facilitate positive transfer and systematic generalization. Specifically, we propose a novel… ▽ More

    Submitted 26 May, 2024; originally announced May 2024.

  34. arXiv:2405.14278  [pdf, other

    cs.CV

    SCMix: Stochastic Compound Mixing for Open Compound Domain Adaptation in Semantic Segmentation

    Authors: Kai Yao, Zhaorui Tan, Zixian Su, Xi Yang, Jie Sun, Kaizhu Huang

    Abstract: Open compound domain adaptation (OCDA) aims to transfer knowledge from a labeled source domain to a mix of unlabeled homogeneous compound target domains while generalizing to open unseen domains. Existing OCDA methods solve the intra-domain gaps by a divide-and-conquer strategy, which divides the problem into several individual and parallel domain adaptation (DA) tasks. Such approaches often conta… ▽ More

    Submitted 23 May, 2024; originally announced May 2024.

  35. arXiv:2405.14036  [pdf, other

    cs.CR

    Remote Keylogging Attacks in Multi-user VR Applications

    Authors: Zihao Su, Kunlin Cai, Reuben Beeler, Lukas Dresel, Allan Garcia, Ilya Grishchenko, Yuan Tian, Christopher Kruegel, Giovanni Vigna

    Abstract: As Virtual Reality (VR) applications grow in popularity, they have bridged distances and brought users closer together. However, with this growth, there have been increasing concerns about security and privacy, especially related to the motion data used to create immersive experiences. In this study, we highlight a significant security threat in multi-user VR applications, which are applications t… ▽ More

    Submitted 22 May, 2024; originally announced May 2024.

    Comments: Accepted for Usenix 2024

  36. arXiv:2405.13976  [pdf, ps, other

    cs.NE cs.LG

    EchoSpike Predictive Plasticity: An Online Local Learning Rule for Spiking Neural Networks

    Authors: Lars Graf, Zhe Su, Giacomo Indiveri

    Abstract: The drive to develop artificial neural networks that efficiently utilize resources has generated significant interest in bio-inspired Spiking Neural Networks (SNNs). These networks are particularly attractive due to their potential in applications requiring low power and memory. This potential is further enhanced by the ability to perform online local learning, enabling them to adapt to dynamic en… ▽ More

    Submitted 26 May, 2024; v1 submitted 22 May, 2024; originally announced May 2024.

    Comments: 11 pages, 6 figures, submitted to IEEE

  37. arXiv:2405.11157  [pdf, other

    cs.LG cs.CL

    Towards Modular LLMs by Building and Reusing a Library of LoRAs

    Authors: Oleksiy Ostapenko, Zhan Su, Edoardo Maria Ponti, Laurent Charlin, Nicolas Le Roux, Matheus Pereira, Lucas Caccia, Alessandro Sordoni

    Abstract: The growing number of parameter-efficient adaptations of a base large language model (LLM) calls for studying whether we can reuse such trained adapters to improve performance for new tasks. We study how to best build a library of adapters given multi-task data and devise techniques for both zero-shot and supervised task generalization through routing in such library. We benchmark existing approac… ▽ More

    Submitted 17 May, 2024; originally announced May 2024.

  38. arXiv:2405.07319  [pdf, other

    cs.CV

    LayGA: Layered Gaussian Avatars for Animatable Clothing Transfer

    Authors: Siyou Lin, Zhe Li, Zhaoqi Su, Zerong Zheng, Hongwen Zhang, Yebin Liu

    Abstract: Animatable clothing transfer, aiming at dressing and animating garments across characters, is a challenging problem. Most human avatar works entangle the representations of the human body and clothing together, which leads to difficulties for virtual try-on across identities. What's worse, the entangled representations usually fail to exactly track the sliding motion of garments. To overcome these… ▽ More

    Submitted 12 May, 2024; originally announced May 2024.

    Comments: SIGGRAPH 2024 conference track

  39. arXiv:2405.04652  [pdf, ps, other

    cs.HC

    AffirmativeAI: Towards LGBTQ+ Friendly Audit Frameworks for Large Language Models

    Authors: Yinru Long, Zilin Ma, Yiyang Mei, Zhaoyuan Su

    Abstract: LGBTQ+ community face disproportionate mental health challenges, including higher rates of depression, anxiety, and suicidal ideation. Research has shown that LGBTQ+ people have been using large language model-based chatbots, such as ChatGPT, for their mental health needs. Despite the potential for immediate support and anonymity these chatbots offer, concerns regarding their capacity to provide e… ▽ More

    Submitted 7 May, 2024; originally announced May 2024.

  40. arXiv:2405.04590  [pdf, other

    cs.CL cs.IR

    Language Modeling Using Tensor Trains

    Authors: Zhan Su, Yuqin Zhou, Fengran Mo, Jakob Grue Simonsen

    Abstract: We propose a novel tensor network language model based on the simplest tensor network (i.e., tensor trains), called `Tensor Train Language Model' (TTLM). TTLM represents sentences in an exponential space constructed by the tensor product of words, but computing the probabilities of sentences in a low-dimensional fashion. We demonstrate that the architectures of Second-order RNNs, Recurrent Arithme… ▽ More

    Submitted 7 May, 2024; originally announced May 2024.

  41. arXiv:2405.02935  [pdf, other

    cs.CL

    Enabling Patient-side Disease Prediction via the Integration of Patient Narratives

    Authors: Zhixiang Su, Yinan Zhang, Jiazheng Jing, Jie Xiao, Zhiqi Shen

    Abstract: Disease prediction holds considerable significance in modern healthcare, because of its crucial role in facilitating early intervention and implementing effective prevention measures. However, most recent disease prediction approaches heavily rely on laboratory test outcomes (e.g., blood tests and medical imaging from X-rays). Gaining access to such data for precise disease prediction is often a c… ▽ More

    Submitted 5 May, 2024; originally announced May 2024.

  42. arXiv:2405.01221  [pdf, other

    cs.NI

    A Survey on Semantic Communication Networks: Architecture, Security, and Privacy

    Authors: Shaolong Guo, Yuntao Wang, Ning Zhang, Zhou Su, Tom H. Luan, Zhiyi Tian, Xuemin Shen

    Abstract: Semantic communication, emerging as a breakthrough beyond the classical Shannon paradigm, aims to convey the essential meaning of source data rather than merely focusing on precise yet content-agnostic bit transmission. By interconnecting diverse intelligent agents (e.g., autonomous vehicles and VR devices) via semantic communications, the semantic communication networks (SemComNet) supports seman… ▽ More

    Submitted 2 May, 2024; originally announced May 2024.

  43. arXiv:2404.17808  [pdf, other

    cs.CL

    Scaffold-BPE: Enhancing Byte Pair Encoding with Simple and Effective Scaffold Token Removal

    Authors: Haoran Lian, Yizhe Xiong, Jianwei Niu, Shasha Mo, Zhenpeng Su, Zijia Lin, Peng Liu, Hui Chen, Guiguang Ding

    Abstract: Byte Pair Encoding (BPE) serves as a foundation method for text tokenization in the Natural Language Processing (NLP) field. Despite its wide adoption, the original BPE algorithm harbors an inherent flaw: it inadvertently introduces a frequency imbalance for tokens in the text corpus. Since BPE iteratively merges the most frequent token pair in the text corpus while keeping all tokens that have be… ▽ More

    Submitted 27 April, 2024; originally announced April 2024.

  44. arXiv:2404.17785  [pdf, other

    cs.CL

    Temporal Scaling Law for Large Language Models

    Authors: Yizhe Xiong, Xiansheng Chen, Xin Ye, Hui Chen, Zijia Lin, Haoran Lian, Zhenpeng Su, Jianwei Niu, Guiguang Ding

    Abstract: Recently, Large Language Models (LLMs) have been widely adopted in a wide range of tasks, leading to increasing attention towards the research on how scaling LLMs affects their performance. Existing works, termed Scaling Laws, have discovered that the final test loss of LLMs scales as power-laws with model size, computational budget, and dataset size. However, the temporal change of the test loss… ▽ More

    Submitted 16 June, 2024; v1 submitted 27 April, 2024; originally announced April 2024.

    Comments: 8 pages, 3 figures; Under review

  45. arXiv:2403.17320  [pdf, other

    cs.RO

    Leveraging Symmetry in RL-based Legged Locomotion Control

    Authors: Zhi Su, Xiaoyu Huang, Daniel Ordoñez-Apraez, Yunfei Li, Zhongyu Li, Qiayuan Liao, Giulio Turrisi, Massimiliano Pontil, Claudio Semini, Yi Wu, Koushil Sreenath

    Abstract: Model-free reinforcement learning is a promising approach for autonomously solving challenging robotics control problems, but faces exploration difficulty without information of the robot's kinematics and dynamics morphology. The under-exploration of multiple modalities with symmetric states leads to behaviors that are often unnatural and sub-optimal. This issue becomes particularly pronounced in… ▽ More

    Submitted 26 March, 2024; v1 submitted 25 March, 2024; originally announced March 2024.

  46. arXiv:2403.06401  [pdf, other

    cs.CV

    Refining Segmentation On-the-Fly: An Interactive Framework for Point Cloud Semantic Segmentation

    Authors: Peng Zhang, Ting Wu, Jinsheng Sun, Weiqing Li, Zhiyong Su

    Abstract: Existing interactive point cloud segmentation approaches primarily focus on the object segmentation, which aim to determine which points belong to the object of interest guided by user interactions. This paper concentrates on an unexplored yet meaningful task, i.e., interactive point cloud semantic segmentation, which assigns high-quality semantic labels to all points in a scene with user correcti… ▽ More

    Submitted 10 March, 2024; originally announced March 2024.

  47. arXiv:2403.06070  [pdf, other

    cs.CV cs.HC

    Reframe Anything: LLM Agent for Open World Video Reframing

    Authors: Jiawang Cao, Yongliang Wu, Weiheng Chi, Wenbo Zhu, Ziyue Su, Jay Wu

    Abstract: The proliferation of mobile devices and social media has revolutionized content dissemination, with short-form video becoming increasingly prevalent. This shift has introduced the challenge of video reframing to fit various screen aspect ratios, a process that highlights the most compelling parts of a video. Traditionally, video reframing is a manual, time-consuming task requiring professional exp… ▽ More

    Submitted 9 March, 2024; originally announced March 2024.

    Comments: 14 pages, 6 figures

  48. arXiv:2403.05020  [pdf, other

    cs.CL cs.AI

    Is this the real life? Is this just fantasy? The Misleading Success of Simulating Social Interactions With LLMs

    Authors: Xuhui Zhou, Zhe Su, Tiwalayo Eisape, Hyunwoo Kim, Maarten Sap

    Abstract: Recent advances in large language models (LLM) have enabled richer social simulations, allowing for the study of various social phenomena. However, most recent work has used a more omniscient perspective on these simulations (e.g., single LLM to generate all interlocutors), which is fundamentally at odds with the non-omniscient, information asymmetric interactions that involve humans and AI agents… ▽ More

    Submitted 18 April, 2024; v1 submitted 7 March, 2024; originally announced March 2024.

  49. arXiv:2403.03561  [pdf, ps, other

    cs.CV

    HMD-Poser: On-Device Real-time Human Motion Tracking from Scalable Sparse Observations

    Authors: Peng Dai, Yang Zhang, Tao Liu, Zhen Fan, Tianyuan Du, Zhuo Su, Xiaozheng Zheng, Zeming Li

    Abstract: It is especially challenging to achieve real-time human motion tracking on a standalone VR Head-Mounted Display (HMD) such as Meta Quest and PICO. In this paper, we propose HMD-Poser, the first unified approach to recover full-body motions using scalable sparse observations from HMD and body-worn IMUs. In particular, it can support a variety of input scenarios, such as HMD, HMD+2IMUs, HMD+3IMUs, e… ▽ More

    Submitted 6 March, 2024; originally announced March 2024.

    Comments: CVPR2024 Accepted

  50. arXiv:2403.01966  [pdf, other

    cs.CV

    Enhancing Information Maximization with Distance-Aware Contrastive Learning for Source-Free Cross-Domain Few-Shot Learning

    Authors: Huali Xu, Li Liu, Shuaifeng Zhi, Shaojing Fu, Zhuo Su, Ming-Ming Cheng, Yongxiang Liu

    Abstract: Existing Cross-Domain Few-Shot Learning (CDFSL) methods require access to source domain data to train a model in the pre-training phase. However, due to increasing concerns about data privacy and the desire to reduce data transmission and training costs, it is necessary to develop a CDFSL solution without accessing source data. For this reason, this paper explores a Source-Free CDFSL (SF-CDFSL) pr… ▽ More

    Submitted 4 March, 2024; originally announced March 2024.

    Comments: Accepted by TIP, 16 pages, 11 figures, 8 tables