Skip to main content

Showing 1–50 of 638 results for author: Liu, G

Searching in archive cs. Search in all archives.
.
  1. arXiv:2408.15998  [pdf, other

    cs.CV cs.AI cs.LG cs.RO

    Eagle: Exploring The Design Space for Multimodal LLMs with Mixture of Encoders

    Authors: Min Shi, Fuxiao Liu, Shihao Wang, Shijia Liao, Subhashree Radhakrishnan, De-An Huang, Hongxu Yin, Karan Sapra, Yaser Yacoob, Humphrey Shi, Bryan Catanzaro, Andrew Tao, Jan Kautz, Zhiding Yu, Guilin Liu

    Abstract: The ability to accurately interpret complex visual information is a crucial topic of multimodal large language models (MLLMs). Recent work indicates that enhanced visual perception significantly reduces hallucinations and improves performance on resolution-sensitive tasks, such as optical character recognition and document analysis. A number of recent MLLMs achieve this goal using a mixture of vis… ▽ More

    Submitted 28 August, 2024; originally announced August 2024.

    Comments: Github: https://github.com/NVlabs/Eagle, HuggingFace: https://huggingface.co/NVEagle

  2. arXiv:2408.15696  [pdf, other

    cs.CY

    Comparing diversity, negativity, and stereotypes in Chinese-language AI technologies: a case study on Baidu, Ernie and Qwen

    Authors: Geng Liu, Carlo Alberto Bono, Francesco Pierri

    Abstract: Large Language Models (LLMs) and search engines have the potential to perpetuate biases and stereotypes by amplifying existing prejudices in their training data and algorithmic processes, thereby influencing public perception and decision-making. While most work has focused on Western-centric AI technologies, we study Chinese-based tools by investigating social biases embedded in the major Chinese… ▽ More

    Submitted 28 August, 2024; originally announced August 2024.

  3. arXiv:2408.15538  [pdf, other

    cs.AI cs.MA

    TrafficGamer: Reliable and Flexible Traffic Simulation for Safety-Critical Scenarios with Game-Theoretic Oracles

    Authors: Guanren Qiao, Guorui Quan, Jiawei Yu, Shujun Jia, Guiliang Liu

    Abstract: While modern Autonomous Vehicle (AV) systems can develop reliable driving policies under regular traffic conditions, they frequently struggle with safety-critical traffic scenarios. This difficulty primarily arises from the rarity of such scenarios in driving datasets and the complexities associated with predictive modeling among multiple vehicles. To support the testing and refinement of AV polic… ▽ More

    Submitted 28 August, 2024; originally announced August 2024.

  4. arXiv:2408.14023  [pdf, other

    cs.CV cs.AI

    Video-CCAM: Enhancing Video-Language Understanding with Causal Cross-Attention Masks for Short and Long Videos

    Authors: Jiajun Fei, Dian Li, Zhidong Deng, Zekun Wang, Gang Liu, Hui Wang

    Abstract: Multi-modal large language models (MLLMs) have demonstrated considerable potential across various downstream tasks that require cross-domain knowledge. MLLMs capable of processing videos, known as Video-MLLMs, have attracted broad interest in video-language understanding. However, videos, especially long videos, contain more visual tokens than images, making them difficult for LLMs to process. Exi… ▽ More

    Submitted 26 August, 2024; originally announced August 2024.

    Comments: 10 pages, 5 figures

  5. arXiv:2408.13727  [pdf, other

    cs.SE cs.AI

    LogParser-LLM: Advancing Efficient Log Parsing with Large Language Models

    Authors: Aoxiao Zhong, Dengyao Mo, Guiyang Liu, Jinbu Liu, Qingda Lu, Qi Zhou, Jiesheng Wu, Quanzheng Li, Qingsong Wen

    Abstract: Logs are ubiquitous digital footprints, playing an indispensable role in system diagnostics, security analysis, and performance optimization. The extraction of actionable insights from logs is critically dependent on the log parsing process, which converts raw logs into structured formats for downstream analysis. Yet, the complexities of contemporary systems and the dynamic nature of logs pose sig… ▽ More

    Submitted 25 August, 2024; originally announced August 2024.

    Comments: Accepted by ACM KDD 2024

  6. arXiv:2408.12161  [pdf, other

    cs.CV

    Rebalancing Multi-Label Class-Incremental Learning

    Authors: Kaile Du, Yifan Zhou, Fan Lyu, Yuyang Li, Junzhou Xie, Yixi Shen, Fuyuan Hu, Guangcan Liu

    Abstract: Multi-label class-incremental learning (MLCIL) is essential for real-world multi-label applications, allowing models to learn new labels while retaining previously learned knowledge continuously. However, recent MLCIL approaches can only achieve suboptimal performance due to the oversight of the positive-negative imbalance problem, which manifests at both the label and loss levels because of the t… ▽ More

    Submitted 22 August, 2024; originally announced August 2024.

  7. arXiv:2408.12152  [pdf, other

    cs.IR

    Behavior Pattern Mining-based Multi-Behavior Recommendation

    Authors: Haojie Li, Zhiyong Cheng, Xu Yu, Jinhuan Liu, Guanfeng Liu, Junwei Du

    Abstract: Multi-behavior recommendation systems enhance effectiveness by leveraging auxiliary behaviors (such as page views and favorites) to address the limitations of traditional models that depend solely on sparse target behaviors like purchases. Existing approaches to multi-behavior recommendations typically follow one of two strategies: some derive initial node representations from individual behavior… ▽ More

    Submitted 22 August, 2024; originally announced August 2024.

  8. arXiv:2408.12124  [pdf, other

    cs.LG cs.HC eess.SP

    Recording Brain Activity While Listening to Music Using Wearable EEG Devices Combined with Bidirectional Long Short-Term Memory Networks

    Authors: Jingyi Wang, Zhiqun Wang, Guiran Liu

    Abstract: Electroencephalography (EEG) signals are crucial for investigating brain function and cognitive processes. This study aims to address the challenges of efficiently recording and analyzing high-dimensional EEG signals while listening to music to recognize emotional states. We propose a method combining Bidirectional Long Short-Term Memory (Bi-LSTM) networks with attention mechanisms for EEG signal… ▽ More

    Submitted 22 August, 2024; originally announced August 2024.

    Comments: 15 pages

  9. arXiv:2408.11302  [pdf, other

    cs.LG cs.CY

    Modeling Reference-dependent Choices with Graph Neural Networks

    Authors: Liang Zhang, Guannan Liu, Junjie Wu, Yong Tan

    Abstract: While the classic Prospect Theory has highlighted the reference-dependent and comparative nature of consumers' product evaluation processes, few models have successfully integrated this theoretical hypothesis into data-driven preference quantification, particularly in the realm of recommender systems development. To bridge this gap, we propose a new research problem of modeling reference-dependent… ▽ More

    Submitted 20 August, 2024; originally announced August 2024.

  10. arXiv:2408.09878  [pdf, other

    cs.CR

    Transferring Backdoors between Large Language Models by Knowledge Distillation

    Authors: Pengzhou Cheng, Zongru Wu, Tianjie Ju, Wei Du, Zhuosheng Zhang Gongshen Liu

    Abstract: Backdoor Attacks have been a serious vulnerability against Large Language Models (LLMs). However, previous methods only reveal such risk in specific models, or present tasks transferability after attacking the pre-trained phase. So, how risky is the model transferability of a backdoor attack? In this paper, we focus on whether existing mini-LLMs may be unconsciously instructed in backdoor knowledg… ▽ More

    Submitted 19 August, 2024; originally announced August 2024.

    Comments: 13 pages, 16 figures, 5 tables

  11. arXiv:2408.08930  [pdf, other

    cs.CR cs.AI cs.CL

    DePrompt: Desensitization and Evaluation of Personal Identifiable Information in Large Language Model Prompts

    Authors: Xiongtao Sun, Gan Liu, Zhipeng He, Hui Li, Xiaoguang Li

    Abstract: Prompt serves as a crucial link in interacting with large language models (LLMs), widely impacting the accuracy and interpretability of model outputs. However, acquiring accurate and high-quality responses necessitates precise prompts, which inevitably pose significant risks of personal identifiable information (PII) leakage. Therefore, this paper proposes DePrompt, a desensitization protection an… ▽ More

    Submitted 15 August, 2024; originally announced August 2024.

  12. arXiv:2408.08661  [pdf, other

    cs.CL cs.CR cs.LG

    MIA-Tuner: Adapting Large Language Models as Pre-training Text Detector

    Authors: Wenjie Fu, Huandong Wang, Chen Gao, Guanghua Liu, Yong Li, Tao Jiang

    Abstract: The increasing parameters and expansive dataset of large language models (LLMs) highlight the urgent demand for a technical solution to audit the underlying privacy risks and copyright issues associated with LLMs. Existing studies have partially addressed this need through an exploration of the pre-training data detection problem, which is an instance of a membership inference attack (MIA). This p… ▽ More

    Submitted 16 August, 2024; originally announced August 2024.

    Comments: code and dataset: https://github.com/wjfu99/MIA-Tuner

  13. arXiv:2408.07410  [pdf, other

    cs.CL

    Aquila2 Technical Report

    Authors: Bo-Wen Zhang, Liangdong Wang, Jijie Li, Shuhao Gu, Xinya Wu, Zhengduo Zhang, Boyan Gao, Yulong Ao, Guang Liu

    Abstract: This paper introduces the Aquila2 series, which comprises a wide range of bilingual models with parameter sizes of 7, 34, and 70 billion. These models are trained based on an innovative framework named HeuriMentor (HM), which offers real-time insights into model convergence and enhances the training process and data management. The HM System, comprising the Adaptive Training Engine (ATE), Training… ▽ More

    Submitted 14 August, 2024; originally announced August 2024.

  14. InfinityMATH: A Scalable Instruction Tuning Dataset in Programmatic Mathematical Reasoning

    Authors: Bo-Wen Zhang, Yan Yan, Lin Li, Guang Liu

    Abstract: Recent advancements in Chain-of-Thoughts (CoT) and Program-of-Thoughts (PoT) methods have greatly enhanced language models' mathematical reasoning capabilities, facilitating their integration into instruction tuning datasets with LLMs. However, existing methods for large-scale dataset creation require substantial seed data and high computational costs for data synthesis, posing significant challen… ▽ More

    Submitted 9 August, 2024; originally announced August 2024.

    Comments: Accepted by CIKM 2024

    ACM Class: I.2.7

  15. arXiv:2408.06567  [pdf, other

    cs.CL cs.AI

    AquilaMoE: Efficient Training for MoE Models with Scale-Up and Scale-Out Strategies

    Authors: Bo-Wen Zhang, Liangdong Wang, Ye Yuan, Jijie Li, Shuhao Gu, Mengdi Zhao, Xinya Wu, Guang Liu, Chengwei Wu, Hanyu Zhao, Li Du, Yiming Ju, Quanyue Ma, Yulong Ao, Yingli Zhao, Songhe Zhu, Zhou Cao, Dong Liang, Yonghua Lin, Ming Zhang, Shunfei Wang, Yanxin Zhou, Min Ye, Xuekai Chen, Xinyang Yu , et al. (2 additional authors not shown)

    Abstract: In recent years, with the rapid application of large language models across various fields, the scale of these models has gradually increased, and the resources required for their pre-training have grown exponentially. Training an LLM from scratch will cost a lot of computation resources while scaling up from a smaller model is a more efficient approach and has thus attracted significant attention… ▽ More

    Submitted 12 August, 2024; originally announced August 2024.

  16. arXiv:2408.06360  [pdf, other

    cs.IR cs.CV

    Modality-Balanced Learning for Multimedia Recommendation

    Authors: Jinghao Zhang, Guofan Liu, Qiang Liu, Shu Wu, Liang Wang

    Abstract: Many recommender models have been proposed to investigate how to incorporate multimodal content information into traditional collaborative filtering framework effectively. The use of multimodal information is expected to provide more comprehensive information and lead to superior performance. However, the integration of multiple modalities often encounters the modal imbalance problem: since the in… ▽ More

    Submitted 26 July, 2024; originally announced August 2024.

    Comments: ACM Multimedia 2024 (Oral)

  17. arXiv:2408.05804  [pdf, other

    cs.LG cs.AI

    A Single Goal is All You Need: Skills and Exploration Emerge from Contrastive RL without Rewards, Demonstrations, or Subgoals

    Authors: Grace Liu, Michael Tang, Benjamin Eysenbach

    Abstract: In this paper, we present empirical evidence of skills and directed exploration emerging from a simple RL algorithm long before any successful trials are observed. For example, in a manipulation task, the agent is given a single observation of the goal state and learns skills, first for moving its end-effector, then for pushing the block, and finally for picking up and placing the block. These ski… ▽ More

    Submitted 11 August, 2024; originally announced August 2024.

    Comments: Code and videos: https://graliuce.github.io/sgcrl/

  18. arXiv:2408.01435  [pdf, other

    cs.CV cs.RO

    A New Clustering-based View Planning Method for Building Inspection with Drone

    Authors: Yongshuai Zheng, Guoliang Liu, Yan Ding, Guohui Tian

    Abstract: With the rapid development of drone technology, the application of drones equipped with visual sensors for building inspection and surveillance has attracted much attention. View planning aims to find a set of near-optimal viewpoints for vision-related tasks to achieve the vision coverage goal. This paper proposes a new clustering-based two-step computational method using spectral clustering, loca… ▽ More

    Submitted 19 July, 2024; originally announced August 2024.

  19. arXiv:2407.17333  [pdf, other

    cs.LG

    Global Confidence Degree Based Graph Neural Network for Financial Fraud Detection

    Authors: Jiaxun Liu, Yue Tian, Guanjun Liu

    Abstract: Graph Neural Networks (GNNs) are widely used in financial fraud detection due to their excellent ability on handling graph-structured financial data and modeling multilayer connections by aggregating information of neighbors. However, these GNN-based methods focus on extracting neighbor-level information but neglect a global perspective. This paper presents the concept and calculation formula of G… ▽ More

    Submitted 18 August, 2024; v1 submitted 24 July, 2024; originally announced July 2024.

  20. arXiv:2407.15286  [pdf, other

    cs.CL

    Intrinsic Self-correction for Enhanced Morality: An Analysis of Internal Mechanisms and the Superficial Hypothesis

    Authors: Guangliang Liu, Haitao Mao, Jiliang Tang, Kristen Marie Johnson

    Abstract: Large Language Models (LLMs) are capable of producing content that perpetuates stereotypes, discrimination, and toxicity. The recently proposed moral self-correction is a computationally efficient method for reducing harmful content in the responses of LLMs. However, the process of how injecting self-correction instructions can modify the behavior of LLMs remains under-explored. In this paper, we… ▽ More

    Submitted 12 August, 2024; v1 submitted 21 July, 2024; originally announced July 2024.

  21. arXiv:2407.15233  [pdf, other

    cs.CV

    CGB-DM: Content and Graphic Balance Layout Generation with Transformer-based Diffusion Model

    Authors: Yu Li, Yifan Chen, Gongye Liu, Jie Wu, Yujiu Yang

    Abstract: Layout generation is the foundation task of intelligent design, which requires the integration of visual aesthetics and harmonious expression of content delivery. However, existing methods still face challenges in generating precise and visually appealing layouts, including blocking, overlap, or spatial misalignment between layouts, which are closely related to the spatial structure of graphic lay… ▽ More

    Submitted 22 July, 2024; v1 submitted 21 July, 2024; originally announced July 2024.

  22. arXiv:2407.14754  [pdf, other

    eess.IV cs.CV

    Representing Topological Self-Similarity Using Fractal Feature Maps for Accurate Segmentation of Tubular Structures

    Authors: Jiaxing Huang, Yanfeng Zhou, Yaoru Luo, Guole Liu, Heng Guo, Ge Yang

    Abstract: Accurate segmentation of long and thin tubular structures is required in a wide variety of areas such as biology, medicine, and remote sensing. The complex topology and geometry of such structures often pose significant technical challenges. A fundamental property of such structures is their topological self-similarity, which can be quantified by fractal features such as fractal dimension (FD). In… ▽ More

    Submitted 20 July, 2024; originally announced July 2024.

  23. arXiv:2407.14054  [pdf, other

    cs.CV

    PointRegGPT: Boosting 3D Point Cloud Registration using Generative Point-Cloud Pairs for Training

    Authors: Suyi Chen, Hao Xu, Haipeng Li, Kunming Luo, Guanghui Liu, Chi-Wing Fu, Ping Tan, Shuaicheng Liu

    Abstract: Data plays a crucial role in training learning-based methods for 3D point cloud registration. However, the real-world dataset is expensive to build, while rendering-based synthetic data suffers from domain gaps. In this work, we present PointRegGPT, boosting 3D point cloud registration using generative point-cloud pairs for training. Given a single depth map, we first apply a random camera motion… ▽ More

    Submitted 19 July, 2024; originally announced July 2024.

    Comments: To appear at the European Conference on Computer Vision (ECCV) 2024

    ACM Class: I.3.3; I.4.5

  24. arXiv:2407.13937  [pdf, other

    cs.CV

    Boosting Online 3D Multi-Object Tracking through Camera-Radar Cross Check

    Authors: Sheng-Yao Kuan, Jen-Hao Cheng, Hsiang-Wei Huang, Wenhao Chai, Cheng-Yen Yang, Hugo Latapie, Gaowen Liu, Bing-Fei Wu, Jenq-Neng Hwang

    Abstract: In the domain of autonomous driving, the integration of multi-modal perception techniques based on data from diverse sensors has demonstrated substantial progress. Effectively surpassing the capabilities of state-of-the-art single-modality detectors through sensor fusion remains an active challenge. This work leverages the respective advantages of cameras in perspective view and radars in Bird's E… ▽ More

    Submitted 18 July, 2024; originally announced July 2024.

    Comments: 2024 IEEE Intelligent Vehicles Symposium (IV)

  25. arXiv:2407.13168  [pdf, other

    cs.AI cs.CL

    SciCode: A Research Coding Benchmark Curated by Scientists

    Authors: Minyang Tian, Luyu Gao, Shizhuo Dylan Zhang, Xinan Chen, Cunwei Fan, Xuefei Guo, Roland Haas, Pan Ji, Kittithat Krongchon, Yao Li, Shengyan Liu, Di Luo, Yutao Ma, Hao Tong, Kha Trinh, Chenyu Tian, Zihan Wang, Bohao Wu, Yanyu Xiong, Shengzhu Yin, Minhui Zhu, Kilian Lieret, Yanxin Lu, Genglin Liu, Yufeng Du , et al. (5 additional authors not shown)

    Abstract: Since language models (LMs) now outperform average humans on many challenging tasks, it has become increasingly difficult to develop challenging, high-quality, and realistic evaluations. We address this issue by examining LMs' capabilities to generate code for solving real scientific research problems. Incorporating input from scientists and AI researchers in 16 diverse natural science sub-fields,… ▽ More

    Submitted 18 July, 2024; originally announced July 2024.

    Comments: 25 pages, 9 figures, 7 tables

  26. arXiv:2407.11772  [pdf, other

    cs.SI

    User Behavior Analysis and Clustering in Peace Elite: Insights and Recommendations

    Authors: Yang Qiu, Yuxin Gong, Guanliang Liu

    Abstract: This study presents a comprehensive analysis of user behavior and clustering in Peace Elite, a popular mobile battle royale game, employing temporal and static data mining techniques to uncover distinct player segments. Our methodology encompasses time series K-means clustering, graph-based algorithms (DeepWalk and LINE), and static attribute clustering, visualized through innovative hybrid charts… ▽ More

    Submitted 16 July, 2024; originally announced July 2024.

  27. arXiv:2407.10484  [pdf, other

    cs.CV cs.LG

    Understanding Matrix Function Normalizations in Covariance Pooling through the Lens of Riemannian Geometry

    Authors: Ziheng Chen, Yue Song, Xiao-Jun Wu, Gaowen Liu, Nicu Sebe

    Abstract: Global Covariance Pooling (GCP) has been demonstrated to improve the performance of Deep Neural Networks (DNNs) by exploiting second-order statistics of high-level representations. GCP typically performs classification of the covariance matrices by applying matrix function normalization, such as matrix logarithm or power, followed by a Euclidean classifier. However, covariance matrices inherently… ▽ More

    Submitted 20 July, 2024; v1 submitted 15 July, 2024; originally announced July 2024.

    Comments: 24 pages, 3 figures

  28. arXiv:2407.08214  [pdf, other

    cs.LG cs.AI

    Towards stable training of parallel continual learning

    Authors: Li Yuepan, Fan Lyu, Yuyang Li, Wei Feng, Guangcan Liu, Fanhua Shang

    Abstract: Parallel Continual Learning (PCL) tasks investigate the training methods for continual learning with multi-source input, where data from different tasks are learned as they arrive. PCL offers high training efficiency and is well-suited for complex multi-source data systems, such as autonomous vehicles equipped with multiple sensors. However, at any time, multiple tasks need to be trained simultane… ▽ More

    Submitted 11 July, 2024; originally announced July 2024.

  29. arXiv:2407.07791  [pdf, other

    cs.CL

    Flooding Spread of Manipulated Knowledge in LLM-Based Multi-Agent Communities

    Authors: Tianjie Ju, Yiting Wang, Xinbei Ma, Pengzhou Cheng, Haodong Zhao, Yulong Wang, Lifeng Liu, Jian Xie, Zhuosheng Zhang, Gongshen Liu

    Abstract: The rapid adoption of large language models (LLMs) in multi-agent systems has highlighted their impressive capabilities in various applications, such as collaborative problem-solving and autonomous negotiation. However, the security implications of these LLM-based multi-agent systems have not been thoroughly investigated, particularly concerning the spread of manipulated knowledge. In this paper,… ▽ More

    Submitted 22 July, 2024; v1 submitted 10 July, 2024; originally announced July 2024.

    Comments: 18 Pages, working in progress

  30. arXiv:2407.07058  [pdf, other

    cs.DS

    An efficient implementation for solving the all pairs minimax path problem in an undirected dense graph

    Authors: Gangli Liu

    Abstract: We provide an efficient $ O(n^2) $ implementation for solving the all pairs minimax path problem or widest path problem in an undirected dense graph. It is a code implementation of the Algorithm 4 (MMJ distance by Calculation and Copy) in a previous paper. The distance matrix is also called the all points path distance (APPD). We conducted experiments to test the implementation and algorithm, comp… ▽ More

    Submitted 26 July, 2024; v1 submitted 9 July, 2024; originally announced July 2024.

  31. arXiv:2407.05941  [pdf, other

    cs.LG cs.CV

    Pruning One More Token is Enough: Leveraging Latency-Workload Non-Linearities for Vision Transformers on the Edge

    Authors: Nick John Eliopoulos, Purvish Jajal, James Davis, Gaowen Liu, George K. Thiravathukal, Yung-Hsiang Lu

    Abstract: This paper investigates how to efficiently deploy vision transformers on edge devices. Recent methods reduce the latency of transformer neural networks by removing or merging tokens, with small accuracy degradation. However, these methods are not designed with edge device deployment in mind: jthey do not leverage information about the latency vs. workload trends to improve efficiency. First, we sh… ▽ More

    Submitted 18 July, 2024; v1 submitted 1 July, 2024; originally announced July 2024.

  32. arXiv:2407.05609  [pdf, other

    cs.CL

    Open-world Multi-label Text Classification with Extremely Weak Supervision

    Authors: Xintong Li, Jinya Jiang, Ria Dharmani, Jayanth Srinivasa, Gaowen Liu, Jingbo Shang

    Abstract: We study open-world multi-label text classification under extremely weak supervision (XWS), where the user only provides a brief description for classification objectives without any labels or ground-truth label space. Similar single-label XWS settings have been explored recently, however, these methods cannot be easily adapted for multi-label. We observe that (1) most documents have a dominant cl… ▽ More

    Submitted 8 July, 2024; originally announced July 2024.

    Comments: Preprint

  33. arXiv:2407.05365  [pdf, other

    cs.AI

    ElecBench: a Power Dispatch Evaluation Benchmark for Large Language Models

    Authors: Xiyuan Zhou, Huan Zhao, Yuheng Cheng, Yuji Cao, Gaoqi Liang, Guolong Liu, Wenxuan Liu, Yan Xu, Junhua Zhao

    Abstract: In response to the urgent demand for grid stability and the complex challenges posed by renewable energy integration and electricity market dynamics, the power sector increasingly seeks innovative technological solutions. In this context, large language models (LLMs) have become a key technology to improve efficiency and promote intelligent progress in the power sector with their excellent natural… ▽ More

    Submitted 11 August, 2024; v1 submitted 7 July, 2024; originally announced July 2024.

  34. arXiv:2407.04305  [pdf, other

    cs.CV

    Towards Stable 3D Object Detection

    Authors: Jiabao Wang, Qiang Meng, Guochao Liu, Liujiang Yan, Ke Wang, Ming-Ming Cheng, Qibin Hou

    Abstract: In autonomous driving, the temporal stability of 3D object detection greatly impacts the driving safety. However, the detection stability cannot be accessed by existing metrics such as mAP and MOTA, and consequently is less explored by the community. To bridge this gap, this work proposes Stability Index (SI), a new metric that can comprehensively evaluate the stability of 3D detectors in terms of… ▽ More

    Submitted 5 July, 2024; originally announced July 2024.

  35. arXiv:2407.04213  [pdf

    cs.CR cs.NI

    Pathfinder: Exploring Path Diversity for Assessing Internet Censorship Inconsistency

    Authors: Xiaoqin Liang, Guannan Liu, Lin Jin, Shuai Hao, Haining Wang

    Abstract: Internet censorship is typically enforced by authorities to achieve information control for a certain group of Internet users. So far existing censorship studies have primarily focused on country-level characterization because (1) in many cases, censorship is enabled by governments with nationwide policies and (2) it is usually hard to control how the probing packets are routed to trigger censorsh… ▽ More

    Submitted 4 July, 2024; originally announced July 2024.

  36. arXiv:2407.03200  [pdf, other

    cs.CV

    SegVG: Transferring Object Bounding Box to Segmentation for Visual Grounding

    Authors: Weitai Kang, Gaowen Liu, Mubarak Shah, Yan Yan

    Abstract: Different from Object Detection, Visual Grounding deals with detecting a bounding box for each text-image pair. This one box for each text-image data provides sparse supervision signals. Although previous works achieve impressive results, their passive utilization of annotation, i.e. the sole use of the box annotation as regression ground truth, results in a suboptimal performance. In this paper,… ▽ More

    Submitted 6 July, 2024; v1 submitted 3 July, 2024; originally announced July 2024.

    Comments: Accepted to ECCV 2024

  37. arXiv:2407.00896  [pdf, other

    eess.SP cs.AI

    Channel Modeling Aided Dataset Generation for AI-Enabled CSI Feedback: Advances, Challenges, and Solutions

    Authors: Yupeng Li, Gang Li, Zirui Wen, Shuangfeng Han, Shijian Gao, Guangyi Liu, Jiangzhou Wang

    Abstract: The AI-enabled autoencoder has demonstrated great potential in channel state information (CSI) feedback in frequency division duplex (FDD) multiple input multiple output (MIMO) systems. However, this method completely changes the existing feedback strategies, making it impractical to deploy in recent years. To address this issue, this paper proposes a channel modeling aided data augmentation metho… ▽ More

    Submitted 30 June, 2024; originally announced July 2024.

  38. arXiv:2407.00331  [pdf, other

    cs.CG cs.DS

    Unweighted Geometric Hitting Set for Line-Constrained Disks and Related Problems

    Authors: Gang Liu, Haitao Wang

    Abstract: Given a set $P$ of $n$ points and a set $S$ of $m$ disks in the plane, the disk hitting set problem asks for a smallest subset of $P$ such that every disk of $S$ contains at least one point in the subset. The problem is NP-hard. In this paper, we consider a line-constrained version in which all disks have their centers on a line. We present an $O(m\log^2n+(n+m)\log(n+m))$ time algorithm for the pr… ▽ More

    Submitted 29 June, 2024; originally announced July 2024.

    Comments: To appear in MFCS 2024

  39. arXiv:2407.00329  [pdf, other

    cs.CG cs.DS

    On Line-Separable Weighted Unit-Disk Coverage and Related Problems

    Authors: Gang Liu, Haitao Wang

    Abstract: Given a set $P$ of $n$ points and a set $S$ of $n$ weighted disks in the plane, the disk coverage problem is to compute a subset of disks of smallest total weight such that the union of the disks in the subset covers all points of $P$. The problem is NP-hard. In this paper, we consider a line-separable unit-disk version of the problem where all disks have the same radius and their centers are sepa… ▽ More

    Submitted 29 June, 2024; originally announced July 2024.

    Comments: To appear in MFCS 2024

  40. arXiv:2407.00014  [pdf

    cs.RO eess.SY

    Simplifying Kinematic Parameter Estimation in sEMG Prosthetic Hands: A Two-Point Approach

    Authors: Gang Liu, Zhenxiang Wang, Ziyang He, Shanshan Guo, Rui Zhang, Dezhong Yao

    Abstract: Regression-based sEMG prosthetic hands are widely used for their ability to provide continuous kinematic parameters. However, establishing these models traditionally requires complex kinematic sensor systems to collect corresponding kinematic data in synchronization with EMG, which is cumbersome and user-unfriendly. This paper presents a simplified approach utilizing only two data points to depict… ▽ More

    Submitted 1 May, 2024; originally announced July 2024.

    Comments: 13 pages

  41. arXiv:2406.19922  [pdf, other

    cs.CV

    Parallax-tolerant Image Stitching via Segmentation-guided Multi-homography Warping

    Authors: Tianli Liao, Ce Wang, Lei Li, Guangen Liu, Nan Li

    Abstract: Large parallax between images is an intractable issue in image stitching. Various warping-based methods are proposed to address it, yet the results are unsatisfactory. In this paper, we propose a novel image stitching method using multi-homography warping guided by image segmentation. Specifically, we leverage the Segment Anything Model to segment the target image into numerous contents and partit… ▽ More

    Submitted 28 June, 2024; originally announced June 2024.

    Comments: 11 pages, 9 figures

  42. arXiv:2406.19234  [pdf, other

    cs.CR cs.AI

    Seeing Is Believing: Black-Box Membership Inference Attacks Against Retrieval Augmented Generation

    Authors: Yuying Li, Gaoyang Liu, Yang Yang, Chen Wang

    Abstract: Retrieval-Augmented Generation (RAG) is a state-of-the-art technique that enhances Large Language Models (LLMs) by retrieving relevant knowledge from an external, non-parametric database. This approach aims to mitigate common LLM issues such as hallucinations and outdated knowledge. Although existing research has demonstrated security and privacy vulnerabilities within RAG systems, making them sus… ▽ More

    Submitted 27 June, 2024; originally announced June 2024.

  43. arXiv:2406.16782  [pdf, other

    cs.LG

    Confidence Aware Inverse Constrained Reinforcement Learning

    Authors: Sriram Ganapathi Subramanian, Guiliang Liu, Mohammed Elmahgiubi, Kasra Rezaee, Pascal Poupart

    Abstract: In coming up with solutions to real-world problems, humans implicitly adhere to constraints that are too numerous and complex to be specified completely. However, reinforcement learning (RL) agents need these constraints to learn the correct optimal policy in these settings. The field of Inverse Constraint Reinforcement Learning (ICRL) deals with this problem and provides algorithms that aim to es… ▽ More

    Submitted 24 June, 2024; originally announced June 2024.

    Comments: Paper to appear in ICML 2024

  44. arXiv:2406.13964  [pdf, other

    cs.NI

    Hierarchical Micro-Segmentations for Zero-Trust Services via Large Language Model (LLM)-enhanced Graph Diffusion

    Authors: Yinqiu Liu, Guangyuan Liu, Hongyang Du, Dusit Niyato, Jiawen Kang, Zehui Xiong, Dong In Kim, Xuemin Shen

    Abstract: In the rapidly evolving Next-Generation Networking (NGN) era, the adoption of zero-trust architectures has become increasingly crucial to protect security. However, provisioning zero-trust services in NGNs poses significant challenges, primarily due to the environmental complexity and dynamics. Motivated by these challenges, this paper explores efficient zero-trust service provisioning using hiera… ▽ More

    Submitted 19 June, 2024; originally announced June 2024.

    Comments: 13 pages

  45. arXiv:2406.12200  [pdf, other

    cs.LG cs.DC cs.ET cs.MM cs.NE

    SFedCA: Credit Assignment-Based Active Client Selection Strategy for Spiking Federated Learning

    Authors: Qiugang Zhan, Jinbo Cao, Xiurui Xie, Malu Zhang, Huajin Tang, Guisong Liu

    Abstract: Spiking federated learning is an emerging distributed learning paradigm that allows resource-constrained devices to train collaboratively at low power consumption without exchanging local data. It takes advantage of both the privacy computation property in federated learning (FL) and the energy efficiency in spiking neural networks (SNN). Thus, it is highly promising to revolutionize the efficient… ▽ More

    Submitted 17 June, 2024; originally announced June 2024.

    Comments: 9 pages

  46. arXiv:2406.12056  [pdf, other

    cs.LG q-bio.QM

    Learning Molecular Representation in a Cell

    Authors: Gang Liu, Srijit Seal, John Arevalo, Zhenwen Liang, Anne E. Carpenter, Meng Jiang, Shantanu Singh

    Abstract: Predicting drug efficacy and safety in vivo requires information on biological responses (e.g., cell morphology and gene expression) to small molecule perturbations. However, current molecular representation learning methods do not provide a comprehensive view of cell states under these perturbations and struggle to remove noise, hindering model generalization. We introduce the Information Alignme… ▽ More

    Submitted 22 June, 2024; v1 submitted 17 June, 2024; originally announced June 2024.

    Comments: 21 pages, 8 tables, 7 figures

  47. arXiv:2406.10030  [pdf, other

    cs.LG stat.ML

    Off-Policy Evaluation from Logged Human Feedback

    Authors: Aniruddha Bhargava, Lalit Jain, Branislav Kveton, Ge Liu, Subhojyoti Mukherjee

    Abstract: Learning from human feedback has been central to recent advances in artificial intelligence and machine learning. Since the collection of human feedback is costly, a natural question to ask is if the new feedback always needs to collected. Or could we evaluate a new model with the human feedback on responses of another model? This motivates us to study off-policy evaluation from logged human feedb… ▽ More

    Submitted 14 June, 2024; originally announced June 2024.

  48. arXiv:2406.09961  [pdf, other

    cs.SE cs.CL cs.CV

    ChartMimic: Evaluating LMM's Cross-Modal Reasoning Capability via Chart-to-Code Generation

    Authors: Chufan Shi, Cheng Yang, Yaxin Liu, Bo Shui, Junjie Wang, Mohan Jing, Linran Xu, Xinyu Zhu, Siheng Li, Yuxiang Zhang, Gongye Liu, Xiaomei Nie, Deng Cai, Yujiu Yang

    Abstract: We introduce a new benchmark, ChartMimic, aimed at assessing the visually-grounded code generation capabilities of large multimodal models (LMMs). ChartMimic utilizes information-intensive visual charts and textual instructions as inputs, requiring LMMs to generate the corresponding code for chart rendering. ChartMimic includes 1,000 human-curated (figure, instruction, code) triplets, which repres… ▽ More

    Submitted 14 June, 2024; originally announced June 2024.

    Comments: Data and code are available at https://github.com/ChartMimic/ChartMimic

  49. arXiv:2406.09455  [pdf, other

    cs.CV cs.AI cs.CL

    Pandora: Towards General World Model with Natural Language Actions and Video States

    Authors: Jiannan Xiang, Guangyi Liu, Yi Gu, Qiyue Gao, Yuting Ning, Yuheng Zha, Zeyu Feng, Tianhua Tao, Shibo Hao, Yemin Shi, Zhengzhong Liu, Eric P. Xing, Zhiting Hu

    Abstract: World models simulate future states of the world in response to different actions. They facilitate interactive content creation and provides a foundation for grounded, long-horizon reasoning. Current foundation models do not fully meet the capabilities of general world models: large language models (LLMs) are constrained by their reliance on language modality and their limited understanding of the… ▽ More

    Submitted 12 June, 2024; originally announced June 2024.

    Comments: Website: https://world-model.maitrix.org/

  50. arXiv:2406.08607  [pdf, other

    cs.CL cs.AI

    Reversing the Forget-Retain Objectives: An Efficient LLM Unlearning Framework from Logit Difference

    Authors: Jiabao Ji, Yujian Liu, Yang Zhang, Gaowen Liu, Ramana Rao Kompella, Sijia Liu, Shiyu Chang

    Abstract: As Large Language Models (LLMs) demonstrate extensive capability in learning from documents, LLM unlearning becomes an increasingly important research area to address concerns of LLMs in terms of privacy, copyright, etc. A conventional LLM unlearning task typically involves two goals: (1) The target LLM should forget the knowledge in the specified forget documents, and (2) it should retain the oth… ▽ More

    Submitted 12 June, 2024; originally announced June 2024.

    Comments: 21 pages, 11 figures