Skip to main content

Showing 1–50 of 258 results for author: Cui, J

Searching in archive cs. Search in all archives.
.
  1. arXiv:2408.15583  [pdf, other

    cs.CE

    PointEMRay: A Novel Efficient SBR Framework on Point Based Geometry

    Authors: Kaiqiao Yang, Che Liu, Wenming Yu, Tie Jun Cui

    Abstract: The rapid computation of electromagnetic (EM) fields across various scenarios has long been a challenge, primarily due to the need for precise geometric models. The emergence of point cloud data offers a potential solution to this issue. However, the lack of electromagnetic simulation algorithms optimized for point-based models remains a significant limitation. In this study, we propose PointEMRay… ▽ More

    Submitted 28 August, 2024; originally announced August 2024.

    Comments: 14 pages, 13 figures, and 2 tables

  2. arXiv:2408.15069  [pdf

    cs.CV eess.IV physics.ins-det

    Geometric Artifact Correction for Symmetric Multi-Linear Trajectory CT: Theory, Method, and Generalization

    Authors: Zhisheng Wang, Yanxu Sun, Shangyu Li, Legeng Lin, Shunli Wang, Junning Cui

    Abstract: For extending CT field-of-view to perform non-destructive testing, the Symmetric Multi-Linear trajectory Computed Tomography (SMLCT) has been developed as a successful example of non-standard CT scanning modes. However, inevitable geometric errors can cause severe artifacts in the reconstructed images. The existing calibration method for SMLCT is both crude and inefficient. It involves reconstruct… ▽ More

    Submitted 27 August, 2024; originally announced August 2024.

    Comments: 15 pages, 10 figures

    MSC Class: 68U10 (Primary) 68V99; 68Q30(Secondary)

  3. arXiv:2408.14047  [pdf

    cs.CV

    Alleviating Class Imbalance in Semi-supervised Multi-organ Segmentation via Balanced Subclass Regularization

    Authors: Zhenghao Feng, Lu Wen, Binyu Yan, Jiaqi Cui, Yan Wang

    Abstract: Semi-supervised learning (SSL) has shown notable potential in relieving the heavy demand of dense prediction tasks on large-scale well-annotated datasets, especially for the challenging multi-organ segmentation (MoS). However, the prevailing class-imbalance problem in MoS, caused by the substantial variations in organ size, exacerbates the learning difficulty of the SSL network. To alleviate this… ▽ More

    Submitted 26 August, 2024; originally announced August 2024.

  4. arXiv:2408.12614  [pdf, other

    cs.CV cs.AI

    Image-Feature Weak-to-Strong Consistency: An Enhanced Paradigm for Semi-Supervised Learning

    Authors: Zhiyu Wu, Jinshi Cui

    Abstract: Image-level weak-to-strong consistency serves as the predominant paradigm in semi-supervised learning~(SSL) due to its simplicity and impressive performance. Nonetheless, this approach confines all perturbations to the image level and suffers from the excessive presence of naive samples, thus necessitating further improvement. In this paper, we introduce feature-level perturbation with varying int… ▽ More

    Submitted 8 August, 2024; originally announced August 2024.

    Comments: Accepted by ECCV 2024

  5. arXiv:2408.12354  [pdf, other

    eess.AS cs.SD

    LCM-SVC: Latent Diffusion Model Based Singing Voice Conversion with Inference Acceleration via Latent Consistency Distillation

    Authors: Shihao Chen, Yu Gu, Jianwei Cui, Jie Zhang, Rilin Chen, Lirong Dai

    Abstract: Any-to-any singing voice conversion (SVC) aims to transfer a target singer's timbre to other songs using a short voice sample. However many diffusion model based any-to-any SVC methods, which have achieved impressive results, usually suffered from low efficiency caused by a mass of inference steps. In this paper, we propose LCM-SVC, a latent consistency distillation (LCD) based latent diffusion mo… ▽ More

    Submitted 22 August, 2024; originally announced August 2024.

    Comments: Accepted to ISCSLP 2024. arXiv admin note: text overlap with arXiv:2406.05325

  6. arXiv:2408.11611  [pdf, other

    cs.IR cs.LG

    DTN: Deep Multiple Task-specific Feature Interactions Network for Multi-Task Recommendation

    Authors: Yaowen Bi, Yuteng Lian, Jie Cui, Jun Liu, Peijian Wang, Guanghui Li, Xuejun Chen, Jinglin Zhao, Hao Wen, Jing Zhang, Zhaoqi Zhang, Wenzhuo Song, Yang Sun, Weiwei Zhang, Mingchen Cai, Guanxing Zhang

    Abstract: Neural-based multi-task learning (MTL) has been successfully applied to many recommendation applications. However, these MTL models (e.g., MMoE, PLE) did not consider feature interaction during the optimization, which is crucial for capturing complex high-order features and has been widely used in ranking models for real-world recommender systems. Moreover, through feature importance analysis acro… ▽ More

    Submitted 23 August, 2024; v1 submitted 21 August, 2024; originally announced August 2024.

  7. arXiv:2408.08312  [pdf, other

    cs.RO cs.AI

    HyperTaxel: Hyper-Resolution for Taxel-Based Tactile Signals Through Contrastive Learning

    Authors: Hongyu Li, Snehal Dikhale, Jinda Cui, Soshi Iba, Nawid Jamali

    Abstract: To achieve dexterity comparable to that of humans, robots must intelligently process tactile sensor data. Taxel-based tactile signals often have low spatial-resolution, with non-standardized representations. In this paper, we propose a novel framework, HyperTaxel, for learning a geometrically-informed representation of taxel-based tactile signals to address challenges associated with their spatial… ▽ More

    Submitted 15 August, 2024; originally announced August 2024.

    Comments: Accepted by IROS 2024

  8. arXiv:2408.05019  [pdf, other

    cs.CV

    Instruction Tuning-free Visual Token Complement for Multimodal LLMs

    Authors: Dongsheng Wang, Jiequan Cui, Miaoge Li, Wang Lin, Bo Chen, Hanwang Zhang

    Abstract: As the open community of large language models (LLMs) matures, multimodal LLMs (MLLMs) have promised an elegant bridge between vision and language. However, current research is inherently constrained by challenges such as the need for high-quality instruction pairs and the loss of visual information in image-to-text training objectives. To this end, we propose a Visual Token Complement framework (… ▽ More

    Submitted 9 August, 2024; originally announced August 2024.

    Comments: Accepted by ECCV2024 (20pages)

  9. arXiv:2408.01800  [pdf, other

    cs.CV

    MiniCPM-V: A GPT-4V Level MLLM on Your Phone

    Authors: Yuan Yao, Tianyu Yu, Ao Zhang, Chongyi Wang, Junbo Cui, Hongji Zhu, Tianchi Cai, Haoyu Li, Weilin Zhao, Zhihui He, Qianyu Chen, Huarong Zhou, Zhensheng Zou, Haoye Zhang, Shengding Hu, Zhi Zheng, Jie Zhou, Jie Cai, Xu Han, Guoyang Zeng, Dahai Li, Zhiyuan Liu, Maosong Sun

    Abstract: The recent surge of Multimodal Large Language Models (MLLMs) has fundamentally reshaped the landscape of AI research and industry, shedding light on a promising path toward the next AI milestone. However, significant challenges remain preventing MLLMs from being practical in real-world applications. The most notable challenge comes from the huge cost of running an MLLM with a massive number of par… ▽ More

    Submitted 3 August, 2024; originally announced August 2024.

    Comments: preprint

  10. arXiv:2407.20878  [pdf

    eess.IV cs.CV

    S3PET: Semi-supervised Standard-dose PET Image Reconstruction via Dose-aware Token Swap

    Authors: Jiaqi Cui, Pinxian Zeng, Yuanyuan Xu, Xi Wu, Jiliu Zhou, Yan Wang

    Abstract: To acquire high-quality positron emission tomography (PET) images while reducing the radiation tracer dose, numerous efforts have been devoted to reconstructing standard-dose PET (SPET) images from low-dose PET (LPET). However, the success of current fully-supervised approaches relies on abundant paired LPET and SPET images, which are often unavailable in clinic. Moreover, these methods often mix… ▽ More

    Submitted 30 July, 2024; originally announced July 2024.

  11. Exposure Completing for Temporally Consistent Neural High Dynamic Range Video Rendering

    Authors: Jiahao Cui, Wei Jiang, Zhan Peng, Zhiyu Pan, Zhiguo Cao

    Abstract: High dynamic range (HDR) video rendering from low dynamic range (LDR) videos where frames are of alternate exposure encounters significant challenges, due to the exposure change and absence at each time stamp. The exposure change and absence make existing methods generate flickering HDR results. In this paper, we propose a novel paradigm to render HDR frames via completing the absent exposure info… ▽ More

    Submitted 4 August, 2024; v1 submitted 18 July, 2024; originally announced July 2024.

    Comments: 9 pages, 6 figures, accepted by ACM-MM 2024 (poster)

  12. arXiv:2407.12274  [pdf, other

    cs.CV

    MDPE: A Multimodal Deception Dataset with Personality and Emotional Characteristics

    Authors: Cong Cai, Shan Liang, Xuefei Liu, Kang Zhu, Zhengqi Wen, Jianhua Tao, Heng Xie, Jizhou Cui, Yiming Ma, Zhenhua Cheng, Hanzhe Xu, Ruibo Fu, Bin Liu, Yongwei Li

    Abstract: Deception detection has garnered increasing attention in recent years due to the significant growth of digital media and heightened ethical and security concerns. It has been extensively studied using multimodal methods, including video, audio, and text. In addition, individual differences in deception production and detection are believed to play a crucial role.Although some studies have utilized… ▽ More

    Submitted 16 July, 2024; originally announced July 2024.

    Comments: Code and data are available; Submitted to NeurIPS 2024 Datasets and Benchmarks Track

  13. arXiv:2407.07094  [pdf, other

    cs.CL cs.AI

    AnyTaskTune: Advanced Domain-Specific Solutions through Task-Fine-Tuning

    Authors: Jiaxi Cui, Wentao Zhang, Jing Tang, Xudong Tong, Zhenwei Zhang, Amie, Jing Wen, Rongsheng Wang, Pengfei Wu

    Abstract: The pervasive deployment of Large Language Models-LLMs in various sectors often neglects the nuanced requirements of individuals and small organizations, who benefit more from models precisely tailored to their specific business contexts rather than those with broadly superior general capabilities. This work introduces \textbf{AnyTaskTune}, a novel fine-tuning methodology coined as \textbf{Task-Fi… ▽ More

    Submitted 9 July, 2024; originally announced July 2024.

  14. arXiv:2407.06087  [pdf, other

    cs.LG cs.CV

    Analytic Convolutional Layer: A Step to Analytic Neural Network

    Authors: Jingmao Cui, Donglai Tao, Linmi Tao, Ruiyang Liu, Yu Cheng

    Abstract: The prevailing approach to embedding prior knowledge within convolutional layers typically includes the design of steerable kernels or their modulation using designated kernel banks. In this study, we introduce the Analytic Convolutional Layer (ACL), an innovative model-driven convolutional layer, which is a mosaic of analytical convolution kernels (ACKs) and traditional convolution kernels. ACKs… ▽ More

    Submitted 3 July, 2024; originally announced July 2024.

  15. arXiv:2407.01971  [pdf, other

    cs.CV

    Pseudo-Labeling by Multi-Policy Viewfinder Network for Image Cropping

    Authors: Zhiyu Pan, Kewei Wang, Yizheng Wu, Liwen Xiao, Jiahao Cui, Zhicheng Wang, Zhiguo Cao

    Abstract: Automatic image cropping models predict reframing boxes to enhance image aesthetics. Yet, the scarcity of labeled data hinders the progress of this task. To overcome this limitation, we explore the possibility of utilizing both labeled and unlabeled data together to expand the scale of training data for image cropping models. This idea can be implemented in a pseudo-labeling way: producing pseudo… ▽ More

    Submitted 4 July, 2024; v1 submitted 2 July, 2024; originally announced July 2024.

    Comments: 18 pages, 8figures

  16. arXiv:2406.18379  [pdf, other

    cs.CR cs.AI cs.SE

    MALSIGHT: Exploring Malicious Source Code and Benign Pseudocode for Iterative Binary Malware Summarization

    Authors: Haolang Lu, Hongrui Peng, Guoshun Nan, Jiaoyang Cui, Cheng Wang, Weifei Jin

    Abstract: Binary malware summarization aims to automatically generate human-readable descriptions of malware behaviors from executable files, facilitating tasks like malware cracking and detection. Previous methods based on Large Language Models (LLMs) have shown great promise. However, they still face significant issues, including poor usability, inaccurate explanations, and incomplete summaries, primarily… ▽ More

    Submitted 26 June, 2024; originally announced June 2024.

    Comments: 17 pages, 14 figures

  17. arXiv:2406.16776  [pdf, other

    cs.CV

    Instance Consistency Regularization for Semi-Supervised 3D Instance Segmentation

    Authors: Yizheng Wu, Zhiyu Pan, Kewei Wang, Xingyi Li, Jiahao Cui, Liwen Xiao, Guosheng Lin, Zhiguo Cao

    Abstract: Large-scale datasets with point-wise semantic and instance labels are crucial to 3D instance segmentation but also expensive. To leverage unlabeled data, previous semi-supervised 3D instance segmentation approaches have explored self-training frameworks, which rely on high-quality pseudo labels for consistency regularization. They intuitively utilize both instance and semantic pseudo labels in a j… ▽ More

    Submitted 24 June, 2024; originally announced June 2024.

    Comments: 14 pages, 10 figures

  18. arXiv:2406.15763  [pdf, other

    cs.LG cs.AI

    AllMatch: Exploiting All Unlabeled Data for Semi-Supervised Learning

    Authors: Zhiyu Wu, Jinshi Cui

    Abstract: Existing semi-supervised learning algorithms adopt pseudo-labeling and consistency regulation techniques to introduce supervision signals for unlabeled samples. To overcome the inherent limitation of threshold-based pseudo-labeling, prior studies have attempted to align the confidence threshold with the evolving learning status of the model, which is estimated through the predictions made on the u… ▽ More

    Submitted 9 July, 2024; v1 submitted 22 June, 2024; originally announced June 2024.

    Comments: Accepted by IJCAI 2024

  19. arXiv:2406.13150  [pdf

    eess.IV cs.CV

    MCAD: Multi-modal Conditioned Adversarial Diffusion Model for High-Quality PET Image Reconstruction

    Authors: Jiaqi Cui, Xinyi Zeng, Pinxian Zeng, Bo Liu, Xi Wu, Jiliu Zhou, Yan Wang

    Abstract: Radiation hazards associated with standard-dose positron emission tomography (SPET) images remain a concern, whereas the quality of low-dose PET (LPET) images fails to meet clinical requirements. Therefore, there is great interest in reconstructing SPET images from LPET images. However, prior studies focus solely on image data, neglecting vital complementary information from other modalities, e.g.… ▽ More

    Submitted 18 June, 2024; originally announced June 2024.

    Comments: Early accepted by MICCAI2024

  20. arXiv:2406.12896  [pdf, other

    cs.AI cs.CY cs.LG

    Leveraging Pedagogical Theories to Understand Student Learning Process with Graph-based Reasonable Knowledge Tracing

    Authors: Jiajun Cui, Hong Qian, Bo Jiang, Wei Zhang

    Abstract: Knowledge tracing (KT) is a crucial task in intelligent education, focusing on predicting students' performance on given questions to trace their evolving knowledge. The advancement of deep learning in this field has led to deep-learning knowledge tracing (DLKT) models that prioritize high predictive accuracy. However, many existing DLKT methods overlook the fundamental goal of tracking students'… ▽ More

    Submitted 7 June, 2024; originally announced June 2024.

    Comments: Preprint, accepted to appear in SIGKDD 2024, 12 pages. The source code is available at https://github.com/JJCui96/GRKT. Keywords: interpretable knowledge tracing, student behavior modeling, intelligence education

  21. arXiv:2406.11317  [pdf, other

    cs.AI cs.CL cs.CV cs.HC

    GUICourse: From General Vision Language Models to Versatile GUI Agents

    Authors: Wentong Chen, Junbo Cui, Jinyi Hu, Yujia Qin, Junjie Fang, Yue Zhao, Chongyi Wang, Jun Liu, Guirong Chen, Yupeng Huo, Yuan Yao, Yankai Lin, Zhiyuan Liu, Maosong Sun

    Abstract: Utilizing Graphic User Interface (GUI) for human-computer interaction is essential for accessing a wide range of digital tools. Recent advancements in Vision Language Models (VLMs) highlight the compelling potential to develop versatile agents to help humans finish GUI navigation tasks. However, current VLMs are challenged in terms of fundamental abilities (OCR and grounding) and GUI knowledge (th… ▽ More

    Submitted 17 June, 2024; originally announced June 2024.

  22. arXiv:2406.08627  [pdf, other

    cs.LG cs.CL

    Time-MMD: A New Multi-Domain Multimodal Dataset for Time Series Analysis

    Authors: Haoxin Liu, Shangqing Xu, Zhiyuan Zhao, Lingkai Kong, Harshavardhan Kamarthi, Aditya B. Sasanur, Megha Sharma, Jiaming Cui, Qingsong Wen, Chao Zhang, B. Aditya Prakash

    Abstract: Time series data are ubiquitous across a wide range of real-world domains. While real-world time series analysis (TSA) requires human experts to integrate numerical series data with multimodal domain-specific knowledge, most existing TSA models rely solely on numerical data, overlooking the significance of information beyond numerical series. This oversight is due to the untapped potential of text… ▽ More

    Submitted 12 June, 2024; originally announced June 2024.

  23. arXiv:2406.06609  [pdf, other

    cs.LG cs.AI cs.CV

    Mitigating Bias in Dataset Distillation

    Authors: Justin Cui, Ruochen Wang, Yuanhao Xiong, Cho-Jui Hsieh

    Abstract: Dataset Distillation has emerged as a technique for compressing large datasets into smaller synthetic counterparts, facilitating downstream training tasks. In this paper, we study the impact of bias inside the original dataset on the performance of dataset distillation. With a comprehensive empirical evaluation on canonical datasets with color, corruption and background biases, we found that color… ▽ More

    Submitted 10 July, 2024; v1 submitted 6 June, 2024; originally announced June 2024.

    Comments: ICML

  24. arXiv:2406.06474  [pdf, other

    cs.AI cs.CL

    Towards a Personal Health Large Language Model

    Authors: Justin Cosentino, Anastasiya Belyaeva, Xin Liu, Nicholas A. Furlotte, Zhun Yang, Chace Lee, Erik Schenck, Yojan Patel, Jian Cui, Logan Douglas Schneider, Robby Bryant, Ryan G. Gomes, Allen Jiang, Roy Lee, Yun Liu, Javier Perez, Jameson K. Rogers, Cathy Speed, Shyam Tailor, Megan Walker, Jeffrey Yu, Tim Althoff, Conor Heneghan, John Hernandez, Mark Malhotra , et al. (9 additional authors not shown)

    Abstract: In health, most large language model (LLM) research has focused on clinical tasks. However, mobile and wearable devices, which are rarely integrated into such tasks, provide rich, longitudinal data for personal health monitoring. Here we present Personal Health Large Language Model (PH-LLM), fine-tuned from Gemini for understanding and reasoning over numerical time-series personal health data. We… ▽ More

    Submitted 10 June, 2024; originally announced June 2024.

    Comments: 72 pages

  25. arXiv:2406.04721  [pdf, other

    cs.IT eess.SP

    End-to-End Design of Polar Coded Integrated Data and Energy Networking

    Authors: Jie Hu, Jingwen Cui, Luping Xiang, Kun Yang

    Abstract: In order to transmit data and transfer energy to the low-power Internet of Things (IoT) devices, integrated data and energy networking (IDEN) system may be harnessed. In this context, we propose a bitwise end-to-end design for polar coded IDEN systems, where the conventional encoding/decoding, modulation/demodulation, and energy harvesting (EH) modules are replaced by the neural networks (NNs). In… ▽ More

    Submitted 7 June, 2024; originally announced June 2024.

  26. arXiv:2406.01721  [pdf, other

    cs.CL

    Rotation and Permutation for Advanced Outlier Management and Efficient Quantization of LLMs

    Authors: Haokun Lin, Haobo Xu, Yichen Wu, Jingzhi Cui, Yingtao Zhang, Linzhan Mou, Linqi Song, Zhenan Sun, Ying Wei

    Abstract: Quantizing large language models (LLMs) presents significant challenges, primarily due to outlier activations that compromise the efficiency of low-bit representation. Traditional approaches mainly focus on solving Normal Outliers-activations with consistently high magnitudes across all tokens. However, these techniques falter when dealing with Massive Outliers, which are significantly higher in v… ▽ More

    Submitted 3 June, 2024; originally announced June 2024.

    Comments: 26 pages, 13 figures

  27. arXiv:2406.01281  [pdf

    physics.med-ph cs.HC

    Noninvasive Extraction of Maternal and Fetal ECG using Periodic Progressive FastICA Peel-off

    Authors: Yao Li, Xuanyu Luo, Haowen Zhao, Jiawen Cui, Yangfan She, Dongfang Li, Lai Jiang, Xu Zhang

    Abstract: The abdominal electrocardiogram (AECG) gives a safe and non-invasive way to monitor fetal well-being during pregnancy. However, due to the overlap with maternal ECG (MECG) as well as significant external noise, it is challenging to extract weak fetal ECG (FECG) using surface electrodes. In this study, we introduce a novel periodic progressive FastICA peel-off (PPFP) method for noninvasive extracti… ▽ More

    Submitted 22 July, 2024; v1 submitted 3 June, 2024; originally announced June 2024.

  28. Retrieval-Augmented Conversational Recommendation with Prompt-based Semi-Structured Natural Language State Tracking

    Authors: Sara Kemper, Justin Cui, Kai Dicarlantonio, Kathy Lin, Danjie Tang, Anton Korikov, Scott Sanner

    Abstract: Conversational recommendation (ConvRec) systems must understand rich and diverse natural language (NL) expressions of user preferences and intents, often communicated in an indirect manner (e.g., "I'm watching my weight"). Such complex utterances make retrieving relevant items challenging, especially if only using often incomplete or out-of-date metadata. Fortunately, many domains feature rich ite… ▽ More

    Submitted 25 May, 2024; originally announced June 2024.

  29. arXiv:2405.20947  [pdf, other

    cs.CL cs.AI

    OR-Bench: An Over-Refusal Benchmark for Large Language Models

    Authors: Justin Cui, Wei-Lin Chiang, Ion Stoica, Cho-Jui Hsieh

    Abstract: Large Language Models (LLMs) require careful safety alignment to prevent malicious outputs. While significant research focuses on mitigating harmful content generation, the enhanced safety often come with the side effect of over-refusal, where LLMs may reject innocuous prompts and become less helpful. Although the issue of over-refusal has been empirically observed, a systematic measurement is cha… ▽ More

    Submitted 20 June, 2024; v1 submitted 31 May, 2024; originally announced May 2024.

    Comments: version 2, 10 pages main, 22 pages total

  30. arXiv:2405.14743  [pdf, other

    cs.LG cs.AI

    Iterative Causal Segmentation: Filling the Gap between Market Segmentation and Marketing Strategy

    Authors: Kaihua Ding, Jingsong Cui, Mohammad Soltani, Jing Jin

    Abstract: The field of causal Machine Learning (ML) has made significant strides in recent years. Notable breakthroughs include methods such as meta learners (arXiv:1706.03461v6) and heterogeneous doubly robust estimators (arXiv:2004.14497) introduced in the last five years. Despite these advancements, the field still faces challenges, particularly in managing tightly coupled systems where both the causal t… ▽ More

    Submitted 23 May, 2024; originally announced May 2024.

  31. arXiv:2405.13910  [pdf, other

    cs.LG cs.CV stat.ML

    Learning Latent Space Hierarchical EBM Diffusion Models

    Authors: Jiali Cui, Tian Han

    Abstract: This work studies the learning problem of the energy-based prior model and the multi-layer generator model. The multi-layer generator model, which contains multiple layers of latent variables organized in a top-down hierarchical structure, typically assumes the Gaussian prior model. Such a prior model can be limited in modelling expressivity, which results in a gap between the generator posterior… ▽ More

    Submitted 27 May, 2024; v1 submitted 22 May, 2024; originally announced May 2024.

  32. arXiv:2405.10587  [pdf, other

    cs.CL cs.IR

    RDRec: Rationale Distillation for LLM-based Recommendation

    Authors: Xinfeng Wang, Jin Cui, Yoshimi Suzuki, Fumiyo Fukumoto

    Abstract: Large language model (LLM)-based recommender models that bridge users and items through textual prompts for effective semantic reasoning have gained considerable attention. However, few methods consider the underlying rationales behind interactions, such as user preferences and item attributes, limiting the reasoning capability of LLMs for recommendations. This paper proposes a rationale distillat… ▽ More

    Submitted 14 June, 2024; v1 submitted 17 May, 2024; originally announced May 2024.

    Comments: 10 pages. Accepted to ACL 2024 Main as a short paper

  33. arXiv:2404.09748  [pdf, other

    cs.CV cs.GR

    LetsGo: Large-Scale Garage Modeling and Rendering via LiDAR-Assisted Gaussian Primitives

    Authors: Jiadi Cui, Junming Cao, Fuqiang Zhao, Zhipeng He, Yifan Chen, Yuhui Zhong, Lan Xu, Yujiao Shi, Yingliang Zhang, Jingyi Yu

    Abstract: Large garages are ubiquitous yet intricate scenes that present unique challenges due to their monotonous colors, repetitive patterns, reflective surfaces, and transparent vehicle glass. Conventional Structure from Motion (SfM) methods for camera pose estimation and 3D reconstruction often fail in these environments due to poor correspondence construction. To address these challenges, we introduce… ▽ More

    Submitted 21 May, 2024; v1 submitted 15 April, 2024; originally announced April 2024.

    Comments: Project Page: https://zhaofuq.github.io/LetsGo/

  34. arXiv:2404.06900  [pdf, other

    cs.IR

    NFARec: A Negative Feedback-Aware Recommender Model

    Authors: Xinfeng Wang, Fumiyo Fukumoto, Jin Cui, Yoshimi Suzuki, Dongjin Yu

    Abstract: Graph neural network (GNN)-based models have been extensively studied for recommendations, as they can extract high-order collaborative signals accurately which is required for high-quality recommender systems. However, they neglect the valuable information gained through negative feedback in two aspects: (1) different users might hold opposite feedback on the same item, which hampers optimal info… ▽ More

    Submitted 28 April, 2024; v1 submitted 10 April, 2024; originally announced April 2024.

    Comments: Accepted to SIGIR 2024

  35. arXiv:2404.06895  [pdf, other

    cs.IR

    CaDRec: Contextualized and Debiased Recommender Model

    Authors: Xinfeng Wang, Fumiyo Fukumoto, Jin Cui, Yoshimi Suzuki, Jiyi Li, Dongjin Yu

    Abstract: Recommender models aimed at mining users' behavioral patterns have raised great attention as one of the essential applications in daily life. Recent work on graph neural networks (GNNs) or debiasing methods has attained remarkable gains. However, they still suffer from (1) over-smoothing node embeddings caused by recursive convolutions with GNNs, and (2) the skewed distribution of interactions due… ▽ More

    Submitted 28 April, 2024; v1 submitted 10 April, 2024; originally announced April 2024.

    Comments: Accepted to SIGIR 2024

  36. arXiv:2404.05019  [pdf, other

    cs.LG cs.CL cs.DC

    Shortcut-connected Expert Parallelism for Accelerating Mixture-of-Experts

    Authors: Weilin Cai, Juyong Jiang, Le Qin, Junwei Cui, Sunghun Kim, Jiayi Huang

    Abstract: Expert parallelism has been introduced as a strategy to distribute the computational workload of sparsely-gated mixture-of-experts (MoE) models across multiple computing devices, facilitating the execution of these increasingly large-scale models. However, the All-to-All communication intrinsic to expert parallelism constitutes a significant overhead, diminishing the MoE models' efficiency. Curren… ▽ More

    Submitted 7 April, 2024; originally announced April 2024.

  37. arXiv:2404.03929  [pdf, other

    cs.DB

    SLSM : An Efficient Strategy for Lazy Schema Migration on Shared-Nothing Databases

    Authors: Zhilin Zeng, Hui Li, Xiyue Gao, Hui Zhang, Huiquan Zhang, Jiangtao Cui

    Abstract: By introducing intermediate states for metadata changes and ensuring that at most two versions of metadata exist in the cluster at the same time, shared-nothing databases are capable of making online, asynchronous schema changes. However, this method leads to delays in the deployment of new schemas since it requires waiting for massive data backfill. To shorten the service vacuum period before the… ▽ More

    Submitted 5 April, 2024; originally announced April 2024.

  38. arXiv:2404.01563  [pdf

    eess.IV cs.CV

    Two-Phase Multi-Dose-Level PET Image Reconstruction with Dose Level Awareness

    Authors: Yuchen Fei, Yanmei Luo, Yan Wang, Jiaqi Cui, Yuanyuan Xu, Jiliu Zhou, Dinggang Shen

    Abstract: To obtain high-quality positron emission tomography (PET) while minimizing radiation exposure, a range of methods have been designed to reconstruct standard-dose PET (SPET) from corresponding low-dose PET (LPET) images. However, most current methods merely learn the mapping between single-dose-level LPET and SPET images, but omit the dose disparity of LPET images in clinical scenarios. In this pap… ▽ More

    Submitted 10 April, 2024; v1 submitted 1 April, 2024; originally announced April 2024.

    Comments: Accepted by ISBI2024

  39. arXiv:2404.00069  [pdf, other

    cs.LG

    A Two-Phase Recall-and-Select Framework for Fast Model Selection

    Authors: Jianwei Cui, Wenhang Shi, Honglin Tao, Wei Lu, Xiaoyong Du

    Abstract: As the ubiquity of deep learning in various machine learning applications has amplified, a proliferation of neural network models has been trained and shared on public model repositories. In the context of a targeted machine learning assignment, utilizing an apt source model as a starting point typically outperforms the strategy of training from scratch, particularly with limited training data. De… ▽ More

    Submitted 28 March, 2024; originally announced April 2024.

  40. arXiv:2403.19079  [pdf, other

    cs.CV

    A Real-Time Framework for Domain-Adaptive Underwater Object Detection with Image Enhancement

    Authors: Junjie Wen, Jinqiang Cui, Benyun Zhao, Bingxin Han, Xuchen Liu, Zhi Gao, Ben M. Chen

    Abstract: In recent years, significant progress has been made in the field of underwater image enhancement (UIE). However, its practical utility for high-level vision tasks, such as underwater object detection (UOD) in Autonomous Underwater Vehicles (AUVs), remains relatively unexplored. It may be attributed to several factors: (1) Existing methods typically employ UIE as a pre-processing step, which inevit… ▽ More

    Submitted 27 March, 2024; originally announced March 2024.

    Comments: accepted by ICRA24

  41. arXiv:2403.12835  [pdf, other

    cs.CV cs.RO

    AnySkill: Learning Open-Vocabulary Physical Skill for Interactive Agents

    Authors: Jieming Cui, Tengyu Liu, Nian Liu, Yaodong Yang, Yixin Zhu, Siyuan Huang

    Abstract: Traditional approaches in physics-based motion generation, centered around imitation learning and reward shaping, often struggle to adapt to new scenarios. To tackle this limitation, we propose AnySkill, a novel hierarchical method that learns physically plausible interactions following open-vocabulary instructions. Our approach begins by developing a set of atomic actions via a low-level controll… ▽ More

    Submitted 19 March, 2024; originally announced March 2024.

  42. arXiv:2403.11703  [pdf, other

    cs.CV cs.AI

    LLaVA-UHD: an LMM Perceiving Any Aspect Ratio and High-Resolution Images

    Authors: Ruyi Xu, Yuan Yao, Zonghao Guo, Junbo Cui, Zanlin Ni, Chunjiang Ge, Tat-Seng Chua, Zhiyuan Liu, Maosong Sun, Gao Huang

    Abstract: Visual encoding constitutes the basis of large multimodal models (LMMs) in understanding the visual world. Conventional LMMs process images in fixed sizes and limited resolutions, while recent explorations in this direction are limited in adaptivity, efficiency, and even correctness. In this work, we first take GPT-4V and LLaVA-1.5 as representative examples and expose systematic flaws rooted in t… ▽ More

    Submitted 18 March, 2024; originally announced March 2024.

    Comments: Preprint

  43. arXiv:2403.10576  [pdf, other

    cs.CR cs.CL cs.LG

    Ignore Me But Don't Replace Me: Utilizing Non-Linguistic Elements for Pretraining on the Cybersecurity Domain

    Authors: Eugene Jang, Jian Cui, Dayeon Yim, Youngjin Jin, Jin-Woo Chung, Seungwon Shin, Yongjae Lee

    Abstract: Cybersecurity information is often technically complex and relayed through unstructured text, making automation of cyber threat intelligence highly challenging. For such text domains that involve high levels of expertise, pretraining on in-domain corpora has been a popular method for language models to obtain domain expertise. However, cybersecurity texts often contain non-linguistic elements (suc… ▽ More

    Submitted 2 April, 2024; v1 submitted 15 March, 2024; originally announced March 2024.

    Comments: To appear in NAACL Findings 2024

    ACM Class: I.2.7

  44. arXiv:2403.10214  [pdf, other

    cs.CL

    Enhanced Coherence-Aware Network with Hierarchical Disentanglement for Aspect-Category Sentiment Analysis

    Authors: Jin Cui, Fumiyo Fukumoto, Xinfeng Wang, Yoshimi Suzuki, Jiyi Li, Noriko Tomuro, Wanzeng Kong

    Abstract: Aspect-category-based sentiment analysis (ACSA), which aims to identify aspect categories and predict their sentiments has been intensively studied due to its wide range of NLP applications. Most approaches mainly utilize intrasentential features. However, a review often includes multiple different aspect categories, and some of them do not explicitly appear in the review. Even in a sentence, ther… ▽ More

    Submitted 15 March, 2024; originally announced March 2024.

    Comments: Accepted by LREC-COLING 2024

  45. arXiv:2403.02981  [pdf, other

    cs.CV

    Doubly Abductive Counterfactual Inference for Text-based Image Editing

    Authors: Xue Song, Jiequan Cui, Hanwang Zhang, Jingjing Chen, Richang Hong, Yu-Gang Jiang

    Abstract: We study text-based image editing (TBIE) of a single image by counterfactual inference because it is an elegant formulation to precisely address the requirement: the edited image should retain the fidelity of the original one. Through the lens of the formulation, we find that the crux of TBIE is that existing techniques hardly achieve a good trade-off between editability and fidelity, mainly due t… ▽ More

    Submitted 25 March, 2024; v1 submitted 5 March, 2024; originally announced March 2024.

    Comments: Accepted by CVPR 2024

  46. arXiv:2403.01210  [pdf, other

    cs.CV cs.AI

    SAR-AE-SFP: SAR Imagery Adversarial Example in Real Physics domain with Target Scattering Feature Parameters

    Authors: Jiahao Cui, Jiale Duan, Binyan Luo, Hang Cao, Wang Guo, Haifeng Li

    Abstract: Deep neural network-based Synthetic Aperture Radar (SAR) target recognition models are susceptible to adversarial examples. Current adversarial example generation methods for SAR imagery primarily operate in the 2D digital domain, known as image adversarial examples. Recent work, while considering SAR imaging scatter mechanisms, fails to account for the actual imaging process, rendering attacks in… ▽ More

    Submitted 2 March, 2024; originally announced March 2024.

    Comments: 10 pages, 9 figures, 2 tables

  47. arXiv:2402.18879  [pdf

    cs.CV

    Dose Prediction Driven Radiotherapy Paramters Regression via Intra- and Inter-Relation Modeling

    Authors: Jiaqi Cui, Yuanyuan Xu, Jianghong Xiao, Yuchen Fei, Jiliu Zhou, Xingcheng Peng, Yan Wang

    Abstract: Deep learning has facilitated the automation of radiotherapy by predicting accurate dose distribution maps. However, existing methods fail to derive the desirable radiotherapy parameters that can be directly input into the treatment planning system (TPS), impeding the full automation of radiotherapy. To enable more thorough automatic radiotherapy, in this paper, we propose a novel two-stage framew… ▽ More

    Submitted 29 February, 2024; originally announced February 2024.

    Comments: Accepted by ISBI 2024

  48. arXiv:2402.18856  [pdf, other

    eess.IV cs.CV

    Anatomy-guided fiber trajectory distribution estimation for cranial nerves tractography

    Authors: Lei Xie, Qingrun Zeng, Huajun Zhou, Guoqiang Xie, Mingchu Li, Jiahao Huang, Jianan Cui, Hao Chen, Yuanjing Feng

    Abstract: Diffusion MRI tractography is an important tool for identifying and analyzing the intracranial course of cranial nerves (CNs). However, the complex environment of the skull base leads to ambiguous spatial correspondence between diffusion directions and fiber geometry, and existing diffusion tractography methods of CNs identification are prone to producing erroneous trajectories and missing true po… ▽ More

    Submitted 29 February, 2024; originally announced February 2024.

  49. arXiv:2402.18133  [pdf, other

    cs.LG cs.CV

    Classes Are Not Equal: An Empirical Study on Image Recognition Fairness

    Authors: Jiequan Cui, Beier Zhu, Xin Wen, Xiaojuan Qi, Bei Yu, Hanwang Zhang

    Abstract: In this paper, we present an empirical study on image recognition fairness, i.e., extreme class accuracy disparity on balanced data like ImageNet. We experimentally demonstrate that classes are not equal and the fairness issue is prevalent for image classification models across various datasets, network architectures, and model capacities. Moreover, several intriguing properties of fairness are id… ▽ More

    Submitted 12 March, 2024; v1 submitted 28 February, 2024; originally announced February 2024.

    Comments: CVPR 2024

  50. arXiv:2402.17159  [pdf, other

    cs.CV

    NocPlace: Nocturnal Visual Place Recognition via Generative and Inherited Knowledge Transfer

    Authors: Bingxi Liu, Yiqun Wang, Huaqi Tao, Tingjun Huang, Fulin Tang, Yihong Wu, Jinqiang Cui, Hong Zhang

    Abstract: Visual Place Recognition (VPR) is crucial in computer vision, aiming to retrieve database images similar to a query image from an extensive collection of known images. However, like many vision tasks, VPR always degrades at night due to the scarcity of nighttime images. Moreover, VPR needs to address the cross-domain problem of night-to-day rather than just the issue of a single nighttime domain.… ▽ More

    Submitted 21 March, 2024; v1 submitted 26 February, 2024; originally announced February 2024.

    Comments: 28 pages,9 figures