Skip to main content

Showing 1–50 of 383 results for author: Yin, Y

Searching in archive cs. Search in all archives.
.
  1. arXiv:2407.06579  [pdf, other

    cs.CL

    NoisyAG-News: A Benchmark for Addressing Instance-Dependent Noise in Text Classification

    Authors: Hongfei Huang, Tingting Liang, Xixi Sun, Zikang Jin, Yuyu Yin

    Abstract: Existing research on learning with noisy labels predominantly focuses on synthetic label noise. Although synthetic noise possesses well-defined structural properties, it often fails to accurately replicate real-world noise patterns. In recent years, there has been a concerted effort to construct generalizable and controllable instance-dependent noise datasets for image classification, significantl… ▽ More

    Submitted 9 July, 2024; originally announced July 2024.

    Comments: 20 pages , 13 figure

  2. arXiv:2407.03000  [pdf, other

    cs.CL cs.CV

    VIVA: A Benchmark for Vision-Grounded Decision-Making with Human Values

    Authors: Zhe Hu, Yixiao Ren, Jing Li, Yu Yin

    Abstract: This paper introduces VIVA, a benchmark for VIsion-grounded decision-making driven by human VAlues. While most large vision-language models (VLMs) focus on physical-level skills, our work is the first to examine their multimodal capabilities in leveraging human values to make decisions under a vision-depicted situation. VIVA contains 1,062 images depicting diverse real-world situations and the man… ▽ More

    Submitted 3 July, 2024; originally announced July 2024.

  3. arXiv:2407.02883  [pdf, other

    cs.IR cs.CL

    CoIR: A Comprehensive Benchmark for Code Information Retrieval Models

    Authors: Xiangyang Li, Kuicai Dong, Yi Quan Lee, Wei Xia, Yichun Yin, Hao Zhang, Yong Liu, Yasheng Wang, Ruiming Tang

    Abstract: Despite the substantial success of Information Retrieval (IR) in various NLP tasks, most IR systems predominantly handle queries and corpora in natural language, neglecting the domain of code retrieval. Code retrieval is critically important yet remains under-explored, with existing methods and benchmarks inadequately representing the diversity of code in various domains and tasks. Addressing this… ▽ More

    Submitted 3 July, 2024; originally announced July 2024.

  4. arXiv:2406.19643  [pdf, other

    cs.CL cs.AI

    Unlocking Varied Perspectives: A Persona-Based Multi-Agent Framework with Debate-Driven Text Planning for Argument Generation

    Authors: Zhe Hu, Hou Pong Chan, Jing Li, Yu Yin

    Abstract: Writing persuasive arguments is a challenging task for both humans and machines. It entails incorporating high-level beliefs from various perspectives on the topic, along with deliberate reasoning and planning to construct a coherent narrative. Current language models often generate surface tokens autoregressively, lacking explicit integration of these underlying controls, resulting in limited out… ▽ More

    Submitted 28 June, 2024; originally announced June 2024.

  5. arXiv:2406.18832  [pdf, other

    cs.CL

    OutlierTune: Efficient Channel-Wise Quantization for Large Language Models

    Authors: Jinguang Wang, Yuexi Yin, Haifeng Sun, Qi Qi, Jingyu Wang, Zirui Zhuang, Tingting Yang, Jianxin Liao

    Abstract: Quantizing the activations of large language models (LLMs) has been a significant challenge due to the presence of structured outliers. Most existing methods focus on the per-token or per-tensor quantization of activations, making it difficult to achieve both accuracy and hardware efficiency. To address this problem, we propose OutlierTune, an efficient per-channel post-training quantization (PTQ)… ▽ More

    Submitted 26 June, 2024; originally announced June 2024.

  6. arXiv:2406.18770  [pdf, other

    cs.LG

    ADO-LLM: Analog Design Bayesian Optimization with In-Context Learning of Large Language Models

    Authors: Yuxuan Yin, Yu Wang, Boxun Xu, Peng Li

    Abstract: Analog circuit design requires substantial human expertise and involvement, which is a significant roadblock to design productivity. Bayesian Optimization (BO), a popular machine learning based optimization strategy, has been leveraged to automate analog design given its applicability across various circuit topologies and technologies. Traditional BO methods employ black box Gaussian Process surro… ▽ More

    Submitted 26 June, 2024; originally announced June 2024.

    Comments: 8 pages, 3 figures

  7. arXiv:2406.18536  [pdf, other

    eess.SY cs.AI cs.AR

    Reliable Interval Prediction of Minimum Operating Voltage Based on On-chip Monitors via Conformalized Quantile Regression

    Authors: Yuxuan Yin, Xiaoxiao Wang, Rebecca Chen, Chen He, Peng Li

    Abstract: Predicting the minimum operating voltage ($V_{min}$) of chips is one of the important techniques for improving the manufacturing testing flow, as well as ensuring the long-term reliability and safety of in-field systems. Current $V_{min}$ prediction methods often provide only point estimates, necessitating additional techniques for constructing prediction confidence intervals to cover uncertaintie… ▽ More

    Submitted 3 May, 2024; originally announced June 2024.

    Comments: Accepted by DATE 2024. Camera-ready version

  8. arXiv:2406.16505  [pdf, other

    q-fin.CP cs.AI

    $\text{Alpha}^2$: Discovering Logical Formulaic Alphas using Deep Reinforcement Learning

    Authors: Feng Xu, Yan Yin, Xinyu Zhang, Tianyuan Liu, Shengyi Jiang, Zongzhang Zhang

    Abstract: Alphas are pivotal in providing signals for quantitative trading. The industry highly values the discovery of formulaic alphas for their interpretability and ease of analysis, compared with the expressive yet overfitting-prone black-box alphas. In this work, we focus on discovering formulaic alphas. Prior studies on automatically generating a collection of formulaic alphas were mostly based on gen… ▽ More

    Submitted 26 June, 2024; v1 submitted 24 June, 2024; originally announced June 2024.

  9. arXiv:2406.16441  [pdf, other

    cs.CL

    UniCoder: Scaling Code Large Language Model via Universal Code

    Authors: Tao Sun, Linzheng Chai, Jian Yang, Yuwei Yin, Hongcheng Guo, Jiaheng Liu, Bing Wang, Liqun Yang, Zhoujun Li

    Abstract: Intermediate reasoning or acting steps have successfully improved large language models (LLMs) for handling various downstream natural language processing (NLP) tasks. When applying LLMs for code generation, recent works mainly focus on directing the models to articulate intermediate natural-language reasoning steps, as in chain-of-thought (CoT) prompting, and then output code with the natural lan… ▽ More

    Submitted 24 June, 2024; originally announced June 2024.

    Comments: Accepted by ACL 2024 (Main)

  10. PetalView: Fine-grained Location and Orientation Extraction of Street-view Images via Cross-view Local Search with Supplementary Materials

    Authors: Wenmiao Hu, Yichen Zhang, Yuxuan Liang, Xianjing Han, Yifang Yin, Hannes Kruppa, See-Kiong Ng, Roger Zimmermann

    Abstract: Satellite-based street-view information extraction by cross-view matching refers to a task that extracts the location and orientation information of a given street-view image query by using one or multiple geo-referenced satellite images. Recent work has initiated a new research direction to find accurate information within a local area covered by one satellite image centered at a location prior (… ▽ More

    Submitted 19 June, 2024; originally announced June 2024.

    Comments: This paper has been accepted by ACM Multimedia 2023. This version contains additional supplementary materials

    Journal ref: Proceedings of the 31st ACM International Conference on Multimedia (2023) 56-66

  11. arXiv:2406.10671  [pdf

    cs.CL

    Augmenting Biomedical Named Entity Recognition with General-domain Resources

    Authors: Yu Yin, Hyunjae Kim, Xiao Xiao, Chih Hsuan Wei, Jaewoo Kang, Zhiyong Lu, Hua Xu, Meng Fang, Qingyu Chen

    Abstract: Training a neural network-based biomedical named entity recognition (BioNER) model usually requires extensive and costly human annotations. While several studies have employed multi-task learning with multiple BioNER datasets to reduce human effort, this approach does not consistently yield performance improvements and may introduce label ambiguity in different biomedical corpora. We aim to tackle… ▽ More

    Submitted 18 June, 2024; v1 submitted 15 June, 2024; originally announced June 2024.

    Comments: We make data, codes, and models publicly available via https://github.com/qingyu-qc/bioner_gerbera

  12. arXiv:2406.08756  [pdf, other

    cs.DC cs.LG

    Optimizing Large Model Training through Overlapped Activation Recomputation

    Authors: Ping Chen, Wenjie Zhang, Shuibing He, Yingjie Gu, Zhuwei Peng, Kexin Huang, Xuan Zhan, Weijian Chen, Yi Zheng, Zhefeng Wang, Yanlong Yin, Gang Chen

    Abstract: Large model training has been using recomputation to alleviate the memory pressure and pipelining to exploit the parallelism of data, tensor, and devices. The existing recomputation approaches may incur up to 40% overhead when training real-world models, e.g., the GPT model with 22B parameters. This is because they are executed on demand in the critical training path. In this paper, we design a ne… ▽ More

    Submitted 27 June, 2024; v1 submitted 12 June, 2024; originally announced June 2024.

    Comments: 13 pages

  13. arXiv:2406.07436  [pdf, other

    cs.PL

    McEval: Massively Multilingual Code Evaluation

    Authors: Linzheng Chai, Shukai Liu, Jian Yang, Yuwei Yin, Ke Jin, Jiaheng Liu, Tao Sun, Ge Zhang, Changyu Ren, Hongcheng Guo, Zekun Wang, Boyang Wang, Xianjie Wu, Bing Wang, Tongliang Li, Liqun Yang, Sufeng Duan, Zhoujun Li

    Abstract: Code large language models (LLMs) have shown remarkable advances in code understanding, completion, and generation tasks. Programming benchmarks, comprised of a selection of code challenges and corresponding test cases, serve as a standard to evaluate the capability of different LLMs in such tasks. However, most existing benchmarks primarily focus on Python and are still restricted to a limited nu… ▽ More

    Submitted 11 June, 2024; originally announced June 2024.

    Comments: 22 pages

  14. arXiv:2406.06382  [pdf, other

    cs.CV cs.CL cs.LG

    Diffusion-RPO: Aligning Diffusion Models through Relative Preference Optimization

    Authors: Yi Gu, Zhendong Wang, Yueqin Yin, Yujia Xie, Mingyuan Zhou

    Abstract: Aligning large language models with human preferences has emerged as a critical focus in language modeling research. Yet, integrating preference learning into Text-to-Image (T2I) generative models is still relatively uncharted territory. The Diffusion-DPO technique made initial strides by employing pairwise preference learning in diffusion models tailored for specific text prompts. We introduce Di… ▽ More

    Submitted 10 June, 2024; originally announced June 2024.

  15. arXiv:2406.06279  [pdf, other

    cs.CL

    Multi-Prompting Decoder Helps Better Language Understanding

    Authors: Zifeng Cheng, Zhaoling Chen, Zhiwei Jiang, Yafeng Yin, Shiping Ge, Yuliang Liu, Qing Gu

    Abstract: Recent Pre-trained Language Models (PLMs) usually only provide users with the inference APIs, namely the emerging Model-as-a-Service (MaaS) setting. To adapt MaaS PLMs to downstream tasks without accessing their parameters and gradients, some existing methods focus on the output-side adaptation of PLMs, viewing the PLM as an encoder and then optimizing a task-specific decoder for decoding the outp… ▽ More

    Submitted 10 June, 2024; originally announced June 2024.

  16. arXiv:2406.01441  [pdf, other

    cs.CL

    LexMatcher: Dictionary-centric Data Collection for LLM-based Machine Translation

    Authors: Yongjing Yin, Jiali Zeng, Yafu Li, Fandong Meng, Yue Zhang

    Abstract: The fine-tuning of open-source large language models (LLMs) for machine translation has recently received considerable attention, marking a shift towards data-centric research from traditional neural machine translation. However, the area of data collection for instruction fine-tuning in machine translation remains relatively underexplored. In this paper, we present LexMatcher, a simple yet effect… ▽ More

    Submitted 2 July, 2024; v1 submitted 3 June, 2024; originally announced June 2024.

  17. arXiv:2405.20830  [pdf, other

    cs.CL cs.LG

    Self-Augmented Preference Optimization: Off-Policy Paradigms for Language Model Alignment

    Authors: Yueqin Yin, Zhendong Wang, Yujia Xie, Weizhu Chen, Mingyuan Zhou

    Abstract: Traditional language model alignment methods, such as Direct Preference Optimization (DPO), are limited by their dependence on static, pre-collected paired preference data, which hampers their adaptability and practical applicability. To overcome this limitation, we introduce Self-Augmented Preference Optimization (SAPO), an effective and scalable training paradigm that does not require existing p… ▽ More

    Submitted 31 May, 2024; originally announced May 2024.

  18. arXiv:2405.19088  [pdf, other

    cs.CL cs.CV

    Cracking the Code of Juxtaposition: Can AI Models Understand the Humorous Contradictions

    Authors: Zhe Hu, Tuo Liang, Jing Li, Yiren Lu, Yunlai Zhou, Yiran Qiao, Jing Ma, Yu Yin

    Abstract: Recent advancements in large multimodal language models have demonstrated remarkable proficiency across a wide range of tasks. Yet, these models still struggle with understanding the nuances of human humor through juxtaposition, particularly when it involves nonlinear narratives that underpin many jokes and humor cues. This paper investigates this challenge by focusing on comics with contradictory… ▽ More

    Submitted 29 May, 2024; originally announced May 2024.

  19. arXiv:2405.17532  [pdf, other

    cs.CV

    ClassDiffusion: More Aligned Personalization Tuning with Explicit Class Guidance

    Authors: Jiannan Huang, Jun Hao Liew, Hanshu Yan, Yuyang Yin, Yao Zhao, Yunchao Wei

    Abstract: Recent text-to-image customization works have been proven successful in generating images of given concepts by fine-tuning the diffusion models on a few examples. However, these methods tend to overfit the concepts, resulting in failure to create the concept under multiple conditions (e.g. headphone is missing when generating a <sks> dog wearing a headphone'). Interestingly, we notice that the bas… ▽ More

    Submitted 27 May, 2024; originally announced May 2024.

  20. arXiv:2405.16645  [pdf, other

    cs.CV

    Diffusion4D: Fast Spatial-temporal Consistent 4D Generation via Video Diffusion Models

    Authors: Hanwen Liang, Yuyang Yin, Dejia Xu, Hanxue Liang, Zhangyang Wang, Konstantinos N. Plataniotis, Yao Zhao, Yunchao Wei

    Abstract: The availability of large-scale multimodal datasets and advancements in diffusion models have significantly accelerated progress in 4D content generation. Most prior approaches rely on multiple image or video diffusion models, utilizing score distillation sampling for optimization or generating pseudo novel views for direct supervision. However, these methods are hindered by slow optimization spee… ▽ More

    Submitted 26 May, 2024; originally announced May 2024.

    Comments: Project page: https://vita-group.github.io/Diffusion4D

  21. arXiv:2405.16093  [pdf, other

    cs.CV

    Diverse Teacher-Students for Deep Safe Semi-Supervised Learning under Class Mismatch

    Authors: Qikai Wang, Rundong He, Yongshun Gong, Chunxiao Ren, Haoliang Sun, Xiaoshui Huang, Yilong Yin

    Abstract: Semi-supervised learning can significantly boost model performance by leveraging unlabeled data, particularly when labeled data is scarce. However, real-world unlabeled data often contain unseen-class samples, which can hinder the classification of seen classes. To address this issue, mainstream safe SSL methods suggest detecting and discarding unseen-class samples from unlabeled data. Nevertheles… ▽ More

    Submitted 25 May, 2024; originally announced May 2024.

  22. arXiv:2405.12969  [pdf, other

    cs.LG

    Can We Treat Noisy Labels as Accurate?

    Authors: Yuxiang Zheng, Zhongyi Han, Yilong Yin, Xin Gao, Tongliang Liu

    Abstract: Noisy labels significantly hinder the accuracy and generalization of machine learning models, particularly due to ambiguous instance features. Traditional techniques that attempt to correct noisy labels directly, such as those using transition matrices, often fail to address the inherent complexities of the problem sufficiently. In this paper, we introduce EchoAlign, a transformative paradigm shif… ▽ More

    Submitted 21 May, 2024; originally announced May 2024.

    Comments: 10 pages

  23. arXiv:2405.12788  [pdf, other

    cs.CL

    What Have We Achieved on Non-autoregressive Translation?

    Authors: Yafu Li, Huajian Zhang, Jianhao Yan, Yongjing Yin, Yue Zhang

    Abstract: Recent advances have made non-autoregressive (NAT) translation comparable to autoregressive methods (AT). However, their evaluation using BLEU has been shown to weakly correlate with human annotations. Limited research compares non-autoregressive translation and autoregressive translation comprehensively, leaving uncertainty about the true proximity of NAT to AT. To address this gap, we systematic… ▽ More

    Submitted 21 May, 2024; originally announced May 2024.

    Comments: ACL 2024 Findings

  24. arXiv:2405.12452  [pdf, other

    cs.LG cs.AI

    Prompt-Enhanced Spatio-Temporal Graph Transfer Learning

    Authors: Junfeng Hu, Xu Liu, Zhencheng Fan, Yifang Yin, Shili Xiang, Savitha Ramasamy, Roger Zimmermann

    Abstract: Spatio-temporal graph neural networks have demonstrated efficacy in capturing complex dependencies for urban computing tasks such as forecasting and kriging. However, their performance is constrained by the reliance on extensive data for training on specific tasks, which limits their adaptability to new urban domains with varied demands. Although transfer learning has been proposed to address this… ▽ More

    Submitted 20 May, 2024; originally announced May 2024.

  25. arXiv:2405.11895  [pdf, other

    cs.LG eess.SY

    Sparse Attention-driven Quality Prediction for Production Process Optimization in Digital Twins

    Authors: Yanlei Yin, Lihua Wang, Wenbo Wang, Dinh Thai Hoang

    Abstract: In the process industry, optimizing production lines for long-term efficiency requires real-time monitoring and analysis of operation states to fine-tune production line parameters. However, the complexity in operational logic and the intricate coupling of production process parameters make it difficult to develop an accurate mathematical model for the entire process, thus hindering the deployment… ▽ More

    Submitted 20 May, 2024; originally announced May 2024.

  26. arXiv:2405.03349  [pdf, other

    cs.CV

    Retinexmamba: Retinex-based Mamba for Low-light Image Enhancement

    Authors: Jiesong Bai, Yuhao Yin, Qiyuan He, Yuanxian Li, Xiaofeng Zhang

    Abstract: In the field of low-light image enhancement, both traditional Retinex methods and advanced deep learning techniques such as Retinexformer have shown distinct advantages and limitations. Traditional Retinex methods, designed to mimic the human eye's perception of brightness and color, decompose images into illumination and reflection components but struggle with noise management and detail preserva… ▽ More

    Submitted 19 May, 2024; v1 submitted 6 May, 2024; originally announced May 2024.

  27. arXiv:2405.02572  [pdf, other

    cs.LG cs.AI

    Off-OAB: Off-Policy Policy Gradient Method with Optimal Action-Dependent Baseline

    Authors: Wenjia Meng, Qian Zheng, Long Yang, Yilong Yin, Gang Pan

    Abstract: Policy-based methods have achieved remarkable success in solving challenging reinforcement learning problems. Among these methods, off-policy policy gradient methods are particularly important due to that they can benefit from off-policy data. However, these methods suffer from the high variance of the off-policy policy gradient (OPPG) estimator, which results in poor sample efficiency during trai… ▽ More

    Submitted 4 May, 2024; originally announced May 2024.

    Comments: 12 pages, 3 figures

  28. arXiv:2404.18961  [pdf, other

    cs.LG cs.AI cs.CV

    Unleashing the Power of Multi-Task Learning: A Comprehensive Survey Spanning Traditional, Deep, and Pretrained Foundation Model Eras

    Authors: Jun Yu, Yutong Dai, Xiaokang Liu, Jin Huang, Yishan Shen, Ke Zhang, Rong Zhou, Eashan Adhikarla, Wenxuan Ye, Yixin Liu, Zhaoming Kong, Kai Zhang, Yilong Yin, Vinod Namboodiri, Brian D. Davison, Jason H. Moore, Yong Chen

    Abstract: MTL is a learning paradigm that effectively leverages both task-specific and shared information to address multiple related tasks simultaneously. In contrast to STL, MTL offers a suite of benefits that enhance both the training process and the inference efficiency. MTL's key advantages encompass streamlined model architecture, performance enhancement, and cross-domain generalizability. Over the pa… ▽ More

    Submitted 29 April, 2024; originally announced April 2024.

    Comments: 60 figures, 116 pages, 500+ references

  29. arXiv:2404.18155  [pdf, other

    cs.CV

    ShapeMoiré: Channel-Wise Shape-Guided Network for Image Demoiréing

    Authors: Jinming Cao, Sicheng Shen, Qiu Zhou, Yifang Yin, Yangyan Li, Roger Zimmermann

    Abstract: Photographing optoelectronic displays often introduces unwanted moiré patterns due to analog signal interference between the pixel grids of the display and the camera sensor arrays. This work identifies two problems that are largely ignored by existing image demoiréing approaches: 1) moiré patterns vary across different channels (RGB); 2) repetitive patterns are constantly observed. However, emplo… ▽ More

    Submitted 28 April, 2024; originally announced April 2024.

    Comments: 12 pages

  30. arXiv:2404.10096  [pdf, other

    cs.CV cs.AI

    Vision Augmentation Prediction Autoencoder with Attention Design (VAPAAD)

    Authors: Yiqiao Yin

    Abstract: Recent advancements in sequence prediction have significantly improved the accuracy of video data interpretation; however, existing models often overlook the potential of attention-based mechanisms for next-frame prediction. This study introduces the Vision Augmentation Prediction Autoencoder with Attention Design (VAPAAD), an innovative approach that integrates attention mechanisms into sequence… ▽ More

    Submitted 16 April, 2024; v1 submitted 15 April, 2024; originally announced April 2024.

    Comments: 12 pages, 4 figures

  31. arXiv:2404.04668  [pdf, ps, other

    cs.DS math.PR

    Spectral Independence Beyond Total Influence on Trees and Related Graphs

    Authors: Xiaoyu Chen, Xiongxin Yang, Yitong Yin, Xinyuan Zhang

    Abstract: We study how to establish $\textit{spectral independence}$, a key concept in sampling, without relying on total influence bounds, by applying an $\textit{approximate inverse}$ of the influence matrix. Our method gives constant upper bounds on spectral independence for two foundational Gibbs distributions known to have unbounded total influences: $\bullet$ The monomer-dimer model on graphs with l… ▽ More

    Submitted 6 April, 2024; originally announced April 2024.

  32. arXiv:2404.01157  [pdf, other

    cs.CL cs.PF

    Green AI: Exploring Carbon Footprints, Mitigation Strategies, and Trade Offs in Large Language Model Training

    Authors: Vivian Liu, Yiqiao Yin

    Abstract: Prominent works in the field of Natural Language Processing have long attempted to create new innovative models by improving upon previous model training approaches, altering model architecture, and developing more in-depth datasets to better their performance. However, with the quickly advancing field of NLP comes increased greenhouse gas emissions, posing concerns over the environmental damage c… ▽ More

    Submitted 1 April, 2024; originally announced April 2024.

  33. arXiv:2404.00323  [pdf, other

    cs.CV cs.LG

    CLIP-driven Outliers Synthesis for few-shot OOD detection

    Authors: Hao Sun, Rundong He, Zhongyi Han, Zhicong Lin, Yongshun Gong, Yilong Yin

    Abstract: Few-shot OOD detection focuses on recognizing out-of-distribution (OOD) images that belong to classes unseen during training, with the use of only a small number of labeled in-distribution (ID) images. Up to now, a mainstream strategy is based on large-scale vision-language models, such as CLIP. However, these methods overlook a crucial issue: the lack of reliable OOD supervision information, whic… ▽ More

    Submitted 30 March, 2024; originally announced April 2024.

    Comments: 9 pages,5 figures

  34. arXiv:2403.19899  [pdf, other

    cs.IR

    Inclusive Design Insights from a Preliminary Image-Based Conversational Search Systems Evaluation

    Authors: Yue Zheng, Lei Yu, Junmian Chen, Tianyu Xia, Yuanyuan Yin, Shan Wang, Haiming Liu

    Abstract: The digital realm has witnessed the rise of various search modalities, among which the Image-Based Conversational Search System stands out. This research delves into the design, implementation, and evaluation of this specific system, juxtaposing it against its text-based and mixed counterparts. A diverse participant cohort ensures a broad evaluation spectrum. Advanced tools facilitate emotion anal… ▽ More

    Submitted 28 March, 2024; originally announced March 2024.

  35. arXiv:2403.19470  [pdf, other

    math.NA cs.LG eess.SP

    Deep decomposition method for the limited aperture inverse obstacle scattering problem

    Authors: Yunwen Yin, Liang Yan

    Abstract: In this paper, we consider a deep learning approach to the limited aperture inverse obstacle scattering problem. It is well known that traditional deep learning relies solely on data, which may limit its performance for the inverse problem when only indirect observation data and a physical model are available. A fundamental question arises in light of these limitations: is it possible to enable de… ▽ More

    Submitted 28 March, 2024; originally announced March 2024.

  36. arXiv:2403.17556  [pdf, other

    cs.CL cs.AI

    m3P: Towards Multimodal Multilingual Translation with Multimodal Prompt

    Authors: Jian Yang, Hongcheng Guo, Yuwei Yin, Jiaqi Bai, Bing Wang, Jiaheng Liu, Xinnian Liang, Linzheng Cahi, Liqun Yang, Zhoujun Li

    Abstract: Multilingual translation supports multiple translation directions by projecting all languages in a shared space, but the translation quality is undermined by the difference between languages in the text-only modality, especially when the number of languages is large. To bridge this gap, we introduce visual context as the universal language-independent representation to facilitate multilingual tran… ▽ More

    Submitted 26 March, 2024; originally announced March 2024.

    Comments: COLING 2024

  37. arXiv:2403.17507  [pdf, other

    cs.LG physics.chem-ph

    EL-MLFFs: Ensemble Learning of Machine Leaning Force Fields

    Authors: Bangchen Yin, Yue Yin, Yuda W. Tang, Hai Xiao

    Abstract: Machine learning force fields (MLFFs) have emerged as a promising approach to bridge the accuracy of quantum mechanical methods and the efficiency of classical force fields. However, the abundance of MLFF models and the challenge of accurately predicting atomic forces pose significant obstacles in their practical application. In this paper, we propose a novel ensemble learning framework, EL-MLFFs,… ▽ More

    Submitted 26 March, 2024; originally announced March 2024.

    Comments: 12 pages, 3 figures

  38. arXiv:2403.10737  [pdf, other

    cs.CV

    Leveraging Synthetic Data for Generalizable and Fair Facial Action Unit Detection

    Authors: Liupei Lu, Yufeng Yin, Yuming Gu, Yizhen Wu, Pratusha Prasad, Yajie Zhao, Mohammad Soleymani

    Abstract: Facial action unit (AU) detection is a fundamental block for objective facial expression analysis. Supervised learning approaches require a large amount of manual labeling which is costly. The limited labeled data are also not diverse in terms of gender which can affect model fairness. In this paper, we propose to use synthetically generated data and multi-source domain adaptation (MSDA) to addres… ▽ More

    Submitted 15 March, 2024; originally announced March 2024.

    Comments: The work was done in 2021

  39. arXiv:2403.10301  [pdf, other

    cs.CL cs.CV

    Uni-SMART: Universal Science Multimodal Analysis and Research Transformer

    Authors: Hengxing Cai, Xiaochen Cai, Shuwen Yang, Jiankun Wang, Lin Yao, Zhifeng Gao, Junhan Chang, Sihang Li, Mingjun Xu, Changxin Wang, Hongshuai Wang, Yongge Li, Mujie Lin, Yaqi Li, Yuqi Yin, Linfeng Zhang, Guolin Ke

    Abstract: In scientific research and its application, scientific literature analysis is crucial as it allows researchers to build on the work of others. However, the fast growth of scientific knowledge has led to a massive increase in scholarly articles, making in-depth literature analysis increasingly challenging and time-consuming. The emergence of Large Language Models (LLMs) has offered a new way to add… ▽ More

    Submitted 15 June, 2024; v1 submitted 15 March, 2024; originally announced March 2024.

  40. arXiv:2403.06041  [pdf, other

    cs.RO cs.AI cs.CV cs.LG cs.MA

    MATRIX: Multi-Agent Trajectory Generation with Diverse Contexts

    Authors: Zhuo Xu, Rui Zhou, Yida Yin, Huidong Gao, Masayoshi Tomizuka, Jiachen Li

    Abstract: Data-driven methods have great advantages in modeling complicated human behavioral dynamics and dealing with many human-robot interaction applications. However, collecting massive and annotated real-world human datasets has been a laborious task, especially for highly interactive scenarios. On the other hand, algorithmic data generation methods are usually limited by their model capacities, making… ▽ More

    Submitted 9 March, 2024; originally announced March 2024.

    Comments: IEEE International Conference on Robotics and Automation (ICRA 2024)

  41. arXiv:2403.01976  [pdf, other

    cs.CL

    SciAssess: Benchmarking LLM Proficiency in Scientific Literature Analysis

    Authors: Hengxing Cai, Xiaochen Cai, Junhan Chang, Sihang Li, Lin Yao, Changxin Wang, Zhifeng Gao, Hongshuai Wang, Yongge Li, Mujie Lin, Shuwen Yang, Jiankun Wang, Mingjun Xu, Jin Huang, Fang Xi, Jiaxi Zhuang, Yuqi Yin, Yaqi Li, Changhong Chen, Zheng Cheng, Zifeng Zhao, Linfeng Zhang, Guolin Ke

    Abstract: Recent breakthroughs in Large Language Models (LLMs) have revolutionized natural language understanding and generation, sparking significant interest in applying them to scientific literature analysis. However, existing benchmarks fail to adequately evaluate the proficiency of LLMs in this domain, particularly in scenarios requiring higher-level abilities beyond mere memorization and the handling… ▽ More

    Submitted 18 June, 2024; v1 submitted 4 March, 2024; originally announced March 2024.

  42. arXiv:2402.17081  [pdf, other

    cs.IR

    A Fine-tuning Enhanced RAG System with Quantized Influence Measure as AI Judge

    Authors: Keshav Rangan, Yiqiao Yin

    Abstract: This study presents an innovative enhancement to retrieval-augmented generation (RAG) systems by seamlessly integrating fine-tuned large language models (LLMs) with vector databases. This integration capitalizes on the combined strengths of structured data retrieval and the nuanced comprehension provided by advanced LLMs. Central to our approach are the LoRA and QLoRA methodologies, which stand at… ▽ More

    Submitted 26 February, 2024; originally announced February 2024.

    Comments: 17 pages, 4 figures

  43. arXiv:2402.15969  [pdf, other

    cs.NE

    Efficient Online Learning for Networks of Two-Compartment Spiking Neurons

    Authors: Yujia Yin, Xinyi Chen, Chenxiang Ma, Jibin Wu, Kay Chen Tan

    Abstract: The brain-inspired Spiking Neural Networks (SNNs) have garnered considerable research interest due to their superior performance and energy efficiency in processing temporal signals. Recently, a novel multi-compartment spiking neuron model, namely the Two-Compartment LIF (TC-LIF) model, has been proposed and exhibited a remarkable capacity for sequential modelling. However, training the TC-LIF mod… ▽ More

    Submitted 24 February, 2024; originally announced February 2024.

  44. arXiv:2402.13435  [pdf, other

    cs.IR cs.LG

    Learning to Retrieve for Job Matching

    Authors: Jianqiang Shen, Yuchin Juan, Shaobo Zhang, Ping Liu, Wen Pu, Sriram Vasudevan, Qingquan Song, Fedor Borisyuk, Kay Qianqi Shen, Haichao Wei, Yunxiang Ren, Yeou S. Chiou, Sicong Kuang, Yuan Yin, Ben Zheng, Muchen Wu, Shaghayegh Gharghabi, Xiaoqing Wang, Huichao Xue, Qi Guo, Daniel Hewlett, Luke Simon, Liangjie Hong, Wenjing Zhang

    Abstract: Web-scale search systems typically tackle the scalability challenge with a two-step paradigm: retrieval and ranking. The retrieval step, also known as candidate selection, often involves extracting standardized entities, creating an inverted index, and performing term matching for retrieval. Such traditional methods require manual and time-consuming development of query models. In this paper, we d… ▽ More

    Submitted 20 February, 2024; originally announced February 2024.

  45. arXiv:2402.10958  [pdf, other

    cs.CL cs.AI cs.LG

    Relative Preference Optimization: Enhancing LLM Alignment through Contrasting Responses across Identical and Diverse Prompts

    Authors: Yueqin Yin, Zhendong Wang, Yi Gu, Hai Huang, Weizhu Chen, Mingyuan Zhou

    Abstract: In the field of large language models (LLMs), aligning models with the diverse preferences of users is a critical challenge. Direct Preference Optimization (DPO) has played a key role in this area. It works by using pairs of preferences derived from the same prompts, and it functions without needing an additional reward model. However, DPO does not fully reflect the complex nature of human learnin… ▽ More

    Submitted 27 May, 2024; v1 submitted 12 February, 2024; originally announced February 2024.

  46. EmoWear: Exploring Emotional Teasers for Voice Message Interaction on Smartwatches

    Authors: Pengcheng An, Jiawen Zhu, Zibo Zhang, Yifei Yin, Qingyuan Ma, Che Yan, Linghao Du, Jian Zhao

    Abstract: Voice messages, by nature, prevent users from gauging the emotional tone without fully diving into the audio content. This hinders the shared emotional experience at the pre-retrieval stage. Research scarcely explored "Emotional Teasers"-pre-retrieval cues offering a glimpse into an awaiting message's emotional tone without disclosing its content. We introduce EmoWear, a smartwatch voice messaging… ▽ More

    Submitted 11 February, 2024; originally announced February 2024.

    Comments: To appear at ACM CHI '24

  47. arXiv:2402.06123  [pdf, other

    cs.DC

    Decentralized Proactive Model Offloading and Resource Allocation for Split and Federated Learning

    Authors: Binbin Huang, Hailiang Zhao, Lingbin Wang, Wenzhuo Qian, Yuyu Yin, Shuiguang Deng

    Abstract: In the resource-constrained IoT-edge environment, Split Federated (SplitFed) learning is implemented to enhance training efficiency. This method involves each IoT device dividing its full DNN model at a designated layer into a device-side model and a server-side model, then offloading the latter to the edge server. However, existing research overlooks four critical issues as follows: (1) the heter… ▽ More

    Submitted 8 February, 2024; originally announced February 2024.

  48. arXiv:2401.12915  [pdf, other

    cs.AI cs.CL cs.CV

    Red Teaming Visual Language Models

    Authors: Mukai Li, Lei Li, Yuwei Yin, Masood Ahmed, Zhenguang Liu, Qi Liu

    Abstract: VLMs (Vision-Language Models) extend the capabilities of LLMs (Large Language Models) to accept multimodal inputs. Since it has been verified that LLMs can be induced to generate harmful or inaccurate content through specific test cases (termed as Red Teaming), how VLMs perform in similar scenarios, especially with their combination of textual and visual inputs, remains a question. To explore this… ▽ More

    Submitted 23 January, 2024; originally announced January 2024.

    Comments: Working in progress

  49. arXiv:2401.11140  [pdf, other

    cs.CV cs.AI

    Stability Plasticity Decoupled Fine-tuning For Few-shot end-to-end Object Detection

    Authors: Yuantao Yin, Ping Yin

    Abstract: Few-shot object detection(FSOD) aims to design methods to adapt object detectors efficiently with only few annotated samples. Fine-tuning has been shown to be an effective and practical approach. However, previous works often take the classical base-novel two stage fine-tuning procedure but ignore the implicit stability-plasticity contradiction among different modules. Specifically, the random re-… ▽ More

    Submitted 20 January, 2024; originally announced January 2024.

  50. arXiv:2401.09725  [pdf, ps, other

    cs.IR cs.MM

    Enhancing Image-Text Matching with Adaptive Feature Aggregation

    Authors: Zuhui Wang, Yunting Yin, I. V. Ramakrishnan

    Abstract: Image-text matching aims to find matched cross-modal pairs accurately. While current methods often rely on projecting cross-modal features into a common embedding space, they frequently suffer from imbalanced feature representations across different modalities, leading to unreliable retrieval results. To address these limitations, we introduce a novel Feature Enhancement Module that adaptively agg… ▽ More

    Submitted 18 January, 2024; originally announced January 2024.

    Comments: Accepted by ICASSP 2024