Skip to main content

Showing 1–50 of 611 results for author: Yan, J

Searching in archive cs. Search in all archives.
.
  1. arXiv:2407.16015  [pdf, other

    cs.CR cs.CV cs.MM

    Wallcamera: Reinventing the Wheel?

    Authors: Aurélien Bourquard, Jeff Yan

    Abstract: Developed at MIT CSAIL, the Wallcamera has captivated the public's imagination. Here, we show that the key insight underlying the Wallcamera is the same one that underpins the concept and the prototype of differential imaging forensics (DIF), both of which were validated and reported several years prior to the Wallcamera's debut. Rather than being the first to extract and amplify invisible signals… ▽ More

    Submitted 22 July, 2024; originally announced July 2024.

  2. LinSATNet: The Positive Linear Satisfiability Neural Networks

    Authors: Runzhong Wang, Yunhao Zhang, Ziao Guo, Tianyi Chen, Xiaokang Yang, Junchi Yan

    Abstract: Encoding constraints into neural networks is attractive. This paper studies how to introduce the popular positive linear satisfiability to neural networks. We propose the first differentiable satisfiability layer based on an extension of the classic Sinkhorn algorithm for jointly encoding multiple sets of marginal distributions. We further theoretically characterize the convergence property of the… ▽ More

    Submitted 18 July, 2024; originally announced July 2024.

    Comments: This is a revised version of our ICML'23 publication that fixes a minor issue in Eq (11). In Proceedings of the 40th International Conference on Machine Learning (ICML'23)

  3. arXiv:2407.12189  [pdf, other

    cs.RO

    Wheeled Humanoid Bilateral Teleoperation with Position-Force Control Modes for Dynamic Loco-Manipulation

    Authors: Amartya Purushottam, Jack Yan, Christopher Xu, Youngwoo Sim, Joao Ramos

    Abstract: Remote-controlled humanoid robots can revolutionize manufacturing, construction, and healthcare industries by performing complex or dangerous manual tasks traditionally done by humans. We refer to these behaviors as Dynamic Loco-Manipulation (DLM). To successfully complete these tasks, humans control the position of their bodies and contact forces at their hands. To enable similar whole-body contr… ▽ More

    Submitted 16 July, 2024; originally announced July 2024.

  4. arXiv:2407.11499  [pdf, other

    cs.CV

    Bridge Past and Future: Overcoming Information Asymmetry in Incremental Object Detection

    Authors: Qijie Mo, Yipeng Gao, Shenghao Fu, Junkai Yan, Ancong Wu, Wei-Shi Zheng

    Abstract: In incremental object detection, knowledge distillation has been proven to be an effective way to alleviate catastrophic forgetting. However, previous works focused on preserving the knowledge of old models, ignoring that images could simultaneously contain categories from past, present, and future stages. The co-occurrence of objects makes the optimization objectives inconsistent across different… ▽ More

    Submitted 16 July, 2024; originally announced July 2024.

    Comments: Accepted to ECCV 2024

  5. GeoMix: Towards Geometry-Aware Data Augmentation

    Authors: Wentao Zhao, Qitian Wu, Chenxiao Yang, Junchi Yan

    Abstract: Mixup has shown considerable success in mitigating the challenges posed by limited labeled data in image classification. By synthesizing samples through the interpolation of features and labels, Mixup effectively addresses the issue of data scarcity. However, it has rarely been explored in graph learning tasks due to the irregularity and connectivity of graph data. Specifically, in node classifica… ▽ More

    Submitted 15 July, 2024; originally announced July 2024.

    Comments: Published as a conference paper at KDD 2024

  6. arXiv:2407.09790  [pdf, other

    cs.LG

    Team up GBDTs and DNNs: Advancing Efficient and Effective Tabular Prediction with Tree-hybrid MLPs

    Authors: Jiahuan Yan, Jintai Chen, Qianxing Wang, Danny Z. Chen, Jian Wu

    Abstract: Tabular datasets play a crucial role in various applications. Thus, developing efficient, effective, and widely compatible prediction algorithms for tabular data is important. Currently, two prominent model types, Gradient Boosted Decision Trees (GBDTs) and Deep Neural Networks (DNNs), have demonstrated performance advantages on distinct tabular prediction tasks. However, selecting an effective mo… ▽ More

    Submitted 13 July, 2024; originally announced July 2024.

    Comments: Accepted at KDD 2024 Research Track, codes will be available at https://github.com/jyansir/tmlp

  7. arXiv:2407.06677  [pdf, other

    cs.CL

    Mixture-of-Modules: Reinventing Transformers as Dynamic Assemblies of Modules

    Authors: Zhuocheng Gong, Ang Lv, Jian Guan, Junxi Yan, Wei Wu, Huishuai Zhang, Minlie Huang, Dongyan Zhao, Rui Yan

    Abstract: Is it always necessary to compute tokens from shallow to deep layers in Transformers? The continued success of vanilla Transformers and their variants suggests an undoubted "yes". In this work, however, we attempt to break the depth-ordered convention by proposing a novel architecture dubbed mixture-of-modules (MoM), which is motivated by an intuition that any layer, regardless of its position, ca… ▽ More

    Submitted 9 July, 2024; originally announced July 2024.

  8. arXiv:2407.03658  [pdf, other

    cs.CL

    GPT-4 vs. Human Translators: A Comprehensive Evaluation of Translation Quality Across Languages, Domains, and Expertise Levels

    Authors: Jianhao Yan, Pingchuan Yan, Yulong Chen, Judy Li, Xianchao Zhu, Yue Zhang

    Abstract: This study comprehensively evaluates the translation quality of Large Language Models (LLMs), specifically GPT-4, against human translators of varying expertise levels across multiple language pairs and domains. Through carefully designed annotation rounds, we find that GPT-4 performs comparably to junior translators in terms of total errors made but lags behind medium and senior translators. We a… ▽ More

    Submitted 4 July, 2024; originally announced July 2024.

  9. arXiv:2407.03625  [pdf, other

    cs.SE

    Augmenting LLMs to Repair Obsolete Test Cases with Static Collector and Neural Reranker

    Authors: Jun Liu, Jiwei Yan, Yuanyuan Xie, Jun Yan, Jian Zhang

    Abstract: During software evolution, it is advocated that test code should co-evolve with production code. In real development scenarios, test updating may lag behind production code changing, which may cause the project to fail to compile or bring other troubles. Existing techniques based on pre-trained language models can be adopted to repair obsolete tests caused by such unsynchronized code changes, espe… ▽ More

    Submitted 4 July, 2024; originally announced July 2024.

  10. arXiv:2407.02888  [pdf, ps, other

    cs.LG cs.AI

    Joint Optimization of Resource Allocation and Data Selection for Fast and Cost-Efficient Federated Edge Learning

    Authors: Yunjian Jia, Zhen Huang, Jiping Yan, Yulu Zhang, Kun Luo, Wanli Wen

    Abstract: Deploying federated learning at the wireless edge introduces federated edge learning (FEEL). Given FEEL's limited communication resources and potential mislabeled data on devices, improper resource allocation or data selection can hurt convergence speed and increase training costs. Thus, to realize an efficient FEEL system, this paper emphasizes jointly optimizing resource allocation and data sele… ▽ More

    Submitted 3 July, 2024; originally announced July 2024.

  11. arXiv:2407.01891  [pdf, other

    cs.RO eess.SY

    Refined Motion Compensation with Soft Laser Manipulators using Data-Driven Surrogate Models

    Authors: Yongjun Yan, Qingpeng Ding, Mingwu Li, Junyan Yan, Shing Shin Cheng

    Abstract: Non-contact laser ablation, a precise thermal technique, simultaneously cuts and coagulates tissue without the insertion errors associated with rigid needles. Human organ motions, such as those in the liver, exhibit rhythmic components influenced by respiratory and cardiac cycles, making effective laser energy delivery to target lesions while compensating for tumor motion crucial. This research in… ▽ More

    Submitted 1 July, 2024; originally announced July 2024.

  12. arXiv:2406.17470  [pdf, other

    cs.LG cs.AI cs.DC cs.IT

    Dynamic Scheduling for Vehicle-to-Vehicle Communications Enhanced Federated Learning

    Authors: Jintao Yan, Tan Chen, Yuxuan Sun, Zhaojun Nan, Sheng Zhou, Zhisheng Niu

    Abstract: Leveraging the computing and sensing capabilities of vehicles, vehicular federated learning (VFL) has been applied to edge training for connected vehicles. The dynamic and interconnected nature of vehicular networks presents unique opportunities to harness direct vehicle-to-vehicle (V2V) communications, enhancing VFL training efficiency. In this paper, we formulate a stochastic optimization proble… ▽ More

    Submitted 25 June, 2024; originally announced June 2024.

    Comments: Submitted to IEEE for possible publication

  13. arXiv:2406.16949  [pdf, other

    cs.LG

    Fair Differentiable Neural Network Architecture Search for Long-Tailed Data with Self-Supervised Learning

    Authors: Jiaming Yan

    Abstract: Recent advancements in artificial intelligence (AI) have positioned deep learning (DL) as a pivotal technology in fields like computer vision, data mining, and natural language processing. A critical factor in DL performance is the selection of neural network architecture. Traditional predefined architectures often fail to adapt to different data distributions, making it challenging to achieve opt… ▽ More

    Submitted 19 June, 2024; originally announced June 2024.

  14. arXiv:2406.16367  [pdf, other

    cs.IR

    On the Role of Long-tail Knowledge in Retrieval Augmented Large Language Models

    Authors: Dongyang Li, Junbing Yan, Taolin Zhang, Chengyu Wang, Xiaofeng He, Longtao Huang, Hui Xue, Jun Huang

    Abstract: Retrieval augmented generation (RAG) exhibits outstanding performance in promoting the knowledge capabilities of large language models (LLMs) with retrieved documents related to user queries. However, RAG only focuses on improving the response quality of LLMs via enhancing queries indiscriminately with retrieved information, paying little attention to what type of knowledge LLMs really need to ans… ▽ More

    Submitted 24 June, 2024; originally announced June 2024.

  15. arXiv:2406.15836  [pdf, other

    cs.LG cs.AI cs.MA

    Decentralized Transformers with Centralized Aggregation are Sample-Efficient Multi-Agent World Models

    Authors: Yang Zhang, Chenjia Bai, Bin Zhao, Junchi Yan, Xiu Li, Xuelong Li

    Abstract: Learning a world model for model-free Reinforcement Learning (RL) agents can significantly improve the sample efficiency by learning policies in imagination. However, building a world model for Multi-Agent RL (MARL) can be particularly challenging due to the scalability issue in a centralized architecture arising from a large number of agents, and also the non-stationarity issue in a decentralized… ▽ More

    Submitted 22 June, 2024; originally announced June 2024.

  16. arXiv:2406.13945  [pdf, other

    cs.AI cs.CL cs.LG

    CityBench: Evaluating the Capabilities of Large Language Model as World Model

    Authors: Jie Feng, Jun Zhang, Junbo Yan, Xin Zhang, Tianjian Ouyang, Tianhui Liu, Yuwei Du, Siqi Guo, Yong Li

    Abstract: Large language models (LLMs) with powerful generalization ability has been widely used in many domains. A systematic and reliable evaluation of LLMs is a crucial step in their development and applications, especially for specific professional fields. In the urban domain, there have been some early explorations about the usability of LLMs, but a systematic and scalable evaluation benchmark is still… ▽ More

    Submitted 19 June, 2024; originally announced June 2024.

  17. arXiv:2406.13358  [pdf, other

    cs.CV eess.IV

    Multi-scale Restoration of Missing Data in Optical Time-series Images with Masked Spatial-Temporal Attention Network

    Authors: Zaiyan Zhang, Jining Yan, Yuanqi Liang, Jiaxin Feng, Haixu He, Wei Han

    Abstract: Due to factors such as thick cloud cover and sensor limitations, remote sensing images often suffer from significant missing data, resulting in incomplete time-series information. Existing methods for imputing missing values in remote sensing images do not fully exploit spatio-temporal auxiliary information, leading to limited accuracy in restoration. Therefore, this paper proposes a novel deep le… ▽ More

    Submitted 19 June, 2024; originally announced June 2024.

  18. arXiv:2406.11633  [pdf, other

    cs.CV

    DocGenome: An Open Large-scale Scientific Document Benchmark for Training and Testing Multi-modal Large Language Models

    Authors: Renqiu Xia, Song Mao, Xiangchao Yan, Hongbin Zhou, Bo Zhang, Haoyang Peng, Jiahao Pi, Daocheng Fu, Wenjie Wu, Hancheng Ye, Shiyang Feng, Bin Wang, Chao Xu, Conghui He, Pinlong Cai, Min Dou, Botian Shi, Sheng Zhou, Yongwei Wang, Bin Wang, Junchi Yan, Fei Wu, Yu Qiao

    Abstract: Scientific documents record research findings and valuable human knowledge, comprising a vast corpus of high-quality data. Leveraging multi-modality data extracted from these documents and assessing large models' abilities to handle scientific document-oriented tasks is therefore meaningful. Despite promising advancements, large models still perform poorly on multi-page scientific document extract… ▽ More

    Submitted 17 June, 2024; originally announced June 2024.

    Comments: Homepage of DocGenome: https://unimodal4reasoning.github.io/DocGenome_page 22 pages, 11 figures

  19. arXiv:2406.10661  [pdf, other

    cs.AI cs.LG

    A GPU-accelerated Large-scale Simulator for Transportation System Optimization Benchmarking

    Authors: Jun Zhang, Wenxuan Ao, Junbo Yan, Depeng Jin, Yong Li

    Abstract: With the development of artificial intelligence techniques, transportation system optimization is evolving from traditional methods relying on expert experience to simulation and learning-based decision optimization methods. Learning-based optimization methods require extensive interaction with highly realistic microscopic traffic simulators for optimization. However, existing microscopic traffic… ▽ More

    Submitted 15 June, 2024; originally announced June 2024.

    Comments: Submitted to NeurIPS 2024 Datasets and Benchmarks Track

  20. arXiv:2406.09410  [pdf, other

    cs.CV cs.AI

    STAR: A First-Ever Dataset and A Large-Scale Benchmark for Scene Graph Generation in Large-Size Satellite Imagery

    Authors: Yansheng Li, Linlin Wang, Tingzhu Wang, Xue Yang, Junwei Luo, Qi Wang, Youming Deng, Wenbin Wang, Xian Sun, Haifeng Li, Bo Dang, Yongjun Zhang, Yi Yu, Junchi Yan

    Abstract: Scene graph generation (SGG) in satellite imagery (SAI) benefits promoting understanding of geospatial scenarios from perception to cognition. In SAI, objects exhibit great variations in scales and aspect ratios, and there exist rich relationships between objects (even between spatially disjoint objects), which makes it attractive to holistically conduct SGG in large-size very-high-resolution (VHR… ▽ More

    Submitted 3 July, 2024; v1 submitted 13 June, 2024; originally announced June 2024.

    Comments: 18 pages, 11 figures

  21. arXiv:2406.09385  [pdf, other

    cs.CV

    Towards Vision-Language Geo-Foundation Model: A Survey

    Authors: Yue Zhou, Litong Feng, Yiping Ke, Xue Jiang, Junchi Yan, Xue Yang, Wayne Zhang

    Abstract: Vision-Language Foundation Models (VLFMs) have made remarkable progress on various multimodal tasks, such as image captioning, image-text retrieval, visual question answering, and visual grounding. However, most methods rely on training with general image datasets, and the lack of geospatial data leads to poor performance on earth observation. Numerous geospatial image-text pair datasets and VLFMs… ▽ More

    Submitted 13 June, 2024; originally announced June 2024.

    Comments: 18 pages, 4 figures

  22. arXiv:2406.04963  [pdf, other

    cs.LG cs.AI

    Learning Divergence Fields for Shift-Robust Graph Representations

    Authors: Qitian Wu, Fan Nie, Chenxiao Yang, Junchi Yan

    Abstract: Real-world data generation often involves certain geometries (e.g., graphs) that induce instance-level interdependence. This characteristic makes the generalization of learning models more difficult due to the intricate interdependent patterns that impact data-generative distributions and can vary from training to testing. In this work, we propose a geometric diffusion model with learnable diverge… ▽ More

    Submitted 7 June, 2024; originally announced June 2024.

    Comments: Accepted to ICML 2024. Source codes at https://github.com/fannie1208/GLIND

  23. arXiv:2406.03877  [pdf, other

    cs.RO cs.CV

    Bench2Drive: Towards Multi-Ability Benchmarking of Closed-Loop End-To-End Autonomous Driving

    Authors: Xiaosong Jia, Zhenjie Yang, Qifeng Li, Zhiyuan Zhang, Junchi Yan

    Abstract: In an era marked by the rapid scaling of foundation models, autonomous driving technologies are approaching a transformative threshold where end-to-end autonomous driving (E2E-AD) emerges due to its potential of scaling up in the data-driven manner. However, existing E2E-AD methods are mostly evaluated under the open-loop log-replay manner with L2 errors and collision rate as metrics (e.g., in nuS… ▽ More

    Submitted 11 June, 2024; v1 submitted 6 June, 2024; originally announced June 2024.

    Comments: Fix typos in text and Table 4. More reference

  24. arXiv:2405.20583  [pdf, other

    cs.CG math.AT

    The Gestalt Computational Model

    Authors: Yu Chen, Hongwei Lin, Jiacong Yan

    Abstract: Widely employed in cognitive psychology, Gestalt theory elucidates basic principles in visual perception, but meanwhile presents significant challenges for computation. The advancement of artificial intelligence requires the emulation of human cognitive behavior, for which Gestalt theory serves as a fundamental framework describing human visual cognitive behavior. In this paper, we utilize persist… ▽ More

    Submitted 30 May, 2024; originally announced May 2024.

  25. arXiv:2405.18132  [pdf, other

    cs.CV

    EG4D: Explicit Generation of 4D Object without Score Distillation

    Authors: Qi Sun, Zhiyang Guo, Ziyu Wan, Jing Nathan Yan, Shengming Yin, Wengang Zhou, Jing Liao, Houqiang Li

    Abstract: In recent years, the increasing demand for dynamic 3D assets in design and gaming applications has given rise to powerful generative pipelines capable of synthesizing high-quality 4D objects. Previous methods generally rely on score distillation sampling (SDS) algorithm to infer the unseen views and motion of 4D objects, thus leading to unsatisfactory results with defects like over-saturation and… ▽ More

    Submitted 28 May, 2024; originally announced May 2024.

  26. arXiv:2405.16759  [pdf, other

    cs.CV cs.LG

    Greedy Growing Enables High-Resolution Pixel-Based Diffusion Models

    Authors: Cristina N. Vasconcelos, Abdullah Rashwan, Austin Waters, Trevor Walker, Keyang Xu, Jimmy Yan, Rui Qian, Shixin Luo, Zarana Parekh, Andrew Bunner, Hongliang Fei, Roopal Garg, Mandy Guo, Ivana Kajic, Yeqing Li, Henna Nandwani, Jordi Pont-Tuset, Yasumasa Onoe, Sarah Rosston, Su Wang, Wenlei Zhou, Kevin Swersky, David J. Fleet, Jason M. Baldridge, Oliver Wang

    Abstract: We address the long-standing problem of how to learn effective pixel-based image diffusion models at scale, introducing a remarkably simple greedy growing method for stable training of large-scale, high-resolution models. without the needs for cascaded super-resolution components. The key insight stems from careful pre-training of core components, namely, those responsible for text-to-image alignm… ▽ More

    Submitted 26 May, 2024; originally announced May 2024.

  27. arXiv:2405.15908  [pdf, other

    cs.AI cs.CR cs.LG

    Knowledge-Informed Auto-Penetration Testing Based on Reinforcement Learning with Reward Machine

    Authors: Yuanliang Li, Hanzheng Dai, Jun Yan

    Abstract: Automated penetration testing (AutoPT) based on reinforcement learning (RL) has proven its ability to improve the efficiency of vulnerability identification in information systems. However, RL-based PT encounters several challenges, including poor sampling efficiency, intricate reward specification, and limited interpretability. To address these issues, we propose a knowledge-informed AutoPT frame… ▽ More

    Submitted 24 May, 2024; originally announced May 2024.

  28. arXiv:2405.14854  [pdf, other

    cs.CV cs.LG

    TerDiT: Ternary Diffusion Models with Transformers

    Authors: Xudong Lu, Aojun Zhou, Ziyi Lin, Qi Liu, Yuhui Xu, Renrui Zhang, Yafei Wen, Shuai Ren, Peng Gao, Junchi Yan, Hongsheng Li

    Abstract: Recent developments in large-scale pre-trained text-to-image diffusion models have significantly improved the generation of high-fidelity images, particularly with the emergence of diffusion models based on transformer architecture (DiTs). Among these diffusion models, diffusion transformers have demonstrated superior image generation capabilities, boosting lower FID scores and higher scalability.… ▽ More

    Submitted 23 May, 2024; originally announced May 2024.

    Comments: 18 pages, 13 figures

  29. arXiv:2405.12788  [pdf, other

    cs.CL

    What Have We Achieved on Non-autoregressive Translation?

    Authors: Yafu Li, Huajian Zhang, Jianhao Yan, Yongjing Yin, Yue Zhang

    Abstract: Recent advances have made non-autoregressive (NAT) translation comparable to autoregressive methods (AT). However, their evaluation using BLEU has been shown to weakly correlate with human annotations. Limited research compares non-autoregressive translation and autoregressive translation comprehensively, leaving uncertainty about the true proximity of NAT to AT. To address this gap, we systematic… ▽ More

    Submitted 21 May, 2024; originally announced May 2024.

    Comments: ACL 2024 Findings

  30. arXiv:2405.12520  [pdf, other

    cs.DC

    MOSS: A Large-scale Open Microscopic Traffic Simulation System

    Authors: Jun Zhang, Wenxuan Ao, Junbo Yan, Can Rong, Depeng Jin, Wei Wu, Yong Li

    Abstract: In the research of Intelligent Transportation Systems (ITS), traffic simulation is a key procedure for the evaluation of new methods and optimization of strategies. However, existing traffic simulation systems face two challenges. First, how to balance simulation scale with realism is a dilemma. Second, it is hard to simulate realistic results, which requires realistic travel demand data and simul… ▽ More

    Submitted 21 May, 2024; originally announced May 2024.

    Comments: Submitted to IEEE ITSC 2024

  31. arXiv:2404.16952  [pdf, other

    cs.RO

    Simultaneous Estimation of Shape and Force along Highly Deformable Surgical Manipulators Using Sparse FBG Measurement

    Authors: Yiang Lu, Bin Li, Wei Chen, Junyan Yan, Shing Shin Cheng, Jiangliu Wang, Jianshu Zhou, Qi Dou, Yun-hui Liu

    Abstract: Recently, fiber optic sensors such as fiber Bragg gratings (FBGs) have been widely investigated for shape reconstruction and force estimation of flexible surgical robots. However, most existing approaches need precise model parameters of FBGs inside the fiber and their alignments with the flexible robots for accurate sensing results. Another challenge lies in online acquiring external forces at ar… ▽ More

    Submitted 25 April, 2024; originally announced April 2024.

    Comments: Accepted to ICRA 2024

  32. arXiv:2404.16687  [pdf, other

    cs.CV

    NTIRE 2024 Quality Assessment of AI-Generated Content Challenge

    Authors: Xiaohong Liu, Xiongkuo Min, Guangtao Zhai, Chunyi Li, Tengchuan Kou, Wei Sun, Haoning Wu, Yixuan Gao, Yuqin Cao, Zicheng Zhang, Xiele Wu, Radu Timofte, Fei Peng, Huiyuan Fu, Anlong Ming, Chuanming Wang, Huadong Ma, Shuai He, Zifei Dou, Shu Chen, Huacong Zhang, Haiyi Xie, Chengwei Wang, Baoying Chen, Jishen Zeng , et al. (89 additional authors not shown)

    Abstract: This paper reports on the NTIRE 2024 Quality Assessment of AI-Generated Content Challenge, which will be held in conjunction with the New Trends in Image Restoration and Enhancement Workshop (NTIRE) at CVPR 2024. This challenge is to address a major challenge in the field of image and video processing, namely, Image Quality Assessment (IQA) and Video Quality Assessment (VQA) for AI-Generated Conte… ▽ More

    Submitted 7 May, 2024; v1 submitted 25 April, 2024; originally announced April 2024.

  33. arXiv:2404.13299  [pdf, other

    cs.CV

    PCQA: A Strong Baseline for AIGC Quality Assessment Based on Prompt Condition

    Authors: Xi Fang, Weigang Wang, Xiaoxin Lv, Jun Yan

    Abstract: The development of Large Language Models (LLM) and Diffusion Models brings the boom of Artificial Intelligence Generated Content (AIGC). It is essential to build an effective quality assessment framework to provide a quantifiable evaluation of different images or videos based on the AIGC technologies. The content generated by AIGC methods is driven by the crafted prompts. Therefore, it is intuitiv… ▽ More

    Submitted 20 April, 2024; originally announced April 2024.

    Comments: Published in CVPR-2024's NTIRE: New Trends in Image Restoration and Enhancement workshop and challenges

  34. arXiv:2404.12612  [pdf, other

    cs.LG cs.CV

    SA-Attack: Speed-adaptive stealthy adversarial attack on trajectory prediction

    Authors: Huilin Yin, Jiaxiang Li, Pengju Zhen, Jun Yan

    Abstract: Trajectory prediction is critical for the safe planning and navigation of automated vehicles. The trajectory prediction models based on the neural networks are vulnerable to adversarial attacks. Previous attack methods have achieved high attack success rates but overlook the adaptability to realistic scenarios and the concealment of the deceits. To address this problem, we propose a speed-adaptive… ▽ More

    Submitted 18 April, 2024; originally announced April 2024.

    Comments: This work is published in IEEE IV Symposium

  35. arXiv:2404.12097  [pdf, other

    eess.SY cs.LG

    MPC of Uncertain Nonlinear Systems with Meta-Learning for Fast Adaptation of Neural Predictive Models

    Authors: Jiaqi Yan, Ankush Chakrabarty, Alisa Rupenyan, John Lygeros

    Abstract: In this paper, we consider the problem of reference tracking in uncertain nonlinear systems. A neural State-Space Model (NSSM) is used to approximate the nonlinear system, where a deep encoder network learns the nonlinearity from data, and a state-space component captures the temporal relationship. This transforms the nonlinear system into a linear system in a latent space, enabling the applicatio… ▽ More

    Submitted 18 April, 2024; originally announced April 2024.

  36. arXiv:2404.11171  [pdf, other

    cs.LG cs.AI eess.SP

    Personalized Heart Disease Detection via ECG Digital Twin Generation

    Authors: Yaojun Hu, Jintai Chen, Lianting Hu, Dantong Li, Jiahuan Yan, Haochao Ying, Huiying Liang, Jian Wu

    Abstract: Heart diseases rank among the leading causes of global mortality, demonstrating a crucial need for early diagnosis and intervention. Most traditional electrocardiogram (ECG) based automated diagnosis methods are trained at population level, neglecting the customization of personalized ECGs to enhance individual healthcare management. A potential solution to address this limitation is to employ dig… ▽ More

    Submitted 11 May, 2024; v1 submitted 17 April, 2024; originally announced April 2024.

  37. A Phone-based Distributed Ambient Temperature Measurement System with An Efficient Label-free Automated Training Strategy

    Authors: Dayin Chen, Xiaodan Shi, Haoran Zhang, Xuan Song, Dongxiao Zhang, Yuntian Chen, Jinyue Yan

    Abstract: Enhancing the energy efficiency of buildings significantly relies on monitoring indoor ambient temperature. The potential limitations of conventional temperature measurement techniques, together with the omnipresence of smartphones, have redirected researchers'attention towards the exploration of phone-based ambient temperature estimation methods. However, existing phone-based methods face challen… ▽ More

    Submitted 17 May, 2024; v1 submitted 16 April, 2024; originally announced April 2024.

    Journal ref: IEEE Transactions on Mobile Computing,13 May 2024, 1 - 13

  38. arXiv:2404.07882  [pdf, other

    cs.AR quant-ph

    On Reducing the Execution Latency of Superconducting Quantum Processors via Quantum Program Scheduling

    Authors: Wenjie Wu, Yiquan Wang, Ge Yan, Yuming Zhao, Junchi Yan

    Abstract: Quantum computing has gained considerable attention, especially after the arrival of the Noisy Intermediate-Scale Quantum (NISQ) era. Quantum processors and cloud services have been made world-wide increasingly available. Unfortunately, programs on existing quantum processors are often executed in series, and the workload could be heavy to the processor. Typically, one has to wait for hours or eve… ▽ More

    Submitted 11 April, 2024; originally announced April 2024.

  39. arXiv:2404.06119  [pdf, other

    cs.CV

    DreamView: Injecting View-specific Text Guidance into Text-to-3D Generation

    Authors: Junkai Yan, Yipeng Gao, Qize Yang, Xihan Wei, Xuansong Xie, Ancong Wu, Wei-Shi Zheng

    Abstract: Text-to-3D generation, which synthesizes 3D assets according to an overall text description, has significantly progressed. However, a challenge arises when the specific appearances need customizing at designated viewpoints but referring solely to the overall description for generating 3D objects. For instance, ambiguity easily occurs when producing a T-shirt with distinct patterns on its front and… ▽ More

    Submitted 12 July, 2024; v1 submitted 9 April, 2024; originally announced April 2024.

    Comments: Accepted to ECCV 2024, camera ready version

  40. arXiv:2403.20002  [pdf, other

    cs.CV

    Grounding and Enhancing Grid-based Models for Neural Fields

    Authors: Zelin Zhao, Fenglei Fan, Wenlong Liao, Junchi Yan

    Abstract: Many contemporary studies utilize grid-based models for neural field representation, but a systematic analysis of grid-based models is still missing, hindering the improvement of those models. Therefore, this paper introduces a theoretical framework for grid-based models. This framework points out that these models' approximation and generalization behaviors are determined by grid tangent kernels… ▽ More

    Submitted 6 June, 2024; v1 submitted 29 March, 2024; originally announced March 2024.

    Comments: CVPR24 Oral & Best Paper Award Candidate. Pre-rebuttal scores: 555. Post-rebuttal scores: 555

  41. arXiv:2403.14660  [pdf

    cs.CY cs.AI

    Machina Economicus: A New Paradigm for Prosumers in the Energy Internet of Smart Cities

    Authors: Luyang Hou, Jun Yan, Yuankai Wu, Chun Wang, Tie Qiu

    Abstract: Energy Internet (EI) is emerging as new share economy platform for flexible local energy supplies in smart cities. Empowered by the Internet-of-Things (IoT) and Artificial Intelligence (AI), EI aims to unlock peer-to-peer energy trading and sharing among prosumers, who can adeptly switch roles between providers and consumers in localized energy markets with rooftop photovoltaic panels, vehicle-to-… ▽ More

    Submitted 27 February, 2024; originally announced March 2024.

    Comments: 25 pages, 1 figure

  42. arXiv:2403.13846  [pdf, other

    cs.LG cs.AI

    A Clustering Method with Graph Maximum Decoding Information

    Authors: Xinrun Xu, Manying Lv, Zhanbiao Lian, Yurong Wu, Jin Yan, Shan Jiang, Zhiming Ding

    Abstract: The clustering method based on graph models has garnered increased attention for its widespread applicability across various knowledge domains. Its adaptability to integrate seamlessly with other relevant applications endows the graph model-based clustering analysis with the ability to robustly extract "natural associations" or "graph structures" within datasets, facilitating the modelling of rela… ▽ More

    Submitted 18 April, 2024; v1 submitted 18 March, 2024; originally announced March 2024.

    Comments: 9 pages, 9 figures, IJCNN 2024

  43. arXiv:2403.13647  [pdf, other

    cs.CV

    Meta-Point Learning and Refining for Category-Agnostic Pose Estimation

    Authors: Junjie Chen, Jiebin Yan, Yuming Fang, Li Niu

    Abstract: Category-agnostic pose estimation (CAPE) aims to predict keypoints for arbitrary classes given a few support images annotated with keypoints. Existing methods only rely on the features extracted at support keypoints to predict or refine the keypoints on query image, but a few support feature vectors are local and inadequate for CAPE. Considering that human can quickly perceive potential keypoints… ▽ More

    Submitted 20 March, 2024; originally announced March 2024.

    Comments: Published in CVPR 2024

  44. arXiv:2403.13331  [pdf, other

    cs.CV cs.RO

    AMP: Autoregressive Motion Prediction Revisited with Next Token Prediction for Autonomous Driving

    Authors: Xiaosong Jia, Shaoshuai Shi, Zijun Chen, Li Jiang, Wenlong Liao, Tao He, Junchi Yan

    Abstract: As an essential task in autonomous driving (AD), motion prediction aims to predict the future states of surround objects for navigation. One natural solution is to estimate the position of other agents in a step-by-step manner where each predicted time-step is conditioned on both observed time-steps and previously predicted time-steps, i.e., autoregressive prediction. Pioneering works like SocialL… ▽ More

    Submitted 21 March, 2024; v1 submitted 20 March, 2024; originally announced March 2024.

  45. arXiv:2403.11380  [pdf, other

    cs.CV

    Boosting Order-Preserving and Transferability for Neural Architecture Search: a Joint Architecture Refined Search and Fine-tuning Approach

    Authors: Beichen Zhang, Xiaoxing Wang, Xiaohan Qin, Junchi Yan

    Abstract: Supernet is a core component in many recent Neural Architecture Search (NAS) methods. It not only helps embody the search space but also provides a (relative) estimation of the final performance of candidate architectures. Thus, it is critical that the top architectures ranked by a supernet should be consistent with those ranked by true performance, which is known as the order-preserving ability.… ▽ More

    Submitted 17 March, 2024; originally announced March 2024.

    Comments: Accepted by CVPR2024

  46. arXiv:2403.11203  [pdf, other

    cs.CL

    TRELM: Towards Robust and Efficient Pre-training for Knowledge-Enhanced Language Models

    Authors: Junbing Yan, Chengyu Wang, Taolin Zhang, Xiaofeng He, Jun Huang, Longtao Huang, Hui Xue, Wei Zhang

    Abstract: KEPLMs are pre-trained models that utilize external knowledge to enhance language understanding. Previous language models facilitated knowledge acquisition by incorporating knowledge-related pre-training tasks learned from relation triples in knowledge graphs. However, these models do not prioritize learning embeddings for entity-related tokens. Moreover, updating the entire set of parameters in K… ▽ More

    Submitted 17 March, 2024; originally announced March 2024.

  47. arXiv:2403.11121  [pdf, other

    cs.CV

    A Versatile Framework for Multi-scene Person Re-identification

    Authors: Wei-Shi Zheng, Junkai Yan, Yi-Xing Peng

    Abstract: Person Re-identification (ReID) has been extensively developed for a decade in order to learn the association of images of the same person across non-overlapping camera views. To overcome significant variations between images across camera views, mountains of variants of ReID models were developed for solving a number of challenges, such as resolution change, clothing change, occlusion, modality c… ▽ More

    Submitted 17 March, 2024; originally announced March 2024.

    Comments: To appear in TPAMI

  48. arXiv:2403.11090  [pdf, other

    cs.NI cs.LG

    Brain-on-Switch: Towards Advanced Intelligent Network Data Plane via NN-Driven Traffic Analysis at Line-Speed

    Authors: Jinzhu Yan, Haotian Xu, Zhuotao Liu, Qi Li, Ke Xu, Mingwei Xu, Jianping Wu

    Abstract: The emerging programmable networks sparked significant research on Intelligent Network Data Plane (INDP), which achieves learning-based traffic analysis at line-speed. Prior art in INDP focus on deploying tree/forest models on the data plane. We observe a fundamental limitation in tree-based INDP approaches: although it is possible to represent even larger tree/forest tables on the data plane, the… ▽ More

    Submitted 17 March, 2024; originally announced March 2024.

    Comments: 12 pages body, 22 pages total, 14 figures, accepted by the 21st USENIX Symposium on Networked Systems Design and Implementation (NSDI'24)

  49. arXiv:2403.10299  [pdf, other

    cs.AI

    A Multi-constraint and Multi-objective Allocation Model for Emergency Rescue in IoT Environment

    Authors: Xinrun Xu, Zhanbiao Lian, Yurong Wu, Manying Lv, Zhiming Ding, Jian Yan, Shang Jiang

    Abstract: Emergency relief operations are essential in disaster aftermaths, necessitating effective resource allocation to minimize negative impacts and maximize benefits. In prolonged crises or extensive disasters, a systematic, multi-cycle approach is key for timely and informed decision-making. Leveraging advancements in IoT and spatio-temporal data analytics, we've developed the Multi-Objective Shuffled… ▽ More

    Submitted 15 March, 2024; originally announced March 2024.

    Comments: 5 pages, 5 figures, ISCAS 2024

  50. arXiv:2403.10069  [pdf, other

    cs.CV cs.AI

    Boundary Matters: A Bi-Level Active Finetuning Framework

    Authors: Han Lu, Yichen Xie, Xiaokang Yang, Junchi Yan

    Abstract: The pretraining-finetuning paradigm has gained widespread adoption in vision tasks and other fields, yet it faces the significant challenge of high sample annotation costs. To mitigate this, the concept of active finetuning has emerged, aiming to select the most appropriate samples for model finetuning within a limited budget. Traditional active learning methods often struggle in this setting due… ▽ More

    Submitted 15 March, 2024; originally announced March 2024.