Skip to main content

Showing 1–50 of 99 results for author: Zeng, C

Searching in archive cs. Search in all archives.
.
  1. arXiv:2407.14882  [pdf, other

    cs.LG cs.AI math.NA

    Reduced Effectiveness of Kolmogorov-Arnold Networks on Functions with Noise

    Authors: Haoran Shen, Chen Zeng, Jiahui Wang, Qiao Wang

    Abstract: It has been observed that even a small amount of noise introduced into the dataset can significantly degrade the performance of KAN. In this brief note, we aim to quantitatively evaluate the performance when noise is added to the dataset. We propose an oversampling technique combined with denoising to alleviate the impact of noise. Specifically, we employ kernel filtering based on diffusion maps f… ▽ More

    Submitted 20 July, 2024; originally announced July 2024.

    MSC Class: 68T07

  2. arXiv:2407.05608  [pdf, other

    cs.SD cs.CL eess.AS

    A Benchmark for Multi-speaker Anonymization

    Authors: Xiaoxiao Miao, Ruijie Tao, Chang Zeng, Xin Wang

    Abstract: Privacy-preserving voice protection approaches primarily suppress privacy-related information derived from paralinguistic attributes while preserving the linguistic content. Existing solutions focus on single-speaker scenarios. However, they lack practicality for real-world applications, i.e., multi-speaker scenarios. In this paper, we present an initial attempt to provide a multi-speaker anonymiz… ▽ More

    Submitted 8 July, 2024; originally announced July 2024.

  3. arXiv:2407.00928  [pdf, other

    cs.LG cs.CL

    FoldGPT: Simple and Effective Large Language Model Compression Scheme

    Authors: Songwei Liu, Chao Zeng, Lianqiang Li, Chenqian Yan, Lean Fu, Xing Mei, Fangmin Chen

    Abstract: The demand for deploying large language models(LLMs) on mobile devices continues to increase, driven by escalating data security concerns and cloud costs. However, network bandwidth and memory limitations pose challenges for deploying billion-level models on mobile devices. In this study, we investigate the outputs of different layers across various scales of LLMs and found that the outputs of mos… ▽ More

    Submitted 30 June, 2024; originally announced July 2024.

  4. arXiv:2406.05316  [pdf, other

    cs.LG

    C-Mamba: Channel Correlation Enhanced State Space Models for Multivariate Time Series Forecasting

    Authors: Chaolv Zeng, Zhanyu Liu, Guanjie Zheng, Linghe Kong

    Abstract: In recent years, significant progress has been made in multivariate time series forecasting using Linear-based, Transformer-based, and Convolution-based models. However, these approaches face notable limitations: linear forecasters struggle with representation capacities, attention mechanisms suffer from quadratic complexity, and convolutional models have a restricted receptive field. These constr… ▽ More

    Submitted 7 June, 2024; originally announced June 2024.

  5. arXiv:2406.05232  [pdf, other

    cs.CL cs.LG

    Improving Logits-based Detector without Logits from Black-box LLMs

    Authors: Cong Zeng, Shengkun Tang, Xianjun Yang, Yuanzhou Chen, Yiyou Sun, zhiqiang xu, Yao Li, Haifeng Chen, Wei Cheng, Dongkuan Xu

    Abstract: The advent of Large Language Models (LLMs) has revolutionized text generation, producing outputs that closely mimic human writing. This blurring of lines between machine- and human-written text presents new challenges in distinguishing one from the other a task further complicated by the frequent updates and closed nature of leading proprietary LLMs. Traditional logits-based detection methods leve… ▽ More

    Submitted 11 June, 2024; v1 submitted 7 June, 2024; originally announced June 2024.

  6. arXiv:2405.07068  [pdf, other

    math.OC cs.LG

    Catastrophe Insurance: An Adaptive Robust Optimization Approach

    Authors: Dimitris Bertsimas, Cynthia Zeng

    Abstract: The escalating frequency and severity of natural disasters, exacerbated by climate change, underscore the critical role of insurance in facilitating recovery and promoting investments in risk reduction. This work introduces a novel Adaptive Robust Optimization (ARO) framework tailored for the calculation of catastrophe insurance premiums, with a case study applied to the United States National Flo… ▽ More

    Submitted 11 May, 2024; originally announced May 2024.

  7. arXiv:2405.00362  [pdf, other

    cs.RO cs.CG cs.GR

    Implicit Swept Volume SDF: Enabling Continuous Collision-Free Trajectory Generation for Arbitrary Shapes

    Authors: Jingping Wang, Tingrui Zhang, Qixuan Zhang, Chuxiao Zeng, Jingyi Yu, Chao Xu, Lan Xu, Fei Gao

    Abstract: In the field of trajectory generation for objects, ensuring continuous collision-free motion remains a huge challenge, especially for non-convex geometries and complex environments. Previous methods either oversimplify object shapes, which results in a sacrifice of feasible space or rely on discrete sampling, which suffers from the "tunnel effect". To address these limitations, we propose a novel… ▽ More

    Submitted 1 May, 2024; originally announced May 2024.

    Comments: accecpted by SIGGRAPH2024&TOG. Joint First Authors: Jingping Wang,Tingrui Zhang, Joint Corresponding authors: Fei Gao, Lan Xu

  8. arXiv:2404.17739  [pdf, other

    cs.SE

    How LLMs Aid in UML Modeling: An Exploratory Study with Novice Analysts

    Authors: Beian Wang, Chong Wang, Peng Liang, Bing Li, Cheng Zeng

    Abstract: Since the emergence of GPT-3, Large Language Models (LLMs) have caught the eyes of researchers, practitioners, and educators in the field of software engineering. However, there has been relatively little investigation regarding the performance of LLMs in assisting with requirements analysis and UML modeling. This paper explores how LLMs can assist novice analysts in creating three types of typica… ▽ More

    Submitted 10 June, 2024; v1 submitted 26 April, 2024; originally announced April 2024.

    Comments: The 21st IEEE International Conference on Software Services Engineering (SSE)

  9. arXiv:2404.13066  [pdf, other

    cs.CL cs.AI

    Leveraging Large Language Model as Simulated Patients for Clinical Education

    Authors: Yanzeng Li, Cheng Zeng, Jialun Zhong, Ruoyu Zhang, Minhao Zhang, Lei Zou

    Abstract: Simulated Patients (SPs) play a crucial role in clinical medical education by providing realistic scenarios for student practice. However, the high cost of training and hiring qualified SPs, along with the heavy workload and potential risks they face in consistently portraying actual patients, limit students' access to this type of clinical training. Consequently, the integration of computer progr… ▽ More

    Submitted 24 April, 2024; v1 submitted 13 April, 2024; originally announced April 2024.

  10. arXiv:2404.06834  [pdf, other

    math.NA cs.LG

    Solving Parametric PDEs with Radial Basis Functions and Deep Neural Networks

    Authors: Guanhang Lei, Zhen Lei, Lei Shi, Chenyu Zeng

    Abstract: We propose the POD-DNN, a novel algorithm leveraging deep neural networks (DNNs) along with radial basis functions (RBFs) in the context of the proper orthogonal decomposition (POD) reduced basis method (RBM), aimed at approximating the parametric mapping of parametric partial differential equations on irregular domains. The POD-DNN algorithm capitalizes on the low-dimensional characteristics of t… ▽ More

    Submitted 12 April, 2024; v1 submitted 10 April, 2024; originally announced April 2024.

  11. arXiv:2403.07294  [pdf, other

    cs.LG cs.AI cs.SI

    Graph Data Condensation via Self-expressive Graph Structure Reconstruction

    Authors: Zhanyu Liu, Chaolv Zeng, Guanjie Zheng

    Abstract: With the increasing demands of training graph neural networks (GNNs) on large-scale graphs, graph data condensation has emerged as a critical technique to relieve the storage and time costs during the training phase. It aims to condense the original large-scale graph to a much smaller synthetic graph while preserving the essential information necessary for efficiently training a downstream GNN. Ho… ▽ More

    Submitted 7 June, 2024; v1 submitted 11 March, 2024; originally announced March 2024.

  12. arXiv:2403.05989  [pdf, other

    cs.SD eess.AS

    HAM-TTS: Hierarchical Acoustic Modeling for Token-Based Zero-Shot Text-to-Speech with Model and Data Scaling

    Authors: Chunhui Wang, Chang Zeng, Bowen Zhang, Ziyang Ma, Yefan Zhu, Zifeng Cai, Jian Zhao, Zhonglin Jiang, Yong Chen

    Abstract: Token-based text-to-speech (TTS) models have emerged as a promising avenue for generating natural and realistic speech, yet they grapple with low pronunciation accuracy, speaking style and timbre inconsistency, and a substantial need for diverse training data. In response, we introduce a novel hierarchical acoustic modeling approach complemented by a tailored data augmentation strategy and train i… ▽ More

    Submitted 9 March, 2024; originally announced March 2024.

  13. arXiv:2403.01798  [pdf, other

    cs.NI cs.LG

    Towards Fair and Efficient Learning-based Congestion Control

    Authors: Xudong Liao, Han Tian, Chaoliang Zeng, Xinchen Wan, Kai Chen

    Abstract: Recent years have witnessed a plethora of learning-based solutions for congestion control (CC) that demonstrate better performance over traditional TCP schemes. However, they fail to provide consistently good convergence properties, including {\em fairness}, {\em fast convergence} and {\em stability}, due to the mismatch between their objective functions and these properties. Despite being intuiti… ▽ More

    Submitted 4 March, 2024; originally announced March 2024.

  14. DiLightNet: Fine-grained Lighting Control for Diffusion-based Image Generation

    Authors: Chong Zeng, Yue Dong, Pieter Peers, Youkang Kong, Hongzhi Wu, Xin Tong

    Abstract: This paper presents a novel method for exerting fine-grained lighting control during text-driven diffusion-based image generation. While existing diffusion models already have the ability to generate images under any lighting condition, without additional guidance these models tend to correlate image content and lighting. Moreover, text prompts lack the necessary expressional power to describe det… ▽ More

    Submitted 27 May, 2024; v1 submitted 19 February, 2024; originally announced February 2024.

    Comments: Accepted to SIGGRAPH 2024. Project page: https://dilightnet.github.io/

    Journal ref: ACM SIGGRAPH 2024 Conference Proceedings

  15. arXiv:2402.08367  [pdf, other

    cs.LG

    RBF-PINN: Non-Fourier Positional Embedding in Physics-Informed Neural Networks

    Authors: Chengxi Zeng, Tilo Burghardt, Alberto M Gambaruto

    Abstract: While many recent Physics-Informed Neural Networks (PINNs) variants have had considerable success in solving Partial Differential Equations, the empirical benefits of feature mapping drawn from the broader Neural Representations research have been largely overlooked. We highlight the limitations of widely used Fourier-based feature mapping in certain situations and suggest the use of the condition… ▽ More

    Submitted 29 April, 2024; v1 submitted 13 February, 2024; originally announced February 2024.

    Comments: arXiv admin note: substantial text overlap with arXiv:2402.06955

  16. arXiv:2402.06955  [pdf, other

    cs.LG cs.AI cs.CE

    Feature Mapping in Physics-Informed Neural Networks (PINNs)

    Authors: Chengxi Zeng, Tilo Burghardt, Alberto M Gambaruto

    Abstract: In this paper, the training dynamics of PINNs with a feature mapping layer via the limiting Conjugate Kernel and Neural Tangent Kernel is investigated, shedding light on the convergence of PINNs; Although the commonly used Fourier-based feature mapping has achieved great success, we show its inadequacy in some physics scenarios. Via these two scopes, we propose conditionally positive definite Radi… ▽ More

    Submitted 22 May, 2024; v1 submitted 10 February, 2024; originally announced February 2024.

  17. arXiv:2312.07271  [pdf

    cs.LG cs.AI stat.ML

    Analyze the Robustness of Classifiers under Label Noise

    Authors: Cheng Zeng, Yixuan Xu, Jiaqi Tian

    Abstract: This study explores the robustness of label noise classifiers, aiming to enhance model resilience against noisy data in complex real-world scenarios. Label noise in supervised learning, characterized by erroneous or imprecise labels, significantly impairs model performance. This research focuses on the increasingly pertinent issue of label noise's impact on practical applications. Addressing the p… ▽ More

    Submitted 12 December, 2023; originally announced December 2023.

    Comments: 21 pages, 11 figures

  18. arXiv:2312.01357  [pdf

    cs.LG cs.AI

    Analyze the robustness of three NMF algorithms (Robust NMF with L1 norm, L2-1 norm NMF, L2 NMF)

    Authors: Cheng Zeng, Jiaqi Tian, Yixuan Xu

    Abstract: Non-negative matrix factorization (NMF) and its variants have been widely employed in clustering and classification tasks (Long, & Jian , 2021). However, noises can seriously affect the results of our experiments. Our research is dedicated to investigating the noise robustness of non-negative matrix factorization (NMF) in the face of different types of noise. Specifically, we adopt three different… ▽ More

    Submitted 3 December, 2023; originally announced December 2023.

    Comments: 22 pages, 6 figures

  19. arXiv:2311.07885  [pdf, other

    cs.CV cs.AI cs.GR

    One-2-3-45++: Fast Single Image to 3D Objects with Consistent Multi-View Generation and 3D Diffusion

    Authors: Minghua Liu, Ruoxi Shi, Linghao Chen, Zhuoyang Zhang, Chao Xu, Xinyue Wei, Hansheng Chen, Chong Zeng, Jiayuan Gu, Hao Su

    Abstract: Recent advancements in open-world 3D object generation have been remarkable, with image-to-3D methods offering superior fine-grained control over their text-to-3D counterparts. However, most existing models fall short in simultaneously providing rapid generation speeds and high fidelity to input images - two features essential for practical applications. In this paper, we present One-2-3-45++, an… ▽ More

    Submitted 13 November, 2023; originally announced November 2023.

  20. arXiv:2310.15110  [pdf, other

    cs.CV cs.GR

    Zero123++: a Single Image to Consistent Multi-view Diffusion Base Model

    Authors: Ruoxi Shi, Hansheng Chen, Zhuoyang Zhang, Minghua Liu, Chao Xu, Xinyue Wei, Linghao Chen, Chong Zeng, Hao Su

    Abstract: We report Zero123++, an image-conditioned diffusion model for generating 3D-consistent multi-view images from a single input view. To take full advantage of pretrained 2D generative priors, we develop various conditioning and training schemes to minimize the effort of finetuning from off-the-shelf image diffusion models such as Stable Diffusion. Zero123++ excels in producing high-quality, consiste… ▽ More

    Submitted 23 October, 2023; originally announced October 2023.

  21. arXiv:2309.13166  [pdf, other

    cs.SD cs.CR cs.LG eess.AS

    Invisible Watermarking for Audio Generation Diffusion Models

    Authors: Xirong Cao, Xiang Li, Divyesh Jadav, Yanzhao Wu, Zhehui Chen, Chen Zeng, Wenqi Wei

    Abstract: Diffusion models have gained prominence in the image domain for their capabilities in data generation and transformation, achieving state-of-the-art performance in various tasks in both image and audio domains. In the rapidly evolving field of audio-based machine learning, safeguarding model integrity and establishing data copyright are of paramount importance. This paper presents the first waterm… ▽ More

    Submitted 31 October, 2023; v1 submitted 22 September, 2023; originally announced September 2023.

    Comments: This is an invited paper for IEEE TPS, part of the IEEE CIC/CogMI/TPS 2023 conference

  22. arXiv:2309.12672  [pdf, other

    cs.SD eess.AS

    CrossSinger: A Cross-Lingual Multi-Singer High-Fidelity Singing Voice Synthesizer Trained on Monolingual Singers

    Authors: Xintong Wang, Chang Zeng, Jun Chen, Chunhui Wang

    Abstract: It is challenging to build a multi-singer high-fidelity singing voice synthesis system with cross-lingual ability by only using monolingual singers in the training stage. In this paper, we propose CrossSinger, which is a cross-lingual singing voice synthesizer based on Xiaoicesing2. Specifically, we utilize International Phonetic Alphabet to unify the representation for all languages of the traini… ▽ More

    Submitted 22 September, 2023; originally announced September 2023.

    Comments: Accepted by ASRU2023

  23. arXiv:2308.14533  [pdf, other

    cs.CL

    A Multi-Task Semantic Decomposition Framework with Task-specific Pre-training for Few-Shot NER

    Authors: Guanting Dong, Zechen Wang, Jinxu Zhao, Gang Zhao, Daichi Guo, Dayuan Fu, Tingfeng Hui, Chen Zeng, Keqing He, Xuefeng Li, Liwen Wang, Xinyue Cui, Weiran Xu

    Abstract: The objective of few-shot named entity recognition is to identify named entities with limited labeled instances. Previous works have primarily focused on optimizing the traditional token-wise classification framework, while neglecting the exploration of information based on NER data characteristics. To address this issue, we propose a Multi-Task Semantic Decomposition Framework via Joint Task-spec… ▽ More

    Submitted 28 August, 2023; originally announced August 2023.

    Comments: Accepted by CIKM 2023 (Oral Presentation)

  24. Relighting Neural Radiance Fields with Shadow and Highlight Hints

    Authors: Chong Zeng, Guojun Chen, Yue Dong, Pieter Peers, Hongzhi Wu, Xin Tong

    Abstract: This paper presents a novel neural implicit radiance representation for free viewpoint relighting from a small set of unstructured photographs of an object lit by a moving point light source different from the view position. We express the shape as a signed distance function modeled by a multi layer perceptron. In contrast to prior relightable implicit neural representations, we do not disentangle… ▽ More

    Submitted 25 August, 2023; originally announced August 2023.

    Comments: Accepted to SIGGRAPH 2023. Author's version. Project page: https://nrhints.github.io/

    Journal ref: ACM SIGGRAPH 2023 Conference Proceedings

  25. arXiv:2308.09605  [pdf, other

    math.NA cs.LG math.ST stat.ML

    Solving PDEs on Spheres with Physics-Informed Convolutional Neural Networks

    Authors: Guanhang Lei, Zhen Lei, Lei Shi, Chenyu Zeng, Ding-Xuan Zhou

    Abstract: Physics-informed neural networks (PINNs) have been demonstrated to be efficient in solving partial differential equations (PDEs) from a variety of experimental perspectives. Some recent studies have also proposed PINN algorithms for PDEs on surfaces, including spheres. However, theoretical understanding of the numerical performance of PINNs, especially PINNs on surfaces or manifolds, is still lack… ▽ More

    Submitted 18 August, 2023; originally announced August 2023.

  26. arXiv:2308.00250  [pdf, other

    cs.SE

    CONSTRUCT: A Program Synthesis Approach for Reconstructing Control Algorithms from Embedded System Binaries in Cyber-Physical Systems

    Authors: Ali Shokri, Alexandre Perez, Souma Chowdhury, Chen Zeng, Gerald Kaloor, Ion Matei, Peter-Patel Schneider, Akshith Gunasekaran, Shantanu Rane

    Abstract: We introduce a novel approach to automatically synthesize a mathematical representation of the control algorithms implemented in industrial cyber-physical systems (CPS), given the embedded system binary. The output model can be used by subject matter experts to assess the system's compliance with the expected behavior and for a variety of forensic applications. Our approach first performs static a… ▽ More

    Submitted 31 July, 2023; originally announced August 2023.

  27. arXiv:2307.14074  [pdf, other

    cs.NI

    Gleam: An RDMA-accelerated Multicast Protocol for Datacenter Networks

    Authors: Wenxue Li, Junyi Zhang, Gaoxiong Zeng, Yufei Liu, Zilong Wang, Chaoliang Zeng, Pengpeng Zhou, Qiaoling Wang, Kai Chen

    Abstract: RDMA has been widely adopted for high-speed datacenter networks. However, native RDMA merely supports one-to-one reliable connection, which mismatches various applications with group communication patterns (e.g., one-to-many). While there are some multicast enhancements to address it, they all fail to simultaneously achieve optimal multicast forwarding and fully unleash the distinguished RDMA capa… ▽ More

    Submitted 29 July, 2023; v1 submitted 26 July, 2023; originally announced July 2023.

  28. arXiv:2307.04817  [pdf

    physics.ao-ph cs.LG

    A physics-constrained machine learning method for mapping gapless land surface temperature

    Authors: Jun Ma, Huanfeng Shen, Menghui Jiang, Liupeng Lin, Chunlei Meng, Chao Zeng, Huifang Li, Penghai Wu

    Abstract: More accurate, spatio-temporally, and physically consistent LST estimation has been a main interest in Earth system research. Developing physics-driven mechanism models and data-driven machine learning (ML) models are two major paradigms for gapless LST estimation, which have their respective advantages and disadvantages. In this paper, a physics-constrained ML model, which combines the strengths… ▽ More

    Submitted 2 July, 2023; originally announced July 2023.

  29. arXiv:2307.02751  [pdf, ps, other

    cs.SD cs.CR eess.AS

    DSARSR: Deep Stacked Auto-encoders Enhanced Robust Speaker Recognition

    Authors: Zhifeng Wang, Chunyan Zeng, Surong Duan, Hongjie Ouyang, Hongmin Xu

    Abstract: Speaker recognition is a biometric modality that utilizes the speaker's speech segments to recognize the identity, determining whether the test speaker belongs to one of the enrolled speakers. In order to improve the robustness of the i-vector framework on cross-channel conditions and explore the nova method for applying deep learning to speaker recognition, the Stacked Auto-encoders are used to g… ▽ More

    Submitted 5 July, 2023; originally announced July 2023.

    Comments: 12 pages, 3 figures

  30. arXiv:2306.10315  [pdf, other

    cs.CL

    FutureTOD: Teaching Future Knowledge to Pre-trained Language Model for Task-Oriented Dialogue

    Authors: Weihao Zeng, Keqing He, Yejie Wang, Chen Zeng, Jingang Wang, Yunsen Xian, Weiran Xu

    Abstract: Pre-trained language models based on general text enable huge success in the NLP scenario. But the intrinsical difference of linguistic patterns between general text and task-oriented dialogues makes existing pre-trained language models less useful in practice. Current dialogue pre-training methods rely on a contrastive framework and face the challenges of both selecting true positives and hard ne… ▽ More

    Submitted 17 June, 2023; originally announced June 2023.

    Comments: ACL 2023 Main Conference

  31. arXiv:2305.17699  [pdf, other

    cs.CL

    Decoupling Pseudo Label Disambiguation and Representation Learning for Generalized Intent Discovery

    Authors: Yutao Mou, Xiaoshuai Song, Keqing He, Chen Zeng, Pei Wang, Jingang Wang, Yunsen Xian, Weiran Xu

    Abstract: Generalized intent discovery aims to extend a closed-set in-domain intent classifier to an open-world intent set including in-domain and out-of-domain intents. The key challenges lie in pseudo label disambiguation and representation learning. Previous methods suffer from a coupling of pseudo label disambiguation and representation learning, that is, the reliability of pseudo labels relies on repre… ▽ More

    Submitted 28 May, 2023; originally announced May 2023.

    Comments: Accepted at ACL2023 main conference

  32. arXiv:2305.11881  [pdf, other

    cs.CV

    Self-Supervised Learning for Point Clouds Data: A Survey

    Authors: Changyu Zeng, Wei Wang, Anh Nguyen, Yutao Yue

    Abstract: 3D point clouds are a crucial type of data collected by LiDAR sensors and widely used in transportation applications due to its concise descriptions and accurate localization. Deep neural networks (DNNs) have achieved remarkable success in processing large amount of disordered and sparse 3D point clouds, especially in various computer vision tasks, such as pedestrian detection and vehicle recognit… ▽ More

    Submitted 24 May, 2023; v1 submitted 9 May, 2023; originally announced May 2023.

  33. arXiv:2304.05170  [pdf, other

    cs.CV

    SportsMOT: A Large Multi-Object Tracking Dataset in Multiple Sports Scenes

    Authors: Yutao Cui, Chenkai Zeng, Xiaoyu Zhao, Yichun Yang, Gangshan Wu, Limin Wang

    Abstract: Multi-object tracking in sports scenes plays a critical role in gathering players statistics, supporting further analysis, such as automatic tactical analysis. Yet existing MOT benchmarks cast little attention on the domain, limiting its development. In this work, we present a new large-scale multi-object tracking dataset in diverse sports scenes, coined as \emph{SportsMOT}, where all players on t… ▽ More

    Submitted 13 April, 2023; v1 submitted 11 April, 2023; originally announced April 2023.

  34. arXiv:2303.17088  [pdf, other

    cs.CV

    Depth-NeuS: Neural Implicit Surfaces Learning for Multi-view Reconstruction Based on Depth Information Optimization

    Authors: Hanqi Jiang, Cheng Zeng, Runnan Chen, Shuai Liang, Yinhe Han, Yichao Gao, Conglin Wang

    Abstract: Recently, methods for neural surface representation and rendering, for example NeuS, have shown that learning neural implicit surfaces through volume rendering is becoming increasingly popular and making good progress. However, these methods still face some challenges. Existing methods lack a direct representation of depth information, which makes object reconstruction unrestricted by geometric fe… ▽ More

    Submitted 29 March, 2023; originally announced March 2023.

    Comments: 9 pages

  35. arXiv:2303.13072  [pdf, other

    cs.SD cs.CL eess.AS

    Beyond Universal Transformer: block reusing with adaptor in Transformer for automatic speech recognition

    Authors: Haoyu Tang, Zhaoyi Liu, Chang Zeng, Xinfeng Li

    Abstract: Transformer-based models have recently made significant achievements in the application of end-to-end (E2E) automatic speech recognition (ASR). It is possible to deploy the E2E ASR system on smart devices with the help of Transformer-based models. While these models still have the disadvantage of requiring a large number of model parameters. To overcome the drawback of universal Transformer models… ▽ More

    Submitted 5 April, 2023; v1 submitted 23 March, 2023; originally announced March 2023.

  36. arXiv:2303.12285  [pdf, other

    cs.LG cs.AI cs.CY

    Reducing Air Pollution through Machine Learning

    Authors: Dimitris Bertsimas, Leonard Boussioux, Cynthia Zeng

    Abstract: This paper presents a data-driven approach to mitigate the effects of air pollution from industrial plants on nearby cities by linking operational decisions with weather conditions. Our method combines predictive and prescriptive machine learning models to forecast short-term wind speed and direction and recommend operational decisions to reduce or pause the industrial plant's production. We exhib… ▽ More

    Submitted 21 March, 2023; originally announced March 2023.

    Comments: Submitted to Manufacturing and Service Operations Management

  37. arXiv:2303.10916  [pdf, other

    cs.CV

    Learning Behavior Recognition in Smart Classroom with Multiple Students Based on YOLOv5

    Authors: Zhifeng Wang, Jialong Yao, Chunyan Zeng, Wanxuan Wu, Hongmin Xu, Yang Yang

    Abstract: Deep learning-based computer vision technology has grown stronger in recent years, and cross-fertilization using computer vision technology has been a popular direction in recent years. The use of computer vision technology to identify students' learning behavior in the classroom can reduce the workload of traditional teachers in supervising students in the classroom, and ensure greater accuracy a… ▽ More

    Submitted 20 March, 2023; originally announced March 2023.

    Comments: 8 pages, 10 figures

  38. arXiv:2303.04477  [pdf, other

    cs.CR cs.LG

    Graph Neural Networks Enhanced Smart Contract Vulnerability Detection of Educational Blockchain

    Authors: Zhifeng Wang, Wanxuan Wu, Chunyan Zeng, Jialong Yao, Yang Yang, Hongmin Xu

    Abstract: With the development of blockchain technology, more and more attention has been paid to the intersection of blockchain and education, and various educational evaluation systems and E-learning systems are developed based on blockchain technology. Among them, Ethereum smart contract is favored by developers for its ``event-triggered" mechanism for building education intelligent trading systems and i… ▽ More

    Submitted 8 March, 2023; originally announced March 2023.

    Comments: 8 pages, 8 figures

  39. arXiv:2302.13610   

    cs.CL

    A Prototypical Semantic Decoupling Method via Joint Contrastive Learning for Few-Shot Name Entity Recognition

    Authors: Guanting Dong, Zechen Wang, Liwen Wang, Daichi Guo, Dayuan Fu, Yuxiang Wu, Chen Zeng, Xuefeng Li, Tingfeng Hui, Keqing He, Xinyue Cui, Qixiang Gao, Weiran Xu

    Abstract: Few-shot named entity recognition (NER) aims at identifying named entities based on only few labeled instances. Most existing prototype-based sequence labeling models tend to memorize entity mentions which would be easily confused by close prototypes. In this paper, we proposed a Prototypical Semantic Decoupling method via joint Contrastive learning (PSDC) for few-shot NER. Specifically, we decoup… ▽ More

    Submitted 12 April, 2023; v1 submitted 27 February, 2023; originally announced February 2023.

    Comments: we want to revise our paper and upload this article in few days

  40. arXiv:2302.13584  [pdf, other

    cs.CL

    Revisit Out-Of-Vocabulary Problem for Slot Filling: A Unified Contrastive Frameword with Multi-level Data Augmentations

    Authors: Daichi Guo, Guanting Dong, Dayuan Fu, Yuxiang Wu, Chen Zeng, Tingfeng Hui, Liwen Wang, Xuefeng Li, Zechen Wang, Keqing He, Xinyue Cui, Weiran Xu

    Abstract: In real dialogue scenarios, the existing slot filling model, which tends to memorize entity patterns, has a significantly reduced generalization facing Out-of-Vocabulary (OOV) problems. To address this issue, we propose an OOV robust slot filling model based on multi-level data augmentations to solve the OOV problem from both word and slot perspectives. We present a unified contrastive learning fr… ▽ More

    Submitted 27 February, 2023; originally announced February 2023.

    Comments: 5 pages, 3 figures, published to ICASSP 2023

  41. Video-SwinUNet: Spatio-temporal Deep Learning Framework for VFSS Instance Segmentation

    Authors: Chengxi Zeng, Xinyu Yang, David Smithard, Majid Mirmehdi, Alberto M Gambaruto, Tilo Burghardt

    Abstract: This paper presents a deep learning framework for medical video segmentation. Convolution neural network (CNN) and transformer-based methods have achieved great milestones in medical image segmentation tasks due to their incredible semantic feature encoding and global information comprehension abilities. However, most existing approaches ignore a salient aspect of medical video data - the temporal… ▽ More

    Submitted 4 July, 2023; v1 submitted 22 February, 2023; originally announced February 2023.

  42. arXiv:2302.11254  [pdf, other

    cs.SD cs.CV cs.LG eess.AS eess.IV

    Cross-modal Audio-visual Co-learning for Text-independent Speaker Verification

    Authors: Meng Liu, Kong Aik Lee, Longbiao Wang, Hanyi Zhang, Chang Zeng, Jianwu Dang

    Abstract: Visual speech (i.e., lip motion) is highly related to auditory speech due to the co-occurrence and synchronization in speech production. This paper investigates this correlation and proposes a cross-modal speech co-learning paradigm. The primary motivation of our cross-modal co-learning method is modeling one modality aided by exploiting knowledge from another modality. Specifically, two cross-mod… ▽ More

    Submitted 22 February, 2023; originally announced February 2023.

  43. arXiv:2302.06120  [pdf, other

    q-bio.QM cs.LG

    Knowledge from Large-Scale Protein Contact Prediction Models Can Be Transferred to the Data-Scarce RNA Contact Prediction Task

    Authors: Yiren Jian, Chongyang Gao, Chen Zeng, Yunjie Zhao, Soroush Vosoughi

    Abstract: RNA, whose functionality is largely determined by its structure, plays an important role in many biological activities. The prediction of pairwise structural proximity between each nucleotide of an RNA sequence can characterize the structural information of the RNA. Historically, this problem has been tackled by machine learning models using expert-engineered features and trained on scarce labeled… ▽ More

    Submitted 18 January, 2024; v1 submitted 13 February, 2023; originally announced February 2023.

    Comments: The code is available at https://github.com/yiren-jian/CoT-RNA-Transfer

  44. arXiv:2301.12548  [pdf, other

    cs.LG cs.CY

    Global Flood Prediction: a Multimodal Machine Learning Approach

    Authors: Cynthia Zeng, Dimitris Bertsimas

    Abstract: Flooding is one of the most destructive and costly natural disasters, and climate changes would further increase risks globally. This work presents a novel multimodal machine learning approach for multi-year global flood risk prediction, combining geographical information and historical natural disaster dataset. Our multimodal framework employs state-of-the-art processing techniques to extract emb… ▽ More

    Submitted 29 January, 2023; originally announced January 2023.

    Comments: 6 pages

  45. arXiv:2212.09518  [pdf, other

    cs.LG

    FedTADBench: Federated Time-Series Anomaly Detection Benchmark

    Authors: Fanxing Liu, Cheng Zeng, Le Zhang, Yingjie Zhou, Qing Mu, Yanru Zhang, Ling Zhang, Ce Zhu

    Abstract: Time series anomaly detection strives to uncover potential abnormal behaviors and patterns from temporal data, and has fundamental significance in diverse application scenarios. Constructing an effective detection model usually requires adequate training data stored in a centralized manner, however, this requirement sometimes could not be satisfied in realistic scenarios. As a prevailing approach… ▽ More

    Submitted 19 December, 2022; originally announced December 2022.

    Comments: 8 pages, 6 figures, published by IEEE HPCC 2022

  46. arXiv:2212.02084  [pdf, other

    cs.SD eess.AS

    End-to-end Recording Device Identification Based on Deep Representation Learning

    Authors: Chunyan Zeng, Dongliang Zhu, Zhifeng Wang, Minghu Wu, Wei Xiong, Nan Zhao

    Abstract: Deep learning techniques have achieved specific results in recording device source identification. The recording device source features include spatial information and certain temporal information. However, most recording device source identification methods based on deep learning only use spatial representation learning from recording device source features, which cannot make full use of recordin… ▽ More

    Submitted 5 December, 2022; originally announced December 2022.

    Comments: 20 pages, 5 figures, recording device identification

  47. arXiv:2211.05963  [pdf, other

    cs.CV eess.IV

    JSRNN: Joint Sampling and Reconstruction Neural Networks for High Quality Image Compressed Sensing

    Authors: Chunyan Zeng, Jiaxiang Ye, Zhifeng Wang, Nan Zhao, Minghu Wu

    Abstract: Most Deep Learning (DL) based Compressed Sensing (DCS) algorithms adopt a single neural network for signal reconstruction, and fail to jointly consider the influences of the sampling operation for reconstruction. In this paper, we propose unified framework, which jointly considers the sampling and reconstruction process for image compressive sensing based on well-designed cascade neural networks.… ▽ More

    Submitted 10 November, 2022; originally announced November 2022.

    Comments: 9 pages, 3 figures

  48. arXiv:2210.14666  [pdf, other

    eess.AS cs.SD

    Xiaoicesing 2: A High-Fidelity Singing Voice Synthesizer Based on Generative Adversarial Network

    Authors: Chunhui Wang, Chang Zeng, Xing He

    Abstract: XiaoiceSing is a singing voice synthesis (SVS) system that aims at generating 48kHz singing voices. However, the mel-spectrogram generated by it is over-smoothing in middle- and high-frequency areas due to no special design for modeling the details of these parts. In this paper, we propose XiaoiceSing2, which can generate the details of middle- and high-frequency parts to better construct the full… ▽ More

    Submitted 28 October, 2022; v1 submitted 26 October, 2022; originally announced October 2022.

    Comments: submitted to icassp2023

  49. arXiv:2210.12740  [pdf, other

    eess.AS cs.SD

    HiFi-WaveGAN: Generative Adversarial Network with Auxiliary Spectrogram-Phase Loss for High-Fidelity Singing Voice Generation

    Authors: Chunhui Wang, Chang Zeng, Jun Chen, Xing He

    Abstract: Entertainment-oriented singing voice synthesis (SVS) requires a vocoder to generate high-fidelity (e.g. 48kHz) audio. However, most text-to-speech (TTS) vocoders cannot reconstruct the waveform well in this scenario. In this paper, we propose HiFi-WaveGAN to synthesize the 48kHz high-quality singing voices in real-time. Specifically, it consists of an Extended WaveNet served as a generator, a mult… ▽ More

    Submitted 17 September, 2023; v1 submitted 23 October, 2022; originally announced October 2022.

  50. arXiv:2210.10506  [pdf, other

    cs.SD eess.AS

    Audio Tampering Detection Based on Shallow and Deep Feature Representation Learning

    Authors: Zhifeng Wang, Yao Yang, Chunyan Zeng, Shuai Kong, Shixiong Feng, Nan Zhao

    Abstract: Digital audio tampering detection can be used to verify the authenticity of digital audio. However, most current methods use standard electronic network frequency (ENF) databases for visual comparison analysis of ENF continuity of digital audio or perform feature extraction for classification by machine learning methods. ENF databases are usually tricky to obtain, visual methods have weak feature… ▽ More

    Submitted 19 October, 2022; originally announced October 2022.

    Comments: Audio tampering detection, 21 pages, 4 figures