Skip to main content

Showing 1–50 of 216 results for author: Li, M

Searching in archive stat. Search in all archives.
.
  1. arXiv:2408.08493  [pdf, other

    cs.LG stat.ML

    Fishers Harvest Parallel Unlearning in Inherited Model Networks

    Authors: Xiao Liu, Mingyuan Li, Xu Wang, Guangsheng Yu, Wei Ni, Lixiang Li, Haipeng Peng, Renping Liu

    Abstract: Unlearning in various learning frameworks remains challenging, with the continuous growth and updates of models exhibiting complex inheritance relationships. This paper presents a novel unlearning framework, which enables fully parallel unlearning among models exhibiting inheritance. A key enabler is the new Unified Model Inheritance Graph (UMIG), which captures the inheritance using a Directed Ac… ▽ More

    Submitted 20 August, 2024; v1 submitted 15 August, 2024; originally announced August 2024.

  2. arXiv:2408.06679  [pdf, other

    cs.LG q-fin.ST stat.ML

    Case-based Explainability for Random Forest: Prototypes, Critics, Counter-factuals and Semi-factuals

    Authors: Gregory Yampolsky, Dhruv Desai, Mingshu Li, Stefano Pasquali, Dhagash Mehta

    Abstract: The explainability of black-box machine learning algorithms, commonly known as Explainable Artificial Intelligence (XAI), has become crucial for financial and other regulated industrial applications due to regulatory requirements and the need for transparency in business practices. Among the various paradigms of XAI, Explainable Case-Based Reasoning (XCBR) stands out as a pragmatic approach that e… ▽ More

    Submitted 13 August, 2024; originally announced August 2024.

    Comments: 8 pages, 2 figures, 5 tables

  3. arXiv:2408.02355  [pdf, other

    stat.ML cs.LG q-fin.ST q-fin.TR

    Quantile Regression using Random Forest Proximities

    Authors: Mingshu Li, Bhaskarjit Sarmah, Dhruv Desai, Joshua Rosaler, Snigdha Bhagat, Philip Sommer, Dhagash Mehta

    Abstract: Due to the dynamic nature of financial markets, maintaining models that produce precise predictions over time is difficult. Often the goal isn't just point prediction but determining uncertainty. Quantifying uncertainty, especially the aleatoric uncertainty due to the unpredictable nature of market drivers, helps investors understand varying risk levels. Recently, quantile regression forests (QRF)… ▽ More

    Submitted 5 August, 2024; originally announced August 2024.

    Comments: 9 pages, 5 figures, 3 tables

  4. arXiv:2408.01017  [pdf, ps, other

    math.DS econ.EM stat.AP

    Application of Superconducting Technology in the Electricity Industry: A Game-Theoretic Analysis of Government Subsidy Policies and Power Company Equipment Upgrade Decisions

    Authors: Mingyang Li, Maoqin Yuan, Han Pengsihua, Yuan Yuan, Zejun Wang

    Abstract: This study investigates the potential impact of "LK-99," a novel material developed by a Korean research team, on the power equipment industry. Using evolutionary game theory, the interactions between governmental subsidies and technology adoption by power companies are modeled. A key innovation of this research is the introduction of sensitivity analyses concerning time delays and initial subsidy… ▽ More

    Submitted 2 August, 2024; originally announced August 2024.

  5. arXiv:2407.18698  [pdf, other

    cs.CL cs.LG stat.ME stat.ML

    Adaptive Contrastive Search: Uncertainty-Guided Decoding for Open-Ended Text Generation

    Authors: Esteban Garces Arias, Julian Rodemann, Meimingwei Li, Christian Heumann, Matthias Aßenmacher

    Abstract: Decoding from the output distributions of large language models to produce high-quality text is a complex challenge in language modeling. Various approaches, such as beam search, sampling with temperature, $k-$sampling, nucleus $p-$sampling, typical decoding, contrastive decoding, and contrastive search, have been proposed to address this problem, aiming to improve coherence, diversity, as well as… ▽ More

    Submitted 26 July, 2024; originally announced July 2024.

  6. arXiv:2407.05082  [pdf, other

    cs.LG cs.AI cs.CV stat.ML

    DMTG: One-Shot Differentiable Multi-Task Grouping

    Authors: Yuan Gao, Shuguo Jiang, Moran Li, Jin-Gang Yu, Gui-Song Xia

    Abstract: We aim to address Multi-Task Learning (MTL) with a large number of tasks by Multi-Task Grouping (MTG). Given N tasks, we propose to simultaneously identify the best task groups from 2^N candidates and train the model weights simultaneously in one-shot, with the high-order task-affinity fully exploited. This is distinct from the pioneering methods which sequentially identify the groups and train th… ▽ More

    Submitted 6 July, 2024; originally announced July 2024.

    Comments: Accepted to ICML 2024

    Journal ref: International Conference on Machine Learning (ICML), 2024

  7. arXiv:2407.01316  [pdf, other

    cs.LG cs.CY stat.ML

    Evaluating Model Performance Under Worst-case Subpopulations

    Authors: Mike Li, Hongseok Namkoong, Shangzhou Xia

    Abstract: The performance of ML models degrades when the training population is different from that seen under operation. Towards assessing distributional robustness, we study the worst-case performance of a model over all subpopulations of a given size, defined with respect to core attributes Z. This notion of robustness can consider arbitrary (continuous) attributes Z, and automatically accounts for compl… ▽ More

    Submitted 1 July, 2024; originally announced July 2024.

    Comments: Earlier version appeared in the proceedings of Advances in Neural Information Processing Systems 34 (NeurIPS 2021): https://proceedings.neurips.cc/paper_files/paper/2021/file/908075ea2c025c335f4865f7db427062-Paper.pdf

  8. arXiv:2406.13036  [pdf, other

    stat.ML cs.LG math.PR math.ST stat.CO

    Sharp detection of low-dimensional structure in probability measures via dimensional logarithmic Sobolev inequalities

    Authors: Matthew T. C. Li, Tiangang Cui, Fengyi Li, Youssef Marzouk, Olivier Zahm

    Abstract: Identifying low-dimensional structure in high-dimensional probability measures is an essential pre-processing step for efficient sampling. We introduce a method for identifying and approximating a target measure $π$ as a perturbation of a given reference measure $μ$ along a few significant directions of $\mathbb{R}^{d}$. The reference measure can be a Gaussian or a nonlinear transformation of a Ga… ▽ More

    Submitted 21 June, 2024; v1 submitted 18 June, 2024; originally announced June 2024.

  9. arXiv:2406.03707  [pdf, other

    cs.LG cs.AI cs.CL stat.ML

    What Should Embeddings Embed? Autoregressive Models Represent Latent Generating Distributions

    Authors: Liyi Zhang, Michael Y. Li, Thomas L. Griffiths

    Abstract: Autoregressive language models have demonstrated a remarkable ability to extract latent structure from text. The embeddings from large language models have been shown to capture aspects of the syntax and semantics of language. But what {\em should} embeddings represent? We connect the autoregressive prediction objective to the idea of constructing predictive sufficient statistics to summarize the… ▽ More

    Submitted 5 June, 2024; originally announced June 2024.

    Comments: 15 pages, 8 figures

    ACM Class: I.2; I.5

  10. arXiv:2404.19495  [pdf

    stat.AP econ.EM stat.ME stat.OT

    Percentage Coefficient (bp) -- Effect Size Analysis (Theory Paper 1)

    Authors: Xinshu Zhao, Dianshi Moses Li, Ze Zack Lai, Piper Liping Liu, Song Harris Ao, Fei You

    Abstract: Percentage coefficient (bp) has emerged in recent publications as an additional and alternative estimator of effect size for regression analysis. This paper retraces the theory behind the estimator. It's posited that an estimator must first serve the fundamental function of enabling researchers and readers to comprehend an estimand, the target of estimation. It may then serve the instrumental func… ▽ More

    Submitted 6 May, 2024; v1 submitted 30 April, 2024; originally announced April 2024.

  11. arXiv:2404.17019  [pdf, other

    stat.ME cs.LG stat.ML

    Neyman Meets Causal Machine Learning: Experimental Evaluation of Individualized Treatment Rules

    Authors: Michael Lingzhi Li, Kosuke Imai

    Abstract: A century ago, Neyman showed how to evaluate the efficacy of treatment using a randomized experiment under a minimal set of assumptions. This classical repeated sampling framework serves as a basis of routine experimental analyses conducted by today's scientists across disciplines. In this paper, we demonstrate that Neyman's methodology can also be used to experimentally evaluate the efficacy of i… ▽ More

    Submitted 25 April, 2024; originally announced April 2024.

  12. arXiv:2404.13836  [pdf, other

    stat.ME

    MultiFun-DAG: Multivariate Functional Directed Acyclic Graph

    Authors: Tian Lan, Ziyue Li, Junpeng Lin, Zhishuai Li, Lei Bai, Man Li, Fugee Tsung, Rui Zhao, Chen Zhang

    Abstract: Directed Acyclic Graphical (DAG) models efficiently formulate causal relationships in complex systems. Traditional DAGs assume nodes to be scalar variables, characterizing complex systems under a facile and oversimplified form. This paper considers that nodes can be multivariate functional data and thus proposes a multivariate functional DAG (MultiFun-DAG). It constructs a hidden bilinear multivar… ▽ More

    Submitted 21 April, 2024; originally announced April 2024.

  13. arXiv:2403.11343  [pdf, other

    cs.LG cs.CR math.ST stat.ME stat.ML

    Federated Transfer Learning with Differential Privacy

    Authors: Mengchu Li, Ye Tian, Yang Feng, Yi Yu

    Abstract: Federated learning is gaining increasing popularity, with data heterogeneity and privacy being two prominent challenges. In this paper, we address both issues within a federated transfer learning framework, aiming to enhance learning on a target data set by leveraging information from multiple heterogeneous source data sets while adhering to privacy constraints. We rigorously formulate the notion… ▽ More

    Submitted 9 April, 2024; v1 submitted 17 March, 2024; originally announced March 2024.

    Comments: 78 pages, 3 figures

  14. arXiv:2403.07031  [pdf, other

    cs.LG stat.CO stat.ME stat.ML

    The Cram Method for Efficient Simultaneous Learning and Evaluation

    Authors: Zeyang Jia, Kosuke Imai, Michael Lingzhi Li

    Abstract: We introduce the "cram" method, a general and efficient approach to simultaneous learning and evaluation using a generic machine learning (ML) algorithm. In a single pass of batched data, the proposed method repeatedly trains an ML algorithm and tests its empirical performance. Because it utilizes the entire sample for both learning and evaluation, cramming is significantly more data-efficient tha… ▽ More

    Submitted 11 March, 2024; originally announced March 2024.

  15. arXiv:2403.01633  [pdf, other

    cs.LG cs.CV stat.ML

    Critical windows: non-asymptotic theory for feature emergence in diffusion models

    Authors: Marvin Li, Sitan Chen

    Abstract: We develop theory to understand an intriguing property of diffusion models for image generation that we term critical windows. Empirically, it has been observed that there are narrow time intervals in sampling during which particular features of the final image emerge, e.g. the image class or background color (Ho et al., 2020b; Meng et al., 2022; Choi et al., 2022; Raya & Ambrogioni, 2023; Georgie… ▽ More

    Submitted 24 May, 2024; v1 submitted 3 March, 2024; originally announced March 2024.

  16. arXiv:2402.18800  [pdf, other

    cs.LG stat.ML

    BlockEcho: Retaining Long-Range Dependencies for Imputing Block-Wise Missing Data

    Authors: Qiao Han, Mingqian Li, Yao Yang, Yiteng Zhai

    Abstract: Block-wise missing data poses significant challenges in real-world data imputation tasks. Compared to scattered missing data, block-wise gaps exacerbate adverse effects on subsequent analytic and machine learning tasks, as the lack of local neighboring elements significantly reduces the interpolation capability and predictive power. However, this issue has not received adequate attention. Most SOT… ▽ More

    Submitted 28 February, 2024; originally announced February 2024.

  17. arXiv:2402.08539  [pdf

    cs.LG stat.AP

    Intelligent Diagnosis of Alzheimer's Disease Based on Machine Learning

    Authors: Mingyang Li, Hongyu Liu, Yixuan Li, Zejun Wang, Yuan Yuan, Honglin Dai

    Abstract: This study is based on the Alzheimer's Disease Neuroimaging Initiative (ADNI) dataset and aims to explore early detection and disease progression in Alzheimer's disease (AD). We employ innovative data preprocessing strategies, including the use of the random forest algorithm to fill missing data and the handling of outliers and invalid data, thereby fully mining and utilizing these limited data re… ▽ More

    Submitted 13 February, 2024; originally announced February 2024.

  18. arXiv:2402.07355  [pdf, ps, other

    math.ST cs.LG stat.ML

    Sampling from the Mean-Field Stationary Distribution

    Authors: Yunbum Kook, Matthew S. Zhang, Sinho Chewi, Murat A. Erdogdu, Mufan Bill Li

    Abstract: We study the complexity of sampling from the stationary distribution of a mean-field SDE, or equivalently, the complexity of minimizing a functional over the space of probability measures which includes an interaction term. Our main insight is to decouple the two key aspects of this problem: (1) approximation of the mean-field SDE via a finite-particle system, via uniform-in-time propagation of ch… ▽ More

    Submitted 5 July, 2024; v1 submitted 11 February, 2024; originally announced February 2024.

  19. arXiv:2402.07227  [pdf, other

    math.DS econ.GN stat.AP

    Time-Delayed Game Strategy Analysis Among Japan, Other Nations, and the International Atomic Energy Agency in the Context of Fukushima Nuclear Wastewater Discharge Decision

    Authors: Mingyang Li, Han Pengsihua, Fujiao Meng, Zejun Wang, Weian Liu

    Abstract: This academic paper examines the strategic interactions between Japan, other nations, and the International Atomic Energy Agency (IAEA) regarding Japan's decision to release treated nuclear wastewater from the Fukushima Daiichi Nuclear Power Plant into the sea. It introduces a payoff matrix and time-delay elements in replicator dynamic equations to mirror real-world decision-making delays. The pap… ▽ More

    Submitted 11 February, 2024; originally announced February 2024.

  20. arXiv:2402.07210  [pdf, other

    math.DS econ.GN physics.soc-ph stat.AP

    Fukushima Nuclear Wastewater Discharge: An Evolutionary Game Theory Approach to International and Domestic Interaction and Strategic Decision-Making

    Authors: Mingyang Li, Han Pengsihua, Songqing Zhao, Zejun Wang, Limin Yang, Weian Liu

    Abstract: On August 24, 2023, Japan controversially decided to discharge nuclear wastewater from the Fukushima Daiichi Nuclear Power Plant into the ocean, sparking intense domestic and global debates. This study uses evolutionary game theory to analyze the strategic dynamics between Japan, other countries, and the Japan Fisheries Association. By incorporating economic, legal, international aid, and environm… ▽ More

    Submitted 11 February, 2024; originally announced February 2024.

  21. arXiv:2401.16320  [pdf, ps, other

    quant-ph stat.ML

    A Strategy for Preparing Quantum Squeezed States Using Reinforcement Learning

    Authors: X. L. Zhao, Y. M. Zhao, M. Li, T. T. Li, Q. Liu, S. Guo, X. X. Yi

    Abstract: We propose a scheme leveraging reinforcement learning to engineer control fields for generating non-classical states. It is exemplified by the application to prepare spin-squeezed states for an open collective spin model where a linear control field is designed to govern the dynamics. The reinforcement learning agent determines the temporal sequence of control pulses, commencing from a coherent sp… ▽ More

    Submitted 14 June, 2024; v1 submitted 29 January, 2024; originally announced January 2024.

  22. arXiv:2401.14343  [pdf, other

    cs.LG cs.CY stat.ML

    Class-attribute Priors: Adapting Optimization to Heterogeneity and Fairness Objective

    Authors: Xuechen Zhang, Mingchen Li, Jiasi Chen, Christos Thrampoulidis, Samet Oymak

    Abstract: Modern classification problems exhibit heterogeneities across individual classes: Each class may have unique attributes, such as sample size, label quality, or predictability (easy vs difficult), and variable importance at test-time. Without care, these heterogeneities impede the learning process, most notably, when optimizing fairness objectives. Confirming this, under a gaussian mixture setting,… ▽ More

    Submitted 25 January, 2024; originally announced January 2024.

    Comments: 15 pages, 8 figures

  23. arXiv:2401.00104  [pdf, other

    cs.LG cs.AI stat.ME

    Causal State Distillation for Explainable Reinforcement Learning

    Authors: Wenhao Lu, Xufeng Zhao, Thilo Fryen, Jae Hee Lee, Mengdi Li, Sven Magg, Stefan Wermter

    Abstract: Reinforcement learning (RL) is a powerful technique for training intelligent agents, but understanding why these agents make specific decisions can be quite challenging. This lack of transparency in RL models has been a long-standing problem, making it difficult for users to grasp the reasons behind an agent's behaviour. Various approaches have been explored to address this problem, with one promi… ▽ More

    Submitted 1 April, 2024; v1 submitted 29 December, 2023; originally announced January 2024.

    Comments: https://lukaswill.github.io/; Accepted as oral by CLeaR 2024

  24. arXiv:2311.13768  [pdf, other

    stat.ME

    Valid confidence intervals for regression with best subset selection

    Authors: Huiming Lin, Meng Li

    Abstract: Classical confidence intervals after best subset selection are widely implemented in statistical software and are routinely used to guide practitioners in scientific fields to conclude significance. However, there are increasing concerns in the recent literature about the validity of these confidence intervals in that the intended frequentist coverage is not attained. In the context of the Akaike… ▽ More

    Submitted 22 November, 2023; originally announced November 2023.

  25. arXiv:2311.07411  [pdf, ps, other

    math.OC stat.ML

    A Large Deviations Perspective on Policy Gradient Algorithms

    Authors: Wouter Jongeneel, Daniel Kuhn, Mengmeng Li

    Abstract: Motivated by policy gradient methods in the context of reinforcement learning, we identify a large deviation rate function for the iterates generated by stochastic gradient descent for possibly non-convex objectives satisfying a Polyak-Łojasiewicz condition. Leveraging the contraction principle from large deviations theory, we illustrate the potential of this result by showing how convergence prop… ▽ More

    Submitted 3 June, 2024; v1 submitted 13 November, 2023; originally announced November 2023.

    Comments: v3; comments are welcome

    MSC Class: 60F10; 90C26

  26. arXiv:2311.03967  [pdf, other

    cs.CV stat.ML

    CeCNN: Copula-enhanced convolutional neural networks in joint prediction of refraction error and axial length based on ultra-widefield fundus images

    Authors: Chong Zhong, Yang Li, Danjuan Yang, Meiyan Li, Xingyao Zhou, Bo Fu, Catherine C. Liu, A. H. Welsh

    Abstract: The ultra-widefield (UWF) fundus image is an attractive 3D biomarker in AI-aided myopia screening because it provides much richer myopia-related information. Though axial length (AL) has been acknowledged to be highly related to the two key targets of myopia screening, Spherical Equivalence (SE) measurement and high myopia diagnosis, its prediction based on the UWF fundus image is rarely considere… ▽ More

    Submitted 16 August, 2024; v1 submitted 7 November, 2023; originally announced November 2023.

  27. arXiv:2311.01287  [pdf, other

    stat.ME

    Semiparametric Latent ANOVA Model for Event-Related Potentials

    Authors: Cheng-Han Yu, Meng Li, Marina Vannucci

    Abstract: Event-related potentials (ERPs) extracted from electroencephalography (EEG) data in response to stimuli are widely used in psychological and neuroscience experiments. A major goal is to link ERP characteristic components to subject-level covariates. Existing methods typically follow two-step approaches, first identifying ERP components using peak detection methods and then relating them to the cov… ▽ More

    Submitted 2 November, 2023; originally announced November 2023.

    Journal ref: Data Science in Science, 2024, 3(1), article 2294204

  28. arXiv:2310.19053  [pdf, other

    cs.LG physics.optics stat.ML

    Datasets and Benchmarks for Nanophotonic Structure and Parametric Design Simulations

    Authors: Jungtaek Kim, Mingxuan Li, Oliver Hinder, Paul W. Leu

    Abstract: Nanophotonic structures have versatile applications including solar cells, anti-reflective coatings, electromagnetic interference shielding, optical filters, and light emitting diodes. To design and understand these nanophotonic structures, electrodynamic simulations are essential. These simulations enable us to model electromagnetic fields over time and calculate optical properties. In this work,… ▽ More

    Submitted 29 October, 2023; originally announced October 2023.

    Comments: 31 pages, 31 figures, 4 tables. Accepted at the 37th Conference on Neural Information Processing Systems (NeurIPS 2023), Datasets and Benchmarks Track

  29. arXiv:2310.18910  [pdf, other

    cs.LG cs.AI cs.CV stat.ML

    InstanT: Semi-supervised Learning with Instance-dependent Thresholds

    Authors: Muyang Li, Runze Wu, Haoyu Liu, Jun Yu, Xun Yang, Bo Han, Tongliang Liu

    Abstract: Semi-supervised learning (SSL) has been a fundamental challenge in machine learning for decades. The primary family of SSL algorithms, known as pseudo-labeling, involves assigning pseudo-labels to confident unlabeled instances and incorporating them into the training set. Therefore, the selection criteria of confident instances are crucial to the success of SSL. Recently, there has been growing in… ▽ More

    Submitted 29 October, 2023; originally announced October 2023.

    Comments: Accepted as poster for NeurIPS 2023

  30. arXiv:2310.12079  [pdf, other

    stat.ML cs.LG

    Differential Equation Scaling Limits of Shaped and Unshaped Neural Networks

    Authors: Mufan Bill Li, Mihai Nica

    Abstract: Recent analyses of neural networks with shaped activations (i.e. the activation function is scaled as the network size grows) have led to scaling limits described by differential equations. However, these results do not a priori tell us anything about "ordinary" unshaped networks, where the activation is unchanged as the network size grows. In this article, we find similar differential equation ba… ▽ More

    Submitted 18 April, 2024; v1 submitted 18 October, 2023; originally announced October 2023.

  31. arXiv:2310.07973  [pdf, other

    stat.ME math.OC stat.AP stat.ML

    Statistical Performance Guarantee for Subgroup Identification with Generic Machine Learning

    Authors: Michael Lingzhi Li, Kosuke Imai

    Abstract: Across a wide array of disciplines, many researchers use machine learning (ML) algorithms to identify a subgroup of individuals who are likely to benefit from a treatment the most (``exceptional responders'') or those who are harmed by it. A common approach to this subgroup identification problem consists of two steps. First, researchers estimate the conditional average treatment effect (CATE) usi… ▽ More

    Submitted 20 December, 2023; v1 submitted 11 October, 2023; originally announced October 2023.

  32. arXiv:2309.16620  [pdf, other

    stat.ML cond-mat.dis-nn cs.AI cs.LG

    Depthwise Hyperparameter Transfer in Residual Networks: Dynamics and Scaling Limit

    Authors: Blake Bordelon, Lorenzo Noci, Mufan Bill Li, Boris Hanin, Cengiz Pehlevan

    Abstract: The cost of hyperparameter tuning in deep learning has been rising with model sizes, prompting practitioners to find new tuning methods using a proxy of smaller networks. One such proposal uses $μ$P parameterized networks, where the optimal hyperparameters for small width networks transfer to networks with arbitrarily large width. However, in this scheme, hyperparameters do not transfer across dep… ▽ More

    Submitted 8 December, 2023; v1 submitted 28 September, 2023; originally announced September 2023.

  33. arXiv:2309.11455  [pdf, other

    stat.CO

    ddtlcm: An R package for overcoming weak separation in Bayesian latent class analysis via tree-regularization

    Authors: Mengbing Li, Bolin Wu, Briana Stephenson, Zhenke Wu

    Abstract: Traditional applications of latent class models (LCMs) often focus on scenarios where a set of unobserved classes are well-defined and easily distinguishable. However, in numerous real-world applications, these classes are weakly separated and difficult to distinguish, creating significant numerical challenges. To address these issues, we have developed an R package ddtlcm that provides comprehens… ▽ More

    Submitted 20 September, 2023; originally announced September 2023.

  34. arXiv:2308.14864  [pdf, other

    cs.LG cs.AI stat.ML

    NAS-X: Neural Adaptive Smoothing via Twisting

    Authors: Dieterich Lawson, Michael Li, Scott Linderman

    Abstract: Sequential latent variable models (SLVMs) are essential tools in statistics and machine learning, with applications ranging from healthcare to neuroscience. As their flexibility increases, analytic inference and model learning can become challenging, necessitating approximate methods. Here we introduce neural adaptive smoothing via twisting (NAS-X), a method that extends reweighted wake-sleep (RWS… ▽ More

    Submitted 30 October, 2023; v1 submitted 28 August, 2023; originally announced August 2023.

    Comments: Updating for clarity and adding new baselines

  35. arXiv:2308.13905  [pdf, ps, other

    stat.ME math.ST

    Estimation and Hypothesis Testing of Derivatives in Smoothing Spline ANOVA Models

    Authors: Ruiqi Liu, Kexuan Li, Meng Li

    Abstract: This article studies the derivatives in models that flexibly characterize the relationship between a response variable and multiple predictors, with goals of providing both accurate estimation and inference procedures for hypothesis testing. In the setting of tensor product reproducing spaces for nonparametric multivariate functions, we propose a plug-in kernel ridge regression estimator to estima… ▽ More

    Submitted 26 August, 2023; originally announced August 2023.

  36. arXiv:2307.02126  [pdf, other

    cs.LG stat.ML

    Robust Graph Structure Learning with the Alignment of Features and Adjacency Matrix

    Authors: Shaogao Lv, Gang Wen, Shiyu Liu, Linsen Wei, Ming Li

    Abstract: To improve the robustness of graph neural networks (GNN), graph structure learning (GSL) has attracted great interest due to the pervasiveness of noise in graph data. Many approaches have been proposed for GSL to jointly learn a clean graph structure and corresponding representations. To extend the previous work, this paper proposes a novel regularized GSL approach, particularly with an alignment… ▽ More

    Submitted 5 July, 2023; originally announced July 2023.

  37. arXiv:2307.01224  [pdf, other

    stat.ML

    INGB: Informed Nonlinear Granular Ball Oversampling Framework for Noisy Imbalanced Classification

    Authors: Min Li, Hao Zhou, Qun Liu, Yabin Shao, Guoying Wang

    Abstract: In classification problems, the datasets are usually imbalanced, noisy or complex. Most sampling algorithms only make some improvements to the linear sampling mechanism of the synthetic minority oversampling technique (SMOTE). Nevertheless, linear oversampling has several unavoidable drawbacks. Linear oversampling is susceptible to overfitting, and the synthetic samples lack diversity and rarely a… ▽ More

    Submitted 2 July, 2023; originally announced July 2023.

    Comments: 15 pages, 6 figures

  38. arXiv:2306.17759  [pdf, other

    stat.ML cs.LG

    The Shaped Transformer: Attention Models in the Infinite Depth-and-Width Limit

    Authors: Lorenzo Noci, Chuning Li, Mufan Bill Li, Bobby He, Thomas Hofmann, Chris Maddison, Daniel M. Roy

    Abstract: In deep learning theory, the covariance matrix of the representations serves as a proxy to examine the network's trainability. Motivated by the success of Transformers, we study the covariance matrix of a modified Softmax-based attention model with skip connections in the proportional limit of infinite-depth-and-width. We show that at initialization the limiting distribution can be described by a… ▽ More

    Submitted 9 December, 2023; v1 submitted 30 June, 2023; originally announced June 2023.

  39. arXiv:2306.06581  [pdf, other

    stat.ML cs.DS cs.LG math.OC

    Importance Sparsification for Sinkhorn Algorithm

    Authors: Mengyu Li, Jun Yu, Tao Li, Cheng Meng

    Abstract: Sinkhorn algorithm has been used pervasively to approximate the solution to optimal transport (OT) and unbalanced optimal transport (UOT) problems. However, its practical application is limited due to the high computational complexity. To alleviate the computational burden, we propose a novel importance sparsification method, called Spar-Sink, to efficiently approximate entropy-regularized OT and… ▽ More

    Submitted 11 June, 2023; originally announced June 2023.

    Comments: Accepted by Journal of Machine Learning Research

  40. arXiv:2306.04700  [pdf, other

    stat.ME

    Tree-Regularized Bayesian Latent Class Analysis for Improving Weakly Separated Dietary Pattern Subtyping in Small-Sized Subpopulations

    Authors: Mengbing Li, Briana Stephenson, Zhenke Wu

    Abstract: Dietary patterns synthesize multiple related diet components, which can be used by nutrition researchers to examine diet-disease relationships. Latent class models (LCMs) have been used to derive dietary patterns from dietary intake assessment, where each class profile represents the probabilities of exposure to a set of diet components. However, LCM-derived dietary patterns can exhibit strong sim… ▽ More

    Submitted 7 June, 2023; originally announced June 2023.

  41. arXiv:2306.02831  [pdf, other

    stat.ML cs.LG

    MM-DAG: Multi-task DAG Learning for Multi-modal Data -- with Application for Traffic Congestion Analysis

    Authors: Tian Lan, Ziyue Li, Zhishuai Li, Lei Bai, Man Li, Fugee Tsung, Wolfgang Ketter, Rui Zhao, Chen Zhang

    Abstract: This paper proposes to learn Multi-task, Multi-modal Direct Acyclic Graphs (MM-DAGs), which are commonly observed in complex systems, e.g., traffic, manufacturing, and weather systems, whose variables are multi-modal with scalars, vectors, and functions. This paper takes the traffic congestion analysis as a concrete case, where a traffic intersection is usually regarded as a DAG. In a road network… ▽ More

    Submitted 5 June, 2023; originally announced June 2023.

    Comments: Accepted in SIGKDD 2023

  42. arXiv:2305.18987  [pdf, other

    math.ST stat.ME

    Robust mean change point testing in high-dimensional data with heavy tails

    Authors: Mengchu Li, Yudong Chen, Tengyao Wang, Yi Yu

    Abstract: We study a mean change point testing problem for high-dimensional data, with exponentially- or polynomially-decaying tails. In each case, depending on the $\ell_0$-norm of the mean change vector, we separately consider dense and sparse regimes. We characterise the boundary between the dense and sparse regimes under the above two tail conditions for the first time in the change point literature and… ▽ More

    Submitted 17 June, 2023; v1 submitted 30 May, 2023; originally announced May 2023.

    Comments: 51 pages, 1 figure

  43. arXiv:2305.12085  [pdf, other

    cs.LG stat.ML

    Stability and Generalization of lp-Regularized Stochastic Learning for GCN

    Authors: Shiyu Liu, Linsen Wei, Shaogao Lv, Ming Li

    Abstract: Graph convolutional networks (GCN) are viewed as one of the most popular representations among the variants of graph neural networks over graph data and have shown powerful performance in empirical experiments. That $\ell_2$-based graph smoothing enforces the global smoothness of GCN, while (soft) $\ell_1$-based sparse graph learning tends to promote signal sparsity to trade for discontinuity. Thi… ▽ More

    Submitted 19 June, 2023; v1 submitted 19 May, 2023; originally announced May 2023.

    Comments: Accepted to IJCAI 2023

  44. arXiv:2305.06172  [pdf, other

    stat.CO math.PR math.ST

    Principal Feature Detection via $Φ$-Sobolev Inequalities

    Authors: Matthew T. C. Li, Youssef Marzouk, Olivier Zahm

    Abstract: We investigate the approximation of high-dimensional target measures as low-dimensional updates of a dominating reference measure. This approximation class replaces the associated density with the composition of: (i) a feature map that identifies the leading principal components or features of the target measure, relative to the reference, and (ii) a low-dimensional profile function. When the refe… ▽ More

    Submitted 16 January, 2024; v1 submitted 10 May, 2023; originally announced May 2023.

    Comments: To appear in Bernoulli, but this version contains both the main file and the supplementary material

  45. arXiv:2305.01789  [pdf, other

    stat.ME

    Multivariate Intrinsic Local Polynomial Regression on Isometric Riemannian Manifolds: Applications to Positive Definite Data

    Authors: Ronaldo García Reyes, Ying Wang, Min Li, Marlis Ontiviero Ortega, Deirel Paz-Linares, Lídice Galán García, Pedro Antonio Valdez Sosa

    Abstract: The paper introduces a novel non-parametric Riemannian regression method using Isometric Riemannian Manifolds (IRMs). The proposed technique, Intrinsic Local Polynomial Regression on IRMs (ILPR-IRMs), enables global data mapping between Riemannian manifolds while preserving underlying geometries. The ILPR method is generalized to handle multivariate covariates on any Riemannian manifold and isomet… ▽ More

    Submitted 2 May, 2023; originally announced May 2023.

    Comments: 32 pages, 6 figures, 1 pseudocode

    MSC Class: 53B12; 62R30; 62G05; 6208; 6211

  46. arXiv:2304.11894  [pdf, other

    stat.CO

    Estimating Failure Probability with Neural Operator Hybrid Approach

    Authors: Mujing Li, Yani Feng, Guanjie Wang

    Abstract: Evaluating failure probability for complex engineering systems is a computationally intensive task. While the Monte Carlo method is easy to implement, it converges slowly and, hence, requires numerous repeated simulations of a complex system to generate sufficient samples. To improve the efficiency, methods based on surrogate models are proposed to approximate the limit state function. In this wor… ▽ More

    Submitted 25 June, 2023; v1 submitted 24 April, 2023; originally announced April 2023.

  47. arXiv:2303.00288   

    stat.AP

    The Race of mRNA therapy: Evidence from Patent Landscape

    Authors: Jianxiong Ren, Xiaoming Zhang, Xingyong Si, Xiangjun Kong, Jinyu Cong, Pingping Wang, Xiang Li, Qianru Zhang, Peifen Yao, Mengyao Li, Yuanqi Cai, Zhaocai Sun, Kunmeng Liu, Benzheng Wei

    Abstract: mRNA therapy is gaining worldwide attention as an emerging therapeutic approach. The widespread use of mRNA vaccines during the COVID-19 outbreak has demonstrated the potential of mRNA therapy. As mRNA-based drugs have expanded and their indications have broadened, more patents for mRNA innovations have emerged. The global patent landscape for mRNA therapy has not yet been analyzed, indicating a r… ▽ More

    Submitted 15 March, 2023; v1 submitted 1 March, 2023; originally announced March 2023.

    Comments: I have received requests from co-authors and funding agencies to withdraw the manuscript

  48. arXiv:2302.08049  [pdf, ps, other

    math.ST stat.ML

    Improved Discretization Analysis for Underdamped Langevin Monte Carlo

    Authors: Matthew Zhang, Sinho Chewi, Mufan Bill Li, Krishnakumar Balasubramanian, Murat A. Erdogdu

    Abstract: Underdamped Langevin Monte Carlo (ULMC) is an algorithm used to sample from unnormalized densities by leveraging the momentum of a particle moving in a potential well. We provide a novel analysis of ULMC, motivated by two central questions: (1) Can we obtain improved sampling guarantees beyond strong log-concavity? (2) Can we achieve acceleration for sampling? For (1), prior results for ULMC onl… ▽ More

    Submitted 15 February, 2023; originally announced February 2023.

  49. arXiv:2302.04438  [pdf, other

    stat.ML cs.LG

    An information-theoretic learning model based on importance sampling

    Authors: Jiangshe Zhang, Lizhen Ji, Fei Gao, Mengyao Li

    Abstract: A crucial assumption underlying the most current theory of machine learning is that the training distribution is identical to the test distribution. However, this assumption may not hold in some real-world applications. In this paper, we develop a learning model based on principles of information theory by minimizing the worst-case loss at prescribed levels of uncertainty. We reformulate the empir… ▽ More

    Submitted 22 February, 2023; v1 submitted 8 February, 2023; originally announced February 2023.

    Comments: 7 pages, 4 figures

  50. arXiv:2302.02895  [pdf, other

    cs.CG stat.AP

    Flexible and Probabilistic Topology Tracking with Partial Optimal Transport

    Authors: Mingzhe Li, Xinyuan Yan, Lin Yan, Tom Needham, Bei Wang

    Abstract: In this paper, we present a flexible and probabilistic framework for tracking topological features in time-varying scalar fields using merge trees and partial optimal transport. Merge trees are topological descriptors that record the evolution of connected components in the sublevel sets of scalar fields. We present a new technique for modeling and comparing merge trees using tools from partial op… ▽ More

    Submitted 6 February, 2023; originally announced February 2023.