Skip to main content

Showing 1–50 of 1,210 results for author: Wang, Y

Searching in archive stat. Search in all archives.
.
  1. arXiv:2408.16913  [pdf, other

    cs.LG cs.AI cs.CR stat.ML

    Analyzing Inference Privacy Risks Through Gradients in Machine Learning

    Authors: Zhuohang Li, Andrew Lowy, Jing Liu, Toshiaki Koike-Akino, Kieran Parsons, Bradley Malin, Ye Wang

    Abstract: In distributed learning settings, models are iteratively updated with shared gradients computed from potentially sensitive user data. While previous work has studied various privacy risks of sharing gradients, our paper aims to provide a systematic approach to analyze private information leakage from gradients. We present a unified game-based framework that encompasses a broad range of attacks inc… ▽ More

    Submitted 29 August, 2024; originally announced August 2024.

  2. arXiv:2408.16751  [pdf, other

    cs.CL cs.LG stat.ML

    A Gradient Analysis Framework for Rewarding Good and Penalizing Bad Examples in Language Models

    Authors: Yi-Lin Tuan, William Yang Wang

    Abstract: Beyond maximum likelihood estimation (MLE), the standard objective of a language model (LM) that optimizes good examples probabilities, many studies have explored ways that also penalize bad examples for enhancing the quality of output distribution, including unlikelihood training, exponential maximizing average treatment effect (ExMATE), and direct preference optimization (DPO). To systematically… ▽ More

    Submitted 29 August, 2024; originally announced August 2024.

  3. arXiv:2408.13062  [pdf, other

    math.ST stat.AP

    Trimmed Mean for Partially Observed Functional Data

    Authors: Yixiao Wang

    Abstract: In practice, as opposed to a large set of finite-dimensional vectors approximated from discrete data, we often prefer to utilize functional data. In recent years, partially observable function data have frequently appeared in practical applications and are the objectofan increasing interest by the literature. In this thesis, we learn the concept of data integration depth of partially observable fu… ▽ More

    Submitted 23 August, 2024; originally announced August 2024.

  4. arXiv:2408.12933  [pdf, ps, other

    stat.AP

    When is truncated stop loss optimal?

    Authors: Erik Bølviken, Yinzhi Wang

    Abstract: The paper examines how reinsurance can be used to strike a balance between expected profit and VaR/CVaR risk. Conditions making truncated stop loss contracts optimal are derived, and it is argued that those are usually satisfied in practice. One of the prerequisites is that reinsurance is not too cheap, and an argument resembling arbitrage suggests that it is not.

    Submitted 23 August, 2024; originally announced August 2024.

  5. arXiv:2408.12888  [pdf, other

    cs.LG stat.ML

    Accelerated Markov Chain Monte Carlo Using Adaptive Weighting Scheme

    Authors: Yanbo Wang, Wenyu Chen, Shimin Shan

    Abstract: Gibbs sampling is one of the most commonly used Markov Chain Monte Carlo (MCMC) algorithms due to its simplicity and efficiency. It cycles through the latent variables, sampling each one from its distribution conditional on the current values of all the other variables. Conventional Gibbs sampling is based on the systematic scan (with a deterministic order of variables). In contrast, in recent yea… ▽ More

    Submitted 23 August, 2024; originally announced August 2024.

  6. arXiv:2408.04441  [pdf, other

    stat.AP cs.SI

    Causal Inference in Social Platforms Under Approximate Interference Networks

    Authors: Yiming Jiang, Lu Deng, Yong Wang, He Wang

    Abstract: Estimating the total treatment effect (TTE) of a new feature in social platforms is crucial for understanding its impact on user behavior. However, the presence of network interference, which arises from user interactions, often complicates this estimation process. Experimenters typically face challenges in fully capturing the intricate structure of this interference, leading to less reliable esti… ▽ More

    Submitted 8 August, 2024; originally announced August 2024.

  7. arXiv:2408.00799  [pdf, other

    cs.IR cs.LG stat.ML

    Deep Uncertainty-Based Explore for Index Construction and Retrieval in Recommendation System

    Authors: Xin Jiang, Kaiqiang Wang, Yinlong Wang, Fengchang Lv, Taiyang Peng, Shuai Yang, Xianteng Wu, Pengye Zhang, Shuo Yuan, Yifan Zeng

    Abstract: In recommendation systems, the relevance and novelty of the final results are selected through a cascade system of Matching -> Ranking -> Strategy. The matching model serves as the starting point of the pipeline and determines the upper bound of the subsequent stages. Balancing the relevance and novelty of matching results is a crucial step in the design and optimization of recommendation systems,… ▽ More

    Submitted 5 August, 2024; v1 submitted 21 July, 2024; originally announced August 2024.

    Comments: accepted by cikm2024

  8. arXiv:2407.19378  [pdf, other

    stat.ME

    Penalized Principal Component Analysis for Large-dimension Factor Model with Group Pursuit

    Authors: Yong He, Dong Liu, Guangming Pan, Yiming Wang

    Abstract: This paper investigates the intrinsic group structures within the framework of large-dimensional approximate factor models, which portrays homogeneous effects of the common factors on the individuals that fall into the same group. To this end, we propose a fusion Penalized Principal Component Analysis (PPCA) method and derive a closed-form solution for the $\ell_2$-norm optimization problem. We al… ▽ More

    Submitted 27 August, 2024; v1 submitted 27 July, 2024; originally announced July 2024.

  9. arXiv:2407.19054  [pdf, other

    stat.ML cs.LG q-bio.PE stat.AP

    Flusion: Integrating multiple data sources for accurate influenza predictions

    Authors: Evan L. Ray, Yijin Wang, Russell D. Wolfinger, Nicholas G. Reich

    Abstract: Over the last ten years, the US Centers for Disease Control and Prevention (CDC) has organized an annual influenza forecasting challenge with the motivation that accurate probabilistic forecasts could improve situational awareness and yield more effective public health actions. Starting with the 2021/22 influenza season, the forecasting targets for this challenge have been based on hospital admiss… ▽ More

    Submitted 26 July, 2024; originally announced July 2024.

  10. arXiv:2407.14335  [pdf, other

    econ.GN cs.CE cs.CR q-fin.CP stat.CO

    Quantifying the Blockchain Trilemma: A Comparative Analysis of Algorand, Ethereum 2.0, and Beyond

    Authors: Yihang Fu, Mingwei Jing, Jiaolun Zhou, Peilin Wu, Ye Wang, Luyao Zhang, Chuang Hu

    Abstract: Blockchain technology is essential for the digital economy and metaverse, supporting applications from decentralized finance to virtual assets. However, its potential is constrained by the "Blockchain Trilemma," which necessitates balancing decentralization, security, and scalability. This study evaluates and compares two leading proof-of-stake (PoS) systems, Algorand and Ethereum 2.0, against the… ▽ More

    Submitted 19 July, 2024; originally announced July 2024.

  11. arXiv:2407.14065  [pdf, other

    cs.LG stat.ML

    MSCT: Addressing Time-Varying Confounding with Marginal Structural Causal Transformer for Counterfactual Post-Crash Traffic Prediction

    Authors: Shuang Li, Ziyuan Pu, Nan Zhang, Duxin Chen, Lu Dong, Daniel J. Graham, Yinhai Wang

    Abstract: Traffic crashes profoundly impede traffic efficiency and pose economic challenges. Accurate prediction of post-crash traffic status provides essential information for evaluating traffic perturbations and developing effective solutions. Previous studies have established a series of deep learning models to predict post-crash traffic conditions, however, these correlation-based methods cannot accommo… ▽ More

    Submitted 19 July, 2024; originally announced July 2024.

    Comments: 13 pages, 9 figures

  12. arXiv:2407.14022  [pdf, other

    stat.ME cs.LG

    Causal Inference with Complex Treatments: A Survey

    Authors: Yingrong Wang, Haoxuan Li, Minqin Zhu, Anpeng Wu, Ruoxuan Xiong, Fei Wu, Kun Kuang

    Abstract: Causal inference plays an important role in explanatory analysis and decision making across various fields like statistics, marketing, health care, and education. Its main task is to estimate treatment effects and make intervention policies. Traditionally, most of the previous works typically focus on the binary treatment setting that there is only one treatment for a unit to adopt or not. However… ▽ More

    Submitted 19 July, 2024; originally announced July 2024.

  13. arXiv:2407.12177  [pdf, other

    cs.LG stat.ML

    Are Linear Regression Models White Box and Interpretable?

    Authors: Ahmed M Salih, Yuhe Wang

    Abstract: Explainable artificial intelligence (XAI) is a set of tools and algorithms that applied or embedded to machine learning models to understand and interpret the models. They are recommended especially for complex or advanced models including deep neural network because they are not interpretable from human point of view. On the other hand, simple models including linear regression are easy to implem… ▽ More

    Submitted 16 July, 2024; originally announced July 2024.

  14. arXiv:2407.11844  [pdf, other

    cs.LG cs.AI cs.CR stat.ML

    Variational Randomized Smoothing for Sample-Wise Adversarial Robustness

    Authors: Ryo Hase, Ye Wang, Toshiaki Koike-Akino, Jing Liu, Kieran Parsons

    Abstract: Randomized smoothing is a defensive technique to achieve enhanced robustness against adversarial examples which are small input perturbations that degrade the performance of neural network models. Conventional randomized smoothing adds random noise with a fixed noise level for every input sample to smooth out adversarial perturbations. This paper proposes a new variational framework that uses a pe… ▽ More

    Submitted 16 July, 2024; originally announced July 2024.

    Comments: 20 pages, under preparation

  15. arXiv:2407.10013  [pdf, other

    astro-ph.CO stat.AP

    AB$\mathbb{C}$MB: Deep Delensing Assisted Likelihood-Free Inference from CMB Polarization Maps

    Authors: Kai Yi, Yanan Fan, Jan Hamann, Pietro Liò, Yuguang Wang

    Abstract: The existence of a cosmic background of primordial gravitational waves (PGWB) is a robust prediction of inflationary cosmology, but it has so far evaded discovery. The most promising avenue of its detection is via measurements of Cosmic Microwave Background (CMB) $B$-polarization. However, this is not straightforward due to (a) the fact that CMB maps are distorted by gravitational lensing and (b)… ▽ More

    Submitted 13 July, 2024; originally announced July 2024.

  16. arXiv:2407.09522  [pdf, other

    cs.DB cs.AI cs.LG stat.ML

    UQE: A Query Engine for Unstructured Databases

    Authors: Hanjun Dai, Bethany Yixin Wang, Xingchen Wan, Bo Dai, Sherry Yang, Azade Nova, Pengcheng Yin, Phitchaya Mangpo Phothilimthana, Charles Sutton, Dale Schuurmans

    Abstract: Analytics on structured data is a mature field with many successful methods. However, most real world data exists in unstructured form, such as images and conversations. We investigate the potential of Large Language Models (LLMs) to enable unstructured data analytics. In particular, we propose a new Universal Query Engine (UQE) that directly interrogates and draws insights from unstructured data… ▽ More

    Submitted 23 June, 2024; originally announced July 2024.

  17. arXiv:2407.07809  [pdf, other

    stat.ME

    Direct estimation and inference of higher-level correlations from lower-level measurements with applications in gene-pathway and proteomics studies

    Authors: Yue Wang, Haoran Shi

    Abstract: This paper tackles the challenge of estimating correlations between higher-level biological variables (e.g., proteins and gene pathways) when only lower-level measurements are directly observed (e.g., peptides and individual genes). Existing methods typically aggregate lower-level data into higher-level variables and then estimate correlations based on the aggregated data. However, different data… ▽ More

    Submitted 10 July, 2024; originally announced July 2024.

    Comments: 16 pages, 4 figures

  18. arXiv:2407.05564  [pdf, ps, other

    math.OC stat.ML

    A Re-solving Heuristic for Dynamic Assortment Optimization with Knapsack Constraints

    Authors: Xi Chen, Mo Liu, Yining Wang, Yuan Zhou

    Abstract: In this paper, we consider a multi-stage dynamic assortment optimization problem with multi-nomial choice modeling (MNL) under resource knapsack constraints. Given the current resource inventory levels, the retailer makes an assortment decision at each period, and the goal of the retailer is to maximize the total profit from purchases. With the exact optimal dynamic assortment solution being compu… ▽ More

    Submitted 7 July, 2024; originally announced July 2024.

  19. arXiv:2407.02539  [pdf

    cs.RO cs.AI cs.LG stat.ML

    Research on Autonomous Robots Navigation based on Reinforcement Learning

    Authors: Zixiang Wang, Hao Yan, Yining Wang, Zhengjia Xu, Zhuoyue Wang, Zhizhong Wu

    Abstract: Reinforcement learning continuously optimizes decision-making based on real-time feedback reward signals through continuous interaction with the environment, demonstrating strong adaptive and self-learning capabilities. In recent years, it has become one of the key methods to achieve autonomous navigation of robots. In this work, an autonomous robot navigation method based on reinforcement learnin… ▽ More

    Submitted 14 August, 2024; v1 submitted 1 July, 2024; originally announced July 2024.

  20. arXiv:2407.02501  [pdf, other

    cs.LG cs.CE eess.SY stat.AP

    Data-driven Power Flow Linearization: Theory

    Authors: Mengshuo Jia, Gabriela Hug, Ning Zhang, Zhaojian Wang, Yi Wang, Chongqing Kang

    Abstract: This two-part tutorial dives into the field of data-driven power flow linearization (DPFL), a domain gaining increased attention. DPFL stands out for its higher approximation accuracy, wide adaptability, and better ability to implicitly incorporate the latest system attributes. This renders DPFL a potentially superior option for managing the significant fluctuations from renewable energy sources,… ▽ More

    Submitted 10 June, 2024; originally announced July 2024.

    Comments: 20 pages

  21. arXiv:2407.00397  [pdf, other

    cs.LG stat.ML

    Markovian Gaussian Process: A Universal State-Space Representation for Stationary Temporal Gaussian Process

    Authors: Weihan Li, Yule Wang, Chengrui Li, Anqi Wu

    Abstract: Gaussian Processes (GPs) and Linear Dynamical Systems (LDSs) are essential time series and dynamic system modeling tools. GPs can handle complex, nonlinear dynamics but are computationally demanding, while LDSs offer efficient computation but lack the expressive power of GPs. To combine their benefits, we introduce a universal method that allows an LDS to mirror stationary temporal GPs. This state… ▽ More

    Submitted 29 June, 2024; originally announced July 2024.

  22. arXiv:2406.19049  [pdf, other

    cs.LG cs.AI stat.ML

    Accuracy on the wrong line: On the pitfalls of noisy data for out-of-distribution generalisation

    Authors: Amartya Sanyal, Yaxi Hu, Yaodong Yu, Yian Ma, Yixin Wang, Bernhard Schölkopf

    Abstract: "Accuracy-on-the-line" is a widely observed phenomenon in machine learning, where a model's accuracy on in-distribution (ID) and out-of-distribution (OOD) data is positively correlated across different hyperparameters and data configurations. But when does this useful relationship break down? In this work, we explore its robustness. The key observation is that noisy data and the presence of nuisan… ▽ More

    Submitted 27 June, 2024; originally announced June 2024.

  23. arXiv:2406.17698  [pdf, other

    stat.ML cs.LG

    Identifying Nonstationary Causal Structures with High-Order Markov Switching Models

    Authors: Carles Balsells-Rodas, Yixin Wang, Pedro A. M. Mediano, Yingzhen Li

    Abstract: Causal discovery in time series is a rapidly evolving field with a wide variety of applications in other areas such as climate science and neuroscience. Traditional approaches assume a stationary causal graph, which can be adapted to nonstationary time series with time-dependent effects or heterogeneous noise. In this work we address nonstationarity via regime-dependent causal structures. We first… ▽ More

    Submitted 25 June, 2024; originally announced June 2024.

    Comments: CI4TS Workshop @UAI2024

  24. arXiv:2406.17096  [pdf, other

    cs.LG cs.AI stat.ML

    Model-Free Robust Reinforcement Learning with Sample Complexity Analysis

    Authors: Yudan Wang, Shaofeng Zou, Yue Wang

    Abstract: Distributionally Robust Reinforcement Learning (DR-RL) aims to derive a policy optimizing the worst-case performance within a predefined uncertainty set. Despite extensive research, previous DR-RL algorithms have predominantly favored model-based approaches, with limited availability of model-free methods offering convergence guarantees or sample complexities. This paper proposes a model-free DR-R… ▽ More

    Submitted 24 June, 2024; originally announced June 2024.

    Comments: UAI 2024

  25. arXiv:2406.16306  [pdf, other

    cs.CL cs.LG stat.ML

    Cascade Reward Sampling for Efficient Decoding-Time Alignment

    Authors: Bolian Li, Yifan Wang, Ananth Grama, Ruqi Zhang

    Abstract: Aligning large language models (LLMs) with human preferences is critical for their deployment. Recently, decoding-time alignment has emerged as an effective plug-and-play technique that requires no fine-tuning of model parameters. However, generating text that achieves both high reward and high likelihood remains a significant challenge. Existing methods often fail to generate high-reward text or… ▽ More

    Submitted 24 June, 2024; originally announced June 2024.

  26. arXiv:2406.15523  [pdf, other

    cs.LG stat.ML

    Unifying Unsupervised Graph-Level Anomaly Detection and Out-of-Distribution Detection: A Benchmark

    Authors: Yili Wang, Yixin Liu, Xu Shen, Chenyu Li, Kaize Ding, Rui Miao, Ying Wang, Shirui Pan, Xin Wang

    Abstract: To build safe and reliable graph machine learning systems, unsupervised graph-level anomaly detection (GLAD) and unsupervised graph-level out-of-distribution (OOD) detection (GLOD) have received significant attention in recent years. Though those two lines of research indeed share the same objective, they have been studied independently in the community due to distinct evaluation setups, creating… ▽ More

    Submitted 21 June, 2024; originally announced June 2024.

  27. arXiv:2406.13635  [pdf, ps, other

    stat.ME math.ST stat.AP

    Temporal label recovery from noisy dynamical data

    Authors: Yuehaw Khoo, Xin T. Tong, Wanjie Wang, Yuguan Wang

    Abstract: Analyzing dynamical data often requires information of the temporal labels, but such information is unavailable in many applications. Recovery of these temporal labels, closely related to the seriation or sequencing problem, becomes crucial in the study. However, challenges arise due to the nonlinear nature of the data and the complexity of the underlying dynamical system, which may be periodic or… ▽ More

    Submitted 19 June, 2024; originally announced June 2024.

    Comments: 20 pages, 4 figures

  28. arXiv:2406.12839  [pdf, other

    cs.LG math.DS math.OC math.PR stat.ML

    Evaluating the design space of diffusion-based generative models

    Authors: Yuqing Wang, Ye He, Molei Tao

    Abstract: Most existing theoretical investigations of the accuracy of diffusion models, albeit significant, assume the score function has been approximated to a certain accuracy, and then use this a priori bound to control the error of generation. This article instead provides a first quantitative understanding of the whole generation process, i.e., both training and sampling. More precisely, it conducts a… ▽ More

    Submitted 25 July, 2024; v1 submitted 18 June, 2024; originally announced June 2024.

    Comments: Comments are welcome. Out of admiration we titled our paper after EDM, and hoped theorists' humor is not too corny

  29. arXiv:2406.11675  [pdf, other

    cs.LG cs.AI cs.CL stat.ML

    BLoB: Bayesian Low-Rank Adaptation by Backpropagation for Large Language Models

    Authors: Yibin Wang, Haizhou Shi, Ligong Han, Dimitris Metaxas, Hao Wang

    Abstract: Large Language Models (LLMs) often suffer from overconfidence during inference, particularly when adapted to downstream domain-specific tasks with limited data. Previous work addresses this issue by employing approximate Bayesian estimation after the LLMs are trained, enabling them to quantify uncertainty. However, such post-training approaches' performance is severely limited by the parameters le… ▽ More

    Submitted 18 June, 2024; v1 submitted 17 June, 2024; originally announced June 2024.

    Comments: 27 pages, 3 figures, 9 tables; preprint, work in progress

  30. arXiv:2406.10917  [pdf, other

    cs.LG stat.ML

    Bayesian Intervention Optimization for Causal Discovery

    Authors: Yuxuan Wang, Mingzhou Liu, Xinwei Sun, Wei Wang, Yizhou Wang

    Abstract: Causal discovery is crucial for understanding complex systems and informing decisions. While observational data can uncover causal relationships under certain assumptions, it often falls short, making active interventions necessary. Current methods, such as Bayesian and graph-theoretical approaches, do not prioritize decision-making and often rely on ideal conditions or information gain, which is… ▽ More

    Submitted 16 June, 2024; originally announced June 2024.

  31. arXiv:2406.06838  [pdf, other

    cs.LG cs.AI stat.ML

    Stable Minima Cannot Overfit in Univariate ReLU Networks: Generalization by Large Step Sizes

    Authors: Dan Qiao, Kaiqi Zhang, Esha Singh, Daniel Soudry, Yu-Xiang Wang

    Abstract: We study the generalization of two-layer ReLU neural networks in a univariate nonparametric regression problem with noisy labels. This is a problem where kernels (\emph{e.g.} NTK) are provably sub-optimal and benign overfitting does not happen, thus disqualifying existing theory for interpolating (0-loss, global optimal) solutions. We present a new theory of generalization for local minima that gr… ▽ More

    Submitted 10 June, 2024; originally announced June 2024.

    Comments: 51 pages

  32. arXiv:2406.06833  [pdf, other

    eess.SY stat.AP

    Data-driven Power Flow Linearization: Simulation

    Authors: Mengshuo Jia, Gabriela Hug, Ning Zhang, Zhaojian Wang, Yi Wang, Chongqing Kang

    Abstract: Building on the theoretical insights of Part I, this paper, as the second part of the tutorial, dives deeper into data-driven power flow linearization (DPFL), focusing on comprehensive numerical testing. The necessity of these simulations stems from the theoretical analysis's inherent limitations, particularly the challenge of identifying the differences in real-world performance among DPFL method… ▽ More

    Submitted 10 June, 2024; originally announced June 2024.

    Comments: 26 pages

  33. arXiv:2406.04201  [pdf, ps, other

    cs.LG cs.MA math.OC stat.ML

    Towards Principled Superhuman AI for Multiplayer Symmetric Games

    Authors: Jiawei Ge, Yuanhao Wang, Wenzhe Li, Chi Jin

    Abstract: Multiplayer games, when the number of players exceeds two, present unique challenges that fundamentally distinguish them from the extensively studied two-player zero-sum games. These challenges arise from the non-uniqueness of equilibria and the risk of agents performing highly suboptimally when adopting equilibrium strategies. While a line of recent works developed learning systems successfully a… ▽ More

    Submitted 6 June, 2024; originally announced June 2024.

  34. arXiv:2406.04150  [pdf, other

    stat.ME stat.ML

    A novel robust meta-analysis model using the $t$ distribution for outlier accommodation and detection

    Authors: Yue Wang, Jianhua Zhao, Fen Jiang, Lei Shi, Jianxin Pan

    Abstract: Random effects meta-analysis model is an important tool for integrating results from multiple independent studies. However, the standard model is based on the assumption of normal distributions for both random effects and within-study errors, making it susceptible to outlying studies. Although robust modeling using the $t$ distribution is an appealing idea, the existing work, that explores the use… ▽ More

    Submitted 6 June, 2024; originally announced June 2024.

    Comments: 15 pages, 7 figures

    MSC Class: 62P10 ACM Class: I.2.6

  35. arXiv:2406.02806  [pdf, other

    cs.LG math.OC stat.ML

    Randomized Geometric Algebra Methods for Convex Neural Networks

    Authors: Yifei Wang, Sungyoon Kim, Paul Chu, Indu Subramaniam, Mert Pilanci

    Abstract: We introduce randomized algorithms to Clifford's Geometric Algebra, generalizing randomized linear algebra to hypercomplex vector spaces. This novel approach has many implications in machine learning, including training neural networks to global optimality via convex optimization. Additionally, we consider fine-tuning large language model (LLM) embeddings as a key application area, exploring the i… ▽ More

    Submitted 8 June, 2024; v1 submitted 4 June, 2024; originally announced June 2024.

  36. arXiv:2406.01762  [pdf, other

    cs.LG cs.AI stat.ML

    Non-Asymptotic Analysis for Single-Loop (Natural) Actor-Critic with Compatible Function Approximation

    Authors: Yudan Wang, Yue Wang, Yi Zhou, Shaofeng Zou

    Abstract: Actor-critic (AC) is a powerful method for learning an optimal policy in reinforcement learning, where the critic uses algorithms, e.g., temporal difference (TD) learning with function approximation, to evaluate the current policy and the actor updates the policy along an approximate gradient direction using information from the critic. This paper provides the \textit{tightest} non-asymptotic conv… ▽ More

    Submitted 3 June, 2024; originally announced June 2024.

    Comments: ICML 2024

  37. arXiv:2406.01335  [pdf, other

    quant-ph q-fin.ST stat.ML

    Statistics-Informed Parameterized Quantum Circuit via Maximum Entropy Principle for Data Science and Finance

    Authors: Xi-Ning Zhuang, Zhao-Yun Chen, Cheng Xue, Xiao-Fan Xu, Chao Wang, Huan-Yu Liu, Tai-Ping Sun, Yun-Jie Wang, Yu-Chun Wu, Guo-Ping Guo

    Abstract: Quantum machine learning has demonstrated significant potential in solving practical problems, particularly in statistics-focused areas such as data science and finance. However, challenges remain in preparing and learning statistical models on a quantum processor due to issues with trainability and interpretability. In this letter, we utilize the maximum entropy principle to design a statistics-i… ▽ More

    Submitted 18 June, 2024; v1 submitted 3 June, 2024; originally announced June 2024.

    Comments: 19 pages, 5 figures

  38. arXiv:2406.00131  [pdf, other

    cs.LG cs.CL stat.ML

    How In-Context Learning Emerges from Training on Unstructured Data: On the Role of Co-Occurrence, Positional Information, and Noise Structures

    Authors: Kevin Christian Wibisono, Yixin Wang

    Abstract: Large language models (LLMs) like transformers have impressive in-context learning (ICL) capabilities; they can generate predictions for new queries based on input-output sequences in prompts without parameter updates. While many theories have attempted to explain ICL, they often focus on structured training data similar to ICL tasks, such as regression. In practice, however, these models are trai… ▽ More

    Submitted 31 May, 2024; originally announced June 2024.

    Comments: 33 pages

  39. arXiv:2405.18781  [pdf, other

    cs.LG stat.ML

    On the Role of Attention Masks and LayerNorm in Transformers

    Authors: Xinyi Wu, Amir Ajorlou, Yifei Wang, Stefanie Jegelka, Ali Jadbabaie

    Abstract: Self-attention is the key mechanism of transformers, which are the essential building blocks of modern foundation models. Recent studies have shown that pure self-attention suffers from an increasing degree of rank collapse as depth increases, limiting model expressivity and further utilization of model depth. The existing literature on rank collapse, however, has mostly overlooked other critical… ▽ More

    Submitted 29 May, 2024; originally announced May 2024.

  40. arXiv:2405.18634  [pdf, other

    cs.LG cs.CL stat.ML

    A Theoretical Understanding of Self-Correction through In-context Alignment

    Authors: Yifei Wang, Yuyang Wu, Zeming Wei, Stefanie Jegelka, Yisen Wang

    Abstract: Going beyond mimicking limited human experiences, recent studies show initial evidence that, like humans, large language models (LLMs) are capable of improving their abilities purely by self-correction, i.e., correcting previous responses through self-examination, in certain circumstances. Nevertheless, little is known about how such capabilities arise. In this work, based on a simplified setup ak… ▽ More

    Submitted 28 May, 2024; originally announced May 2024.

  41. arXiv:2405.17478  [pdf, other

    cs.LG stat.ML

    ROSE: Register Assisted General Time Series Forecasting with Decomposed Frequency Learning

    Authors: Yihang Wang, Yuying Qiu, Peng Chen, Kai Zhao, Yang Shu, Zhongwen Rao, Lujia Pan, Bin Yang, Chenjuan Guo

    Abstract: With the increasing collection of time series data from various domains, there arises a strong demand for general time series forecasting models pre-trained on a large number of time-series datasets to support a variety of downstream prediction tasks. Enabling general time series forecasting faces two challenges: how to obtain unified representations from multi-domian time series data, and how to… ▽ More

    Submitted 24 May, 2024; originally announced May 2024.

  42. arXiv:2405.11451  [pdf, ps, other

    math.NA cs.AI math.AP stat.ML

    Error Analysis of Three-Layer Neural Network Trained with PGD for Deep Ritz Method

    Authors: Yuling Jiao, Yanming Lai, Yang Wang

    Abstract: Machine learning is a rapidly advancing field with diverse applications across various domains. One prominent area of research is the utilization of deep learning techniques for solving partial differential equations(PDEs). In this work, we specifically focus on employing a three-layer tanh neural network within the framework of the deep Ritz method(DRM) to solve second-order elliptic equations wi… ▽ More

    Submitted 19 May, 2024; originally announced May 2024.

    MSC Class: 65N12; 65N15; 68T07; 62G05; 35J25

  43. arXiv:2405.08920  [pdf, other

    cs.LG cs.CR cs.CV stat.ML

    Neural Collapse Meets Differential Privacy: Curious Behaviors of NoisyGD with Near-perfect Representation Learning

    Authors: Chendi Wang, Yuqing Zhu, Weijie J. Su, Yu-Xiang Wang

    Abstract: A recent study by De et al. (2022) has reported that large-scale representation learning through pre-training on a public dataset significantly enhances differentially private (DP) learning in downstream tasks, despite the high dimensionality of the feature space. To theoretically explain this phenomenon, we consider the setting of a layer-peeled model in representation learning, which results in… ▽ More

    Submitted 16 May, 2024; v1 submitted 14 May, 2024; originally announced May 2024.

    Comments: To appear in ICML 2024

  44. arXiv:2405.07138  [pdf, other

    stat.ME

    Large-dimensional Robust Factor Analysis with Group Structure

    Authors: Yong He, Xiaoyang Ma, Xingheng Wang, Yalin Wang

    Abstract: In this paper, we focus on exploiting the group structure for large-dimensional factor models, which captures the homogeneous effects of common factors on individuals within the same group. In view of the fact that datasets in macroeconomics and finance are typically heavy-tailed, we propose to identify the unknown group structure using the agglomerative hierarchical clustering algorithm and an in… ▽ More

    Submitted 11 May, 2024; originally announced May 2024.

  45. arXiv:2405.00742  [pdf, other

    cs.CR cs.LG stat.ML

    Federated Graph Learning for EV Charging Demand Forecasting with Personalization Against Cyberattacks

    Authors: Yi Li, Renyou Xie, Chaojie Li, Yi Wang, Zhaoyang Dong

    Abstract: Mitigating cybersecurity risk in electric vehicle (EV) charging demand forecasting plays a crucial role in the safe operation of collective EV chargings, the stability of the power grid, and the cost-effective infrastructure expansion. However, existing methods either suffer from the data privacy issue and the susceptibility to cyberattacks or fail to consider the spatial correlation among differe… ▽ More

    Submitted 30 April, 2024; originally announced May 2024.

    Comments: 11 pages,4 figures

  46. arXiv:2405.00669  [pdf, other

    astro-ph.CO physics.data-an stat.CO

    Euclid preparation. LensMC, weak lensing cosmic shear measurement with forward modelling and Markov Chain Monte Carlo sampling

    Authors: Euclid Collaboration, G. Congedo, L. Miller, A. N. Taylor, N. Cross, C. A. J. Duncan, T. Kitching, N. Martinet, S. Matthew, T. Schrabback, M. Tewes, N. Welikala, N. Aghanim, A. Amara, S. Andreon, N. Auricchio, M. Baldi, S. Bardelli, R. Bender, C. Bodendorf, D. Bonino, E. Branchini, M. Brescia, J. Brinchmann, S. Camera , et al. (217 additional authors not shown)

    Abstract: LensMC is a weak lensing shear measurement method developed for Euclid and Stage-IV surveys. It is based on forward modelling to deal with convolution by a point spread function with comparable size to many galaxies; sampling the posterior distribution of galaxy parameters via Markov Chain Monte Carlo; and marginalisation over nuisance parameters for each of the 1.5 billion galaxies observed by Eu… ▽ More

    Submitted 13 August, 2024; v1 submitted 1 May, 2024; originally announced May 2024.

    Comments: 29 pages, 18 figures, 2 tables

  47. arXiv:2404.19756  [pdf, other

    cs.LG cond-mat.dis-nn cs.AI stat.ML

    KAN: Kolmogorov-Arnold Networks

    Authors: Ziming Liu, Yixuan Wang, Sachin Vaidya, Fabian Ruehle, James Halverson, Marin Soljačić, Thomas Y. Hou, Max Tegmark

    Abstract: Inspired by the Kolmogorov-Arnold representation theorem, we propose Kolmogorov-Arnold Networks (KANs) as promising alternatives to Multi-Layer Perceptrons (MLPs). While MLPs have fixed activation functions on nodes ("neurons"), KANs have learnable activation functions on edges ("weights"). KANs have no linear weights at all -- every weight parameter is replaced by a univariate function parametriz… ▽ More

    Submitted 16 June, 2024; v1 submitted 30 April, 2024; originally announced April 2024.

    Comments: 48 pages, 20 figures. Codes are available at https://github.com/KindXiaoming/pykan

  48. arXiv:2404.12589  [pdf, other

    math.PR cs.IT math.OC stat.CO

    A rate-distortion framework for MCMC algorithms: geometry and factorization of multivariate Markov chains

    Authors: Michael C. H. Choi, Youjia Wang, Geoffrey Wolfer

    Abstract: We introduce a framework rooted in a rate distortion problem for Markov chains, and show how a suite of commonly used Markov Chain Monte Carlo (MCMC) algorithms are specific instances within it, where the target stationary distribution is controlled by the distortion function. Our approach offers a unified variational view on the optimality of algorithms such as Metropolis-Hastings, Glauber dynami… ▽ More

    Submitted 18 April, 2024; originally announced April 2024.

    Comments: 63 pages, 6 figures

    MSC Class: 60F10; 60J10; 60J22; 94A15; 94A17

  49. arXiv:2404.09729  [pdf

    eess.SP cs.IT cs.LG stat.ME

    Amplitude-Phase Fusion for Enhanced Electrocardiogram Morphological Analysis

    Authors: Shuaicong Hu, Yanan Wang, Jian Liu, Jingyu Lin, Shengmei Qin, Zhenning Nie, Zhifeng Yao, Wenjie Cai, Cuiwei Yang

    Abstract: Considering the variability of amplitude and phase patterns in electrocardiogram (ECG) signals due to cardiac activity and individual differences, existing entropy-based studies have not fully utilized these two patterns and lack integration. To address this gap, this paper proposes a novel fusion entropy metric, morphological ECG entropy (MEE) for the first time, specifically designed for ECG mor… ▽ More

    Submitted 15 April, 2024; originally announced April 2024.

    Comments: 16 pages, 12 figures

    ACM Class: I.5.2

  50. arXiv:2404.09413  [pdf, other

    stat.ML cs.CR cs.LG

    On the Optimal Regret of Locally Private Linear Contextual Bandit

    Authors: Jiachun Li, David Simchi-Levi, Yining Wang

    Abstract: Contextual bandit with linear reward functions is among one of the most extensively studied models in bandit and online learning research. Recently, there has been increasing interest in designing \emph{locally private} linear contextual bandit algorithms, where sensitive information contained in contexts and rewards is protected against leakage to the general public. While the classical linear co… ▽ More

    Submitted 14 April, 2024; originally announced April 2024.