Skip to main content

Showing 1–50 of 147 results for author: Chen, M

Searching in archive stat. Search in all archives.
.
  1. arXiv:2408.03746  [pdf, ps, other

    cs.LG cs.AI stat.ML

    Flexible Bayesian Last Layer Models Using Implicit Priors and Diffusion Posterior Sampling

    Authors: Jian Xu, Zhiqi Lin, Shigui Li, Min Chen, Junmei Yang, Delu Zeng, John Paisley

    Abstract: Bayesian Last Layer (BLL) models focus solely on uncertainty in the output layer of neural networks, demonstrating comparable performance to more complex Bayesian models. However, the use of Gaussian priors for last layer weights in Bayesian Last Layer (BLL) models limits their expressive capacity when faced with non-Gaussian, outlier-rich, or high-dimensional datasets. To address this shortfall,… ▽ More

    Submitted 7 August, 2024; originally announced August 2024.

  2. arXiv:2407.16134  [pdf, other

    cs.LG math.ST stat.ML

    Diffusion Transformer Captures Spatial-Temporal Dependencies: A Theory for Gaussian Process Data

    Authors: Hengyu Fu, Zehao Dou, Jiawei Guo, Mengdi Wang, Minshuo Chen

    Abstract: Diffusion Transformer, the backbone of Sora for video generation, successfully scales the capacity of diffusion models, pioneering new avenues for high-fidelity sequential data generation. Unlike static data such as images, sequential data consists of consecutive data frames indexed by time, exhibiting rich spatial and temporal dependencies. These dependencies represent the underlying dynamic mode… ▽ More

    Submitted 22 July, 2024; originally announced July 2024.

    Comments: 52 pages, 8 figures

  3. arXiv:2407.15580  [pdf, other

    cs.LG cs.SD eess.AS math.PR stat.ML

    Annealed Multiple Choice Learning: Overcoming limitations of Winner-takes-all with annealing

    Authors: David Perera, Victor Letzelter, Théo Mariotte, Adrien Cortés, Mickael Chen, Slim Essid, Gaël Richard

    Abstract: We introduce Annealed Multiple Choice Learning (aMCL) which combines simulated annealing with MCL. MCL is a learning framework handling ambiguous tasks by predicting a small set of plausible hypotheses. These hypotheses are trained using the Winner-takes-all (WTA) scheme, which promotes the diversity of the predictions. However, this scheme may converge toward an arbitrarily suboptimal local minim… ▽ More

    Submitted 22 July, 2024; originally announced July 2024.

  4. arXiv:2404.14743  [pdf, other

    stat.ML cs.LG

    Gradient Guidance for Diffusion Models: An Optimization Perspective

    Authors: Yingqing Guo, Hui Yuan, Yukang Yang, Minshuo Chen, Mengdi Wang

    Abstract: Diffusion models have demonstrated empirical successes in various applications and can be adapted to task-specific needs via guidance. This paper introduces a form of gradient guidance for adapting or fine-tuning diffusion models towards user-specified optimization objectives. We study the theoretic aspects of a guided score-based sampling process, linking the gradient-guided diffusion model to fi… ▽ More

    Submitted 23 April, 2024; originally announced April 2024.

  5. arXiv:2404.07771  [pdf, other

    cs.LG math.ST stat.ML

    An Overview of Diffusion Models: Applications, Guided Generation, Statistical Rates and Optimization

    Authors: Minshuo Chen, Song Mei, Jianqing Fan, Mengdi Wang

    Abstract: Diffusion models, a powerful and universal generative AI technology, have achieved tremendous success in computer vision, audio, reinforcement learning, and computational biology. In these applications, diffusion models provide flexible high-dimensional data modeling, and act as a sampler for generating new samples under active guidance towards task-desired properties. Despite the significant empi… ▽ More

    Submitted 11 April, 2024; originally announced April 2024.

  6. arXiv:2403.11968  [pdf, other

    cs.LG math.ST stat.ML

    Unveil Conditional Diffusion Models with Classifier-free Guidance: A Sharp Statistical Theory

    Authors: Hengyu Fu, Zhuoran Yang, Mengdi Wang, Minshuo Chen

    Abstract: Conditional diffusion models serve as the foundation of modern image synthesis and find extensive application in fields like computational biology and reinforcement learning. In these applications, conditional diffusion models incorporate various conditional information, such as prompt input, to guide the sample generation towards desired properties. Despite the empirical success, theory of condit… ▽ More

    Submitted 18 March, 2024; originally announced March 2024.

    Comments: 92 pages, 5 figures

  7. arXiv:2403.09877  [pdf, other

    stat.ME math.ST

    Quantifying Distributional Input Uncertainty via Inflated Kolmogorov-Smirnov Confidence Band

    Authors: Motong Chen, Henry Lam, Zhenyuan Liu

    Abstract: In stochastic simulation, input uncertainty refers to the propagation of the statistical noise in calibrating input models to impact output accuracy, in addition to the Monte Carlo simulation noise. The vast majority of the input uncertainty literature focuses on estimating target output quantities that are real-valued. However, outputs of simulation models are random and real-valued targets essen… ▽ More

    Submitted 14 March, 2024; originally announced March 2024.

  8. arXiv:2403.01639  [pdf, other

    cs.LG stat.ML

    Theoretical Insights for Diffusion Guidance: A Case Study for Gaussian Mixture Models

    Authors: Yuchen Wu, Minshuo Chen, Zihao Li, Mengdi Wang, Yuting Wei

    Abstract: Diffusion models benefit from instillation of task-specific information into the score function to steer the sample generation towards desired properties. Such information is coined as guidance. For example, in text-to-image synthesis, text input is encoded as guidance to generate semantically aligned images. Proper guidance inputs are closely tied to the performance of diffusion models. A common… ▽ More

    Submitted 3 March, 2024; originally announced March 2024.

    Comments: 41 pages, 12 figures

  9. arXiv:2401.06687  [pdf, other

    cs.CL cs.LG stat.ME

    Proximal Causal Inference With Text Data

    Authors: Jacob M. Chen, Rohit Bhattacharya, Katherine A. Keith

    Abstract: Recent text-based causal methods attempt to mitigate confounding bias by estimating proxies of confounding variables that are partially or imperfectly measured from unstructured text data. These approaches, however, assume analysts have supervised labels of the confounders given text for a subset of instances, a constraint that is sometimes infeasible due to data privacy or annotation costs. In th… ▽ More

    Submitted 21 May, 2024; v1 submitted 12 January, 2024; originally announced January 2024.

    Comments: 26 pages

  10. arXiv:2311.01052  [pdf, other

    stat.ML cs.LG

    Resilient Multiple Choice Learning: A learned scoring scheme with application to audio scene analysis

    Authors: Victor Letzelter, Mathieu Fontaine, Mickaël Chen, Patrick Pérez, Slim Essid, Gaël Richard

    Abstract: We introduce Resilient Multiple Choice Learning (rMCL), an extension of the MCL approach for conditional distribution estimation in regression settings where multiple targets may be sampled for each training input. Multiple Choice Learning is a simple framework to tackle multimodal density estimation, using the Winner-Takes-All (WTA) loss for a set of hypotheses. In regression settings, the existi… ▽ More

    Submitted 16 November, 2023; v1 submitted 2 November, 2023; originally announced November 2023.

    Journal ref: Advances in neural information processing systems, Dec 2023, New Orleans, United States

  11. arXiv:2310.15497  [pdf

    stat.ME

    Generalized Box-Cox method to estimate sample mean and standard deviation for Meta-analysis

    Authors: Olivia Xiao, Stacy Wang, Min Chen

    Abstract: Meta-analysis is the aggregation of data from multiple studies to find patterns across a broad range relating to a particular subject. It is becoming increasingly useful to apply meta-analysis to summarize these studies being done across various fields. In meta-analysis, it is common to use the mean and standard deviation from each study to compare for analysis. While many studies reported mean an… ▽ More

    Submitted 23 October, 2023; originally announced October 2023.

    Comments: 9 pages, 6 figures

  12. arXiv:2310.10556  [pdf, other

    cs.LG stat.ML

    Sample Complexity of Preference-Based Nonparametric Off-Policy Evaluation with Deep Networks

    Authors: Zihao Li, Xiang Ji, Minshuo Chen, Mengdi Wang

    Abstract: A recently popular approach to solving reinforcement learning is with data from human preferences. In fact, human preference data are now used with classic reinforcement learning algorithms such as actor-critic methods, which involve evaluating an intermediate policy over a reward learned from human preference data with distribution shift, known as off-policy evaluation (OPE). Such algorithm inclu… ▽ More

    Submitted 26 February, 2024; v1 submitted 16 October, 2023; originally announced October 2023.

  13. arXiv:2310.10013  [pdf, other

    stat.ML cs.LG

    Riemannian Residual Neural Networks

    Authors: Isay Katsman, Eric Ming Chen, Sidhanth Holalkere, Anna Asch, Aaron Lou, Ser-Nam Lim, Christopher De Sa

    Abstract: Recent methods in geometric deep learning have introduced various neural networks to operate over data that lie on Riemannian manifolds. Such networks are often necessary to learn well over graphs with a hierarchical structure or to learn over manifold-valued data encountered in the natural sciences. These networks are often inspired by and directly generalize standard Euclidean neural networks. H… ▽ More

    Submitted 15 October, 2023; originally announced October 2023.

    Comments: Published at NeurIPS 2023

  14. arXiv:2310.01756  [pdf, other

    stat.ML cs.AI cs.LG

    Improved Algorithms for Adversarial Bandits with Unbounded Losses

    Authors: Mingyu Chen, Xuezhou Zhang

    Abstract: We consider the Adversarial Multi-Armed Bandits (MAB) problem with unbounded losses, where the algorithms have no prior knowledge on the sizes of the losses. We present UMAB-NN and UMAB-G, two algorithms for non-negative and general unbounded loss respectively. For non-negative unbounded loss, UMAB-NN achieves the first adaptive and scale free regret bound without uniform exploration. Built up on… ▽ More

    Submitted 2 October, 2023; originally announced October 2023.

  15. Sample Complexity of Neural Policy Mirror Descent for Policy Optimization on Low-Dimensional Manifolds

    Authors: Zhenghao Xu, Xiang Ji, Minshuo Chen, Mengdi Wang, Tuo Zhao

    Abstract: Policy gradient methods equipped with deep neural networks have achieved great success in solving high-dimensional reinforcement learning (RL) problems. However, current analyses cannot explain why they are resistant to the curse of dimensionality. In this work, we study the sample complexity of the neural policy mirror descent (NPMD) algorithm with deep convolutional neural networks (CNN). Motiva… ▽ More

    Submitted 14 January, 2024; v1 submitted 25 September, 2023; originally announced September 2023.

  16. arXiv:2307.12975  [pdf, ps, other

    cs.LG math.ST stat.ML

    Provable Benefits of Policy Learning from Human Preferences in Contextual Bandit Problems

    Authors: Xiang Ji, Huazheng Wang, Minshuo Chen, Tuo Zhao, Mengdi Wang

    Abstract: For a real-world decision-making problem, the reward function often needs to be engineered or learned. A popular approach is to utilize human feedback to learn a reward function for training. The most straightforward way to do so is to ask humans to provide ratings for state-action pairs on an absolute scale and take these ratings as reward samples directly. Another popular way is to ask humans to… ▽ More

    Submitted 28 October, 2023; v1 submitted 24 July, 2023; originally announced July 2023.

  17. arXiv:2307.02884  [pdf, ps, other

    cs.LG stat.ML

    Sample-Efficient Learning of POMDPs with Multiple Observations In Hindsight

    Authors: Jiacheng Guo, Minshuo Chen, Huan Wang, Caiming Xiong, Mengdi Wang, Yu Bai

    Abstract: This paper studies the sample-efficiency of learning in Partially Observable Markov Decision Processes (POMDPs), a challenging problem in reinforcement learning that is known to be exponentially hard in the worst-case. Motivated by real-world settings such as loading in game playing, we propose an enhanced feedback model called ``multiple observations in hindsight'', where after each episode of in… ▽ More

    Submitted 6 July, 2023; originally announced July 2023.

  18. arXiv:2306.14859  [pdf, other

    cs.LG stat.ML

    Effective Minkowski Dimension of Deep Nonparametric Regression: Function Approximation and Statistical Theories

    Authors: Zixuan Zhang, Minshuo Chen, Mengdi Wang, Wenjing Liao, Tuo Zhao

    Abstract: Existing theories on deep nonparametric regression have shown that when the input data lie on a low-dimensional manifold, deep neural networks can adapt to the intrinsic data structures. In real world applications, such an assumption of data lying exactly on a low dimensional manifold is stringent. This paper introduces a relaxed assumption that the input data are concentrated around a subset of… ▽ More

    Submitted 26 June, 2023; originally announced June 2023.

  19. arXiv:2306.05511  [pdf, other

    stat.ME

    Causal Inference With Outcome-Dependent Missingness And Self-Censoring

    Authors: Jacob M Chen, Daniel Malinsky, Rohit Bhattacharya

    Abstract: We consider missingness in the context of causal inference when the outcome of interest may be missing. If the outcome directly affects its own missingness status, i.e., it is "self-censoring", this may lead to severely biased causal effect estimates. Miao et al. [2015] proposed the shadow variable method to correct for bias due to self-censoring; however, verifying the required model assumptions… ▽ More

    Submitted 8 June, 2023; originally announced June 2023.

    Comments: 15 pages. In proceedings of the 39th Conference on Uncertainty in Artificial Intelligence

  20. arXiv:2305.16150  [pdf, other

    cs.LG cs.CV cs.NE stat.ML

    Unifying GANs and Score-Based Diffusion as Generative Particle Models

    Authors: Jean-Yves Franceschi, Mike Gartrell, Ludovic Dos Santos, Thibaut Issenhuth, Emmanuel de Bézenac, Mickaël Chen, Alain Rakotomamonjy

    Abstract: Particle-based deep generative models, such as gradient flows and score-based diffusion models, have recently gained traction thanks to their striking performance. Their principle of displacing particle distributions using differential equations is conventionally seen as opposed to the previously widespread generative adversarial networks (GANs), which involve training a pushforward generator netw… ▽ More

    Submitted 21 December, 2023; v1 submitted 25 May, 2023; originally announced May 2023.

    Journal ref: Thirty-seventh Conference on Neural Information Processing Systems, Neural Information Processing Systems Foundation, Dec. 2023, New Orleans, LA, USA

  21. arXiv:2305.15742  [pdf, other

    stat.ML cs.LG stat.ME

    Counterfactual Generative Models for Time-Varying Treatments

    Authors: Shenghao Wu, Wenbin Zhou, Minshuo Chen, Shixiang Zhu

    Abstract: Estimating the counterfactual outcome of treatment is essential for decision-making in public health and clinical science, among others. Often, treatments are administered in a sequential, time-varying manner, leading to an exponentially increased number of possible counterfactual outcomes. Furthermore, in modern applications, the outcomes are high-dimensional and conventional average treatment ef… ▽ More

    Submitted 13 July, 2024; v1 submitted 25 May, 2023; originally announced May 2023.

    Comments: Published at KDD'24

  22. arXiv:2305.13444  [pdf, other

    stat.ME

    Ordinal Outcome State-Space Models for Intensive Longitudinal Data

    Authors: Teague R. Henry, Lindley R. Slipetz, Ami Falk, Jiaxing Qiu, Meng Chen

    Abstract: Intensive longitudinal (IL) data are increasingly prevalent in psychological science, coinciding with technological advancements that make it simple to deploy study designs such as daily diary and ecological momentary assessments. IL data are characterized by a rapid rate of data collection (1+ collections per day), over a period of time, allowing for the capture of the dynamics that underlie psyc… ▽ More

    Submitted 22 May, 2023; originally announced May 2023.

    Comments: 28 pages, 6 figures, 7 pages supplementary materials

  23. arXiv:2305.10373  [pdf, other

    stat.ME q-bio.NC

    Functional Connectivity: Continuous-Time Latent Factor Models for Neural Spike Trains

    Authors: Meixi Chen, Martin Lysy, David Moorman, Reza Ramezan

    Abstract: Modelling the dynamics of interactions in a neuronal ensemble is an important problem in functional connectivity research. One popular framework is latent factor models (LFMs), which have achieved notable success in decoding neuronal population dynamics. However, most LFMs are specified in discrete time, where the choice of bin size significantly impacts inference results. In this work, we present… ▽ More

    Submitted 17 May, 2023; originally announced May 2023.

  24. arXiv:2303.06560  [pdf, other

    stat.ME

    Causal Mediation Analysis with a Three-Dimensional Image Mediator

    Authors: Minghao Chen, Yingchun Zhou

    Abstract: Causal mediation analysis is increasingly abundant in biology, psychology, and epidemiology studies, etc. In particular, with the advent of the big data era, the issue of high-dimensional mediators is becoming more prevalent. In neuroscience, with the widespread application of magnetic resonance technology in the field of brain imaging, studies on image being a mediator emerged. In this study, a n… ▽ More

    Submitted 7 July, 2023; v1 submitted 11 March, 2023; originally announced March 2023.

    Comments: 35 pages, 9 figures

  25. arXiv:2303.01469  [pdf, other

    cs.LG cs.CV stat.ML

    Consistency Models

    Authors: Yang Song, Prafulla Dhariwal, Mark Chen, Ilya Sutskever

    Abstract: Diffusion models have significantly advanced the fields of image, audio, and video generation, but they depend on an iterative sampling process that causes slow generation. To overcome this limitation, we propose consistency models, a new family of models that generate high quality samples by directly mapping noise to data. They support fast one-step generation by design, while still allowing mult… ▽ More

    Submitted 31 May, 2023; v1 submitted 2 March, 2023; originally announced March 2023.

    Comments: ICML 2023

  26. arXiv:2302.13183  [pdf, other

    stat.ML cs.LG

    On Deep Generative Models for Approximation and Estimation of Distributions on Manifolds

    Authors: Biraj Dahal, Alex Havrilla, Minshuo Chen, Tuo Zhao, Wenjing Liao

    Abstract: Generative networks have experienced great empirical successes in distribution learning. Many existing experiments have demonstrated that generative networks can generate high-dimensional complex data from a low-dimensional easy-to-sample distribution. However, this phenomenon can not be justified by existing theories. The widely held manifold hypothesis speculates that real-world data sets, such… ▽ More

    Submitted 25 February, 2023; originally announced February 2023.

  27. arXiv:2302.07194  [pdf, other

    cs.LG stat.ML

    Score Approximation, Estimation and Distribution Recovery of Diffusion Models on Low-Dimensional Data

    Authors: Minshuo Chen, Kaixuan Huang, Tuo Zhao, Mengdi Wang

    Abstract: Diffusion models achieve state-of-the-art performance in various generation tasks. However, their theoretical foundations fall far behind. This paper studies score approximation, estimation, and distribution recovery of diffusion models, when data are supported on an unknown low-dimensional linear subspace. Our result provides sample complexity bounds for distribution estimation using diffusion mo… ▽ More

    Submitted 14 February, 2023; originally announced February 2023.

    Comments: 52 pages, 4 figures

  28. arXiv:2212.10579  [pdf, other

    hep-ph cs.LG hep-ex stat.ML

    Resonant Anomaly Detection with Multiple Reference Datasets

    Authors: Mayee F. Chen, Benjamin Nachman, Frederic Sala

    Abstract: An important class of techniques for resonant anomaly detection in high energy physics builds models that can distinguish between reference and target datasets, where only the latter has appreciable signal. Such techniques, including Classification Without Labels (CWoLa) and Simulation Assisted Likelihood-free Anomaly Detection (SALAD) rely on a single reference dataset. They cannot take advantage… ▽ More

    Submitted 20 December, 2022; originally announced December 2022.

  29. arXiv:2211.10926  [pdf, other

    stat.AP physics.soc-ph q-bio.PE

    Unraveling implicit human behavioral effects on dynamic characteristics of Covid-19 daily infection rates in Taiwan

    Authors: Ting-Li Chen, Elizabeth P. Chou, Min-Yi Chen, Hsieh Fushing

    Abstract: We study Covid-19 spreading dynamics underlying 84 curves of daily Covid-19 infection rates pertaining to 84 districts belonging to the largest seven cities in Taiwan during her pristine surge period. Our computational developments begin with selecting and extracting 18 features from each smoothed district-specific curve. This step of computing effort allows unstructured data to be converted into… ▽ More

    Submitted 20 November, 2022; originally announced November 2022.

  30. arXiv:2209.11805  [pdf

    cs.CY stat.ME

    Tracking the State and Behavior of People in Response to COVID-1 19 Through the Fusion of Multiple Longitudinal Data Streams

    Authors: Mohamed Amine Bouzaghrane, Hassan Obeid, Drake Hayes, Minnie Chen, Meiqing Li, Madeleine Parker, Daniel A. Rodríguez, Daniel G. Chatman, Karen Trapenberg Frick, Raja Sengupta, Joan Walker

    Abstract: The changing nature of the COVID-19 pandemic has highlighted the importance of comprehensively considering its impacts and considering changes over time. Most COVID-19 related research addresses narrowly focused research questions and is therefore limited in addressing the complexities created by the interrelated impacts of the pandemic. Such research generally makes use of only one of either 1) a… ▽ More

    Submitted 1 October, 2022; v1 submitted 23 September, 2022; originally announced September 2022.

  31. arXiv:2206.09903  [pdf, other

    stat.ME q-bio.NC

    A Multivariate Point Process Model for Simultaneously Recorded Neural Spike Trains

    Authors: Reza Ramezan, Meixi Chen, Martin Lysy, Paul Marriott

    Abstract: The current state-of-the-art in neurophysiological data collection allows for simultaneous recording of tens to hundreds of neurons, for which point processes are an appropriate statistical modelling framework. However, existing point process models lack multivariate generalizations which are both flexible and computationally tractable. This paper introduces a multivariate generalization of the Sk… ▽ More

    Submitted 20 June, 2022; originally announced June 2022.

    Comments: 6 pages, 1 figure

  32. arXiv:2206.04569  [pdf, other

    stat.ML cs.LG

    Benefits of Overparameterized Convolutional Residual Networks: Function Approximation under Smoothness Constraint

    Authors: Hao Liu, Minshuo Chen, Siawpeng Er, Wenjing Liao, Tong Zhang, Tuo Zhao

    Abstract: Overparameterized neural networks enjoy great representation power on complex data, and more importantly yield sufficiently smooth output, which is crucial to their generalization and robustness. Most existing function approximation theories suggest that with sufficiently many parameters, neural networks can well approximate certain classes of functions in terms of the function value. The neural n… ▽ More

    Submitted 9 June, 2022; originally announced June 2022.

  33. arXiv:2206.02887  [pdf, other

    cs.LG stat.ML

    Sample Complexity of Nonparametric Off-Policy Evaluation on Low-Dimensional Manifolds using Deep Networks

    Authors: Xiang Ji, Minshuo Chen, Mengdi Wang, Tuo Zhao

    Abstract: We consider the off-policy evaluation problem of reinforcement learning using deep convolutional neural networks. We analyze the deep fitted Q-evaluation method for estimating the expected cumulative reward of a target policy, when the data are generated from an unknown behavior policy. We show that, by choosing network size appropriately, one can leverage any low-dimensional manifold structure in… ▽ More

    Submitted 3 October, 2022; v1 submitted 6 June, 2022; originally announced June 2022.

    Comments: 52 pages, 2 figures

  34. arXiv:2205.10467  [pdf, other

    stat.ME

    Understanding the Risks and Rewards of Combining Unbiased and Possibly Biased Estimators, with Applications to Causal Inference

    Authors: Michael Oberst, Alexander D'Amour, Minmin Chen, Yuyan Wang, David Sontag, Steve Yadlowsky

    Abstract: Several problems in statistics involve the combination of high-variance unbiased estimators with low-variance estimators that are only unbiased under strong assumptions. A notable example is the estimation of causal effects while combining small experimental datasets with larger observational datasets. There exist a series of recent proposals on how to perform such a combination, even when the bia… ▽ More

    Submitted 24 May, 2023; v1 submitted 20 May, 2022; originally announced May 2022.

  35. arXiv:2205.02043  [pdf, other

    stat.ML cs.LG math.ST

    A Manifold Two-Sample Test Study: Integral Probability Metric with Neural Networks

    Authors: Jie Wang, Minshuo Chen, Tuo Zhao, Wenjing Liao, Yao Xie

    Abstract: Two-sample tests are important areas aiming to determine whether two collections of observations follow the same distribution or not. We propose two-sample tests based on integral probability metric (IPM) for high-dimensional samples supported on a low-dimensional manifold. We characterize the properties of proposed tests with respect to the number of samples $n$ and the structure of the manifold… ▽ More

    Submitted 19 April, 2023; v1 submitted 4 May, 2022; originally announced May 2022.

    Comments: 32 pages, 2 figures, 3 tables. Accepted by Information and Inference: A Journal of the IMA

  36. arXiv:2204.07596  [pdf, other

    stat.ML cs.LG

    Perfectly Balanced: Improving Transfer and Robustness of Supervised Contrastive Learning

    Authors: Mayee F. Chen, Daniel Y. Fu, Avanika Narayan, Michael Zhang, Zhao Song, Kayvon Fatahalian, Christopher Ré

    Abstract: An ideal learned representation should display transferability and robustness. Supervised contrastive learning (SupCon) is a promising method for training accurate models, but produces representations that do not capture these properties due to class collapse -- when all points in a class map to the same representation. Recent work suggests that "spreading out" these representations improves them,… ▽ More

    Submitted 13 July, 2022; v1 submitted 15 April, 2022; originally announced April 2022.

    Comments: ICML 2022 Camera Ready

  37. arXiv:2204.06963  [pdf, other

    cs.LG cs.CR stat.ML

    Finding MNEMON: Reviving Memories of Node Embeddings

    Authors: Yun Shen, Yufei Han, Zhikun Zhang, Min Chen, Ting Yu, Michael Backes, Yang Zhang, Gianluca Stringhini

    Abstract: Previous security research efforts orbiting around graphs have been exclusively focusing on either (de-)anonymizing the graphs or understanding the security and privacy issues of graph neural networks. Little attention has been paid to understand the privacy risks of integrating the output from graph embedding models (e.g., node embeddings) with complex downstream machine learning pipelines. In th… ▽ More

    Submitted 29 April, 2022; v1 submitted 14 April, 2022; originally announced April 2022.

    Comments: To Appear in the 29th ACM Conference on Computer and Communications Security (CCS), November 7-11, 2022

  38. arXiv:2203.13270  [pdf, other

    stat.ML cs.LG

    Shoring Up the Foundations: Fusing Model Embeddings and Weak Supervision

    Authors: Mayee F. Chen, Daniel Y. Fu, Dyah Adila, Michael Zhang, Frederic Sala, Kayvon Fatahalian, Christopher Ré

    Abstract: Foundation models offer an exciting new paradigm for constructing models with out-of-the-box embeddings and a few labeled examples. However, it is not clear how to best apply foundation models without labeled data. A potential approach is to fuse foundation models with weak supervision frameworks, which use weak label sources -- pre-trained models, heuristics, crowd-workers -- to construct pseudol… ▽ More

    Submitted 1 August, 2022; v1 submitted 24 March, 2022; originally announced March 2022.

    Comments: UAI 2022 Camera Ready

  39. arXiv:2201.10617  [pdf, other

    cs.HC stat.AP

    Inform Product Change through Experimentation with Data-Driven Behavioral Segmentation

    Authors: Zhenyu Zhao, Yan He, Miao Chen

    Abstract: Online controlled experimentation is widely adopted for evaluating new features in the rapid development cycle for web products and mobile applications. Measurement of the overall experiment sample is a common practice to quantify the overall treatment effect. In order to understand why the treatment effect occurs in a certain way, segmentation becomes a valuable approach to a finer analysis of ex… ▽ More

    Submitted 25 January, 2022; originally announced January 2022.

    Comments: 2017 IEEE International Conference on Data Science and Advanced Analytics (DSAA). IEEE, 2017

  40. arXiv:2201.06763  [pdf, other

    cs.LG stat.ML

    Online Time Series Anomaly Detection with State Space Gaussian Processes

    Authors: Christian Bock, François-Xavier Aubet, Jan Gasthaus, Andrey Kan, Ming Chen, Laurent Callot

    Abstract: We propose r-ssGPFA, an unsupervised online anomaly detection model for uni- and multivariate time series building on the efficient state space formulation of Gaussian processes. For high-dimensional time series, we propose an extension of Gaussian process factor analysis to identify the common latent processes of the time series, allowing us to detect anomalies efficiently in an interpretable man… ▽ More

    Submitted 18 January, 2022; originally announced January 2022.

  41. arXiv:2201.00217  [pdf, other

    stat.ML cs.LG

    Deep Nonparametric Estimation of Operators between Infinite Dimensional Spaces

    Authors: Hao Liu, Haizhao Yang, Minshuo Chen, Tuo Zhao, Wenjing Liao

    Abstract: Learning operators between infinitely dimensional spaces is an important learning task arising in wide applications in machine learning, imaging science, mathematical modeling and simulations, etc. This paper studies the nonparametric estimation of Lipschitz operators using deep neural networks. Non-asymptotic upper bounds are derived for the generalization error of the empirical risk minimizer ov… ▽ More

    Submitted 1 January, 2022; originally announced January 2022.

  42. arXiv:2110.09704  [pdf, other

    stat.ME eess.SY

    Hybrid variable monitoring: An unsupervised process monitoring framework with binary and continuous variables

    Authors: Min Wang, Donghua Zhou, Maoyin Chen

    Abstract: Traditional process monitoring methods, such as PCA, PLS, ICA, MD et al., are strongly dependent on continuous variables because most of them inevitably involve Euclidean or Mahalanobis distance. With industrial processes becoming more and more complex and integrated, binary variables also appear in monitoring variables besides continuous variables, which makes process monitoring more challenging.… ▽ More

    Submitted 10 March, 2022; v1 submitted 18 October, 2021; originally announced October 2021.

    Comments: This paper has been submitted to Automatica for potential publication

  43. arXiv:2110.07051  [pdf, other

    stat.ME stat.CO

    Fast and Scalable Inference for Spatial Extreme Value Models

    Authors: Meixi Chen, Reza Ramezan, Martin Lysy

    Abstract: The generalized extreme value (GEV) distribution is a popular model for analyzing and forecasting extreme weather data. To increase prediction accuracy, spatial information is often pooled via a latent Gaussian process (GP) on the GEV parameters. Inference for GEV-GP models is typically carried out using Markov chain Monte Carlo (MCMC) methods, or using approximate inference methods such as the in… ▽ More

    Submitted 16 May, 2024; v1 submitted 13 October, 2021; originally announced October 2021.

  44. arXiv:2110.02631  [pdf, other

    cs.CR cs.LG stat.ML

    Inference Attacks Against Graph Neural Networks

    Authors: Zhikun Zhang, Min Chen, Michael Backes, Yun Shen, Yang Zhang

    Abstract: Graph is an important data representation ubiquitously existing in the real world. However, analyzing the graph data is computationally difficult due to its non-Euclidean nature. Graph embedding is a powerful tool to solve the graph analytics problem by transforming the graph data into low-dimensional vectors. These vectors could also be shared with third parties to gain additional insights of wha… ▽ More

    Submitted 6 October, 2021; originally announced October 2021.

    Comments: 19 pages, 18 figures. To Appear in the 31st USENIX Security Symposium

  45. arXiv:2109.02832  [pdf, other

    stat.ML cs.LG

    Besov Function Approximation and Binary Classification on Low-Dimensional Manifolds Using Convolutional Residual Networks

    Authors: Hao Liu, Minshuo Chen, Tuo Zhao, Wenjing Liao

    Abstract: Most of existing statistical theories on deep neural networks have sample complexities cursed by the data dimension and therefore cannot well explain the empirical success of deep learning on high-dimensional data. To bridge this gap, we propose to exploit low-dimensional geometric structures of the real world data sets. We establish theoretical guarantees of convolutional residual networks (ConvR… ▽ More

    Submitted 10 September, 2021; v1 submitted 6 September, 2021; originally announced September 2021.

    Journal ref: ICML2021, 6770-6780

  46. arXiv:2106.05566  [pdf, other

    cs.LG cs.NE stat.ML

    A Neural Tangent Kernel Perspective of GANs

    Authors: Jean-Yves Franceschi, Emmanuel de Bézenac, Ibrahim Ayed, Mickaël Chen, Sylvain Lamprier, Patrick Gallinari

    Abstract: We propose a novel theoretical framework of analysis for Generative Adversarial Networks (GANs). We reveal a fundamental flaw of previous analyses which, by incorrectly modeling GANs' training scheme, are subject to ill-defined discriminator gradients. We overcome this issue which impedes a principled study of GAN training, solving it within our framework by taking into account the discriminator's… ▽ More

    Submitted 7 November, 2022; v1 submitted 10 June, 2021; originally announced June 2021.

    Journal ref: 39th International Conference on Machine Learning, International Machine Learning Society, Jul 2022, Baltimore, MD, United States. pp.6660-6704

  47. arXiv:2106.04197  [pdf

    stat.ML cs.LG physics.geo-ph

    Seismic Inverse Modeling Method based on Generative Adversarial Network

    Authors: Pengfei Xie, YanShu Yin, JiaGen Hou, Mei Chen, Lixin Wang

    Abstract: Seismic inverse modeling is a common method in reservoir prediction and it plays a vital role in the exploration and development of oil and gas. Conventional seismic inversion method is difficult to combine with complicated and abstract knowledge on geological mode and its uncertainty is difficult to be assessed. The paper proposes an inversion modeling method based on GAN consistent with geology,… ▽ More

    Submitted 8 June, 2021; originally announced June 2021.

    Comments: 22 pages,13 figures

    MSC Class: 86-08 ACM Class: I.1.5

  48. arXiv:2104.09368  [pdf, other

    econ.EM econ.GN stat.ML

    Deep Reinforcement Learning in a Monetary Model

    Authors: Mingli Chen, Andreas Joseph, Michael Kumhof, Xinlei Pan, Xuan Zhou

    Abstract: We propose using deep reinforcement learning to solve dynamic stochastic general equilibrium models. Agents are represented by deep artificial neural networks and learn to solve their dynamic optimisation problem by interacting with the model environment, of which they have no a priori knowledge. Deep reinforcement learning offers a flexible yet principled way to model bounded rationality within t… ▽ More

    Submitted 5 January, 2023; v1 submitted 19 April, 2021; originally announced April 2021.

  49. arXiv:2103.14991  [pdf, other

    cs.LG cs.AI cs.CR stat.ML

    Graph Unlearning

    Authors: Min Chen, Zhikun Zhang, Tianhao Wang, Michael Backes, Mathias Humbert, Yang Zhang

    Abstract: Machine unlearning is a process of removing the impact of some training data from the machine learning (ML) models upon receiving removal requests. While straightforward and legitimate, retraining the ML model from scratch incurs a high computational overhead. To address this issue, a number of approximate algorithms have been proposed in the domain of image and text data, among which SISA is the… ▽ More

    Submitted 16 September, 2022; v1 submitted 27 March, 2021; originally announced March 2021.

    Comments: To Appear in 2022 ACM SIGSAC Conference on Computer and Communications Security, November 7-11, 2022. Please cite our CCS version

  50. arXiv:2103.02761  [pdf, other

    cs.LG stat.ML

    Comparing the Value of Labeled and Unlabeled Data in Method-of-Moments Latent Variable Estimation

    Authors: Mayee F. Chen, Benjamin Cohen-Wang, Stephen Mussmann, Frederic Sala, Christopher Ré

    Abstract: Labeling data for modern machine learning is expensive and time-consuming. Latent variable models can be used to infer labels from weaker, easier-to-acquire sources operating on unlabeled data. Such models can also be trained using labeled data, presenting a key question: should a user invest in few labeled or many unlabeled points? We answer this via a framework centered on model misspecification… ▽ More

    Submitted 3 March, 2021; originally announced March 2021.

    Comments: To appear in AISTATS 2021