Skip to main content

Showing 1–32 of 32 results for author: Xie, W

Searching in archive stat. Search in all archives.
.
  1. arXiv:2407.12993  [pdf, other

    cs.LG stat.ML

    Improving SAM Requires Rethinking its Optimization Formulation

    Authors: Wanyun Xie, Fabian Latorre, Kimon Antonakopoulos, Thomas Pethick, Volkan Cevher

    Abstract: This paper rethinks Sharpness-Aware Minimization (SAM), which is originally formulated as a zero-sum game where the weights of a network and a bounded perturbation try to minimize/maximize, respectively, the same differentiable loss. To fundamentally improve this design, we argue that SAM should instead be reformulated using the 0-1 loss. As a continuous relaxation, we follow the simple convention… ▽ More

    Submitted 17 July, 2024; originally announced July 2024.

    Comments: International Conference on Machine Learning (ICML), 2024

  2. arXiv:2405.04011  [pdf, other

    q-bio.MN stat.ML

    Adjoint Sensitivity Analysis on Multi-Scale Bioprocess Stochastic Reaction Network

    Authors: Keilung Choy, Wei Xie

    Abstract: Motivated by the pressing challenges in the digital twin development for biomanufacturing systems, we introduce an adjoint sensitivity analysis (SA) approach to expedite the learning of mechanistic model parameters. In this paper, we consider enzymatic stochastic reaction networks representing a multi-scale bioprocess mechanistic model that allows us to integrate disparate data from diverse produc… ▽ More

    Submitted 28 June, 2024; v1 submitted 7 May, 2024; originally announced May 2024.

    Comments: 11 pages, 2 figures

  3. arXiv:2405.03913  [pdf, other

    q-bio.QM cs.LG stat.ML

    Digital Twin Calibration for Biological System-of-Systems: Cell Culture Manufacturing Process

    Authors: Fuqiang Cheng, Wei Xie, Hua Zheng

    Abstract: Biomanufacturing innovation relies on an efficient Design of Experiments (DoEs) to optimize processes and product quality. Traditional DoE methods, ignoring the underlying bioprocessing mechanisms, often suffer from a lack of interpretability and sample efficiency. This limitation motivates us to create a new optimal learning approach for digital twin model calibration. In this study, we consider… ▽ More

    Submitted 28 June, 2024; v1 submitted 6 May, 2024; originally announced May 2024.

    Comments: 11 pages, 5 figures

  4. arXiv:2405.02783  [pdf, other

    stat.ML cs.LG

    Linear Noise Approximation Assisted Bayesian Inference on Mechanistic Model of Partially Observed Stochastic Reaction Network

    Authors: Wandi Xu, Wei Xie

    Abstract: To support mechanism online learning and facilitate digital twin development for biomanufacturing processes, this paper develops an efficient Bayesian inference approach for partially observed enzymatic stochastic reaction network (SRN), a fundamental building block of multi-scale bioprocess mechanistic model. To tackle the critical challenges brought by the nonlinear stochastic differential equat… ▽ More

    Submitted 28 June, 2024; v1 submitted 4 May, 2024; originally announced May 2024.

    Comments: 11 pages, 2 figures

  5. arXiv:2310.18567  [pdf

    stat.AP

    Reliability modeling and statistical inference of accelerated degradation data with memory effects and unit-to-unit variability

    Authors: Shi-Shun Chen, Xiao-Yang Li, Wenrui Xie

    Abstract: Accelerated degradation testing (ADT) is an effective way to evaluate the lifetime and reliability of highly reliable products. Markovian stochastic processes are usually applied to describe the degradation process. However, the degradation processes of some products are non-Markovian due to the interaction with environments. Besides, owing to the differences in materials and manufacturing process… ▽ More

    Submitted 24 July, 2024; v1 submitted 27 October, 2023; originally announced October 2023.

  6. arXiv:2305.07638  [pdf, ps, other

    math.OC stat.ML

    On the Partial Convexification for Low-Rank Spectral Optimization: Rank Bounds and Algorithms

    Authors: Yongchun Li, Weijun Xie

    Abstract: A Low-rank Spectral Optimization Problem (LSOP) minimizes a linear objective subject to multiple two-sided linear matrix inequalities intersected with a low-rank and spectral constrained domain set. Although solving LSOP is, in general, NP-hard, its partial convexification (i.e., replacing the domain set by its convex hull) termed "LSOP-R," is often tractable and yields a high-quality solution. Th… ▽ More

    Submitted 20 June, 2023; v1 submitted 12 May, 2023; originally announced May 2023.

  7. arXiv:2210.16191  [pdf, other

    math.OC stat.ML

    On the Exactness of Dantzig-Wolfe Relaxation for Rank Constrained Optimization Problems

    Authors: Yongchun Li, Weijun Xie

    Abstract: In the rank-constrained optimization problem (RCOP), it minimizes a linear objective function over a prespecified closed rank-constrained domain set and $m$ generic two-sided linear matrix inequalities. Motivated by the Dantzig-Wolfe (DW) decomposition, a popular approach of solving many nonconvex optimization problems, we investigate the strength of DW relaxation (DWR) of the RCOP, which admits t… ▽ More

    Submitted 14 June, 2023; v1 submitted 28 October, 2022; originally announced October 2022.

  8. arXiv:2208.12341   

    stat.ML cs.LG

    Variance Reduction based Experience Replay for Policy Optimization

    Authors: Hua Zheng, Wei Xie, M. Ben Feng

    Abstract: For reinforcement learning on complex stochastic systems where many factors dynamically impact the output trajectories, it is desirable to effectively leverage the information from historical samples collected in previous iterations to accelerate policy optimization. Classical experience replay allows agents to remember by reusing historical observations. However, the uniform reuse strategy that t… ▽ More

    Submitted 9 September, 2022; v1 submitted 25 August, 2022; originally announced August 2022.

    Comments: This work was intended as a replacement of arXiv:2110.08902 and any subsequent updates will appear there

  9. arXiv:2205.02410  [pdf, other

    stat.ML cs.LG

    Sequential Importance Sampling for Hybrid Model Bayesian Inference to Support Bioprocess Mechanism Learning and Robust Control

    Authors: Wei Xie, Keqi Wang, Hua Zheng, Ben Feng

    Abstract: Driven by the critical needs of biomanufacturing 4.0, we introduce a probabilistic knowledge graph hybrid model characterizing the risk- and science-based understanding of bioprocess mechanisms. It can faithfully capture the important properties, including nonlinear reactions, partially observed state, and nonstationary dynamics. Given very limited real process observations, we derive a posterior… ▽ More

    Submitted 29 September, 2022; v1 submitted 4 May, 2022; originally announced May 2022.

    Comments: 11 pages, 2 figures

  10. arXiv:2203.16328  [pdf, other

    cs.CV cs.LG stat.ML

    Smooth Robust Tensor Completion for Background/Foreground Separation with Missing Pixels: Novel Algorithm with Convergence Guarantee

    Authors: Bo Shen, Weijun Xie, Zhenyu Kong

    Abstract: The objective of this study is to address the problem of background/foreground separation with missing pixels by combining the video acquisition, video recovery, background/foreground separation into a single framework. To achieve this, a smooth robust tensor completion (SRTC) model is proposed to recover the data and decompose it into the static background and smooth foreground, respectively. Spe… ▽ More

    Submitted 10 April, 2022; v1 submitted 28 March, 2022; originally announced March 2022.

    Comments: 40 pages, 11 figures

  11. arXiv:2203.08980  [pdf, other

    stat.ME eess.SY

    Stochastic Simulation Uncertainty Analysis to Accelerate Flexible Biomanufacturing Process Development

    Authors: Wei Xie, Russell R. Barton, Barry L. Nelson, Keqi Wang

    Abstract: Motivated by critical challenges and needs from biopharmaceuticals manufacturing, we propose a general metamodel-assisted stochastic simulation uncertainty analysis framework to accelerate the development of a simulation model with modular design for flexible production processes. There are often very limited process observations. Thus, there exist both simulation and model uncertainties in the sy… ▽ More

    Submitted 3 September, 2022; v1 submitted 16 March, 2022; originally announced March 2022.

    Comments: 32 pages, 3 figures. arXiv admin note: substantial text overlap with arXiv:2011.04207

  12. arXiv:2111.06968  [pdf, other

    stat.ML cs.AI cs.LG

    Hierarchical clustering by aggregating representatives in sub-minimum-spanning-trees

    Authors: Wen-Bo Xie, Zhen Liu, Jaideep Srivastava

    Abstract: One of the main challenges for hierarchical clustering is how to appropriately identify the representative points in the lower level of the cluster tree, which are going to be utilized as the roots in the higher level of the cluster tree for further aggregation. However, conventional hierarchical clustering approaches have adopted some simple tricks to select the "representative" points which migh… ▽ More

    Submitted 11 November, 2021; originally announced November 2021.

  13. arXiv:2104.01114  [pdf, other

    stat.OT

    The general conformable fractional grey system model and its applications

    Authors: Wanli Xie, Mingyong Pang, Wen-Ze Wu, Chong Liu, Caixia Liu

    Abstract: Grey system theory is an important mathematical tool for describing uncertain information in the real world. It has been used to solve the uncertainty problems specially caused by lack of information. As a novel theory, the theory can deal with various fields and plays an important role in modeling the small sample problems. But many modeling mechanisms of grey system need to be answered, such as… ▽ More

    Submitted 14 July, 2021; v1 submitted 28 March, 2021; originally announced April 2021.

  14. arXiv:2101.03735  [pdf, other

    stat.ML cs.LG

    Biomanufacturing Harvest Optimization with Small Data

    Authors: Bo Wang, Wei Xie, Tugce Martagan, Alp Akcay, Bram van Ravenstein

    Abstract: In biopharmaceutical manufacturing, fermentation processes play a critical role in productivity and profit. A fermentation process uses living cells with complex biological mechanisms, leading to high variability in the process outputs, namely, the protein and impurity levels. By building on the biological mechanisms of protein and impurity growth, we introduce a stochastic model to characterize t… ▽ More

    Submitted 6 July, 2024; v1 submitted 11 January, 2021; originally announced January 2021.

    Comments: 36 pages, 8 figures

  15. arXiv:2012.12356  [pdf, other

    stat.ML cs.LG

    Unbiased Subdata Selection for Fair Classification: A Unified Framework and Scalable Algorithms

    Authors: Qing Ye, Weijun Xie

    Abstract: As an important problem in modern data analytics, classification has witnessed varieties of applications from different domains. Different from conventional classification approaches, fair classification concerns the issues of unintentional biases against the sensitive features (e.g., gender, race). Due to high nonconvexity of fairness measures, existing methods are often unable to model exact fai… ▽ More

    Submitted 24 December, 2020; v1 submitted 22 December, 2020; originally announced December 2020.

    Comments: 42 pages, 4 Figures

  16. arXiv:2011.04207  [pdf, other

    stat.ME

    Statistical Uncertainty Analysis for Stochastic Simulation

    Authors: Wei Xie, Barry L. Nelson, Russell R. Barton

    Abstract: When we use simulation to evaluate the performance of a stochastic system, the simulation often contains input distributions estimated from real-world data; therefore, there is both simulation and input uncertainty in the performance estimates. Ignoring either source of uncertainty underestimates the overall statistical error. Simulation uncertainty can be reduced by additional computation (e.g.,… ▽ More

    Submitted 9 November, 2020; originally announced November 2020.

    Comments: 40 pages, 3 figures

  17. arXiv:2008.12438  [pdf, other

    stat.ML cs.LG math.OC

    Exact and Approximation Algorithms for Sparse PCA

    Authors: Yongchun Li, Weijun Xie

    Abstract: Sparse PCA (SPCA) is a fundamental model in machine learning and data analytics, which has witnessed a variety of application areas such as finance, manufacturing, biology, healthcare. To select a prespecified-size principal submatrix from a covariance matrix to maximize its largest eigenvalue for the better interpretability purpose, SPCA advances the conventional PCA with both feature selection a… ▽ More

    Submitted 27 August, 2020; originally announced August 2020.

    Comments: 49 pages, 1 figure

  18. arXiv:2006.09919  [pdf, other

    cs.LG cs.AI stat.ML

    Green Simulation Assisted Reinforcement Learning with Model Risk for Biomanufacturing Learning and Control

    Authors: Hua Zheng, Wei Xie, Mingbin Ben Feng

    Abstract: Biopharmaceutical manufacturing faces critical challenges, including complexity, high variability, lengthy lead time, and limited historical data and knowledge of the underlying system stochastic process. To address these challenges, we propose a green simulation assisted model-based reinforcement learning to support process online learning and guide dynamic decision making. Basically, the process… ▽ More

    Submitted 17 June, 2020; originally announced June 2020.

    Comments: 12 pages, 1 figures. To appear in the Proceedings of the 2020 Winter Simulation Conference (WSC)

  19. arXiv:2005.13607  [pdf, other

    q-bio.QM cs.LG stat.ML

    Multi-View Graph Neural Networks for Molecular Property Prediction

    Authors: Hehuan Ma, Yatao Bian, Yu Rong, Wenbing Huang, Tingyang Xu, Weiyang Xie, Geyan Ye, Junzhou Huang

    Abstract: The crux of molecular property prediction is to generate meaningful representations of the molecules. One promising route is to exploit the molecular graph structure through Graph Neural Networks (GNNs). It is well known that both atoms and bonds significantly affect the chemical properties of a molecule, so an expressive model shall be able to exploit both node (atom) and edge (bond) information… ▽ More

    Submitted 12 June, 2020; v1 submitted 17 May, 2020; originally announced May 2020.

  20. arXiv:2001.08537  [pdf, other

    stat.ML cs.LG math.OC

    Best Principal Submatrix Selection for the Maximum Entropy Sampling Problem: Scalable Algorithms and Performance Guarantees

    Authors: Yongchun Li, Weijun Xie

    Abstract: This paper studies a classic maximum entropy sampling problem (MESP), which aims to select the most informative principal submatrix of a prespecified size from a covariance matrix. MESP has been widely applied to many areas, including healthcare, power system, manufacturing and data science. By investigating its Lagrangian dual and primal characterization, we derive a novel convex integer program… ▽ More

    Submitted 1 May, 2023; v1 submitted 23 January, 2020; originally announced January 2020.

    Comments: 62 pages

  21. arXiv:1912.02522  [pdf, other

    cs.SD cs.LG eess.AS stat.ML

    VoxSRC 2019: The first VoxCeleb Speaker Recognition Challenge

    Authors: Joon Son Chung, Arsha Nagrani, Ernesto Coto, Weidi Xie, Mitchell McLaren, Douglas A Reynolds, Andrew Zisserman

    Abstract: The VoxCeleb Speaker Recognition Challenge 2019 aimed to assess how well current speaker recognition technology is able to identify speakers in unconstrained or `in the wild' data. It consisted of: (i) a publicly available speaker recognition dataset from YouTube videos together with ground truth annotation and standardised evaluation software; and (ii) a public challenge and workshop held at Inte… ▽ More

    Submitted 5 December, 2019; originally announced December 2019.

    Comments: ISCA Archive

  22. arXiv:1910.05863  [pdf, other

    math.OC cs.LG stat.ML

    Global-Local Metamodel Assisted Two-Stage Optimization via Simulation

    Authors: Wei Xie, Yuan Yi, Hua Zheng

    Abstract: To integrate strategic, tactical and operational decisions, the two-stage optimization has been widely used to guide dynamic decision making. In this paper, we study the two-stage stochastic programming for complex systems with unknown response estimated by simulation. We introduce the global-local metamodel assisted two-stage optimization via simulation that can efficiently employ the simulation… ▽ More

    Submitted 13 October, 2019; originally announced October 2019.

  23. arXiv:1910.05845  [pdf, other

    stat.ME

    A Pooled Quantile Estimator for Parallel Simulations

    Authors: Qiong Zhang, Bo Wang, Wei Xie

    Abstract: Quantile is an important risk measure quantifying the stochastic system random behaviors. This paper studies a pooled quantile estimator, which is the sample quantile of detailed simulation outputs after directly pooling independent sample paths together. We derive the asymptotic representation of the pooled quantile estimator and further prove its normality. By comparing with the classical quanti… ▽ More

    Submitted 13 October, 2019; originally announced October 2019.

  24. arXiv:1910.03766  [pdf, other

    stat.ME

    A Nonparametric Bayesian Framework for Uncertainty Quantification in Stochastic Simulation

    Authors: Wei Xie, Cheng Li, Yuefeng Wu, Pu Zhang

    Abstract: When we use simulation to assess the performance of stochastic systems, the input models used to drive simulation experiments are often estimated from finite real-world data. There exist both input model and simulation estimation uncertainties in the system performance estimates. Without strong prior information on the input models and the system mean response surface, in this paper, we propose a… ▽ More

    Submitted 7 August, 2021; v1 submitted 8 October, 2019; originally announced October 2019.

    Comments: 54 pages, 1 figure

  25. arXiv:1909.04261  [pdf, other

    stat.ML cs.LG eess.SY

    Interpretable Biomanufacturing Process Risk and Sensitivity Analyses for Quality-by-Design and Stability Control

    Authors: Wei Xie, Bo Wang, Cheng Li, Dongming Xie, Jared Auclair

    Abstract: While biomanufacturing plays a significant role in supporting the economy and ensuring public health, it faces critical challenges, including complexity, high variability, lengthy lead time, and very limited process data, especially for personalized new cell and gene biotherapeutics. Driven by these challenges, we propose an interpretable semantic bioprocess probabilistic knowledge graph and devel… ▽ More

    Submitted 2 June, 2021; v1 submitted 9 September, 2019; originally announced September 2019.

    Comments: 41 pages, 8 figures

    Journal ref: Naval Research Logistics, 2021

  26. arXiv:1806.03756  [pdf, ps, other

    stat.CO math.OC

    Scalable Algorithms for the Sparse Ridge Regression

    Authors: Weijun Xie, Xinwei Deng

    Abstract: Sparse regression and variable selection for large-scale data have been rapidly developed in the past decades. This work focuses on sparse ridge regression, which enforces the sparsity by use of the L0 norm. We first prove that the continuous relaxation of the mixed integer second order conic (MISOC) reformulation using perspective formulation is equivalent to that of the convex integer formulatio… ▽ More

    Submitted 28 June, 2020; v1 submitted 10 June, 2018; originally announced June 2018.

    Comments: 31 pages

    MSC Class: 62J07; 90C10; 90C15

  27. arXiv:1802.08372  [pdf, ps, other

    stat.ML cs.DS

    Approximation Algorithms for D-optimal Design

    Authors: Mohit Singh, Weijun Xie

    Abstract: Experimental design is a classical statistics problem and its aim is to estimate an unknown $m$-dimensional vector $β$ from linear measurements where a Gaussian noise is introduced in each measurement. For the combinatorial experimental design problem, the goal is to pick $k$ out of the given $n$ experiments so as to make the most accurate estimate of the unknown parameters, denoted as $\hatβ$. In… ▽ More

    Submitted 26 September, 2019; v1 submitted 22 February, 2018; originally announced February 2018.

    Comments: 34 pages, accepted by Mathematics of Operations Research

  28. arXiv:1708.04741  [pdf

    stat.AP

    A Novel Method of Subgroup Identification by Combining Virtual Twins with GUIDE (VG) for Development of Precision Medicines

    Authors: Jia Jia, Qi Tang, Wangang Xie, Richard Rode

    Abstract: A lack of understanding of human biology creates a hurdle for the development of precision medicines. To overcome this hurdle we need to better understand the potential synergy between a given investigational treatment (vs. placebo or active control) and various demographic or genetic factors, disease history and severity, etc., with the goal of identifying those patients at increased risk of exhi… ▽ More

    Submitted 15 August, 2017; originally announced August 2017.

    Comments: 22 pages, 4 figures, 3 tables, all included in the main text

  29. arXiv:1706.03156  [pdf, other

    stat.ME

    Functional principal variance component testing for a genetic association study of HIV progression

    Authors: Denis Agniel, Wen Xie, Myron Essex, Tianxi Cai

    Abstract: HIV-1C is the most prevalent subtype of HIV-1 and accounts for over half of HIV-1 infections worldwide. Host genetic influence of HIV infection has been previously studied in HIV-1B, but little attention has been paid to the more prevalent subtype C. To understand the role of host genetics in HIV-1C disease progression, we perform a study to assess the association between longitudinally collected… ▽ More

    Submitted 9 June, 2017; originally announced June 2017.

    Comments: 20 pages, 6 figures

  30. A Geometric Approach to Visualization of Variability in Functional Data

    Authors: Weiyi Xie, Sebastian Kurtek, Karthik Bharath, Ying Sun

    Abstract: We propose a new method for the construction and visualization of boxplot-type displays for functional data. We use a recent functional data analysis framework, based on a representation of functions called square-root slope functions, to decompose observed variation in functional data into three main components: amplitude, phase, and vertical translation. We then construct separate displays for e… ▽ More

    Submitted 3 February, 2017; originally announced February 2017.

    Comments: Journal of the American Statistical Association, 2016

  31. arXiv:1611.01170  [pdf, other

    cs.LG cs.CR stat.ML

    PrivLogit: Efficient Privacy-preserving Logistic Regression by Tailoring Numerical Optimizers

    Authors: Wei Xie, Yang Wang, Steven M. Boker, Donald E. Brown

    Abstract: Safeguarding privacy in machine learning is highly desirable, especially in collaborative studies across many organizations. Privacy-preserving distributed machine learning (based on cryptography) is popular to solve the problem. However, existing cryptographic protocols still incur excess computational overhead. Here, we make a novel observation that this is partially due to naive adoption of mai… ▽ More

    Submitted 3 November, 2016; originally announced November 2016.

    Comments: 24 pages, 4 figures. Work done and circulated since 2015

  32. arXiv:1608.04581  [pdf, ps, other

    cs.LG stat.ML

    A novel transfer learning method based on common space mapping and weighted domain matching

    Authors: Ru-Ze Liang, Wei Xie, Weizhi Li, Hongqi Wang, Jim Jing-Yan Wang, Lisa Taylor

    Abstract: In this paper, we propose a novel learning framework for the problem of domain transfer learning. We map the data of two domains to one single common space, and learn a classifier in this common space. Then we adapt the common classifier to the two domains by adding two adaptive functions to it respectively. In the common space, the target domain data points are weighted and matched to the target… ▽ More

    Submitted 16 August, 2016; originally announced August 2016.

    Comments: arXiv admin note: text overlap with arXiv:1605.06673