Skip to main content

Showing 1–50 of 102 results for author: Zhu, S

Searching in archive stat. Search in all archives.
.
  1. arXiv:2407.05625  [pdf, other

    stat.ME cs.LG

    New User Event Prediction Through the Lens of Causal Inference

    Authors: Henry Shaowu Yuchi, Shixiang Zhu, Li Dong, Yigit M. Arisoy, Matthew C. Spencer

    Abstract: Modeling and analysis for event series generated by heterogeneous users of various behavioral patterns are closely involved in our daily lives, including credit card fraud detection, online platform user recommendation, and social network analysis. The most commonly adopted approach to this task is to classify users into behavior-based categories and analyze each of them separately. However, this… ▽ More

    Submitted 10 July, 2024; v1 submitted 8 July, 2024; originally announced July 2024.

  2. arXiv:2403.17852  [pdf, other

    cs.LG stat.ML

    Counterfactual Fairness through Transforming Data Orthogonal to Bias

    Authors: Shuyi Chen, Shixiang Zhu

    Abstract: Machine learning models have shown exceptional prowess in solving complex issues across various domains. However, these models can sometimes exhibit biased decision-making, resulting in unequal treatment of different groups. Despite substantial research on counterfactual fairness, methods to reduce the impact of multivariate and continuous sensitive variables on decision-making outcomes are still… ▽ More

    Submitted 29 June, 2024; v1 submitted 26 March, 2024; originally announced March 2024.

  3. arXiv:2402.03167  [pdf, other

    math.OC cs.LG stat.ML

    Decentralized Bilevel Optimization over Graphs: Loopless Algorithmic Update and Transient Iteration Complexity

    Authors: Boao Kong, Shuchen Zhu, Songtao Lu, Xinmeng Huang, Kun Yuan

    Abstract: Stochastic bilevel optimization (SBO) is becoming increasingly essential in machine learning due to its versatility in handling nested structures. To address large-scale SBO, decentralized approaches have emerged as effective paradigms in which nodes communicate with immediate neighbors without a central server, thereby improving communication efficiency and enhancing algorithmic robustness. Howev… ▽ More

    Submitted 26 February, 2024; v1 submitted 5 February, 2024; originally announced February 2024.

    Comments: 37 pages, 6 figures

  4. arXiv:2401.18017  [pdf, ps, other

    stat.ML cs.LG

    Causal Discovery by Kernel Deviance Measures with Heterogeneous Transforms

    Authors: Tim Tse, Zhitang Chen, Shengyu Zhu, Yue Liu

    Abstract: The discovery of causal relationships in a set of random variables is a fundamental objective of science and has also recently been argued as being an essential component towards real machine intelligence. One class of causal discovery techniques are founded based on the argument that there are inherent structural asymmetries between the causal and anti-causal direction which could be leveraged in… ▽ More

    Submitted 31 January, 2024; originally announced January 2024.

  5. arXiv:2310.18449  [pdf, other

    stat.ML cs.CE cs.LG

    Black-Box Optimization with Implicit Constraints for Public Policy

    Authors: Wenqian Xing, Jungho Lee, Chong Liu, Shixiang Zhu

    Abstract: Black-box optimization (BBO) has become increasingly relevant for tackling complex decision-making problems, especially in public policy domains such as police redistricting. However, its broader application in public policymaking is hindered by the complexity of defining feasible regions and the high-dimensionality of decisions. This paper introduces a novel BBO framework, termed as the Condition… ▽ More

    Submitted 16 August, 2024; v1 submitted 27 October, 2023; originally announced October 2023.

  6. arXiv:2310.03258  [pdf, other

    cs.LG stat.ME

    Assessing Electricity Service Unfairness with Transfer Counterfactual Learning

    Authors: Song Wei, Xiangrui Kong, Alinson Santos Xavier, Shixiang Zhu, Yao Xie, Feng Qiu

    Abstract: Energy justice is a growing area of interest in interdisciplinary energy research. However, identifying systematic biases in the energy sector remains challenging due to confounding variables, intricate heterogeneity in counterfactual effects, and limited data availability. First, this paper demonstrates how one can evaluate counterfactual unfairness in a power system by analyzing the average caus… ▽ More

    Submitted 24 January, 2024; v1 submitted 4 October, 2023; originally announced October 2023.

    Comments: The preliminary version titled "Detecting Electricity Service Equity Issues with Transfer Counterfactual Learning on Large-Scale Outage Datasets" is presented at NeurIPS 2023 Workshops on Causal Representation Learning (CRL) and Algorithmic Fairness through the Lens of Time (AFT); See v1

  7. arXiv:2310.03218  [pdf, other

    cs.LG cs.AI stat.ML

    Learning Energy-Based Prior Model with Diffusion-Amortized MCMC

    Authors: Peiyu Yu, Yaxuan Zhu, Sirui Xie, Xiaojian Ma, Ruiqi Gao, Song-Chun Zhu, Ying Nian Wu

    Abstract: Latent space Energy-Based Models (EBMs), also known as energy-based priors, have drawn growing interests in the field of generative modeling due to its flexibility in the formulation and strong modeling power of the latent space. However, the common practice of learning latent space EBMs with non-convergent short-run MCMC for prior and posterior sampling is hindering the model from further progres… ▽ More

    Submitted 4 October, 2023; originally announced October 2023.

    Comments: NeurIPS 2023

  8. arXiv:2309.01885  [pdf, other

    stat.ML cs.CL cs.LG

    QuantEase: Optimization-based Quantization for Language Models

    Authors: Kayhan Behdin, Ayan Acharya, Aman Gupta, Qingquan Song, Siyu Zhu, Sathiya Keerthi, Rahul Mazumder

    Abstract: With the rising popularity of Large Language Models (LLMs), there has been an increasing interest in compression techniques that enable their efficient deployment. This study focuses on the Post-Training Quantization (PTQ) of LLMs. Drawing from recent advances, our work introduces QuantEase, a layer-wise quantization framework where individual layers undergo separate quantization. The problem is f… ▽ More

    Submitted 1 December, 2023; v1 submitted 4 September, 2023; originally announced September 2023.

  9. arXiv:2305.15742  [pdf, other

    stat.ML cs.LG stat.ME

    Counterfactual Generative Models for Time-Varying Treatments

    Authors: Shenghao Wu, Wenbin Zhou, Minshuo Chen, Shixiang Zhu

    Abstract: Estimating the counterfactual outcome of treatment is essential for decision-making in public health and clinical science, among others. Often, treatments are administered in a sequential, time-varying manner, leading to an exponentially increased number of possible counterfactual outcomes. Furthermore, in modern applications, the outcomes are high-dimensional and conventional average treatment ef… ▽ More

    Submitted 13 July, 2024; v1 submitted 25 May, 2023; originally announced May 2023.

    Comments: Published at KDD'24

  10. arXiv:2305.12569  [pdf, other

    stat.ML cs.LG

    Conditional Generative Modeling for High-dimensional Marked Temporal Point Processes

    Authors: Zheng Dong, Zekai Fan, Shixiang Zhu

    Abstract: Point processes offer a versatile framework for sequential event modeling. However, the computational challenges and constrained representational power of the existing point process models have impeded their potential for wider applications. This limitation becomes especially pronounced when dealing with event data that is associated with multi-dimensional or high-dimensional marks such as texts o… ▽ More

    Submitted 14 February, 2024; v1 submitted 21 May, 2023; originally announced May 2023.

  11. Learning with linear mixed model for group recommendation systems

    Authors: Baode Gao, Guangpeng Zhan, Hanzhang Wang, Yiming Wang, Shengxin Zhu

    Abstract: Accurate prediction of users' responses to items is one of the main aims of many computational advising applications. Examples include recommending movies, news articles, songs, jobs, clothes, books and so forth. Accurate prediction of inactive users' responses still remains a challenging problem for many applications. In this paper, we explore the linear mixed model in recommendation system. The… ▽ More

    Submitted 17 December, 2022; originally announced December 2022.

    Comments: 5 pages, 9 figures, published

    ACM Class: G.3

    Journal ref: In Proceedings of the 2019 11th International Conference on Machine Learning and Computing (pp. 81-85) (2019, February)

  12. arXiv:2210.16486  [pdf, other

    cs.CV cs.LG stat.ML

    Learning Probabilistic Models from Generator Latent Spaces with Hat EBM

    Authors: Mitch Hill, Erik Nijkamp, Jonathan Mitchell, Bo Pang, Song-Chun Zhu

    Abstract: This work proposes a method for using any generator network as the foundation of an Energy-Based Model (EBM). Our formulation posits that observed images are the sum of unobserved latent variables passed through the generator network and a residual random variable that spans the gap between the generator output and the image manifold. One can then define an EBM that includes the generator as part… ▽ More

    Submitted 12 January, 2023; v1 submitted 28 October, 2022; originally announced October 2022.

    Comments: NeurIPS 2022 camera ready

  13. arXiv:2206.08531  [pdf, ps, other

    stat.ML cs.LG

    Reframed GES with a Neural Conditional Dependence Measure

    Authors: Xinwei Shen, Shengyu Zhu, Jiji Zhang, Shoubo Hu, Zhitang Chen

    Abstract: In a nonparametric setting, the causal structure is often identifiable only up to Markov equivalence, and for the purpose of causal inference, it is useful to learn a graphical representation of the Markov equivalence class (MEC). In this paper, we revisit the Greedy Equivalence Search (GES) algorithm, which is widely cited as a score-based algorithm for learning the MEC of the underlying causal s… ▽ More

    Submitted 16 June, 2022; originally announced June 2022.

    Comments: Accepted to UAI 2022

  14. arXiv:2205.12243  [pdf, other

    stat.ML cs.LG

    EBM Life Cycle: MCMC Strategies for Synthesis, Defense, and Density Modeling

    Authors: Mitch Hill, Jonathan Mitchell, Chu Chen, Yuan Du, Mubarak Shah, Song-Chun Zhu

    Abstract: This work presents strategies to learn an Energy-Based Model (EBM) according to the desired length of its MCMC sampling trajectories. MCMC trajectories of different lengths correspond to models with different purposes. Our experiments cover three different trajectory magnitudes and learning outcomes: 1) shortrun sampling for image generation; 2) midrun sampling for classifier-agnostic adversarial… ▽ More

    Submitted 24 May, 2022; originally announced May 2022.

  15. arXiv:2203.11528  [pdf, other

    stat.ML cs.LG

    Out-of-distribution Generalization with Causal Invariant Transformations

    Authors: Ruoyu Wang, Mingyang Yi, Zhitang Chen, Shengyu Zhu

    Abstract: In real-world applications, it is important and desirable to learn a model that performs well on out-of-distribution (OOD) data. Recently, causality has become a powerful tool to tackle the OOD generalization problem, with the idea resting on the causal mechanism that is invariant across domains of interest. To leverage the generally unknown causal mechanism, existing works assume a linear form of… ▽ More

    Submitted 23 March, 2022; v1 submitted 22 March, 2022; originally announced March 2022.

    Comments: accepted by cvpr2022

  16. arXiv:2111.15155  [pdf, other

    cs.LG stat.ML

    gCastle: A Python Toolbox for Causal Discovery

    Authors: Keli Zhang, Shengyu Zhu, Marcus Kalander, Ignavier Ng, Junjian Ye, Zhitang Chen, Lujia Pan

    Abstract: $\texttt{gCastle}… ▽ More

    Submitted 30 November, 2021; originally announced November 2021.

    Comments: Tech report describing the gCastle toolbox. More details can be found in the github repository https://github.com/huawei-noah/trustworthyAI/tree/master/gcastle

  17. arXiv:2111.05529  [pdf, other

    cs.LG cs.CV stat.ML

    Understanding the Generalization Benefit of Model Invariance from a Data Perspective

    Authors: Sicheng Zhu, Bang An, Furong Huang

    Abstract: Machine learning models that are developed with invariance to certain types of data transformations have demonstrated superior generalization performance in practice. However, the underlying mechanism that explains why invariance leads to better generalization is not well-understood, limiting our ability to select appropriate data transformations for a given dataset. This paper studies the general… ▽ More

    Submitted 23 February, 2023; v1 submitted 9 November, 2021; originally announced November 2021.

    Comments: Accepted to NeurIPS 2021. Version 2 includes several content clarifications and image format revisions

  18. arXiv:2109.09711  [pdf, other

    stat.AP

    Quantifying Grid Resilience Against Extreme Weather Using Large-Scale Customer Power Outage Data

    Authors: Shixiang Zhu, Rui Yao, Yao Xie, Feng Qiu, Yueming, Qiu, Xuan Wu

    Abstract: In recent decades, the weather around the world has become more irregular and extreme, often causing large-scale extended power outages. Resilience -- the capability of withstanding, adapting to, and recovering from a large-scale disruption -- has become a top priority for the power sector. However, the understanding of power grid resilience still stays on the conceptual level mostly or focuses on… ▽ More

    Submitted 4 September, 2022; v1 submitted 20 September, 2021; originally announced September 2021.

  19. arXiv:2109.09029  [pdf, other

    stat.AP

    Non-stationary spatio-temporal point process modeling for high-resolution COVID-19 data

    Authors: Zheng Dong, Shixiang Zhu, Yao Xie, Jorge Mateu, Francisco J. Rodríguez-Cortés

    Abstract: Most COVID-19 studies commonly report figures of the overall infection at a state- or county-level. This aggregation tends to miss out on fine details of virus propagation. In this paper, we analyze a high-resolution COVID-19 dataset in Cali, Colombia, that records the precise time and location of every confirmed case. We develop a non-stationary spatio-temporal point process equipped with a neura… ▽ More

    Submitted 9 March, 2023; v1 submitted 18 September, 2021; originally announced September 2021.

  20. arXiv:2108.13285  [pdf, other

    stat.AP

    Multi-resolution spatio-temporal prediction with application to wind power generation

    Authors: Zheng Dong, Hanyu Zhang, Shixiang Zhu, Yao Xie, Pascal Van Hentenryck

    Abstract: Wind energy is becoming an increasingly crucial component of a sustainable grid, but its inherent variability and limited predictability present challenges for grid operators. The energy sector needs novel forecasting techniques that can precisely predict the generation of renewable power and offer precise quantification of prediction uncertainty. This will facilitate well-informed decision-making… ▽ More

    Submitted 2 December, 2023; v1 submitted 30 August, 2021; originally announced August 2021.

  21. arXiv:2106.10773  [pdf, other

    cs.LG stat.ML

    Neural Spectral Marked Point Processes

    Authors: Shixiang Zhu, Haoyun Wang, Zheng Dong, Xiuyuan Cheng, Yao Xie

    Abstract: Self- and mutually-exciting point processes are popular models in machine learning and statistics for dependent discrete event data. To date, most existing models assume stationary kernels (including the classical Hawkes processes) and simple parametric models. Modern applications with complex event data require more general point process models that can incorporate contextual information of the e… ▽ More

    Submitted 12 February, 2022; v1 submitted 20 June, 2021; originally announced June 2021.

  22. arXiv:2106.01694  [pdf, ps, other

    stat.AP

    Analysis and Evaluation of the Inequality of the Spatial Distribution of Medical Resources in Jinan

    Authors: Shengkun Zhu

    Abstract: This article will analyze the inequality and evaluation of the spatial distribution of medical resources in Jinan. The research will be carried out from the following four aspects: analysis of existing medical resource allocation and distribution characteristics, medical resource accessibility analysis, inequality evaluation and optimization layout analysis. The article will use G2SFCA/M2SFCA Mode… ▽ More

    Submitted 3 June, 2021; originally announced June 2021.

  23. arXiv:2106.00072  [pdf, other

    stat.ML cs.LG stat.AP

    Early Detection of COVID-19 Hotspots Using Spatio-Temporal Data

    Authors: Shixiang Zhu, Alexander Bukharin, Liyan Xie, Khurram Yamin, Shihao Yang, Pinar Keskinocak, Yao Xie

    Abstract: Recently, the Centers for Disease Control and Prevention (CDC) has worked with other federal agencies to identify counties with increasing coronavirus disease 2019 (COVID-19) incidence (hotspots) and offers support to local health departments to limit the spread of the disease. Understanding the spatio-temporal dynamics of hotspot events is of great importance to support policy decisions and preve… ▽ More

    Submitted 31 October, 2021; v1 submitted 31 May, 2021; originally announced June 2021.

  24. A Local Method for Identifying Causal Relations under Markov Equivalence

    Authors: Zhuangyan Fang, Yue Liu, Zhi Geng, Shengyu Zhu, Yangbo He

    Abstract: Causality is important for designing interpretable and robust methods in artificial intelligence research. We propose a local approach to identify whether a variable is a cause of a given target under the framework of causal graphical models of directed acyclic graphs (DAGs). In general, the causal relation between two variables may not be identifiable from observational data as many causal DAGs e… ▽ More

    Submitted 5 March, 2022; v1 submitted 25 February, 2021; originally announced February 2021.

  25. arXiv:2009.07356  [pdf, other

    stat.AP physics.soc-ph

    High-resolution Spatio-temporal Model for County-level COVID-19 Activity in the U.S

    Authors: Shixiang Zhu, Alexander Bukharin, Liyan Xie, Mauricio Santillana, Shihao Yang, Yao Xie

    Abstract: We present an interpretable high-resolution spatio-temporal model to estimate COVID-19 deaths together with confirmed cases one-week ahead of the current time, at the county-level and weekly aggregated, in the United States. A notable feature of our spatio-temporal model is that it considers the (a) temporal auto- and pairwise correlation of the two local time series (confirmed cases and death of… ▽ More

    Submitted 20 August, 2021; v1 submitted 15 September, 2020; originally announced September 2020.

  26. arXiv:2006.10259  [pdf, other

    q-bio.NC cs.LG stat.ML

    On Path Integration of Grid Cells: Group Representation and Isotropic Scaling

    Authors: Ruiqi Gao, Jianwen Xie, Xue-Xin Wei, Song-Chun Zhu, Ying Nian Wu

    Abstract: Understanding how grid cells perform path integration calculations remains a fundamental problem. In this paper, we conduct theoretical analysis of a general representation model of path integration by grid cells, where the 2D self-position is encoded as a higher dimensional vector, and the 2D self-motion is represented by a general transformation of the vector. We identify two conditions on the t… ▽ More

    Submitted 3 November, 2021; v1 submitted 17 June, 2020; originally announced June 2020.

  27. arXiv:2006.09439  [pdf, other

    math.ST cs.LG stat.ML

    Goodness-of-Fit Test for Mismatched Self-Exciting Processes

    Authors: Song Wei, Shixiang Zhu, Minghe Zhang, Yao Xie

    Abstract: Recently there have been many research efforts in developing generative models for self-exciting point processes, partly due to their broad applicability for real-world applications. However, rarely can we quantify how well the generative model captures the nature or ground-truth since it is usually unknown. The challenge typically lies in the fact that the generative models typically provide, at… ▽ More

    Submitted 12 February, 2021; v1 submitted 16 June, 2020; originally announced June 2020.

    Comments: 28 pages, 11 figures, 3 tables. Accepted to AISTATS 2021. Camera-ready version

    MSC Class: 62G10 (Primary) 62L10; 62E20 (Secondary) ACM Class: G.3

  28. arXiv:2006.08205  [pdf, other

    stat.ML cs.LG

    Learning Latent Space Energy-Based Prior Model

    Authors: Bo Pang, Tian Han, Erik Nijkamp, Song-Chun Zhu, Ying Nian Wu

    Abstract: We propose to learn energy-based model (EBM) in the latent space of a generator model, so that the EBM serves as a prior model that stands on the top-down network of the generator model. Both the latent space EBM and the top-down network can be learned jointly by maximum likelihood, which involves short-run MCMC sampling from both the prior and posterior distributions of the latent vector. Due to… ▽ More

    Submitted 29 October, 2020; v1 submitted 15 June, 2020; originally announced June 2020.

    Comments: NeurIPS 2020 Camera-Ready

  29. arXiv:2006.06897  [pdf, other

    stat.ML cs.LG

    MCMC Should Mix: Learning Energy-Based Model with Neural Transport Latent Space MCMC

    Authors: Erik Nijkamp, Ruiqi Gao, Pavel Sountsov, Srinivas Vasudevan, Bo Pang, Song-Chun Zhu, Ying Nian Wu

    Abstract: Learning energy-based model (EBM) requires MCMC sampling of the learned model as an inner loop of the learning algorithm. However, MCMC sampling of EBMs in high-dimensional data space is generally not mixing, because the energy function, which is usually parametrized by a deep network, is highly multi-modal in the data space. This is a serious handicap for both theory and practice of EBMs. In this… ▽ More

    Submitted 16 March, 2022; v1 submitted 11 June, 2020; originally announced June 2020.

  30. arXiv:2006.06649  [pdf, other

    stat.ML cs.AI cs.CV cs.LG

    Closed Loop Neural-Symbolic Learning via Integrating Neural Perception, Grammar Parsing, and Symbolic Reasoning

    Authors: Qing Li, Siyuan Huang, Yining Hong, Yixin Chen, Ying Nian Wu, Song-Chun Zhu

    Abstract: The goal of neural-symbolic computation is to integrate the connectionist and symbolist paradigms. Prior methods learn the neural-symbolic models using reinforcement learning (RL) approaches, which ignore the error propagation in the symbolic reasoning module and thus converge slowly with sparse rewards. In this paper, we address these issues and close the loop of neural-symbolic learning by (1) i… ▽ More

    Submitted 27 July, 2020; v1 submitted 11 June, 2020; originally announced June 2020.

    Comments: ICML 2020. Project page: https://liqing-ustc.github.io/NGS

  31. arXiv:2006.05691  [pdf, other

    cs.LG stat.ML

    On Low Rank Directed Acyclic Graphs and Causal Structure Learning

    Authors: Zhuangyan Fang, Shengyu Zhu, Jiji Zhang, Yue Liu, Zhitang Chen, Yangbo He

    Abstract: Despite several advances in recent years, learning causal structures represented by directed acyclic graphs (DAGs) remains a challenging task in high dimensional settings when the graphs to be learned are not sparse. In this paper, we propose to exploit a low rank assumption regarding the (weighted) adjacency matrix of a DAG causal model to help address this problem. We utilize existing low rank t… ▽ More

    Submitted 15 May, 2023; v1 submitted 10 June, 2020; originally announced June 2020.

    Comments: This paper has been accepted by the IEEE Transactions on Neural Networks and Learning Systems

  32. arXiv:2006.04004  [pdf, other

    stat.ML cs.LG

    Distributionally Robust Weighted $k$-Nearest Neighbors

    Authors: Shixiang Zhu, Liyan Xie, Minghe Zhang, Rui Gao, Yao Xie

    Abstract: Learning a robust classifier from a few samples remains a key challenge in machine learning. A major thrust of research has been focused on developing $k$-nearest neighbor ($k$-NN) based algorithms combined with metric learning that captures similarities between samples. When the samples are limited, robustness is especially crucial to ensure the generalization capability of the classifier. In thi… ▽ More

    Submitted 16 February, 2022; v1 submitted 6 June, 2020; originally announced June 2020.

  33. arXiv:2005.13525  [pdf, other

    stat.ML cs.CR cs.LG

    Stochastic Security: Adversarial Defense Using Long-Run Dynamics of Energy-Based Models

    Authors: Mitch Hill, Jonathan Mitchell, Song-Chun Zhu

    Abstract: The vulnerability of deep networks to adversarial attacks is a central problem for deep learning from the perspective of both cognition and security. The current most successful defense method is to train a classifier using adversarial images created during learning. Another defense approach involves transformation or purification of the original input to remove adversarial signals before the imag… ▽ More

    Submitted 18 March, 2021; v1 submitted 27 May, 2020; originally announced May 2020.

    Comments: ICLR 2021

  34. arXiv:2005.08665  [pdf, other

    cs.LG stat.AP stat.ML

    Spatio-Temporal Point Processes with Attention for Traffic Congestion Event Modeling

    Authors: Shixiang Zhu, Ruyi Ding, Minghe Zhang, Pascal Van Hentenryck, Yao Xie

    Abstract: We present a novel framework for modeling traffic congestion events over road networks. Using multi-modal data by combining count data from traffic sensors with police reports that report traffic incidents, we aim to capture two types of triggering effect for congestion events. Current traffic congestion at one location may cause future congestion over the road network, and traffic incidents may c… ▽ More

    Submitted 31 May, 2021; v1 submitted 15 May, 2020; originally announced May 2020.

  35. arXiv:2005.04354  [pdf, ps, other

    stat.ML cs.IT cs.LG

    Exact Asymptotics for Learning Tree-Structured Graphical Models with Side Information: Noiseless and Noisy Samples

    Authors: Anshoo Tandon, Vincent Y. F. Tan, Shiyao Zhu

    Abstract: Given side information that an Ising tree-structured graphical model is homogeneous and has no external field, we derive the exact asymptotics of learning its structure from independently drawn samples. Our results, which leverage the use of probabilistic tools from the theory of strong large deviations, refine the large deviation (error exponents) results of Tan, Anandkumar, Tong, and Willsky [IE… ▽ More

    Submitted 8 May, 2020; originally announced May 2020.

  36. arXiv:2004.09660  [pdf, other

    math.OC stat.AP

    Data-Driven Optimization for Police Beat Design in South Fulton, Georgia

    Authors: Shixiang Zhu, Alexander W. Bukharin, Le Lu, He Wang, Yao Xie

    Abstract: We redesign the police patrol beat in South Fulton, Georgia, in collaboration with the South Fulton Police Department (SFPD), using a predictive data-driven optimization approach. Due to rapid urban development and population growth, the existing police beat design done in the 1970s was far from efficient, which leads to low policing efficiency and long 911 call response time. We balance the polic… ▽ More

    Submitted 23 August, 2021; v1 submitted 20 April, 2020; originally announced April 2020.

  37. arXiv:2002.11798  [pdf, other

    cs.LG cs.CR cs.IT stat.ML

    Learning Adversarially Robust Representations via Worst-Case Mutual Information Maximization

    Authors: Sicheng Zhu, Xiao Zhang, David Evans

    Abstract: Training machine learning models that are robust against adversarial inputs poses seemingly insurmountable challenges. To better understand adversarial robustness, we consider the underlying problem of learning robust representations. We develop a notion of representation vulnerability that captures the maximum change of mutual information between the input and output distributions, under the wors… ▽ More

    Submitted 5 July, 2020; v1 submitted 26 February, 2020; originally announced February 2020.

    Comments: ICML 2020

  38. arXiv:2002.07281  [pdf, other

    stat.ML cs.LG

    Deep Fourier Kernel for Self-Attentive Point Processes

    Authors: Shixiang Zhu, Minghe Zhang, Ruyi Ding, Yao Xie

    Abstract: We present a novel attention-based model for discrete event data to capture complex non-linear temporal dependence structures. We borrow the idea from the attention mechanism and incorporate it into the point processes' conditional intensity function. We further introduce a novel score function using Fourier kernel embedding, whose spectrum is represented using neural networks, which drastically d… ▽ More

    Submitted 21 February, 2021; v1 submitted 17 February, 2020; originally announced February 2020.

  39. arXiv:2001.03311  [pdf, other

    cs.LG stat.ML

    Guess First to Enable Better Compression and Adversarial Robustness

    Authors: Sicheng Zhu, Bang An, Shiyu Niu

    Abstract: Machine learning models are generally vulnerable to adversarial examples, which is in contrast to the robustness of humans. In this paper, we try to leverage one of the mechanisms in human recognition and propose a bio-inspired classification framework in which model inference is conditioned on label hypothesis. We provide a class of training objectives for this framework and an information bottle… ▽ More

    Submitted 10 January, 2020; originally announced January 2020.

    Comments: Accepted by NeurIPS 2019 workshop on Information Theory and Machine Learning

  40. arXiv:1912.01909  [pdf, other

    stat.ML cs.LG

    Learning Multi-layer Latent Variable Model via Variational Optimization of Short Run MCMC for Approximate Inference

    Authors: Erik Nijkamp, Bo Pang, Tian Han, Linqi Zhou, Song-Chun Zhu, Ying Nian Wu

    Abstract: This paper studies the fundamental problem of learning deep generative models that consist of multiple layers of latent variables organized in top-down architectures. Such models have high expressivity and allow for learning hierarchical representations. Learning such a generative model requires inferring the latent variables for each training example based on the posterior distribution of these l… ▽ More

    Submitted 17 July, 2020; v1 submitted 4 December, 2019; originally announced December 2019.

  41. arXiv:1911.11374  [pdf, other

    stat.ML cs.LG

    Representation Learning: A Statistical Perspective

    Authors: Jianwen Xie, Ruiqi Gao, Erik Nijkamp, Song-Chun Zhu, Ying Nian Wu

    Abstract: Learning representations of data is an important problem in statistics and machine learning. While the origin of learning representations can be traced back to factor analysis and multidimensional scaling in statistics, it has become a central theme in deep learning with important applications in computer vision and computational neuroscience. In this article, we review recent advances in learning… ▽ More

    Submitted 26 November, 2019; originally announced November 2019.

    Journal ref: Annual Review of Statistics and Its Application 2020

  42. arXiv:1911.11185  [pdf, other

    cs.LG cs.AI stat.ML

    Theory-based Causal Transfer: Integrating Instance-level Induction and Abstract-level Structure Learning

    Authors: Mark Edmonds, Xiaojian Ma, Siyuan Qi, Yixin Zhu, Hongjing Lu, Song-Chun Zhu

    Abstract: Learning transferable knowledge across similar but different settings is a fundamental component of generalized intelligence. In this paper, we approach the transfer learning challenge from a causal theory perspective. Our agent is endowed with two basic yet general theories for transfer learning: (i) a task shares a common abstract structure that is invariant across domains, and (ii) the behavior… ▽ More

    Submitted 25 November, 2019; originally announced November 2019.

    Comments: Accepted to AAAI 2020 as an oral

  43. arXiv:1911.07420  [pdf, other

    cs.LG stat.ML

    A Graph Autoencoder Approach to Causal Structure Learning

    Authors: Ignavier Ng, Shengyu Zhu, Zhitang Chen, Zhuangyan Fang

    Abstract: Causal structure learning has been a challenging task in the past decades and several mainstream approaches such as constraint- and score-based methods have been studied with theoretical guarantees. Recently, a new approach has transformed the combinatorial structure learning problem into a continuous one and then solved it using gradient-based optimization methods. Following the recent state-of-t… ▽ More

    Submitted 17 November, 2019; originally announced November 2019.

    Comments: NeurIPS 2019 Workshop "Do the right thing": machine learning and causal inference for improved decision making

  44. arXiv:1911.00685  [pdf

    stat.CO math.NA stat.ML

    Sparse inversion for derivative of log determinant

    Authors: Shengxin Zhu, Andrew J Wathen

    Abstract: Algorithms for Gaussian process, marginal likelihood methods or restricted maximum likelihood methods often require derivatives of log determinant terms. These log determinants are usually parametric with variance parameters of the underlying statistical models. This paper demonstrates that, when the underlying matrix is sparse, how to take the advantage of sparse inversion---selected inversion wh… ▽ More

    Submitted 2 November, 2019; originally announced November 2019.

    Comments: 15

    MSC Class: 65F05; 90C53

  45. arXiv:1910.09161  [pdf, other

    stat.ML cs.LG

    Sequential Adversarial Anomaly Detection for One-Class Event Data

    Authors: Shixiang Zhu, Henry Shaowu Yuchi, Minghe Zhang, Yao Xie

    Abstract: We consider the sequential anomaly detection problem in the one-class setting when only the anomalous sequences are available and propose an adversarial sequential detector by solving a minimax problem to find an optimal detector against the worst-case sequences from a generator. The generator captures the dependence in sequential events using the marked point process model. The detector sequentia… ▽ More

    Submitted 5 April, 2023; v1 submitted 21 October, 2019; originally announced October 2019.

  46. arXiv:1910.08527  [pdf, other

    cs.LG stat.ME stat.ML

    Masked Gradient-Based Causal Structure Learning

    Authors: Ignavier Ng, Shengyu Zhu, Zhuangyan Fang, Haoyang Li, Zhitang Chen, Jun Wang

    Abstract: This paper studies the problem of learning causal structures from observational data. We reformulate the Structural Equation Model (SEM) with additive noises in a form parameterized by binary graph adjacency matrix and show that, if the original SEM is identifiable, then the binary adjacency matrix can be identified up to super-graphs of the true causal graph under mild conditions. We then utilize… ▽ More

    Submitted 10 January, 2022; v1 submitted 18 October, 2019; originally announced October 2019.

    Comments: Accepted to SDM 2022

  47. arXiv:1909.04324  [pdf, other

    cs.CV cs.LG eess.IV stat.ML

    Inducing Hierarchical Compositional Model by Sparsifying Generator Network

    Authors: Xianglei Xing, Tianfu Wu, Song-Chun Zhu, Ying Nian Wu

    Abstract: This paper proposes to learn hierarchical compositional AND-OR model for interpretable image synthesis by sparsifying the generator network. The proposed method adopts the scene-objects-parts-subparts-primitives hierarchy in image representation. A scene has different types (i.e., OR) each of which consists of a number of objects (i.e., AND). This can be recursively formulated across the scene-obj… ▽ More

    Submitted 20 June, 2020; v1 submitted 10 September, 2019; originally announced September 2019.

    Comments: This is the CVPR version

  48. arXiv:1909.00513  [pdf, other

    cs.LG cs.AI stat.ML

    Causal Discovery by Kernel Intrinsic Invariance Measure

    Authors: Zhitang Chen, Shengyu Zhu, Yue Liu, Tim Tse

    Abstract: Reasoning based on causality, instead of association has been considered as a key ingredient towards real machine intelligence. However, it is a challenging task to infer causal relationship/structure among variables. In recent years, an Independent Mechanism (IM) principle was proposed, stating that the mechanism generating the cause and the one mapping the cause to the effect are independent. As… ▽ More

    Submitted 1 September, 2019; originally announced September 2019.

    Comments: 9 pages, preprint

  49. arXiv:1908.10037  [pdf, ps, other

    cs.IT stat.ML

    Asymptotically Optimal One- and Two-Sample Testing with Kernels

    Authors: Shengyu Zhu, Biao Chen, Zhitang Chen, Pengfei Yang

    Abstract: We characterize the asymptotic performance of nonparametric one- and two-sample testing. The exponential decay rate or error exponent of the type-II error probability is used as the asymptotic performance metric, and an optimal test achieves the maximum rate subject to a constant level constraint on the type-I error probability. With Sanov's theorem, we derive a sufficient condition for one-sample… ▽ More

    Submitted 5 February, 2021; v1 submitted 27 August, 2019; originally announced August 2019.

    Comments: Accepted to IEEE Transactions on Information Theory. This version may be further modified

  50. arXiv:1906.05467  [pdf, other

    cs.LG stat.AP stat.ML

    Imitation Learning of Neural Spatio-Temporal Point Processes

    Authors: Shixiang Zhu, Shuang Li, Zhigang Peng, Yao Xie

    Abstract: We present a novel Neural Embedding Spatio-Temporal (NEST) point process model for spatio-temporal discrete event data and develop an efficient imitation learning (a type of reinforcement learning) based approach for model fitting. Despite the rapid development of one-dimensional temporal point processes for discrete event data, the study of spatial-temporal aspects of such data is relatively scar… ▽ More

    Submitted 22 January, 2021; v1 submitted 12 June, 2019; originally announced June 2019.