Search | arXiv e-print repository

Sketchy Moment Matching: Toward Fast and Provable Data Selection for Finetuning

Authors: Yijun Dong, Hoang Phan, Xiang Pan, Qi Lei

Abstract: We revisit data selection in a modern context of finetuning from a fundamental perspective. Extending the classical wisdom of variance minimization in low dimensions to high-dimensional finetuning, our generalization analysis unveils the importance of additionally reducing bias induced by low-rank approximation. Inspired by the variance-bias tradeoff in high dimensions from the theory, we introduc… ▽ More We revisit data selection in a modern context of finetuning from a fundamental perspective. Extending the classical wisdom of variance minimization in low dimensions to high-dimensional finetuning, our generalization analysis unveils the importance of additionally reducing bias induced by low-rank approximation. Inspired by the variance-bias tradeoff in high dimensions from the theory, we introduce Sketchy Moment Matching (SkMM), a scalable data selection scheme with two stages. (i) First, the bias is controlled using gradient sketching that explores the finetuning parameter space for an informative low-dimensional subspace $\mathcal{S}$; (ii) then the variance is reduced over $\mathcal{S}$ via moment matching between the original and selected datasets. Theoretically, we show that gradient sketching is fast and provably accurate: selecting $n$ samples by reducing variance over $\mathcal{S}$ preserves the fast-rate generalization $O(\dim(\mathcal{S})/n)$, independent of the parameter dimension. Empirically, we concretize the variance-bias balance via synthetic experiments and demonstrate the effectiveness of SkMM for finetuning in real vision tasks. △ Less

Submitted 8 July, 2024; originally announced July 2024.

arXiv:2403.11125 [pdf]

Machine learning-based system reliability analysis with Gaussian Process Regression

Authors: Lisang Zhou, Ziqian Luo, Xueting Pan

Abstract: Machine learning-based reliability analysis methods have shown great advancements for their computational efficiency and accuracy. Recently, many efficient learning strategies have been proposed to enhance the computational performance. However, few of them explores the theoretical optimal learning strategy. In this article, we propose several theorems that facilitates such exploration. Specifical… ▽ More Machine learning-based reliability analysis methods have shown great advancements for their computational efficiency and accuracy. Recently, many efficient learning strategies have been proposed to enhance the computational performance. However, few of them explores the theoretical optimal learning strategy. In this article, we propose several theorems that facilitates such exploration. Specifically, cases that considering and neglecting the correlations among the candidate design samples are well elaborated. Moreover, we prove that the well-known U learning function can be reformulated to the optimal learning function for the case neglecting the Kriging correlation. In addition, the theoretical optimal learning strategy for sequential multiple training samples enrichment is also mathematically explored through the Bayesian estimate with the corresponding lost functions. Simulation results show that the optimal learning strategy considering the Kriging correlation works better than that neglecting the Kriging correlation and other state-of-the art learning functions from the literatures in terms of the reduction of number of evaluations of performance function. However, the implementation needs to investigate very large computational resource. △ Less

Submitted 20 April, 2024; v1 submitted 17 March, 2024; originally announced March 2024.

arXiv:2403.06424 [pdf, other]

Bridging Domains with Approximately Shared Features

Authors: Ziliang Samuel Zhong, Xiang Pan, Qi Lei

Abstract: Multi-source domain adaptation aims to reduce performance degradation when applying machine learning models to unseen domains. A fundamental challenge is devising the optimal strategy for feature selection. Existing literature is somewhat paradoxical: some advocate for learning invariant features from source domains, while others favor more diverse features. To address the challenge, we propose a… ▽ More Multi-source domain adaptation aims to reduce performance degradation when applying machine learning models to unseen domains. A fundamental challenge is devising the optimal strategy for feature selection. Existing literature is somewhat paradoxical: some advocate for learning invariant features from source domains, while others favor more diverse features. To address the challenge, we propose a statistical framework that distinguishes the utilities of features based on the variance of their correlation to label $y$ across domains. Under our framework, we design and analyze a learning procedure consisting of learning approximately shared feature representation from source tasks and fine-tuning it on the target task. Our theoretical analysis necessitates the importance of learning approximately shared features instead of only the strictly invariant features and yields an improved population risk compared to previous results on both source and target tasks, thus partly resolving the paradox mentioned above. Inspired by our theory, we proposed a more practical way to isolate the content (invariant+approximately shared) from environmental features and further consolidate our theoretical findings. △ Less

Submitted 11 March, 2024; originally announced March 2024.

arXiv:2207.08556 [pdf, other]

A Certifiable Security Patch for Object Tracking in Self-Driving Systems via Historical Deviation Modeling

Authors: Xudong Pan, Qifan Xiao, Mi Zhang, Min Yang

Abstract: Self-driving cars (SDC) commonly implement the perception pipeline to detect the surrounding obstacles and track their moving trajectories, which lays the ground for the subsequent driving decision making process. Although the security of obstacle detection in SDC is intensively studied, not until very recently the attackers start to exploit the vulnerability of the tracking module. Compared with… ▽ More Self-driving cars (SDC) commonly implement the perception pipeline to detect the surrounding obstacles and track their moving trajectories, which lays the ground for the subsequent driving decision making process. Although the security of obstacle detection in SDC is intensively studied, not until very recently the attackers start to exploit the vulnerability of the tracking module. Compared with solely attacking the object detectors, this new attack strategy influences the driving decision more effectively with less attack budgets. However, little is known on whether the revealed vulnerability remains effective in end-to-end self-driving systems and, if so, how to mitigate the threat. In this paper, we present the first systematic research on the security of object tracking in SDC. Through a comprehensive case study on the full perception pipeline of a popular open-sourced self-driving system, Baidu's Apollo, we prove the mainstream multi-object tracker (MOT) based on Kalman Filter (KF) is unsafe even with an enabled multi-sensor fusion mechanism. Our root cause analysis reveals, the vulnerability is innate to the design of KF-based MOT, which shall error-handle the prediction results from the object detectors yet the adopted KF algorithm is prone to trust the observation more when its deviation from the prediction is larger. To address this design flaw, we propose a simple yet effective security patch for KF-based MOT, the core of which is an adaptive strategy to balance the focus of KF on observations and predictions according to the anomaly index of the observation-prediction deviation, and has certified effectiveness against a generalized hijacking attack model. Extensive evaluation on $4$ KF-based existing MOT implementations (including 2D and 3D, academic and Apollo ones) validate the defense effectiveness and the trivial performance overhead of our approach. △ Less

Submitted 18 July, 2022; originally announced July 2022.

arXiv:2206.14371 [pdf, other]

Matryoshka: Stealing Functionality of Private ML Data by Hiding Models in Model

Authors: Xudong Pan, Yifan Yan, Shengyao Zhang, Mi Zhang, Min Yang

Abstract: In this paper, we present a novel insider attack called Matryoshka, which employs an irrelevant scheduled-to-publish DNN model as a carrier model for covert transmission of multiple secret models which memorize the functionality of private ML data stored in local data centers. Instead of treating the parameters of the carrier model as bit strings and applying conventional steganography, we devise… ▽ More In this paper, we present a novel insider attack called Matryoshka, which employs an irrelevant scheduled-to-publish DNN model as a carrier model for covert transmission of multiple secret models which memorize the functionality of private ML data stored in local data centers. Instead of treating the parameters of the carrier model as bit strings and applying conventional steganography, we devise a novel parameter sharing approach which exploits the learning capacity of the carrier model for information hiding. Matryoshka simultaneously achieves: (i) High Capacity -- With almost no utility loss of the carrier model, Matryoshka can hide a 26x larger secret model or 8 secret models of diverse architectures spanning different application domains in the carrier model, neither of which can be done with existing steganography techniques; (ii) Decoding Efficiency -- once downloading the published carrier model, an outside colluder can exclusively decode the hidden models from the carrier model with only several integer secrets and the knowledge of the hidden model architecture; (iii) Effectiveness -- Moreover, almost all the recovered models have similar performance as if it were trained independently on the private data; (iv) Robustness -- Information redundancy is naturally implemented to achieve resilience against common post-processing techniques on the carrier before its publishing; (v) Covertness -- A model inspector with different levels of prior knowledge could hardly differentiate a carrier model from a normal model. △ Less

Submitted 28 June, 2022; originally announced June 2022.

Comments: A preprint work

arXiv:2205.02432 [pdf, other]

A Unified Algorithm for Penalized Convolution Smoothed Quantile Regression

Authors: Rebeka Man, Xiaoou Pan, Kean Ming Tan, Wen-Xin Zhou

Abstract: Penalized quantile regression (QR) is widely used for studying the relationship between a response variable and a set of predictors under data heterogeneity in high-dimensional settings. Compared to penalized least squares, scalable algorithms for fitting penalized QR are lacking due to the non-differentiable piecewise linear loss function. To overcome the lack of smoothness, a recently proposed c… ▽ More Penalized quantile regression (QR) is widely used for studying the relationship between a response variable and a set of predictors under data heterogeneity in high-dimensional settings. Compared to penalized least squares, scalable algorithms for fitting penalized QR are lacking due to the non-differentiable piecewise linear loss function. To overcome the lack of smoothness, a recently proposed convolution-type smoothed method brings an interesting tradeoff between statistical accuracy and computational efficiency for both standard and penalized quantile regressions. In this paper, we propose a unified algorithm for fitting penalized convolution smoothed quantile regression with various commonly used convex penalties, accompanied by an R-language package conquer available from the Comprehensive R Archive Network. We perform extensive numerical studies to demonstrate the superior performance of the proposed algorithm over existing methods in both statistical and computational aspects. We further exemplify the proposed algorithm by fitting a fused lasso additive QR model on the world happiness data. △ Less

Submitted 5 May, 2022; originally announced May 2022.

arXiv:2104.09368 [pdf, other]

Deep Reinforcement Learning in a Monetary Model

Authors: Mingli Chen, Andreas Joseph, Michael Kumhof, Xinlei Pan, Xuan Zhou

Abstract: We propose using deep reinforcement learning to solve dynamic stochastic general equilibrium models. Agents are represented by deep artificial neural networks and learn to solve their dynamic optimisation problem by interacting with the model environment, of which they have no a priori knowledge. Deep reinforcement learning offers a flexible yet principled way to model bounded rationality within t… ▽ More We propose using deep reinforcement learning to solve dynamic stochastic general equilibrium models. Agents are represented by deep artificial neural networks and learn to solve their dynamic optimisation problem by interacting with the model environment, of which they have no a priori knowledge. Deep reinforcement learning offers a flexible yet principled way to model bounded rationality within this general class of models. We apply our proposed approach to a classical model from the adaptive learning literature in macroeconomics which looks at the interaction of monetary and fiscal policy. We find that, contrary to adaptive learning, the artificially intelligent household can solve the model in all policy regimes. △ Less

Submitted 5 January, 2023; v1 submitted 19 April, 2021; originally announced April 2021.

arXiv:2012.05187 [pdf, other]

Smoothed Quantile Regression with Large-Scale Inference

Authors: Xuming He, Xiaoou Pan, Kean Ming Tan, Wen-Xin Zhou

Abstract: Quantile regression is a powerful tool for learning the relationship between a response variable and a multivariate predictor while exploring heterogeneous effects. In this paper, we consider statistical inference for quantile regression with large-scale data in the "increasing dimension" regime. We provide a comprehensive and in-depth analysis of a convolution-type smoothing approach that achieve… ▽ More Quantile regression is a powerful tool for learning the relationship between a response variable and a multivariate predictor while exploring heterogeneous effects. In this paper, we consider statistical inference for quantile regression with large-scale data in the "increasing dimension" regime. We provide a comprehensive and in-depth analysis of a convolution-type smoothing approach that achieves adequate approximation to computation and inference for quantile regression. This method, which we refer to as {\it{conquer}}, turns the non-differentiable quantile loss function into a twice-differentiable, convex and locally strongly convex surrogate, which admits a fast and scalable Barzilai-Borwein gradient-based algorithm to perform optimization, and multiplier bootstrap for statistical inference. Theoretically, we establish explicit non-asymptotic bounds on both estimation and Bahadur-Kiefer linearization errors, from which we show that the asymptotic normality of the conquer estimator holds under a weaker requirement on the number of the regressors than needed for conventional quantile regression. Moreover, we prove the validity of multiplier bootstrap confidence constructions. Our numerical studies confirm the conquer estimator as a practical and reliable approach to large-scale inference for quantile regression. Software implementing the methodology is available in the \texttt{R} package \texttt{conquer}. △ Less

Submitted 17 May, 2021; v1 submitted 9 December, 2020; originally announced December 2020.

Comments: An R package conquer for fitting smoothed quantile regression is available in CRAN, https://cran.r-project.org/web/packages/conquer/index.html

arXiv:2010.13356 [pdf, other]

Exploring the Security Boundary of Data Reconstruction via Neuron Exclusivity Analysis

Authors: Xudong Pan, Mi Zhang, Yifan Yan, Jiaming Zhu, Min Yang

Abstract: Among existing privacy attacks on the gradient of neural networks, \emph{data reconstruction attack}, which reverse engineers the training batch from the gradient, poses a severe threat on the private training data. Despite its empirical success on large architectures and small training batches, unstable reconstruction accuracy is also observed when a smaller architecture or a larger batch is unde… ▽ More Among existing privacy attacks on the gradient of neural networks, \emph{data reconstruction attack}, which reverse engineers the training batch from the gradient, poses a severe threat on the private training data. Despite its empirical success on large architectures and small training batches, unstable reconstruction accuracy is also observed when a smaller architecture or a larger batch is under attack. Due to the weak interpretability of existing learning-based attacks, there is little known on why, when and how data reconstruction attack is feasible. In our work, we perform the first analytic study on the security boundary of data reconstruction from gradient via a microcosmic view on neural networks with rectified linear units (ReLUs), the most popular activation function in practice. For the first time, we characterize the insecure/secure boundary of data reconstruction attack in terms of the \emph{neuron exclusivity state} of a training batch, indexed by the number of \emph{\textbf{Ex}clusively \textbf{A}ctivated \textbf{N}eurons} (ExANs, i.e., a ReLU activated by only one sample in a batch). Intuitively, we show a training batch with more ExANs are more vulnerable to data reconstruction attack and vice versa. On the one hand, we construct a novel deterministic attack algorithm which substantially outperforms previous attacks for reconstructing training batches lying in the insecure boundary of a neural network. Meanwhile, for training batches lying in the secure boundary, we prove the impossibility of unique reconstruction, based on which an exclusivity reduction strategy is devised to enlarge the secure boundary for mitigation purposes. △ Less

Submitted 22 December, 2021; v1 submitted 26 October, 2020; originally announced October 2020.

Comments: Accepted by USENIX Security'22; A preprint version

arXiv:2005.10831 [pdf]

Repurpose Open Data to Discover Therapeutics for COVID-19 using Deep Learning

Authors: Xiangxiang Zeng, Xiang Song, Tengfei Ma, Xiaoqin Pan, Yadi Zhou, Yuan Hou, Zheng Zhang, George Karypis, Feixiong Cheng

Abstract: There have been more than 850,000 confirmed cases and over 48,000 deaths from the human coronavirus disease 2019 (COVID-19) pandemic, caused by novel severe acute respiratory syndrome coronavirus (SARS-CoV-2), in the United States alone. However, there are currently no proven effective medications against COVID-19. Drug repurposing offers a promising way for the development of prevention and treat… ▽ More There have been more than 850,000 confirmed cases and over 48,000 deaths from the human coronavirus disease 2019 (COVID-19) pandemic, caused by novel severe acute respiratory syndrome coronavirus (SARS-CoV-2), in the United States alone. However, there are currently no proven effective medications against COVID-19. Drug repurposing offers a promising way for the development of prevention and treatment strategies for COVID-19. This study reports an integrative, network-based deep learning methodology to identify repurposable drugs for COVID-19 (termed CoV-KGE). Specifically, we built a comprehensive knowledge graph that includes 15 million edges across 39 types of relationships connecting drugs, diseases, genes, pathways, and expressions, from a large scientific corpus of 24 million PubMed publications. Using Amazon AWS computing resources, we identified 41 repurposable drugs (including indomethacin, toremifene and niclosamide) whose therapeutic association with COVID-19 were validated by transcriptomic and proteomic data in SARS-CoV-2 infected human cells and data from ongoing clinical trials. While this study, by no means recommends specific drugs, it demonstrates a powerful deep learning methodology to prioritize existing drugs for further investigation, which holds the potential of accelerating therapeutic development for COVID-19. △ Less

Submitted 21 May, 2020; originally announced May 2020.

MSC Class: I.2.1

arXiv:2004.12909 [pdf, other]

Evolutionary Stochastic Policy Distillation

Authors: Hao Sun, Xinyu Pan, Bo Dai, Dahua Lin, Bolei Zhou

Abstract: Solving the Goal-Conditioned Reward Sparse (GCRS) task is a challenging reinforcement learning problem due to the sparsity of reward signals. In this work, we propose a new formulation of GCRS tasks from the perspective of the drifted random walk on the state space, and design a novel method called Evolutionary Stochastic Policy Distillation (ESPD) to solve them based on the insight of reducing th… ▽ More Solving the Goal-Conditioned Reward Sparse (GCRS) task is a challenging reinforcement learning problem due to the sparsity of reward signals. In this work, we propose a new formulation of GCRS tasks from the perspective of the drifted random walk on the state space, and design a novel method called Evolutionary Stochastic Policy Distillation (ESPD) to solve them based on the insight of reducing the First Hitting Time of the stochastic process. As a self-imitate approach, ESPD enables a target policy to learn from a series of its stochastic variants through the technique of policy distillation (PD). The learning mechanism of ESPD can be considered as an Evolution Strategy (ES) that applies perturbations upon policy directly on the action space, with a SELECT function to check the superiority of stochastic variants and then use PD to update the policy. The experiments based on the MuJoCo robotics control suite show the high learning efficiency of the proposed method. △ Less

Submitted 30 April, 2020; v1 submitted 27 April, 2020; originally announced April 2020.

arXiv:1909.12220 [pdf, other]

Implicit Semantic Data Augmentation for Deep Networks

Authors: Yulin Wang, Xuran Pan, Shiji Song, Hong Zhang, Cheng Wu, Gao Huang

Abstract: In this paper, we propose a novel implicit semantic data augmentation (ISDA) approach to complement traditional augmentation techniques like flipping, translation or rotation. Our work is motivated by the intriguing property that deep networks are surprisingly good at linearizing features, such that certain directions in the deep feature space correspond to meaningful semantic transformations, e.g… ▽ More In this paper, we propose a novel implicit semantic data augmentation (ISDA) approach to complement traditional augmentation techniques like flipping, translation or rotation. Our work is motivated by the intriguing property that deep networks are surprisingly good at linearizing features, such that certain directions in the deep feature space correspond to meaningful semantic transformations, e.g., adding sunglasses or changing backgrounds. As a consequence, translating training samples along many semantic directions in the feature space can effectively augment the dataset to improve generalization. To implement this idea effectively and efficiently, we first perform an online estimate of the covariance matrix of deep features for each class, which captures the intra-class semantic variations. Then random vectors are drawn from a zero-mean normal distribution with the estimated covariance to augment the training data in that class. Importantly, instead of augmenting the samples explicitly, we can directly minimize an upper bound of the expected cross-entropy (CE) loss on the augmented training set, leading to a highly efficient algorithm. In fact, we show that the proposed ISDA amounts to minimizing a novel robust CE loss, which adds negligible extra computational cost to a normal training procedure. Although being simple, ISDA consistently improves the generalization performance of popular deep models (ResNets and DenseNets) on a variety of datasets, e.g., CIFAR-10, CIFAR-100 and ImageNet. Code for reproducing our results is available at https://github.com/blackfeather-wang/ISDA-for-Deep-Networks. △ Less

Submitted 24 April, 2020; v1 submitted 26 September, 2019; originally announced September 2019.

Comments: Accepted by NeurIPS 2019

arXiv:1909.03403 [pdf, other]

Open Compound Domain Adaptation

Authors: Ziwei Liu, Zhongqi Miao, Xingang Pan, Xiaohang Zhan, Dahua Lin, Stella X. Yu, Boqing Gong

Abstract: A typical domain adaptation approach is to adapt models trained on the annotated data in a source domain (e.g., sunny weather) for achieving high performance on the test data in a target domain (e.g., rainy weather). Whether the target contains a single homogeneous domain or multiple heterogeneous domains, existing works always assume that there exist clear distinctions between the domains, which… ▽ More A typical domain adaptation approach is to adapt models trained on the annotated data in a source domain (e.g., sunny weather) for achieving high performance on the test data in a target domain (e.g., rainy weather). Whether the target contains a single homogeneous domain or multiple heterogeneous domains, existing works always assume that there exist clear distinctions between the domains, which is often not true in practice (e.g., changes in weather). We study an open compound domain adaptation (OCDA) problem, in which the target is a compound of multiple homogeneous domains without domain labels, reflecting realistic data collection from mixed and novel situations. We propose a new approach based on two technical insights into OCDA: 1) a curriculum domain adaptation strategy to bootstrap generalization across domains in a data-driven self-organizing fashion and 2) a memory module to increase the model's agility towards novel domains. Our experiments on digit classification, facial expression recognition, semantic segmentation, and reinforcement learning demonstrate the effectiveness of our approach. △ Less

Submitted 29 March, 2020; v1 submitted 8 September, 2019; originally announced September 2019.

Comments: To appear in CVPR 2020 as an oral presentation. Code, datasets and models are available at: https://liuziwei7.github.io/projects/CompoundDomain.html

arXiv:1907.09470 [pdf, other]

Characterizing Attacks on Deep Reinforcement Learning

Authors: Xinlei Pan, Chaowei Xiao, Warren He, Shuang Yang, Jian Peng, Mingjie Sun, Jinfeng Yi, Zijiang Yang, Mingyan Liu, Bo Li, Dawn Song

Abstract: Recent studies show that Deep Reinforcement Learning (DRL) models are vulnerable to adversarial attacks, which attack DRL models by adding small perturbations to the observations. However, some attacks assume full availability of the victim model, and some require a huge amount of computation, making them less feasible for real world applications. In this work, we make further explorations of the… ▽ More Recent studies show that Deep Reinforcement Learning (DRL) models are vulnerable to adversarial attacks, which attack DRL models by adding small perturbations to the observations. However, some attacks assume full availability of the victim model, and some require a huge amount of computation, making them less feasible for real world applications. In this work, we make further explorations of the vulnerabilities of DRL by studying other aspects of attacks on DRL using realistic and efficient attacks. First, we adapt and propose efficient black-box attacks when we do not have access to DRL model parameters. Second, to address the high computational demands of existing attacks, we introduce efficient online sequential attacks that exploit temporal consistency across consecutive steps. Third, we explore the possibility of an attacker perturbing other aspects in the DRL setting, such as the environment dynamics. Finally, to account for imperfections in how an attacker would inject perturbations in the physical world, we devise a method for generating a robust physical perturbations to be printed. The attack is evaluated on a real-world robot under various conditions. We conduct extensive experiments both in simulation such as Atari games, robotics and autonomous driving, and on real-world robotics, to compare the effectiveness of the proposed attacks with baseline approaches. To the best of our knowledge, we are the first to apply adversarial attacks on DRL systems to physical robots. △ Less

Submitted 16 February, 2022; v1 submitted 21 July, 2019; originally announced July 2019.

Comments: AAMAS 2022, 13 pages, 6 figures

arXiv:1907.04027 [pdf, other]

Iteratively Reweighted $\ell_1$-Penalized Robust Regression

Authors: Xiaoou Pan, Qiang Sun, Wen-Xin Zhou

Abstract: This paper investigates tradeoffs among optimization errors, statistical rates of convergence and the effect of heavy-tailed errors for high-dimensional robust regression with nonconvex regularization. When the additive errors in linear models have only bounded second moment, we show that iteratively reweighted $\ell_1$-penalized adaptive Huber regression estimator satisfies exponential deviation… ▽ More This paper investigates tradeoffs among optimization errors, statistical rates of convergence and the effect of heavy-tailed errors for high-dimensional robust regression with nonconvex regularization. When the additive errors in linear models have only bounded second moment, we show that iteratively reweighted $\ell_1$-penalized adaptive Huber regression estimator satisfies exponential deviation bounds and oracle properties, including the oracle convergence rate and variable selection consistency, under a weak beta-min condition. Computationally, we need as many as $O(\log s + \log\log d)$ iterations to reach such an oracle estimator, where $s$ and $d$ denote the sparsity and ambient dimension, respectively. Extension to a general class of robust loss functions is also considered. Numerical studies lend strong support to our methodology and theory. △ Less

Submitted 29 December, 2020; v1 submitted 9 July, 2019; originally announced July 2019.

Comments: 62 pages

arXiv:1904.11082 [pdf, other]

How You Act Tells a Lot: Privacy-Leakage Attack on Deep Reinforcement Learning

Authors: Xinlei Pan, Weiyao Wang, Xiaoshuai Zhang, Bo Li, Jinfeng Yi, Dawn Song

Abstract: Machine learning has been widely applied to various applications, some of which involve training with privacy-sensitive data. A modest number of data breaches have been studied, including credit card information in natural language data and identities from face dataset. However, most of these studies focus on supervised learning models. As deep reinforcement learning (DRL) has been deployed in a n… ▽ More Machine learning has been widely applied to various applications, some of which involve training with privacy-sensitive data. A modest number of data breaches have been studied, including credit card information in natural language data and identities from face dataset. However, most of these studies focus on supervised learning models. As deep reinforcement learning (DRL) has been deployed in a number of real-world systems, such as indoor robot navigation, whether trained DRL policies can leak private information requires in-depth study. To explore such privacy breaches in general, we mainly propose two methods: environment dynamics search via genetic algorithm and candidate inference based on shadow policies. We conduct extensive experiments to demonstrate such privacy vulnerabilities in DRL under various settings. We leverage the proposed algorithms to infer floor plans from some trained Grid World navigation DRL agents with LiDAR perception. The proposed algorithm can correctly infer most of the floor plans and reaches an average recovery rate of 95.83% using policy gradient trained agents. In addition, we are able to recover the robot configuration in continuous control environments and an autonomous driving simulator with high accuracy. To the best of our knowledge, this is the first work to investigate privacy leakage in DRL settings and we show that DRL-based agents do potentially leak privacy-sensitive information from the trained policies. △ Less

Submitted 24 April, 2019; originally announced April 2019.

Comments: The first three authors contributed equally. Accepted by AAMAS 2019

arXiv:1806.07001 [pdf, ps, other]

Theoretical Analysis of Image-to-Image Translation with Adversarial Learning

Authors: Xudong Pan, Mi Zhang, Daizong Ding

Abstract: Recently, a unified model for image-to-image translation tasks within adversarial learning framework has aroused widespread research interests in computer vision practitioners. Their reported empirical success however lacks solid theoretical interpretations for its inherent mechanism. In this paper, we reformulate their model from a brand-new geometrical perspective and have eventually reached a f… ▽ More Recently, a unified model for image-to-image translation tasks within adversarial learning framework has aroused widespread research interests in computer vision practitioners. Their reported empirical success however lacks solid theoretical interpretations for its inherent mechanism. In this paper, we reformulate their model from a brand-new geometrical perspective and have eventually reached a full interpretation on some interesting but unclear empirical phenomenons from their experiments. Furthermore, by extending the definition of generalization for generative adversarial nets to a broader sense, we have derived a condition to control the generalization capability of their model. According to our derived condition, several practical suggestions have also been proposed on model design and dataset construction as a guidance for further empirical researches. △ Less

Submitted 18 June, 2018; originally announced June 2018.

Comments: will appear in ICML2018

arXiv:1806.04245 [pdf, other]

Learning to Speed Up Structured Output Prediction

Authors: Xingyuan Pan, Vivek Srikumar

Abstract: Predicting structured outputs can be computationally onerous due to the combinatorially large output spaces. In this paper, we focus on reducing the prediction time of a trained black-box structured classifier without losing accuracy. To do so, we train a speedup classifier that learns to mimic a black-box classifier under the learning-to-search approach. As the structured classifier predicts more… ▽ More Predicting structured outputs can be computationally onerous due to the combinatorially large output spaces. In this paper, we focus on reducing the prediction time of a trained black-box structured classifier without losing accuracy. To do so, we train a speedup classifier that learns to mimic a black-box classifier under the learning-to-search approach. As the structured classifier predicts more examples, the speedup classifier will operate as a learned heuristic to guide search to favorable regions of the output space. We present a mistake bound for the speedup classifier and identify inference situations where it can independently make correct judgments without input features. We evaluate our method on the task of entity and relation extraction and show that the speedup classifier outperforms even greedy search in terms of speed without loss of accuracy. △ Less

Submitted 11 June, 2018; originally announced June 2018.

Comments: International Conference on Machine Learning, Stockholm, Sweden, 2018

arXiv:1712.02270 [pdf, ps, other]

Attention based convolutional neural network for predicting RNA-protein binding sites

Authors: Xiaoyong Pan, Junchi Yan

Abstract: RNA-binding proteins (RBPs) play crucial roles in many biological processes, e.g. gene regulation. Computational identification of RBP binding sites on RNAs are urgently needed. In particular, RBPs bind to RNAs by recognizing sequence motifs. Thus, fast locating those motifs on RNA sequences is crucial and time-efficient for determining whether the RNAs interact with the RBPs or not. In this study… ▽ More RNA-binding proteins (RBPs) play crucial roles in many biological processes, e.g. gene regulation. Computational identification of RBP binding sites on RNAs are urgently needed. In particular, RBPs bind to RNAs by recognizing sequence motifs. Thus, fast locating those motifs on RNA sequences is crucial and time-efficient for determining whether the RNAs interact with the RBPs or not. In this study, we present an attention based convolutional neural network, iDeepA, to predict RNA-protein binding sites from raw RNA sequences. We first encode RNA sequences into one-hot encoding. Next, we design a deep learning model with a convolutional neural network (CNN) and an attention mechanism, which automatically search for important positions, e.g. binding motifs, to learn discriminant high-level features for predicting RBP binding sites. We evaluate iDeepA on publicly gold-standard RBP binding sites derived from CLIP-seq data. The results demonstrate iDeepA achieves comparable performance with other state-of-the-art methods. △ Less

Submitted 6 December, 2017; originally announced December 2017.

Journal ref: NIPS 2017 Computational Biology Workshop

arXiv:1706.04572 [pdf, other]

Deep Learning Methods for Efficient Large Scale Video Labeling

Authors: Miha Skalic, Marcin Pekalski, Xingguo E. Pan

Abstract: We present a solution to "Google Cloud and YouTube-8M Video Understanding Challenge" that ranked 5th place. The proposed model is an ensemble of three model families, two frame level and one video level. The training was performed on augmented dataset, with cross validation. We present a solution to "Google Cloud and YouTube-8M Video Understanding Challenge" that ranked 5th place. The proposed model is an ensemble of three model families, two frame level and one video level. The training was performed on augmented dataset, with cross validation. △ Less

Submitted 14 June, 2017; originally announced June 2017.

Comments: 7 pages, 5 tables, 1 figure

arXiv:1610.06848 [pdf, other]

An Efficient Minibatch Acceptance Test for Metropolis-Hastings

Authors: Daniel Seita, Xinlei Pan, Haoyu Chen, John Canny

Abstract: We present a novel Metropolis-Hastings method for large datasets that uses small expected-size minibatches of data. Previous work on reducing the cost of Metropolis-Hastings tests yield variable data consumed per sample, with only constant factor reductions versus using the full dataset for each sample. Here we present a method that can be tuned to provide arbitrarily small batch sizes, by adjusti… ▽ More We present a novel Metropolis-Hastings method for large datasets that uses small expected-size minibatches of data. Previous work on reducing the cost of Metropolis-Hastings tests yield variable data consumed per sample, with only constant factor reductions versus using the full dataset for each sample. Here we present a method that can be tuned to provide arbitrarily small batch sizes, by adjusting either proposal step size or temperature. Our test uses the noise-tolerant Barker acceptance test with a novel additive correction variable. The resulting test has similar cost to a normal SGD update. Our experiments demonstrate several order-of-magnitude speedups over previous work. △ Less

Submitted 9 July, 2017; v1 submitted 18 October, 2016; originally announced October 2016.

Comments: Final version for UAI 2017

arXiv:1605.09721 [pdf, other]

CYCLADES: Conflict-free Asynchronous Machine Learning

Authors: Xinghao Pan, Maximilian Lam, Stephen Tu, Dimitris Papailiopoulos, Ce Zhang, Michael I. Jordan, Kannan Ramchandran, Chris Re, Benjamin Recht

Abstract: We present CYCLADES, a general framework for parallelizing stochastic optimization algorithms in a shared memory setting. CYCLADES is asynchronous during shared model updates, and requires no memory locking mechanisms, similar to HOGWILD!-type algorithms. Unlike HOGWILD!, CYCLADES introduces no conflicts during the parallel execution, and offers a black-box analysis for provable speedups across a… ▽ More We present CYCLADES, a general framework for parallelizing stochastic optimization algorithms in a shared memory setting. CYCLADES is asynchronous during shared model updates, and requires no memory locking mechanisms, similar to HOGWILD!-type algorithms. Unlike HOGWILD!, CYCLADES introduces no conflicts during the parallel execution, and offers a black-box analysis for provable speedups across a large family of algorithms. Due to its inherent conflict-free nature and cache locality, our multi-core implementation of CYCLADES consistently outperforms HOGWILD!-type algorithms on sufficiently sparse datasets, leading to up to 40% speedup gains compared to the HOGWILD! implementation of SGD, and up to 5x gains over asynchronous implementations of variance reduction algorithms. △ Less

Submitted 31 May, 2016; originally announced May 2016.

arXiv:1507.06970 [pdf, ps, other]

Perturbed Iterate Analysis for Asynchronous Stochastic Optimization

Authors: Horia Mania, Xinghao Pan, Dimitris Papailiopoulos, Benjamin Recht, Kannan Ramchandran, Michael I. Jordan

Abstract: We introduce and analyze stochastic optimization methods where the input to each gradient update is perturbed by bounded noise. We show that this framework forms the basis of a unified approach to analyze asynchronous implementations of stochastic optimization algorithms.In this framework, asynchronous stochastic optimization algorithms can be thought of as serial methods operating on noisy inputs… ▽ More We introduce and analyze stochastic optimization methods where the input to each gradient update is perturbed by bounded noise. We show that this framework forms the basis of a unified approach to analyze asynchronous implementations of stochastic optimization algorithms.In this framework, asynchronous stochastic optimization algorithms can be thought of as serial methods operating on noisy inputs. Using our perturbed iterate framework, we provide new analyses of the Hogwild! algorithm and asynchronous stochastic coordinate descent, that are simpler than earlier analyses, remove many assumptions of previous models, and in some cases yield improved upper bounds on the convergence rates. We proceed to apply our framework to develop and analyze KroMagnon: a novel, parallel, sparse stochastic variance-reduced gradient (SVRG) algorithm. We demonstrate experimentally on a 16-core machine that the sparse and parallel version of SVRG is in some cases more than four orders of magnitude faster than the standard SVRG algorithm. △ Less

Submitted 25 March, 2016; v1 submitted 24 July, 2015; originally announced July 2015.

Comments: 30 pages

MSC Class: 65K10; 65Y05; 68W10; 68W20

arXiv:1507.05086 [pdf, other]

Parallel Correlation Clustering on Big Graphs

Authors: Xinghao Pan, Dimitris Papailiopoulos, Samet Oymak, Benjamin Recht, Kannan Ramchandran, Michael I. Jordan

Abstract: Given a similarity graph between items, correlation clustering (CC) groups similar items together and dissimilar ones apart. One of the most popular CC algorithms is KwikCluster: an algorithm that serially clusters neighborhoods of vertices, and obtains a 3-approximation ratio. Unfortunately, KwikCluster in practice requires a large number of clustering rounds, a potential bottleneck for large gra… ▽ More Given a similarity graph between items, correlation clustering (CC) groups similar items together and dissimilar ones apart. One of the most popular CC algorithms is KwikCluster: an algorithm that serially clusters neighborhoods of vertices, and obtains a 3-approximation ratio. Unfortunately, KwikCluster in practice requires a large number of clustering rounds, a potential bottleneck for large graphs. We present C4 and ClusterWild!, two algorithms for parallel correlation clustering that run in a polylogarithmic number of rounds and achieve nearly linear speedups, provably. C4 uses concurrency control to enforce serializability of a parallel clustering process, and guarantees a 3-approximation ratio. ClusterWild! is a coordination free algorithm that abandons consistency for the benefit of better scaling; this leads to a provably small loss in the 3-approximation ratio. We provide extensive experimental results for both algorithms, where we outperform the state of the art, both in terms of clustering accuracy and running time. We show that our algorithms can cluster billion-edge graphs in under 5 seconds on 32 cores, while achieving a 15x speedup. △ Less

Submitted 20 July, 2015; v1 submitted 17 July, 2015; originally announced July 2015.

arXiv:1310.5426 [pdf, other]

MLI: An API for Distributed Machine Learning

Authors: Evan R. Sparks, Ameet Talwalkar, Virginia Smith, Jey Kottalam, Xinghao Pan, Joseph Gonzalez, Michael J. Franklin, Michael I. Jordan, Tim Kraska

Abstract: MLI is an Application Programming Interface designed to address the challenges of building Machine Learn- ing algorithms in a distributed setting based on data-centric computing. Its primary goal is to simplify the development of high-performance, scalable, distributed algorithms. Our initial results show that, relative to existing systems, this interface can be used to build distributed implement… ▽ More MLI is an Application Programming Interface designed to address the challenges of building Machine Learn- ing algorithms in a distributed setting based on data-centric computing. Its primary goal is to simplify the development of high-performance, scalable, distributed algorithms. Our initial results show that, relative to existing systems, this interface can be used to build distributed implementations of a wide variety of common Machine Learning algorithms with minimal complexity and highly competitive performance and scalability. △ Less

Submitted 25 October, 2013; v1 submitted 21 October, 2013; originally announced October 2013.

arXiv:1308.1624 [pdf]

Poisson-type Multivariate Transfer Function Model Reveals Short-term Effects of Ambient Air Pollutants on Hospital Emergency room Visits for Cerebro-cardiovascular Diseases

Authors: Menghui Li, Dasheng Luo, Chenghua Cao, Xiaochuan Pan, Qixin Wang

Abstract: Laboratory experiments have shown that cardiovascular diseases are positively correlated to the concentration of ambient air pollutants, such as SO2, NO2, PM10, etc. It has also been repeatedly reported in many countries that increased concentration of ambient air pollutants leads to rise in hospital emergency room visitss for these diseases. These studies mainly adopt either regression analysis o… ▽ More Laboratory experiments have shown that cardiovascular diseases are positively correlated to the concentration of ambient air pollutants, such as SO2, NO2, PM10, etc. It has also been repeatedly reported in many countries that increased concentration of ambient air pollutants leads to rise in hospital emergency room visitss for these diseases. These studies mainly adopt either regression analysis or preliminary models in time series analysis, while the multivariable transfer function model, a relatively newly developed model, has multiple advantages over the conventional linear regression on analyzing time series. This study attempts to quantify the association between concentrations ambient air pollutants and hospital emergency room visitss for cerebro-cardiovascular diseases in Beijing using a Poisson-type multivariate transfer function model. The results show that the RR values of SO2, NO2 and PM10 for a 50 g/m3 increase are 1.129, 1.092 and 1.069 respectively. The lags for the three pollutants are estimated to be 2 days, 1 day and 1 day respectively. Compared with the ambient pollutants, daily average temperature and relative humidity do not influence the daily count of hospital emergency room visits significantly. △ Less

Submitted 7 August, 2013; originally announced August 2013.

Showing 1–26 of 26 results for author: Pan, X