Skip to main content

Showing 1–45 of 45 results for author: Peng, Y

Searching in archive stat. Search in all archives.
.
  1. arXiv:2405.04026  [pdf, other

    stat.ML cs.LG

    Federated Control in Markov Decision Processes

    Authors: Hao Jin, Yang Peng, Liangyu Zhang, Zhihua Zhang

    Abstract: We study problems of federated control in Markov Decision Processes. To solve an MDP with large state space, multiple learning agents are introduced to collaboratively learn its optimal policy without communication of locally collected experience. In our settings, these agents have limited capabilities, which means they are restricted within different regions of the overall state space during the… ▽ More

    Submitted 7 May, 2024; originally announced May 2024.

  2. arXiv:2403.05811  [pdf, ps, other

    stat.ML cs.LG

    Near Minimax-Optimal Distributional Temporal Difference Algorithms and The Freedman Inequality in Hilbert Spaces

    Authors: Yang Peng, Liangyu Zhang, Zhihua Zhang

    Abstract: Distributional reinforcement learning (DRL) has achieved empirical success in various domains. One of the core tasks in the field of DRL is distributional policy evaluation, which involves estimating the return distribution $η^π$ for a given policy $π$. The distributional temporal difference (TD) algorithm has been accordingly proposed, which is an extension of the temporal difference algorithm in… ▽ More

    Submitted 14 March, 2024; v1 submitted 9 March, 2024; originally announced March 2024.

  3. arXiv:2402.02196  [pdf, other

    stat.ME cs.LG

    Sample-Efficient Clustering and Conquer Procedures for Parallel Large-Scale Ranking and Selection

    Authors: Zishi Zhang, Yijie Peng

    Abstract: We propose novel "clustering and conquer" procedures for the parallel large-scale ranking and selection (R&S) problem, which leverage correlation information for clustering to break the bottleneck of sample efficiency. In parallel computing environments, correlation-based clustering can achieve an $\mathcal{O}(p)$ sample complexity reduction rate, which is the optimal reduction rate theoretically… ▽ More

    Submitted 12 February, 2024; v1 submitted 3 February, 2024; originally announced February 2024.

  4. arXiv:2402.00907  [pdf, other

    cs.LG stat.ME

    AlphaRank: An Artificial Intelligence Approach for Ranking and Selection Problems

    Authors: Ruihan Zhou, L. Jeff Hong, Yijie Peng

    Abstract: We introduce AlphaRank, an artificial intelligence approach to address the fixed-budget ranking and selection (R&S) problems. We formulate the sequential sampling decision as a Markov decision process and propose a Monte Carlo simulation-based rollout policy that utilizes classic R&S procedures as base policies for efficiently learning the value function of stochastic dynamic programming. We accel… ▽ More

    Submitted 31 January, 2024; originally announced February 2024.

  5. arXiv:2401.11646  [pdf, other

    stat.ML cs.LG math.NA stat.ME

    Nonparametric Density Estimation via Variance-Reduced Sketching

    Authors: Yifan Peng, Yuehaw Khoo, Daren Wang

    Abstract: Nonparametric density models are of great interest in various scientific and engineering disciplines. Classical density kernel methods, while numerically robust and statistically sound in low-dimensional settings, become inadequate even in moderate higher-dimensional settings due to the curse of dimensionality. In this paper, we introduce a new framework called Variance-Reduced Sketching (VRS), sp… ▽ More

    Submitted 7 July, 2024; v1 submitted 21 January, 2024; originally announced January 2024.

    Comments: 62 pages, 12 figures

  6. arXiv:2312.05590  [pdf, other

    math.OC stat.ME

    Gradient Tracking for High Dimensional Federated Optimization

    Authors: Jiadong Liang, Yang Peng, Zhihua Zhang

    Abstract: In this paper, we study the (decentralized) distributed optimization problem with high-dimensional sparse structure. Building upon the FedDA algorithm, we propose a (Decentralized) FedDA-GT algorithm, which combines the \textbf{gradient tracking} technique. It is able to eliminate the heterogeneity among different clients' objective functions while ensuring a dimension-free convergence rate. Compa… ▽ More

    Submitted 9 December, 2023; originally announced December 2023.

  7. arXiv:2309.17262  [pdf, other

    stat.ML cs.LG

    Estimation and Inference in Distributional Reinforcement Learning

    Authors: Liangyu Zhang, Yang Peng, Jiadong Liang, Wenhao Yang, Zhihua Zhang

    Abstract: In this paper, we study distributional reinforcement learning from the perspective of statistical efficiency. We investigate distributional policy evaluation, aiming to estimate the complete distribution of the random return (denoted $η^π$) attained by a given policy $π$. We use the certainty-equivalence method to construct our estimator $\hatη^π$, given a generative model is available. We s… ▽ More

    Submitted 29 September, 2023; originally announced September 2023.

  8. arXiv:2308.00904  [pdf, other

    cs.LG cs.AI stat.ME

    VLUCI: Variational Learning of Unobserved Confounders for Counterfactual Inference

    Authors: Yonghe Zhao, Qiang Huang, Siwei Wu, Yun Peng, Huiyan Sun

    Abstract: Causal inference plays a vital role in diverse domains like epidemiology, healthcare, and economics. De-confounding and counterfactual prediction in observational data has emerged as a prominent concern in causal inference research. While existing models tackle observed confounders, the presence of unobserved confounders remains a significant challenge, distorting causal inference and impacting co… ▽ More

    Submitted 7 September, 2023; v1 submitted 1 August, 2023; originally announced August 2023.

    Comments: 15 pages, 8 figures

  9. arXiv:2306.17704  [pdf, other

    stat.ME math.ST

    Top-Two Thompson Sampling for Contextual Top-mc Selection Problems

    Authors: Xinbo Shi, Yijie Peng, Gongbo Zhang

    Abstract: We aim to efficiently allocate a fixed simulation budget to identify the top-mc designs for each context among a finite number of contexts. The performance of each design under a context is measured by an identifiable statistical characteristic, possibly with the existence of nuisance parameters. Under a Bayesian framework, we extend the top-two Thompson sampling method designed for selecting the… ▽ More

    Submitted 30 June, 2023; originally announced June 2023.

    MSC Class: 62F07 (Primary) 62C10; 62L10 (Secondary)

  10. arXiv:2305.04086  [pdf, other

    stat.ML math.OC

    Efficient Learning for Selecting Top-m Context-Dependent Designs

    Authors: Gongbo Zhang, Sihua Chen, Kuihua Huang, Yijie Peng

    Abstract: We consider a simulation optimization problem for a context-dependent decision-making, which aims to determine the top-m designs for all contexts. Under a Bayesian framework, we formulate the optimal dynamic sampling decision as a stochastic dynamic programming problem, and develop a sequential sampling policy to efficiently learn the performance of each design under each context. The asymptotical… ▽ More

    Submitted 9 June, 2023; v1 submitted 6 May, 2023; originally announced May 2023.

  11. arXiv:2305.00254  [pdf, other

    cs.LG stat.ML

    Semi-Infinitely Constrained Markov Decision Processes and Efficient Reinforcement Learning

    Authors: Liangyu Zhang, Yang Peng, Wenhao Yang, Zhihua Zhang

    Abstract: We propose a novel generalization of constrained Markov decision processes (CMDPs) that we call the \emph{semi-infinitely constrained Markov decision process} (SICMDP). Particularly, we consider a continuum of constraints instead of a finite number of constraints as in the case of ordinary CMDPs. We also devise two reinforcement learning algorithms for SICMDPs that we call SI-CRL and SI-CPO. SI-CR… ▽ More

    Submitted 29 April, 2023; originally announced May 2023.

    Comments: Shorter version accepted at NeurIPS 2022

  12. arXiv:2304.14618  [pdf, other

    cs.LG stat.ML

    Recognizable Information Bottleneck

    Authors: Yilin Lyu, Xin Liu, Mingyang Song, Xinyue Wang, Yaxin Peng, Tieyong Zeng, Liping Jing

    Abstract: Information Bottlenecks (IBs) learn representations that generalize to unseen data by information compression. However, existing IBs are practically unable to guarantee generalization in real-world scenarios due to the vacuous generalization bound. The recent PAC-Bayes IB uses information complexity instead of information compression to establish a connection with the mutual information generaliza… ▽ More

    Submitted 27 April, 2023; originally announced April 2023.

    Comments: 12 pages. To appear in IJCAI 2023

  13. arXiv:2304.05305  [pdf, other

    math.NA cs.LG stat.ML

    Generative Modeling via Hierarchical Tensor Sketching

    Authors: Yifan Peng, Yian Chen, E. Miles Stoudenmire, Yuehaw Khoo

    Abstract: We propose a hierarchical tensor-network approach for approximating high-dimensional probability density via empirical distribution. This leverages randomized singular value decomposition (SVD) techniques and involves solving linear equations for tensor cores in this tensor network. The complexity of the resulting algorithm scales linearly in the dimension of the high-dimensional density. An analy… ▽ More

    Submitted 11 April, 2023; originally announced April 2023.

    MSC Class: 15A69; 62Gxx

  14. arXiv:2303.14906  [pdf, ps, other

    stat.ME

    Model-free screening procedure for ultrahigh-dimensional survival data based on Hilbert-Schmidt independence criterion

    Authors: Xuerui Li, Yanyan Liu, Yankai Peng, Jing Zhang

    Abstract: How to select the active variables which have significant impact on the event of interest is a very important and meaningful problem in the statistical analysis of ultrahigh-dimensional data. Sure independent screening procedure has been demonstrated to be an effective method to reduce the dimensionality of data from a large scale to a relatively moderate scale. For censored survival data, the exi… ▽ More

    Submitted 26 March, 2023; originally announced March 2023.

  15. arXiv:2204.12043  [pdf, other

    cs.AI math.OC stat.AP

    An Efficient Dynamic Sampling Policy For Monte Carlo Tree Search

    Authors: Gongbo Zhang, Yijie Peng, Yilong Xu

    Abstract: We consider the popular tree-based search strategy within the framework of reinforcement learning, the Monte Carlo Tree Search (MCTS), in the context of finite-horizon Markov decision process. We propose a dynamic sampling tree policy that efficiently allocates limited computational budget to maximize the probability of correct selection of the best action at the root node of the tree. Experimenta… ▽ More

    Submitted 25 April, 2022; originally announced April 2022.

  16. arXiv:2204.02634  [pdf, other

    cs.LG stat.ML

    Federated Reinforcement Learning with Environment Heterogeneity

    Authors: Hao Jin, Yang Peng, Wenhao Yang, Shusen Wang, Zhihua Zhang

    Abstract: We study a Federated Reinforcement Learning (FedRL) problem in which $n$ agents collaboratively learn a single policy without sharing the trajectories they collected during agent-environment interaction. We stress the constraint of environment heterogeneity, which means $n$ environments corresponding to these $n$ agents have different state transitions. To obtain a value function or a policy funct… ▽ More

    Submitted 6 April, 2022; originally announced April 2022.

    Comments: Artificial Intelligence and Statistics 2022

  17. arXiv:2109.01098  [pdf, other

    stat.ME stat.CO

    A Support Vector Machine Based Cure Rate Model For Interval Censored Data

    Authors: Suvra Pal, Yingwei Peng, Sandip Barui, Pei Wang

    Abstract: The mixture cure rate model is the most commonly used cure rate model in the literature. In the context of mixture cure rate model, the standard approach to model the effect of covariates on the cured or uncured probability is to use a logistic function. This readily implies that the boundary classifying the cured and uncured subjects is linear. In this paper, we propose a new mixture cure rate mo… ▽ More

    Submitted 22 August, 2022; v1 submitted 2 September, 2021; originally announced September 2021.

  18. arXiv:2108.13637  [pdf, other

    cs.LG cs.AI q-bio.NC stat.ML

    When are Deep Networks really better than Decision Forests at small sample sizes, and how?

    Authors: Haoyin Xu, Kaleab A. Kinfu, Will LeVine, Sambit Panda, Jayanta Dey, Michael Ainsworth, Yu-Chung Peng, Madi Kusmanov, Florian Engert, Christopher M. White, Joshua T. Vogelstein, Carey E. Priebe

    Abstract: Deep networks and decision forests (such as random forests and gradient boosted trees) are the leading machine learning methods for structured and tabular data, respectively. Many papers have empirically compared large numbers of classifiers on one or two different domains (e.g., on 100 different tabular data settings). However, a careful conceptual and empirical comparison of these two strategies… ▽ More

    Submitted 2 November, 2021; v1 submitted 31 August, 2021; originally announced August 2021.

  19. arXiv:2012.05591  [pdf, other

    stat.ME

    Efficient Learning for Clustering and Optimizing Context-Dependent Designs

    Authors: Haidong Li, Henry Lam, Yijie Peng

    Abstract: We consider a simulation optimization problem for a context-dependent decision-making. A Gaussian mixture model is proposed to capture the performance clustering phenomena of context-dependent designs. Under a Bayesian framework, we develop a dynamic sampling policy to efficiently learn both the global information of each cluster and local information of each design for selecting the best designs… ▽ More

    Submitted 13 December, 2020; v1 submitted 10 December, 2020; originally announced December 2020.

  20. arXiv:2012.05577   

    stat.ME

    Context-dependent Ranking and Selection under a Bayesian Framework

    Authors: Haidong Li, Henry Lam, Zhe Liang, Yijie Peng

    Abstract: We consider a context-dependent ranking and selection problem. The best design is not universal but depends on the contexts. Under a Bayesian framework, we develop a dynamic sampling scheme for context-dependent optimization (DSCO) to efficiently learn and select the best designs in all contexts. The proposed sampling scheme is proved to be consistent. Numerical experiments show that the proposed… ▽ More

    Submitted 18 December, 2020; v1 submitted 10 December, 2020; originally announced December 2020.

    Comments: The article was published without the co-Author's notice, and it is withdrawn due to his objection

  21. arXiv:2010.05295  [pdf, other

    stat.ML cs.LG math.NA

    Efficient Long-Range Convolutions for Point Clouds

    Authors: Yifan Peng, Lin Lin, Lexing Ying, Leonardo Zepeda-Núñez

    Abstract: The efficient treatment of long-range interactions for point clouds is a challenging problem in many scientific machine learning applications. To extract global information, one usually needs a large window size, a large number of layers, and/or a large number of channels. This can often significantly increase the computational cost. In this work, we present a novel neural network layer that direc… ▽ More

    Submitted 11 October, 2020; originally announced October 2020.

  22. arXiv:2008.12646  [pdf, other

    cs.LG stat.ML

    Graph Learning for Combinatorial Optimization: A Survey of State-of-the-Art

    Authors: Yun Peng, Byron Choi, Jianliang Xu

    Abstract: Graphs have been widely used to represent complex data in many applications. Efficient and effective analysis of graphs is important for graph-based applications. However, most graph analysis tasks are combinatorial optimization (CO) problems, which are NP-hard. Recent studies have focused a lot on the potential of using machine learning (ML) to solve graph-based CO problems. Most recent methods f… ▽ More

    Submitted 21 April, 2021; v1 submitted 26 August, 2020; originally announced August 2020.

    Comments: 40 pages

  23. arXiv:2007.08165  [pdf, other

    eess.AS cs.LG cs.SD stat.ML

    Audio Tagging by Cross Filtering Noisy Labels

    Authors: Boqing Zhu, Kele Xu, Qiuqiang Kong, Huaimin Wang, Yuxing Peng

    Abstract: High quality labeled datasets have allowed deep learning to achieve impressive results on many sound analysis tasks. Yet, it is labor-intensive to accurately annotate large amount of audio data, and the dataset may contain noisy labels in the practical settings. Meanwhile, the deep neural networks are susceptive to those incorrect labeled data because of their outstanding memorization ability. In… ▽ More

    Submitted 16 July, 2020; originally announced July 2020.

    Comments: Accepted by IEEE/ACM Transactions on Audio, Speech and Language Processing

  24. arXiv:2001.01185  [pdf, other

    cs.LG stat.ML

    CNNTOP: a CNN-based Trajectory Owner Prediction Method

    Authors: Xucheng Luo, Shengyang Li, Yuxiang Peng

    Abstract: Trajectory owner prediction is the basis for many applications such as personalized recommendation, urban planning. Although much effort has been put on this topic, the results archived are still not good enough. Existing methods mainly employ RNNs to model trajectories semantically due to the inherent sequential attribute of trajectories. However, these approaches are weak at Point of Interest (P… ▽ More

    Submitted 5 January, 2020; originally announced January 2020.

    Comments: 9pages, 11figures

  25. arXiv:1910.12895  [pdf

    cs.CY stat.AP

    Added Value of Intraoperative Data for Predicting Postoperative Complications: Development and Validation of a MySurgeryRisk Extension

    Authors: Shounak Datta, Tyler J. Loftus, Matthew M. Ruppert, Chris Giordano, Lasith Adhikari, Ying-Chih Peng, Yuanfang Ren, Benjamin Shickel, Zheng Feng, Gloria Lipori, Gilbert R. Upchurch Jr., Xiaolin Li, Parisa Rashidi, Tezcan Ozrazgat-Baslanti, Azra Bihorac

    Abstract: To test the hypothesis that accuracy, discrimination, and precision in predicting postoperative complications improve when using both preoperative and intraoperative data input features versus preoperative data alone. Models that predict postoperative complications often ignore important intraoperative physiological changes. Incorporation of intraoperative physiological data may improve model perf… ▽ More

    Submitted 8 November, 2019; v1 submitted 28 October, 2019; originally announced October 2019.

    Comments: 46 pages,8 figures, 7 tables version 2: corrected typos

  26. arXiv:1910.05862  [pdf, other

    cs.LG stat.ML

    Constrained Non-Affine Alignment of Embeddings

    Authors: Yuwei Wang, Yan Zheng, Yanqing Peng, Chin-Chia Michael Yeh, Zhongfang Zhuang, Das Mahashweta, Bendre Mangesh, Feifei Li, Wei Zhang, Jeff M. Phillips

    Abstract: Embeddings are one of the fundamental building blocks for data analysis tasks. Embeddings are already essential tools for large language models and image analysis, and their use is being extended to many other research domains. The generation of these distributed representations is often a data- and computation-expensive process; yet the holistic analysis and adjustment of them after they have bee… ▽ More

    Submitted 19 November, 2021; v1 submitted 13 October, 2019; originally announced October 2019.

  27. arXiv:1909.06137  [pdf, other

    cs.LG stat.ML

    Defending Against Adversarial Attacks by Suppressing the Largest Eigenvalue of Fisher Information Matrix

    Authors: Chaomin Shen, Yaxin Peng, Guixu Zhang, Jinsong Fan

    Abstract: We propose a scheme for defending against adversarial attacks by suppressing the largest eigenvalue of the Fisher information matrix (FIM). Our starting point is one explanation on the rationale of adversarial examples. Based on the idea of the difference between a benign sample and its adversarial example is measured by the Euclidean norm, while the difference between their classification probabi… ▽ More

    Submitted 13 September, 2019; originally announced September 2019.

    Comments: 11 pages, 5 figures

  28. arXiv:1909.06040  [pdf, ps, other

    cs.LG cs.DC stat.ML

    DL2: A Deep Learning-driven Scheduler for Deep Learning Clusters

    Authors: Yanghua Peng, Yixin Bao, Yangrui Chen, Chuan Wu, Chen Meng, Wei Lin

    Abstract: More and more companies have deployed machine learning (ML) clusters, where deep learning (DL) models are trained for providing various AI-driven services. Efficient resource scheduling is essential for maximal utilization of expensive DL clusters. Existing cluster schedulers either are agnostic to ML workload characteristics, or use scheduling heuristics based on operators' understanding of parti… ▽ More

    Submitted 13 September, 2019; originally announced September 2019.

  29. arXiv:1905.12916  [pdf, other

    cs.LG stat.ML

    Effective Medical Test Suggestions Using Deep Reinforcement Learning

    Authors: Yang-En Chen, Kai-Fu Tang, Yu-Shao Peng, Edward Y. Chang

    Abstract: Effective medical test suggestions benefit both patients and physicians to conserve time and improve diagnosis accuracy. In this work, we show that an agent can learn to suggest effective medical tests. We formulate the problem as a stage-wise Markov decision process and propose a reinforcement learning method to train the agent. We introduce a new representation of multiple action policy along wi… ▽ More

    Submitted 31 May, 2019; v1 submitted 30 May, 2019; originally announced May 2019.

  30. arXiv:1903.12584  [pdf, other

    stat.ML cs.LG

    The False Positive Control Lasso

    Authors: Erik Drysdale, Yingwei Peng, Timothy P. Hanna, Paul Nguyen, Anna Goldenberg

    Abstract: In high dimensional settings where a small number of regressors are expected to be important, the Lasso estimator can be used to obtain a sparse solution vector with the expectation that most of the non-zero coefficients are associated with true signals. While several approaches have been developed to control the inclusion of false predictors with the Lasso, these approaches are limited by relying… ▽ More

    Submitted 29 March, 2019; originally announced March 2019.

  31. arXiv:1903.10679  [pdf

    cs.LG stat.ML

    Short-term Load Forecasting at Different Aggregation Levels with Predictability Analysis

    Authors: Yayu Peng, Yishen Wang, Xiao Lu, Haifeng Li, Di Shi, Zhiwei Wang, Jie Li

    Abstract: Short-term load forecasting (STLF) is essential for the reliable and economic operation of power systems. Though many STLF methods were proposed over the past decades, most of them focused on loads at high aggregation levels only. Thus, low-aggregation load forecast still requires further research and development. Compared with the substation or city level loads, individual loads are typically mor… ▽ More

    Submitted 26 March, 2019; originally announced March 2019.

    Comments: To appear in ISGT ASIA 2019

  32. arXiv:1902.05551  [pdf, other

    cs.LG stat.ML

    Off-Policy Actor-Critic in an Ensemble: Achieving Maximum General Entropy and Effective Environment Exploration in Deep Reinforcement Learning

    Authors: Gang Chen, Yiming Peng

    Abstract: We propose a new policy iteration theory as an important extension of soft policy iteration and Soft Actor-Critic (SAC), one of the most efficient model free algorithms for deep reinforcement learning. Supported by the new theory, arbitrary entropy measures that generalize Shannon entropy, such as Tsallis entropy and Renyi entropy, can be utilized to properly randomize action selection while fulfi… ▽ More

    Submitted 13 February, 2019; originally announced February 2019.

  33. arXiv:1902.00358  [pdf, other

    cs.LG cs.AI stat.ML

    Training Artificial Neural Networks by Generalized Likelihood Ratio Method: Exploring Brain-like Learning to Improve Robustness

    Authors: Li Xiao, Yijie Peng, Jeff Hong, Zewu Ke, Shuhuai Yang

    Abstract: In this work, we propose a generalized likelihood ratio method capable of training the artificial neural networks with some biological brain-like mechanisms,.e.g., (a) learning by the loss value, (b) learning via neurons with discontinuous activation and loss functions. The traditional back propagation method cannot train the artificial neural networks with aforementioned brain-like learning mecha… ▽ More

    Submitted 11 July, 2019; v1 submitted 31 January, 2019; originally announced February 2019.

    Comments: 12 pages, 7 figures, 2 tables

  34. arXiv:1901.03466  [pdf, other

    stat.ME cs.SI

    Efficient Sampling for Selecting Important Nodes in Random Network

    Authors: Haidong Li, Xiaoyun Xu, Yijie Peng, Chun-Hung Chen

    Abstract: We consider the problem of selecting important nodes in a random network, where the nodes connect to each other randomly with certain transition probabilities. The node importance is characterized by the stationary probabilities of the corresponding nodes in a Markov chain defined over the network, as in Google's PageRank. Unlike deterministic network, the transition probabilities in random networ… ▽ More

    Submitted 10 January, 2019; originally announced January 2019.

  35. arXiv:1811.05475  [pdf

    cs.IR cs.CL cs.LG stat.ML

    ML-Net: multi-label classification of biomedical texts with deep neural networks

    Authors: Jingcheng Du, Qingyu Chen, Yifan Peng, Yang Xiang, Cui Tao, Zhiyong Lu

    Abstract: In multi-label text classification, each textual document can be assigned with one or more labels. Due to this nature, the multi-label text classification task is often considered to be more challenging compared to the binary or multi-class text classification problems. As an important task with broad applications in biomedicine such as assigning diagnosis codes, a number of different computationa… ▽ More

    Submitted 15 November, 2018; v1 submitted 13 November, 2018; originally announced November 2018.

  36. arXiv:1810.03806  [pdf, other

    cs.LG stat.ML

    The Adversarial Attack and Detection under the Fisher Information Metric

    Authors: Chenxiao Zhao, P. Thomas Fletcher, Mixue Yu, Yaxin Peng, Guixu Zhang, Chaomin Shen

    Abstract: Many deep learning models are vulnerable to the adversarial attack, i.e., imperceptible but intentionally-designed perturbations to the input can cause incorrect output of the networks. In this paper, using information geometry, we provide a reasonable explanation for the vulnerability of deep learning models. By considering the data space as a non-linear space with the Fisher information metric i… ▽ More

    Submitted 8 February, 2019; v1 submitted 9 October, 2018; originally announced October 2018.

    Comments: Accepted as an AAAI-2019 oral paper

  37. arXiv:1809.00403  [pdf, other

    cs.LG stat.ML

    Effective Exploration for Deep Reinforcement Learning via Bootstrapped Q-Ensembles under Tsallis Entropy Regularization

    Authors: Gang Chen, Yiming Peng, Mengjie Zhang

    Abstract: Recently deep reinforcement learning (DRL) has achieved outstanding success on solving many difficult and large-scale RL problems. However the high sample cost required for effective learning often makes DRL unaffordable in resource-limited applications. With the aim of improving sample efficiency and learning performance, we will develop a new DRL algorithm in this paper that seamless integrates… ▽ More

    Submitted 4 September, 2018; v1 submitted 2 September, 2018; originally announced September 2018.

  38. arXiv:1805.10638  [pdf, ps, other

    cs.LG math.NA stat.ML

    Fast K-Means Clustering with Anderson Acceleration

    Authors: Juyong Zhang, Yuxin Yao, Yue Peng, Hao Yu, Bailin Deng

    Abstract: We propose a novel method to accelerate Lloyd's algorithm for K-Means clustering. Unlike previous acceleration approaches that reduce computational cost per iterations or improve initialization, our approach is focused on reducing the number of iterations required for convergence. This is achieved by treating the assignment step and the update step of Lloyd's algorithm as a fixed-point iteration,… ▽ More

    Submitted 27 May, 2018; originally announced May 2018.

  39. arXiv:1804.07757  [pdf, other

    cs.LG cs.AI stat.ML

    Learning More Robust Features with Adversarial Training

    Authors: Shuangtao Li, Yuanke Chen, Yanlin Peng, Lin Bai

    Abstract: In recent years, it has been found that neural networks can be easily fooled by adversarial examples, which is a potential safety hazard in some safety-critical applications. Many researchers have proposed various method to make neural networks more robust to white-box adversarial attacks, but an effective method have not been found so far. In this short paper, we focus on the robustness of the fe… ▽ More

    Submitted 20 April, 2018; originally announced April 2018.

  40. arXiv:1804.06461  [pdf, other

    cs.LG cs.AI stat.ML

    An Adaptive Clipping Approach for Proximal Policy Optimization

    Authors: Gang Chen, Yiming Peng, Mengjie Zhang

    Abstract: Very recently proximal policy optimization (PPO) algorithms have been proposed as first-order optimization methods for effective reinforcement learning. While PPO is inspired by the same learning theory that justifies trust region policy optimization (TRPO), PPO substantially simplifies algorithm design and improves data efficiency by performing multiple epochs of \emph{clipped policy optimization… ▽ More

    Submitted 17 April, 2018; originally announced April 2018.

  41. arXiv:1710.02619  [pdf, ps, other

    cs.LG stat.ML

    Ranking and Selection as Stochastic Control

    Authors: Yijie Peng, Edwin K. P. Chong, Chun-Hung Chen, Michael C. Fu

    Abstract: Under a Bayesian framework, we formulate the fully sequential sampling and selection decision in statistical ranking and selection as a stochastic control problem, and derive the associated Bellman equation. Using value function approximation, we derive an approximately optimal allocation policy. We show that this policy is not only computationally efficient but also possesses both one-step-ahead… ▽ More

    Submitted 6 October, 2017; originally announced October 2017.

    Comments: 15 pages, 8 figures, to appear in IEEE Transactions on Automatic Control

    MSC Class: IEEE

  42. arXiv:1704.04333  [pdf, other

    cs.MM cs.LG stat.ML

    Cross-media Similarity Metric Learning with Unified Deep Networks

    Authors: Jinwei Qi, Xin Huang, Yuxin Peng

    Abstract: As a highlighting research topic in the multimedia area, cross-media retrieval aims to capture the complex correlations among multiple media types. Learning better shared representation and distance metric for multimedia data is important to boost the cross-media retrieval. Motivated by the strong ability of deep neural network in feature representation and comparison functions learning, we propos… ▽ More

    Submitted 13 April, 2017; originally announced April 2017.

    Comments: 19 pages, submitted to Multimedia Tools and Applications

  43. arXiv:1703.07026  [pdf, other

    cs.LG cs.CV stat.ML

    Cross-modal Deep Metric Learning with Multi-task Regularization

    Authors: Xin Huang, Yuxin Peng

    Abstract: DNN-based cross-modal retrieval has become a research hotspot, by which users can search results across various modalities like image and text. However, existing methods mainly focus on the pairwise correlation and reconstruction error of labeled data. They ignore the semantically similar and dissimilar constraints between different modalities, and cannot take advantage of unlabeled data. This pap… ▽ More

    Submitted 5 April, 2017; v1 submitted 20 March, 2017; originally announced March 2017.

    Comments: Revision: Added reference [7] 6 pages, 1 figure, to appear in the proceedings of the IEEE International Conference on Multimedia and Expo (ICME), Jul 10, 2017 - Jul 14, 2017, Hong Kong, Hong Kong

  44. arXiv:1608.08761  [pdf, other

    cs.LG stat.ML

    hi-RF: Incremental Learning Random Forest for large-scale multi-class Data Classification

    Authors: Tingting Xie, Yuxing Peng, Changjian Wang

    Abstract: In recent years, dynamically growing data and incrementally growing number of classes pose new challenges to large-scale data classification research. Most traditional methods struggle to balance the precision and computational burden when data and its number of classes increased. However, some methods are with weak precision, and the others are time-consuming. In this paper, we propose an increme… ▽ More

    Submitted 31 October, 2016; v1 submitted 31 August, 2016; originally announced August 2016.

    Comments: Accepted by AIIE2016

  45. arXiv:1006.0054  [pdf

    cs.IT math.NA stat.AP

    Anti-measurement Matrix Uncertainty Sparse Signal Recovery for Compressive Sensing

    Authors: Yipeng Liu, Qun Wan, Fei Wen, Jia Xu, Yingning Peng

    Abstract: Compressive sensing (CS) is a technique for estimating a sparse signal from the random measurements and the measurement matrix. Traditional sparse signal recovery methods have seriously degeneration with the measurement matrix uncertainty (MMU). Here the MMU is modeled as a bounded additive error. An anti-uncertainty constraint in the form of a mixed L2 and L1 norm is deduced from the sparse signa… ▽ More

    Submitted 18 June, 2011; v1 submitted 1 June, 2010; originally announced June 2010.

    Comments: 13 pages, 3 figures; Accepted by International Journal of the Physical Sciences