Skip to main content

Showing 1–44 of 44 results for author: Hu, H

Searching in archive stat. Search in all archives.
.
  1. arXiv:2405.17796  [pdf, ps, other

    cs.LG stat.ML

    Offline Oracle-Efficient Learning for Contextual MDPs via Layerwise Exploration-Exploitation Tradeoff

    Authors: Jian Qian, Haichen Hu, David Simchi-Levi

    Abstract: Motivated by the recent discovery of a statistical and computational reduction from contextual bandits to offline regression (Simchi-Levi and Xu, 2021), we address the general (stochastic) Contextual Markov Decision Process (CMDP) problem with horizon H (as known as CMDP with H layers). In this paper, we introduce a reduction from CMDPs to offline density estimation under the realizability assumpt… ▽ More

    Submitted 27 May, 2024; originally announced May 2024.

  2. arXiv:2403.08160  [pdf, other

    stat.ML cs.LG math.ST

    Asymptotics of Random Feature Regression Beyond the Linear Scaling Regime

    Authors: Hong Hu, Yue M. Lu, Theodor Misiakiewicz

    Abstract: Recent advances in machine learning have been achieved by using overparametrized models trained until near interpolation of the training data. It was shown, e.g., through the double descent phenomenon, that the number of parameters is a poor proxy for the model complexity and generalization capabilities. This leaves open the question of understanding the impact of parametrization on the performanc… ▽ More

    Submitted 12 March, 2024; originally announced March 2024.

    Comments: 106 pages, 8 figures

  3. arXiv:2307.00667  [pdf, other

    stat.ML cs.AI cs.LG

    Morse Neural Networks for Uncertainty Quantification

    Authors: Benoit Dherin, Huiyi Hu, Jie Ren, Michael W. Dusenberry, Balaji Lakshminarayanan

    Abstract: We introduce a new deep generative model useful for uncertainty quantification: the Morse neural network, which generalizes the unnormalized Gaussian densities to have modes of high-dimensional submanifolds instead of just discrete points. Fitting the Morse neural network via a KL-divergence loss yields 1) a (unnormalized) generative density, 2) an OOD detector, 3) a calibration temperature, 4) a… ▽ More

    Submitted 2 July, 2023; originally announced July 2023.

    Comments: Accepted to ICML workshop on Structured Probabilistic Inference & Generative Modeling 2023

  4. arXiv:2305.18258  [pdf, other

    cs.LG cs.AI cs.GT math.OC stat.ML

    Maximize to Explore: One Objective Function Fusing Estimation, Planning, and Exploration

    Authors: Zhihan Liu, Miao Lu, Wei Xiong, Han Zhong, Hao Hu, Shenao Zhang, Sirui Zheng, Zhuoran Yang, Zhaoran Wang

    Abstract: In online reinforcement learning (online RL), balancing exploration and exploitation is crucial for finding an optimal policy in a sample-efficient way. To achieve this, existing sample-efficient online RL algorithms typically consist of three components: estimation, planning, and exploration. However, in order to cope with general function approximators, most of them involve impractical algorithm… ▽ More

    Submitted 25 October, 2023; v1 submitted 29 May, 2023; originally announced May 2023.

  5. arXiv:2303.03254  [pdf, ps, other

    math.OC cs.LG stat.ML

    An Online Algorithm for Chance Constrained Resource Allocation

    Authors: Yuwei Chen, Zengde Deng, Yinzhi Zhou, Zaiyi Chen, Yujie Chen, Haoyuan Hu

    Abstract: This paper studies the online stochastic resource allocation problem (RAP) with chance constraints. The online RAP is a 0-1 integer linear programming problem where the resource consumption coefficients are revealed column by column along with the corresponding revenue coefficients. When a column is revealed, the corresponding decision variables are determined instantaneously without future inform… ▽ More

    Submitted 6 March, 2023; originally announced March 2023.

    Comments: 5 pages, 5 figures. Accepted to ICASSP 2023. arXiv admin note: substantial text overlap with arXiv:2203.16818

  6. arXiv:2212.13069  [pdf, other

    cs.LG cond-mat.dis-nn stat.ML

    Homophily modulates double descent generalization in graph convolution networks

    Authors: Cheng Shi, Liming Pan, Hong Hu, Ivan Dokmanić

    Abstract: Graph neural networks (GNNs) excel in modeling relational data such as biological, social, and transportation networks, but the underpinnings of their success are not well understood. Traditional complexity measures from statistical learning theory fail to account for observed phenomena like the double descent or the impact of relational semantics on generalization error. Motivated by experimental… ▽ More

    Submitted 23 January, 2024; v1 submitted 26 December, 2022; originally announced December 2022.

  7. arXiv:2210.11520  [pdf, other

    stat.ME stat.CO

    A Semiparametric Approach to the Detection of Change-points in Volatility Dynamics of Financial Data

    Authors: Huaiyu Hu, Ashis Gangopadhyay

    Abstract: One of the most important features of financial time series data is volatility. There are often structural changes in volatility over time, and an accurate estimation of the volatility of financial time series requires careful identification of change-points. A common approach to modeling the volatility of time series data is the well-known GARCH model. Although the problem of change-point estimat… ▽ More

    Submitted 20 October, 2022; originally announced October 2022.

  8. arXiv:2208.04508  [pdf, ps, other

    cs.LG cs.DS stat.ML

    Training Overparametrized Neural Networks in Sublinear Time

    Authors: Yichuan Deng, Hang Hu, Zhao Song, Omri Weinstein, Danyang Zhuo

    Abstract: The success of deep learning comes at a tremendous computational and energy cost, and the scalability of training massively overparametrized neural networks is becoming a real barrier to the progress of artificial intelligence (AI). Despite the popularity and low cost-per-iteration of traditional backpropagation via gradient decent, stochastic gradient descent (SGD) has prohibitive convergence rat… ▽ More

    Submitted 7 February, 2024; v1 submitted 8 August, 2022; originally announced August 2022.

  9. arXiv:2208.01169  [pdf, other

    q-bio.PE cs.NE stat.ML

    A Modified PINN Approach for Identifiable Compartmental Models in Epidemiology with Applications to COVID-19

    Authors: Haoran Hu, Connor M Kennedy, Panayotis G. Kevrekidis, Hongkun Zhang

    Abstract: A variety of approaches using compartmental models have been used to study the COVID-19 pandemic and the usage of machine learning methods with these models has had particularly notable success. We present here an approach toward analyzing accessible data on Covid-19's U.S. development using a variation of the "Physics Informed Neural Networks" (PINN) which is capable of using the knowledge of the… ▽ More

    Submitted 17 August, 2022; v1 submitted 1 August, 2022; originally announced August 2022.

    Comments: 22 pages, 8 figures, to be submitted to Viruses

  10. arXiv:2207.07411  [pdf, other

    cs.LG stat.ML

    Plex: Towards Reliability using Pretrained Large Model Extensions

    Authors: Dustin Tran, Jeremiah Liu, Michael W. Dusenberry, Du Phan, Mark Collier, Jie Ren, Kehang Han, Zi Wang, Zelda Mariet, Huiyi Hu, Neil Band, Tim G. J. Rudner, Karan Singhal, Zachary Nado, Joost van Amersfoort, Andreas Kirsch, Rodolphe Jenatton, Nithum Thain, Honglin Yuan, Kelly Buchanan, Kevin Murphy, D. Sculley, Yarin Gal, Zoubin Ghahramani, Jasper Snoek , et al. (1 additional authors not shown)

    Abstract: A recent trend in artificial intelligence is the use of pretrained models for language and vision tasks, which have achieved extraordinary performance but also puzzling failures. Probing these models' abilities in diverse ways is therefore critical to the field. In this paper, we explore the reliability of models, where we define a reliable model as one that not only achieves strong predictive per… ▽ More

    Submitted 15 July, 2022; originally announced July 2022.

    Comments: Code available at https://goo.gle/plex-code

  11. arXiv:2205.14846  [pdf, other

    cs.LG stat.ML

    Precise Learning Curves and Higher-Order Scaling Limits for Dot Product Kernel Regression

    Authors: Lechao Xiao, Hong Hu, Theodor Misiakiewicz, Yue M. Lu, Jeffrey Pennington

    Abstract: As modern machine learning models continue to advance the computational frontier, it has become increasingly important to develop precise estimates for expected performance improvements under different model and data scaling regimes. Currently, theoretical understanding of the learning curves that characterize how the prediction error depends on the number of samples is restricted to either large-… ▽ More

    Submitted 12 June, 2023; v1 submitted 30 May, 2022; originally announced May 2022.

    Comments: 42 pages; 5 + 6 figures

    MSC Class: 68T07

  12. arXiv:2204.04763  [pdf, other

    cs.LG stat.ML

    Information-theoretic Online Memory Selection for Continual Learning

    Authors: Shengyang Sun, Daniele Calandriello, Huiyi Hu, Ang Li, Michalis Titsias

    Abstract: A challenging problem in task-free continual learning is the online selection of a representative replay memory from data streams. In this work, we investigate the online memory selection problem from an information-theoretic perspective. To gather the most information, we propose the \textit{surprise} and the \textit{learnability} criteria to pick informative points and to avoid outliers. We pres… ▽ More

    Submitted 10 April, 2022; originally announced April 2022.

    Comments: ICLR 2022

  13. arXiv:2109.14419  [pdf, other

    cs.LG cs.AI stat.ML

    On the Estimation Bias in Double Q-Learning

    Authors: Zhizhou Ren, Guangxiang Zhu, Hao Hu, Beining Han, Jianglun Chen, Chongjie Zhang

    Abstract: Double Q-learning is a classical method for reducing overestimation bias, which is caused by taking maximum estimated values in the Bellman operation. Its variants in the deep Q-learning paradigm have shown great promise in producing reliable value prediction and improving learning performance. However, as shown by prior work, double Q-learning is not fully unbiased and suffers from underestimatio… ▽ More

    Submitted 14 January, 2022; v1 submitted 29 September, 2021; originally announced September 2021.

    Comments: Thirty-Fifth Conference on Neural Information Processing Systems (NeurIPS 2021)

  14. arXiv:2106.12772  [pdf, other

    cs.LG stat.ML

    Task-agnostic Continual Learning with Hybrid Probabilistic Models

    Authors: Polina Kirichenko, Mehrdad Farajtabar, Dushyant Rao, Balaji Lakshminarayanan, Nir Levine, Ang Li, Huiyi Hu, Andrew Gordon Wilson, Razvan Pascanu

    Abstract: Learning new tasks continuously without forgetting on a constantly changing data distribution is essential for real-world problems but extremely challenging for modern deep learning. In this work we propose HCL, a Hybrid generative-discriminative approach to Continual Learning for classification. We model the distribution of each task and each class with a normalizing flow. The flow is used to lea… ▽ More

    Submitted 24 June, 2021; originally announced June 2021.

  15. arXiv:2010.00029  [pdf, other

    cs.LG cond-mat.dis-nn cs.AI cs.CV stat.ML

    RG-Flow: A hierarchical and explainable flow model based on renormalization group and sparse prior

    Authors: Hong-Ye Hu, Dian Wu, Yi-Zhuang You, Bruno Olshausen, Yubei Chen

    Abstract: Flow-based generative models have become an important class of unsupervised learning approaches. In this work, we incorporate the key ideas of renormalization group (RG) and sparse prior distribution to design a hierarchical flow-based generative model, RG-Flow, which can separate information at different scales of images and extract disentangled representations at each scale. We demonstrate our m… ▽ More

    Submitted 15 August, 2022; v1 submitted 30 September, 2020; originally announced October 2020.

    Comments: 31 pages, 20 figures, 3 tables

    Journal ref: Mach. Learn.: Sci. Technol. 3 035009 (2022)

  16. arXiv:2007.09335  [pdf, other

    cs.LG cs.CL stat.ML

    Drinking from a Firehose: Continual Learning with Web-scale Natural Language

    Authors: Hexiang Hu, Ozan Sener, Fei Sha, Vladlen Koltun

    Abstract: Continual learning systems will interact with humans, with each other, and with the physical world through time -- and continue to learn and adapt as they do. An important open problem for continual learning is a large-scale benchmark that enables realistic evaluation of algorithms. In this paper, we study a natural setting for continual learning on a massive scale. We introduce the problem of per… ▽ More

    Submitted 1 November, 2020; v1 submitted 18 July, 2020; originally announced July 2020.

    Comments: Dataset Downloader: https://github.com/firehose-dataset/downloader Source Code: https://github.com/firehose-dataset/congrad

  17. arXiv:2004.13480  [pdf, other

    eess.AS cs.CL cs.LG cs.SD stat.ML

    L-Vector: Neural Label Embedding for Domain Adaptation

    Authors: Zhong Meng, Hu Hu, Jinyu Li, Changliang Liu, Yan Huang, Yifan Gong, Chin-Hui Lee

    Abstract: We propose a novel neural label embedding (NLE) scheme for the domain adaptation of a deep neural network (DNN) acoustic model with unpaired data samples from source and target domains. With NLE method, we distill the knowledge from a powerful source-domain DNN into a dictionary of label embeddings, or l-vectors, one for each senone class. Each l-vector is a representation of the senone-specific o… ▽ More

    Submitted 25 April, 2020; originally announced April 2020.

    Comments: 5 pages, 2 figure, ICASSP 2020

    Journal ref: 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Barcelona, Spain

  18. arXiv:2004.00204  [pdf, other

    cs.LG cs.AI stat.ML

    Ontology-based Interpretable Machine Learning for Textual Data

    Authors: Phung Lai, NhatHai Phan, Han Hu, Anuja Badeti, David Newman, Dejing Dou

    Abstract: In this paper, we introduce a novel interpreting framework that learns an interpretable model based on an ontology-based sampling technique to explain agnostic prediction models. Different from existing approaches, our algorithm considers contextual correlation among words, described in domain knowledge ontologies, to generate semantic explanations. To narrow down the search space for explanations… ▽ More

    Submitted 31 March, 2020; originally announced April 2020.

    Comments: Accepted by IJCNN 2020

  19. arXiv:2003.12857  [pdf, other

    cs.LG cs.AI cs.CV cs.NE stat.ML

    NPENAS: Neural Predictor Guided Evolution for Neural Architecture Search

    Authors: Chen Wei, Chuang Niu, Yiping Tang, Yue Wang, Haihong Hu, Jimin Liang

    Abstract: Neural architecture search (NAS) is a promising method for automatically design neural architectures. NAS adopts a search strategy to explore the predefined search space to find outstanding performance architecture with the minimum searching costs. Bayesian optimization and evolutionary algorithms are two commonly used search strategies, but they suffer from computationally expensive, challenge to… ▽ More

    Submitted 10 September, 2020; v1 submitted 28 March, 2020; originally announced March 2020.

  20. arXiv:2003.12060  [pdf, other

    cs.CV cs.LG stat.ML

    Negative Margin Matters: Understanding Margin in Few-shot Classification

    Authors: Bin Liu, Yue Cao, Yutong Lin, Qi Li, Zheng Zhang, Mingsheng Long, Han Hu

    Abstract: This paper introduces a negative margin loss to metric learning based few-shot learning methods. The negative margin loss significantly outperforms regular softmax loss, and achieves state-of-the-art accuracy on three standard few-shot classification benchmarks with few bells and whistles. These results are contrary to the common practice in the metric learning field, that the margin is zero or po… ▽ More

    Submitted 26 March, 2020; originally announced March 2020.

    Comments: Code is available at https://github.com/bl0/negative-margin.few-shot

  21. arXiv:2002.00097  [pdf, other

    cs.LG eess.SP stat.ML

    Physics-Guided Deep Neural Networks for Power Flow Analysis

    Authors: Xinyue Hu, Haoji Hu, Saurabh Verma, Zhi-Li Zhang

    Abstract: Solving power flow (PF) equations is the basis of power flow analysis, which is important in determining the best operation of existing systems, performing security analysis, etc. However, PF equations can be out-of-date or even unavailable due to system dynamics and uncertainties, making traditional numerical approaches infeasible. To address these concerns, researchers have proposed data-driven… ▽ More

    Submitted 5 January, 2021; v1 submitted 31 January, 2020; originally announced February 2020.

    Comments: 9 pages

  22. arXiv:2001.09832  [pdf, other

    cs.LG stat.ML

    Polygames: Improved Zero Learning

    Authors: Tristan Cazenave, Yen-Chi Chen, Guan-Wei Chen, Shi-Yu Chen, Xian-Dong Chiu, Julien Dehos, Maria Elsa, Qucheng Gong, Hengyuan Hu, Vasil Khalidov, Cheng-Ling Li, Hsin-I Lin, Yu-Jin Lin, Xavier Martinet, Vegard Mella, Jeremy Rapin, Baptiste Roziere, Gabriel Synnaeve, Fabien Teytaud, Olivier Teytaud, Shi-Cheng Ye, Yi-Jun Ye, Shi-Jim Yen, Sergey Zagoruyko

    Abstract: Since DeepMind's AlphaZero, Zero learning quickly became the state-of-the-art method for many board games. It can be improved using a fully convolutional structure (no fully connected layer). Using such an architecture plus global pooling, we can create bots independent of the board size. The training can be made more robust by keeping track of the best checkpoints during the training and by train… ▽ More

    Submitted 27 January, 2020; originally announced January 2020.

  23. arXiv:1912.02757  [pdf, other

    stat.ML cs.LG

    Deep Ensembles: A Loss Landscape Perspective

    Authors: Stanislav Fort, Huiyi Hu, Balaji Lakshminarayanan

    Abstract: Deep ensembles have been empirically shown to be a promising approach for improving accuracy, uncertainty and out-of-distribution robustness of deep learning models. While deep ensembles were theoretically motivated by the bootstrap, non-bootstrap ensembles trained with just random initialization also perform well in practice, which suggests that there could be other explanations for why deep ense… ▽ More

    Submitted 24 June, 2020; v1 submitted 5 December, 2019; originally announced December 2019.

  24. arXiv:1910.13616  [pdf, other

    cs.LG cs.AI stat.ML

    Multimodal Model-Agnostic Meta-Learning via Task-Aware Modulation

    Authors: Risto Vuorio, Shao-Hua Sun, Hexiang Hu, Joseph J. Lim

    Abstract: Model-agnostic meta-learners aim to acquire meta-learned parameters from similar tasks to adapt to novel tasks from the same distribution with few gradient updates. With the flexibility in the choice of models, those frameworks demonstrate appealing performance on a variety of domains such as few-shot image classification and reinforcement learning. However, one important limitation of such framew… ▽ More

    Submitted 29 October, 2019; originally announced October 2019.

  25. arXiv:1909.08081  [pdf, other

    cs.LG stat.ML

    A Distributed Fair Machine Learning Framework with Private Demographic Data Protection

    Authors: Hui Hu, Yijun Liu, Zhen Wang, Chao Lan

    Abstract: Fair machine learning has become a significant research topic with broad societal impact. However, most fair learning methods require direct access to personal demographic data, which is increasingly restricted to use for protecting user privacy (e.g. by the EU General Data Protection Regulation). In this paper, we propose a distributed fair learning framework for protecting the privacy of demogra… ▽ More

    Submitted 17 September, 2019; originally announced September 2019.

    Comments: 9 pages,4 figures,International Conference of Data Mining

  26. arXiv:1907.02242  [pdf, other

    cs.LG stat.ML

    Fair Kernel Regression via Fair Feature Embedding in Kernel Space

    Authors: Austin Okray, Hui Hu, Chao Lan

    Abstract: In recent years, there have been significant efforts on mitigating unethical demographic biases in machine learning methods. However, very little is done for kernel methods. In this paper, we propose a new fair kernel regression method via fair feature embedding (FKR-F$^2$E) in kernel space. Motivated by prior works on feature selection in kernel space and feature processing for fair machine learn… ▽ More

    Submitted 20 September, 2019; v1 submitted 4 July, 2019; originally announced July 2019.

    Comments: ICTAI 2019, fair machine learning, kernel regression, fair feature embedding, feature selection for kernel methods, mean discrepancy

  27. arXiv:1906.07920  [pdf, other

    cs.LG cs.CR stat.ML

    Global Adversarial Attacks for Assessing Deep Learning Robustness

    Authors: Hanbin Hu, Mit Shah, Jianhua Z. Huang, Peng Li

    Abstract: It has been shown that deep neural networks (DNNs) may be vulnerable to adversarial attacks, raising the concern on their robustness particularly for safety-critical applications. Recognizing the local nature and limitations of existing adversarial attacks, we present a new type of global adversarial attacks for assessing global DNN robustness. More specifically, we propose a novel concept of glob… ▽ More

    Submitted 19 June, 2019; originally announced June 2019.

    Comments: Submitted to NeurIPS 2019

  28. arXiv:1905.13360  [pdf, other

    cs.LG stat.ML

    Efficient Forward Architecture Search

    Authors: Hanzhang Hu, John Langford, Rich Caruana, Saurajit Mukherjee, Eric Horvitz, Debadeepta Dey

    Abstract: We propose a neural architecture search (NAS) algorithm, Petridish, to iteratively add shortcut connections to existing network layers. The added shortcut connections effectively perform gradient boosting on the augmented layers. The proposed algorithm is motivated by the feature selection algorithm forward stage-wise linear regression, since we consider NAS as a generalization of feature selectio… ▽ More

    Submitted 30 May, 2019; originally announced May 2019.

    Comments: preprint

  29. arXiv:1904.03276  [pdf, other

    cs.LG stat.ML

    Synthesized Policies for Transfer and Adaptation across Tasks and Environments

    Authors: Hexiang Hu, Liyu Chen, Boqing Gong, Fei Sha

    Abstract: The ability to transfer in reinforcement learning is key towards building an agent of general artificial intelligence. In this paper, we consider the problem of learning to simultaneously transfer across both environments (ENV) and tasks (TASK), probably more importantly, by learning from only sparse (ENV, TASK) pairs out of all the possible combinations. We propose a novel compositional neural ne… ▽ More

    Submitted 26 May, 2021; v1 submitted 5 April, 2019; originally announced April 2019.

    Comments: presented at NeurIPS 2018 as a Spotlight

  30. arXiv:1902.05696  [pdf, other

    cs.LG stat.ML

    Learning to Adaptively Scale Recurrent Neural Networks

    Authors: Hao Hu, Liqiang Wang, Guo-Jun Qi

    Abstract: Recent advancements in recurrent neural network (RNN) research have demonstrated the superiority of utilizing multiscale structures in learning temporal representations of time series. Currently, most of multiscale RNNs use fixed scales, which do not comply with the nature of dynamical temporal patterns among sequences. In this paper, we propose Adaptively Scaled Recurrent Neural Networks (ASRNN),… ▽ More

    Submitted 15 February, 2019; originally announced February 2019.

  31. arXiv:1812.07611  [pdf

    cs.NE stat.ML

    GP-CNAS: Convolutional Neural Network Architecture Search with Genetic Programming

    Authors: Yiheng Zhu, Yichen Yao, Zili Wu, Yujie Chen, Guozheng Li, Haoyuan Hu, Yinghui Xu

    Abstract: Convolutional neural networks (CNNs) are effective at solving difficult problems like visual recognition, speech recognition and natural language processing. However, performance gain comes at the cost of laborious trial-and-error in designing deeper CNN architectures. In this paper, a genetic programming (GP) framework for convolutional neural network architecture search, abbreviated as GP-CNAS,… ▽ More

    Submitted 26 November, 2018; originally announced December 2018.

    Comments: 8 pages, 7 figures

  32. arXiv:1812.07172  [pdf, other

    cs.LG cs.AI stat.ML

    Toward Multimodal Model-Agnostic Meta-Learning

    Authors: Risto Vuorio, Shao-Hua Sun, Hexiang Hu, Joseph J. Lim

    Abstract: Gradient-based meta-learners such as MAML are able to learn a meta-prior from similar tasks to adapt to novel tasks from the same distribution with few gradient updates. One important limitation of such frameworks is that they seek a common initialization shared across the entire task distribution, substantially limiting the diversity of the task distributions that they are able to learn from. In… ▽ More

    Submitted 18 December, 2018; originally announced December 2018.

  33. arXiv:1811.11124  [pdf, other

    cs.LG stat.ML

    LEASGD: an Efficient and Privacy-Preserving Decentralized Algorithm for Distributed Learning

    Authors: Hsin-Pai Cheng, Patrick Yu, Haojing Hu, Feng Yan, Shiyu Li, Hai Li, Yiran Chen

    Abstract: Distributed learning systems have enabled training large-scale models over large amount of data in significantly shorter time. In this paper, we focus on decentralized distributed deep learning systems and aim to achieve differential privacy with good convergence rate and low communication cost. To achieve this goal, we propose a new learning algorithm LEASGD (Leader-Follower Elastic Averaging Sto… ▽ More

    Submitted 27 November, 2018; originally announced November 2018.

  34. arXiv:1811.07555  [pdf, other

    cs.LG cs.CV stat.ML

    Three Dimensional Convolutional Neural Network Pruning with Regularization-Based Method

    Authors: Yuxin Zhang, Huan Wang, Yang Luo, Lu Yu, Haoji Hu, Hangguan Shan, Tony Q. S. Quek

    Abstract: Despite enjoying extensive applications in video analysis, three-dimensional convolutional neural networks (3D CNNs)are restricted by their massive computation and storage consumption. To solve this problem, we propose a threedimensional regularization-based neural network pruning method to assign different regularization parameters to different weight groups based on their importance to the netwo… ▽ More

    Submitted 19 May, 2019; v1 submitted 19 November, 2018; originally announced November 2018.

    Comments: ICIP 2019

    Journal ref: ICIP 2019

  35. arXiv:1811.04151  [pdf, ps, other

    cs.LG stat.ML

    Design Rule Violation Hotspot Prediction Based on Neural Network Ensembles

    Authors: Wei Zeng, Azadeh Davoodi, Yu Hen Hu

    Abstract: Design rule check is a critical step in the physical design of integrated circuits to ensure manufacturability. However, it can be done only after a time-consuming detailed routing procedure, which adds drastically to the time of design iterations. With advanced technology nodes, the outcomes of global routing and detailed routing become less correlated, which adds to the difficulty of predicting… ▽ More

    Submitted 9 November, 2018; originally announced November 2018.

  36. arXiv:1807.09387  [pdf, other

    cs.LG stat.ML

    Learning from Delayed Outcomes via Proxies with Applications to Recommender Systems

    Authors: Timothy A. Mann, Sven Gowal, András György, Ray Jiang, Huiyi Hu, Balaji Lakshminarayanan, Prav Srinivasan

    Abstract: Predicting delayed outcomes is an important problem in recommender systems (e.g., if customers will finish reading an ebook). We formalize the problem as an adversarial, delayed online learning problem and consider how a proxy for the delayed outcome (e.g., if customers read a third of the book in 24 hours) can help minimize regret, even though the proxy is not available when making a prediction.… ▽ More

    Submitted 15 October, 2019; v1 submitted 24 July, 2018; originally announced July 2018.

  37. arXiv:1805.08349  [pdf, other

    cs.LG cond-mat.dis-nn cs.IT stat.ML

    A Solvable High-Dimensional Model of GAN

    Authors: Chuang Wang, Hong Hu, Yue M. Lu

    Abstract: We present a theoretical analysis of the training process for a single-layer GAN fed by high-dimensional input data. The training dynamics of the proposed model at both microscopic and macroscopic scales can be exactly analyzed in the high-dimensional limit. In particular, we prove that the macroscopic quantities measuring the quality of the training process converge to a deterministic process cha… ▽ More

    Submitted 28 October, 2019; v1 submitted 21 May, 2018; originally announced May 2018.

    Comments: Accepted by 33rd Conference on Neural Information Processing Systems (NeurIPS 2019), Vancouver, Canada

  38. arXiv:1804.09461  [pdf, other

    cs.LG stat.ML

    Structured Pruning for Efficient ConvNets via Incremental Regularization

    Authors: Huan Wang, Qiming Zhang, Yuehai Wang, Yu Lu, Haoji Hu

    Abstract: Parameter pruning is a promising approach for CNN compression and acceleration by eliminating redundant model parameters with tolerable performance degrade. Despite its effectiveness, existing regularization-based parameter pruning methods usually drive weights towards zero with large and constant regularization factors, which neglects the fragility of the expressiveness of CNNs, and thus calls fo… ▽ More

    Submitted 15 April, 2019; v1 submitted 25 April, 2018; originally announced April 2018.

    Comments: IJCNN 2019

  39. arXiv:1804.06896  [pdf, other

    cs.LG cs.AI stat.ML

    A Multi-task Selected Learning Approach for Solving 3D Flexible Bin Packing Problem

    Authors: Lu Duan, Haoyuan Hu, Yu Qian, Yu Gong, Xiaodong Zhang, Yinghui Xu, Jiangwen Wei

    Abstract: A 3D flexible bin packing problem (3D-FBPP) arises from the process of warehouse packing in e-commerce. An online customer's order usually contains several items and needs to be packed as a whole before shipping. In particular, 5% of tens of millions of packages are using plastic wrapping as outer packaging every day, which brings pressure on the plastic surface minimization to save traditional lo… ▽ More

    Submitted 15 February, 2019; v1 submitted 17 April, 2018; originally announced April 2018.

    Comments: 8 pages, 34figures. arXiv admin note: text overlap with arXiv:1708.05930

  40. arXiv:1710.11176  [pdf, other

    cs.LG cs.CV stat.ML

    CrescendoNet: A Simple Deep Convolutional Neural Network with Ensemble Behavior

    Authors: Xiang Zhang, Nishant Vishwamitra, Hongxin Hu, Feng Luo

    Abstract: We introduce a new deep convolutional neural network, CrescendoNet, by stacking simple building blocks without residual connections. Each Crescendo block contains independent convolution paths with increased depths. The numbers of convolution layers and parameters are only increased linearly in Crescendo blocks. In experiments, CrescendoNet with only 15 layers outperforms almost all networks witho… ▽ More

    Submitted 4 January, 2018; v1 submitted 30 October, 2017; originally announced October 2017.

  41. arXiv:1709.06994  [pdf, other

    cs.LG stat.ML

    Structured Probabilistic Pruning for Convolutional Neural Network Acceleration

    Authors: Huan Wang, Qiming Zhang, Yuehai Wang, Haoji Hu

    Abstract: In this paper, we propose a novel progressive parameter pruning method for Convolutional Neural Network acceleration, named Structured Probabilistic Pruning (SPP), which effectively prunes weights of convolutional layers in a probabilistic manner. Unlike existing deterministic pruning approaches, where unimportant weights are permanently eliminated, SPP introduces a pruning probability for each we… ▽ More

    Submitted 9 September, 2018; v1 submitted 19 September, 2017; originally announced September 2017.

    Comments: CNN model acceleration, 13 pages, 6 figures, accepted by Proceedings of the British Machine Vision Conference (BMVC), 2018 oral

    Journal ref: Proceedings of the British Machine Vision Conference (BMVC), 2018

  42. arXiv:1709.05750  [pdf, other

    cs.CR cs.LG stat.ML

    Adaptive Laplace Mechanism: Differential Privacy Preservation in Deep Learning

    Authors: NhatHai Phan, Xintao Wu, Han Hu, Dejing Dou

    Abstract: In this paper, we focus on developing a novel mechanism to preserve differential privacy in deep neural networks, such that: (1) The privacy budget consumption is totally independent of the number of training steps; (2) It has the ability to adaptively inject noise into features based on the contribution of each to the output; and (3) It could be applied in a variety of different deep neural netwo… ▽ More

    Submitted 22 April, 2018; v1 submitted 17 September, 2017; originally announced September 2017.

    Comments: IEEE ICDM 2017 - regular paper

  43. arXiv:1406.3837  [pdf, other

    stat.ML cs.LG

    An Incremental Reseeding Strategy for Clustering

    Authors: Xavier Bresson, Huiyi Hu, Thomas Laurent, Arthur Szlam, James von Brecht

    Abstract: In this work we propose a simple and easily parallelizable algorithm for multiway graph partitioning. The algorithm alternates between three basic components: diffusing seed vertices over the graph, thresholding the diffused seeds, and then randomly reseeding the thresholded clusters. We demonstrate experimentally that the proper combination of these ingredients leads to an algorithm that achieves… ▽ More

    Submitted 15 June, 2014; originally announced June 2014.

  44. arXiv:1307.1864  [pdf

    stat.ME q-bio.BM

    Combine Umbrella Sampling with Integrated Tempering Method for Efficient and Accurate Calculation of Free Energy Changes of Complex Energy Surface

    Authors: Mingjun Yang, Lijiang Yang, Yiqin Gao, Hao Hu

    Abstract: Umbrella sampling is an efficient method for the calculation of free energy changes of a system along well-defined reaction coordinates. However, when multiple parallel channels along the reaction coordinate or hidden barriers in directions perpendicular to the reaction coordinate exist, it is difficult for conventional umbrella sampling methods to generate sufficient sampling within limited simul… ▽ More

    Submitted 7 July, 2013; originally announced July 2013.

    Comments: 28 pages and 6 figures