Skip to main content

Showing 1–45 of 45 results for author: Lu, H

Searching in archive stat. Search in all archives.
.
  1. arXiv:2408.04081  [pdf

    stat.AP cs.CY

    A Framework for Assessing Cumulative Exposure to Extreme Temperatures During Transit Trip

    Authors: Huiying Fan, Hongyu Lu, Geyu Lyu, Angshuman Guin, Randall Guensler

    Abstract: The combined influence of urban heat islands, climate change, and extreme temperature events are increasingly impacting transit travelers, especially vulnerable populations such as older adults, people with disabilities, and those with chronic diseases. Previous studies have generally attempted to address this issue at either the micro- or macro-level, but each approach presents different limitati… ▽ More

    Submitted 7 August, 2024; originally announced August 2024.

    Comments: 44 pages, 1 table, 8 figures

  2. arXiv:2407.12996  [pdf, other

    stat.ML cs.LG

    Sharpness-diversity tradeoff: improving flat ensembles with SharpBalance

    Authors: Haiquan Lu, Xiaotian Liu, Yefan Zhou, Qunli Li, Kurt Keutzer, Michael W. Mahoney, Yujun Yan, Huanrui Yang, Yaoqing Yang

    Abstract: Recent studies on deep ensembles have identified the sharpness of the local minima of individual learners and the diversity of the ensemble members as key factors in improving test-time performance. Building on this, our study investigates the interplay between sharpness and diversity within deep ensembles, illustrating their crucial role in robust generalization to both in-distribution (ID) and o… ▽ More

    Submitted 17 July, 2024; originally announced July 2024.

  3. arXiv:2401.08159  [pdf, other

    stat.ME

    Reluctant Interaction Modeling in Generalized Linear Models

    Authors: Hai Lu, Guo Yu

    Abstract: While including pairwise interactions in a regression model can better approximate response surface, fitting such an interaction model is a well-known difficult problem. In particular, analyzing contemporary high-dimensional datasets often leads to extremely large-scale interaction modeling problem, where the challenge is posed to identify important interactions among millions or even billions of… ▽ More

    Submitted 16 January, 2024; originally announced January 2024.

    Comments: 41 pages

  4. arXiv:2310.07999  [pdf, other

    cs.LG stat.ML

    LEMON: Lossless model expansion

    Authors: Yite Wang, Jiahao Su, Hanlin Lu, Cong Xie, Tianyi Liu, Jianbo Yuan, Haibin Lin, Ruoyu Sun, Hongxia Yang

    Abstract: Scaling of deep neural networks, especially Transformers, is pivotal for their surging performance and has further led to the emergence of sophisticated reasoning capabilities in foundation models. Such scaling generally requires training large models from scratch with random initialization, failing to leverage the knowledge acquired by their smaller counterparts, which are already resource-intens… ▽ More

    Submitted 11 October, 2023; originally announced October 2023.

    Comments: Preprint

  5. arXiv:2304.04692  [pdf, other

    stat.ME stat.AP stat.ML

    Scalable Randomized Kernel Methods for Multiview Data Integration and Prediction

    Authors: Sandra E. Safo, Han Lu

    Abstract: We develop scalable randomized kernel methods for jointly associating data from multiple sources and simultaneously predicting an outcome or classifying a unit into one of two or more classes. The proposed methods model nonlinear relationships in multiview data together with predicting a clinical outcome and are capable of identifying variables or groups of variables that best contribute to the re… ▽ More

    Submitted 10 April, 2023; originally announced April 2023.

    Comments: 24 pages, 5 figures, 4 tables

  6. arXiv:2303.05399  [pdf, ps, other

    stat.ME cs.AI cs.CY

    Practical Statistical Considerations for the Clinical Validation of AI/ML-enabled Medical Diagnostic Devices

    Authors: Feiming Chen, Hong Laura Lu, Arianna Simonetti

    Abstract: Artificial Intelligence (AI) and Machine-Learning (ML) models have been increasingly used in medical products, such as medical device software. General considerations on the statistical aspects for the evaluation of AI/ML-enabled medical diagnostic devices are discussed in this paper. We also provide relevant academic references and note good practices in addressing various statistical challenges… ▽ More

    Submitted 2 March, 2023; originally announced March 2023.

    Comments: 20 pages, 1 table

  7. arXiv:2302.07930  [pdf, other

    cs.LG stat.ME stat.ML

    Interpretable Deep Learning Methods for Multiview Learning

    Authors: Hengkang Wang, Han Lu, Ju Sun, Sandra E Safo

    Abstract: Technological advances have enabled the generation of unique and complementary types of data or views (e.g. genomics, proteomics, metabolomics) and opened up a new era in multiview learning research with the potential to lead to new biomedical discoveries. We propose iDeepViewLearn (Interpretable Deep Learning Method for Multiview Learning) for learning nonlinear relationships in data from multipl… ▽ More

    Submitted 15 February, 2024; v1 submitted 15 February, 2023; originally announced February 2023.

    Comments: Published in BMC Bioinformatics (https://bmcbioinformatics.biomedcentral.com/articles/10.1186/s12859-024-05679-9)

    Journal ref: BMC Bioinformatics 25, 69 (2024)

  8. arXiv:2211.16509  [pdf, other

    q-bio.GN cs.AI cs.LG q-bio.BM stat.ML

    Multimodal Learning for Multi-Omics: A Survey

    Authors: Sina Tabakhi, Mohammod Naimul Islam Suvon, Pegah Ahadian, Haiping Lu

    Abstract: With advanced imaging, sequencing, and profiling technologies, multiple omics data become increasingly available and hold promises for many healthcare applications such as cancer diagnosis and treatment. Multimodal learning for integrative multi-omics analysis can help researchers and practitioners gain deep insights into human diseases and improve clinical decisions. However, several challenges a… ▽ More

    Submitted 19 December, 2022; v1 submitted 29 November, 2022; originally announced November 2022.

    Comments: 52 pages, 3 figures; Revised matrix factorization fusion section

  9. arXiv:2112.02180  [pdf, other

    stat.CO stat.ME

    Generalized Transitional Markov Chain Monte Carlo Sampling Technique for Bayesian Inversion

    Authors: Han Lu, Mohammad Khalil, Thomas Catanach, Jiefu Chen, Xuqing Wu, Xin Fu, Cosmin Safta, Yueqin Huang

    Abstract: In the context of Bayesian inversion for scientific and engineering modeling, Markov chain Monte Carlo sampling strategies are the benchmark due to their flexibility and robustness in dealing with arbitrary posterior probability density functions (PDFs). However, these algorithms been shown to be inefficient when sampling from posterior distributions that are high-dimensional or exhibit multi-moda… ▽ More

    Submitted 3 December, 2021; originally announced December 2021.

  10. arXiv:2106.09756  [pdf, other

    cs.LG cs.AI cs.CV stat.ML

    PyKale: Knowledge-Aware Machine Learning from Multiple Sources in Python

    Authors: Haiping Lu, Xianyuan Liu, Robert Turner, Peizhen Bai, Raivo E Koot, Shuo Zhou, Mustafa Chasmai, Lawrence Schobs

    Abstract: Machine learning is a general-purpose technology holding promises for many interdisciplinary research problems. However, significant barriers exist in crossing disciplinary boundaries when most machine learning tools are developed in different areas separately. We present Pykale - a Python library for knowledge-aware machine learning on graphs, images, texts, and videos to enable and accelerate in… ▽ More

    Submitted 17 June, 2021; originally announced June 2021.

    Comments: This library is available at https://github.com/pykale/pykale

  11. arXiv:2103.11539  [pdf, other

    stat.ME stat.AP

    Interpretable, predictive spatio-temporal models via enhanced Pairwise Directions Estimation

    Authors: Heng-Hui Lue, ShengLi Tzeng

    Abstract: This article concerns the predictive modeling for spatio-temporal data as well as model interpretation using data information in space and time. We develop a novel approach based on supervised dimension reduction for such data in order to capture nonlinear mean structures without requiring a prespecified parametric model. In addition to prediction as a common interest, this approach emphasizes the… ▽ More

    Submitted 6 November, 2021; v1 submitted 21 March, 2021; originally announced March 2021.

    Comments: 18 pages, 4 figures

  12. arXiv:2102.03607  [pdf, other

    stat.ML cs.LG math.ST

    Bootstrapping Fitted Q-Evaluation for Off-Policy Inference

    Authors: Botao Hao, Xiang Ji, Yaqi Duan, Hao Lu, Csaba Szepesvári, Mengdi Wang

    Abstract: Bootstrapping provides a flexible and effective approach for assessing the quality of batch reinforcement learning, yet its theoretical property is less understood. In this paper, we study the use of bootstrapping in off-policy evaluation (OPE), and in particular, we focus on the fitted Q-evaluation (FQE) that is known to be minimax-optimal in the tabular and linear-model cases. We propose a boots… ▽ More

    Submitted 22 May, 2022; v1 submitted 6 February, 2021; originally announced February 2021.

    Comments: Accepted at ICML 2021

  13. arXiv:2009.08685  [pdf, other

    cs.LG cs.AR stat.ML

    GrateTile: Efficient Sparse Tensor Tiling for CNN Processing

    Authors: Yu-Sheng Lin, Hung Chang Lu, Yang-Bin Tsao, Yi-Min Chih, Wei-Chao Chen, Shao-Yi Chien

    Abstract: We propose GrateTile, an efficient, hardwarefriendly data storage scheme for sparse CNN feature maps (activations). It divides data into uneven-sized subtensors and, with small indexing overhead, stores them in a compressed yet randomly accessible format. This design enables modern CNN accelerators to fetch and decompressed sub-tensors on-the-fly in a tiled processing manner. GrateTile is suitable… ▽ More

    Submitted 18 September, 2020; originally announced September 2020.

    Comments: To be published at IEEE Workshop on Signal Processing System (SiPS 2020)

  14. arXiv:2007.12375  [pdf, other

    cs.LG cs.AI stat.ML

    Impact of Medical Data Imprecision on Learning Results

    Authors: Mei Wang, Jianwen Su, Haiqin Lu

    Abstract: Test data measured by medical instruments often carry imprecise ranges that include the true values. The latter are not obtainable in virtually all cases. Most learning algorithms, however, carry out arithmetical calculations that are subject to uncertain influence in both the learning process to obtain models and applications of the learned models in, e.g. prediction. In this paper, we initiate a… ▽ More

    Submitted 24 July, 2020; originally announced July 2020.

    Comments: 2020 KDD Workshop on Applied Data Science for Healthcare

  15. arXiv:2007.02977  [pdf, other

    cs.LG stat.ML

    Sharing Models or Coresets: A Study based on Membership Inference Attack

    Authors: Hanlin Lu, Changchang Liu, Ting He, Shiqiang Wang, Kevin S. Chan

    Abstract: Distributed machine learning generally aims at training a global model based on distributed data without collecting all the data to a centralized location, where two different approaches have been proposed: collecting and aggregating local models (federated learning) and collecting and training over representative data summaries (coreset). While each approach preserves data privacy to some extent… ▽ More

    Submitted 6 July, 2020; originally announced July 2020.

  16. arXiv:2006.09815  [pdf, other

    eess.AS cs.LG cs.SD stat.ML

    A Deep Neural Network for Audio Classification with a Classifier Attention Mechanism

    Authors: Haoye Lu, Haolong Zhang, Amit Nayak

    Abstract: Audio classification is considered as a challenging problem in pattern recognition. Recently, many algorithms have been proposed using deep neural networks. In this paper, we introduce a new attention-based neural network architecture called Classifier-Attention-Based Convolutional Neural Network (CAB-CNN). The algorithm uses a newly designed architecture consisting of a list of simple classifiers… ▽ More

    Submitted 14 June, 2020; originally announced June 2020.

  17. arXiv:2006.08667  [pdf, other

    math.OC cs.LG stat.ML

    The Landscape of the Proximal Point Method for Nonconvex-Nonconcave Minimax Optimization

    Authors: Benjamin Grimmer, Haihao Lu, Pratik Worah, Vahab Mirrokni

    Abstract: Minimax optimization has become a central tool in machine learning with applications in robust optimization, reinforcement learning, GANs, etc. These applications are often nonconvex-nonconcave, but the existing theory is unable to identify and deal with the fundamental difficulties this poses. In this paper, we study the classic proximal point method (PPM) applied to nonconvex-nonconcave minimax… ▽ More

    Submitted 1 April, 2021; v1 submitted 15 June, 2020; originally announced June 2020.

    Comments: Notably updated version that connects our theory with that of Attouch and Wets from the 80s and notably expands on our first posting to apply to generic minimax problems (rather than requiring bilinear interaction)

    MSC Class: 65K05; 65K10; 90C26; 90C15; 90C30

  18. arXiv:2006.00038  [pdf, other

    cs.LG stat.ML

    Quasi-orthonormal Encoding for Machine Learning Applications

    Authors: Haw-minn Lu

    Abstract: Most machine learning models, especially artificial neural networks, require numerical, not categorical data. We briefly describe the advantages and disadvantages of common encoding schemes. For example, one-hot encoding is commonly used for attributes with a few unrelated categories and word embeddings for attributes with many related categories (e.g., words). Neither is suitable for encoding att… ▽ More

    Submitted 29 May, 2020; originally announced June 2020.

    Comments: Accepted and submitted to 19th Python in Science Conference. (SciPy 2020)

  19. arXiv:2004.03104  [pdf, other

    cs.LG cs.CV eess.IV stat.ML

    Generalized Label Enhancement with Sample Correlations

    Authors: Qinghai Zheng, Jihua Zhu, Haoyu Tang, Xinyuan Liu, Zhongyu Li, Huimin Lu

    Abstract: Recently, label distribution learning (LDL) has drawn much attention in machine learning, where LDL model is learned from labelel instances. Different from single-label and multi-label annotations, label distributions describe the instance by multiple labels with different intensities and accommodate to more general scenes. Since most existing machine learning datasets merely provide logical label… ▽ More

    Submitted 11 April, 2021; v1 submitted 6 April, 2020; originally announced April 2020.

  20. arXiv:2003.11723  [pdf, other

    cs.LG stat.ML

    Learning transferable and discriminative features for unsupervised domain adaptation

    Authors: Yuntao Du, Ruiting Zhang, Xiaowen Zhang, Yirong Yao, Hengyang Lu, Chongjun Wang

    Abstract: Although achieving remarkable progress, it is very difficult to induce a supervised classifier without any labeled data. Unsupervised domain adaptation is able to overcome this challenge by transferring knowledge from a labeled source domain to an unlabeled target domain. Transferability and discriminability are two key criteria for characterizing the superiority of feature representations to enab… ▽ More

    Submitted 25 June, 2021; v1 submitted 25 March, 2020; originally announced March 2020.

    Comments: Accepted by IDA journal

  21. arXiv:2002.08338  [pdf, ps, other

    cs.LG stat.ME stat.ML

    Multiple Imputation with Denoising Autoencoder using Metamorphic Truth and Imputation Feedback

    Authors: Haw-minn Lu, Giancarlo Perrone, José Unpingco

    Abstract: Although data may be abundant, complete data is less so, due to missing columns or rows. This missingness undermines the performance of downstream data products that either omit incomplete cases or create derived completed data for subsequent processing. Appropriately managing missing data is required in order to fully exploit and correctly use data. We propose a Multiple Imputation model using De… ▽ More

    Submitted 24 June, 2020; v1 submitted 19 February, 2020; originally announced February 2020.

    Comments: Machine Learning and Data Mining in Pattern Recognition, 16th International Conference on Machine Learning and Data Mining, MLDM 2020, Amsterdam, The Netherlands, July 20-21, 2020, Proceedings, pp. 197-208

  22. arXiv:2001.10516  [pdf, other

    cs.LG stat.ML

    Tri-graph Information Propagation for Polypharmacy Side Effect Prediction

    Authors: Hao Xu, Shengqi Sang, Haiping Lu

    Abstract: The use of drug combinations often leads to polypharmacy side effects (POSE). A recent method formulates POSE prediction as a link prediction problem on a graph of drugs and proteins, and solves it with Graph Convolutional Networks (GCNs). However, due to the complex relationships in POSE, this method has high computational cost and memory demand. This paper proposes a flexible Tri-graph Informati… ▽ More

    Submitted 28 January, 2020; originally announced January 2020.

    Comments: Presented at NeruIPS 2019 Graph Representation Learning Workshop

  23. arXiv:1911.11185  [pdf, other

    cs.LG cs.AI stat.ML

    Theory-based Causal Transfer: Integrating Instance-level Induction and Abstract-level Structure Learning

    Authors: Mark Edmonds, Xiaojian Ma, Siyuan Qi, Yixin Zhu, Hongjing Lu, Song-Chun Zhu

    Abstract: Learning transferable knowledge across similar but different settings is a fundamental component of generalized intelligence. In this paper, we approach the transfer learning challenge from a causal theory perspective. Our agent is endowed with two basic yet general theories for transfer learning: (i) a task shares a common abstract structure that is invariant across domains, and (ii) the behavior… ▽ More

    Submitted 25 November, 2019; originally announced November 2019.

    Comments: Accepted to AAAI 2020 as an oral

  24. arXiv:1909.03835  [pdf, other

    cs.LG stat.ML

    Data Sanity Check for Deep Learning Systems via Learnt Assertions

    Authors: Haochuan Lu, Huanlin Xu, Nana Liu, Yangfan Zhou, Xin Wang

    Abstract: Reliability is a critical consideration to DL-based systems. But the statistical nature of DL makes it quite vulnerable to invalid inputs, i.e., those cases that are not considered in the training phase of a DL model. This paper proposes to perform data sanity check to identify invalid inputs, so as to enhance the reliability of DL-based systems. We design and implement a tool to detect behavior d… ▽ More

    Submitted 28 September, 2019; v1 submitted 6 September, 2019; originally announced September 2019.

  25. arXiv:1907.04371  [pdf, other

    stat.ML cs.LG math.OC

    Ordered SGD: A New Stochastic Optimization Framework for Empirical Risk Minimization

    Authors: Kenji Kawaguchi, Haihao Lu

    Abstract: We propose a new stochastic optimization framework for empirical risk minimization problems such as those that arise in machine learning. The traditional approaches, such as (mini-batch) stochastic gradient descent (SGD), utilize an unbiased gradient estimator of the empirical average loss. In contrast, we develop a computationally efficient method to construct a gradient estimator that is purpose… ▽ More

    Submitted 1 February, 2020; v1 submitted 9 July, 2019; originally announced July 2019.

    Comments: Accepted in AISTATS 2020. Code available at: https://github.com/k9k2/qSGD

  26. arXiv:1906.01407  [pdf, other

    cs.LG cs.AI stat.ML

    RL4health: Crowdsourcing Reinforcement Learning for Knee Replacement Pathway Optimization

    Authors: Hao Lu, Mengdi Wang

    Abstract: Joint replacement is the most common inpatient surgical treatment in the US. We investigate the clinical pathway optimization for knee replacement, which is a sequential decision process from onset to recovery. Based on episodic claims from previous cases, we view the pathway optimization as an intelligence crowdsourcing problem and learn the optimal decision policy from data by imitating the best… ▽ More

    Submitted 24 May, 2019; originally announced June 2019.

  27. arXiv:1905.05884  [pdf, other

    stat.ME stat.CO stat.ML

    Approximate Bayesian computation via the energy statistic

    Authors: Hien D. Nguyen, Julyan Arbel, Hongliang Lü, Florence Forbes

    Abstract: Approximate Bayesian computation (ABC) has become an essential part of the Bayesian toolbox for addressing problems in which the likelihood is prohibitively expensive or entirely unknown, making it intractable. ABC defines a pseudo-posterior by comparing observed data with simulated data, traditionally based on some summary statistics, the elicitation of which is regarded as a key difficulty. Rece… ▽ More

    Submitted 30 June, 2020; v1 submitted 14 May, 2019; originally announced May 2019.

    Comments: 25 pages, 6 figures, 5 tables

    Journal ref: IEEE Access (2020)

  28. arXiv:1904.05961  [pdf, other

    cs.LG cs.DS stat.ML

    Robust Coreset Construction for Distributed Machine Learning

    Authors: Hanlin Lu, Ming-Ju Li, Ting He, Shiqiang Wang, Vijaykrishnan Narayanan, Kevin S Chan

    Abstract: Coreset, which is a summary of the original dataset in the form of a small weighted set in the same sample space, provides a promising approach to enable machine learning over distributed data. Although viewed as a proxy of the original dataset, each coreset is only designed to approximate the cost function of a specific machine learning problem, and thus different coresets are often required to s… ▽ More

    Submitted 22 June, 2020; v1 submitted 11 April, 2019; originally announced April 2019.

  29. arXiv:1903.11020  [pdf, other

    cs.LG cs.CV stat.ML

    Domain Independent SVM for Transfer Learning in Brain Decoding

    Authors: Shuo Zhou, Wenwen Li, Christopher R. Cox, Haiping Lu

    Abstract: Brain imaging data are important in brain sciences yet expensive to obtain, with big volume (i.e., large p) but small sample size (i.e., small n). To tackle this problem, transfer learning is a promising direction that leverages source data to improve performance on related, target data. Most transfer learning methods focus on minimizing data distribution mismatch. However, a big challenge in brai… ▽ More

    Submitted 26 March, 2019; originally announced March 2019.

  30. arXiv:1903.08708  [pdf, other

    cs.LG stat.ML

    Accelerating Gradient Boosting Machine

    Authors: Haihao Lu, Sai Praneeth Karimireddy, Natalia Ponomareva, Vahab Mirrokni

    Abstract: Gradient Boosting Machine (GBM) is an extremely powerful supervised learning algorithm that is widely used in practice. GBM routinely features as a leading algorithm in machine learning competitions such as Kaggle and the KDDCup. In this work, we propose Accelerated Gradient Boosting Machine (AGBM) by incorporating Nesterov's acceleration techniques into the design of GBM. The difficulty in accele… ▽ More

    Submitted 27 August, 2020; v1 submitted 20 March, 2019; originally announced March 2019.

  31. arXiv:1902.07903  [pdf, other

    cs.NI cs.LG stat.ML

    Learning Deterministic Policy with Target for Power Control in Wireless Networks

    Authors: Yujiao Lu, Hancheng Lu, Liangliang Cao, Feng Wu, Daren Zhu

    Abstract: Inter-Cell Interference Coordination (ICIC) is a promising way to improve energy efficiency in wireless networks, especially where small base stations are densely deployed. However, traditional optimization based ICIC schemes suffer from severe performance degradation with complex interference pattern. To address this issue, we propose a Deep Reinforcement Learning with Deterministic Policy and Ta… ▽ More

    Submitted 21 February, 2019; originally announced February 2019.

    Comments: 7 pages, 7 figures, GlobeCom2018

  32. arXiv:1812.10140  [pdf, other

    cs.LG stat.ML

    Mixed-Order Spectral Clustering for Networks

    Authors: Yan Ge, Haiping Lu, Pan Peng

    Abstract: Clustering is fundamental for gaining insights from complex networks, and spectral clustering (SC) is a popular approach. Conventional SC focuses on second-order structures (e.g., edges connecting two nodes) without direct consideration of higher-order structures (e.g., triangles and cliques). This has motivated SC extensions that directly consider higher-order structures. However, both approaches… ▽ More

    Submitted 25 December, 2018; originally announced December 2018.

    Comments: 12 pages

  33. arXiv:1812.00086  [pdf, other

    cs.LG stat.ML

    Graph Node-Feature Convolution for Representation Learning

    Authors: Li Zhang, Heda Song, Nikolaos Aletras, Haiping Lu

    Abstract: Graph convolutional network (GCN) is an emerging neural network approach. It learns new representation of a node by aggregating feature vectors of all neighbors in the aggregation process without considering whether the neighbors or features are useful or not. Recent methods have improved solutions by sampling a fixed size set of neighbors, or assigning different weights to different neighbors in… ▽ More

    Submitted 31 March, 2022; v1 submitted 30 November, 2018; originally announced December 2018.

  34. arXiv:1811.11017  [pdf, other

    cs.CL cs.IR cs.LG stat.ML

    Latent Dirichlet Allocation with Residual Convolutional Neural Network Applied in Evaluating Credibility of Chinese Listed Companies

    Authors: Mohan Zhang, Zhichao Luo, Hai Lu

    Abstract: This project demonstrated a methodology to estimating cooperate credibility with a Natural Language Processing approach. As cooperate transparency impacts both the credibility and possible future earnings of the firm, it is an important factor to be considered by banks and investors on risk assessments of listed firms. This approach of estimating cooperate credibility can bypass human bias and inc… ▽ More

    Submitted 24 November, 2018; originally announced November 2018.

  35. arXiv:1810.10158  [pdf, other

    cs.LG math.OC stat.ML

    Randomized Gradient Boosting Machine

    Authors: Haihao Lu, Rahul Mazumder

    Abstract: Gradient Boosting Machine (GBM) introduced by Friedman is a powerful supervised learning algorithm that is very widely used in practice---it routinely features as a leading algorithm in machine learning competitions such as Kaggle and the KDDCup. In spite of the usefulness of GBM in practice, our current theoretical understanding of this method is rather limited. In this work, we propose Randomize… ▽ More

    Submitted 15 September, 2020; v1 submitted 23 October, 2018; originally announced October 2018.

  36. arXiv:1810.09177  [pdf, other

    cs.LG cs.IR stat.ML

    Compositional Coding Capsule Network with K-Means Routing for Text Classification

    Authors: Hao Ren, Hong Lu

    Abstract: Text classification is a challenging problem which aims to identify the category of texts. In the process of training, word embeddings occupy a large part of parameters. Under the limitation of limited computing resources, it indirectly limits the ability of subsequent network designs. In order to reduce the number of parameters, the compositional coding mechanism has been proposed recently. Based… ▽ More

    Submitted 2 June, 2022; v1 submitted 22 October, 2018; originally announced October 2018.

    Comments: the paper is accepted by Pattern Recognition Letters, please refer https://www.sciencedirect.com/science/article/pii/S016786552200188X for an updated version

  37. arXiv:1810.02716  [pdf, other

    cs.LG stat.ML

    Approximate Leave-One-Out for High-Dimensional Non-Differentiable Learning Problems

    Authors: Shuaiwen Wang, Wenda Zhou, Arian Maleki, Haihao Lu, Vahab Mirrokni

    Abstract: Consider the following class of learning schemes: \begin{equation} \label{eq:main-problem1} \hat{\boldsymbolβ} := \underset{\boldsymbolβ \in \mathcal{C}}{\arg\min} \;\sum_{j=1}^n \ell(\boldsymbol{x}_j^\top\boldsymbolβ; y_j) + λR(\boldsymbolβ), \qquad \qquad \qquad (1) \end{equation} where $\boldsymbol{x}_i \in \mathbb{R}^p$ and $y_i \in \mathbb{R}$ denote the $i^{\rm th}$ feature and response va… ▽ More

    Submitted 4 October, 2018; originally announced October 2018.

    Comments: 63 pages, 7 figures. arXiv admin note: substantial text overlap with arXiv:1807.02694

  38. arXiv:1807.02694  [pdf, other

    stat.ML cs.LG

    Approximate Leave-One-Out for Fast Parameter Tuning in High Dimensions

    Authors: Shuaiwen Wang, Wenda Zhou, Haihao Lu, Arian Maleki, Vahab Mirrokni

    Abstract: Consider the following class of learning schemes: $$\hat{\boldsymbolβ} := \arg\min_{\boldsymbolβ}\;\sum_{j=1}^n \ell(\boldsymbol{x}_j^\top\boldsymbolβ; y_j) + λR(\boldsymbolβ),\qquad\qquad (1) $$ where $\boldsymbol{x}_i \in \mathbb{R}^p$ and $y_i \in \mathbb{R}$ denote the $i^{\text{th}}$ feature and response variable respectively. Let $\ell$ and $R$ be the loss function and regularizer,… ▽ More

    Submitted 7 July, 2018; originally announced July 2018.

    Comments: The paper is published on ICML 2018

  39. arXiv:1712.00573  [pdf, other

    cs.LG stat.ML

    Supervised Hashing based on Energy Minimization

    Authors: Zihao Hu, Xiyi Luo, Hongtao Lu, Yong Yu

    Abstract: Recently, supervised hashing methods have attracted much attention since they can optimize retrieval speed and storage cost while preserving semantic information. Because hashing codes learning is NP-hard, many methods resort to some form of relaxation technique. But the performance of these methods can easily deteriorate due to the relaxation. Luckily, many supervised hashing formulations can be… ▽ More

    Submitted 2 December, 2017; originally announced December 2017.

  40. arXiv:1702.08580  [pdf, ps, other

    cs.LG cs.NE math.OC stat.ML

    Depth Creates No Bad Local Minima

    Authors: Haihao Lu, Kenji Kawaguchi

    Abstract: In deep learning, \textit{depth}, as well as \textit{nonlinearity}, create non-convex loss surfaces. Then, does depth alone create bad local minima? In this paper, we prove that without nonlinearity, depth alone does not create bad local minima, although it induces non-convex loss surface. Using this insight, we greatly simplify a recently proposed proof to show that all of the local minima of fee… ▽ More

    Submitted 23 May, 2017; v1 submitted 27 February, 2017; originally announced February 2017.

  41. arXiv:1604.02100  [pdf, other

    stat.ML cs.IT math.NA math.SP physics.med-ph

    Hankel Matrix Nuclear Norm Regularized Tensor Completion for $N$-dimensional Exponential Signals

    Authors: Jiaxi Ying, Hengfa Lu, Qingtao Wei, Jian-Feng Cai, Di Guo, Jihui Wu, Zhong Chen, Xiaobo Qu

    Abstract: Signals are generally modeled as a superposition of exponential functions in spectroscopy of chemistry, biology and medical imaging. For fast data acquisition or other inevitable reasons, however, only a small amount of samples may be acquired and thus how to recover the full signal becomes an active research topic. But existing approaches can not efficiently recover $N$-dimensional exponential si… ▽ More

    Submitted 31 March, 2017; v1 submitted 6 April, 2016; originally announced April 2016.

    Comments: 15 pages, 12 figures

  42. arXiv:1504.08142  [pdf, other

    stat.ML cs.CV cs.LG

    Semi-Orthogonal Multilinear PCA with Relaxed Start

    Authors: Qiquan Shi, Haiping Lu

    Abstract: Principal component analysis (PCA) is an unsupervised method for learning low-dimensional features with orthogonal projections. Multilinear PCA methods extend PCA to deal with multidimensional data (tensors) directly via tensor-to-tensor projection or tensor-to-vector projection (TVP). However, under the TVP setting, it is difficult to develop an effective multilinear PCA method with the orthogona… ▽ More

    Submitted 6 May, 2015; v1 submitted 30 April, 2015; originally announced April 2015.

    Comments: 8 pages, 2 figures, to appear in Proceedings of the 24th International Joint Conference on Artificial Intelligence (IJCAI 2015)

    ACM Class: I.2.6

  43. arXiv:1410.3561  [pdf, ps, other

    stat.ME

    Sufficient dimension reduction with additional information

    Authors: Hung Hung, Chih-Yen Liu, Henry Horng-Shing Lu

    Abstract: Sufficient dimension reduction is widely applied to help model building between the response $Y$ and covariate $X$. While the target of interest is the relationship between $(Y,X)$, in some applications we also collect additional variable $W$ that is strongly correlated with $Y$. From a statistical point of view, making inference about $(Y,X)$ without using $W$ will lose efficiency. However, it is… ▽ More

    Submitted 13 October, 2014; originally announced October 2014.

    Comments: 26 pages, 4 figures, 1 table

  44. arXiv:1408.5352  [pdf, other

    stat.ML cs.LG

    Nonconvex Statistical Optimization: Minimax-Optimal Sparse PCA in Polynomial Time

    Authors: Zhaoran Wang, Huanran Lu, Han Liu

    Abstract: Sparse principal component analysis (PCA) involves nonconvex optimization for which the global solution is hard to obtain. To address this issue, one popular approach is convex relaxation. However, such an approach may produce suboptimal estimators due to the relaxation effect. To optimally estimate sparse principal subspaces, we propose a two-stage computational framework named "tighten after rel… ▽ More

    Submitted 22 August, 2014; originally announced August 2014.

    Comments: 64 pages, 8 figures

  45. arXiv:1307.0293  [pdf, other

    stat.ML

    A Direct Estimation of High Dimensional Stationary Vector Autoregressions

    Authors: Fang Han, Huanran Lu, Han Liu

    Abstract: The vector autoregressive (VAR) model is a powerful tool in modeling complex time series and has been exploited in many fields. However, fitting high dimensional VAR model poses some unique challenges: On one hand, the dimensionality, caused by modeling a large number of time series and higher order autoregressive processes, is usually much higher than the time series length; On the other hand, th… ▽ More

    Submitted 28 October, 2014; v1 submitted 1 July, 2013; originally announced July 2013.

    Comments: 36 pages, 3 figure