Skip to main content

Showing 1–39 of 39 results for author: Needell, D

Searching in archive stat. Search in all archives.
.
  1. arXiv:2405.03073  [pdf, other

    math.OC stat.ML

    Convergence and Complexity Guarantee for Inexact First-order Riemannian Optimization Algorithms

    Authors: Yuchen Li, Laura Balzano, Deanna Needell, Hanbaek Lyu

    Abstract: We analyze inexact Riemannian gradient descent (RGD) where Riemannian gradients and retractions are inexactly (and cheaply) computed. Our focus is on understanding when inexact RGD converges and what is the complexity in the general nonconvex and constrained setting. We answer these questions in a general framework of tangential Block Majorization-Minimization (tBMM). We establish that tBMM conver… ▽ More

    Submitted 9 May, 2024; v1 submitted 5 May, 2024; originally announced May 2024.

    Comments: 23 pages, 5 figures. ICML 2024. Appendix revised

  2. arXiv:2403.06903  [pdf, ps, other

    cs.LG stat.ML

    Benign overfitting in leaky ReLU networks with moderate input dimension

    Authors: Kedar Karhadkar, Erin George, Michael Murray, Guido Montúfar, Deanna Needell

    Abstract: The problem of benign overfitting asks whether it is possible for a model to perfectly fit noisy training data and still generalize well. We study benign overfitting in two-layer leaky ReLU networks trained with the hinge loss on a binary classification task. We consider input data that can be decomposed into the sum of a common signal and a random noise component, that lie on subspaces orthogonal… ▽ More

    Submitted 9 July, 2024; v1 submitted 11 March, 2024; originally announced March 2024.

    Comments: 39 pages

  3. arXiv:2403.01204  [pdf, ps, other

    cs.LG math.NA stat.ML

    Stochastic gradient descent for streaming linear and rectified linear systems with Massart noise

    Authors: Halyun Jeong, Deanna Needell, Elizaveta Rebrova

    Abstract: We propose SGD-exp, a stochastic gradient descent approach for linear and ReLU regressions under Massart noise (adversarial semi-random corruption model) for the fully streaming setting. We show novel nearly linear convergence guarantees of SGD-exp to the true parameter with up to $50\%$ Massart corruption rate, and with any corruption rate in the case of symmetric oblivious corruptions. This is t… ▽ More

    Submitted 2 March, 2024; originally announced March 2024.

    Comments: Submitted to a journal

    MSC Class: 65F10; 60-XX

  4. arXiv:2312.10330  [pdf, other

    math.OC stat.ML

    Convergence and complexity of block majorization-minimization for constrained block-Riemannian optimization

    Authors: Yuchen Li, Laura Balzano, Deanna Needell, Hanbaek Lyu

    Abstract: Block majorization-minimization (BMM) is a simple iterative algorithm for nonconvex optimization that sequentially minimizes a majorizing surrogate of the objective function in each block coordinate while the other block coordinates are held fixed. We consider a family of BMM algorithms for minimizing smooth nonconvex objectives, where each parameter block is constrained within a subset of a Riema… ▽ More

    Submitted 6 August, 2024; v1 submitted 16 December, 2023; originally announced December 2023.

    Comments: 54 pages, 8 figures. Related work updated

  5. arXiv:2307.04056  [pdf, other

    stat.ML cs.LG eess.SP math.NA

    Manifold Filter-Combine Networks

    Authors: Joyce Chew, Edward De Brouwer, Smita Krishnaswamy, Deanna Needell, Michael Perlmutter

    Abstract: We introduce a class of manifold neural networks (MNNs) that we call Manifold Filter-Combine Networks (MFCNs), that aims to further our understanding of MNNs, analogous to how the aggregate-combine framework helps with the understanding of graph neural networks (GNNs). This class includes a wide variety of subclasses that can be thought of as the manifold analog of various popular GNNs. We then co… ▽ More

    Submitted 5 September, 2023; v1 submitted 8 July, 2023; originally announced July 2023.

  6. arXiv:2306.04730  [pdf, other

    eess.SP cs.LG math.NA math.OC stat.ML

    Stochastic Natural Thresholding Algorithms

    Authors: Rachel Grotheer, Shuang Li, Anna Ma, Deanna Needell, Jing Qin

    Abstract: Sparse signal recovery is one of the most fundamental problems in various applications, including medical imaging and remote sensing. Many greedy algorithms based on the family of hard thresholding operators have been developed to solve the sparse signal recovery problem. More recently, Natural Thresholding (NT) has been proposed with improved computational efficiency. This paper proposes and disc… ▽ More

    Submitted 7 June, 2023; originally announced June 2023.

  7. arXiv:2304.10123  [pdf, other

    stat.ML math.NA

    Linear Convergence of Reshuffling Kaczmarz Methods With Sparse Constraints

    Authors: Halyun Jeong, Deanna Needell

    Abstract: The Kaczmarz method (KZ) and its variants, which are types of stochastic gradient descent (SGD) methods, have been extensively studied due to their simplicity and efficiency in solving linear equation systems. The iterative thresholding (IHT) method has gained popularity in various research fields, including compressed sensing or sparse linear regression, machine learning with additional structure… ▽ More

    Submitted 20 April, 2023; originally announced April 2023.

    Comments: Submitted to a journal

    MSC Class: 65F10; 65F22; 90C26

  8. arXiv:2303.00058  [pdf, other

    cs.LG stat.ML

    Neural Nonnegative Matrix Factorization for Hierarchical Multilayer Topic Modeling

    Authors: Tyler Will, Runyu Zhang, Eli Sadovnik, Mengdi Gao, Joshua Vendrow, Jamie Haddock, Denali Molitor, Deanna Needell

    Abstract: We introduce a new method based on nonnegative matrix factorization, Neural NMF, for detecting latent hierarchical structure in data. Datasets with hierarchical structure arise in a wide variety of fields, such as document classification, image processing, and bioinformatics. Neural NMF recursively applies NMF in layers to discover overarching topics encompassing the lower-level features. We deriv… ▽ More

    Submitted 28 February, 2023; originally announced March 2023.

  9. arXiv:2212.12606  [pdf, other

    cs.LG eess.SP math.NA stat.ML

    A Convergence Rate for Manifold Neural Networks

    Authors: Joyce Chew, Deanna Needell, Michael Perlmutter

    Abstract: High-dimensional data arises in numerous applications, and the rapidly developing field of geometric deep learning seeks to develop neural network architectures to analyze such data in non-Euclidean domains, such as graphs and manifolds. Recent work by Z. Wang, L. Ruiz, and A. Ribeiro has introduced a method for constructing manifold neural networks using the spectral decomposition of the Laplace… ▽ More

    Submitted 20 July, 2023; v1 submitted 23 December, 2022; originally announced December 2022.

  10. arXiv:2211.13496  [pdf, other

    stat.CO

    Multi-scale Hybridized Topic Modeling: A Pipeline for Analyzing Unstructured Text Datasets via Topic Modeling

    Authors: Keyi Cheng, Stefan Inzer, Adrian Leung, Xiaoxian Shen, Michael Perlmutter, Michael Lindstrom, Joyce Chew, Todd Presner, Deanna Needell

    Abstract: We propose a multi-scale hybridized topic modeling method to find hidden topics from transcribed interviews more accurately and efficiently than traditional topic modeling methods. Our multi-scale hybridized topic modeling method (MSHTM) approaches data at different scales and performs topic modeling in a hierarchical way utilizing first a classical method, Nonnegative Matrix Factorization, and th… ▽ More

    Submitted 24 November, 2022; originally announced November 2022.

  11. arXiv:2211.05749  [pdf, other

    stat.CO stat.ML

    Sketched Gaussian Model Linear Discriminant Analysis via the Randomized Kaczmarz Method

    Authors: Jocelyn T. Chi, Deanna Needell

    Abstract: We present sketched linear discriminant analysis, an iterative randomized approach to binary-class Gaussian model linear discriminant analysis (LDA) for very large data. We harness a least squares formulation and mobilize the stochastic gradient descent framework. Therefore, we obtain a randomized classifier with performance that is very comparable to that of full data LDA while requiring access t… ▽ More

    Submitted 10 November, 2022; originally announced November 2022.

  12. arXiv:2209.04968  [pdf, other

    stat.CO

    Population-Based Hierarchical Non-negative Matrix Factorization for Survey Data

    Authors: Xiaofu Ding, Xinyu Dong, Olivia McGough, Chenxin Shen, Annie Ulichney, Ruiyao Xu, William Swartworth, Jocelyn T. Chi, Deanna Needell

    Abstract: Motivated by the problem of identifying potential hierarchical population structure on modern survey data containing a wide range of complex data types, we introduce population-based hierarchical non-negative matrix factorization (PHNMF). PHNMF is a variant of hierarchical non-negative matrix factorization based on feature similarity. As such, it enables an automatic and interpretable approach for… ▽ More

    Submitted 11 September, 2022; originally announced September 2022.

  13. arXiv:2208.08561  [pdf, other

    stat.ML cs.LG math.SP

    Geometric Scattering on Measure Spaces

    Authors: Joyce Chew, Matthew Hirn, Smita Krishnaswamy, Deanna Needell, Michael Perlmutter, Holly Steach, Siddharth Viswanath, Hau-Tieng Wu

    Abstract: The scattering transform is a multilayered, wavelet-based transform initially introduced as a model of convolutional neural networks (CNNs) that has played a foundational role in our understanding of these networks' stability and invariance properties. Subsequently, there has been widespread interest in extending the success of CNNs to data sets with non-Euclidean structure, such as graphs and man… ▽ More

    Submitted 13 October, 2022; v1 submitted 17 August, 2022; originally announced August 2022.

    MSC Class: 68T07

  14. arXiv:2206.10078  [pdf, other

    cs.LG eess.SP math.NA stat.ML

    The Manifold Scattering Transform for High-Dimensional Point Cloud Data

    Authors: Joyce Chew, Holly R. Steach, Siddharth Viswanath, Hau-Tieng Wu, Matthew Hirn, Deanna Needell, Smita Krishnaswamy, Michael Perlmutter

    Abstract: The manifold scattering transform is a deep feature extractor for data defined on a Riemannian manifold. It is one of the first examples of extending convolutional neural network-like operators to general manifolds. The initial work on this model focused primarily on its theoretical stability and invariance properties but did not provide methods for its numerical implementation except in the case… ▽ More

    Submitted 21 January, 2024; v1 submitted 20 June, 2022; originally announced June 2022.

    Comments: Accepted for publication in the TAG in DS Workshop at ICML. For subsequent theoretical guarantees, please see Section 6 of arXiv:2208.08561

    MSC Class: 68T07 ACM Class: I.2.6

  15. arXiv:2201.13324  [pdf, other

    cs.LG cs.IR stat.ML

    Guided Semi-Supervised Non-negative Matrix Factorization on Legal Documents

    Authors: Pengyu Li, Christine Tseng, Yaxuan Zheng, Joyce A. Chew, Longxiu Huang, Benjamin Jarman, Deanna Needell

    Abstract: Classification and topic modeling are popular techniques in machine learning that extract information from large-scale datasets. By incorporating a priori information such as labels or important features, methods have been developed to perform classification and topic modeling tasks; however, most methods that can perform both do not allow for guidance of the topics or features. In this paper, we… ▽ More

    Submitted 31 January, 2022; originally announced January 2022.

    Comments: 14 pages, 4 figures

  16. arXiv:2109.14820  [pdf, other

    cs.LG stat.ML

    A Generalized Hierarchical Nonnegative Tensor Decomposition

    Authors: Joshua Vendrow, Jamie Haddock, Deanna Needell

    Abstract: Nonnegative matrix factorization (NMF) has found many applications including topic modeling and document analysis. Hierarchical NMF (HNMF) variants are able to learn topics at various levels of granularity and illustrate their hierarchical relationship. Recently, nonnegative tensor factorization (NTF) methods have been applied in a similar fashion in order to handle data sets with complex, multi-m… ▽ More

    Submitted 15 February, 2022; v1 submitted 29 September, 2021; originally announced September 2021.

    Comments: 6 pages, 2 figues, 3 tables

  17. arXiv:2109.14079  [pdf, other

    cs.IT math.NA stat.CO

    Robust recovery of bandlimited graph signals via randomized dynamical sampling

    Authors: Longxiu Huang, Deanna Needell, Sui Tang

    Abstract: Heat diffusion processes have found wide applications in modelling dynamical systems over graphs. In this paper, we consider the recovery of a $k$-bandlimited graph signal that is an initial signal of a heat diffusion process from its space-time samples. We propose three random space-time sampling regimes, termed dynamical sampling techniques, that consist in selecting a small subset of space-time… ▽ More

    Submitted 3 October, 2021; v1 submitted 28 September, 2021; originally announced September 2021.

    Comments: corrected mistakes in plotting. arXiv admin note: text overlap with arXiv:1511.05118 by other authors

    MSC Class: 94A20; 94A12

  18. arXiv:2105.09065  [pdf, other

    stat.AP

    Statistical Learning for Best Practices in Tattoo Removal

    Authors: Richard Yim, Jamie Haddock, Deanna Needell

    Abstract: The causes behind complications in laser-assisted tattoo removal are currently not well understood, and in the literature relating to tattoo removal the emphasis on removal treatment is on removal technologies and tools, not best parameters involved in the treatment process. Additionally, the very challenge of determining best practices is difficult given the complexity of interactions between fac… ▽ More

    Submitted 19 May, 2021; originally announced May 2021.

    Comments: 15 pages, 2 figures, 9 tables

  19. arXiv:2009.09087  [pdf, other

    cs.CY cs.LG stat.ML

    Feature Selection on Lyme Disease Patient Survey Data

    Authors: Joshua Vendrow, Jamie Haddock, Deanna Needell, Lorraine Johnson

    Abstract: Lyme disease is a rapidly growing illness that remains poorly understood within the medical community. Critical questions about when and why patients respond to treatment or stay ill, what kinds of treatments are effective, and even how to properly diagnose the disease remain largely unanswered. We investigate these questions by applying machine learning techniques to a large scale Lyme disease pa… ▽ More

    Submitted 24 August, 2020; originally announced September 2020.

    Comments: 9 pages, 8 figures, 6 tables

  20. arXiv:2009.09074  [pdf, other

    cs.DL cs.IR cs.LG stat.ML

    COVID-19 Literature Topic-Based Search via Hierarchical NMF

    Authors: Rachel Grotheer, Yihuan Huang, Pengyu Li, Elizaveta Rebrova, Deanna Needell, Longxiu Huang, Alona Kryshchenko, Xia Li, Kyung Ha, Oleksandr Kryshchenko

    Abstract: A dataset of COVID-19-related scientific literature is compiled, combining the articles from several online libraries and selecting those with open access and full text available. Then, hierarchical nonnegative matrix factorization is used to organize literature related to the novel coronavirus into a tree structure that allows researchers to search for relevant literature based on detected topics… ▽ More

    Submitted 7 September, 2020; originally announced September 2020.

  21. arXiv:2009.07612  [pdf, other

    stat.ML cs.LG math.OC

    Online nonnegative CP-dictionary learning for Markovian data

    Authors: Hanbaek Lyu, Christopher Strohmeier, Deanna Needell

    Abstract: Online Tensor Factorization (OTF) is a fundamental tool in learning low-dimensional interpretable features from streaming multi-modal data. While various algorithmic and theoretical aspects of OTF have been investigated recently, a general convergence guarantee to stationary points of the objective function without any incoherence or sparsity assumptions is still lacking even for the i.i.d. case.… ▽ More

    Submitted 2 April, 2022; v1 submitted 16 September, 2020; originally announced September 2020.

    Comments: 41 pages, 5 figures

  22. arXiv:2009.01279  [pdf, other

    stat.ML cs.LG eess.SP

    Clustering of Nonnegative Data and an Application to Matrix Completion

    Authors: C. Strohmeier, D. Needell

    Abstract: In this paper, we propose a simple algorithm to cluster nonnegative data lying in disjoint subspaces. We analyze its performance in relation to a certain measure of correlation between said subspaces. We use our clustering algorithm to develop a matrix completion algorithm which can outperform standard matrix completion algorithms on data matrices satisfying certain natural conditions.

    Submitted 2 September, 2020; originally announced September 2020.

  23. arXiv:2007.15776  [pdf, other

    stat.ML cs.IT cs.LG math.PR

    Random Vector Functional Link Networks for Function Approximation on Manifolds

    Authors: Deanna Needell, Aaron A. Nelson, Rayan Saab, Palina Salanevich, Olov Schavemaker

    Abstract: The learning speed of feed-forward neural networks is notoriously slow and has presented a bottleneck in deep learning applications for several decades. For instance, gradient-based learning algorithms, which are used extensively to train neural networks, tend to work slowly when all of the network parameters must be iteratively tuned. To counter this, both researchers and practitioners have tried… ▽ More

    Submitted 26 August, 2024; v1 submitted 30 July, 2020; originally announced July 2020.

    Comments: 37 pages, 1 figure

    MSC Class: 62M45

    Journal ref: Frontiers in Applied Mathematics and Statistics 10 (2024), 2297-4687

  24. arXiv:2004.09112  [pdf, other

    cs.LG math.OC stat.ML

    COVID-19 Time-series Prediction by Joint Dictionary Learning and Online NMF

    Authors: Hanbaek Lyu, Christopher Strohmeier, Georg Menz, Deanna Needell

    Abstract: Predicting the spread and containment of COVID-19 is a challenge of utmost importance that the broader scientific community is currently facing. One of the main sources of difficulty is that a very limited amount of daily COVID-19 case data is available, and with few exceptions, the majority of countries are currently in the "exponential spread stage," and thus there is scarce information availabl… ▽ More

    Submitted 20 April, 2020; originally announced April 2020.

    Comments: 8 pages, 4 figures

  25. arXiv:2001.00631  [pdf, other

    cs.LG stat.ML

    On Large-Scale Dynamic Topic Modeling with Nonnegative CP Tensor Decomposition

    Authors: Miju Ahn, Nicole Eikmeier, Jamie Haddock, Lara Kassab, Alona Kryshchenko, Kathryn Leonard, Deanna Needell, R. W. M. A. Madushani, Elena Sizikova, Chuntian Wang

    Abstract: There is currently an unprecedented demand for large-scale temporal data analysis due to the explosive growth of data. Dynamic topic modeling has been widely used in social and data sciences with the goal of learning latent topics that emerge, evolve, and fade over time. Previous work on dynamic topic modeling primarily employ the method of nonnegative matrix factorization (NMF), where slices of t… ▽ More

    Submitted 14 October, 2020; v1 submitted 2 January, 2020; originally announced January 2020.

    Comments: 23 pages, 29 figures, submitted to Women in Data Science and Mathematics (WiSDM) Workshop Proceedings, "Advances in Data Science", AWM-Springer series

  26. arXiv:1912.08294  [pdf, other

    math.NA stat.ML

    Lower Memory Oblivious (Tensor) Subspace Embeddings with Fewer Random Bits: Modewise Methods for Least Squares

    Authors: M. A. Iwen, D. Needell, E. Rebrova, A. Zare

    Abstract: In this paper new general modewise Johnson-Lindenstrauss (JL) subspace embeddings are proposed that are both considerably faster to generate and easier to store than traditional JL embeddings when working with extremely large vectors and/or tensors. Corresponding embedding results are then proven for two different types of low-dimensional (tensor) subspaces. The first of these new subspace embed… ▽ More

    Submitted 16 December, 2020; v1 submitted 17 December, 2019; originally announced December 2019.

  27. arXiv:1912.00315  [pdf, other

    cs.CL cs.LG math.OC stat.ML

    Topic-aware chatbot using Recurrent Neural Networks and Nonnegative Matrix Factorization

    Authors: Yuchen Guo, Nicholas Hanoian, Zhexiao Lin, Nicholas Liskij, Hanbaek Lyu, Deanna Needell, Jiahao Qu, Henry Sojico, Yuliang Wang, Zhe Xiong, Zhenhong Zou

    Abstract: We propose a novel model for a topic-aware chatbot by combining the traditional Recurrent Neural Network (RNN) encoder-decoder model with a topic attention layer based on Nonnegative Matrix Factorization (NMF). After learning topic vectors from an auxiliary text corpus via NMF, the decoder is trained so that it is more likely to sample response words from the most correlated topic vectors. One of… ▽ More

    Submitted 4 December, 2019; v1 submitted 30 November, 2019; originally announced December 2019.

    Comments: 14 pages, 1 figure, 2 tables

  28. arXiv:1911.01931  [pdf, other

    cs.LG cs.DS math.OC math.PR stat.ML

    Online matrix factorization for Markovian data and applications to Network Dictionary Learning

    Authors: Hanbaek Lyu, Deanna Needell, Laura Balzano

    Abstract: Online Matrix Factorization (OMF) is a fundamental tool for dictionary learning problems, giving an approximate representation of complex data sets in terms of a reduced number of extracted features. Convergence guarantees for most of the OMF algorithms in the literature assume independence between data matrices, and the case of dependent data streams remains largely unexplored. In this paper, we… ▽ More

    Submitted 7 November, 2020; v1 submitted 5 November, 2019; originally announced November 2019.

    Comments: 39 pages, 13 figures

    Journal ref: Journal of Machine Learning Research 21 (2020)

  29. arXiv:1908.08479  [pdf, other

    math.NA stat.ML

    Iterative Hard Thresholding for Low CP-rank Tensor Models

    Authors: Rachel Grotheer, Shuang Li, Anna Ma, Deanna Needell, Jing Qin

    Abstract: Recovery of low-rank matrices from a small number of linear measurements is now well-known to be possible under various model assumptions on the measurements. Such results demonstrate robustness and are backed with provable theoretical guarantees. However, extensions to tensor recovery have only recently began to be studied and developed, despite an abundance of practical tensor applications. Rece… ▽ More

    Submitted 22 August, 2019; originally announced August 2019.

  30. arXiv:1907.11746  [pdf, other

    stat.ML cs.LG

    Bias of Homotopic Gradient Descent for the Hinge Loss

    Authors: Denali Molitor, Deanna Needell, Rachel Ward

    Abstract: Gradient descent is a simple and widely used optimization method for machine learning. For homogeneous linear classifiers applied to separable data, gradient descent has been shown to converge to the maximal margin (or equivalently, the minimal norm) solution for various smooth loss functions. The previous theory does not, however, apply to non-smooth functions such as the hinge loss which is wide… ▽ More

    Submitted 26 July, 2019; originally announced July 2019.

  31. arXiv:1905.13404  [pdf, other

    cs.LG math.OC stat.ML

    Data-driven Algorithm Selection and Parameter Tuning: Two Case studies in Optimization and Signal Processing

    Authors: Jesus A. De Loera, Jamie Haddock, Anna Ma, Deanna Needell

    Abstract: Machine learning algorithms typically rely on optimization subroutines and are well-known to provide very effective outcomes for many types of problems. Here, we flip the reliance and ask the reverse question: can machine learning algorithms lead to more effective outcomes for optimization problems? Our goal is to train machine learning methods to automatically improve the performance of optimizat… ▽ More

    Submitted 26 July, 2019; v1 submitted 30 May, 2019; originally announced May 2019.

  32. arXiv:1904.08540  [pdf, other

    cs.LG stat.ML

    Matrix Completion With Selective Sampling

    Authors: Christian Parkinson, Kevin Huynh, Deanna Needell

    Abstract: Matrix completion is a classical problem in data science wherein one attempts to reconstruct a low-rank matrix while only observing some subset of the entries. Previous authors have phrased this problem as a nuclear norm minimization problem. Almost all previous work assumes no explicit structure of the matrix and uses uniform sampling to decide the observed entries. We suggest methods for selecti… ▽ More

    Submitted 17 April, 2019; originally announced April 2019.

    Comments: 4 pages, 4 figures

  33. arXiv:1809.03041  [pdf, other

    stat.ML cs.LG

    An iterative method for classification of binary data

    Authors: Denali Molitor, Deanna Needell

    Abstract: In today's data driven world, storing, processing, and gleaning insights from large-scale data are major challenges. Data compression is often required in order to store large amounts of high-dimensional data, and thus, efficient inference methods for analyzing compressed data are necessary. Building on a recently designed simple framework for classification using binary data, we demonstrate that… ▽ More

    Submitted 9 September, 2018; originally announced September 2018.

    MSC Class: 68T05; 68P30; 68U10

  34. arXiv:1807.08825  [pdf, other

    cs.LG stat.ML

    Hierarchical Classification using Binary Data

    Authors: Denali Molitor, Deanna Needell

    Abstract: In classification problems, especially those that categorize data into a large number of classes, the classes often naturally follow a hierarchical structure. That is, some classes are likely to share similar structures and features. Those characteristics can be captured by considering a hierarchical relationship among the class labels. Here, we extend a recent simple classification approach on bi… ▽ More

    Submitted 23 July, 2018; originally announced July 2018.

    Comments: AAAI Magazine special Issue on Deep Models, Machine Learning and Artificial Intelligence Applications in National and International Security, June, 2018

  35. arXiv:1805.12529  [pdf, other

    cs.LG stat.ML

    Analysis of Fast Structured Dictionary Learning

    Authors: Saiprasad Ravishankar, Anna Ma, Deanna Needell

    Abstract: Sparsity-based models and techniques have been exploited in many signal processing and imaging applications. Data-driven methods based on dictionary and sparsifying transform learning enable learning rich image features from data, and can outperform analytical models. In particular, alternating optimization algorithms have been popular for learning such models. In this work, we focus on alternatin… ▽ More

    Submitted 23 September, 2019; v1 submitted 31 May, 2018; originally announced May 2018.

    Comments: This article has been accepted for publication in Information and Inference Published by Oxford University Press

  36. arXiv:1801.09657  [pdf, other

    math.NA cs.LG stat.ME

    Matrix Completion for Structured Observations

    Authors: Denali Molitor, Deanna Needell

    Abstract: The need to predict or fill-in missing data, often referred to as matrix completion, is a common challenge in today's data-driven world. Previous strategies typically assume that no structural difference between observed and missing entries exists. Unfortunately, this assumption is woefully unrealistic in many applications. For example, in the classic Netflix challenge, in which one hopes to predi… ▽ More

    Submitted 29 January, 2018; originally announced January 2018.

  37. arXiv:1707.01945  [pdf, other

    cs.LG math.NA stat.ML

    Simple Classification using Binary Data

    Authors: Deanna Needell, Rayan Saab, Tina Woolf

    Abstract: Binary, or one-bit, representations of data arise naturally in many applications, and are appealing in both hardware implementations and algorithm design. In this work, we study the problem of data classification from binary data and propose a framework with low computation and resource costs. We illustrate the utility of the proposed approach through stylized and realistic numerical experiments,… ▽ More

    Submitted 6 July, 2017; originally announced July 2017.

  38. arXiv:1310.5715  [pdf, ps, other

    math.NA cs.CV cs.LG math.OC stat.ML

    Stochastic Gradient Descent, Weighted Sampling, and the Randomized Kaczmarz algorithm

    Authors: Deanna Needell, Nathan Srebro, Rachel Ward

    Abstract: We obtain an improved finite-sample guarantee on the linear convergence of stochastic gradient descent for smooth and strongly convex objectives, improving from a quadratic dependence on the conditioning $(L/μ)^2$ (where $L$ is a bound on the smoothness and $μ$ on the strong convexity) to a linear dependence on $L/μ$. Furthermore, we show how reweighting the sampling distribution (i.e. importance… ▽ More

    Submitted 16 January, 2015; v1 submitted 21 October, 2013; originally announced October 2013.

    Comments: 22 pages, 6 figures

    MSC Class: 65B99; 52A99; 60G99; 62L20

  39. arXiv:1211.3444  [pdf, other

    cs.LG math.NA stat.ML

    Spectral Clustering: An empirical study of Approximation Algorithms and its Application to the Attrition Problem

    Authors: B. Cung, T. Jin, J. Ramirez, A. Thompson, C. Boutsidis, D. Needell

    Abstract: Clustering is the problem of separating a set of objects into groups (called clusters) so that objects within the same cluster are more similar to each other than to those in different clusters. Spectral clustering is a now well-known method for clustering which utilizes the spectrum of the data similarity matrix to perform this separation. Since the method relies on solving an eigenvector problem… ▽ More

    Submitted 14 November, 2012; originally announced November 2012.