Skip to main content

Showing 1–22 of 22 results for author: Klivans, A

Searching in archive stat. Search in all archives.
.
  1. arXiv:2307.01178  [pdf, ps, other

    cs.DS cs.LG stat.ML

    Learning Mixtures of Gaussians Using the DDPM Objective

    Authors: Kulin Shah, Sitan Chen, Adam Klivans

    Abstract: Recent works have shown that diffusion models can learn essentially any distribution provided one can perform score estimation. Yet it remains poorly understood under what settings score estimation is possible, let alone when practical gradient-based algorithms for this task can provably succeed. In this work, we give the first provably efficient results along these lines for one of the most fun… ▽ More

    Submitted 3 July, 2023; originally announced July 2023.

    Comments: 48 pages

  2. arXiv:2306.10615  [pdf, ps, other

    cs.LG cs.DS stat.ML

    Agnostically Learning Single-Index Models using Omnipredictors

    Authors: Aravind Gollakota, Parikshit Gopalan, Adam R. Klivans, Konstantinos Stavropoulos

    Abstract: We give the first result for agnostically learning Single-Index Models (SIMs) with arbitrary monotone and Lipschitz activations. All prior work either held only in the realizable setting or required the activation to be known. Moreover, we only require the marginal to have bounded second moments, whereas all prior work required stronger distributional assumptions (such as anticoncentration or boun… ▽ More

    Submitted 18 June, 2023; originally announced June 2023.

    Comments: 21 pages

  3. arXiv:2305.11765  [pdf, other

    cs.LG cs.DS stat.ML

    Tester-Learners for Halfspaces: Universal Algorithms

    Authors: Aravind Gollakota, Adam R. Klivans, Konstantinos Stavropoulos, Arsen Vasilyan

    Abstract: We give the first tester-learner for halfspaces that succeeds universally over a wide class of structured distributions. Our universal tester-learner runs in fully polynomial time and has the following guarantee: the learner achieves error $O(\mathrm{opt}) + ε$ on any labeled distribution that the tester accepts, and moreover, the tester accepts whenever the marginal is any distribution that satis… ▽ More

    Submitted 19 May, 2023; originally announced May 2023.

    Comments: 26 pages, 2 figures

  4. arXiv:2304.10524  [pdf, other

    cs.LG cs.DS stat.ML

    Learning Narrow One-Hidden-Layer ReLU Networks

    Authors: Sitan Chen, Zehao Dou, Surbhi Goel, Adam R Klivans, Raghu Meka

    Abstract: We consider the well-studied problem of learning a linear combination of $k$ ReLU activations with respect to a Gaussian distribution on inputs in $d$ dimensions. We give the first polynomial-time algorithm that succeeds whenever $k$ is a constant. All prior polynomial-time learners require additional assumptions on the network, such as positive combining coefficients or the matrix of hidden weigh… ▽ More

    Submitted 20 April, 2023; originally announced April 2023.

    Comments: 33 pages, comments welcome

  5. arXiv:2302.14853  [pdf, other

    cs.LG stat.ML

    An Efficient Tester-Learner for Halfspaces

    Authors: Aravind Gollakota, Adam R. Klivans, Konstantinos Stavropoulos, Arsen Vasilyan

    Abstract: We give the first efficient algorithm for learning halfspaces in the testable learning model recently defined by Rubinfeld and Vasilyan (2023). In this model, a learner certifies that the accuracy of its output hypothesis is near optimal whenever the training set passes an associated test, and training sets drawn from some target distribution -- e.g., the Gaussian -- must pass the test. This model… ▽ More

    Submitted 13 March, 2023; v1 submitted 28 February, 2023; originally announced February 2023.

    Comments: 26 pages, 3 figures, Version v2: strengthened the agnostic guarantee

  6. arXiv:2211.13312  [pdf, ps, other

    cs.LG cs.CC stat.ML

    A Moment-Matching Approach to Testable Learning and a New Characterization of Rademacher Complexity

    Authors: Aravind Gollakota, Adam R. Klivans, Pravesh K. Kothari

    Abstract: A remarkable recent paper by Rubinfeld and Vasilyan (2022) initiated the study of \emph{testable learning}, where the goal is to replace hard-to-verify distributional assumptions (such as Gaussianity) with efficiently testable ones and to require that the learner succeed whenever the unknown distribution passes the corresponding test. In this model, they gave an efficient algorithm for learning ha… ▽ More

    Submitted 23 November, 2022; originally announced November 2022.

    Comments: 34 pages

  7. arXiv:2202.05258  [pdf, ps, other

    cs.LG cs.CC stat.ML

    Hardness of Noise-Free Learning for Two-Hidden-Layer Neural Networks

    Authors: Sitan Chen, Aravind Gollakota, Adam R. Klivans, Raghu Meka

    Abstract: We give superpolynomial statistical query (SQ) lower bounds for learning two-hidden-layer ReLU networks with respect to Gaussian inputs in the standard (noise-free) model. No general SQ lower bounds were known for learning ReLU networks of any depth in this setting: previous SQ lower bounds held only for adversarial noise models (agnostic learning) or restricted models such as correlational SQ.… ▽ More

    Submitted 13 November, 2022; v1 submitted 10 February, 2022; originally announced February 2022.

    Comments: 35 pages, v3: refined exposition

  8. arXiv:2009.13512  [pdf, ps, other

    cs.LG cs.DS stat.ML

    Learning Deep ReLU Networks Is Fixed-Parameter Tractable

    Authors: Sitan Chen, Adam R. Klivans, Raghu Meka

    Abstract: We consider the problem of learning an unknown ReLU network with respect to Gaussian inputs and obtain the first nontrivial results for networks of depth more than two. We give an algorithm whose running time is a fixed polynomial in the ambient dimension and some (exponentially large) function of only the network's parameters. Our bounds depend on the number of hidden units, depth, spectral nor… ▽ More

    Submitted 28 September, 2020; originally announced September 2020.

    Comments: 39 pages

  9. arXiv:2007.12815  [pdf, other

    cs.LG cs.DS stat.ML

    From Boltzmann Machines to Neural Networks and Back Again

    Authors: Surbhi Goel, Adam Klivans, Frederic Koehler

    Abstract: Graphical models are powerful tools for modeling high-dimensional data, but learning graphical models in the presence of latent variables is well-known to be difficult. In this work we give new results for learning Restricted Boltzmann Machines, probably the most well-studied class of latent variable models. Our results are based on new connections to learning two-layer neural networks under… ▽ More

    Submitted 24 July, 2020; originally announced July 2020.

  10. arXiv:2006.15812  [pdf, ps, other

    cs.LG cs.DS stat.ML

    Statistical-Query Lower Bounds via Functional Gradients

    Authors: Surbhi Goel, Aravind Gollakota, Adam Klivans

    Abstract: We give the first statistical-query lower bounds for agnostically learning any non-polynomial activation with respect to Gaussian marginals (e.g., ReLU, sigmoid, sign). For the specific problem of ReLU regression (equivalently, agnostically learning a ReLU), we show that any statistical-query algorithm with tolerance $n^{-(1/ε)^b}$ must use at least $2^{n^c} ε$ queries for some constant… ▽ More

    Submitted 22 October, 2020; v1 submitted 29 June, 2020; originally announced June 2020.

    Comments: 34 pages, NeurIPS 2020

  11. arXiv:2006.12011  [pdf, other

    cs.LG cs.DS stat.ML

    Superpolynomial Lower Bounds for Learning One-Layer Neural Networks using Gradient Descent

    Authors: Surbhi Goel, Aravind Gollakota, Zhihan Jin, Sushrut Karmalkar, Adam Klivans

    Abstract: We prove the first superpolynomial lower bounds for learning one-layer neural networks with respect to the Gaussian distribution using gradient descent. We show that any classifier trained using gradient descent with respect to square-loss will fail to achieve small test error in polynomial time given access to samples labeled by a one-layer neural network. For classification, we give a stronger r… ▽ More

    Submitted 22 October, 2020; v1 submitted 22 June, 2020; originally announced June 2020.

    Comments: 25 pages, ICML 2020

  12. arXiv:2005.12844  [pdf, other

    cs.LG cs.DS stat.ML

    Approximation Schemes for ReLU Regression

    Authors: Ilias Diakonikolas, Surbhi Goel, Sushrut Karmalkar, Adam R. Klivans, Mahdi Soltanolkotabi

    Abstract: We consider the fundamental problem of ReLU regression, where the goal is to output the best fitting ReLU with respect to square loss given access to draws from some unknown distribution. We give the first efficient, constant-factor approximation algorithm for this problem assuming the underlying distribution satisfies some weak concentration and anti-concentration conditions (and includes, for ex… ▽ More

    Submitted 28 September, 2020; v1 submitted 26 May, 2020; originally announced May 2020.

  13. arXiv:2003.01794  [pdf, other

    cs.LG stat.ML

    Good Subnetworks Provably Exist: Pruning via Greedy Forward Selection

    Authors: Mao Ye, Chengyue Gong, Lizhen Nie, Denny Zhou, Adam Klivans, Qiang Liu

    Abstract: Recent empirical works show that large deep neural networks are often highly redundant and one can find much smaller subnetworks without a significant drop of accuracy. However, most existing methods of network pruning are empirical and heuristic, leaving it open whether good subnetworks provably exist, how to find them efficiently, and if network pruning can be provably better than direct trainin… ▽ More

    Submitted 19 October, 2020; v1 submitted 3 March, 2020; originally announced March 2020.

    Comments: ICML 2020

  14. arXiv:1911.01462  [pdf, ps, other

    cs.LG cs.DS stat.ML

    Time/Accuracy Tradeoffs for Learning a ReLU with respect to Gaussian Marginals

    Authors: Surbhi Goel, Sushrut Karmalkar, Adam Klivans

    Abstract: We consider the problem of computing the best-fitting ReLU with respect to square-loss on a training set when the examples have been drawn according to a spherical Gaussian distribution (the labels can be arbitrary). Let $\mathsf{opt} < 1$ be the population loss of the best-fitting ReLU. We prove: 1. Finding a ReLU with square-loss $\mathsf{opt} + ε$ is as hard as the problem of learning sparse… ▽ More

    Submitted 4 November, 2019; originally announced November 2019.

    Comments: To appear in NeurIPS 2019 (Spotlight)

  15. arXiv:1905.05679  [pdf, ps, other

    cs.DS cs.LG stat.ML

    List-Decodable Linear Regression

    Authors: Sushrut Karmalkar, Adam R. Klivans, Pravesh K. Kothari

    Abstract: We give the first polynomial-time algorithm for robust regression in the list-decodable setting where an adversary can corrupt a greater than $1/2$ fraction of examples. For any $α< 1$, our algorithm takes as input a sample $\{(x_i,y_i)\}_{i \leq n}$ of $n$ linear equations where $αn$ of the equations satisfy $y_i = \langle x_i,\ell^*\rangle +ζ$ for some small noise $ζ$ and $(1-α)n$ of the equat… ▽ More

    Submitted 30 May, 2019; v1 submitted 14 May, 2019; originally announced May 2019.

    Comments: 28 Pages

  16. arXiv:1902.04728  [pdf, ps, other

    cs.DS cs.LG stat.ML

    Learning Ising Models with Independent Failures

    Authors: Surbhi Goel, Daniel M. Kane, Adam R. Klivans

    Abstract: We give the first efficient algorithm for learning the structure of an Ising model that tolerates independent failures; that is, each entry of the observed sample is missing with some unknown probability p. Our algorithm matches the essentially optimal runtime and sample complexity bounds of recent work for learning Ising models due to Klivans and Meka (2017). We devise a novel unbiased estimato… ▽ More

    Submitted 12 February, 2019; originally announced February 2019.

  17. arXiv:1803.03241  [pdf, ps, other

    cs.LG cs.AI cs.DS stat.ML

    Efficient Algorithms for Outlier-Robust Regression

    Authors: Adam Klivans, Pravesh K. Kothari, Raghu Meka

    Abstract: We give the first polynomial-time algorithm for performing linear or polynomial regression resilient to adversarial corruptions in both examples and labels. Given a sufficiently large (polynomial-size) training set drawn i.i.d. from distribution D and subsequently corrupted on some fraction of points, our algorithm outputs a linear function whose squared error is close to the squared error of th… ▽ More

    Submitted 4 June, 2020; v1 submitted 8 March, 2018; originally announced March 2018.

    Comments: 27 pages. Appeared in COLT 2018. This update removes Lemma 6.2 that erroneously claimed an information-theoretic lower bound on error rate as a function of fraction of outliers

  18. arXiv:1802.02547  [pdf, other

    cs.LG cs.DS stat.ML

    Learning One Convolutional Layer with Overlapping Patches

    Authors: Surbhi Goel, Adam Klivans, Raghu Meka

    Abstract: We give the first provably efficient algorithm for learning a one hidden layer convolutional network with respect to a general class of (potentially overlapping) patches. Additionally, our algorithm requires only mild conditions on the underlying distribution. We prove that our framework captures commonly used schemes from computer vision, including one-dimensional and two-dimensional "patch and s… ▽ More

    Submitted 7 February, 2018; originally announced February 2018.

  19. arXiv:1709.06010  [pdf, ps, other

    cs.DS cs.LG stat.ML

    Learning Neural Networks with Two Nonlinear Layers in Polynomial Time

    Authors: Surbhi Goel, Adam Klivans

    Abstract: We give a polynomial-time algorithm for learning neural networks with one layer of sigmoids feeding into any Lipschitz, monotone activation function (e.g., sigmoid or ReLU). We make no assumptions on the structure of the network, and the algorithm succeeds with respect to {\em any} distribution on the unit ball in $n$ dimensions (hidden weight vectors also have unit norm). This is the first assump… ▽ More

    Submitted 20 April, 2018; v1 submitted 18 September, 2017; originally announced September 2017.

    Comments: Changed title, included new results

  20. arXiv:1706.00764  [pdf, other

    cs.LG cs.AI math.OC stat.ML

    Hyperparameter Optimization: A Spectral Approach

    Authors: Elad Hazan, Adam Klivans, Yang Yuan

    Abstract: We give a simple, fast algorithm for hyperparameter optimization inspired by techniques from the analysis of Boolean functions. We focus on the high-dimensional regime where the canonical example is training a neural network with a large number of hyperparameters. The algorithm --- an iterative application of compressed sensing techniques for orthogonal polynomials --- requires only uniform sampli… ▽ More

    Submitted 19 January, 2018; v1 submitted 2 June, 2017; originally announced June 2017.

  21. arXiv:1703.02689  [pdf, ps, other

    stat.ML cs.DS cs.IT cs.LG

    Exact MAP Inference by Avoiding Fractional Vertices

    Authors: Erik M. Lindgren, Alexandros G. Dimakis, Adam Klivans

    Abstract: Given a graphical model, one essential problem is MAP inference, that is, finding the most likely configuration of states according to the model. Although this problem is NP-hard, large instances can be solved in practice. A major open question is to explain why this is true. We give a natural condition under which we can provably perform MAP inference in polynomial time. We require that the numbe… ▽ More

    Submitted 7 March, 2017; originally announced March 2017.

  22. arXiv:1611.10258  [pdf, ps, other

    cs.LG cs.CC stat.ML

    Reliably Learning the ReLU in Polynomial Time

    Authors: Surbhi Goel, Varun Kanade, Adam Klivans, Justin Thaler

    Abstract: We give the first dimension-efficient algorithms for learning Rectified Linear Units (ReLUs), which are functions of the form $\mathbf{x} \mapsto \max(0, \mathbf{w} \cdot \mathbf{x})$ with $\mathbf{w} \in \mathbb{S}^{n-1}$. Our algorithm works in the challenging Reliable Agnostic learning model of Kalai, Kanade, and Mansour (2009) where the learner is given access to a distribution $\cal{D}$ on la… ▽ More

    Submitted 30 November, 2016; originally announced November 2016.