Skip to main content

Showing 1–10 of 10 results for author: Haque, S

Searching in archive stat. Search in all archives.
.
  1. arXiv:2301.07078  [pdf, ps, other

    stat.ML cs.CR cs.DS cs.LG

    A Fast Algorithm for Adaptive Private Mean Estimation

    Authors: John Duchi, Saminul Haque, Rohith Kuditipudi

    Abstract: We design an $(\varepsilon, δ)$-differentially private algorithm to estimate the mean of a $d$-variate distribution, with unknown covariance $Σ$, that is adaptive to $Σ$. To within polylogarithmic factors, the estimator achieves optimal rates of convergence with respect to the induced Mahalanobis norm $||\cdot||_Σ$, takes time $\tilde{O}(n d^2)$ to compute, has near linear sample complexity for su… ▽ More

    Submitted 17 January, 2023; originally announced January 2023.

    Comments: 38 pages, no figures

  2. arXiv:2206.03328  [pdf, other

    cs.LG math.OC stat.ML

    Concentration bounds for SSP Q-learning for average cost MDPs

    Authors: Shaan Ul Haque, Vivek Borkar

    Abstract: We derive a concentration bound for a Q-learning algorithm for average cost Markov decision processes based on an equivalent shortest path problem, and compare it numerically with the alternative scheme based on relative value iteration.

    Submitted 12 June, 2022; v1 submitted 7 June, 2022; originally announced June 2022.

    Comments: 6 pages, 2 figures

  3. arXiv:2205.13094  [pdf, other

    cs.LG cs.AI math.ST stat.ML

    Undersampling is a Minimax Optimal Robustness Intervention in Nonparametric Classification

    Authors: Niladri S. Chatterji, Saminul Haque, Tatsunori Hashimoto

    Abstract: While a broad range of techniques have been proposed to tackle distribution shift, the simple baseline of training on an $\textit{undersampled}$ balanced dataset often achieves close to state-of-the-art-accuracy across several popular benchmarks. This is rather surprising, since undersampling algorithms discard excess majority group data. To understand this phenomenon, we ask if learning is fundam… ▽ More

    Submitted 19 June, 2023; v1 submitted 25 May, 2022; originally announced May 2022.

  4. arXiv:2203.01667  [pdf, other

    cs.LG stat.ML

    Joint Probability Estimation Using Tensor Decomposition and Dictionaries

    Authors: Shaan ul Haque, Ajit Rajwade, Karthik S. Gurumoorthy

    Abstract: In this work, we study non-parametric estimation of joint probabilities of a given set of discrete and continuous random variables from their (empirically estimated) 2D marginals, under the assumption that the joint probability could be decomposed and approximated by a mixture of product densities/mass functions. The problem of estimating the joint probability density function (PDF) using semi-par… ▽ More

    Submitted 3 March, 2022; originally announced March 2022.

  5. arXiv:2112.12986  [pdf, other

    cs.LG stat.ML

    Is Importance Weighting Incompatible with Interpolating Classifiers?

    Authors: Ke Alexander Wang, Niladri S. Chatterji, Saminul Haque, Tatsunori Hashimoto

    Abstract: Importance weighting is a classic technique to handle distribution shifts. However, prior work has presented strong empirical and theoretical evidence demonstrating that importance weights can have little to no effect on overparameterized neural networks. Is importance weighting truly incompatible with the training of overparameterized neural networks? Our paper answers this in the negative. We sh… ▽ More

    Submitted 4 March, 2022; v1 submitted 24 December, 2021; originally announced December 2021.

    Comments: International Conference on Learning Representations (ICLR), 2022

  6. arXiv:2003.06291  [pdf

    stat.CO cs.DB stat.AP

    Improved assessment of the accuracy of record linkage via an extended MaCSim approach

    Authors: Shovanur Haque, Kerrie Mengersen

    Abstract: Record linkage is the process of bringing together the same entity from overlapping data sources while removing duplicates. Huge amounts of data are now being collected by public or private organizations as well as by researchers and individuals. Linking and analysing relevant information from this massive data reservoir can provide new insights into society. However, this increase in the amount o… ▽ More

    Submitted 12 October, 2020; v1 submitted 12 March, 2020; originally announced March 2020.

    Comments: 32 pages, 4 figures. arXiv admin note: text overlap with arXiv:1901.04779

  7. arXiv:2003.05686  [pdf

    stat.CO cs.DB stat.AP

    Assessing the accuracy of individual link with varying block sizes and cut-off values using MaCSim approach

    Authors: Shovanur Haque, Kerrie Mengersen

    Abstract: Record linkage is the process of matching together records from different data sources that belong to the same entity. Record linkage is increasingly being used by many organizations including statistical, health, government etc. to link administrative, survey, and other files to create a robust file for more comprehensive analysis. Therefore, it becomes necessary to assess the ability of a linkin… ▽ More

    Submitted 23 November, 2020; v1 submitted 12 March, 2020; originally announced March 2020.

    Comments: 24 pages, 6 figures. arXiv admin note: text overlap with arXiv:1901.04779

  8. arXiv:1911.00937  [pdf, other

    cs.LG stat.ML

    Preventing Gradient Attenuation in Lipschitz Constrained Convolutional Networks

    Authors: Qiyang Li, Saminul Haque, Cem Anil, James Lucas, Roger Grosse, Jörn-Henrik Jacobsen

    Abstract: Lipschitz constraints under L2 norm on deep neural networks are useful for provable adversarial robustness bounds, stable training, and Wasserstein distance estimation. While heuristic approaches such as the gradient penalty have seen much practical success, it is challenging to achieve similar practical performance while provably enforcing a Lipschitz constraint. In principle, one can design Lips… ▽ More

    Submitted 9 November, 2019; v1 submitted 3 November, 2019; originally announced November 2019.

    Comments: 9 main pages, 31 pages total, 3 figures. Accepted at 33rd Conference on Neural Information Processing Systems (NeurIPS 2019)

  9. arXiv:1901.04779  [pdf

    stat.CO stat.AP

    Assessing the accuracy of record linkages with Markov chain based Monte Carlo simulation approach

    Authors: Shovanur Haque, Kerrie Mengersen, Steven Stern

    Abstract: Record linkage is the process of finding matches and linking records from different data sources so that the linked records belong to the same entity. There is an increasing number of applications of record linkage in statistical, health, government and business organisations to link administrative, survey, population census and other files to create a complete set of information for more complete… ▽ More

    Submitted 29 September, 2020; v1 submitted 15 January, 2019; originally announced January 2019.

    Comments: 33 pages, 10 figures, 4 tables

  10. arXiv:1812.03632  [pdf, other

    cs.CY stat.ML

    Statement networks: a power structure narrative as depicted by newspapers

    Authors: Shoumik Sharar Chowdhury, Nazmus Saquib, Niamat Zawad, Manash Kumar Mandal, Syed Haque

    Abstract: We report a data mining pipeline and subsequent analysis to understand the core periphery power structure created in three national newspapers in Bangladesh, as depicted by statements made by people appearing in news. Statements made by one actor about another actor can be considered a form of public conversation. Named entity recognition techniques can be used to create a temporal actor network f… ▽ More

    Submitted 10 December, 2018; originally announced December 2018.

    Comments: Presented at NeurIPS 2018 Workshop on Machine Learning for the Developing World