Skip to main content

Showing 1–18 of 18 results for author: Jiang, K

Searching in archive stat. Search in all archives.
.
  1. arXiv:2406.04107  [pdf

    stat.AP

    A Practical Analysis Procedure on Generalizing Comparative Effectiveness in the Randomized Clinical Trial to the Real-world Trialeligible Population

    Authors: Kuan Jiang, Xin-xing Lai, Shu Yang, Ying Gao, Xiao-Hua Zhou

    Abstract: When evaluating the effectiveness of a drug, a Randomized Controlled Trial (RCT) is often considered the gold standard due to its perfect randomization. While RCT assures strong internal validity, its restricted external validity poses challenges in extending treatment effects to the broader real-world population due to possible heterogeneity in covariates. In this paper, we introduce a procedure… ▽ More

    Submitted 6 June, 2024; originally announced June 2024.

    Comments: 21 pages, 3 figures, 3tables

  2. arXiv:2403.16336  [pdf, other

    stat.ML cs.LG math.ST stat.ME

    Predictive Inference in Multi-environment Scenarios

    Authors: John C. Duchi, Suyash Gupta, Kuanhao Jiang, Pragya Sur

    Abstract: We address the challenge of constructing valid confidence intervals and sets in problems of prediction across multiple environments. We investigate two types of coverage suitable for these problems, extending the jackknife and split-conformal methods to show how to obtain distribution-free coverage in such non-traditional, hierarchical data-generating scenarios. Our contributions also include exte… ▽ More

    Submitted 24 March, 2024; originally announced March 2024.

  3. arXiv:2402.05569  [pdf, other

    cs.LG cs.AI eess.SP stat.ML

    Simplifying Hypergraph Neural Networks

    Authors: Bohan Tang, Zexi Liu, Keyue Jiang, Siheng Chen, Xiaowen Dong

    Abstract: Hypergraphs are crucial for modeling higher-order interactions in real-world data. Hypergraph neural networks (HNNs) effectively utilise these structures by message passing to generate informative node features for various downstream tasks like node classification. However, the message passing block in existing HNNs typically requires a computationally intensive training process, which limits thei… ▽ More

    Submitted 22 May, 2024; v1 submitted 8 February, 2024; originally announced February 2024.

  4. arXiv:2306.10577  [pdf, other

    cs.LG stat.ML

    OpenDataVal: a Unified Benchmark for Data Valuation

    Authors: Kevin Fu Jiang, Weixin Liang, James Zou, Yongchan Kwon

    Abstract: Assessing the quality and impact of individual data points is critical for improving model performance and mitigating undesirable biases within the training dataset. Several data valuation algorithms have been proposed to quantify data quality, however, there lacks a systemic and standardized benchmarking system for data valuation. In this paper, we introduce OpenDataVal, an easy-to-use and unifie… ▽ More

    Submitted 13 October, 2023; v1 submitted 18 June, 2023; originally announced June 2023.

    Comments: 25 pages, NeurIPS 2023 Track on Datasets and Benchmarks

  5. arXiv:2211.02254  [pdf, other

    cs.LG math.OC stat.ML

    How Does Adaptive Optimization Impact Local Neural Network Geometry?

    Authors: Kaiqi Jiang, Dhruv Malik, Yuanzhi Li

    Abstract: Adaptive optimization methods are well known to achieve superior convergence relative to vanilla gradient methods. The traditional viewpoint in optimization, particularly in convex optimization, explains this improved performance by arguing that, unlike vanilla gradient schemes, adaptive algorithms mimic the behavior of a second-order method by adapting to the global geometry of the loss function.… ▽ More

    Submitted 4 November, 2022; originally announced November 2022.

  6. arXiv:2207.14439  [pdf, other

    stat.ME stat.ML

    Treatment Effect Estimation with Unobserved and Heterogeneous Confounding Variables

    Authors: Kevin Jiang, Yang Ning

    Abstract: The estimation of the treatment effect is often biased in the presence of unobserved confounding variables which are commonly referred to as hidden variables. Although a few methods have been recently proposed to handle the effect of hidden variables, these methods often overlook the possibility of any interaction between the observed treatment variable and the unobserved covariates. In this work,… ▽ More

    Submitted 28 July, 2022; originally announced July 2022.

    Comments: 20 pages, 4 figures

  7. arXiv:2205.10198  [pdf, other

    math.ST econ.EM stat.ME stat.ML

    A New Central Limit Theorem for the Augmented IPW Estimator: Variance Inflation, Cross-Fit Covariance and Beyond

    Authors: Kuanhao Jiang, Rajarshi Mukherjee, Subhabrata Sen, Pragya Sur

    Abstract: Estimation of the average treatment effect (ATE) is a central problem in causal inference. In recent times, inference for the ATE in the presence of high-dimensional covariates has been extensively studied. Among the diverse approaches that have been proposed, augmented inverse probability weighting (AIPW) with cross-fitting has emerged a popular choice in practice. In this work, we study this cro… ▽ More

    Submitted 28 October, 2022; v1 submitted 20 May, 2022; originally announced May 2022.

    Comments: 132 pages, 7 figures; In V2, we added extensive comparisons with the classical variance formula (c.f.~Sec 3, Fig 2, Fig 4) and elaborated on the non-trivial cross-fit covariance phenomenon further

  8. arXiv:2201.05672  [pdf, ps, other

    econ.EM stat.AP

    Measuring Changes in Disparity Gaps: An Application to Health Insurance

    Authors: Paul Goldsmith-Pinkham, Karen Jiang, Zirui Song, Jacob Wallace

    Abstract: We propose a method for reporting how program evaluations reduce gaps between groups, such as the gender or Black-white gap. We first show that the reduction in disparities between groups can be written as the difference in conditional average treatment effects (CATE) for each group. Then, using a Kitagawa-Oaxaca-Blinder-style decomposition, we highlight how these CATE can be decomposed into unexp… ▽ More

    Submitted 14 January, 2022; originally announced January 2022.

    Comments: AEA P&P accepted draft

  9. arXiv:2110.09697  [pdf, other

    stat.ML cs.LG stat.CO

    abess: A Fast Best Subset Selection Library in Python and R

    Authors: Jin Zhu, Xueqin Wang, Liyuan Hu, Junhao Huang, Kangkang Jiang, Yanhang Zhang, Shiyun Lin, Junxian Zhu

    Abstract: We introduce a new library named abess that implements a unified framework of best-subset selection for solving diverse machine learning problems, e.g., linear regression, classification, and principal component analysis. Particularly, the abess certifiably gets the optimal solution within polynomial times with high probability under the linear model. Our efficient implementation allows abess to a… ▽ More

    Submitted 16 June, 2022; v1 submitted 18 October, 2021; originally announced October 2021.

    Journal ref: Journal of Machine Learning Research (2022)

  10. arXiv:2001.00127  [pdf, other

    cs.LG cs.AI stat.ML

    Reinforcement Learning with Goal-Distance Gradient

    Authors: Kai Jiang, XiaoLong Qin

    Abstract: Reinforcement learning usually uses the feedback rewards of environmental to train agents. But the rewards in the actual environment are sparse, and even some environments will not rewards. Most of the current methods are difficult to get good performance in sparse reward or non-reward environments. Although using shaped rewards is effective when solving sparse reward tasks, it is limited to speci… ▽ More

    Submitted 10 January, 2020; v1 submitted 31 December, 2019; originally announced January 2020.

  11. arXiv:1905.01991  [pdf, other

    cs.IR cs.LG stat.ML

    A Content-Based Approach to Email Triage Action Prediction: Exploration and Evaluation

    Authors: Sudipto Mukherjee, Ke Jiang

    Abstract: Email has remained a principal form of communication among people, both in enterprise and social settings. With a deluge of emails crowding our mailboxes daily, there is a dire need of smart email systems that can recover important emails and make personalized recommendations. In this work, we study the problem of predicting user triage actions to incoming emails where we take the reply prediction… ▽ More

    Submitted 29 April, 2019; originally announced May 2019.

    Comments: User representations, Personalization, Email response prediction, Similarity features

  12. arXiv:1812.01101  [pdf, other

    physics.geo-ph cs.LG stat.ML

    Automatic Seismic Salt Interpretation with Deep Convolutional Neural Networks

    Authors: Yu Zeng, Kebei Jiang, Jie Chen

    Abstract: One of the most crucial tasks in seismic reflection imaging is to identify the salt bodies with high precision. Traditionally, this is accomplished by visually picking the salt/sediment boundaries, which requires a great amount of manual work and may introduce systematic bias. With recent progress of deep learning algorithm and growing computational power, a great deal of efforts have been made to… ▽ More

    Submitted 24 November, 2018; originally announced December 2018.

    Comments: 11 pages, 7 figures

    Journal ref: ICISDM 2019 - The 3rd International Conference on Information System and Data Mining

  13. arXiv:1810.12153  [pdf, other

    cs.LG stat.ML

    Deep learning long-range information in undirected graphs with wave networks

    Authors: Matthew K. Matlock, Arghya Datta, Na Le Dang, Kevin Jiang, S. Joshua Swamidass

    Abstract: Graph algorithms are key tools in many fields of science and technology. Some of these algorithms depend on propagating information between distant nodes in a graph. Recently, there have been a number of deep learning architectures proposed to learn on undirected graphs. However, most of these architectures aggregate information in the local neighborhood of a node, and therefore they may not be ca… ▽ More

    Submitted 29 October, 2018; originally announced October 2018.

  14. arXiv:1808.09940  [pdf, other

    q-fin.PM cs.LG stat.ML

    Adversarial Deep Reinforcement Learning in Portfolio Management

    Authors: Zhipeng Liang, Hao Chen, Junhao Zhu, Kangkang Jiang, Yanran Li

    Abstract: In this paper, we implement three state-of-art continuous reinforcement learning algorithms, Deep Deterministic Policy Gradient (DDPG), Proximal Policy Optimization (PPO) and Policy Gradient (PG)in portfolio management. All of them are widely-used in game playing and robot control. What's more, PPO has appealing theoretical propeties which is hopefully potential in portfolio management. We present… ▽ More

    Submitted 17 November, 2018; v1 submitted 29 August, 2018; originally announced August 2018.

  15. arXiv:1604.02027  [pdf, other

    cs.LG cs.CL stat.ML

    Combinatorial Topic Models using Small-Variance Asymptotics

    Authors: Ke Jiang, Suvrit Sra, Brian Kulis

    Abstract: Topic models have emerged as fundamental tools in unsupervised machine learning. Most modern topic modeling algorithms take a probabilistic view and derive inference algorithms based on Latent Dirichlet Allocation (LDA) or its variants. In contrast, we study topic modeling as a combinatorial optimization problem, and propose a new objective function derived from LDA by passing to the small-varianc… ▽ More

    Submitted 26 May, 2016; v1 submitted 7 April, 2016; originally announced April 2016.

    Comments: 19 pages

  16. arXiv:1411.4199  [pdf, ps, other

    cs.CV cs.LG stat.ML

    Revisiting Kernelized Locality-Sensitive Hashing for Improved Large-Scale Image Retrieval

    Authors: Ke Jiang, Qichao Que, Brian Kulis

    Abstract: We present a simple but powerful reinterpretation of kernelized locality-sensitive hashing (KLSH), a general and popular method developed in the vision community for performing approximate nearest-neighbor searches in an arbitrary reproducing kernel Hilbert space (RKHS). Our new perspective is based on viewing the steps of the KLSH algorithm in an appropriately projected space, and has several key… ▽ More

    Submitted 15 November, 2014; originally announced November 2014.

    Comments: 15 pages

  17. Inferring gene-gene interactions and functional modules using sparse canonical correlation analysis

    Authors: Y. X. Rachel Wang, Keni Jiang, Lewis J. Feldman, Peter J. Bickel, Haiyan Huang

    Abstract: Networks pervade many disciplines of science for analyzing complex systems with interacting components. In particular, this concept is commonly used to model interactions between genes and identify closely associated genes forming functional modules. In this paper, we focus on gene group interactions and infer these interactions using appropriate partial correlations between genes, that is, the co… ▽ More

    Submitted 1 June, 2015; v1 submitted 25 January, 2014; originally announced January 2014.

    Comments: Published at http://dx.doi.org/10.1214/14-AOAS792 in the Annals of Applied Statistics (http://www.imstat.org/aoas/) by the Institute of Mathematical Statistics (http://www.imstat.org)

    Report number: IMS-AOAS-AOAS792

    Journal ref: Annals of Applied Statistics 2015, Vol. 9, No. 1, 300-323

  18. arXiv:1401.2054  [pdf, other

    stat.ME

    Bayesian meta-analysis of correlation coefficients through power prior

    Authors: Zhiyong Zhang, Kaifeng Jiang, Haiyan Liu, In-Sue Oh

    Abstract: To answer the call of introducing more Bayesian techniques to organizational research (e.g., Kruschke, Aguinis, & Joo, 2012; Zyphur & Oswald, 2013), we propose a Bayesian approach for meta-analysis with power prior in this article. The primary purpose of this method is to allow meta-analytic researchers to control the contribution of each individual study to an estimated overall effect size though… ▽ More

    Submitted 29 July, 2014; v1 submitted 9 January, 2014; originally announced January 2014.