Skip to main content

Showing 1–16 of 16 results for author: Cai, L

Searching in archive stat. Search in all archives.
.
  1. arXiv:2406.00924  [pdf, ps, other

    cs.LG cs.DS math.ST stat.ML

    Faster Diffusion-based Sampling with Randomized Midpoints: Sequential and Parallel

    Authors: Shivam Gupta, Linda Cai, Sitan Chen

    Abstract: In recent years, there has been a surge of interest in proving discretization bounds for diffusion models. These works show that for essentially any data distribution, one can approximately sample in polynomial time given a sufficiently accurate estimate of its score functions at different noise levels. In this work, we propose a new discretization scheme for diffusion models inspired by Shen and… ▽ More

    Submitted 2 June, 2024; originally announced June 2024.

  2. arXiv:2405.14652  [pdf, ps, other

    stat.ME

    Statistical inference for high-dimensional convoluted rank regression

    Authors: Leheng Cai, Xu Guo, Heng Lian, Liping Zhu

    Abstract: High-dimensional penalized rank regression is a powerful tool for modeling high-dimensional data due to its robustness and estimation efficiency. However, the non-smoothness of the rank loss brings great challenges to the computation. To solve this critical issue, high-dimensional convoluted rank regression is recently proposed, and penalized convoluted rank regression estimators are introduced. H… ▽ More

    Submitted 23 May, 2024; originally announced May 2024.

  3. arXiv:2404.05991  [pdf, other

    cs.DS stat.ML

    Polynomial-time derivation of optimal k-tree topology from Markov networks

    Authors: Fereshteh R. Dastjerdi, Liming Cai

    Abstract: Characterization of joint probability distribution for large networks of random variables remains a challenging task in data science. Probabilistic graph approximation with simple topologies has practically been resorted to; typically the tree topology makes joint probability computation much simpler and can be effective for statistical inference on insufficient data. However, to characterize netw… ▽ More

    Submitted 8 April, 2024; originally announced April 2024.

    Comments: 20 pages including references, 1 figure

  4. arXiv:2401.17646  [pdf, other

    stat.ME

    From Sparse to Dense Functional Data: Phase Transitions from a Simultaneous Inference Perspective

    Authors: Leheng Cai, Qirui Hu

    Abstract: We aim to develop simultaneous inference tools for the mean function of functional data from sparse to dense. First, we derive a unified Gaussian approximation to construct simultaneous confidence bands of mean functions based on the B-spline estimator. Then, we investigate the conditions of phase transitions by decomposing the asymptotic variance of the approximated Gaussian process. As an extens… ▽ More

    Submitted 31 January, 2024; originally announced January 2024.

  5. arXiv:2212.12874  [pdf, ps, other

    stat.ME

    Test and Measure for Partial Mean Dependence Based on Machine Learning Methods

    Authors: Leheng Cai, Xu Guo, Wei Zhong

    Abstract: It is of importance to investigate the significance of a subset of covariates $W$ for the response $Y$ given covariates $Z$ in regression modeling. To this end, we propose a significance test for the partial mean independence problem based on machine learning methods and data splitting. The test statistic converges to the standard chi-squared distribution under the null hypothesis while it converg… ▽ More

    Submitted 5 June, 2024; v1 submitted 25 December, 2022; originally announced December 2022.

  6. arXiv:2007.09334  [pdf, other

    cs.LG q-bio.MN stat.ML

    Deep Learning of High-Order Interactions for Protein Interface Prediction

    Authors: Yi Liu, Hao Yuan, Lei Cai, Shuiwang Ji

    Abstract: Protein interactions are important in a broad range of biological processes. Traditionally, computational methods have been developed to automatically predict protein interface from hand-crafted features. Recent approaches employ deep neural networks and predict the interaction of each amino acid pair independently. However, these methods do not incorporate the important sequential information fro… ▽ More

    Submitted 18 July, 2020; originally announced July 2020.

    Comments: 10 pages, 3 figures, 4 tables. KDD2020

  7. arXiv:2005.07427  [pdf, other

    cs.LG cs.SI stat.ML

    Structural Temporal Graph Neural Networks for Anomaly Detection in Dynamic Graphs

    Authors: Lei Cai, Zhengzhang Chen, Chen Luo, Jiaping Gui, Jingchao Ni, Ding Li, Haifeng Chen

    Abstract: Detecting anomalies in dynamic graphs is a vital task, with numerous practical applications in areas such as security, finance, and social media. Previous network embedding based methods have been mostly focusing on learning good node representations, whereas largely ignoring the subgraph structural changes related to the target nodes in dynamic graphs. In this paper, we propose StrGNN, an end-to-… ▽ More

    Submitted 25 May, 2020; v1 submitted 15 May, 2020; originally announced May 2020.

  8. arXiv:2005.03718  [pdf, other

    cs.LG eess.SP math.OC stat.ML

    A Gradient-Aware Search Algorithm for Constrained Markov Decision Processes

    Authors: Sami Khairy, Prasanna Balaprakash, Lin X. Cai

    Abstract: The canonical solution methodology for finite constrained Markov decision processes (CMDPs), where the objective is to maximize the expected infinite-horizon discounted rewards subject to the expected infinite-horizon discounted costs constraints, is based on convex linear programming. In this brief, we first prove that the optimization objective in the dual linear program of a finite CMDP is a pi… ▽ More

    Submitted 7 May, 2020; originally announced May 2020.

    Comments: Submitted as a brief paper to the IEEE TNNLS

  9. arXiv:2004.14171  [pdf, other

    cs.DB cs.AI cs.CL cs.LG stat.ML

    SE-KGE: A Location-Aware Knowledge Graph Embedding Model for Geographic Question Answering and Spatial Semantic Lifting

    Authors: Gengchen Mai, Krzysztof Janowicz, Ling Cai, Rui Zhu, Blake Regalia, Bo Yan, Meilin Shi, Ni Lao

    Abstract: Learning knowledge graph (KG) embeddings is an emerging technique for a variety of downstream tasks such as summarization, link prediction, information retrieval, and question answering. However, most existing KG embedding models neglect space and, therefore, do not perform well when applied to (geo)spatial data and tasks. For those models that consider space, most of them primarily rely on some n… ▽ More

    Submitted 25 April, 2020; originally announced April 2020.

    Comments: Accepted to Transactions in GIS

    ACM Class: I.2.4; I.1.3; I.2.2

    Journal ref: Transactions in GIS, 2020

  10. arXiv:2003.00824  [pdf, other

    cs.CV cs.AI cs.LG stat.ML

    Multi-Scale Representation Learning for Spatial Feature Distributions using Grid Cells

    Authors: Gengchen Mai, Krzysztof Janowicz, Bo Yan, Rui Zhu, Ling Cai, Ni Lao

    Abstract: Unsupervised text encoding models have recently fueled substantial progress in NLP. The key idea is to use neural networks to convert words in texts to vector space representations based on word positions in a sentence and their contexts, which are suitable for end-to-end training of downstream tasks. We see a strikingly similar situation in spatial analysis, which focuses on incorporating both ab… ▽ More

    Submitted 15 February, 2020; originally announced March 2020.

    Comments: 15 pages; Accepted to ICLR 2020 as a spotlight paper

    ACM Class: I.2.0; I.2.6; I.5.1; J.2

    Journal ref: ICLR 2020, Apr. 26 - 30, 2020, Addis Ababa, ETHIOPIA

  11. arXiv:2003.00117  [pdf, ps, other

    stat.ME

    Simultaneous confidence bands for nonparametric regression with partially missing covariates

    Authors: Li Cai, Lijie Gu, Qihua Wang, Suojin Wang

    Abstract: In this paper, we consider a weighted local linear estimator based on the inverse selection probability for nonparametric regression with missing covariates at random. The asymptotic distribution of the maximal deviation between the estimator and the true regression function is derived and an asymptotically accurate simultaneous confidence band is constructed. The estimator for the regression func… ▽ More

    Submitted 28 February, 2020; originally announced March 2020.

  12. arXiv:1910.00702  [pdf, other

    cs.LG cs.CL stat.ML

    TransGCN:Coupling Transformation Assumptions with Graph Convolutional Networks for Link Prediction

    Authors: Ling Cai, Bo Yan, Gengchen Mai, Krzysztof Janowicz, Rui Zhu

    Abstract: Link prediction is an important and frequently studied task that contributes to an understanding of the structure of knowledge graphs (KGs) in statistical relational learning. Inspired by the success of graph convolutional networks (GCN) in modeling graph data, we propose a unified GCN framework, named TransGCN, to address this task, in which relation and entity embeddings are learned simultaneous… ▽ More

    Submitted 1 October, 2019; originally announced October 2019.

  13. arXiv:1910.00084  [pdf, other

    cs.LG cs.AI cs.CL stat.ML

    Contextual Graph Attention for Answering Logical Queries over Incomplete Knowledge Graphs

    Authors: Gengchen Mai, Krzysztof Janowicz, Bo Yan, Rui Zhu, Ling Cai, Ni Lao

    Abstract: Recently, several studies have explored methods for using KG embedding to answer logical queries. These approaches either treat embedding learning and query answering as two separated learning tasks, or fail to deal with the variability of contributions from different query paths. We proposed to leverage a graph attention mechanism to handle the unequal contribution of different query paths. Howev… ▽ More

    Submitted 30 September, 2019; originally announced October 2019.

    Comments: 8 pages, 3 figures, camera ready version of article accepted to K-CAP 2019, Marina del Rey, California, United States

    ACM Class: I.2.4; I.1.3

    Journal ref: K-CAP 2019, Nov. 19 - 21, 2019, Marina del Rey, CA, USA

  14. arXiv:1809.02804  [pdf, other

    cs.LG cs.AI stat.ML

    Handling Concept Drift via Model Reuse

    Authors: Peng Zhao, Le-Wen Cai, Zhi-Hua Zhou

    Abstract: In many real-world applications, data are often collected in the form of stream, and thus the distribution usually changes in nature, which is referred as concept drift in literature. We propose a novel and effective approach to handle concept drift via model reuse, leveraging previous knowledge by reusing models. Each model is associated with a weight representing its reusability towards current… ▽ More

    Submitted 8 September, 2018; originally announced September 2018.

    Journal ref: Machine Learning, 2020, 109(3): 533-568

  15. arXiv:1705.08881  [pdf, other

    cs.CV cs.LG cs.NE stat.ML

    Dense Transformer Networks

    Authors: Jun Li, Yongjun Chen, Lei Cai, Ian Davidson, Shuiwang Ji

    Abstract: The key idea of current deep learning methods for dense prediction is to apply a model on a regular patch centered on each pixel to make pixel-wise predictions. These methods are limited in the sense that the patches are determined by network architecture instead of learned from data. In this work, we propose the dense transformer networks, which can learn the shapes and sizes of patches from data… ▽ More

    Submitted 7 June, 2017; v1 submitted 24 May, 2017; originally announced May 2017.

  16. arXiv:1503.08445  [pdf, other

    stat.AP

    A Random Matrix Theoretical Approach to Early Event Detection Using Experimental Data

    Authors: Y. Cao, L. Cai, C. Qiu, J. Gu, X. He, Q. Ai, Z. Jin

    Abstract: In this paper, High-dimensional data analysis methods are proposed to deal with random matrix which is composed by the real data from power network before and after the fault. The mean spectral radius (MSR) of non-Hermitian random matrices is defined as a statistic analytic for the fault detection. By analyzing the characteristics of random matrices and observing the changes of the spectral radius… ▽ More

    Submitted 29 March, 2015; originally announced March 2015.

    Comments: 4 pages, 6 figures