Skip to main content

Showing 1–50 of 158 results for author: Gao, A

.
  1. arXiv:2408.14853  [pdf, other

    cs.CL cs.AI cs.CR

    Detecting AI Flaws: Target-Driven Attacks on Internal Faults in Language Models

    Authors: Yuhao Du, Zhuo Li, Pengyu Cheng, Xiang Wan, Anningzhe Gao

    Abstract: Large Language Models (LLMs) have become a focal point in the rapidly evolving field of artificial intelligence. However, a critical concern is the presence of toxic content within the pre-training corpus of these models, which can lead to the generation of inappropriate outputs. Investigating methods for detecting internal faults in LLMs can help us understand their limitations and improve their… ▽ More

    Submitted 27 August, 2024; originally announced August 2024.

  2. Risks and NLP Design: A Case Study on Procedural Document QA

    Authors: Nikita Haduong, Alice Gao, Noah A. Smith

    Abstract: As NLP systems are increasingly deployed at scale, concerns about their potential negative impacts have attracted the attention of the research community, yet discussions of risk have mostly been at an abstract level and focused on generic AI or NLP applications. We argue that clearer assessments of risks and harms to users--and concrete strategies to mitigate them--will be possible when we specia… ▽ More

    Submitted 16 August, 2024; originally announced August 2024.

    Journal ref: Findings of the Association for Computational Linguistics ACL (2023) 1248-1269

  3. arXiv:2407.13301  [pdf, other

    cs.CL cs.AI cs.LG

    CoD, Towards an Interpretable Medical Agent using Chain of Diagnosis

    Authors: Junying Chen, Chi Gui, Anningzhe Gao, Ke Ji, Xidong Wang, Xiang Wan, Benyou Wang

    Abstract: The field of medical diagnosis has undergone a significant transformation with the advent of large language models (LLMs), yet the challenges of interpretability within these models remain largely unaddressed. This study introduces Chain-of-Diagnosis (CoD) to enhance the interpretability of LLM-based medical diagnostics. CoD transforms the diagnostic process into a diagnostic chain that mirrors a… ▽ More

    Submitted 18 July, 2024; originally announced July 2024.

  4. arXiv:2407.10666  [pdf, other

    stat.ML cs.LG physics.chem-ph

    Flow Perturbation to Accelerate Unbiased Sampling of Boltzmann distribution

    Authors: Xin Peng, Ang Gao

    Abstract: Flow-based generative models have been employed for sampling the Boltzmann distribution, but their application to high-dimensional systems is hindered by the significant computational cost of obtaining the Jacobian of the flow. To overcome this challenge, we introduce the flow perturbation method, which incorporates optimized stochastic perturbations into the flow. By reweighting trajectories gene… ▽ More

    Submitted 27 July, 2024; v1 submitted 15 July, 2024; originally announced July 2024.

  5. arXiv:2407.05302  [pdf, other

    cs.LG stat.ML

    Mamba Hawkes Process

    Authors: Anningzhe Gao, Shan Dai, Yan Hu

    Abstract: Irregular and asynchronous event sequences are prevalent in many domains, such as social media, finance, and healthcare. Traditional temporal point processes (TPPs), like Hawkes processes, often struggle to model mutual inhibition and nonlinearity effectively. While recent neural network models, including RNNs and Transformers, address some of these issues, they still face challenges with long-ter… ▽ More

    Submitted 7 July, 2024; originally announced July 2024.

  6. arXiv:2406.19280  [pdf, other

    cs.CV cs.AI cs.CL cs.LG

    HuatuoGPT-Vision, Towards Injecting Medical Visual Knowledge into Multimodal LLMs at Scale

    Authors: Junying Chen, Ruyi Ouyang, Anningzhe Gao, Shunian Chen, Guiming Hardy Chen, Xidong Wang, Ruifei Zhang, Zhenyang Cai, Ke Ji, Guangjun Yu, Xiang Wan, Benyou Wang

    Abstract: The rapid development of multimodal large language models (MLLMs), such as GPT-4V, has led to significant advancements. However, these models still face challenges in medical multimodal capabilities due to limitations in the quantity and quality of medical vision-text data, stemming from data privacy concerns and high annotation costs. While pioneering approaches utilize PubMed's large-scale, de-i… ▽ More

    Submitted 27 June, 2024; originally announced June 2024.

  7. arXiv:2406.18034  [pdf, other

    cs.CL

    LLMs for Doctors: Leveraging Medical LLMs to Assist Doctors, Not Replace Them

    Authors: Wenya Xie, Qingying Xiao, Yu Zheng, Xidong Wang, Junying Chen, Ke Ji, Anningzhe Gao, Xiang Wan, Feng Jiang, Benyou Wang

    Abstract: The recent success of Large Language Models (LLMs) has had a significant impact on the healthcare field, providing patients with medical advice, diagnostic information, and more. However, due to a lack of professional medical knowledge, patients are easily misled by generated erroneous information from LLMs, which may result in serious medical problems. To address this issue, we focus on tuning th… ▽ More

    Submitted 25 June, 2024; originally announced June 2024.

  8. arXiv:2406.18027  [pdf, other

    cs.CL cs.AI

    Automated Clinical Data Extraction with Knowledge Conditioned LLMs

    Authors: Diya Li, Asim Kadav, Aijing Gao, Rui Li, Richard Bourgon

    Abstract: The extraction of lung lesion information from clinical and medical imaging reports is crucial for research on and clinical care of lung-related diseases. Large language models (LLMs) can be effective at interpreting unstructured text in reports, but they often hallucinate due to a lack of domain-specific knowledge, leading to reduced accuracy and posing challenges for use in clinical settings. To… ▽ More

    Submitted 25 June, 2024; originally announced June 2024.

  9. arXiv:2406.16771  [pdf, other

    cond-mat.str-el

    An antiferromagnetic diode effect in even-layered MnBi2Te4

    Authors: Anyuan Gao, Shao-Wen Chen, Barun Ghosh, Jian-Xiang Qiu, Yu-Fei Liu, Yugo Onishi, Chaowei Hu, Tiema Qian, Damien Bérubé, Thao Dinh, Houchen Li, Christian Tzschaschel, Seunghyun Park, Tianye Huang, Shang-Wei Lien, Zhe Sun, Sheng-Chin Ho, Bahadur Singh, Kenji Watanabe, Takashi Taniguchi, David C. Bell, Arun Bansil, Hsin Lin, Tay-Rong Chang, Amir Yacoby , et al. (4 additional authors not shown)

    Abstract: In a PN junction, the separation between positive and negative charges leads to diode transport. In the past few years, the intrinsic diode transport in noncentrosymmetric polar conductors has attracted great interest, because it suggests novel nonlinear applications and provides a symmetry-sensitive probe of Fermi surface. Recently, such studies have been extended to noncentrosymmetric supercondu… ▽ More

    Submitted 24 June, 2024; originally announced June 2024.

    Comments: 33+8 pages, 14+2 figures

  10. arXiv:2406.09648  [pdf, other

    cs.LG cs.CV

    An Intrinsic Vector Heat Network

    Authors: Alexander Gao, Maurice Chu, Mubbasir Kapadia, Ming C. Lin, Hsueh-Ti Derek Liu

    Abstract: Vector fields are widely used to represent and model flows for many science and engineering applications. This paper introduces a novel neural network architecture for learning tangent vector fields that are intrinsically defined on manifold surfaces embedded in 3D. Previous approaches to learning vector fields on surfaces treat vectors as multi-dimensional scalar fields, using traditional scalar-… ▽ More

    Submitted 18 July, 2024; v1 submitted 13 June, 2024; originally announced June 2024.

  11. arXiv:2406.00606  [pdf, other

    cs.CL

    LLMs Could Autonomously Learn Without External Supervision

    Authors: Ke Ji, Junying Chen, Anningzhe Gao, Wenya Xie, Xiang Wan, Benyou Wang

    Abstract: In the quest for super-human performance, Large Language Models (LLMs) have traditionally been tethered to human-annotated datasets and predefined training objectives-a process that is both labor-intensive and inherently limited. This paper presents a transformative approach: Autonomous Learning for LLMs, a self-sufficient learning paradigm that frees models from the constraints of human supervisi… ▽ More

    Submitted 6 June, 2024; v1 submitted 1 June, 2024; originally announced June 2024.

    Comments: 20 pages, 8 figures

  12. arXiv:2406.00073  [pdf, other

    cs.LG cs.CR

    A Novel Review of Stability Techniques for Improved Privacy-Preserving Machine Learning

    Authors: Coleman DuPlessie, Aidan Gao

    Abstract: Machine learning models have recently enjoyed a significant increase in size and popularity. However, this growth has created concerns about dataset privacy. To counteract data leakage, various privacy frameworks guarantee that the output of machine learning models does not compromise their training data. However, this privatization comes at a cost by adding random noise to the training process, w… ▽ More

    Submitted 30 May, 2024; originally announced June 2024.

  13. arXiv:2405.19799  [pdf, other

    cs.CL

    Unsupervised Mutual Learning of Dialogue Discourse Parsing and Topic Segmentation

    Authors: Jiahui Xu, Feng Jiang, Anningzhe Gao, Haizhou Li

    Abstract: The advancement of large language models (LLMs) has propelled the development of dialogue systems. Unlike the popular ChatGPT-like assistant model, which only satisfies the user's preferences, task-oriented dialogue systems have also faced new requirements and challenges in the broader business field. They are expected to provide correct responses at each dialogue turn, at the same time, achieve t… ▽ More

    Submitted 3 June, 2024; v1 submitted 30 May, 2024; originally announced May 2024.

  14. arXiv:2405.14559  [pdf, other

    eess.IV

    HemSeg-200: A Voxel-Annotated Dataset for Intracerebral Hemorrhages Segmentation in Brain CT Scans

    Authors: Changwei Song, Qing Zhao, Jianqiang Li, Xin Yue, Ruoyun Gao, Zhaoxuan Wang, An Gao, Guanghui Fu

    Abstract: Acute intracerebral hemorrhage is a life-threatening condition that demands immediate medical intervention. Intraparenchymal hemorrhage (IPH) and intraventricular hemorrhage (IVH) are critical subtypes of this condition. Clinically, when such hemorrhages are suspected, immediate CT scanning is essential to assess the extent of the bleeding and to facilitate the formulation of a targeted treatment… ▽ More

    Submitted 23 May, 2024; originally announced May 2024.

  15. arXiv:2405.13144  [pdf, other

    cs.AI cs.CL

    Mamo: a Mathematical Modeling Benchmark with Solvers

    Authors: Xuhan Huang, Qingning Shen, Yan Hu, Anningzhe Gao, Benyou Wang

    Abstract: Mathematical modeling involves representing real-world phenomena, systems, or problems using mathematical expressions and equations to analyze, understand, and predict their behavior. Given that this process typically requires experienced experts, there is an interest in exploring whether Large Language Models (LLMs) can undertake mathematical modeling to potentially decrease human labor. To evalu… ▽ More

    Submitted 30 June, 2024; v1 submitted 21 May, 2024; originally announced May 2024.

    Comments: Project: https://github.com/FreedomIntelligence/Mamo Updates: 1. include more models 2. minor modification of the metric with new results 3. fix some typos 4. add error analysis with examples

  16. arXiv:2405.06985  [pdf, other

    cs.LG

    RoTHP: Rotary Position Embedding-based Transformer Hawkes Process

    Authors: Anningzhe Gao, Shan Dai

    Abstract: Temporal Point Processes (TPPs), especially Hawkes Process are commonly used for modeling asynchronous event sequences data such as financial transactions and user behaviors in social networks. Due to the strong fitting ability of neural networks, various neural Temporal Point Processes are proposed, among which the Neural Hawkes Processes based on self-attention such as Transformer Hawkes Process… ▽ More

    Submitted 11 May, 2024; originally announced May 2024.

  17. arXiv:2404.17104  [pdf, other

    cs.HC cs.CV

    Don't Look at the Camera: Achieving Perceived Eye Contact

    Authors: Alice Gao, Samyukta Jayakumar, Marcello Maniglia, Brian Curless, Ira Kemelmacher-Shlizerman, Aaron R. Seitz, Steven M. Seitz

    Abstract: We consider the question of how to best achieve the perception of eye contact when a person is captured by camera and then rendered on a 2D display. For single subjects photographed by a camera, conventional wisdom tells us that looking directly into the camera achieves eye contact. Through empirical user studies, we show that it is instead preferable to {\em look just below the camera lens}. We q… ▽ More

    Submitted 25 April, 2024; originally announced April 2024.

  18. arXiv:2404.12036  [pdf, other

    physics.comp-ph cond-mat.soft

    Exploring the Premelting Transition through Molecular Simulations Powered by Neural Network Potentials

    Authors: Limin Zeng, Ang Gao

    Abstract: The system has addressed the error of "Bad character(s) in field Abstract" for no reason. Please refer to manuscript for the full abstract.

    Submitted 18 April, 2024; originally announced April 2024.

    Comments: 10 pages, 6 figures

  19. arXiv:2404.05236  [pdf, other

    cs.CV cs.GR

    Stylizing Sparse-View 3D Scenes with Hierarchical Neural Representation

    Authors: Y. Wang, A. Gao, Y. Gong, Y. Zeng

    Abstract: Recently, a surge of 3D style transfer methods has been proposed that leverage the scene reconstruction power of a pre-trained neural radiance field (NeRF). To successfully stylize a scene this way, one must first reconstruct a photo-realistic radiance field from collected images of the scene. However, when only sparse input views are available, pre-trained few-shot NeRFs often suffer from high-fr… ▽ More

    Submitted 8 April, 2024; originally announced April 2024.

  20. arXiv:2404.02986  [pdf, other

    cs.LG stat.ML

    Universal Functional Regression with Neural Operator Flows

    Authors: Yaozhong Shi, Angela F. Gao, Zachary E. Ross, Kamyar Azizzadenesheli

    Abstract: Regression on function spaces is typically limited to models with Gaussian process priors. We introduce the notion of universal functional regression, in which we aim to learn a prior distribution over non-Gaussian function spaces that remains mathematically tractable for functional regression. To do this, we develop Neural Operator Flows (OpFlow), an infinite-dimensional extension of normalizing… ▽ More

    Submitted 3 April, 2024; originally announced April 2024.

  21. arXiv:2403.15912  [pdf, other

    cond-mat.mes-hall cond-mat.mtrl-sci cond-mat.str-el

    Observation of the dual quantum spin Hall insulator by density-tuned correlations in a van der Waals monolayer

    Authors: Jian Tang, Thomas Siyuan Ding, Hongyu Chen, Anyuan Gao, Tiema Qian, Zumeng Huang, Zhe Sun, Xin Han, Alex Strasser, Jiangxu Li, Michael Geiwitz, Mohamed Shehabeldin, Vsevolod Belosevich, Zihan Wang, Yiping Wang, Kenji Watanabe, Takashi Taniguchi, David C. Bell, Ziqiang Wang, Liang Fu, Yang Zhang, Xiaofeng Qian, Kenneth S. Burch, Youguo Shi, Ni Ni , et al. (3 additional authors not shown)

    Abstract: The convergence of topology and correlations represents a highly coveted realm in the pursuit of novel quantum states of matter. Introducing electron correlations to a quantum spin Hall (QSH) insulator can lead to the emergence of a fractional topological insulator and other exotic time-reversal-symmetric topological order, not possible in quantum Hall and Chern insulator systems. However, the QSH… ▽ More

    Submitted 23 March, 2024; originally announced March 2024.

    Comments: 23 pages, 15 figures, submitted version

  22. arXiv:2403.03640  [pdf, other

    cs.CL cs.AI

    Apollo: A Lightweight Multilingual Medical LLM towards Democratizing Medical AI to 6B People

    Authors: Xidong Wang, Nuo Chen, Junyin Chen, Yidong Wang, Guorui Zhen, Yan Hu, Xiangbo Wu, Anningzhe Gao, Xiang Wan, Haizhou Li, Benyou Wang

    Abstract: Despite the vast repository of global medical knowledge predominantly being in English, local languages are crucial for delivering tailored healthcare services, particularly in areas with limited medical resources. To extend the reach of medical AI advancements to a broader population, we aim to develop medical LLMs across the six most widely spoken languages, encompassing a global population of 6… ▽ More

    Submitted 16 August, 2024; v1 submitted 6 March, 2024; originally announced March 2024.

    Comments: Preprint

  23. arXiv:2401.10956  [pdf, other

    cs.HC cs.AI cs.IR

    AI Revolution on Chat Bot: Evidence from a Randomized Controlled Experiment

    Authors: Sida Peng, Wojciech Swiatek, Allen Gao, Paul Cullivan, Haoge Chang

    Abstract: In recent years, generative AI has undergone major advancements, demonstrating significant promise in augmenting human productivity. Notably, large language models (LLM), with ChatGPT-4 as an example, have drawn considerable attention. Numerous articles have examined the impact of LLM-based tools on human productivity in lab settings and designed tasks or in observational studies. Despite recent a… ▽ More

    Submitted 19 January, 2024; originally announced January 2024.

  24. arXiv:2401.00931  [pdf, other

    hep-ph hep-th nucl-th

    A Collinear Perspective on the Regge Limit

    Authors: Anjie Gao, Ian Moult, Sanjay Raman, Gregory Ridgway, Iain W. Stewart

    Abstract: The high energy (Regge) limit provides a playground for understanding all loop structures of scattering amplitudes, and plays an important role in the description of many phenomenologically relevant cross-sections. While well understood in the planar limit, the structure of non-planar corrections introduces many fascinating complexities, for which a general organizing principle is still lacking. W… ▽ More

    Submitted 29 August, 2024; v1 submitted 1 January, 2024; originally announced January 2024.

    Comments: 48 pages + references, 10 figures; v2: JHEP version; v3: fix references

    Report number: MIT-CTP 5628

    Journal ref: JHEP 05 (2024) 328

  25. arXiv:2312.16408  [pdf, other

    hep-ph nucl-th

    The Transverse Energy-Energy Correlator at Next-to-Next-to-Next-to-Leading Logarithm

    Authors: Anjie Gao, Hai Tao Li, Ian Moult, Hua Xing Zhu

    Abstract: We present an operator based factorization formula for the transverse energy-energy correlator in the back-to-back (dijet) region, and uncover its remarkable perturbative simplicity and relation to transverse momentum dynamics. This simplicity enables us to achieve next-to-next-to-next-to leading logarithmic (N$^3$LL) accuracy for a hadron collider dijet event shape for the first time. Our factori… ▽ More

    Submitted 26 December, 2023; originally announced December 2023.

    Comments: 54 pages, 12 figures

    Report number: MIT-CTP 5662

  26. arXiv:2311.13951  [pdf, other

    cs.CL

    MLLM-Bench: Evaluating Multimodal LLMs with Per-sample Criteria

    Authors: Wentao Ge, Shunian Chen, Guiming Hardy Chen, Zhihong Chen, Junying Chen, Shuo Yan, Chenghao Zhu, Ziyue Lin, Wenya Xie, Xinyi Zhang, Yichen Chai, Xiaoyu Liu, Dingjie Song, Xidong Wang, Anningzhe Gao, Zhiyi Zhang, Jianquan Li, Xiang Wan, Benyou Wang

    Abstract: Multimodal large language models (MLLMs) (e.g., GPT-4V, LLaVA, and Claude-3) have broadened the scope of AI applications. Yet, evaluating their performance presents a significant challenge owing to the inherently subjective nature of tasks that do not yield clear-cut solutions especially for those open-ended queries. Existing automatic evaluation methodologies are mainly limited in evaluating obje… ▽ More

    Submitted 27 April, 2024; v1 submitted 23 November, 2023; originally announced November 2023.

    Comments: 23 pages

  27. arXiv:2311.09774  [pdf, other

    cs.CL cs.AI cs.LG

    HuatuoGPT-II, One-stage Training for Medical Adaption of LLMs

    Authors: Junying Chen, Xidong Wang, Anningzhe Gao, Feng Jiang, Shunian Chen, Hongbo Zhang, Dingjie Song, Wenya Xie, Chuyi Kong, Jianquan Li, Xiang Wan, Haizhou Li, Benyou Wang

    Abstract: Adapting a language model into a specific domain, a.k.a `domain adaption', is a common practice when specialized knowledge, e.g. medicine, is not encapsulated in a general language model like Llama2. The challenge lies in the heterogeneity of data across the two training stages, as it varies in languages, genres, or formats. To tackle this and simplify the learning protocol, we propose to transfor… ▽ More

    Submitted 16 November, 2023; originally announced November 2023.

  28. arXiv:2311.09724  [pdf, other

    cs.AI cs.CL

    OVM, Outcome-supervised Value Models for Planning in Mathematical Reasoning

    Authors: Fei Yu, Anningzhe Gao, Benyou Wang

    Abstract: Large language models (LLMs) often struggle with maintaining accuracy throughout multiple multiple reasoning steps, especially in mathematical reasoning where an error in earlier steps can propagate to subsequent ones and it ultimately leading to an incorrect answer. To reduce error propagation, guided decoding is employed to direct the LM decoding on a step-by-step basis. We argue that in guided… ▽ More

    Submitted 1 April, 2024; v1 submitted 16 November, 2023; originally announced November 2023.

    Comments: Accepted to NAACL findings. https://github.com/FreedomIntelligence/OVM

  29. arXiv:2311.06929  [pdf, ps, other

    math.CO

    The combinatorics behind the leading Kazhdan-Lusztig coefficients of braid matroids

    Authors: Alice L. L. Gao, Nicholas Proudfoot, Arthur L. B. Yang, Zhong-Xue Zhang

    Abstract: Ferroni and Larson gave a combinatorial interpretation of the braid Kazhdan-Lusztig polynomials in terms of series-parallel matroids. As a consequence, they confirmed an explicit formula for the leading Kazhdan-Lusztig coefficients of braid matroids with odd rank, as conjectured by Elias, Proudfoot, and Wakefield. Based on Ferroni and Larson's work, we further explore the combinatorics behind the… ▽ More

    Submitted 12 November, 2023; originally announced November 2023.

    MSC Class: 05B35; 05A15; 05A19

  30. arXiv:2309.12119  [pdf, other

    stat.ME

    Pseudo-Bayesian unit level modeling for small area estimation under informative sampling

    Authors: Peter A. Gao, Jon Wakefield

    Abstract: When mapping subnational health and demographic indicators, direct weighted estimators of small area means based on household survey data can be unreliable when data are limited. If survey microdata are available, unit level models can relate individual survey responses to unit level auxiliary covariates and explicitly account for spatial dependence and between area variation using random effects.… ▽ More

    Submitted 21 September, 2023; originally announced September 2023.

  31. arXiv:2309.04581  [pdf, other

    cs.GR cs.CV cs.LG

    Dynamic Mesh-Aware Radiance Fields

    Authors: Yi-Ling Qiao, Alexander Gao, Yiran Xu, Yue Feng, Jia-Bin Huang, Ming C. Lin

    Abstract: Embedding polygonal mesh assets within photorealistic Neural Radience Fields (NeRF) volumes, such that they can be rendered and their dynamics simulated in a physically consistent manner with the NeRF, is under-explored from the system perspective of integrating NeRF into the traditional graphics pipeline. This paper designs a two-way coupling between mesh and NeRF during rendering and simulation.… ▽ More

    Submitted 8 September, 2023; originally announced September 2023.

    Comments: ICCV 2023

  32. arXiv:2307.15603  [pdf, other

    cond-mat.mes-hall cond-mat.mtrl-sci physics.optics

    Nonlinear optical diode effect in a magnetic Weyl semimetal

    Authors: Christian Tzschaschel, Jian-Xiang Qiu, Xue-Jian Gao, Hou-Chen Li, Chunyu Guo, Hung-Yu Yang, Cheng-Ping Zhang, Ying-Ming Xie, Yu-Fei Liu, Anyuan Gao, Damien Bérubé, Thao Dinh, Sheng-Chin Ho, Yuqiang Fang, Fuqiang Huang, Johanna Nordlander, Qiong Ma, Fazel Tafti, Philip J. W. Moll, Kam Tuen Law, Su-Yang Xu

    Abstract: Diode effects are of great interest for both fundamental physics and modern technologies. Electrical diode effects (nonreciprocal transport) have been observed in Weyl systems. Optical diode effects arising from the Weyl fermions have been theoretically considered but not probed experimentally. Here, we report the observation of a nonlinear optical diode effect (NODE) in the magnetic Weyl semimeta… ▽ More

    Submitted 8 April, 2024; v1 submitted 28 July, 2023; originally announced July 2023.

    Comments: 20 pages, 4 figures, SI included

    Journal ref: Nat. Commun 15, 3017 (2024)

  33. arXiv:2307.10539  [pdf, ps, other

    math.CO math.RT

    Induced log-concavity of equivariant matroid invariants

    Authors: Alice L. L. Gao, Ethan Y. H. Li, Matthew H. Y. Xie, Arthur L. B. Yang, Zhong-Xue Zhang

    Abstract: Inspired by the notion of equivariant log-concavity, we introduce the concept of induced log-concavity for a sequence of representations of a finite group. For an equivariant matroid equipped with a symmetric group action or a finite general linear group action, we transform the problem of proving the induced log-concavity of matroid invariants to that of proving the Schur positivity of symmetric… ▽ More

    Submitted 19 July, 2023; originally announced July 2023.

    Comments: 36 pages

    MSC Class: 05B35; 05E05; 20C30

  34. arXiv:2307.09793  [pdf

    cs.DL cs.CL

    On the Origin of LLMs: An Evolutionary Tree and Graph for 15,821 Large Language Models

    Authors: Sarah Gao, Andrew Kean Gao

    Abstract: Since late 2022, Large Language Models (LLMs) have become very prominent with LLMs like ChatGPT and Bard receiving millions of users. Hundreds of new LLMs are announced each week, many of which are deposited to Hugging Face, a repository of machine learning models and datasets. To date, nearly 16,000 Text Generation models have been uploaded to the site. Given the huge influx of LLMs, it is of int… ▽ More

    Submitted 19 July, 2023; originally announced July 2023.

    Comments: 14 pages, 6 figures, 1 table

    ACM Class: I.2.1; H.5.0

  35. arXiv:2307.05537  [pdf

    cs.LG q-bio.BM

    NLP Meets RNA: Unsupervised Embedding Learning for Ribozymes with Word2Vec

    Authors: Andrew Kean Gao

    Abstract: Ribozymes, RNA molecules with distinct 3D structures and catalytic activity, have widespread applications in synthetic biology and therapeutics. However, relatively little research has focused on leveraging deep learning to enhance our understanding of ribozymes. This study implements Word2Vec, an unsupervised learning technique for natural language processing, to learn ribozyme embeddings. Ribo2V… ▽ More

    Submitted 8 July, 2023; originally announced July 2023.

    ACM Class: I.2.7

  36. arXiv:2307.00213  [pdf

    cs.CV cs.LG

    More for Less: Compact Convolutional Transformers Enable Robust Medical Image Classification with Limited Data

    Authors: Andrew Kean Gao

    Abstract: Transformers are very powerful tools for a variety of tasks across domains, from text generation to image captioning. However, transformers require substantial amounts of training data, which is often a challenge in biomedical settings, where high quality labeled data can be challenging or expensive to obtain. This study investigates the efficacy of Compact Convolutional Transformers (CCT) for rob… ▽ More

    Submitted 30 June, 2023; originally announced July 2023.

    Comments: 9 pages, 4 figures, 2 tables

    MSC Class: I.4.9; I.2.10

  37. arXiv:2306.14461  [pdf, other

    cs.RO

    Polynomial-based Online Planning for Autonomous Drone Racing in Dynamic Environments

    Authors: Qianhao Wang, Dong Wang, Chao Xu, Alan Gao, Fei Gao

    Abstract: In recent years, there is a noteworthy advancement in autonomous drone racing. However, the primary focus is on attaining execution times, while scant attention is given to the challenges of dynamic environments. The high-speed nature of racing scenarios, coupled with the potential for unforeseeable environmental alterations, present stringent requirements for online replanning and its timeliness.… ▽ More

    Submitted 26 June, 2023; originally announced June 2023.

  38. arXiv:2306.12689  [pdf

    cs.CL cs.AI cs.IR cs.LG

    Vec2Vec: A Compact Neural Network Approach for Transforming Text Embeddings with High Fidelity

    Authors: Andrew Kean Gao

    Abstract: Vector embeddings have become ubiquitous tools for many language-related tasks. A leading embedding model is OpenAI's text-ada-002 which can embed approximately 6,000 words into a 1,536-dimensional vector. While powerful, text-ada-002 is not open source and is only available via API. We trained a simple neural network to convert open-source 768-dimensional MPNet embeddings into text-ada-002 embedd… ▽ More

    Submitted 22 June, 2023; originally announced June 2023.

    Comments: 14 pages, 6 figures, 5 tables

    ACM Class: I.2.7; D.2.12

  39. arXiv:2306.09575  [pdf, other

    cond-mat.mes-hall cond-mat.str-el

    Quantum metric nonlinear Hall effect in a topological antiferromagnetic heterostructure

    Authors: Anyuan Gao, Yu-Fei Liu, Jian-Xiang Qiu, Barun Ghosh, Thaís V. Trevisan, Yugo Onishi, Chaowei Hu, Tiema Qian, Hung-Ju Tien, Shao-Wen Chen, Mengqi Huang, Damien Bérubé, Houchen Li, Christian Tzschaschel, Thao Dinh, Zhe Sun, Sheng-Chin Ho, Shang-Wei Lien, Bahadur Singh, Kenji Watanabe, Takashi Taniguchi, David C. Bell, Hsin Lin, Tay-Rong Chang, Chunhui Rita Du , et al. (6 additional authors not shown)

    Abstract: Quantum geometry - the geometry of electron Bloch wavefunctions - is central to modern condensed matter physics. Due to the quantum nature, quantum geometry has two parts, the real part quantum metric and the imaginary part Berry curvature. The studies of Berry curvature have led to countless breakthroughs, ranging from the quantum Hall effect in 2DEGs to the anomalous Hall effect (AHE) in ferroma… ▽ More

    Submitted 23 July, 2023; v1 submitted 15 June, 2023; originally announced June 2023.

    Comments: 19 pages, 4 figures and a Supplementary Materials with 66 pages, 4 figures and 3 tables. Originally submitted to Science on Oct. 5, 2022

    Journal ref: Science 381, 181-186 (2023)

  40. arXiv:2306.03922  [pdf, other

    cond-mat.mes-hall cond-mat.mtrl-sci cond-mat.str-el

    Electronic ratchet effect in a moiré system: signatures of excitonic ferroelectricity

    Authors: Zhiren Zheng, Xueqiao Wang, Ziyan Zhu, Stephen Carr, Trithep Devakul, Sergio de la Barrera, Nisarga Paul, Zumeng Huang, Anyuan Gao, Yang Zhang, Damien Bérubé, Kathryn Natasha Evancho, Kenji Watanabe, Takashi Taniguchi, Liang Fu, Yao Wang, Su-Yang Xu, Efthimios Kaxiras, Pablo Jarillo-Herrero, Qiong Ma

    Abstract: Electronic ferroelectricity represents a new paradigm where spontaneous symmetry breaking driven by electronic correlations, in contrast to traditional lattice-driven ferroelectricity, leads to the formation of electric dipoles. Despite the potential application advantages arising from its electronic nature, switchable electronic ferroelectricity remains exceedingly rare. Here, we report the disco… ▽ More

    Submitted 6 June, 2023; originally announced June 2023.

  41. arXiv:2304.05589  [pdf, other

    eess.IV cs.CV

    Discovering Structure From Corruption for Unsupervised Image Reconstruction

    Authors: Oscar Leong, Angela F. Gao, He Sun, Katherine L. Bouman

    Abstract: We consider solving ill-posed imaging inverse problems without access to an image prior or ground-truth examples. An overarching challenge in these inverse problems is that an infinite number of images, including many that are implausible, are consistent with the observed measurements. Thus, image priors are required to reduce the space of possible solutions to more desirable reconstructions. Howe… ▽ More

    Submitted 1 November, 2023; v1 submitted 11 April, 2023; originally announced April 2023.

    Comments: Extended version of arXiv:2303.12217

  42. arXiv:2303.12217  [pdf, other

    eess.IV cs.CV

    Image Reconstruction without Explicit Priors

    Authors: Angela F. Gao, Oscar Leong, He Sun, Katherine L. Bouman

    Abstract: We consider solving ill-posed imaging inverse problems without access to an explicit image prior or ground-truth examples. An overarching challenge in inverse problems is that there are many undesired images that fit to the observed measurements, thus requiring image priors to constrain the space of possible solutions to more plausible reconstructions. However, in many applications it is difficult… ▽ More

    Submitted 21 March, 2023; originally announced March 2023.

    Comments: ICASSP 2023

  43. arXiv:2303.05451  [pdf, other

    cond-mat.mes-hall cond-mat.mtrl-sci

    Axion optical induction of antiferromagnetic order

    Authors: Jian-Xiang Qiu, Christian Tzschaschel, Junyeong Ahn, Anyuan Gao, Houchen Li, Xin-Yue Zhang, Barun Ghosh, Chaowei Hu, Yu-Xuan Wang, Yu-Fei Liu, Damien Bérubé, Thao Dinh, Zhenhao Gong, Shang-Wei Lien, Sheng-Chin Ho, Bahadur Singh, Kenji Watanabe, Takashi Taniguchi, David C. Bell, Hai-Zhou Lu, Arun Bansil, Hsin Lin, Tay-Rong Chang, Brian B. Zhou, Qiong Ma , et al. (3 additional authors not shown)

    Abstract: Using circularly-polarized light to control quantum matter is a highly intriguing topic in physics, chemistry and biology. Previous studies have demonstrated helicity-dependent optical control of spatial chirality and magnetization $M$. The former is central for asymmetric synthesis in chemistry and homochirality in bio-molecules, while the latter is of great interest for ferromagnetic spintronics… ▽ More

    Submitted 9 March, 2023; originally announced March 2023.

    Journal ref: Nature Materials 22, 583-590 (2023)

  44. arXiv:2303.01330  [pdf, other

    cs.RO

    Continuous Implicit SDF Based Any-shape Robot Trajectory Optimization

    Authors: Tingrui Zhang, Jingping Wang, Chao Xu, Alan Gao, Fei Gao

    Abstract: Optimization-based trajectory generation methods are widely used in whole-body planning for robots. However, existing work either oversimplifies the robot's geometry and environment representation, resulting in a conservative trajectory, or suffers from a huge overhead in maintaining additional information such as the Signed Distance Field (SDF). To bridge the gap, we consider the robot as an impl… ▽ More

    Submitted 2 March, 2023; originally announced March 2023.

  45. arXiv:2301.07188  [pdf, other

    physics.chem-ph cond-mat.stat-mech

    Dielectric Saturation in Water from a Long Range Machine Learning Model

    Authors: Harender S. Dhattarwal, Ang Gao, Richard C. Remsing

    Abstract: Machine learning-based neural network potentials have the ability to provide ab initio-level predictions while reaching large length and time scales often limited to empirical force fields. Traditionally, neural network potentials rely on a local description of atomic environments to achieve this scalability. These local descriptions result in short range models that neglect long range interaction… ▽ More

    Submitted 17 January, 2023; originally announced January 2023.

    Comments: 10 pages, 7 figures

  46. arXiv:2212.05155  [pdf, other

    cs.DC cs.LG

    Acela: Predictable Datacenter-level Maintenance Job Scheduling

    Authors: Yi Ding, Aijia Gao, Thibaud Ryden, Kaushik Mitra, Sukumar Kalmanje, Yanai Golany, Michael Carbin, Henry Hoffmann

    Abstract: Datacenter operators ensure fair and regular server maintenance by using automated processes to schedule maintenance jobs to complete within a strict time budget. Automating this scheduling problem is challenging because maintenance job duration varies based on both job type and hardware. While it is tempting to use prior machine learning techniques for predicting job duration, we find that the st… ▽ More

    Submitted 9 December, 2022; originally announced December 2022.

  47. arXiv:2212.01468  [pdf, ps, other

    math.HO

    The Struggles of Chessland

    Authors: Irene Choi, Shreyas Ekanathan, Aidan Gao, Tanya Khovanova, Sylvia Zia Lee, Rajarshi Mandal, Vaibhav Rastogi, Daniel Sheffield, Michael Yang, Angela Zhao, Corey Zhao

    Abstract: This is a fairy tale taking place in Chessland, located in the Bermuda triangle. The chess pieces survey their land and trap enemy pieces. Behind the story, there is fascinating mathematics on how to optimize surveying and trapping. The tale is written by the students in the PRIMES STEP junior group, who were in grades 6 through 9. The paper has a conclusion, written by the group's mentor, Tanya K… ▽ More

    Submitted 2 December, 2022; originally announced December 2022.

    Comments: 31 pages, 32 figures, 5 tables

    MSC Class: 00A08; 05C99

  48. arXiv:2211.13329  [pdf, other

    stat.AP

    Extent of Safety Database in Pediatric Drug Development: Types of Assessment, Analytical Precision, and Pathway for Extrapolation through On-Target Effects

    Authors: Margaret Gamalo, Yihua Zhao, Aijun Gao, Jingjing Ye, Ralph DeMasi, Eiji Eshida, YJ Choi, Robert Nelson

    Abstract: Pediatric patients should have access to medicines that have been appropriately evaluated for safety and efficacy. Given this goal of revised labelling, the adequacy of the pediatric clinical development plan and resulting safety database must inform a favorable benefit-risk assessment for the intended use of the medicinal product. While extrapolation from adults can be used to support efficacy of… ▽ More

    Submitted 23 November, 2022; originally announced November 2022.

  49. arXiv:2211.06573  [pdf

    physics.app-ph cond-mat.mes-hall

    Approaching intrinsic threshold breakdown voltage and ultra-high gain in graphite/InSe Schottky photodetector

    Authors: Zhiyi Zhang, Bin Cheng, Jeremy Lim, Anyuan Gao, Lingyuan Lyu, Tianju Cao, Shuang Wang, Zhu-An Li, Qingyun Wu, L. K. Ang, Yee Sin Ang, Shi-Jun Liang, Feng Miao

    Abstract: Realizing both ultra-low breakdown voltage and ultra-high gain has been one of the major challenges in the development of high-performance avalanche photodetector. Here, we report that an ultra-high avalanche gain of 3*10^5 can be realized in the graphite/InSe Schottky photodetector at a breakdown voltage down to 5.5 V. Remarkably, the threshold breakdown voltage can be further reduced down to 1.8… ▽ More

    Submitted 11 November, 2022; originally announced November 2022.

  50. arXiv:2210.12352  [pdf, other

    cs.CV cs.GR cs.LG

    NeuPhysics: Editable Neural Geometry and Physics from Monocular Videos

    Authors: Yi-Ling Qiao, Alexander Gao, Ming C. Lin

    Abstract: We present a method for learning 3D geometry and physics parameters of a dynamic scene from only a monocular RGB video input. To decouple the learning of underlying scene geometry from dynamic motion, we represent the scene as a time-invariant signed distance function (SDF) which serves as a reference frame, along with a time-conditioned deformation field. We further bridge this neural geometry re… ▽ More

    Submitted 22 October, 2022; originally announced October 2022.

    Comments: NeurIPS 2022