-
Using Galaxy Evolution as Source of Physics-Based Ground Truth for Generative Models
Authors:
Yun Qi Li,
Tuan Do,
Evan Jones,
Bernie Boscoe,
Kevin Alfaro,
Zooey Nguyen
Abstract:
Generative models producing images have enormous potential to advance discoveries across scientific fields and require metrics capable of quantifying the high dimensional output. We propose that astrophysics data, such as galaxy images, can test generative models with additional physics-motivated ground truths in addition to human judgment. For example, galaxies in the Universe form and change ove…
▽ More
Generative models producing images have enormous potential to advance discoveries across scientific fields and require metrics capable of quantifying the high dimensional output. We propose that astrophysics data, such as galaxy images, can test generative models with additional physics-motivated ground truths in addition to human judgment. For example, galaxies in the Universe form and change over billions of years, following physical laws and relationships that are both easy to characterize and difficult to encode in generative models. We build a conditional denoising diffusion probabilistic model (DDPM) and a conditional variational autoencoder (CVAE) and test their ability to generate realistic galaxies conditioned on their redshifts (galaxy ages). This is one of the first studies to probe these generative models using physically motivated metrics. We find that both models produce comparable realistic galaxies based on human evaluation, but our physics-based metrics are better able to discern the strengths and weaknesses of the generative models. Overall, the DDPM model performs better than the CVAE on the majority of the physics-based metrics. Ultimately, if we can show that generative models can learn the physics of galaxy evolution, they have the potential to unlock new astrophysical discoveries.
△ Less
Submitted 9 July, 2024;
originally announced July 2024.
-
Enhancing Q&A with Domain-Specific Fine-Tuning and Iterative Reasoning: A Comparative Study
Authors:
Zooey Nguyen,
Anthony Annunziata,
Vinh Luong,
Sang Dinh,
Quynh Le,
Anh Hai Ha,
Chanh Le,
Hong An Phan,
Shruti Raghavan,
Christopher Nguyen
Abstract:
This paper investigates the impact of domain-specific model fine-tuning and of reasoning mechanisms on the performance of question-answering (Q&A) systems powered by large language models (LLMs) and Retrieval-Augmented Generation (RAG). Using the FinanceBench SEC financial filings dataset, we observe that, for RAG, combining a fine-tuned embedding model with a fine-tuned LLM achieves better accura…
▽ More
This paper investigates the impact of domain-specific model fine-tuning and of reasoning mechanisms on the performance of question-answering (Q&A) systems powered by large language models (LLMs) and Retrieval-Augmented Generation (RAG). Using the FinanceBench SEC financial filings dataset, we observe that, for RAG, combining a fine-tuned embedding model with a fine-tuned LLM achieves better accuracy than generic models, with relatively greater gains attributable to fine-tuned embedding models. Additionally, employing reasoning iterations on top of RAG delivers an even bigger jump in performance, enabling the Q&A systems to get closer to human-expert quality. We discuss the implications of such findings, propose a structured technical design space capturing major technical components of Q&A AI, and provide recommendations for making high-impact technical choices for such components. We plan to follow up on this work with actionable guides for AI teams and further investigations into the impact of domain-specific augmentation in RAG and into agentic AI capabilities such as advanced planning and reasoning.
△ Less
Submitted 19 April, 2024; v1 submitted 17 April, 2024;
originally announced April 2024.
-
Improving Photometric Redshift Estimation for Cosmology with LSST using Bayesian Neural Networks
Authors:
Evan Jones,
Tuan Do,
Bernie Boscoe,
Jack Singal,
Yujie Wan,
Zooey Nguyen
Abstract:
We present results exploring the role that probabilistic deep learning models can play in cosmology from large-scale astronomical surveys through photometric redshift (photo-z) estimation. Photo-z uncertainty estimates are critical for the science goals of upcoming large-scale surveys such as LSST, however common machine learning methods typically provide only point estimates and lack uncertaintie…
▽ More
We present results exploring the role that probabilistic deep learning models can play in cosmology from large-scale astronomical surveys through photometric redshift (photo-z) estimation. Photo-z uncertainty estimates are critical for the science goals of upcoming large-scale surveys such as LSST, however common machine learning methods typically provide only point estimates and lack uncertainties on predictions. We turn to Bayesian neural networks (BNNs) as a promising way to provide accurate predictions of redshift values with uncertainty estimates. We have compiled a galaxy data set from the Hyper Suprime-Cam Survey with grizy photometry, which is designed to be a smaller scale version of large surveys like LSST. We use this data set to investigate the performance of a neural network (NN) and a probabilistic BNN for photo-z estimation and evaluate their performance with respect to LSST photo-z science requirements. We also examine the utility of photo-z uncertainties as a means to reduce catastrophic outlier estimates. The BNN outputs the estimate in the form of a Gaussian probability distribution. We use the mean and standard deviation as the redshift estimate and uncertainty. We find that the BNN can produce accurate uncertainties. Using a coverage test, we find excellent agreement with expectation -- 67.2$\%$ of galaxies between $0 < 2.5$ have 1-$σ$ uncertainties that cover the spectroscopic value. We also include a comparison to alternative machine learning models using the same data. We find the BNN meets two out of three of the LSST photo-z science requirements in the range $0 < z < 2.5$.
△ Less
Submitted 18 March, 2024; v1 submitted 22 June, 2023;
originally announced June 2023.
-
BLOOM: A 176B-Parameter Open-Access Multilingual Language Model
Authors:
BigScience Workshop,
:,
Teven Le Scao,
Angela Fan,
Christopher Akiki,
Ellie Pavlick,
Suzana Ilić,
Daniel Hesslow,
Roman Castagné,
Alexandra Sasha Luccioni,
François Yvon,
Matthias Gallé,
Jonathan Tow,
Alexander M. Rush,
Stella Biderman,
Albert Webson,
Pawan Sasanka Ammanamanchi,
Thomas Wang,
Benoît Sagot,
Niklas Muennighoff,
Albert Villanova del Moral,
Olatunji Ruwase,
Rachel Bawden,
Stas Bekman,
Angelina McMillan-Major
, et al. (369 additional authors not shown)
Abstract:
Large language models (LLMs) have been shown to be able to perform new tasks based on a few demonstrations or natural language instructions. While these capabilities have led to widespread adoption, most LLMs are developed by resource-rich organizations and are frequently kept from the public. As a step towards democratizing this powerful technology, we present BLOOM, a 176B-parameter open-access…
▽ More
Large language models (LLMs) have been shown to be able to perform new tasks based on a few demonstrations or natural language instructions. While these capabilities have led to widespread adoption, most LLMs are developed by resource-rich organizations and are frequently kept from the public. As a step towards democratizing this powerful technology, we present BLOOM, a 176B-parameter open-access language model designed and built thanks to a collaboration of hundreds of researchers. BLOOM is a decoder-only Transformer language model that was trained on the ROOTS corpus, a dataset comprising hundreds of sources in 46 natural and 13 programming languages (59 in total). We find that BLOOM achieves competitive performance on a wide variety of benchmarks, with stronger results after undergoing multitask prompted finetuning. To facilitate future research and applications using LLMs, we publicly release our models and code under the Responsible AI License.
△ Less
Submitted 27 June, 2023; v1 submitted 9 November, 2022;
originally announced November 2022.
-
Photometric Redshifts for Cosmology: Improving Accuracy and Uncertainty Estimates Using Bayesian Neural Networks
Authors:
Evan Jones,
Tuan Do,
Bernie Boscoe,
Yujie Wan,
Zooey Nguyen,
Jack Singal
Abstract:
We present results exploring the role that probabilistic deep learning models can play in cosmology from large scale astronomical surveys through estimating the distances to galaxies (redshifts) from photometry. Due to the massive scale of data coming from these new and upcoming sky surveys, machine learning techniques using galaxy photometry are increasingly adopted to predict galactic redshifts…
▽ More
We present results exploring the role that probabilistic deep learning models can play in cosmology from large scale astronomical surveys through estimating the distances to galaxies (redshifts) from photometry. Due to the massive scale of data coming from these new and upcoming sky surveys, machine learning techniques using galaxy photometry are increasingly adopted to predict galactic redshifts which are important for inferring cosmological parameters such as the nature of dark energy. Associated uncertainty estimates are also critical measurements, however, common machine learning methods typically provide only point estimates and lack uncertainty information as outputs. We turn to Bayesian neural networks (BNNs) as a promising way to provide accurate predictions of redshift values. We have compiled a new galaxy training dataset from the Hyper Suprime-Cam Survey, designed to mimic large surveys, but over a smaller portion of the sky. We evaluate the performance and accuracy of photometric redshift (photo-z) predictions from photometry using machine learning, astronomical and probabilistic metrics. We find that while the Bayesian neural network did not perform as well as non-Bayesian neural networks if evaluated solely by point estimate photo-z values, BNNs can provide uncertainty estimates that are necessary for cosmology
△ Less
Submitted 14 February, 2022;
originally announced February 2022.
-
Predicting the redshift of gamma-ray loud AGNs using supervised machine learning
Authors:
Maria Giovanna Dainotti,
Malgorzata Bogdan,
Aditya Narendra,
Spencer James Gibson,
Blazej Miasojedow,
Ioannis Liodakis,
Agnieszka Pollo,
Trevor Nelson,
Kamil Wozniak,
Zooey Nguyen,
Johan Larrson
Abstract:
AGNs are very powerful galaxies characterized by extremely bright emissions coming out from their central massive black holes. Knowing the redshifts of AGNs provides us with an opportunity to determine their distance to investigate important astrophysical problems such as the evolution of the early stars, their formation along with the structure of early galaxies. The redshift determination is cha…
▽ More
AGNs are very powerful galaxies characterized by extremely bright emissions coming out from their central massive black holes. Knowing the redshifts of AGNs provides us with an opportunity to determine their distance to investigate important astrophysical problems such as the evolution of the early stars, their formation along with the structure of early galaxies. The redshift determination is challenging because it requires detailed follow-up of multi-wavelength observations, often involving various astronomical facilities. Here, we employ machine learning algorithms to estimate redshifts from the observed gamma-ray properties and photometric data of gamma-ray loud AGN from the Fourth Fermi-LAT Catalog. The prediction is obtained with the Superlearner algorithm, using LASSO selected set of predictors. We obtain a tight correlation, with a Pearson Correlation Coefficient of 71.3% between the inferred and the observed redshifts, an average Δz_norm = 11.6 x 10^-4. We stress that notwithstanding the small sample of gamma-ray loud AGNs, we obtain a reliable predictive model using Superlearner, which is an ensemble of several machine learning models.
△ Less
Submitted 22 July, 2021;
originally announced July 2021.
-
Unified modelling of epidemics by coupled dynamics via Monte-Carlo Markov Chain algorithms
Authors:
Frédéric Protin,
Martel Jules,
Duc Thang Nguyen,
Hang T. T. Nguyen,
Charles Piffault,
Willy Rodríguez,
Susely Figueroa Iglesias,
Tat Dat Tô,
Wilderich Tuschmann,
Hông Vân Lê,
Tenan Yeo,
Tien Zung Nguyen
Abstract:
To forecast the time dynamics of an epidemic, we propose a discrete stochastic model that unifies and generalizes previous approaches to the subject. Viewing a given population of individuals or groups of individuals with given health state attributes as living in and moving between the nodes of a graph, we use Monte-Carlo Markov Chain techniques to simulate the movements and health state changes…
▽ More
To forecast the time dynamics of an epidemic, we propose a discrete stochastic model that unifies and generalizes previous approaches to the subject. Viewing a given population of individuals or groups of individuals with given health state attributes as living in and moving between the nodes of a graph, we use Monte-Carlo Markov Chain techniques to simulate the movements and health state changes of the individuals according to given probabilities of stay that have been preassigned to each of the nodes. We utilize this model to either capture and predict the future geographic evolution of an epidemic in time, or the evolution of an epidemic inside a heterogeneous population which is divided into homogeneous sub-populations, or, more generally, its evolution in a combination or superposition of the previous two contexts. We also prove that when the size of the population increases and a natural hypothesis is satisfied, the stochastic process associated to our model converges to a deterministic process. Indeed, when the length of the time step used in the discrete model converges to zero, in the limit this deterministic process is driven by a differential equation yielding the evolution of the expectation value of the number of infected as a function of time. In the second part of the paper, we apply our model to study the evolution of the Covid-19 epidemic. We deduce a decomposition of the function yielding the number of infectious individuals into "wavelets", which allows to trace in time the expectation value for the number of infections inside each sub-population. Within this framework, we also discuss possible causes for the occurrence of multiple epidemiological waves.
△ Less
Submitted 25 June, 2021;
originally announced June 2021.
-
Coalescence of Islands in Freely-Suspended Smectic Films
Authors:
Z. H. Nguyen,
K. Harth,
A. M. Goldfain,
C. S. Park,
J. E. Maclennan,
M. A. Glaser,
N. A. Clark
Abstract:
Smectic liquid crystal films a few molecular layers thick that are freely suspended in air are used as a model system to study the coalescence of fluids in two dimensions. High-speed video microscopy is used to observe the coalescence of islands, which are thicker, disk-shaped regions of the film, in a process driven by the line tension associated with edge dislocations along the island boundaries…
▽ More
Smectic liquid crystal films a few molecular layers thick that are freely suspended in air are used as a model system to study the coalescence of fluids in two dimensions. High-speed video microscopy is used to observe the coalescence of islands, which are thicker, disk-shaped regions of the film, in a process driven by the line tension associated with edge dislocations along the island boundaries and limited by viscous dissipation in the liquid crystal and in the surrounding air. The early time growth of the bridge connecting the merging islands reveals much slower dynamics than predicted by Hopper's classical hydrodynamic model of coalescence of two infinitely long, fluid cylinders in vacuum, a discrepancy proposed to be due to significant dissipation in the background film and in the air that is not included in Hopper's theory. At late times, the elliptical merged island relaxes exponentially to a circular shape, at rates that are described quantitatively by a model originally developed for the evolution of fluid domains in Langmuir films.
△ Less
Submitted 2 July, 2021; v1 submitted 1 June, 2020;
originally announced June 2020.
-
Mutual diffusion of inclusions in freely-suspended smectic liquid crystal films
Authors:
Zhiyuan Qi,
Zoom Hoang Nguyen,
Cheol Soo Park,
Matthew A. Glaser,
Joseph E. Maclennan,
Noel A. Clark,
Tatiana Kuriabova,
Thomas R. Powers
Abstract:
We study experimentally and theoretically the hydrodynamic interaction of pairs of circular inclusions in two-dimensional, fluid smectic membranes suspended in air. By analyzing their Brownian motion, we find that the radial mutual mobilities of identical inclusions are independent of their size but that the angular coupling becomes strongly size-dependent when their radius exceeds a characteristi…
▽ More
We study experimentally and theoretically the hydrodynamic interaction of pairs of circular inclusions in two-dimensional, fluid smectic membranes suspended in air. By analyzing their Brownian motion, we find that the radial mutual mobilities of identical inclusions are independent of their size but that the angular coupling becomes strongly size-dependent when their radius exceeds a characteristic hydrodynamic length. The observed dependence of the mutual mobilities on inclusion size is described well for arbitrary separations by a model that generalizes the Levine/MacKintosh theory of point-force response functions and uses a boundary-element approach to calculate the mobility matrix.
△ Less
Submitted 9 January, 2014;
originally announced January 2014.