RONIN: data lake exploration

P Ouellette, A Sciortino, F Nargesian… - Proceedings of the …, 2021 - par.nsf.gov
RONIN, a data lake exploration tool that enables integrated search and navigation. The main
component of RONIN is an algorithm for constructing an organization on a set of data sets. …

Data lakes: A survey of functions and systems

R Hai, C Koutras, C Quix… - … on Knowledge and Data …, 2023 - ieeexplore.ieee.org
… A more recent system RONIN [110], combines navigation using the above DAG-based …
In what follows, we discuss the data lakes that explore datasets with diverse data models. …

Integrating data lake tables

A Khatiwada, R Shraga, W Gatterbauer… - Proceedings of the VLDB …, 2022 - dl.acm.org
… In this work, we explore the use of TURL to represent columns of data lake tables. Once the
embeddings for the columns are set, we need to define a similarity/distance measure to be …

R2D2: Reducing Redundancy and Duplication in Data Lakes

R Shah, K Mukherjee, A Tyagi, SK Karnam… - … Management of Data, 2023 - dl.acm.org
RONIN [23] enables user exploration of a data lake by navigation of a hierarchical structure.
Here, table similarities are computed by averaging word embeddings of tokens in the table …

Table discovery in data lakes: State-of-the-art and future directions

G Fan, J Wang, Y Li, RJ Miller - … Conference on Management of Data, 2023 - dl.acm.org
… to explore and gain insights from massive collections of data … studies about discovery from
other data formats, such as … from tabular data, which is a primary data format in data lakes. We …

A multi-start simulated annealing strategy for Data Lake Organization Problem

D Fernandes, GS Ramos, RGS Pinheiro… - Applied Soft …, 2024 - Elsevier
… In the latter, we evaluate the navigation quality by simulating a user’s exploration. For … In
RONIN, a user can perform a keyword or joinability search over a data lake, then browse the …

Demeter: An automatic framework for data migration in open data lakes

D Kim, J Han, S Son, MS Gil… - Software: Practice and …, 2024 - Wiley Online Library
data in tables rather than files for efficient data exploration and analysis. In this paper, we
investigate the data management of open data lakes … an open data lake, RONIN 41 proposed a …

Dataset discovery and exploration: A survey

NW Paton, J Chen, Z Wu - ACM Computing Surveys, 2023 - dl.acm.org
… However, suitable data is often not immediately to hand, and there may be many potentially
… in a data lake or in open data repositories. As a result, data discovery and exploration are …

UniDM: A Unified Framework for Data Manipulation with Large Language Models

Y Qian, Y He, R Zhu, J Huang, Z Ma… - Proceedings of …, 2024 - proceedings.mlsys.org
… We develop an automatic context retrieval to allow the LLMs to retrieve data from data lakes,
… Besides, it is very interesting to explore new integration methods except fine-tuning LLMs. …

Humboldt: Metadata-Driven Extensible Data Discovery

A Bäuerle, Ç Demiralp, M Stonebraker - arXiv preprint arXiv:2408.05439, 2024 - arxiv.org
data, users need contextual views (eg, "which dashboards are my teammates working on?"),
exploration tools (eg, "show me data … their exploration at one data artifact, explore data that …