0% found this document useful (0 votes)

2 views6 pages

The Implementation Solution for Automatic Visualization of Tabular Data in Relational Databases Based on Large Language Models (1)

This document discusses an implementation solution for automatic visualization of tabular data in relational databases using large language models. The process involves generating SQL queries based on user descriptions, determining chart types, and mapping data to visual channels, utilizing the Chain-of-Thought technique for improved reasoning. The study evaluates the effectiveness of this approach on the nvBench dataset, demonstrating its potential for enhancing data visualization accessibility.

Uploaded by

carter TLC

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

2 views6 pages

The Implementation Solution for Automatic Visualization of Tabular Data in Relational Databases Based on Large Language Models (1)

Uploaded by

carter TLC

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 6

The implementation solution for automatic

visualization of tabular data in relational databases

based on large language models
2024 International Conference on Asian Language Processing (IALP) | 979-8-3315-4085-2/24/$31.00 ©2024 IEEE | DOI: 10.1109/IALP63756.2024.10661162

Hao Yang Zhaoyong Yang

Beijing Advanced Innovation Center for language Resources Beijing Advanced Innovation Center for language Resources
Beijing Language and Culture University Beijing Language and Culture University
Beijing, China Beijing, China
[email protected] yangzhaoyong [email protected]

Ruyang Zhao Xiaoran Li

Gaoqi Rao
Research Institute of International Chinese Language Education
Beijing Language and Culture University
Beijing, China
[email protected]

Abstract—In the data analysis process, visualized data can Power BI [1], Redash [23], Tableau [24], etc. Users can utilize
help users gain better insights. To make it easier and faster these software tools to visualize their data through interface
for users to obtain visual charts from data, natural language selections, drag-and-drop operations, and other interactions.
interfaces for data visualization have emerged. Users only need
to provide the visualization model with the data to be visualized Although these tools can help users visualize tabular data,
and a description of their visualization needs, and the model will they require users to have professional data analysis skills and
return a visual chart(NL2VIS). In real-world scenarios, most visualization knowledge, which creates a relatively high barrier
data is stored in relational databases. To visualize this data, to entry. Therefore, the academic and industrial communities
it is first necessary to generate a structured query statements have begun researching natural language interfaces for tabular
based on the user’s visualization requirements(NL2SQL), and
then proceed with the subsequent visualization operations. This data visualization, aiming to achieve automatic visualization
study breaks down the task of automatic visualization of tabular of tabular data. The goal is to automatically generate a chart
data in relational databases into three main steps: generating based on the user’s natural language description.
SQL, determining the chart type, and mapping data to visual Research on automatic visualization of tabular data has gen-
channels. We utilize the Chain-of-Thought(CoT) technique of erally gone through three main stages: rule-based stage [2], [5],
generative large language models to address the task of automatic
visualization of tabular data. Finally, we evaluated our approach deep neural network-based stage [8], [9], and large language
on the nvBench dataset, and the results show that CoT-based model-based stage [12], [15]. Its generalization ability and ro-
automatic visualization of tabular data performs well. bustness have gradually increased with the iterative upgrading
Index Terms—NL2VIS, NL2SQL, Chain-of-Thought, Large of technology. From the perspective of model output, previous
Language Model research on automatic visualization of tabular data can mainly
be divided into two categories. One is to output executable
I. I NTRODUCTION
visualization language scripts, and the other is to output
The visualization of tabular data is extremely important in abstract expressions. The method of outputting visualization
data analysis. A good chart not only clearly presents data char- language scripts directly utilizes generative language models
acteristics but also helps users gain insights into data patterns. to generate target code, such as Vega-Lite [13], Python [11],
To assist users in automatically visualizing tabular data, a etc. The method of outputting abstract expressions mainly
series of commercial software solutions have emerged, such as involves the model generating a predefined visualization query
statement, such as Vega-Zero (as is illustrated in the Fig. 1)
Supported by NSFC(No. 62076038), achievements of the Project of In-
telligent International Chinese Education at Beijing Language and Culture [9], DVQ (as is illustrated in the Fig. 2) [21], etc., which is
University then converted into a visualization language program. Unlike

979-8-3315-4085-2/24/$31.00 ©2024 IEEE 175

Authorized licensed use limited to: Hong Kong University of Science and Technology. Downloaded on July 29,2025 at 08:29:39 UTC from IEEE Xplore. Restrictions apply.
these two mainstream approaches, in this study, we need the [23] have emerged successively. Users can utilize these tools
model to output SQL. for data visualization, but they often require a significant
amount of manual operation. Subsequently, both academic
and industry began researching natural language interfaces for
data visualization, making user operations more convenient.
These natural language interfaces enable users, even those
lacking data analysis and visualization experience, to generate
charts through language descriptions. The research on natural
language interfaces for visualization has gone through three
stages: rule-based methods, such as NL4DV [2], Orko [3],
Eviza [4], FlowSense [5], DataTone [6]; deep neural network-
based methods, such as ADVisor [7], Seq2vis [8], ncNet [9],
RGVisNet [10]; and generative large language model based
methods, such as Chat2Vis [11], Prompt4Vis [12], Mirror [13],
ChartGPT [14], and LIDA [15]. At present, the development
of automated data visualization has progressed to a stage
Fig. 1: Vega-Zero abstract statement expression. where large-scale models are used to solve problems. Based
on large language models, two mainstream solutions have
emerged: directly generating target code and visualization
abstract expressions. For example, Chat2Vis [11] directly
generates Python programs, while Prompt4Vis [12] generates
visualization abstract expressions.To further improve the ac-
curacy of data visualization, researchers tend to decompose
visualization tasks into several sub-tasks, with each subtask
responsible for solving a specific problem. The results of
each subtask are then inputted into subsequent sub-tasks. For
example, both ChartGPT [14] and Prompt4Vis [12] decompose
the visualization task into sub-tasks and then solve each
sub-task one by one, ultimately achieving good results in
completing the visualization task.
Large language models possess powerful In-Contextual
Learning abilities [16]. In order to further enhance their
Fig. 2: Data Visualization Query(DVQ) abstract statement reasoning capabilities, researchers have investigated prompt
expression. techniques such as zero-shot [17] and few-shot [16]. Sub-
sequently, researchers discovered a Chain-of-Thought [18]
To visualize tabular data in relational databases, data ac- prompt approach, which mimics the human thinking process
quisition is the first step. Retrieving data from relational by solving problems step by step. In academia, based on the
databases relies on the execution of SQL statements by the chain-of-thought, other variants have been studied, such as
database engine. Therefore, to visualize tabular data in rela- Contrastive Chain-of-Thought [19], and Least-to-Most [20],
tional databases, we must first generate SQL queries based etc.
on the user’s natural language description. Subsequently, we
proceed with chart selection and mapping data to visual III. D EFINITION OF P ROBLEM
channels. When solving problems using prompt-based techniques with
Large language models like GPT-3.5 have demonstrated generative large language models, users need to provide the
powerful text generation and semantic understanding capabil- model with a prompt for problem solving. The model then
ities, achieving state-of-the-art performance in many down- performs inference based on the prompt provided by the user
stream tasks. Prompt techniques using large language models, and generates a response to the problem. We define the large
such as CoT, imitate human problem-solving approaches by language model as LLM , the prompt provided by the user
step-by-step reasoning, and have been widely researched by as P , and the model’s response as A. This process can be
scholars. In this study, we will utilize CoT prompt techniques formalized as follows:
to address the task of automatic visualization of tabular data
LLM (P ) → A
in relational databases (CoT-VIS).
The automatic visualization of tabular data is the process by
II. R ELATED W ORK which a model provides users with a chart based on natural
In order to facilitate data analysis and visualization, a series language descriptions and database schema information. Gen-
of tools such as Power BI [1], Tableau [24], and Redash erating a chart typically requires three types of information:

2024 International Conference on Asian Language Processing (IALP) 176

Authorized licensed use limited to: Hong Kong University of Science and Technology. Downloaded on July 29,2025 at 08:29:39 UTC from IEEE Xplore. Restrictions apply.
the chart type, the data, and the mapping between the data A. Prompt Strategies for Large Language Models
and visual channels. Therefore, it is necessary to assemble the
Large language models can leverage their ICL abilities to
user’s natural language descriptions and the database schema
respond to user queries based on the provided prompt text.
information into a prompt as input to the LLM. The data
Depending on the number of examples given by the user,
objects in this study are tabular data in relational databases.
prompt strategies can be categorized into two types: zero-shot
The model needs to provide structured query statements, chart
and few-shot styles. Zero-shot prompts do not provide the
types, and mappings between data and visual channels based
model with question-answer examples, requiring the model
on the user’s natural language descriptions and the database
to respond directly to the query. Few-shot prompts, on the
schema information. The process of automatic visualization
other hand, provide the model with several question-answer
of tabular data in relational databases is illustrated in the
examples, which can further enhance the accuracy of the
Fig. 3. We define the user’s natural language descriptions as
model’s responses.
N L, the database schema information as D, structured query
The user provides the model with question-answer pairs in
statements as SQL, chart types as CHART , and the mapping
the form < q, a >. Here, q represents the user’s question
between data and visual channels as M AP . This process can
along with any additional information needed to solve the
be formalized as follows:
problem, and a represents the model’s response. Typically,
LLM (P (N L, D)) → {SQL, CHART, M AP } prompts in the < q, a > format are referred to as standard-
prompt. Researchers have discovered that by adding reasoning
steps r, prompting the model to derive the answer step by
step, the accuracy of the model’s responses can be significantly
improved. Consequently, researchers proposed the < q, r, a >
format prompt and named it Chain-of-Thought(CoT) prompt.
Usually, users need to manually create several CoT examples
to enable the model to mimic this format to generate answers.
This type of prompt is known as few-shot CoT prompting.
Additionally, there is a method that does not require manually
crafted prompts: by providing the model with ”Let’s think step
by step.”, the model is induced to produce CoT responses. This
is referred to as zero-shot CoT prompting. Empirical evidence
Fig. 3: The workflow of automatic visualization of tabular data shows that manually few-shot CoT prompts outperform zero-
in relational databases . shot CoT prompts in terms of performance.CoT prompt breaks
down problems step by step, solving them gradually, which
enhances problem-solving capabilities compared to standard
IV. S OLUTION OF C OT-VIS prompts.
To enhance the performance of large language models
on downstream tasks, there are typically two mainstream B. CoT-VIS
approaches: first, fine-tuning the model on a dataset specific to As previously mentioned, visualizing tabular data in a
the downstream task; second, constructing prompts to leverage relational database involves three major steps: generating
the model’s In-Context Learning(ICL) ability to solve prob- SQL, determining the chart type, and mapping data to visual
lems. Fine-tuning a model demands significant computational channels. We will now detail these three steps, followed by
power and high-quality downstream task datasets, making it developing a CoT prompt based on this process.
highly resource-intensive. In contrast, prompt engineering is 1) Generating SQL: The model needs to generate the
simple and easy to implement, as it does not require updating corresponding SQL based on the user’s natural language
the model’s parameters. A well-crafted prompt can guide the description to query data. Unlike traditional NL2SQL tasks,
model to generate highly accurate answers. Prompt engineer- in a visualization task, it is not only necessary to identify the
ing is currently a subject of extensive research in the academic correct data columns and tables but also to perform appropriate
community and has already achieved remarkable success in data transformations on the data columns, such as data binning.
tasks such as commonsense reasoning, mathematical problem- a) Step 1: Determining Data Columns: After seman-
solving, and symbolic reasoning [18]. tically understanding the user’s natural language description
Next, we will elaborate on how to use prompt techniques and the database schema, the language model determines the
in large language models to address the task of automatic data columns that need to be queried. Assuming there are
visualization of tabular data in relational databases. First, we n data columns {column1 , column2 , . . . , columnn } in the
will summarize the current prompt strategies employed to database, this process will ultimately yield an intermediate
enhance the reasoning capabilities of large language models. SQL expression in the following form:
Then, we will select an appropriate solution specifically for
the task of automatic visualization of tabular data. SELECT columni | columnj |...

2024 International Conference on Asian Language Processing (IALP) 177

Authorized licensed use limited to: Hong Kong University of Science and Technology. Downloaded on July 29,2025 at 08:29:39 UTC from IEEE Xplore. Restrictions apply.
Fig. 4: Chain-of-thought deduction process for automatic visualization of tabular data in relational databases.

b) Step 2: Determining Data Tables: Based on step 1, de- afternoon’

termine the data tables according to the selected data columns. WHEN strftime(’%H’, timestamp)
If the data columns come from a single table Ti , then the target BETWEEN ’18’ AND ’23’ THEN ’
data T = Ti . If the selected columns come from multiple evening’
tables {T1 , T2 , . . . , Tn }, it is necessary to perform multi-table ELSE ’unknown’
joins based on foreign key information. This process will END AS bucket
generate an intermediate SQL expression in the following FROM
form: T;
SELECT columni | columnj |... Almost all database engines support basic binning operations
FROM T for time or numeric data. After the data has been binned, we
will obtain the following intermediate SQL:
where T = Ti | JOIN (Ti , Tj , . . . | F oreign Keys).
c) Step 3: Data Transformation: To determine whether SELECT columni | columnj |...
transformation operations are needed for the selected columns, FROM T
primarily data binning operations. Many previous works re- where, columni ∈ {columni , BIN (columni )}, columnj ∈
lied on the data binning functionality inherent in frontend {columnj , BIN (columnj )} . . .
visualization frameworks (such as vega-lite). In this study, d) Step 4: Data Filtering: Based on the user’s natural
data is obtained through generating SQL queries, and SQL language description, determine the data filtering conditions
itself supports data transformation operations. Specifically, to obtain the data that meets the user’s requirements. In SQL,
data binning operations are typically implemented using the data filtering is done through the ` WHERE ` clause. After
` CASE ` expression in SQL. If there is a field named this step, we will obtain an intermediate SQL in the following
timestamp, representing timestamps, and the user requires data form (` Cond `represents the condition filtering operation):
to be binned based on early morning, morning, afternoon, and
evening time periods, the ` CASE ` expression can be used SELECT columni | columnj |...
to achieve this: FROM T
WHERE Cond(columni ) | Cond(columnj )|...
SELECT
CASE where, columni ∈ {columni , BIN (columni )}, columnj ∈
WHEN strftime(’%H’, timestamp) {columnj , BIN (columnj )} . . .
e) Step 5: Data Group By: Determine whether to per-
BETWEEN ’00’ AND ’05’ THEN ’
form a grouping operation (GROUP BY) based on certain data
early morning’
columns. After binning and grouping operations, the data will
WHEN strftime(’%H’, timestamp)
be divided into different groups based on values. After this
BETWEEN ’06’ AND ’11’ THEN ’
step, we will obtain an intermediate SQL in the following
morning’
form:
WHEN strftime(’%H’, timestamp)
BETWEEN ’12’ AND ’17’ THEN ’ SELECT columni | columnj |...

2024 International Conference on Asian Language Processing (IALP) 178

Authorized licensed use limited to: Hong Kong University of Science and Technology. Downloaded on July 29,2025 at 08:29:39 UTC from IEEE Xplore. Restrictions apply.
FROM T V. E XPERIMENT
GROUP BY columni | columnj |... a) Dataset: We chose nvBench [8] as the evaluation
where, columni ∈ {columni , BIN (columni )}, columnj ∈ dataset, widely used in the field of data visualization. nvBench
{columnj , BIN (columnj )} . . . is a large dataset designed for complex and cross-domain
f) Step 6: Data Aggregation: After dividing the data into NL2VIS tasks, covering 105 domains, supporting seven com-
different groups, it is common to perform aggregation (AGG) mon types of visualizations (Bar, Line, Scatter, Pie, Stacked
operations on the data within the same group. Basic column Bar, Grouping Line, Grouping Scatter) and containing 25,750
aggregation operations include SUM(), COUNT(), AVG(), (NL, VIS) pairs. Following the experimental validation method
etc. After the data aggregation operation, we will obtain the outlined in the Prompt4Vis paper [12], we randomly selected
following intermediate SQL: 141 databases from nvBench and divided them into training,
validation, and test sets in a ratio of 7:2:1. Specifically, the
SELECT columni | columnj |... training set contains 98 databases, the validation set contains
FROM T 14 databases, and the test set contains 29 databases.
GROUP BY columni | columnj |... b) Model and Methods: In this experiment, we selected
the GPT-3.5-turbo model interface provided by OpenAI and set
where,
temperature=0. The comparative methods include: zero-shot
columni ∈ {columni , BIN (columni ), AGG(columni )}, prompt, few-shot prompt, zero-shot-CoT prompt, and few-
shot-CoT prompt. In the zero-shot prompt method, the model
columnj ∈ {columnj , BIN (columnj ), AGG(columnj )}
directly responds based on the provided database schema in-
formation and user natural language description. The few-shot
prompt method requires manually writing several examples to
g) Step 7: Data Order By: The final step in generating provide to the model before letting it respond. The Zero-shot-
the target SQL is to determine whether certain columns need CoT prompt method adds ”Let’s think step by step.” to induce
to be sorted in ascending or descending order. In visualization the model to generate chain-of-thought responses based on the
charts, it is very common to sort by the names on the x-axis, zero-shot prompt. The few-shot-CoT prompt method involves
for example. After this step, we will obtain the final SQL manually writing seven chain-of-thought examples (one for
expression: each type of chart) to provide to the model for its response.
SELECT columni | columnj |... c) Metrics: This study comprehensively evaluates the
FROM T data accuracy (data acc), axis accuracy(axis acc), chart accu-
GROUP BY columni | columnj |... racy(chart acc), and overall accuracy(overall acc) of the data
ORDER BY columni | columnj |...ASC|DESC visualization system. Data accuracy refers to the execution
where, accuracy of the SQL predicted by the model, calculated as the
proportion of correctly executed results to the total number of
columni ∈ {columni , BIN (columni ), AGG(columni )}, results. Axis accuracy is the accuracy of mapping data to visual
columnj ∈ {columnj , BIN (columnj ), AGG(columnj )} channels, calculated similarly as a proportion. Chart accuracy
compares the predicted chart type by the model with the gold
chart, also calculated as a proportion.
2) Determining the Type of Chart: The language model Method Data acc Axis acc Chart acc Overall acc
will determine the type of chart based on the user’s natural zero-shot 0.304 0.371 0.759 0.217
language description and the generated SQL. Our data visu- few-shot 0.501 0.497 0.928 0.326
zero-shot-CoT 0.490 0.494 0.939 0.274
alization system supports seven types of charts: Scatter, Pie, few-shot-CoT 0.559 0.810 0.975 0.490
Bar, Stacked Bar, Line, Grouping Scatter, and Grouping Line.
3) Mapping: Mapping the target fields of the SQL query TABLE I: Experiment result
to the visual channels of the data chart. For example, a bar
chart has two dimensions: the x-axis and the y-axis. If the d) Experiment Result: Experimental results show that
SQL query fields are column1 and column2 , where column1 with zero-shot and zero-shot-CoT prompt methods, large lan-
corresponds to the x-axis and column2 corresponds to the guage models struggle with the automatic visualization of
y-axis of the bar chart. After reasoning based on contextual tabular data, exhibiting very low accuracy. Providing the model
information, the language model will obtain: {‘‘x-axis’’: with a few examples using few-shot prompts can improve the
‘‘column1 ’’, ‘‘y-axis’’: ‘‘column2 ’’}. model’s accuracy to some extent. Our designed few-shot-CoT
After generating SQL, determining the chart type, and prompts significantly enhance the model’s response accuracy.
mapping the visual channels, the model will output all the This proves that the chain-of-thought prompts designed for the
information needed for visualization through chain-of-thought automatic visualization of tabular data in relational databases
reasoning. The complete deductive process is illustrated in the are highly effective. However, we must also note that there
Figure 4. is still considerable room for improvement in accuracy, which

2024 International Conference on Asian Language Processing (IALP) 179

Authorized licensed use limited to: Hong Kong University of Science and Technology. Downloaded on July 29,2025 at 08:29:39 UTC from IEEE Xplore. Restrictions apply.
inspires us to design more sophisticated deduction algorithms [13] Canwen Xu, Julian McAuley, and Penghan Wang. 2023. Mirror: A
to further enhance accuracy. Natural Language Interface for Data Querying, Summarization, and
Visualization. In Companion Proceedings of the ACM Web Conference
2023. 49–52.
VI. C ONCLUSION [14] Yuan Tian, Weiwei Cui, Dazhen Deng, Xinjing Yi, Yurun Yang, Haidong
This study divides the task of automatic visualization of Zhang, andYingcai Wu.2024. Chartgpt: Leveraging llms to generate
charts from abstract natural language. IEEE Transactions on Visualiza-
tabular data in relational databases into three main steps: tion and Computer Graphics (2024).
generating SQL, determining the chart type, and mapping data [15] Victor Dibia. 2023. LIDA: A Toolfor Automatic Generation of
to visual channels. Using the Chain-of-Thought technique of Grammar-Agnostic Visualizations and Infographics using Large Lan-
guage Models. In Proceedings of the 61st Annual Meeting of the
large language models, we perform step-by-step reasoning for Association for Computational Linguistics (Volume 3: System Demon-
these three steps. Experimental validation demonstrates that strations), Danushka Bollegala, Ruihong Huang, and Alan Ritter (Eds.).
the Chain-of-Thought technique can be effectively applied to Association for Computational Linguistics, Toronto, Canada, 113–126.
https://doi.org/10.18653/v1/2023.acl-demo.11
the task of automatic data visualization, significantly improv- [16] Brown T, Mann B, Ryder N, et al. Language models are few-shot
ing its accuracy. However, we should also note that there is learners[J]. Advances in neural information processing systems, 2020,
still considerable room for improvement in the accuracy of 33: 1877-1901.
[17] Wei J, Bosma M, Zhao V Y, et al. Finetuned language models are zero-
automatic data visualization tasks, and we will continue to shot learners[J]. arXiv preprint arXiv:2109.01652, 2021.
explore ways to enhance this accuracy. Additionally, for the [18] Wei J, Wang X, Schuurmans D, et al. Chain-of-thought prompting elicits
task of automatic visualization of tabular data, the only eval- reasoning in large language models[J]. Advances in neural information
processing systems, 2022, 35: 24824-24837.
uation dataset available is nvBench. We look forward to the [19] Chia Y K, Chen G, Tuan L A, et al. Contrastive chain-of-thought
academic community producing more high-quality evaluation prompting[J]. arXiv preprint arXiv:2311.09277, 2023.
datasets. [20] Zhou D, Schärli N, Hou L, et al. Least-to-most prompting en-
ables complex reasoning in large language models[J]. arXiv preprint
arXiv:2205.10625, 2022.
R EFERENCES [21] Luo Y, Qin X, Tang N, et al. Deepeye: Towards automatic data visual-
[1] Louis T Becker and Elyssa M Gould. 2019. Microsoft power BI: ization[C]//2018 IEEE 34th international conference on data engineering
extending excel to manipulate, analyze, and visualize diverse data. (ICDE). IEEE, 2018: 101-112.
Serials Review 45, 3 (2019), 184–188. [22] Satyanarayan A, Moritz D, Wongsuphasawat K, et al. Vega-lite: A
[2] Arpit Narechania, Arjun Srinivasan, and John Stasko. 2020. NL4DV: grammar of interactive graphics[J]. IEEE transactions on visualization
A toolkit for generating analytic specifications for data visualization and computer graphics, 2016, 23(1): 341-350.
from natural language queries. IEEE Transactions on Visualization and [23] Leibzon A, Leibzon Y. Redash V5 Quick Start Guide: Create and Share
Computer Graphics 27, 2 (2020), 369–379. Interactive Dashboards Using Redash[M]. Packt Publishing Ltd, 2018.
[3] Arjun Srinivasan and JohnStasko.2017.Orko:Facilitating multimodal in- [24] Batt S, Grealis T, Harmon O, et al. Learning Tableau: A data visu-
teraction for visual exploration and analysis of networks. IEEE transac- alization tool[J]. The Journal of Economic Education, 2020, 51(3-4):
tions on visualization and computer graphics 24, 1 (2017), 511–521. 317-328.
[4] Vidya Setlur, Sarah E Battersby, Melanie Tory, Rich Gossweiler, and
Angel X Chang. 2016. Eviza: A natural language interface for visual
analysis. In Proceedings of the 29th annual symposium on user interface
software and technology. 365–377.
[5] Bowen Yu and Cláudio T Silva. 2019. FlowSense: A natural language
interface for visual data exploration within a dataflow system. IEEE
transactions on visualization and computer graphics 26, 1 (2019), 1–11.
[6] Tong Gao, Mira Dontcheva, Eytan Adar, Zhicheng Liu, and Karrie G
Karahalios. 2015. Datatone: Managing ambiguity in natural language
interfaces for data visualization. In Proceedings of the 28th annual acm
symposium on user interface software & technology. 489–500.
[7] Can Liu, Yun Han, Ruike Jiang, and Xiaoru Yuan. 2021. Advisor:
Automatic visualization answer for natural-language question on tabular
data. In 2021 IEEE 14th Pacific Visualization Symposium (PacificVis).
IEEE, 11–20.
[8] Yuyu Luo, Nan Tang, Guoliang Li, Chengliang Chai, Wenbo Li,
and Xuedi Qin. 2021. Synthesizing natural language to visualization
(NL2VIS) benchmarks from NL2SQL benchmarks. In Proceedings of
the 2021 International Conference on Management of Data. 1235–1247.
[9] Yuyu Luo, Nan Tang, Guoliang Li, Jiawei Tang, Chengliang Chai, and
Xuedi Qin. 2021. Natural language to visualization by neural machine
translation. IEEE Transactions on Visualization and Computer Graphics
28, 1 (2021), 217–226.
[10] Yuanfeng Song, Xuefang Zhao, Raymond Chi-Wing Wong, and Di
Jiang. 2022. Rgvisnet: A hybrid retrieval-generation neural framework
towards automatic data visualization generation. In Proceedings of the
28th ACM SIGKDD Conference on Knowledge Discovery and Data
Mining. 1646–1655.
[11] Paula Maddigan and Teo Susnjak. 2023. Chat2vis: Generating data
visualisations via natural language using chatgpt, codex and gpt-3 large
language models. Ieee Access (2023).
[12] Shuaimin Li, Xuanang Chen, Yuanfeng Song, Yunze Song, and Chen
Zhang. 2024. Prompt4Vis: Prompting Large Language Models with
Example Mining and Schema Filtering for Tabular Data Visualization.
arXiv preprint arXiv:2402.07909 (2024).

2024 International Conference on Asian Language Processing (IALP) 180

Authorized licensed use limited to: Hong Kong University of Science and Technology. Downloaded on July 29,2025 at 08:29:39 UTC from IEEE Xplore. Restrictions apply.

DATA Warehousing Quiz
No ratings yet
DATA Warehousing Quiz
9 pages
Analysis of Student-LLM Interaction in a Software Engineering Project
No ratings yet
Analysis of Student-LLM Interaction in a Software Engineering Project
8 pages
Analysis of Student-LLM Interaction in a Software Engineering Project
No ratings yet
Analysis of Student-LLM Interaction in a Software Engineering Project
8 pages
LLM-ProS Analyzing Large Language Models Performance in Competitive Problem Solving
No ratings yet
LLM-ProS Analyzing Large Language Models Performance in Competitive Problem Solving
8 pages
3744746
No ratings yet
3744746
69 pages
MissionGPT Mission Planner for Mobile Robot Based on Robotics Transformer Model
No ratings yet
MissionGPT Mission Planner for Mobile Robot Based on Robotics Transformer Model
6 pages
Predicting Groundwater Levels at Colorado State of USA Using ARIMA and ANN Models
No ratings yet
Predicting Groundwater Levels at Colorado State of USA Using ARIMA and ANN Models
6 pages
LLM-Driven Testing for Autonomous Driving Scenarios
No ratings yet
LLM-Driven Testing for Autonomous Driving Scenarios
6 pages
The Impact of Generative AI on Islamic Studies Case Analysis of -Digital Muhammad Ibn Ismail Al-Bukhari
No ratings yet
The Impact of Generative AI on Islamic Studies Case Analysis of -Digital Muhammad Ibn Ismail Al-Bukhari
9 pages
Comparative Analysis of Reasoning Capabilities in Foundation Models
No ratings yet
Comparative Analysis of Reasoning Capabilities in Foundation Models
9 pages
Fake News Detection With Retrieval Augmented Generative Artificial Intelligence
No ratings yet
Fake News Detection With Retrieval Augmented Generative Artificial Intelligence
8 pages
Sustainability 14 06428 v2
No ratings yet
Sustainability 14 06428 v2
15 pages
Circles
No ratings yet
Circles
7 pages
Annex III 08 2022
No ratings yet
Annex III 08 2022
2 pages
Science Adt9819
No ratings yet
Science Adt9819
4 pages
Value at Risk Estimation Using Extreme Value Theory
No ratings yet
Value at Risk Estimation Using Extreme Value Theory
8 pages
Worksheet Data File Handling
No ratings yet
Worksheet Data File Handling
3 pages
Enhancing Text-To-SQL Capabilities of Large Language Models
No ratings yet
Enhancing Text-To-SQL Capabilities of Large Language Models
22 pages
Action Against Hunger - 2016 - Multi-Sectorial Monitoring & Evaluation. A Practical Guide For Fieldworkers
No ratings yet
Action Against Hunger - 2016 - Multi-Sectorial Monitoring & Evaluation. A Practical Guide For Fieldworkers
167 pages
SQLPa LM
No ratings yet
SQLPa LM
61 pages
DATANARRATIVE - Automated Data-Driven Storytelling With Visualizations and Texts
No ratings yet
DATANARRATIVE - Automated Data-Driven Storytelling With Visualizations and Texts
33 pages
Plan Then Generate Controlled Data To-Text Generation Via Planning
No ratings yet
Plan Then Generate Controlled Data To-Text Generation Via Planning
15 pages
Code Confabulator Harnessing LLMs To Compile Code For Visualization
No ratings yet
Code Confabulator Harnessing LLMs To Compile Code For Visualization
6 pages
QuestionsAS CS
No ratings yet
QuestionsAS CS
24 pages
AVA 2312.04494v1
No ratings yet
AVA 2312.04494v1
17 pages
AVA 2312.04494v1 A
No ratings yet
AVA 2312.04494v1 A
17 pages
St. Ann'S School: Computer Science
No ratings yet
St. Ann'S School: Computer Science
25 pages
FHIR For Developers
100% (3)
FHIR For Developers
159 pages
Memorization Vs Generalization Quantifying Data Le
No ratings yet
Memorization Vs Generalization Quantifying Data Le
11 pages
This Study Resource Was: Unit 705: Leading A Strategic Management Project
No ratings yet
This Study Resource Was: Unit 705: Leading A Strategic Management Project
9 pages
How Does Automation Shape The Process of Narrative Visualization A Survey of Tools
No ratings yet
How Does Automation Shape The Process of Narrative Visualization A Survey of Tools
20 pages
Umair Research Paper
No ratings yet
Umair Research Paper
32 pages
From Pdfs To Structured Data: Utilizing LLM Analysis in Sports Database Management
No ratings yet
From Pdfs To Structured Data: Utilizing LLM Analysis in Sports Database Management
11 pages
HO Ka Man Carman - Report I (English Version)
No ratings yet
HO Ka Man Carman - Report I (English Version)
3 pages
Summary
No ratings yet
Summary
29 pages
BSBPEF301 Assessor Guide V1.3.v1.0
No ratings yet
BSBPEF301 Assessor Guide V1.3.v1.0
27 pages
Automated Data Visualization From Natural Language Via Large Language Models: An Exploratory Study
No ratings yet
Automated Data Visualization From Natural Language Via Large Language Models: An Exploratory Study
28 pages
Vizml: A Machine Learning Approach To Visualization Recommendation
No ratings yet
Vizml: A Machine Learning Approach To Visualization Recommendation
14 pages
Team 18 Implementation
No ratings yet
Team 18 Implementation
13 pages
Qin (2020) - Making Data Visualization More Efficient and Effective
No ratings yet
Qin (2020) - Making Data Visualization More Efficient and Effective
25 pages
Apex Institute of Engineering Department of Ait - Cse: B.E. CSE With Specialization in BDA
No ratings yet
Apex Institute of Engineering Department of Ait - Cse: B.E. CSE With Specialization in BDA
10 pages
Edi6 Paper3
No ratings yet
Edi6 Paper3
10 pages
DVT Unit 4
No ratings yet
DVT Unit 4
21 pages
Research Paper
No ratings yet
Research Paper
32 pages
What Is Data Mining
No ratings yet
What Is Data Mining
8 pages
Visistant A Conversational Chatbot For Natural Language To Visualizations With Gemini Large Language Models
No ratings yet
Visistant A Conversational Chatbot For Natural Language To Visualizations With Gemini Large Language Models
17 pages
Raw Data Activity 1 Statistic
No ratings yet
Raw Data Activity 1 Statistic
43 pages
Tablegpt
No ratings yet
Tablegpt
13 pages
Syntax and Relation Enhanced Query Generation For
No ratings yet
Syntax and Relation Enhanced Query Generation For
12 pages
Lab Guide - PDF - EN
No ratings yet
Lab Guide - PDF - EN
114 pages
PROJECT Toko Mainan Netbeans Database
No ratings yet
PROJECT Toko Mainan Netbeans Database
27 pages
2404.18144v1 Pages 8
No ratings yet
2404.18144v1 Pages 8
4 pages
2404.18144v1 Pages 6
No ratings yet
2404.18144v1 Pages 6
10 pages
1152cs191 Data Visualization Unit IV
No ratings yet
1152cs191 Data Visualization Unit IV
99 pages
2404.18144v1 Pages 4
No ratings yet
2404.18144v1 Pages 4
10 pages
2404.18144v1 Pages 3
No ratings yet
2404.18144v1 Pages 3
10 pages
2404.18144v1 Pages 5
No ratings yet
2404.18144v1 Pages 5
10 pages
Muhammad Asif Updated CV
No ratings yet
Muhammad Asif Updated CV
7 pages
Data Democratisation With Deep Learning
No ratings yet
Data Democratisation With Deep Learning
4 pages
Analysis of Transformer Decoder Architecture and KV Cache Behavior During LLM Inference
No ratings yet
Analysis of Transformer Decoder Architecture and KV Cache Behavior During LLM Inference
5 pages
9.P.Milev
No ratings yet
9.P.Milev
6 pages
TableGPT2 - A Large Multimodal Model With Tabular Data Integration
No ratings yet
TableGPT2 - A Large Multimodal Model With Tabular Data Integration
32 pages
Ai Viz
No ratings yet
Ai Viz
101 pages
10 1109@tetci 2019 2892755
No ratings yet
10 1109@tetci 2019 2892755
16 pages
Incorporating Visual Information Into Natural Language Processing
No ratings yet
Incorporating Visual Information Into Natural Language Processing
151 pages
Creating Visual Representations: 2.1 A Reference Model
No ratings yet
Creating Visual Representations: 2.1 A Reference Model
2 pages
Table-to-Text Describing Table Region With Natural
No ratings yet
Table-to-Text Describing Table Region With Natural
10 pages
Possible
No ratings yet
Possible
9 pages
OTT Streaming Wars:: Raise or Fold
No ratings yet
OTT Streaming Wars:: Raise or Fold
40 pages
Paper 3
No ratings yet
Paper 3
13 pages
RATSQL
No ratings yet
RATSQL
12 pages
Lecture 4 Relational Data Model in DBMS
No ratings yet
Lecture 4 Relational Data Model in DBMS
19 pages
Synthesize Step-by-Step Tools, Templates and LLMs As Data Generators For Reasoning-Based Chart VQA
No ratings yet
Synthesize Step-by-Step Tools, Templates and LLMs As Data Generators For Reasoning-Based Chart VQA
16 pages
Report of Pupilpod
No ratings yet
Report of Pupilpod
20 pages
Date: - Total Marks: 50 Name: - Passing Marks: 35 Emp ID: - Passing Percentage: 70 % Time: 1 Hour
No ratings yet
Date: - Total Marks: 50 Name: - Passing Marks: 35 Emp ID: - Passing Percentage: 70 % Time: 1 Hour
7 pages
DVT UNIT - 4 Notes 211124
No ratings yet
DVT UNIT - 4 Notes 211124
21 pages
Immaculate Conception Archdiocesan School Tetuan, Zamboanga City
No ratings yet
Immaculate Conception Archdiocesan School Tetuan, Zamboanga City
12 pages
Chapter 4 Microprocessor System
No ratings yet
Chapter 4 Microprocessor System
50 pages
John Mashey Capture Curate: Mca Iii Sem
No ratings yet
John Mashey Capture Curate: Mca Iii Sem
4 pages
Activity 2 - Measuring, Gathering and Writing Quantitative Data
No ratings yet
Activity 2 - Measuring, Gathering and Writing Quantitative Data
6 pages
C: A Pragmatic Chinese Answer-to-Sequence Dataset With Large Scale and High Quality
No ratings yet
C: A Pragmatic Chinese Answer-to-Sequence Dataset With Large Scale and High Quality
16 pages
nl4dv Vis20
No ratings yet
nl4dv Vis20
11 pages
Educational Research Review: Tina Hascher, Jennifer Waber
No ratings yet
Educational Research Review: Tina Hascher, Jennifer Waber
25 pages
ESLSCA Courses Description
No ratings yet
ESLSCA Courses Description
33 pages
Text2Chart A Multi-Staged Chart Generator From Nat
No ratings yet
Text2Chart A Multi-Staged Chart Generator From Nat
21 pages
Publi-6721 2
No ratings yet
Publi-6721 2
17 pages
Netbackup 8.0 Blueprint Exchange
No ratings yet
Netbackup 8.0 Blueprint Exchange
36 pages
Is GPT-4 A Good Data Analyst?
No ratings yet
Is GPT-4 A Good Data Analyst?
19 pages
Ibm Tivoli Maximo DG PDF
No ratings yet
Ibm Tivoli Maximo DG PDF
16 pages
Final Report 169369314
No ratings yet
Final Report 169369314
11 pages
Visual Analytics For Transformers
No ratings yet
Visual Analytics For Transformers
10 pages
A Review On Question Generation From Natural Language Text
No ratings yet
A Review On Question Generation From Natural Language Text
43 pages
CS201 Introduction To Programming Solved MID Term Paper 03
No ratings yet
CS201 Introduction To Programming Solved MID Term Paper 03
4 pages
12007-Article (PDF) - 24616-1-10-20201002
No ratings yet
12007-Article (PDF) - 24616-1-10-20201002
76 pages
2016 NAND Flash Outlook
No ratings yet
2016 NAND Flash Outlook
17 pages
Neural Machine Translation: A Review and Survey
No ratings yet
Neural Machine Translation: A Review and Survey
91 pages
Visual GPT
No ratings yet
Visual GPT
17 pages
Non-Functional Requirement Checklist: Mann-India Technologies
No ratings yet
Non-Functional Requirement Checklist: Mann-India Technologies
3 pages
Thesis LLMsForDocVQA
No ratings yet
Thesis LLMsForDocVQA
29 pages
Tacl A 00544
No ratings yet
Tacl A 00544
23 pages
IDRISI Software Manual
100% (1)
IDRISI Software Manual
32 pages
Oracle Questions
No ratings yet
Oracle Questions
56 pages
(Text, Speech and Language Technology 4) Ludovic Lebart, André Salem, Lisette Berry (Auth.) - Exploring Textual Data-Springer Netherlands (1998)
No ratings yet
(Text, Speech and Language Technology 4) Ludovic Lebart, André Salem, Lisette Berry (Auth.) - Exploring Textual Data-Springer Netherlands (1998)
254 pages
Towards AI-Complete Question Answering: A Set of Prerequisite Toy Tasks
No ratings yet
Towards AI-Complete Question Answering: A Set of Prerequisite Toy Tasks
11 pages
PHD Regulation 2015
No ratings yet
PHD Regulation 2015
5 pages
Querying and Creating Visualizations by Analogy
No ratings yet
Querying and Creating Visualizations by Analogy
8 pages

The Implementation Solution for Automatic Visualization of Tabular Data in Relational Databases Based on Large Language Models (1)

Uploaded by

The Implementation Solution for Automatic Visualization of Tabular Data in Relational Databases Based on Large Language Models (1)

Uploaded by

The implementation solution for automatic

visualization of tabular data in relational databases

Hao Yang Zhaoyong Yang

Ruyang Zhao Xiaoran Li

979-8-3315-4085-2/24/$31.00 ©2024 IEEE 175

2024 International Conference on Asian Language Processing (IALP) 176

2024 International Conference on Asian Language Processing (IALP) 177

b) Step 2: Determining Data Tables: Based on step 1, de- afternoon’

2024 International Conference on Asian Language Processing (IALP) 178

2024 International Conference on Asian Language Processing (IALP) 179

2024 International Conference on Asian Language Processing (IALP) 180

You might also like