GenAI - Text To Charts
GenAI - Text To Charts
Automated Chart
Creation with
Generative AI
CASE STUDY
September 2023
Field of Study
GENERATIVE AI
DATA VISUALISATION
Written By
ABHILASH SHUKLA
THE
CONTENT
2 Introduction
3 Overview of Text Analysis with AI
6 Generative AI Models for Text Understanding
10 From Text to Data Insights
15 Types of Charts and their Relevance
17 Generative AI for Chart Creation
19 Key Techniques & Technologies
29 Barriers and Shortcomings
31 Real-World Applications and Case Studies
33 Future prospects of Generative AI for Chart Creation
35 The Closing Note
The power of merging text and visuals is evident, but to do so with AI presents a series of
challenges and opportunities.
In this journey, we'll delve deep into how AI understands text, how it discerns which chart
type is most relevant, and how it then creates that chart. Moreover, as we embark on this
exploration, we'll not shy away from the technical intricacies that make this process feasible,
from attention mechanisms to the mathematical underpinnings of generative models. So, let's
embark on this expedition together, weaving the story of how AI is revolutionizing our ability
to understand and present data.
Lemmatization
Converting a word to its base form
considering the context
"Better" → "Good"
Stop Word Eliminating common words with less
"and", "the", "is"
Removal semantic value
By cleaning up the text data, we ensure that our AI models receive quality input, which is
essential for quality output.
Transformers' key strength is the attention The formula for the attention mechanism is:
mechanism. Unlike previous models that
processed words sequentially, transformers
can attend to all words simultaneously,
understanding their interplay, and grasping
context efficiently. Consider the sentence: Where:
Q represents the Query
Contextual Attention
Output Mechanism
GPT is fundamentally generative. If you've ever seen AI write an essay, poem, or even a
story, GPT is likely behind the curtains. It's pre-trained on vast corpuses, understanding
language patterns, and then fine-tuned for specific tasks.
On the other hand, BERT is designed to understand the bidirectional context of words. It's
akin to reading a sentence forwards and backwards, ensuring a deeper comprehension.
Both models, with their distinct characteristics, have spun off numerous variants. For
instance, GPT-3, a successor to GPT-2, offers a staggering 175 billion parameters, while BERT
has seen adaptations like RoBERTa and DistilBERT.
contd... Thought
The OpenAI’s 'MuseNet' of The Future The boundaries between different kinds of
content—be it text, music, or visual elements
Imagine a future where you're an aspiring
—are blurring. This advanced MuseNet isn't
musician grappling with writer's block. You
just a software; it becomes a holistic artist,
have the melody, but the words escape you.
a manager, and a marketer—your
Enter a future iteration of MuseNet, an AI
intellectual companion in the truest sense. It
model deeply embedded with the zero-shot
can read the room, the world, and perhaps,
learning capabilities.
even the cosmos, adding layers of
complexity and depth to its generative
You hum the melody into the system, and it
capabilities.
instantly understands not just the musical
notes but the emotion, the tempo, and the
Musician's
genre. But here's where it gets magical: You Zero-Shot
Melody &
tell the AI your intended theme—let's say "a Learning AI
Theme
journey through a rainforest"—and it not
only drafts lyrics for you but also suggests
accompanying instruments. All this, without Lyrics Instrument Emotion &
being specifically trained on songwriting or Generation Suggestions Genre Analysis
music composition!
Full Song
This AI could then collaborate with text-
analysis models to auto-generate music
Text Analysis Models
reviews, assess public sentiment about the
song, and even devise marketing strategies
—all based on the text generated and
analyzed through advanced NLP and zero- Music Public Marketing
Reviews Sentiment Strategies
shot learning.
this phone, but its names, places, brands, and more. In our
example, the entities would be "camera" and
battery life could be "battery life".
better."
Topic Modeling: Extracting the main topics
This simple feedback houses multiple facets: from a large volume of text. Techniques like
1. A positive sentiment towards the Latent Dirichlet Allocation (LDA) are
camera. commonly employed.
2. A constructive critique of the battery.
User Reviews
To visualize such feedback across hundreds
of reviews, our AI needs to not only
understand each sentiment but also Sentiment Entity
Topic Modeling
aggregate and categorize them. Analysis Recognition
Pie Charts: Part-to-whole Note: While pie charts are simple and
intuitive, they aren't suitable for datasets
Relationships with too many categories or those where
precise differences between categories are
Pie Charts are perfect for showing a part- crucial.
to-whole relationship. Each slice represents
a category, while its size shows its
Bar and Column Charts:
proportion.
Comparing Across Categories
Example: If we were to visualize the market
share of different smartphone brands, a pie Bar (horizontal) and Column (vertical)
chart would aptly show which brand Charts are among the most versatile. They
dominates and which ones have a niche are perfect for comparing data across
presence. categories.
Others
5%
Brand D Example: If we analyze how many units of a
10%
particular book were sold each month, a
column chart would effectively showcase
the month-wise breakdown.
Brand C Brand A
15% 45%
Consideration: While similar, bar charts can
be more readable when dealing with longer
category names or a larger number of
categories.
Brand B
25%
a. Embedding Layers
In the realm of deep learning, particularly Mathematically speaking, the embedding
when dealing with text, embedding layers layer operates as a function that maps
have proven to be a transformative tool. But discrete items (words, in the case of text) to
what makes them so special, and how do a continuous vector space. Conceptually,
they function? given a word w, its embedded
representation e(w) is selected as a row
a.1. What are Embedding Layers?
from an embedding matrix E that has
At its core, an embedding layer translates dimensions V×d.
large sparse vectors (often one-hot
encoded vectors representing words) into a
smaller and more manageable dense vector.
The dense vector captures semantic Here:
relationships between words. For instance, e(w): The dense vector representing
'king' and 'queen' would have vectors closer word w.
in the embedding space than 'king' and E: The embedding matrix.
'apple'. V: The size of the vocabulary.
a.2. The Magic Behind Embeddings d: The dimensionality of the embedding
space (often much smaller than V).
While embedding layers may seem
deceptively simple—acting as mere lookup During the training phase, the model tunes
tables where each word or token gets its the vectors such that words appearing in
vector representation—their power extends similar contexts end up closer in the
far beyond this surface-level operation. embedding space. So if the words "love" and
During the training phase, embedding layers "adore" frequently appear in similar
learn to map words to vectors in a way that sentences or near similar words, their
captures subtle semantic and contextual vectors will move closer to each other in
details. The real magic lies in the the multi-dimensional embedding space.
adaptability and expressiveness of these
dense vectors.
While the horizon of AI-generated charting is vast and fascinating, we need to tread with
caution. Embracing the technology doesn't mean abandoning the tried-and-tested
methodologies and human judgment that have served us for so long. Rather, as we venture
further into this domain, a blend of human intuition and AI-powered automation will likely
yield the best results. After all, the true value of any technology lies not just in its capability
but in our wisdom to use it appropriately.
Generator
ina
im
cr
Dis
The journey from text to visual representation, once a painstakingly manual process, now
lies at the fingertips of algorithms that can understand, interpret, and depict. From the
intricacies of embedding layers capturing the essence of words, the focus of attention
mechanisms highlighting crucial segments, to the magic of Generative Adversarial Networks
sculpting relevant charts, we've traversed a landscape rich in potential.
AI Interpretation Generative AI
Textual Contextual Relevant Visual
Data Understanding Representation
Yet, with great power comes great responsibility. As we've learned, the path is not without
its challenges. The biases inherent in datasets, the risk of over-relying on AI-generated
visuals without human interpretation, and the potential for misrepresentation are all pitfalls
that need to be navigated with care.
But these are not deterrents; rather, they are milestones in our quest. Each challenge met
and each limitation overcome brings us one step closer to realizing a vision where data is not
just seen but truly understood.
In the words of the renowned mathematician, Richard Hamming, "The purpose of computing
is insight, not numbers." And through the marriage of Generative AI and chart creation,
we're not just gaining insights but evolving the very mediums through which they're gleaned.
To the enthusiasts, experts, and every curious mind reading this — the future beckons. The
horizon is dotted with possibilities yet to be realized, stories yet to be told, and charts yet to
be crafted. Here's to a brighter, clearer, and more visually enriched tomorrow.
linkedin.com/in/abhilashshuklaa
@abhilashshuklaa
[email protected]
abhilashshukla.com