Analysis of Text-to-Image AI Generators

This document analyzes text-to-image AI generators, focusing primarily on DALL-E 2 and comparing it with earlier models. The analysis uses three metrics: aesthetic quality, comprehension and interpretation, and creativity, revealing that DALL-E 2 outperforms earlier models but has limitations in spelling and understanding complex prompts. The conclusion suggests improvements for future AI art generators based on these findings.

Uploaded by

jammingcomputer1

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

6 views

Analysis of Text-to-Image AI Generators

Uploaded by

jammingcomputer1

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 1

Analysis of Text-to-Image AI Generators

Ziyu Huang (Cheryl)

IPHS300 AI for the Humanities (Spring 2022) Prof Elkins and Chun, Kenyon College

Abstract Material and Methodology Results Conclusion and Recommendation

This project is an analysis of text-to-image artificial intelligence Material: Comparison with AI Art Generators from Earlier Generation: Comparing the performance of AI art generators, DALLE-2
generators. The comparison will mostly focus on the newly- 1. Twitter posts of DALL-E 2: - post-modern style: outperforms earlier generations of AI art generators on all three
released DALL-E 2, but will also include two other AI art producers As I currently have no access to DALL-E 2, the only sources “Remembrance of nostalgia, surrealist painting by Dalí.” metrics. It can generate images with a high level of aesthetic quality,
from earlier generations. Each AI generator will be fed the same that can be drawn from DALL-E 2 are from Twitter. This project will an accurate interpretation of the text prompt, and some creativity
text prompt for the analysis. Three metrics will be used to analyze therefore collect artwork created by DALL-E 2 from Twitter posts. in blending information with style. Nonetheless, the outcome
the images generated by each AI generator in response to the same The spectrum is limited to the arts and excludes photographs. In demonstrates that DALL-E 2 has several limitations. First, its
text prompt. This project will utilize three metrics: aesthetic, addition, I will utilize the identical text prompts to feed the other two spelling ability is relatively poor. When asked to generate graphics
comprehension and interpretation, and creativity. This project will AI art generators and evaluate the performance of the different with some text on them, typographical errors are quite probable.
result in a conclusion and a recommendation for the improvement generators by comparing their outputs. Several DALL-E 2 users are also aware of this shortcoming. Second,
2. Hotpot AI Art Maker & 3. Starryai AI Art Generator (Orion) it has different levels of art style comprehension. It has a greater
of future AI art generators based on a comparison of the
These are open-sourced AI art generators, featuring fast Hotpot: Starryai: DALL-E 2:
understanding of postmodern and contemporary art styles,
performance of several AI art generators and different text prompts.
generating speed (1-2 min) and superior visual quality than other Aesthetic: 6/10 Aesthetic: 8/10 Aesthetic: 9/10
C&I: 3/10 C&I: 6/10 C&I: 9/10 (closest to Dalí) especially digital art and some cartoon styles linked to popular
open-source AI art generators.
Creativity: 5/10 Creativity: 6/10 Creativity: 7/10 animations. According to one of the user reports, DALL-E 2 has
trouble assigning specific attributes to particular characters. This
Introduction Methodology: - pre-modern style: circumstance occurs when the text prompts involve two or more
There is no access to the code underlying these models, thus “a hot dog in the style of a renaissance painting.”
With the spring 2022 release of DALL-E 2, there is figures and indicate distinct characteristics for each figure. In
all evaluation will be based on text input and output images.
heightened interest in the debate of AI-generated art. In addition to some fundamental characteristics like as gender, DALL-
All of the text prompts will include an indication of a certain art
comparison to existing AI art generators that convert text to images, style and at least one from the subject identification and activity E 2 can easily mix up age, hairstyle/color, and clothing. Even while
the revolutionary DALL-E 2 is an AI system that can generate more description. DALL-E 2 exhibits its strength in analyzing and comprehending
realistic and accurate images based on the text input. Furthermore, The three metrics developed for this project are aesthetic, subjects, it cannot create satisfactory results when the text prompt
DALL-E 2 can make complex artworks with only relatively brief comprehension and interpretation(C&I), and creativity. The aesthetic contains a novel subject, as stated in the same user report.
text inputs. In addition to these, DALL-E 2 is capable of visually will be the formal analysis of the images produced from the The majority of these constraints can be overcome by by
integrating distinct and irrelevant objects. While earlier AI perspective of human art historians. Composition, color palette, and modifying the parameters of the DALL-E 2 model. For example, the
Hotpot: Starryai: DALL-E 2:
generators could only produce crude and low-quality images, lines and shapes will be the primary factors for conducting the formal disparity between the amounts of accessible digital data for works
Aesthetic: 2/10 Aesthetic: 6/10 Aesthetic: 9/10
DALL-E 2 has reached the State of the Art (SOTA) since its analysis. The comprehension and interpretation metric will assess the C&I: 2/10 C&I: 6/10 () C&I: 7/10 (more Baroque) of art generated throughout different eras is the primary cause of
products satisfy practically all artistic requirements. accuracy with which you comprehend and interpret the text prompt Creativity: 3/10 Creativity: 5/10 Creativity: 6/10 different degrees of comprehension of art styles. The majority of
Compared to Generative Adversarial Networks-based in terms of artistic style, subject matter, and iconography. The the premodern artworks are paintings or sculptures on easels.
creativity will investigate the originality of combining the formal Comparison with Different Text Prompts Using DALL-E 2: Their reliance on artistic expertise and lengthy production time
model (GAN), DALLE-2 is a newer model that supplants and even
components of the particular art style with the narrative and - in the style of Vermeer: restricts their quantity, and many of them are damaged or
excels GAN. Unlike other elementary models that rely mostly on
iconography. - text prompts from left to right: destroyed. Postmodern artworks, in this case the digital arts,
GAN, DALL-E 2 benefits from Contrastive Learning-Image Pre-
“Ai generated 'Robot girl with a pearl earring' by Johannes Vermeer”
training (CLIP) and diffusion models. The CLIP parallels the trainings require less painting or sculpting expertise and less time to execute.
"Mother, by Vermeer"
of the texts and images, functions like the encoder; while the Therefore, there is a disparity in the amounts of artworks created
diffusion models learn to generate image by nosing and denosing
Acknowledgement "Good morning, in the style of Vermeer"
throughout different time periods, which persists in the DALL-E 2
the training set, function like the decoder. DALL-E 2's architecture Dickson, Ben. “Dall-e 2, the Future of AI Research, and OpenAI's Business Model.” training data. This bias in the trainning data results in various levels
is to first train the CLIP model and then use it to train the diffusion TechTalks, April 11, 2022. https://bdtechtalks.com/2022/04/11/openai-dall-e-2/. of art style comprehension. However, this could be improved by
models. Last but not least, the diffusion models use CLIP to altering the parameter to have more pre-modern iterations than
O'Connor, Ryan. “How Dall-e 2 Actually Works.” AssemblyAI Blog. AssemblyAI post-modern iterations.
construct text embeddings and generate images corresponding to
Blog, April 22, 2022. https://www.assemblyai.com/blog/how-dall-e-2-actually-
the text. The most notable benefit of this design is that it does not Currently, there are numerous critiques about the ethical
works/.
require massive amount of text-image paired data for training. In issues posed by Deepfakes created by AI art generators. However,
other words, it is a model that is unsupervised or "self-supervised." Ramesh, Aditya, Prafulla Dhariwal, Alex Nichol, Casey Chu and Mark Chen. as several users have pointed out, DALL-E 2 appears to have
“Hierarchical Text-Conditional Image Generation with CLIP Latents.” ArXiv Aesthetic: 8/10 Aesthetic: 9/10 Aesthetic: 8/10 deliberate flaws in its ability to generate photorealistic human faces.
The self-supervised system can save a substantial amount of
abs/2204.06125 (2022): n. pag. C&I: 8/10 C&I: 7/10 C&I: 9/10
human labor. At the same time, the unsupervised construct Some say that this flaw is one of DALL-E 2's defects. However,
Creativity: 9/10 Creativity: 9/10 Creativity: 7/10
maximizes creativity and novelty, as the AI may discover surprising DALL-E 2 is capable of producing photorealistic images of objects
Swimmer963. “What Dall-e 2 Can and Cannot Do.” LessWrong, May 1, 2022.
outcomes that are never observed by humans. - DALL-E 2 generates art by combining the most distinctive and and non-human animals. Therefore, it is more plausible to believe
https://www.lesswrong.com/posts/uKp6tBFStnsvrot5t/what-dall-e-2-can-
recognizable features of the subject and the style. These "features" may that the flaw is an intentional attempt to prevent the creation of
and-cannot-do.
include facial characteristics, costumes, hairstyles, makeup, accessories, Deepfakes. One of the additional worries regarding DALL-E 2 is
Wang, Zihao, Wei Liu, Qian He, Xin-ru Wu and Zili Yi. “CLIP-GEN: Language- color palettes, brushstrokes, modeling of light and shadow, compositions, that the AI art generators may lead to the unemployment of artists,
Free Training of a Text-to-Image Generator with CLIP.” ArXiv abs/2203.00386 lines and shapes, etc. But here comes the question, how does DALL-E 2
particularly digital artists. DALL-E 2's exceptional s creativity can
(2022): n. pag. choose which feature(s) to combine? When text prompts include the
occasionally surpass human intelligence, as it can produce
name of the style (or the artist's last name if the style is named after the
https://twitter.com/Merzmensch/status/1522277446980091904
combinations of style and content that have never been observed
artist), DALL-E 2 is more likely to select the formal stylistic features. In the
https://twitter.com/bakztfuture/status/1517373091034378241 case above, when "Vermeer" appears as a style, DALL-E 2 generates by humans. However, rather of eliminating employment, AI art
https://twitter.com/Merzmensch/status/1523302450047893506 work with Vermeer's distinctive sketchy brushstrokes and bluish, cold- producers are more likely to change them. For instance, AI art
https://twitter.com/Dalle2Pics/status/1521217219488894977/photo/1 toned color palette. While the first does not incorporate Vermeer's generators like DALL-E 2 requires domain expertise to improve the
https://twitter.com/Merzmensch/status/1523550836281937921/photo/1 perfomance.
painting style.

Dalle 3 Playbook
100% (4)
Dalle 3 Playbook
127 pages
Repairing Jaeger & Smiths Speedometers
100% (2)
Repairing Jaeger & Smiths Speedometers
25 pages
The DALL E 2 Prompt Book
100% (10)
The DALL E 2 Prompt Book
81 pages
What's in A Text-To-Image Prompt The Potential of Stable Diffusion in Visual Arts Education
No ratings yet
What's in A Text-To-Image Prompt The Potential of Stable Diffusion in Visual Arts Education
12 pages
AI image generation
No ratings yet
AI image generation
11 pages
Prompt Art
No ratings yet
Prompt Art
19 pages
Overcoming The Articulation Barrier in Generative AI Using Hybrid Interfaces
No ratings yet
Overcoming The Articulation Barrier in Generative AI Using Hybrid Interfaces
11 pages
Watermarked - A Matter of Perspective - Aug 06 2023 08 39 49
No ratings yet
Watermarked - A Matter of Perspective - Aug 06 2023 08 39 49
15 pages
Design Guidelines For Prompt Engineering
No ratings yet
Design Guidelines For Prompt Engineering
23 pages
Artificial Intelligence and Machine Learning For Additive Manufacturing Composites Toward Enriching Metaverse Technology
No ratings yet
Artificial Intelligence and Machine Learning For Additive Manufacturing Composites Toward Enriching Metaverse Technology
13 pages
The Algorithmic Muse - Exploring Creativity in The Age of AI
No ratings yet
The Algorithmic Muse - Exploring Creativity in The Age of AI
2 pages
Generative_artificial_intelligence_human_creativit
No ratings yet
Generative_artificial_intelligence_human_creativit
8 pages
AI Art Project
No ratings yet
AI Art Project
1 page
Artificial Aesthetics - Chapter 2
No ratings yet
Artificial Aesthetics - Chapter 2
24 pages
Dall e 3 - Compressed
No ratings yet
Dall e 3 - Compressed
19 pages
Questions for Text to Image Ai
No ratings yet
Questions for Text to Image Ai
5 pages
Artificial Aesthetics A Critical Guide T
No ratings yet
Artificial Aesthetics A Critical Guide T
24 pages
Portfolio Research Paper
No ratings yet
Portfolio Research Paper
14 pages
The_algorithmic_art_Exploring_the_inters
No ratings yet
The_algorithmic_art_Exploring_the_inters
25 pages
AI Art in Architecture
No ratings yet
AI Art in Architecture
11 pages
JDSAA-Volume 4-Issue 2- Page 42-58
No ratings yet
JDSAA-Volume 4-Issue 2- Page 42-58
17 pages
Analysis of Appeal For Realistic AI-Generated Photos
No ratings yet
Analysis of Appeal For Realistic AI-Generated Photos
14 pages
3DALL-E: Integrating Text-to-Image AI in 3D Design Workflows
No ratings yet
3DALL-E: Integrating Text-to-Image AI in 3D Design Workflows
20 pages
Assignment+1+AI
No ratings yet
Assignment+1+AI
11 pages
AIin Artand Creativity
No ratings yet
AIin Artand Creativity
12 pages
Dall-E Case Study AI
No ratings yet
Dall-E Case Study AI
4 pages
JDSAA Volume4 Issue2 Pages42-58
No ratings yet
JDSAA Volume4 Issue2 Pages42-58
18 pages
Interview With Brian Sykes - Sintetica
No ratings yet
Interview With Brian Sykes - Sintetica
5 pages
BAMC2023_72833
No ratings yet
BAMC2023_72833
12 pages
Final Assigment Version 2
No ratings yet
Final Assigment Version 2
44 pages
Exploring the Role of Artificial Intelligence in Art: Creative Collaboration or Threat? (www.kiu.ac.ug)
No ratings yet
Exploring the Role of Artificial Intelligence in Art: Creative Collaboration or Threat? (www.kiu.ac.ug)
6 pages
2408.00544v1
No ratings yet
2408.00544v1
7 pages
How to Write AI Image Prompts
No ratings yet
How to Write AI Image Prompts
16 pages
Piskopani Et Al 2023 Responsible Ai and
No ratings yet
Piskopani Et Al 2023 Responsible Ai and
5 pages
SanjanaSademba 2205348.
No ratings yet
SanjanaSademba 2205348.
8 pages
[1] User's prompt journey
No ratings yet
[1] User's prompt journey
13 pages
ARTIFICIALCREATIVITY Abstracts - Docx-2
No ratings yet
ARTIFICIALCREATIVITY Abstracts - Docx-2
22 pages
Understanding and Creating Art With a i
No ratings yet
Understanding and Creating Art With a i
23 pages
Kalpokas 2023 Work of Art in The Age of Its Ai Reproduction
No ratings yet
Kalpokas 2023 Work of Art in The Age of Its Ai Reproduction
19 pages
Creative and Critical Entanglements With AI in Art Education
No ratings yet
Creative and Critical Entanglements With AI in Art Education
21 pages
Literature Review
No ratings yet
Literature Review
5 pages
Sample Report PDF
No ratings yet
Sample Report PDF
25 pages
Tanzeel AI
No ratings yet
Tanzeel AI
43 pages
Mingyong Cheng - The Creativity of Artificial Intelligence in Art
No ratings yet
Mingyong Cheng - The Creativity of Artificial Intelligence in Art
5 pages
DALL-E
No ratings yet
DALL-E
3 pages
a
No ratings yet
a
3 pages
ARBAN - STS Position Paper
No ratings yet
ARBAN - STS Position Paper
3 pages
Generating AI Text to Image A Comprehensive Guide
No ratings yet
Generating AI Text to Image A Comprehensive Guide
3 pages
PBBML L6
No ratings yet
PBBML L6
19 pages
001
No ratings yet
001
8 pages
4 3 Activity Guide-merged
No ratings yet
4 3 Activity Guide-merged
8 pages
Image-Dev An Advance Text To Image AI Model
No ratings yet
Image-Dev An Advance Text To Image AI Model
6 pages
Co-Creating Art GenAI - Implications For Artworks and Artists
No ratings yet
Co-Creating Art GenAI - Implications For Artworks and Artists
19 pages
Introduction
No ratings yet
Introduction
3 pages
DALL E 3 the Ultimate Playbook 2nd Edition 1713457680
No ratings yet
DALL E 3 the Ultimate Playbook 2nd Edition 1713457680
138 pages
Analyzing and Discussing Primary Creative T 2016 Biologically Inspired Cogni
No ratings yet
Analyzing and Discussing Primary Creative T 2016 Biologically Inspired Cogni
10 pages
Science Adh4451
No ratings yet
Science Adh4451
3 pages
Qualitative Study of Text-To-Image AI Generators and Their Relationship With NFTs
No ratings yet
Qualitative Study of Text-To-Image AI Generators and Their Relationship With NFTs
6 pages
Wk4_AI Generated Images
No ratings yet
Wk4_AI Generated Images
30 pages
Generative Art and Artificial Creativity: Exploring the Fusion Between Human Creativity and Artificial Intelligence
From Everand
Generative Art and Artificial Creativity: Exploring the Fusion Between Human Creativity and Artificial Intelligence
Hartem Writers
No ratings yet
Art in the Age of AI
From Everand
Art in the Age of AI
Jamal Faisal Almutawa
No ratings yet
Anger-Coping-Skills-2
No ratings yet
Anger-Coping-Skills-2
5 pages
Rescue-Coping-Skills-graphic
No ratings yet
Rescue-Coping-Skills-graphic
1 page
coping-skills-toolbox
No ratings yet
coping-skills-toolbox
77 pages
northwestern-medicine-your-heart-and-how-it-works-en
No ratings yet
northwestern-medicine-your-heart-and-how-it-works-en
3 pages
Da Acs Tobacco Pipe Manual
No ratings yet
Da Acs Tobacco Pipe Manual
23 pages
02665
No ratings yet
02665
19 pages
Tips for Safe Online Dating
No ratings yet
Tips for Safe Online Dating
1 page
organic-ccs24
No ratings yet
organic-ccs24
15 pages
TheGreenWay_Spring_2011_web
No ratings yet
TheGreenWay_Spring_2011_web
8 pages
Name Mobile: +91
No ratings yet
Name Mobile: +91
2 pages
CU34G2XP BK-English
No ratings yet
CU34G2XP BK-English
29 pages
Mathematics-JAMB-Syllabus-2025
No ratings yet
Mathematics-JAMB-Syllabus-2025
8 pages
IP All in One
No ratings yet
IP All in One
61 pages
Super Hornet CLIN 3 & 6 - 3 Weeks Look Ahead - 22 September 2022
No ratings yet
Super Hornet CLIN 3 & 6 - 3 Weeks Look Ahead - 22 September 2022
5 pages
Business Proposal: This Proposal Is Prepared For: This Proposal Is Prepared by
No ratings yet
Business Proposal: This Proposal Is Prepared For: This Proposal Is Prepared by
6 pages
MIDAS/Civil: 1. Design Information
No ratings yet
MIDAS/Civil: 1. Design Information
1 page
Manual Android Mini PC
No ratings yet
Manual Android Mini PC
25 pages
EA Sports FC Mobile 24 Captains Event Guide
No ratings yet
EA Sports FC Mobile 24 Captains Event Guide
1 page
Notes On Media Literacy
No ratings yet
Notes On Media Literacy
5 pages
Amritsar College of Engineering & Technology, Amritsar: Secrecy Branch
No ratings yet
Amritsar College of Engineering & Technology, Amritsar: Secrecy Branch
1 page
CT1 - 2022 - Answer Keys
No ratings yet
CT1 - 2022 - Answer Keys
14 pages
DGMTR Instruction Manual
No ratings yet
DGMTR Instruction Manual
14 pages
Chapter 1
No ratings yet
Chapter 1
39 pages
DIP 6TH sem.pdf
No ratings yet
DIP 6TH sem.pdf
5 pages
LB1823
No ratings yet
LB1823
12 pages
Vmware Airlift
No ratings yet
Vmware Airlift
2 pages
Splunk Fundamentals Part1
No ratings yet
Splunk Fundamentals Part1
5 pages
HITE EUROPE Company (2024.07)
No ratings yet
HITE EUROPE Company (2024.07)
25 pages
Unified Communications Engineer: Certification Courses
No ratings yet
Unified Communications Engineer: Certification Courses
2 pages
Turbnpro - MHC2
No ratings yet
Turbnpro - MHC2
9 pages
A Flying Ejection Seat, Foldable Jet Powerd Autogyro
100% (1)
A Flying Ejection Seat, Foldable Jet Powerd Autogyro
12 pages
KE QB Questions Only
100% (1)
KE QB Questions Only
7 pages
325-2012 - BS en
No ratings yet
325-2012 - BS en
10 pages
MELAG
No ratings yet
MELAG
64 pages
Project Management GUIDE 2020: Revision 1
No ratings yet
Project Management GUIDE 2020: Revision 1
27 pages
IV.b.tech Projects Details A Sec
No ratings yet
IV.b.tech Projects Details A Sec
4 pages
Layers of Netscape
100% (1)
Layers of Netscape
4 pages
Math9 q1 Mod8of8 Findingequationandsolvingwordprobleminvolvingquadraticfunctions v2
No ratings yet
Math9 q1 Mod8of8 Findingequationandsolvingwordprobleminvolvingquadraticfunctions v2
26 pages

Analysis of Text-to-Image AI Generators

Uploaded by

Analysis of Text-to-Image AI Generators

Uploaded by

Analysis of Text-to-Image AI Generators

Ziyu Huang (Cheryl)

Abstract Material and Methodology Results Conclusion and Recommendation

You might also like