0% found this document useful (0 votes)

6 views

Report Image generation

The project report titled 'Signifying Immediate Image Generation from Text' presents a web-based application developed by students of Canara Engineering College, utilizing Generative Adversarial Networks (GANs) to create 2D interior design visuals from textual descriptions. The application employs React for the user interface, FastAPI for backend functionality, and MongoDB for data management, demonstrating the feasibility of AI-driven design visualization. The report includes acknowledgments, an abstract, and detailed sections on literature survey, software requirements, and system design.

Uploaded by

shetdarshan42

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

6 views

Report Image generation

Uploaded by

shetdarshan42

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 61

Visvesvaraya Technological University, Belagavi – 590018

PROJECT REPORT
ON
SIGNIFYING IMMEDIATE IMAGE
GENERATION FROM TEXT
Submitted in partial fulfillment for the award of degree of

BACHELOR OF ENGINEERING
in
INFORMATION SCIENCE AND ENGINEERING

Submitted by
DARSHAN SHET 4CB21IS009
NITHISH 4CB21IS030
PRANAV SAVANT 4CB21IS035
SHANMUKHA MADDODI 4CB21IS045

Under the Guidance of

Prof. Pradeep M
Assistant Professor, Department of Information Science And Engineering

DEPT. OF INFORMATION SCIENCE AND ENGINEERING

CANARA ENGINEERING COLLEGE
(Affiliated to VTU Belagavi, Recognized by AICTE, Accredited by NBA)
Sudhindra Nagara, Benjanapadavu, Mangaluru - 574219,
Karnataka.
2024-25
CANARA ENGINEERING COLLEGE
(Affiliated to VTU Belagavi, Recognized by AICTE, Accredited by NBA)
Sudhindra Nagara, Benjanapadavu, Mangaluru - 574219,
Karnataka

DEPARTMENT OF INFORMATION SCIENCE AND

ENGINEERING

CERTIFICATE
Certified that the project work entitled “SIGNIFYING IMMEDIATE IMAGE
GENERATION FROM TEXT” carried out by

Mr. Darshan Shet 4CB21IS009

Mr. Nithish 4CB21IS030
Mr. Pranav Savant 4CB21IS035
Mr. Shanmukha Maddodi 4CB21IS045

the bonafide students of VII semester INFORMATION SCIENCE AND ENGI-

NEERING in partial fulfillment for the award of Bachelor of Engineering in INFOR-
MATION SCIENCE AND ENGINEERING of the Visvesvaraya Technological
University, Belagavi during the year 2024-2025. It is certified all corrections/suggestions
indicated for Internal Assessment as indicated during internal assessment. The project
report has been approved as it satisfies the academic requirements in respect of project
work prescribed for the said degree.

Prof. Pradeep M Dr. H Manoj T Gadiyar Dr. Nagesh H R

Project Guide HOD-ISE Principal

External Viva:

Examiner’s Name Signature with Date

1. . . . . . . . . . . . . . . . . . . . . . .....................
2. . . . . . . . . . . . . . . . . . . . . . .....................
CANARA ENGINEERING COLLEGE
(Affiliated to VTU Belagavi, Recognized by AICTE, Accredited by NBA)
Sudhindra Nagara, Benjanapadavu, Mangaluru - 574219,
Karnataka

DEPARTMENT OF INFORMATION SCIENCE AND

ENGINEERING

DECLARATION

We hereby declare that the entire work embodied in this Project Report ti-
tled “ SIGNIFYING IMMEDIATE IMAGE GENERATION FROM
TEXT ” has been carried out by us at CANARA ENGINEERING COL-
LEGE, Mangaluru under the supervision of Prof. Pradeep M, for the
award of Bachelor of Engineering in Information Science And En-
gineering. This report has not been submitted to this or any other Uni-
versity for the award of any other degree.

Darshan Shet 4CB21IS009

Nithish 4CB21IS030
Pranav Savant 4CB21IS035
Shanmukha Maddodi 4CB21IS045
Acknowledgement

We dedicate this page to acknowledge and thank those responsible for the
shaping of the project. Without their guidance and help, the experience
while constructing the dissertation would not have been so smooth and
efficient.

We sincerely thank our Project guide Prof. Pradeep M, Assistant Pro-

fessor, Department of Information Science And Engineering for his guid-
ance and valuable suggestions which helped us to complete this project.
We also thank our Project coordinators Dr. Ganesh Pai, Department of
Information Science And Engineering, for their consistant encouragement.

We owe a profound gratitude to Dr. H Manoj T Gadiyar, Head of

the Department of Information Science And Engineering, whose kind sup-
port and guidance helped us to complete this work successfully. We also
take this opportunity to thank our Dean Academics/Vice-Principal Dr.
Demian Antony D’Mello and We are extremely thankful to our Prin-
cipal, Dr. Nagesh H R, for their support and encouragement.

We would like to thank all faculty and staff of the Department of Infor-
mation Science And Engineering who have always been with us extending
their support, precious suggestions, guidance, and encouragement through
the project.We also express our gratitude to our beloved friends and par-
ents for their constant encouragement and support.

Darshan Shet
Nithish
Pranav Savant
Shanmukha Maddodi

i
Abstract

Interior design is a creative process that transforms spaces to suit aesthetic

and functional needs. This project develops a web-based application using
Generative Adversarial Networks (GANs) to generate 2D interior design
visuals from textual descriptions. The application integrates React for an
interactive user interface, FastAPI for backend functionality, and Mon-
goDB for managing user data and design histories. Leveraging Stable
Diffusion from Hugging Face, the system interprets user prompts like “a
modern minimalist living room with a neutral color palette” and generates
corresponding visuals.
Key processes include gathering user inputs, using GANs for text-to-
image generation, and storing design metadata. Test scenarios validated
the accuracy of text interpretation and visual generation. The project
demonstrates the feasibility of AI-driven design visualization and its ap-
plication in interior design consultations. Future work could explore inte-
grating 3D visualization and compatibility with professional design tools.
Keywords : Interior Design, Generative Adversarial Networks, Text-to-
Image Generation, React, FastAPI, MongoDB, Streamlit.

ii
Table of Contents

Acknowledgement i

Abstract ii

Table of Contents vii

List of Figures viii

List of Tables ix

1 Introduction 1
1.1 Background . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 Motivation and Problem Statement . . . . . . . . . . . . . 1
1.3 Objectives . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.4 Scope and Limitations . . . . . . . . . . . . . . . . . . . . 2
1.5 Relevance and Type . . . . . . . . . . . . . . . . . . . . . 3
1.6 Organization of the Report . . . . . . . . . . . . . . . . . . 3

2 Literature Survey 4
2.1 Text-to-Image Synthesis With Generative Models: Meth-
ods, Datasets, Performance Metrics, Challenges, and Future
Direction . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
2.1.1 Brief Findings . . . . . . . . . . . . . . . . . . . . . 4
2.1.2 Design/Methodology/Techniques Adopted . . . . . 4
2.1.3 Results Achieved . . . . . . . . . . . . . . . . . . . 5
2.2 Recent Advances in Text-to-Image Synthesis: Approaches,
Datasets, and Future Research Prospects . . . . . . . . . . 5
2.2.1 Brief Findings . . . . . . . . . . . . . . . . . . . . . 5

iii
2.2.2 Design/Methodology/Techniques . . . . . . . . . . 6
2.2.3 Results Achieved . . . . . . . . . . . . . . . . . . . 6
2.3 GACnet-Text-to-Image Synthesis With Generative Models
Using Attention Mechanisms With Contrastive Learning . 6
2.3.1 Brief Findings . . . . . . . . . . . . . . . . . . . . . 6
2.3.2 Design/Methodology/Techniques Adopted in Arti-
cle n . . . . . . . . . . . . . . . . . . . . . . . . . . 7
2.3.3 Results Achieved . . . . . . . . . . . . . . . . . . . 7
2.4 Exploring Progress in Text-to-Image Synthesis: An In-Depth
Survey on the Evolution of Generative Adversarial Networks 7
2.4.1 Brief Findings . . . . . . . . . . . . . . . . . . . . 8
2.4.2 Design/Methodology/Techniques Adopted . . . . . 8
2.4.3 Results Achieved . . . . . . . . . . . . . . . . . . . 8
2.5 BigGan-based Bayesian reconstruction of natural images
from human brain activity . . . . . . . . . . . . . . . . . . 9
2.5.1 Brief Findings . . . . . . . . . . . . . . . . . . . . . 9
2.5.2 Design/Methodology/Techniques Adopted . . . . . 9
2.5.3 Results Achieved . . . . . . . . . . . . . . . . . . . 9
2.6 Use mean field theory to train a 200-layer vanilla Gan . . 10
2.6.1 Brief Findings . . . . . . . . . . . . . . . . . . . . 10
2.6.2 Design/Methodology/Techniques Adopted . . . . . 10
2.6.3 Results Achieved . . . . . . . . . . . . . . . . . . . 10
2.7 High-Resolution Image Synthesis with Latent Diffusion Mod-
els . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
2.7.1 Brief Findings . . . . . . . . . . . . . . . . . . . . . 11
2.7.2 Design/Methodology/Techniques Adopted . . . . . 11
2.7.3 Results Achieved . . . . . . . . . . . . . . . . . . . 11
2.8 Antenna Design Using a GAN-Based Synthetic Data Gen-
eration Approach . . . . . . . . . . . . . . . . . . . . . . . 12
2.8.1 Brief Findings . . . . . . . . . . . . . . . . . . . . 12
2.8.2 Design/Methodology/Techniques Adopted . . . . . 12
2.8.3 Results Achieved . . . . . . . . . . . . . . . . . . . 12
2.9 Text-to-Image Generator using GANs . . . . . . . . . . . 13

iv
2.9.1 Brief Findings . . . . . . . . . . . . . . . . . . . . . 13
2.9.2 Design/Methodology/Techniques Adopted . . . . . 13
2.9.3 Results Achieved . . . . . . . . . . . . . . . . . . . 13
2.10 A 28.6 mJ/iter Stable Diffusion Processor for Text-to-Image
Generation with Patch Similarity-based Sparsity Augmen-
tation and Text-based Mixed-Precision . . . . . . . . . . . 14
2.10.1 Brief Findings . . . . . . . . . . . . . . . . . . . . . 14
2.10.2 Design/Methodology/Techniques Adopted . . . . . 14
2.10.3 Results Achieved . . . . . . . . . . . . . . . . . . . 14
2.11 Comparison Table . . . . . . . . . . . . . . . . . . . . . . . 15
2.12 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . 16

3 Software Requirements Specification 17

3.1 Functional requirements . . . . . . . . . . . . . . . . . . . 17
3.2 Non-Functional requirements . . . . . . . . . . . . . . . . . 18
3.2.1 Safety Requirements . . . . . . . . . . . . . . . . . 18
3.2.2 Performance Requirements . . . . . . . . . . . . . . 18
3.3 User interface design . . . . . . . . . . . . . . . . . . . . . 19
3.4 Hardware and Software requirements . . . . . . . . . . . . 19
3.5 Performance Requirements . . . . . . . . . . . . . . . . . . 19
3.6 Any Other Requirements . . . . . . . . . . . . . . . . . . 20
3.7 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . 20

4 System Design 21
4.1 Abstract Design . . . . . . . . . . . . . . . . . . . . . . . . 21
4.1.1 Architectural diagram . . . . . . . . . . . . . . . . 21
4.2 Proposed system . . . . . . . . . . . . . . . . . . . . . . . 23
4.3 Functional Design . . . . . . . . . . . . . . . . . . . . . . . 24
4.3.1 Modular design diagram . . . . . . . . . . . . . . . 24
4.3.2 Sequence diagram . . . . . . . . . . . . . . . . . . 25
4.3.3 Use case diagram . . . . . . . . . . . . . . . . . . . 26
4.4 Control Flow Design . . . . . . . . . . . . . . . . . . . . . 27
4.4.1 Activity diagram for use cases . . . . . . . . . . . 27
4.5 Data Flow Diagram . . . . . . . . . . . . . . . . . . . . . . 28

v
4.5.1 Zero-Level Data Flow Diagram . . . . . . . . . . . 28
4.5.2 First-Level Data Flow Diagram . . . . . . . . . . . 29
4.5.3 Second-Level Data Flow Diagram . . . . . . . . . . 30
4.5.4 Summary . . . . . . . . . . . . . . . . . . . . . . . 31

5 Implementation 32
5.1 Software Used with Justification . . . . . . . . . . . . . . . 32
5.1.1 Frontend Development . . . . . . . . . . . . . . . . 32
5.1.2 Backend Development . . . . . . . . . . . . . . . . 33
5.1.3 Framework Used . . . . . . . . . . . . . . . . . . . 33
5.1.4 Coding Languages Used for Development . . . . . . 34
5.1.5 Operating System . . . . . . . . . . . . . . . . . . . 34
5.2 Hardware Used with Justification . . . . . . . . . . . . . . 35
5.3 Algorithm/Procedures Used in the Project in Different Mod-
ules . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
5.4 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . 36

6 Results and Discussion 37

6.1 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
6.2 TESTING . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
6.3 TYPES OF SOFTWARE TESTING . . . . . . . . . . . . 41
6.4 TESTING METHODOLOGY . . . . . . . . . . . . . . . . 41
6.4.1 Unit Testing . . . . . . . . . . . . . . . . . . . . . 41
6.4.2 Integration Testing . . . . . . . . . . . . . . . . . . 42
6.4.3 System Testing . . . . . . . . . . . . . . . . . . . . 42
6.5 TESTING CRITERIA . . . . . . . . . . . . . . . . . . . . 43
6.5.1 Testing for Text Input . . . . . . . . . . . . . . . . 43
6.5.2 Testing for Generated Images . . . . . . . . . . . . 43
6.5.3 Testing for User Interface . . . . . . . . . . . . . . 44
6.5.4 Testing for Download Functionality . . . . . . . . . 44
6.6 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . 44

7 Conclusions and Future work 46

vi
References 47

A Turnitin Plagiarism Report 48

B Expo Details 49

vii
List of Figures

4.1 Example Architecture diagram . . . . . . . . . . . . . . . . 22

4.2 Sample proposed System . . . . . . . . . . . . . . . . . . . 23
4.3 Sample Modular design diagram . . . . . . . . . . . . . . . 24
4.4 Sample Sequence diagram . . . . . . . . . . . . . . . . . . 25
4.5 Example Use case diagram . . . . . . . . . . . . . . . . . . 26
4.6 Sample Activity diagram . . . . . . . . . . . . . . . . . . . 27
4.7 Sample Zero-Level Data Flow Diagram . . . . . . . . . . . 28
4.8 Sample First-Level Data Flow Diagram . . . . . . . . . . 29
4.9 Sample Second-Level Data Flow Diagram . . . . . . . . . 30

6.1 Home page . . . . . . . . . . . . . . . . . . . . . . . . . . 37

6.2 Sign Up Page . . . . . . . . . . . . . . . . . . . . . . . . . 38
6.3 Sign In Page . . . . . . . . . . . . . . . . . . . . . . . . . 38
6.4 Give Prompt to generate . . . . . . . . . . . . . . . . . . 39
6.5 Generate Images . . . . . . . . . . . . . . . . . . . . . . . 40

B.1 Innovation Showcase . . . . . . . . . . . . . . . . . . . . . 49

viii
List of Tables

2.1 Comparison of Existing Work and Gap Identification . . . 15

6.1 Sample Testing Criteria for Text Input . . . . . . . . . . . 43

6.2 SSample Testing Criteria for Generated Images . . . . . . 43
6.3 Sample Testing Criteria for User Interface . . . . . . . . . 44
6.4 Sample Testing Criteria for Download Functionality . . . . 44

ix
Chapter 1

Introduction

1.1 Background

Interior design has traditionally been a creative and manual process re-
quiring expertise to conceptualize and create living spaces. With advance-
ments in artificial intelligence (AI), particularly Generative Adversarial
Networks (GANs), the interior design process has seen a transformation.
AI now enables automated visualizations of interior spaces based on text
descriptions, significantly reducing time and effort while enhancing cre-
ativity. This project focuses on developing a Text-To-Image Generator
that uses AI to convert textual descriptions into 2D interior design im-
ages. Users can describe the desired room style, color palette, and furni-
ture preferences, and the system generates a visual representation. This
approach aims to provide an easy-to-use solution for those who lack design
expertise, allowing them to visualize their ideal living spaces.

1.2 Motivation and Problem Statement

The interior design process can often be overwhelming, particularly for

individuals who lack a design background. Communicating a design idea
to professionals or even visualizing abstract concepts can be a difficult task.
This project seeks to simplify the process by providing a tool that allows
users to generate visual representations of interior designs based solely
on textual descriptions. Additionally, providing an estimate of furniture
and material costs can assist users in budgeting for their projects. The

1
Signifying Immediate Image Generation from Text Chapter 1

key problem this project addresses is the lack of accessible and affordable
tools for visualizing interior designs quickly and efficiently, particularly for
those with no design experience.

1.3 Objectives

The objectives of the project are as follows:

• Develop a web-based application to generate 2D interior design visuals

based on text input from users.

• Implement an AI-driven model using GANs and Stable Diffusion for

image generation.

• Provide a user-friendly interface with options to select design style,

room type, and color palette.

• Use MongoDB to store and retrieve user data and design history.

• Ensure the application is scalable and responsive, handling multiple

user requests efficiently.

1.4 Scope and Limitations

This project aims to generate 2D interior designs for a limited set of room
types (living room, bedroom, kitchen, and office) and a select group of
design styles (modern, minimalist, industrial, etc.). The scope is confined
to generating visual concepts and providing basic cost estimates based
on pre-set material and furniture prices. While this approach is designed
to assist users in initial design visualization, the system does not aim to
offer detailed architectural planning or 3D modeling. The limitations of
the project include the potential lack of accuracy in the cost estimation,
which is based on general pricing datasets that may not reflect individual
market variations. Additionally, while the AI model can generate realistic
designs, its ability to understand complex or highly specific descriptions
may be limited.

Department of Information Science And Engineering, CEC, Benjanapadavu,Mangaluru Page 2

Signifying Immediate Image Generation from Text Chapter 1

1.5 Relevance and Type

This project is particularly relevant as the application of AI in interior

design is an emerging field. The tool provides an accessible, affordable
solution for both professionals and non-professionals, simplifying the de-
sign process. The project falls under **applied research**, aiming to solve
practical problems through technology. It is also **innovative**, as it
integrates AI-driven design visualization and cost estimation into a user-
friendly web application, making it a valuable tool for users looking to
quickly conceptualize and plan interior designs.

1.6 Organization of the Report

The report is organized into structured sections that guide the reader
through the various components of the project. First, the Literature Re-
view examines existing studies on covert communication and related de-
tection technologies, providing context and identifying gaps in current re-
search. The Methodology section outlines the machine learning models,
data sources, and techniques employed to develop the detection system,
while Implementation details the technical aspects of building and inte-
grating these models for real-time monitoring. Following the methodol-
ogy, the Results and Analysis section presents the performance of the
system, including accuracy rates and the effectiveness of real-time de-
tection capabilities. The Conclusion and Future Work section summarizes
the project’s findings and identifies areas for improvement, including plans
for enhancing adaptability and computational efficiency. This structure
is designed to ensure a comprehensive understanding of the project’s ob-
jectives, processes, and potential impact, paving the way for continued
development and application in real-world settings.

Department of Information Science And Engineering, CEC, Benjanapadavu,Mangaluru Page 3

Chapter 2

Literature Survey

2.1 Text-to-Image Synthesis With Generative Mod-

els: Methods, Datasets, Performance Metrics,
Challenges, and Future Direction

Author: Sarah K. Alhabeeb, Amal A. Al-Shargabi[1]

2.1.1 Brief Findings

The study by Sarah K. Alhabeeb and Amal A. Al-Shargabi (2024) fo-

cuses on advancements in text-to-image synthesis using generative mod-
els. It reviews various methods, including Generative Adversarial Net-
works (GANs), Variational Autoencoders (VAEs), and diffusion models,
highlighting their capabilities in generating realistic images from textual
descriptions. The paper also explores datasets like MS-COCO and CUB,
discussing their roles in benchmarking model performance. It emphasizes
the challenges of bridging the semantic gap between textual and visual
representations, requiring innovations in data annotation and contextual
understanding.

2.1.2 Design/Methodology/Techniques Adopted

The research employs a comprehensive literature review methodology, sys-

tematically analyzing state-of-the-art generative techniques and their evo-
lution. It evaluates performance metrics such as Fréchet Inception Dis-
tance (FID) and Inception Score (IS) to assess model output quality. The

4
Signifying Immediate Image Generation from Text Chapter 2

authors also identify shortcomings in existing datasets and propose met-

rics for improved evaluation, emphasizing a need for diverse and unbiased
datasets to enhance generalization.

2.1.3 Results Achieved

The study concludes that while significant progress has been made, current
models often struggle with complex or abstract textual inputs. It iden-
tifies promising directions, such as integrating multimodal learning and
advanced semantic embeddings, to improve synthesis accuracy. The find-
ings suggest that achieving human-level realism and diversity in generated
images requires addressing limitations in training data and computational
efficiency.

2.2 Recent Advances in Text-to-Image Synthesis: Ap-

proaches, Datasets, and Future Research Prospects

Author:Yong Xuan Tan, Chin Poo Lee, Mai Neo, Kian Ming Lim, Jit Yan
Lim, Ali Alqahtani[2]

2.2.1 Brief Findings

The paper ”Recent Advances in Text-to-Image Synthesis: Approaches,

Datasets, and Future Research Prospects” by Yong Xuan Tan et al. (2023)
explores the rapid evolution of text-to-image synthesis models. It high-
lights advancements in generative frameworks like GANs, VAEs, and diffu-
sion models, emphasizing their ability to translate text prompts into high-
quality, semantically accurate images. The study discusses key datasets
and benchmarks that have driven innovation while addressing model lim-
itations, including challenges with consistency and detail preservation in
generated images

Department of Information Science And Engineering, CEC, Benjanapadavu,Mangaluru Page 5

Signifying Immediate Image Generation from Text Chapter 2

2.2.2 Design/Methodology/Techniques

The authors systematically reviewed major approaches in text-to-image

synthesis, focusing on architectural innovations. They explored integra-
tion techniques, such as combining pre-trained language models with vi-
sual representation models. Methods like char-CNN-RNN and transform-
ers for text embeddings, along with multimodal alignment strategies, were
critically analyzed. The study also evaluated newer strategies, including
conditional augmentation and hierarchical generation, to address continu-
ity issues

2.2.3 Results Achieved

The paper concludes that while the field has made significant progress in
realism and diversity of generated images, there is room for improvement
in generating fine-grained details and achieving better alignment between
text descriptions and visual outputs. The authors also propose future
directions, such as leveraging hybrid architectures and improving dataset
diversity, to address these challenges and enable broader applications of
text-to-image synthesis

2.3 GACnet-Text-to-Image Synthesis With Genera-

tive Models Using Attention Mechanisms With
Contrastive Learning

Author:Md. Ahsan Habib, Md. Anwar Hussen Wadud[3]

2.3.1 Brief Findings

The GACnet: Text-to-Image Synthesis With Generative Models Using At-

Department of Information Science And Engineering, CEC, Benjanapadavu,Mangaluru Page 6

Signifying Immediate Image Generation from Text Chapter 2

techniques improved the model’s ability to understand intricate textual nu-

ances and translate them into realistic, high-fidelity images. These findings
underscore the importance of synergizing attention and contrastive mech-
anisms for tackling challenges in generative modeling.

2.3.2 Design/Methodology/Techniques Adopted in Article n

The GACnet: Text-to-Image Synthesis With Generative Models Using At-

tention Mechanisms and Contrastive Learning project revealed significant
advancements in the text-to-image synthesis domain. By integrating at-
tention mechanisms with contrastive learning in a generative adversarial
network (GAN) framework, the model achieved superior text-image align-
ment and enhanced visual quality. It demonstrated that combining these
techniques improved the model’s ability to understand intricate textual nu-
ances and translate them into realistic, high-fidelity images. These findings
underscore the importance of synergizing attention and contrastive mech-
anisms for tackling challenges in generative modeling.

2.3.3 Results Achieved

The GACnet project achieved significant results in the field of text-to-

image synthesis by leveraging attention mechanisms and contrastive learn-
ing. The model demonstrated improved capabilities in generating diverse
and high-quality images aligned with textual descriptions. Key perfor-
mance metrics included an Inception Score (IS) of 35.23, a Fréchet Incep-
tion Distance (FID) of 18.2, and an R-Precision of 89.14, indicating a high
level of visual fidelity, diversity, and textual alignment.

2.4 Exploring Progress in Text-to-Image Synthesis:

An In-Depth Survey on the Evolution of Gener-
ative Adversarial Networks

Author:Md. Ahsan Habib, Md. Anwar Hussen Wadud, Md. Fazlul Karim
Patwary, Mohammad Motiur Rahman, M. F. Mridha[4]

Department of Information Science And Engineering, CEC, Benjanapadavu,Mangaluru Page 7

Signifying Immediate Image Generation from Text Chapter 2

2.4.1 Brief Findings

The paper explores advancements in Generative Adversarial Networks

(GANs) for text-to-image synthesis. It highlights the evolution from basic
GAN models to sophisticated architectures incorporating attention mech-
anisms, multi-modal learning, and contrastive techniques. The authors
discuss how these advancements have improved the synthesis quality, di-
versity, and semantic alignment of generated images with text inputs, em-
phasizing real-world applications like creative content and AI design.

2.4.2 Design/Methodology/Techniques Adopted

The study systematically reviews various GAN-based approaches, ana-

lyzing architectural designs, attention-driven mechanisms, and training
strategies. It also evaluates the effectiveness of different datasets and
benchmarks, focusing on integrating contrastive learning to enhance image-
text coherence and ensure better feature representation in generative mod-
els.

2.4.3 Results Achieved

The survey reveals significant progress in achieving photo-realistic and

contextually accurate image generation. The integration of attention and
contrastive learning has led to better interpretability and higher-quality
outcomes. The authors propose potential future directions, such as en-
hancing scalability, dataset diversity, and ethical considerations in text-
to-image synthesis

Department of Information Science And Engineering, CEC, Benjanapadavu,Mangaluru Page 8

Signifying Immediate Image Generation from Text Chapter 2

2.5 BigGan-based Bayesian reconstruction of natu-

ral images from human brain activity

Author: Kai Qiao, Jain Chen,Linyuan Wang,Chi Zhang,Li Tong,[5]

2.5.1 Brief Findings

The research focuses on decoding brain activity into natural images by

leveraging the strengths of GANs, specifically BigGAN. The study ad-
dresses two challenges: The limited sample size of fMRI data, which often
leads to suboptimal GAN training. The need for balancing fidelity (accu-
rate detail replication) and naturalness (plausibility of generated images).

2.5.2 Design/Methodology/Techniques Adopted

The proposed GAN-based Bayesian Visual Reconstruction Model (GAN-

BVRM) integrates a classifier to decode semantic categories from fMRI
data with a pre-trained conditional BigGAN generator, which produces
images corresponding to those categories. This process is refined through
encoding models that evaluate the generated images against the original
brain activity to ensure alignment. Operating within a Bayesian frame-
work, the system iteratively generates and selects images that best match
the brain data, effectively balancing the naturalness of the generated im-
ages with their fidelity to the observed neural signals

2.5.3 Results Achieved

Experimental validation demonstrated that GAN-BVRM significantly im-

proves over traditional GAN-based methods. The reconstructed images
exhibited higher fidelity and were more semantically aligned with the fMRI
stimuli. This achievement marks a step forward in bridging computational
neuroscience with computer vision technologies.

Department of Information Science And Engineering, CEC, Benjanapadavu,Mangaluru Page 9

Signifying Immediate Image Generation from Text Chapter 2

2.6 Use mean field theory to train a 200-layer vanilla

Gan

Author:Dan Li,Shaung Liu,Wellai Xiang,Fengqi Liu, J Doe [6]

2.6.1 Brief Findings

The research highlights that incorporating MFT into the GAN framework
allows for a more robust training process, enabling the model to learn
higher-dimensional data representations. The study revealed that the pro-
posed approach reduced gradient explosion and vanishing problems com-
mon in deep models, resulting in improved convergence and better gener-
ative performance compared to standard training approaches

2.6.2 Design/Methodology/Techniques Adopted

The authors applied mean field theory, a concept originating from statisti-
cal physics, to stabilize the training of vanilla GANs, which often struggle
with deep architectures due to vanishing gradients and mode collapse.
By leveraging MFT, they approximated the interactions among neural
network units, effectively mitigating instability in backpropagation. The
methodology included mathematical analysis to validate MFT’s suitability
and extensive implementation experiments with multiple datasets, demon-
strating how deeper GANs can be trained without significant loss of sta-
bility.

2.6.3 Results Achieved

The experimental results showed significant improvements in training sta-

bility and output quality. The 200-layer vanilla GAN, trained using the
MFT approach, achieved high fidelity in generating complex data dis-
tributions. Quantitative metrics such as the Fréchet Inception Distance
(FID) demonstrated superior performance over traditional GAN training
methodologies.

Department of Information Science And Engineering, CEC, Benjanapadavu,Mangaluru Page 10

Signifying Immediate Image Generation from Text Chapter 2

2.7 High-Resolution Image Synthesis with Latent Dif-

fusion Models

Author:Robin Rombach; Andreas Blattmann; Dominik Lorenz; Patrick

Esser; Björn Ommer [7]

2.7.1 Brief Findings

The paper ”High-Resolution Image Synthesis with Latent Diffusion Mod-

els” by Robin Rombach et al. introduces a novel approach to high-resolution
image synthesis by integrating diffusion models into the latent space of
pretrained autoencoders. This method significantly reduces the compu-
tational overhead typically associated with diffusion models, which tra-
ditionally operate in pixel space, requiring extensive GPU resources. By
leveraging the latent space.

2.7.2 Design/Methodology/Techniques Adopted

The design employs cross-attention layers within the model architecture,

enabling flexible conditioning inputs such as text or bounding boxes. This
allows for advanced tasks like image inpainting, super-resolution, and class-
conditional image synthesis. The latent diffusion models (LDMs) proposed
in this paper outperform prior models in terms of image quality while
maintaining computational efficiency. This method also supports scalable
high-resolution synthesis, setting new standards in various benchmarks.

2.7.3 Results Achieved

Results from the study highlight the versatility of LDMs in multiple do-
mains, including text-to-image synthesis, semantic scene generation, and
image inpainting. These models achieve state-of-the-art performance while
using fewer computational resources compared to pixel-based diffusion
methods. The work emphasizes not only the effectiveness of latent dif-
fusion but also its practical implications for resource-constrained training
environments

Department of Information Science And Engineering, CEC, Benjanapadavu,Mangaluru Page 11

Signifying Immediate Image Generation from Text Chapter 2

2.8 Antenna Design Using a GAN-Based Synthetic

Data Generation Approach

Author:Oameed Noakoasteen ,Jayakrishnan Vijayamohanan, Arjun Gupta

and Christos Christodoulou [8]

2.8.1 Brief Findings

The research employs Generative Adversarial Networks (GANs) to simplify

2.8.2 Design/Methodology/Techniques Adopted

The research employs Generative Adversarial Networks (GANs) to simplify

and accelerate antenna design processes, specifically focusing on the Log-
Periodic Folded Dipole Array (LPFDA). The study addresses the challenge
of generating antenna designs for distinct Q-factor ranges by transforming
the task into a problem of producing parameterized samples for predefined
classes. This innovative approach bridges the gap between theoretical
design and practical implementation, streamlining design iteration cycles

2.8.3 Results Achieved

The system demonstrated the ability to rapidly synthesize antenna de-

signs that meet specific Q-factor requirements, significantly reducing the
computational cost and time involved in traditional methods. The gen-
erated designs were validated for their accuracy and practical feasibility,
highlighting the potential of GAN-based models in advancing electromag-
netic design technologies. The results underscore GANs’ effectiveness in
optimizing antenna performance and adaptability to varying design con-
straints

Department of Information Science And Engineering, CEC, Benjanapadavu,Mangaluru Page 12

Signifying Immediate Image Generation from Text Chapter 2

2.9 Text-to-Image Generator using GANs

Author:Vadik Amar; Sonu; Hatesh Shyan [9]

2.9.1 Brief Findings

The paper ”Text-to-Image Generator using GANs” explores the capability

of Generative Adversarial Networks (GANs) to generate realistic images
from textual descriptions. It highlights the challenges in maintaining se-
mantic alignment between text and image, achieving high-resolution out-
puts, and ensuring diversity in generated images. The study emphasizes
the growing importance of attention mechanisms and large datasets in im-
proving text-to-image synthesis performance

2.9.2 Design/Methodology/Techniques Adopted

The authors designed a GAN architecture tailored for text-to-image syn-

thesis, incorporating an advanced generator and discriminator. The gener-
ator translates textual embeddings into visual features, while the discrim-
inator evaluates the quality and relevance of the generated images. The
study also integrated pre-trained language models for text processing and
enhanced training.

2.9.3 Results Achieved

The model achieved significant improvements in generating high-quality

and semantically accurate images compared to baseline approaches. It
was able to create diverse outputs while maintaining fidelity to the input
descriptions. Benchmarking on public datasets demonstrated a marked
increase in both the quality and coherence of synthesized images, with
enhanced metrics like Inception Score (IS) and Fréchet Inception Distance
(FID).

Department of Information Science And Engineering, CEC, Benjanapadavu,Mangaluru Page 13

Signifying Immediate Image Generation from Text Chapter 2

2.10 A 28.6 mJ/iter Stable Diffusion Processor for

Text-to-Image Generation with Patch Similarity-
based Sparsity Augmentation and Text-based
Mixed-Precision

Author:Jiwon Choi; Wooyoung Jo; Seongyon Hong; Beomseok Kwon; Won-

hoon Park; Hoi-Jun Yoo[10]

2.10.1 Brief Findings

The study presents a novel energy-efficient Stable Diffusion Processor for

text-to-image generation, employing Patch Similarity-based Sparsity Aug-
mentation (PSSA) and text-based mixed-precision mechanisms. The pro-
cessor optimizes computational demands by identifying and focusing on
text-relevant pixels, reducing unnecessary calculations.

2.10.2 Design/Methodology/Techniques Adopted

The design integrates PSSA to minimize redundant energy use in sparsity

augmentation and utilizes Text-based Important Pixel Spotting (TIPS)
for pixel-level precision adjustments. By leveraging mixed-precision pro-
cessing in key network layers, it enhances computational efficiency.

2.10.3 Results Achieved

The processor demonstrated a 37.8 percent reduction in energy consump-

tion compared to conventional approaches while maintaining high image
fidelity. It achieved over 44.8 percent lower precision computations for
non-critical pixels, significantly improving processing speed and reducing
power requirements.Thus the overall image clarity is improved with very
less energy consumption.

Department of Information Science And Engineering, CEC, Benjanapadavu,Mangaluru Page 14

Signifying Immediate Image Generation from Text Chapter 2

2.11 Comparison Table

Table 2.1: Comparison of Existing Work and Gap Identification

Project Title and Problem Ad- Implementation Limitations/Future
Author dressed and Results Scope
Text-to-Image Syn- Challenges in bal- Introduced a tax- Highlighted the
thesis With Genera- ancing quality, di- onomy of methods; need for realistic
tive Models Sarah K. versity, and accuracy emphasized GANs contextual un-
Alhabeeb, Amal A. in text-to-image syn- and VAEs. Results derstanding.More
Al-Shargabi, 2023 thesis. showed incremental scalable datasets.
improvements in
benchmarks.
Recent Advances in Reviewed emerging Synthesized datasets Suggested focus on
Text-to-Image Syn- techniques for im- and benchmarks; in- ethical concerns and
thesis Yong Xuan age synthesis and troduced novel hy- bias reduction in fu-
Tan et al., 2024 dataset limitations. brid approaches im- ture models.
proving fidelity.
GACnet - Text- Addressed the lack Developed a GAN Highlighted the
to-Image Synthesis of advanced atten- model with con- importance of ex-
With Attention tion mechanisms in trastive learning and panding the model
Mechanisms Md. GANs for text-to- improved generation for more complex
Ahsan Habib et al., image synthesis. fidelity and semantic datasets and reduc-
2024 alignment. ing computational
costs.
Exploring Progress Challenges in scal- Surveyed state-of- Suggested future
in Text-to-Image ability and consis- the-art GANs and work on hybrid
Synthesis: An In- tency in GAN-based identified issues in GAN models and
Depth Survey on text-to-image gener- model stability and resolving mode col-
Generative Adver- ation. collapse handling. lapse issues.
sarial Networks Md.
Ahsan Habib et al.,
2023
BigGAN-Based Bridging the gap be- Utilized BigGAN Proposed extending
Bayesian Recon- tween human cogni- models for Bayesian the framework for
struction of Natural tion and image syn- image reconstruc- more brain activity
Images From Hu- thesis. tion from brain datasets and inte-
man Brain Activity activity patterns. grating real-time
Kai Qiao et al., 2023 Achieved high- data processing.
fidelity results.

Department of Information Science And Engineering, CEC, Benjanapadavu,Mangaluru Page 15

Signifying Immediate Image Generation from Text Chapter 2

Project Title and Problem Addressed Implementation Limitations/Future

Author and Results Scope
High-Resolution Image Enhancing the resolu- Implemented a latent Emphasized scalability
Synthesis With La- tion and quality of gen- diffusion model achiev- for large-scale datasets
tent Diffusion Models erated images in latent ing state-of-the-art re- and reducing training
Robin Rombach et al., diffusion models. sults with minimal ar- times.
2023 tifacts.
Antenna Design Us- Limitations in antenna Created synthetic Recommended explor-
ing a GAN-Based data generation for op- datasets; GANs im- ing other RF appli-
Synthetic Data Gen- timal designs. proved antenna design cations and generaliz-
eration Approach accuracy. ing for high-frequency
Oameed Noakoasteen bands.
et al., 2023
Text-to-Image Genera- Challenges in main- Designed a GAN Suggested integrating
tor Using GANs Vadik taining context consis- model integrating multi-modal GANs
Amar et al., 2022 tency between text in- conditional generation and handling more
puts and generated im- strategies. complex scenarios.
ages.
A Stable Diffusion Energy consumption Developed a pro- Recommended testing
Processor for Text- during high-resolution cessor leveraging the processor for edge
to-Image Generation text-to-image synthe- sparsity augmentation, devices and optimizing
Jiwon Choi et al., 2024 sis. reducing energy signif- for real-world applica-
icantly. tions.
Use Mean Field The- Difficulties in training Utilized mean field the- Proposed future work
ory to Train Vanilla deep GAN architec- ory to train a 200-layer on adapting the model
GAN Dan Li et al., tures effectively. GAN model, improv- for conditional GANs
2023 ing model stability. and real-world applica-
tions.

2.12 Summary

The project focuses on generating images from textual prompts using ad-
vanced deep learning techniques. The frontend is built with React, offering
an intuitive interface with options for user registration and login. Once au-
thenticated, users are redirected to a Streamlit-based interface where they
can input prompts to generate images. The backend uses a GAN (Gen-
erative Adversarial Network) model to process the prompts, with Flask
acting as the intermediary for communication. User details, such as login
credentials, are securely stored in a MongoDB database. This workflow
ensures a seamless experience for users while maintaining performance,
modularity, and scalability.

Department of Information Science And Engineering, CEC, Benjanapadavu,Mangaluru Page 16

Chapter 3

Software Requirements Specification

3.1 Functional requirements

Functional requirements define the core functionalities the Interior Design

Text-to-Image Generation system must support to meet user needs. Below
are the key functional requirements:

• User Input: The system must allow users to provide input via a de-
scription of the desired room design, including style, room type, and
color palette.

• Image Generation: The system should generate interior design images

based on the text input provided by the user.

• Interactive User Interface: Users should be able to interact with the

platform to customize their design by selecting from predefined design
styles, room types, and color schemes.

• On-Demand Service: The system should provide real-time image gen-

eration upon receiving a valid user input.

• Image Output: The application must generate and display the design
concept based on user specifications and allow for image download.

• Secure User Access: Users should have a secure login system, ensuring
privacy and data protection.

17
Signifying Immediate Image Generation from Text Chapter 3

3.2 Non-Functional requirements

• Usability: The platform must have an intuitive and responsive inter-

face, ensuring users can easily navigate and generate interior designs.

• Performance: The system should be able to handle multiple simulta-

neous users, with low latency in design generation (within 5 seconds
per design).

• Scalability: The platform must scale to handle an increasing number

of users without performance degradation.

• Security: The system must ensure secure access to user accounts and
safeguard sensitive data, such as design preferences and personal de-
tails.

3.2.1 Safety Requirements

Safety requirements aim to protect user data and ensure the integrity of the
system. The platform should implement strong authentication protocols
to ensure that only authorized users can access their accounts. User data
(such as preferences and generated designs) should be securely stored and
transmitted using encryption. Additionally, regular security audits and
vulnerability scans should be conducted to identify and address potential
threats.

3.2.2 Performance Requirements

To meet the performance expectations of the users, the platform must en-
sure that design generation happens in a reasonable time frame (within 5
seconds per design) to maintain user satisfaction. The platform should
be capable of supporting at least 500 concurrent users, ensuring that
users don’t experience delays or timeouts. Optimized server configura-
tions, database queries, and efficient backend processes will contribute to
achieving this requirement

Department of Information Science And Engineering, CEC, Benjanapadavu,Mangaluru Page 18

Signifying Immediate Image Generation from Text Chapter 3

3.3 User interface design

The user interface (UI) is crucial for providing an optimal user experience.
The platform will feature a simple, clean, and intuitive design. Users will
interact with the platform by entering design preferences through forms
or sliders, selecting room types, and receiving generated designs. The UI
will be responsive, ensuring that the platform is accessible across various
devices, including desktops, tablets, and smartphones.

3.4 Hardware and Software requirements

Hardware Requirements: The platform will be hosted on a cloud server

or dedicated server with a minimum of 8 GB RAM, 4 vCPUs, and at
least 500 GB of storage capacity to accommodate user data and generated
images.
Software Requirements
• Backend: Python-based backend with Flask for handling HTTP re-
quests and interacting with the machine learning model.

• Frontend: ReactJS for a dynamic, responsive interface.

• Database: MongoDB for storing user preferences, design images, and

other data.

• Machine Learning Model: A custom-trained text-to-image model

(e.g., a GAN or diffusion model) to generate interior designs.

• Hosting: The system can be hosted on platforms like AWS, Heroku,

or any suitable cloud provider.

3.5 Performance Requirements

The platform must handle high traffic during peak hours without perfor-
mance degradation. Key performance metrics include:
• Load Time: The platform’s homepage and design input page should
load within 3-5 seconds.

Department of Information Science And Engineering, CEC, Benjanapadavu,Mangaluru Page 19

Signifying Immediate Image Generation from Text Chapter 3

• Design Generation: Each design must be generated within 5 seconds

based on user input.

• Concurrent Users: The platform should support up to 500 concurrent

users, ensuring that simultaneous users do not experience slowdowns
or system crashes.

• Uptime: The platform should maintain an uptime of 99%, ensuring

minimal downtime during critical periods.

3.6 Any Other Requirements

• Integration with Design Software: The platform should allow users to

download generated designs in standard image formats (e.g., PNG,
JPEG). Future iterations may consider integrating with design tools
like AutoCAD or SketchUp for further refinement of the generated
designs.

• Localization: The platform may support multiple languages to cater

to a wider user base, especially if targeting regions with different
languages.

3.7 Summary

In summary, the non-functional requirements for the Interior Design Text-

to-Image Generation system outline the key characteristics for user expe-
rience, performance, and security. The system must be reliable, secure,
scalable, and efficient, ensuring that users can generate high-quality de-
signs quickly and seamlessly. By meeting these requirements, the platform
will provide a robust and engaging experience for users interested in cre-
ating personalized interior designs.

Department of Information Science And Engineering, CEC, Benjanapadavu,Mangaluru Page 20

Chapter 4

System Design

4.1 Abstract Design

4.1.1 Architectural diagram

The architecture of our project, Signifying Immediate Image Generation

from Text, is designed to seamlessly integrate user interaction, backend
processing, and data storage into a cohesive system that ensures a smooth
user experience and efficient performance. At the core of this architecture
is the User Interaction Layer, comprising a React-based frontend for nav-
igation and account management and a Streamlit interface for generating
and displaying designs.
The React frontend serves as the entry point, offering users a clean and
intuitive interface for signing in or signing up. After logging in, users are
directed to the Streamlit interface, where they can input textual descrip-
tions to generate 2D interior design images. The Streamlit interface acts
as a dedicated workspace for interacting with the text-to-image model and
viewing results, bridging the gap between the frontend and backend com-
ponents to ensure user inputs are processed efficiently.

21
Signifying Immediate Image Generation from Text Chapter 4

Figure 4.1: Example Architecture diagram

Figure 4.1 illustrates the architecture of our project, highlighting a

highly efficient pipeline that seamlessly integrates user interaction, back-
end processing, and data storage. Users can input a simple text prompt
through an intuitive interface and receive a visually generated 2D image
within seconds.
This architecture prioritizes simplicity and usability, ensuring that even
users unfamiliar with the underlying technology can effortlessly navigate
and understand the workflow. By combining modern web technologies like
React and Streamlit, advanced machine learning models such as GANs,
and robust data management using MongoDB, the system exemplifies a
cohesive and user-friendly design for generating interior designs from tex-
tual descriptions.

Department of Information Science And Engineering, CEC, Benjanapadavu,Mangaluru Page 22

Signifying Immediate Image Generation from Text Chapter 4

4.2 Proposed system

Figure 4.2: Sample proposed System

The proposed diagram outlines a streamlined workflow that integrates

user interaction, backend processing, and data storage into a cohesive
system. Users interact through a React-based frontend for login/signup,
seamlessly transitioning to a Streamlit interface for submitting prompts.
The backend, powered by Flask, serves as the central hub, processing
prompts with a GAN model to generate images and storing user data

Department of Information Science And Engineering, CEC, Benjanapadavu,Mangaluru Page 23

Signifying Immediate Image Generation from Text Chapter 4

in MongoDB.This architecture emphasizes clarity and efficiency, ensuring

smooth communication between components, providing users intuitive ex-
perience. By combining modern web technologies and machine learning,
the system offers a robust foundation for generating high-quality images
based on user input.

4.3 Functional Design

4.3.1 Modular design diagram

Figure 4.3: Sample Modular design diagram

The functional or modular design of our project breaks down the system
into distinct modules, each responsible for a specific functionality. The user
interface module, built with React and Streamlit, handles user interaction
and input. The backend module, powered by Flask, processes the prompts
and integrates seamlessly with the GAN model for image generation. A
dedicated data management module uses MongoDB to store and retrieve
user data. This modular approach ensures clear separation of concerns,
ease of maintenance, making the system efficient and user-friendly.

Department of Information Science And Engineering, CEC, Benjanapadavu,Mangaluru Page 24

Signifying Immediate Image Generation from Text Chapter 4

4.3.2 Sequence diagram

Figure 4.4: Sample Sequence diagram

The sequence diagram illustrates the step-by-step interactions between

the components of the system. It begins with the user accessing the React
frontend for login or signup, after which they are redirected to the Streamlit
interface. Here, the user inputs a prompt, which is sent to the Flask
backend for processing. Flask forwards the prompt to the GAN model,
which generates the corresponding image and returns it to Flask. Flask
stores the image, user data in MongoDB before sending the image back to
the Streamlit interface. Finally, the user views the generated image. This
diagram clearly outlines the sequential flow of data and control, ensuring
a smooth and logical interaction process.

Department of Information Science And Engineering, CEC, Benjanapadavu,Mangaluru Page 25

Signifying Immediate Image Generation from Text Chapter 4

4.3.3 Use case diagram

Figure 4.5: Example Use case diagram

The use case diagram highlights the key interactions between the user
and the system components. The user interacts with the system to perform
various actions, such as signing in or signing up through the React fron-
tend, submitting prompts via the Streamlit interface, and viewing the gen-
erated images. The backend, powered by Flask, processes these prompts
and communicates with the GAN model to generate images while storing
all relevant data in MongoDB. Each use case is linked to specific actions,
ensuring clarity in the user’s journey through the system. It provides a
high-level overview of the system’s functionality.

Department of Information Science And Engineering, CEC, Benjanapadavu,Mangaluru Page 26

Signifying Immediate Image Generation from Text Chapter 4

4.4 Control Flow Design

4.4.1 Activity diagram for use cases

Figure 4.6: Sample Activity diagram

Activity diagram outlines the control flow of the system, showcasing

the logical progression of activities from start to finish. It begins with the
user accessing the landing page, where they can log in or sign up. Upon
successful authentication, the user is redirected to the Streamlit interface
to submit a prompt. The prompt triggers backend processing, where Flask
forwards it to the GAN model. The GAN processes the input and generates
an image, which is then stored in MongoDB . Flask retrieves the image and
sends it back to the Streamlit interface, where the user views the result.
This diagram effectively maps the decision points, actions, and data flow.

Department of Information Science And Engineering, CEC, Benjanapadavu,Mangaluru Page 27

Signifying Immediate Image Generation from Text Chapter 4

4.5 Data Flow Diagram

4.5.1 Zero-Level Data Flow Diagram

Figure 4.7: Sample Zero-Level Data Flow Diagram

The zero-level Data Flow Diagram (DFD) provides a simplified repre-

sentation of the entire system, encapsulating all its major functionalities
into a single process called the ”Image Generation System.” The user acts
as the external entity, interacting with the system to log in, sign up, and
submit prompts for image generation. Once the user provides input, the
system processes the prompt through its backend components, including
the GAN model, which generates the desired image. The system also com-
municates with MongoDB to store user data. The resulting image is then
sent back to the user, completing the process. This high-level abstraction
focuses on the data flow and external interactions, offering a clear and
concise view of the system’s functionality.

Department of Information Science And Engineering, CEC, Benjanapadavu,Mangaluru Page 28

Signifying Immediate Image Generation from Text Chapter 4

4.5.2 First-Level Data Flow Diagram

Figure 4.8: Sample First-Level Data Flow Diagram

The first-level Data Flow Diagram (DFD) further decomposes the high-
level system into distinct processes, each responsible for specific tasks. The
User interacts with the system by first logging in or signing up, where their
credentials are processed and authenticated. After successful authentica-
tion, the user submits a prompt to the Prompt Submission Process, which
sends the prompt to the Processing System. This system interacts with
the GAN Model to generate an image based on the user’s input. The user
details is stored in MongoDB through the Storage System.First-level DFD
highlights the core processes and their interdependencies in the system.

Department of Information Science And Engineering, CEC, Benjanapadavu,Mangaluru Page 29

Signifying Immediate Image Generation from Text Chapter 4

4.5.3 Second-Level Data Flow Diagram

Figure 4.9: Sample Second-Level Data Flow Diagram

The Second-Level Data Flow Diagram (DFD) dives deeper into the
individual processes, providing a more granular view of the system’s op-
erations. In this diagram, the Login/Signup Process is broken down into
two sub-processes: Authentication and Account Creation . Once the user
is authenticated, the Prompt Submission Process is split into Prompt Val-
idation and Prompt Forwarding. The Processing System consists of two
sub-processes: GAN Model Interaction and Image Generation Confirma-
tion . The Storage System stores User Data .

Department of Information Science And Engineering, CEC, Benjanapadavu,Mangaluru Page 30

Signifying Immediate Image Generation from Text Chapter 4

4.5.4 Summary

This project is a web-based image generation system where users interact

through a React frontend. The process begins with users logging in or
signing up, which authenticates their credentials and redirects them to a
Streamlit interface. Once logged in, users input text prompts that are sent
to a Flask backend, which processes these prompts by communicating with
a GAN model to generate the requested images. The generated images are
then displayed back to the user via Streamlit, providing a seamless experi-
ence. MongoDB is used in the backend to store user-related data, including
credentials and account information, ensuring secure management of user
profiles. The integration of React, Streamlit, Flask, and MongoDB ensures
the system is scalable, efficient, and user-friendly, offering a smooth and
interactive experience from login to receiving generated images.

Department of Information Science And Engineering, CEC, Benjanapadavu,Mangaluru Page 31

Chapter 5

Implementation

5.1 Software Used with Justification

5.1.1 Frontend Development

React: In order to construct a responsive and interactive user interface,

React was utilized to generate reusable and dynamic components. Its
component-based architecture guaranteed scalability and streamlined the
development process. Performance was improved by React’s virtual DOM,
which updated UI changes quickly.

Node.js: To build a scalable and efficient server-side environment, Node.js

was utilized to handle asynchronous operations and manage multiple re-
quests seamlessly. Its event-driven, non-blocking I/O architecture en-
sured high performance and responsiveness, even under heavy workloads.
Node.js’s extensive library of modules and npm ecosystem streamlined de-
velopment, making it ideal for creating fast, reliable, and scalable backend
systems.

Streamlit: Streamlit was used to develop an interactive and user-friendly

web application for data visualization and analysis.Its Python-centric frame-
work enabled rapid prototyping with minimal boilerplate code.Streamlit’s
ability to dynamically update UI components in real time based on user
inputs streamlined the creation of responsive dashboards. This ensured a
seamless experience, making it an ideal choice for presenting data-driven
insights.

32
Signifying Immediate Image Generation from Text Chapter 5

5.1.2 Backend Development

FastAPI: To build high-performance APIs with Python, FastAPI was uti-

lized. Known for its speed and efficiency, FastAPI leverages asynchronous
programming to handle a large number of requests concurrently, making
it ideal for scalable applications. Its automatic generation of interactive
API documentation with Swagger and ReDoc greatly simplified develop-
ment and testing. FastAPI’s strong support for data validation, type hints,
and dependency injection streamlined the creation of robust, maintainable
APIs with minimal code.

pyTorch: To develop and deploy deep learning models, PyTorch was uti-
lized for its dynamic computation graph, which allows for greater flexibility
during model development and debugging. Its intuitive, Pythonic inter-
face makes it easy to experiment with various architectures and algorithms.
PyTorch’s extensive support for GPU acceleration and its seamless inte-
gration with other Python libraries, such as NumPy and pandas, ensured
efficient computation and scalability. Additionally, its robust ecosystem
for model training, optimization, and deployment streamlined the devel-
opment of high-performance machine learning models.

5.1.3 Framework Used

Flask: Flask is a lightweight and flexible Python web framework designed

for simplicity and extensibility. Unlike more comprehensive frameworks
like Django, Flask provides only the essential tools for web development,
allowing developers to add libraries and features as needed. It uses a
decorator-based approach to define routes, making it easy to create and
manage application endpoints. For rendering HTML, Flask integrates the
Jinja2 templating engine, which supports dynamic content generation with
a clean syntax. The framework is highly extensible, offering a range of
optional extensions such as Flask-SQLAlchemy for database integration,
Flask-WTF for form handling, and Flask-RESTful for building APIs. This
minimalistic yet powerful approach makes Flask ideal for small to medium-
sized applications and projects that require flexibility.

Department of Information Science And Engineering, CEC, Benjanapadavu,Mangaluru Page 33

Signifying Immediate Image Generation from Text Chapter 5

5.1.4 Coding Languages Used for Development

• Python: Python’s extensive standard library provides tools for han-

dling tasks such as file manipulation, networking, data processing,
and web development. Additionally, its vast ecosystem of third-party
libraries and frameworks—like NumPy and pandas for data analysis,
Django and Flask for web development, and TensorFlow and Py-
Torch for machine learning—further extends its capabilities.Python
is widely used in areas like web development, scientific computing,
artificial intelligence, data analysis, automation, and more. Its sim-
plicity makes it an excellent choice for beginners, while its power and
flexibility cater to the needs of experienced developers building com-
plex applications.

5.1.5 Operating System

Windows : Windows is a popular operating system developed by Microsoft,

designed to provide a user-friendly graphical interface for personal com-
puters, laptops, and servers. First released in 1985, it has evolved through
numerous versions, each improving functionality, performance, and user
experience. Known for its intuitive interface, Windows features the fa-
miliar Start menu, taskbar, and desktop layout, making it accessible for
users of all levels. It supports a wide range of software applications, from
productivity tools like Microsoft Office to gaming and creative software.
Windows is compatible with various hardware, making it a versatile choice
for both personal and professional use. The operating system includes fea-
tures like Windows File Explorer for file management, Cortana for voice
assistance, and built-in security tools such as Windows Defender. The lat-
est versions, Windows 10 and 11, bring new features like virtual desktops,
enhanced touchscreen support, and seamless integration with Microsoft’s
cloud services, including OneDrive. With its widespread adoption and
constant updates, Windows remains a key player in the computing world.

Department of Information Science And Engineering, CEC, Benjanapadavu,Mangaluru Page 34

Signifying Immediate Image Generation from Text Chapter 5

5.2 Hardware Used with Justification

Since the Alumni Association project is software-based, no specific hard-

ware was required for development. The application was developed and
tested on standard computing devices with multi-core processors and a
minimum of 8 GB of RAM. For hosting and deployment, the application
was deployed on Vercel, a cloud-based platform, eliminating the need for
dedicated hardware infrastructure.

5.3 Algorithm/Procedures Used in the Project in

Different Modules

The SIGNIFYING IMMEDIATE IMAGE GENERATION FROM

TEXT project is built using various modules, each carefully designed to
perform a specific task. These modules work together seamlessly to trans-
form text descriptions into visually appealing 2D interior designs.

1. Text-to-Image Generation At the heart of the project is a Gen-

erative Adversarial Network (GAN), a machine learning model that
generates images from textual input. The process begins by convert-
ing the text into numerical data using an embedding technique. The
GAN then uses two main components:

• A Generator, which creates images based on the text.

• A Discriminator, which ensures the generated image matches

the text description.

Both components are trained together to improve the image quality

and alignment with the input text.

2. User Interaction To make the system user-friendly, a combination

of React and Streamlit is used:

• The React frontend provides a smooth interface for users to

Department of Information Science And Engineering, CEC, Benjanapadavu,Mangaluru Page 35

Signifying Immediate Image Generation from Text Chapter 5

• The Streamlit interface allows users to input their text de-

scriptions and view the generated images in a simple and intu-
itive way.

3. Backend Processing The backend, powered by Flask, acts as the

brain of the system. It handles user input, sends it to the GAN model
for processing, and retrieves the generated image. Flask also ensures
the entire process is smooth and efficient.

4. Data Management To store and manage user inputs and gener-

ated images, MongoDB is used. This database keeps everything
organized, making it easy for users to access their previously created
designs whenever they need.

5.4 Summary

This chapter outlines the key components and tools used in the develop-
ment of the Signifying Immediate Image Generation from Text project,
highlighting the rationale behind their selection. The software stack in-
cludes React for creating dynamic and responsive user interfaces, Streamlit
for real-time interactive dashboards, Node.js for efficient server-side oper-
ations, and FastAPI for building high-performance APIs. PyTorch was
utilized for developing and deploying deep learning models, while Flask
served as the lightweight framework for backend integration. Python,
with its extensive library support and simplicity, was the primary pro-
gramming language, complemented by the Windows operating system for
development and testing. From a hardware perspective, standard comput-
ing devices with multi-core processors and at least 8 GB of RAM sufficed,
while the application was deployed on Vercel to eliminate the need for
dedicated infrastructure. The project architecture is modular, incorporat-
ing text-to-image generation using GANs, user interaction via React and
Streamlit, backend processing with Flask, and efficient data management
using MongoDB. These components seamlessly work together to transform
textual descriptions into high-quality, visually appealing 2D designs.

Department of Information Science And Engineering, CEC, Benjanapadavu,Mangaluru Page 36

Chapter 6

Results and Discussion

6.1 Results

Figure 6.1: Home page

Figure 6.1 represents The Home Page of the application acts as a wel-
coming gateway for users, offering a clean and minimalistic design. Its
responsive layout ensures compatibility with all devices, providing an ac-
cessible and seamless user experience. During user testing, the interface
received positive feedback for its simplicity and ease of navigation. Users
could easily access options to sign up, log in, or learn more about the sys-
tem.

37
Signifying Immediate Image Generation from Text Chapter 6

Figure 6.2: Sign Up Page

Figure 6.2 represents The Sign-Up Page facilitates the creation of user
accounts securely. It includes fields to input a username, email, and pass-
word, all of which are validated to ensure accuracy. Additionally, user
credentials are encrypted, prioritizing data security. Once the sign-up
process is complete, users are seamlessly redirected to the Login Page.

Figure 6.3: Sign In Page

Department of Information Science And Engineering, CEC, Benjanapadavu,Mangaluru Page 38

Signifying Immediate Image Generation from Text Chapter 6

Figure 6.3 represents The Login Page allows registered users to access
their accounts securely. It features fields for email and password input,
ensuring smooth authentication. Incorrect login attempts prompt error
messages, guiding users to resolve the issue. Upon successful login, users
are directed to the application’s main functionality, where they can provide
text prompts for interior designs.

Figure 6.4: Give Prompt to generate

Figure 6.4 represents Once logged in, users interact with the system
by providing textual descriptions of their desired interior designs. For
instance, a user might enter the prompt, “A cozy bedroom with wooden
flooring, natural lighting, and a queen-sized bed.” The system processes
this input and generates a 2D design that aligns closely with the provided
description. Feedback mechanisms are integrated to guide users in refining
their prompts for optimal results.

Department of Information Science And Engineering, CEC, Benjanapadavu,Mangaluru Page 39

Signifying Immediate Image Generation from Text Chapter 6

Figure 6.5: Generate Images

Figure 6.5 represents The generated images are the highlight of the ap-
plication. Based on the text input, the system leverages a custom GAN
model to create detailed 2D designs. For example, a prompt like “A mod-
ern living room with a grey sofa, a glass coffee table, and indoor plants”
produced a visually appealing and accurate representation of the descrip-
tion. Testing showed that the system efficiently translated textual inputs
into designs, receiving praise for its intuitive output. The generated de-
signs are downloadable, allowing users to save or integrate them into other
design projects.

6.2 TESTING

For our project, the testing process ensures the system’s usability, accu-
racy, and functionality. It involves running tests to identify errors and
validate the performance of each module by simulating various user inputs
and scenarios. The primary goal of testing is to confirm the system’s abil-
ity to interpret textual descriptions accurately and generate corresponding
2D interior designs.
The testing success depends on well-structured test cases that cover
diverse room types, material descriptions, and user interactions. Each test
case includes inputs (e.g., text descriptions of rooms or specific furniture),

Department of Information Science And Engineering, CEC, Benjanapadavu,Mangaluru Page 40

Signifying Immediate Image Generation from Text Chapter 6

system actions (e.g., text preprocessing, GAN model processing, and 2D

image generation), expected outputs (e.g., accurate 2D designs matching
the description), and actual outputs.

6.3 TYPES OF SOFTWARE TESTING

• Black Box Testing: In this testing method, we check if the system

works as expected without looking at how it’s built internally. For
our project, we test whether the application generates accurate 2D
interior designs based on the user’s text input. For example, if the
input is “A cozy bedroom with wooden flooring and a queen-sized
bed,” the system should create an image matching the description.
We also test unusual inputs, like incomplete or unclear descriptions,
to see if the system handles them properly and gives helpful feedback.
This ensures the system behaves as users would expect.

• White Box Testing: This type of testing focuses on how the system
works on the inside. It involves checking the code, algorithms, and
overall logic to ensure everything runs smoothly and efficiently. For
our project, we test how well the system processes text descriptions,
how accurately the GAN model creates images, and whether the ap-
plication handles different kinds of inputs effectively. For example, we
check if the text processing part can understand complex sentences
and if the GAN model generates designs that match the key features
described. This helps ensure the system produces high-quality results.

6.4 TESTING METHODOLOGY

The different types of testing are as follows:

6.4.1 Unit Testing

This type of testing focuses on checking individual parts of the system.

For our project, we tested each module separately, such as the text in-
put processor, the GAN model for generating images, and the backend

Department of Information Science And Engineering, CEC, Benjanapadavu,Mangaluru Page 41

Signifying Immediate Image Generation from Text Chapter 6

handling user inputs. For example, we verified whether the system could
understand a description like “A modern living room with a grey sofa”
and if the GAN generated a suitable design. This way, we identified and
fixed any bugs in the individual parts before combining them into the full
system.

6.4.2 Integration Testing

After making sure each part worked individually, we tested how well they
worked together. Integration testing ensured that all components, like
the text analysis module, the GAN model, and the user interface, could
communicate and function as a cohesive system. For instance, we checked
if a user input was passed correctly from the frontend to the GAN model
and if the generated image was displayed back to the user without errors.
This testing step helped us catch any issues in the interaction between
different modules.

6.4.3 System Testing

System testing involved testing the entire project as a whole, under real-life
conditions. We made sure all parts of the system—like the text analysis,
image generation, and user interface—worked seamlessly together. For
example, we tested the system with a variety of text descriptions to see
if it consistently generated accurate 2D designs and handled unexpected
inputs gracefully. This stage ensured that the application could provide
reliable results in a real-world environment.System testing consists of the
following steps:

• Module Testing: Each module, such as text processing and image

generation, was tested to ensure it worked properly on its own.

• Scenario Testing: Different types of user inputs, like detailed or vague

room descriptions, were tested to see if the system could handle them
and generate appropriate designs.

Department of Information Science And Engineering, CEC, Benjanapadavu,Mangaluru Page 42

Signifying Immediate Image Generation from Text Chapter 6

• End-to-End Testing: The entire system was tested to ensure all com-
ponents worked together without any issues, from the user’s input to
the generated design.

• Documentation: Detailed records of the testing process, issues en-

countered, and their resolutions were created. This documentation
will help in future updates or debugging.

6.5 TESTING CRITERIA

6.5.1 Testing for Text Input

Table 6.1: Sample Testing Criteria for Text Input

Test Case Input Test Description Output
1 “A cozy bedroom with Verify that the sys- Generated design in-
a queen-sized bed.” tem generates a design cludes a cozy bedroom
matching the input de- with a queen-sized bed.
scription.
2 Empty text input Ensure the system han- Error message prompt-
dles empty text input ing for valid input.
gracefully.

6.5.2 Testing for Generated Images

Table 6.2: SSample Testing Criteria for Generated Images

Test Case Input Test Description Output
1 Text input: “A mod- Verify that the system Design features a grey
ern living room with generates an accurate sofa and indoor plants
a grey sofa and indoor design for the descrip- in a modern living
plants.” tion. room.
2 Text input: “A rustic Ensure that the sys- Generated image in-
kitchen with wooden tem captures key ele- cludes wooden cabi-
cabinets.” ments of the descrip- nets in a rustic kitchen
tion in the output. style.

Department of Information Science And Engineering, CEC, Benjanapadavu,Mangaluru Page 43

Signifying Immediate Image Generation from Text Chapter 6

6.5.3 Testing for User Interface

Table 6.3: Sample Testing Criteria for User Interface

Test Case Input Test Description Output
1 Click on “Generate Im- Check if the button Image is successfully
age” after entering a triggers the generation generated and dis-
valid prompt. process. played.
2 Click on “Generate Im- Verify that the system Error message dis-
age” without entering prompts for valid in- played: “Please enter
a prompt. put. a prompt.”

6.5.4 Testing for Download Functionality

Table 6.4: Sample Testing Criteria for Download Functionality

Test Case Input Test Description Output
1 Click on the “Down- Ensure that the image Image file is down-
load” button after gen- is successfully down- loaded.
erating an image. loaded to the user’s de-
vice.
2 Click on the “Down- Verify that the sys- Error message dis-
load” button without tem handles the action played: “No image
generating an image. gracefully. available for down-
load.”

6.6 Summary

The project ”SIGNIFYING IMMEDIATE IMAGE GENERATION FROM

TEXT” offers a creative and user-friendly solution for generating 2D in-
terior designs from textual descriptions. It is designed with three core
functionalities: natural language processing to interpret user inputs and
extract key design elements, a custom GAN model for generating accurate
and visually appealing 2D designs, and a simple user interface for seamless
interaction. Users provide descriptive prompts about their desired inte-
rior spaces, such as ”a cozy bedroom with wooden flooring and natural
lighting,” which the system processes to generate corresponding room de-
signs that align with the description. The generated designs are displayed
within the application, and users can download them for further use. The

Department of Information Science And Engineering, CEC, Benjanapadavu,Mangaluru Page 44

Signifying Immediate Image Generation from Text Chapter 6

project employs a MongoDB database to store user inputs and design

data efficiently, ensuring smooth and organized information management.
A Flask-powered backend integrates all system functionalities, enabling
real-time processing and interaction, while a React-developed frontend en-
sures a responsive and engaging user experience. By leveraging advanced
AI models and intuitive interfaces, the project simplifies the process of
conceptualizing interior designs, making it accessible to a wider audience,
including homeowners, designers, and enthusiasts, and provides a practical
tool to bridge the gap between creative ideas and tangible visualizations,
significantly enhancing the design process.

Department of Information Science And Engineering, CEC, Benjanapadavu,Mangaluru Page 45

Chapter 7

Conclusions and Future work

The project, ”SIGNIFYING IMMEDIATE IMAGE GENERATION FROM

TEXT,” demonstrates how AI can bridge the gap between textual descrip-
tions and visual interior designs. By utilizing custom GAN models, the
system efficiently transforms user inputs into accurate 2D room layouts.
The integration of React, Flask, and MongoDB ensures smooth function-
ality and user-friendly interaction, making it a practical tool for both per-
sonal and professional applications in interior design.

In the future, the project can be enhanced by incorporating 3D design

capabilities, expanding the dataset for better accuracy, and integrating
with professional tools like AutoCAD. Features such as mobile app support
and user profiles could further increase accessibility and usability. These
advancements would position the system as a powerful and comprehensive
solution for interior design needs.

46
References

[1] S. K. Alhabeeb and A. A. Al-Shargabi, “Text-to-image synthesis with generative

models,” N/A, 2023, Add the journal or conference details if available.
[2] Y. X. Tan et al., “Recent advances in text-to-image synthesis,” N/A, 2024, Add
the journal or conference details if available.
[3] M. A. Habib et al., “Gacnet - text-to-image synthesis with attention mechanisms,”
N/A, 2024, Add the journal or conference details if available.
[4] M. A. Habib et al., “Exploring progress in text-to-image synthesis: An in-depth
survey on generative adversarial networks,” N/A, 2023, Add the journal or confer-
ence details if available.
[5] K. Qiao et al., “Biggan-based bayesian reconstruction of natural images from human
brain activity,” N/A, 2023, Add the journal or conference details if available.
[6] R. Rombach et al., “High-resolution image synthesis with latent diffusion models,”
N/A, 2023, Add the journal or conference details if available.
[7] O. Noakoasteen et al., “Antenna design using a gan-based synthetic data generation
approach,” N/A, 2023, Add the journal or conference details if available.
[8] V. Amar et al., “Text-to-image generator using gans,” N/A, 2022, Add the journal
or conference details if available.
[9] J. Choi et al., “A stable diffusion processor for text-to-image generation,” N/A,
2024, Add the journal or conference details if available.
[10] D. Li et al., “Use mean field theory to train vanilla gan,” N/A, 2023, Add the
journal or conference details if available.

47
Appendix A

Turnitin Plagiarism Report

48
Appendix B

Expo Details

Figure B.1: Innovation Showcase

Figure B.1 shows our Team has successfully presented the project in
”Nirmaan 2024”as part of ”Innovations Showcase” college level project
exhibition and competition at Canara Engineering College on 10th Dece-
meber 2024

Generative Adversarial Networks and Deep Learning Theory and Applications 9781032068107 - 20230320 - 112232 PDF
No ratings yet
Generative Adversarial Networks and Deep Learning Theory and Applications 9781032068107 - 20230320 - 112232 PDF
223 pages
BTP Presentation On Text To Image Synthesis
100% (1)
BTP Presentation On Text To Image Synthesis
38 pages
2 DNN-CNN-RNN
100% (1)
2 DNN-CNN-RNN
87 pages
ttoimage_merged
No ratings yet
ttoimage_merged
57 pages
BTP_6 sem_part1
No ratings yet
BTP_6 sem_part1
40 pages
Text-to-Image Synthesis With Generative Models Met
No ratings yet
Text-to-Image Synthesis With Generative Models Met
16 pages
Text To Image Synthesis Using Generative Adversarial Networks
No ratings yet
Text To Image Synthesis Using Generative Adversarial Networks
10 pages
Indian Institute OF Information Technology Allahabad: Text To Image Synthesis
No ratings yet
Indian Institute OF Information Technology Allahabad: Text To Image Synthesis
8 pages
Dynamic Image Generation From Text Prompt Research Paper-JOT-5135
100% (1)
Dynamic Image Generation From Text Prompt Research Paper-JOT-5135
7 pages
Engproc 20 00016 With Cover
No ratings yet
Engproc 20 00016 With Cover
7 pages
Text To Image Synthesis Using Self
No ratings yet
Text To Image Synthesis Using Self
20 pages
Sample Report PDF
No ratings yet
Sample Report PDF
25 pages
Documents 5
No ratings yet
Documents 5
5 pages
Rishab Paper Final
No ratings yet
Rishab Paper Final
7 pages
Abstractive Text Summarization Using GAN
No ratings yet
Abstractive Text Summarization Using GAN
6 pages
Huang Et Al. - 2021 - On GANs, NLP and Architecture Combining Human and
No ratings yet
Huang Et Al. - 2021 - On GANs, NLP and Architecture Combining Human and
19 pages
Utilizing Generative AI for Text-To-Image Generation
No ratings yet
Utilizing Generative AI for Text-To-Image Generation
6 pages
2A Report
No ratings yet
2A Report
29 pages
Text-to-Image Generation Using Deep Learning
No ratings yet
Text-to-Image Generation Using Deep Learning
6 pages
Presentation1
No ratings yet
Presentation1
64 pages
Text-to-image generation using Generative AI
No ratings yet
Text-to-image generation using Generative AI
5 pages
BTP Report On Text To Image Synthesis
No ratings yet
BTP Report On Text To Image Synthesis
62 pages
project 4 report(Rohit&Gayatri)
No ratings yet
project 4 report(Rohit&Gayatri)
36 pages
Cycle-Consistent Inverse GAN For Text-to-Image Synthesis - 3474085.3475226
No ratings yet
Cycle-Consistent Inverse GAN For Text-to-Image Synthesis - 3474085.3475226
2 pages
Text-to-Image_Synthesis_With_Generative_Models_Methods_Datasets_Performance_Metrics_Challenges_and_Future_Direction_Basiv
No ratings yet
Text-to-Image_Synthesis_With_Generative_Models_Methods_Datasets_Performance_Metrics_Challenges_and_Future_Direction_Basiv
16 pages
Yayi Final Seminar
No ratings yet
Yayi Final Seminar
19 pages
ai-image-generator
No ratings yet
ai-image-generator
37 pages
Deep Learning Based Text To Image Genera
No ratings yet
Deep Learning Based Text To Image Genera
6 pages
Harsha Thesis
No ratings yet
Harsha Thesis
62 pages
An Adaptive Approach To Text To Image
No ratings yet
An Adaptive Approach To Text To Image
5 pages
ImageGeneration
No ratings yet
ImageGeneration
2 pages
TAM GAN - Tamil Text To Naturalistic Image Synthesis Using Conventional Deep Adversarial Networks - 3584019
No ratings yet
TAM GAN - Tamil Text To Naturalistic Image Synthesis Using Conventional Deep Adversarial Networks - 3584019
18 pages
Meta
No ratings yet
Meta
17 pages
From Words To Pictures Artificial Intelligence Based Art Generator
No ratings yet
From Words To Pictures Artificial Intelligence Based Art Generator
9 pages
ChatGPT Implementation in The Metaverse
No ratings yet
ChatGPT Implementation in The Metaverse
607 pages
Report (ST GAN)
No ratings yet
Report (ST GAN)
44 pages
SanjanaSademba 2205348.
No ratings yet
SanjanaSademba 2205348.
8 pages
Sketch To Image Using GAN
No ratings yet
Sketch To Image Using GAN
6 pages
New Microsoft Word Document (2)
No ratings yet
New Microsoft Word Document (2)
8 pages
Xulu Yao Thesis
No ratings yet
Xulu Yao Thesis
120 pages
AI Image Generation
No ratings yet
AI Image Generation
12 pages
Gen Ai Report Amritha 05
No ratings yet
Gen Ai Report Amritha 05
46 pages
Research Paper
No ratings yet
Research Paper
128 pages
(Arisandy Yudha Putra - 23150137) Research Interest
No ratings yet
(Arisandy Yudha Putra - 23150137) Research Interest
13 pages
3689641
No ratings yet
3689641
22 pages
b383fba0-f67c-4a5a-aad0-fd288516352c_Background_and_Literature_Review
No ratings yet
b383fba0-f67c-4a5a-aad0-fd288516352c_Background_and_Literature_Review
7 pages
98152bdf-d3c6-4d64-8f54-5cfe41c88dda_Background_and_Literature_Review
No ratings yet
98152bdf-d3c6-4d64-8f54-5cfe41c88dda_Background_and_Literature_Review
17 pages
Final All Correct
No ratings yet
Final All Correct
49 pages
Scene Reconstruction From 4D Radar Data With GAN and Diffusion
No ratings yet
Scene Reconstruction From 4D Radar Data With GAN and Diffusion
69 pages
Experiment Result 1_ Property Type Analysis (3)
No ratings yet
Experiment Result 1_ Property Type Analysis (3)
3 pages
Yu Xiaozhuo
No ratings yet
Yu Xiaozhuo
85 pages
Photographic Text-to-Image Synthesis With A Hierarchically-Nested Adversarial Network
No ratings yet
Photographic Text-to-Image Synthesis With A Hierarchically-Nested Adversarial Network
10 pages
Base Paper Batch 9 Final Updated 3
No ratings yet
Base Paper Batch 9 Final Updated 3
10 pages
Generating AI Text to Image A Comprehensive Guide
No ratings yet
Generating AI Text to Image A Comprehensive Guide
3 pages
Image To Caption Generator
No ratings yet
Image To Caption Generator
7 pages
Text Based Nlp.2
No ratings yet
Text Based Nlp.2
29 pages
A Review of Generative Adversarial Networks For Computer Vision TasksElectronics Switzerland
No ratings yet
A Review of Generative Adversarial Networks For Computer Vision TasksElectronics Switzerland
17 pages
2024 6 PCIG
No ratings yet
2024 6 PCIG
17 pages
Generative Adversarial Networks For Image and Video Synthesis: Algorithms and Applications
No ratings yet
Generative Adversarial Networks For Image and Video Synthesis: Algorithms and Applications
24 pages
Applsci 13 10637 v2
No ratings yet
Applsci 13 10637 v2
29 pages
Telepresence: A modern way for collaborative work
From Everand
Telepresence: A modern way for collaborative work
Rosi Maria Heller
No ratings yet
Modeling, State of Charge Estimation, and Charging of Lithium-Ion Battery in Electric Vehicle: A Review
No ratings yet
Modeling, State of Charge Estimation, and Charging of Lithium-Ion Battery in Electric Vehicle: A Review
25 pages
Liver Tumor Segmentation Thesis
No ratings yet
Liver Tumor Segmentation Thesis
62 pages
GenAI Interview Questions 1733510469
No ratings yet
GenAI Interview Questions 1733510469
13 pages
Weight Initialization Techniques Assignment Questions
No ratings yet
Weight Initialization Techniques Assignment Questions
8 pages
POA - Tracker
No ratings yet
POA - Tracker
60 pages
JNTUK R20 UNIT-IV DEEP LEARNING TECHNIQUES (WWW - Jntumaterials.co - In)
No ratings yet
JNTUK R20 UNIT-IV DEEP LEARNING TECHNIQUES (WWW - Jntumaterials.co - In)
26 pages
JournalPaper ASC Updated
No ratings yet
JournalPaper ASC Updated
16 pages
Training Deep Neural Networks
No ratings yet
Training Deep Neural Networks
55 pages
Recurrent Neural Networks
No ratings yet
Recurrent Neural Networks
36 pages
402B Deep Learning
No ratings yet
402B Deep Learning
82 pages
CISC 867 Deep Learning: 12. Recurrent Neural Networks
No ratings yet
CISC 867 Deep Learning: 12. Recurrent Neural Networks
72 pages
Short-Term Traffic Prediction Using Deep Learning Long Short-Term Memory Taxonomy Applications Challenges and Future Trends
No ratings yet
Short-Term Traffic Prediction Using Deep Learning Long Short-Term Memory Taxonomy Applications Challenges and Future Trends
21 pages
DL Unit 4 Notes
No ratings yet
DL Unit 4 Notes
21 pages
Activation Functions and Initialization Methods
No ratings yet
Activation Functions and Initialization Methods
17 pages
IVA Question Bank
No ratings yet
IVA Question Bank
8 pages
Notes Unit 1
No ratings yet
Notes Unit 1
13 pages
2. Deep Neural Network
No ratings yet
2. Deep Neural Network
60 pages
HCIA AI Practice Exam All
No ratings yet
HCIA AI Practice Exam All
64 pages
ROHAN PRASAD FinalProjectReport - Rohan Gamer
No ratings yet
ROHAN PRASAD FinalProjectReport - Rohan Gamer
39 pages
RNN LSTM
No ratings yet
RNN LSTM
42 pages
Module1
No ratings yet
Module1
124 pages
Rezero Is All You Need: Fast Convergence at Large Depth: Authors Contributed Equally, Ordered by Last Name
No ratings yet
Rezero Is All You Need: Fast Convergence at Large Depth: Authors Contributed Equally, Ordered by Last Name
14 pages
Ai Resos
No ratings yet
Ai Resos
16 pages
Practice Exam Solutions
No ratings yet
Practice Exam Solutions
26 pages
DeepLearing Theory
No ratings yet
DeepLearing Theory
51 pages
Data Science Ai
No ratings yet
Data Science Ai
27 pages
ML Unit4
No ratings yet
ML Unit4
32 pages
American Sign Language Recognition Using Machine Learning and Com
No ratings yet
American Sign Language Recognition Using Machine Learning and Com
57 pages
Quantum Machine Learning Presentation
No ratings yet
Quantum Machine Learning Presentation
46 pages