0% found this document useful (0 votes)
7 views17 pages

Proposal

The document presents a research project focused on Bengali text-to-image generation using deep learning techniques. It outlines the motivation for developing AI models in regional languages, the objectives of the research, and the challenges faced due to the lack of quality datasets. The literature review highlights previous studies and their limitations in relation to Bengali text generation.

Uploaded by

kanizmitu22
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
7 views17 pages

Proposal

The document presents a research project focused on Bengali text-to-image generation using deep learning techniques. It outlines the motivation for developing AI models in regional languages, the objectives of the research, and the challenges faced due to the lack of quality datasets. The literature review highlights previous studies and their limitations in relation to Bengali text generation.

Uploaded by

kanizmitu22
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 17

IUBAT

Bengali Text to Image


Generati on

Presented By: Supervised By:

Kaniz Fatema Mitu Suhala Lamia


ID: 22103111 Assistant professor
Department of Computer Science
Sabbir Hossain Hridoy
ID : 22103119

1
TABLE OF CONTENTS

1 Introducti on 4 Work Plan

2 Literature Review
5 Conclusion

3 Research Methodology
6 Reference

2
Introduction

Text to image generation is a technique where a model


can generate image based on the given bengali text by
match with them with the existing dataset.
Bengali text-to-image generation is an advanced NLP and
computer vision task that involves creating images from
textual descriptions written in Bengali.
It utilizes deep learning techniques, such as transformers
and diffusion models, to bridge the gap between language
and vision.

Our Research aim is to use machine learning techniques


for generating images from given bengali text.

3
Motivation

 Bengali is a widely spoken language but has limited


resources in AI-driven generative models.
 The growing demand for AI in creative fields highlights
the need for text-to-image models in regional
languages.
 Potential applications in content creation, accessibility
tools, and automated illustration systems

4
Objective

 Gathering knowledge about NLP and Machine Learning.


 Conducting extensive analysis on previous Research
Papers.
 Develop a deep learning model capable of generating
high-quality images from Bengali text descriptions.
 Build a dataset specific to Bengali text-to-image
generation.

5
Problem Statement

 Generating realistic images from Bengali text is challenging


because there is lack of high-quality paired datasets.
 Existing text-to-image models are primarily trained on English
datasets, limiting their effectiveness for Bengali users.
 Complexity of understanding Bengali linguistic structures.
 Need for improved semantic understanding of Bengali text
descriptions in image generation.

6
Literature Review
Title Authors Published year Result Limitation
A Deep Learning-Based Abdul Hady Akash et al. 2022 The study proposed a The study's focus on
Approach to Image model for Bangla image image captioning,
Captioning in Bengali captioning using ResNet
as a feature extractor rather than direct
text-to-image
generation, may also
limit its relevance to
certain applications.

Zero-Shot Text-to-Image Ramesh et al. 2021 Generates high-quality, Lacks fine control,
Generation." diverse images from text exhibits biases, and is
Proceedings of the prompts with zero-shot
International capabilities. computationally
Conference on Machine expensive, with poor
Learning (ICML). Bengali text support.

7
Literature Review
Title Authors Published year Result Limitation
Toward Multimodal Zhu et al. 2017 Effectively translates GANs suffer from mode
Image-to-Image images (e.g., sketches to collapse, training
Translation." Advances photos) with multimodal instability, and require
in Neural Information outputs. large datasets, making
Processing Systems Bengali adaptation
(NeurIPS) challenging.

Photorealistic Text-to- Saharia et al. 2022 Produces highly realistic Slow inference, high
Image Diffusion Models and diverse images with computational cost,
with Deep Learning. better text-image and limited support for
alignment. Bengali-language text.

8
Research Methodology

Data collection Data collection Data collection

Data collection Data collection


Data collection

9
Proposed Methods

10
Soft ware & Hardware Requirements

11
Flowchart

12
Work Plan

13
Conclusion

14
Conclusion

15
Reference

16
ANY
QUESTIONS?

17

You might also like