0% found this document useful (0 votes)
4 views6 pages

Code Wizards

The document outlines a project proposal for the Smart India Hackathon 2024, focusing on a Conversational Image Recognition Chatbot that utilizes deep learning and large language models for image analysis. The solution combines YOLO for real-time object detection with a generative AI model to create detailed image descriptions, aiming to improve accessibility and automate content management. Challenges include the model's limitations in text detection, with strategies proposed to enhance its capabilities through diverse training datasets and OCR integration.

Uploaded by

hema22050.ec
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
4 views6 pages

Code Wizards

The document outlines a project proposal for the Smart India Hackathon 2024, focusing on a Conversational Image Recognition Chatbot that utilizes deep learning and large language models for image analysis. The solution combines YOLO for real-time object detection with a generative AI model to create detailed image descriptions, aiming to improve accessibility and automate content management. Challenges include the model's limitations in text detection, with strategies proposed to enhance its capabilities through diverse training datasets and OCR integration.

Uploaded by

hema22050.ec
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 6

SMART INDIA HACKATHON 2024

• Problem Statement ID – SIH1604


• Problem Statement Title-

Conversational Image Recognition Chatbot

• Theme- Smart Automation

• PS Category- Software

• Team ID-

• Team Name- Code Wizards


Code
Wizards

Proposed Solution

• A web application is developed leveraging deep learning and large


language models for image analysis. YOLO is employed for real-time
object detection and classification, with its output fed into a generative AI
model to produce detailed image descriptions

• The problem is addressed by combining YOLO for real-time object


detection and classification with a generative AI model to produce
accurate, context-aware image descriptions from the detected content.

@SIH Idea submission- Template 2


Your
Code
Team
Wizards
TECHNICAL APPROACH
Name

Tech Stack:

• Generative AI
• YOLO v8
• Streamlit
• COCO Dataset
• Google CoLab

@SIH Idea submission- Template 3


Your
Code
Team
Wizards
FEASIBILITY AND VIABILITY
Name

 Feasibility  Challenges  Statergies

The project is feasible A challenge is the Strategies include


with a model trained to model's inability to training the model on a
identify 80 objects, detect text in images, more diverse dataset
ensuring accurate which may limit the and incorporating OCR
detection and relevant accuracy and capabilities to enhance
descriptions. completeness of the text detection and
generated descriptions. overall accuracy.

4
Your
Code
Team
Wizards
IMPACT AND BENEFITS
Name

 Potential Impact
 ibility### Potential Impact
• It can be used for educational purposes through automated
content generation.
• Improves accessibility by identifying and describing images
for visually impaired users by text to voice features.
• Assists in content management by automating image
categorization and description.
• Supports automated reporting and documentation across
various industries.

@SIH Idea submission- Template 5


Your
Code
Team
Wizards
RESEARCH AND REFERENCES
Name

Drive Link

@SIH Idea submission- Template 6

You might also like