PU IntelliExtract CS - For Project Synopsis
PU IntelliExtract CS - For Project Synopsis
On
In
By
Of
-------------------------
CERTIFICATE
-------------------------
-------------------------
DECLARATION
-------------------------
To the best of our knowledge this project synopsis has not been submitted
to Veer Bahadur Singh Purvanchal University, Jaunpur (U.P.) or any
other University or Institute for the award of any other degree.
CSE, UNSIET, Veer Bahadur Singh Purvanchal University, Jaunpur Page iii
INTELLIEXTRACT: AI-Based Video Text Extraction System
ABSTRACT
This paper introduces IntelliExtract, a video text extraction system
designed to accurately capture and extract text from video frames in real
time. Leveraging cutting-edge machine learning algorithms and computer
vision techniques, IntelliExtract is capable of processing diverse video
formats and environments to identify, detect, and extract both printed and
handwritten text from video streams. The system is built using an
intuitive user interface for seamless interaction, allowing users to upload
videos, preview the text extraction process, and retrieve results
efficiently.
Keywords - Key Frame, Frame Selection, Video Indexing, Keyword
Selection, Indexing, Content Retrieval, Text Extraction, Detection,
Binarization, edge, connected component, Frame Extraction, Text
Recognition, Keyword Indexing.
ACKNOWLEDGEMENT
This work is just not an individual contribution till its completion. We
take this opportunity to express a deep gratitude towards our teachers for
providing excellent guidance, encouragement, and inspiration throughout
the training work, without their invaluable guidance this work would
never have been a successful one. We would like to express deepest
appreciation towards our Project Guide Mr. Dileep Kumar Yadav. At last,
we must express our sincere heartfelt gratitude to our HOD Dr. Vikrant
Bhateja, and all the teachers of Computer Science & Engineering
Department, who helped us directly or indirectly during this course of
work.
TABLE OF CONTENTS
Certificate ii
Declaration iii
Abstract iv
Acknowledgement v
Table of Contents vi
List of Figures vii
1. Introduction 1-21-2
1.1 Overview 1
1.2 Background 1 1
1.3 Stages Of Text Extraction 2 1
1.4 Project Concept 2 2
2. Review of Related Work 3-53-5
2.1 Methods For Key Frame Selection 3 3
2.2 Methods For Text Extraction 4
2.3 Inferences 5
3. Problem Definition 6 6
3.1 Motivation 6 6
3.2 Aim of the Project 6 6
3.3 Project Objectives 6 6
4. Proposed Design Methodology 7-9
7-9
5. Hardware/Software Requirements & Specifications 10 10
6. Applications of Proposed Project 11-12
11-12
Appendix- ‘A’: List of Abbreviations Used viiiviii
Appendix- ‘B’: List of Common Symbols Used ix
ix
References & Bibliography x
x
LIST OF FIGURES
CSE, UNSIET, Veer Bahadur Singh Purvanchal University, Jaunpur Page vii
INTELLIEXTRACT: AI-Based Video Text Extraction System
CHAPTER 1
INTRODUCTION
In videos there are different types of text objects. These objects contain information
about videos such as logo of a university which tells university name and various texts
which provide the contents about the video. That’s why extraction of text is important
for video indexing and information retrieval. In this report we have done the exactly
the same thing and returned the text present in the indexed video in the order of their
appearance.
1.1 OVERVIEW
In this project, methods of how to extract proper text from videos are discussed and
also which types of tools are used which method gives how much accuracy shown we
are currently devel- oping tools for indexing video archives for later reuse, a system
for content analysis of videos in which text appearance is different. These all things are
also dependent on their efficient computational support, combining indexed image and
video analysis and processing tools. Now a days in text extraction rapid developments
are shown hundreds of researcher try to do this in proper way and any research paper is
published. Text extraction approaches for videos proposed respectively. In this project,
we mainly concentrate on the approaches proposed for text extraction in videos in the
most recent 5 years and how to get proper text from videos. To summarize and discuss
the recent progress in this research area.
1.2 BACKGROUND
In recent years the availability of videos are growing rapidly over internet specially on
youtube. The text extraction is used for searching important information from video
data sets. Using this extracted text anybody can get an idea about the videos. For
categorizing the extracted text play important role as a key sign. It is also used to
determine the content of the video. Video text extraction is identified as one of the
key components of the video analysis and retrieval system. Video text extraction can
be used in many applications, like multilingual video information access, semantic
video indexing, video security and surveillance etc. In every video which contain text
usually persists for at least some seconds, because of human viewers so that they read
it and understand easily.
CHAPTER 2
REVIEW OF RELATED WORK
2.1 LITERATURE SURVEY
Relevant Information from frames of indexed video is something which has become a
new phenomenon upon which many research papers are being published and still the
searching continues to go on. Although it’s tedious and complex subject but due to its
tremendous use it’s a hot potato for many years. The research papers which has been
published regarding the same is thoroughly analysed and referred for further
understanding. The techniques which are mentioned the papers are explained in
subsequent parts of the project research. As we move in ahead we discuss different
phases of project.
several video formats, such as f4v, flv and mp4. In order to improve the universality
of video key extraction algorithm, the present method does not consider the specific
format and video stream structure, and the video is decoded before the processed
video frame decomposition. It is seen that the program to extract key frame is divided
into two steps.
Figure 2.1
2.4 INFERENCES
Building the INTELLIEXTRACT model requires thoughtful selection of tools (like
OpenCV and Tesseract), effective preprocessing (e.g., adjusting contrast in video
frames), and a robust model architecture such as combining EAST for text detection
and CRNN for recognition. Key considerations include handling different text
orientations, optimizing processing speed by filtering frames without text, and
training on diverse datasets for fonts and languages. To achieve accuracy and
efficiency, advanced preprocessing techniques and custom datasets are essential,
particularly for domain-specific needs. By leveraging batch processing and cloud
resources, the model can be scaled for large video datasets, making IntelliExtract
adaptable for real-world applications.
CHAPTER 3
PROBLEM DEFINITION
3.1 MOTIVATION
There are different types of methods to extract the text from videos. These methods
are for specific applications including page segmentation, license plate location and
content-based video indexing. After studying such types of text extraction method it is
not easy task to design a general text information extraction (TIE) system. In videos
there are different types of variations such as complexity of background, font size,
color, style, alignment, brightness that’s why design of a TIE system is tough. These
variations play a important role to not working properly a automatic TIE system.
After studying different methods of text extraction analyzing their evaluation results
performance evaluation approaches not only search for answers to many questions
such as: Which text extraction method is better? Why does performance of different
methods is varying in different types of dataset ? Which types of error comes at the
time of indexing ? These questions actually help to develop new ideas to improve the
extraction technology and specific algorithms.
CHAPTER 4
PROPOSED DESIGN METHODOLOGY
The main goal of this methodology is to approach for automated video indexing and
video search from video lecture archives. The methodology further aims to apply
automatic video segmentation and key-frame detection to offer a visual guideline for
the video content extraction in the order of their appearance in the video. Extract
textual metadata by applying video Optical Character Recognition (OCR) technology
on key-frames.
Figure 4.1
In cases where classes are not linearly separable, SVM can use a kernel trick to
transform the data into a higher-dimensional space, where it becomes easier to draw a
separating hyperplane. Common kernel functions include linear, polynomial, and
radial basis function (RBF). SVM is known for its effectiveness in high-dimensional
spaces and its ability to work well even with a limited number of samples, making it
ideal for applications like image recognition, text classification, and bioinformatics.
CHAPTER 5
HARDWARE/SOFTWARE REQUIREMENTS &
SPECIFICATIONS
For developing IntelliExtract: AI-Based Video Text Extraction System, the following
hardware and software are recommended to ensure smooth development & operation.
CHAPTER 6
APPLICATION OF PROPOSED PROJECT
We design this model in frontend and backend. In frontend we create a GUI where
title bar is there, a canvas window where video is running continuously. Right side of
this canvas window result window is there where accuracy of model for that particular
text and indexing of text is there. At the below of that window video controller option
is there like start button, stop button, restart video button, volume controller, find
button. In backend all other process like frame generation, key frame selection, video
frame indexing, text extraction process is executed and show result on our GUI
window Please refer Figure 6.1 and Figure 6.2.
Figure 6.1
Figure 6.2
Figure 6.3
Figure 6.4
1. CC Connected Component
2. DR Detection Rate
3. ECR Edge Change Ratio
4. FAR False Alarm Rate
5. FD Frame Difference
6. OCR Optimal Character Recognition
7. PDE Partial Differential Equation
8. PR Precision Rate
9. RR Recall Rate
10. SSD Sum of Squared Difference
11. SVM Support Vector Machine
12. TIE Text Information Extraction
CSE, UNSIET, Veer Bahadur Singh Purvanchal University, Jaunpur Page viii
INTELLIEXTRACT: AI-Based Video Text Extraction System
[1] Gongqing, W., Jun, H., Li, L.L., et al.: Online content extraction based on
label path feature fusion. J. Softw. 27(3), 714–735 (2018).
[2] Wu, Jung G.Q., Hu, J., Li, L., Xu, Z.H., Liu, P.C., Hu, X.G., Wu, X.D. :Web
news extrac- tion via tag path feature fusion. Ruan Jian XueBao/J.Softw.
27(3), 714–735 (2018).
[3] Jiazhen, C., Yan, G., Qiang, L., et al.: An automatic text extraction method for
short text web pages. Chin. J. Inf. Sci. 30(1), 8–15 (2016).
[4] Q. Ye, D. S. Doermann, “Text Detection and Recognition in Imagery: A
Survey”, IEEE Transactions on Pattern Analysis and Machine Intelligence,
Vol. 37(7), pp. 1480-1500, 2015.
[5] V. Khare, P. Shivakumara, P. Raveendran, M. Blumenstein, “A blind
deconvolution model for scene text detection and recognition in video”,
Pattern Recognition, Vol. 54, pp.128- 148, 2016.
[6] A. Gonzalez, L. M. Bergasa, J. J. Yebes. "Text detection and recognition on
traffic pan- els from street-level imagery using visual appearance", IEEE
Transactions on Intelligent Transportation Systems, Vol. 16(3), pp. 228-238,
2015.
[7] A. K. Bhunia, A. Das, P. P. Roy, U. Pal, “A Comparative Study of Features of
Handwrit- ten Bangla Text Recognition”, In Proceedings of International
Conference on Document Analysis and Recognition, pp.636-640, 2015.
[8] Zhong, A., X. Peng, X. Zhuang, P. Natarajan, H. Cao,Ohya “Text detection
and recognition in natural scenes and consumer videos”. In Proceedings of
International Conference on Acoustics, Speech and Signal Processing, pp.
1245-1249, 2014.
[9] H. Yang, B. Quehl, H. Sack, “A framework for improved video text detection
and recogni- tion”, Multimedia Tools and Applications, Vol. 69(1), pp.217-
245, 2014.