Group 24 Report
Group 24 Report
Project Report
on
YouTube Transcript Summarizer
Affiliated to Dr. A.P.J. Abdul Kalam Technical University (Formerly Known as U.P.T.U.),
Lucknow
2022 - 2023
DECLARATION
I declare that the project work presented in this report entitled “YouTube
Transcript Summarizer”, submitted to the Computer Science and Engineering
Department, Raja Balwant Singh Engineering Technical Campus, for the award
of the Bachelor of Technology degree in Computer Science and Engineering, is
our original work. I have not plagiarized or submitted the same work for the
award of any other degree.
May, 2023
Agra
ii
CERTIFICATE
This is to certify that the Project entitled “YouTube Transcript Summarizer” has
been submitted by Gourav Sharma in partial fulfilmentof the degree of Bachelor
of Technology in Computer Science & Engineering of “Raja Balwant Singh
Engineering Technical Campus, affiliated to Dr. A.P.J. Abdul Kalam Technical
University (Formerly known as U.P.T.U), Lucknow” in academic session 2022-
23.
May, 2023
i
ACKNOWLEDGEMENT
Apart from our effort, the success of the project depends largely on the
encouragement and guidelines of many others. We take this opportunity to
express a gratitude to the people has been instrumental in the successful
completion of this project.
I would like to express my deep and sincere gratitude to my project guide Er.
Aman Singh (Assistant Professor of CSE) who gave me his full support and
encourage me to work in innovative and challenging projects for educational
field.
I extend our gratitude Dr. Brajesh Kumar Singh, Head of Department in
Computer Science and Engineering to encourage us to the highest peak and to
provide us the opportunity prepare the project.
I am grateful to Dr. B.S. Kushwaha (Director Academics) and Dr. Pankaj
Gupta (Director Finance & Admin.), Director, Raja Balwant Singh
Engineering Technical Campus, Bichpuri, Agra for providing us facilities and
constant encouragement. I am also grateful to all the faculty members of the
Department of Computer Science and Engineering for their deliberations and
honest concerns.
Finally, I am grateful to our parents and friends for their constant support
throughout this project work. This work was a distant reality.
I also place on record our indebtedness to those who have directly or indirectly
provided their helping hands in this endeavor.
ii
ABSTRACT
In this project, I will be creating a Chrome Extension which will make a request
to a backend REST API where it will perform NLP and respond with a
summarized version of a YouTube transcript. Enormous number of video
recordings are being created and shared on the Internet throughout the day. It has
become really difficult to spend time watching such videos which may have a
longer duration than expected and sometimes our efforts may become futile if we
couldn't find relevant information out of it. Summarizing transcripts of such
videos automatically allows us to quickly lookout for the important patterns in the
video and helps us to save time and effort to go through the whole content of the
video. This project will give us an opportunity to have hands-on experience with
state-of-the-art NLP technique for abstractive text summarization and implement
an interesting idea suitable for intermediates and a refreshing hobby project for
professionals.
i
Table of Contents
i
List of Figures
Figure Page No.
v
CHAPTER 1
1.1 Introduction
YouTube is a video sharing platform, the second most visited website, the second
most used search engine, and is stronger than ever after more than 17 years of
being online. YouTube uploads about 720,000 hours of fresh video content per day.
The number of videos available on the web platform is steadily growing. It has
become increasing easy to watch videos on YouTube for anything, from cooking
videos to dance videos to motivational videos and other bizarre stuff as well. The
content is available worldwide primarily for educational purposes. The biggest
challenge while extracting information from a video is that the viewer has to watch
the entire video to understand the context, unlike images, where data can be
gathered from a single frame. If a viewer has low network speed or any other
device limitation can lead to watch video with a low resolution that makes it blurry
and hectic to watch. Also, in between advertisements are too frustrating. So,
removing the junk at the start and end of the concerned video as well as skipping
advertisements, and getting is summary to directly jump to your part of interest is
valuable and time efficient. This project focuses on to reducing the length of the
script for the videos. Summarizing transcripts of such videos automatically allows
one to quickly lookout for the important patterns in the video and helps to save
time and effort to go through the whole content of the video. The most important
part of this project will be its ability to string together all the necessary information
and concentrate it into a small paragraph. Video summarization is the process of
identifying the significant segments of the video and produce output video whose
content represents the entire input video. It has advantages like reducing the storage
space used for the video. This project will give an opportunity to have hands-on
experience with state-of-the-art NLP technique for abstractive text summarization
and implement an interesting idea suitable for intermediates and a refreshing hobby
project for professionals.
6
1.2 Objective
1.3 Scope
• Crash Course: - Students who wants to watch YouTube videos for their study can
easily get a quick idea of the topic and concisely will get a quick read of the video
and can easily check whether the video is relevant for them or not.
• Quick Notes: - Students who don't want to attend the boring lectures or somehow,
they have missed the classes, they can use this application to build the notes from
the summary of the video. Most students browse on YouTube a day before their
exams and watch the video on double speed, but in reducing the watch time by
half, it doubles the confusion about a totally new topic. Thus, making things way
worse than they originally were. So, removing the junk at the start and end of the
concerned video as well as skipping advertisements, and getting is summary to
directly jump to your part of interest is valuable and time efficient.
7
• Customer feedback: - Most of the time getting long feedback from the customers
for any particular product, this application helps to summarize their long feedback
and can easily predict whether the feedback is positive or negative.
8
CHAPTER 2
REVIEW OF LITERATURE
Prof. SH Chaflekar et al. [1] spend a noticeable amount of our weekly time watching
YouTube videos, be it for entertainment, education, or exploring our interests. In most
cases, the overall intent is to obtain some form of information from the video. We were
seeking a solution to increase the efficiency of this "information extraction" process as
YouTube's speed adjustment option is the only relevant tool. The summarizer is a
Chrome extension that works with YouTube to extract the key points of a video and
make them accessible to the user. The summary is customizable per user's request,
allowing varying extents of summarization. Key points from the summarization process,
together with corresponding time-stamps, are then presented to the user through a small
UI next to the video feed. This allows the user to navigate to more important sections of
the video, to get to the key points more efficiently. The main idea behind it is to be able
to find a short subset of the most essential information from the entire set and present it
in a human-readable format. As online Textual data grows, automatic Summarization of
text methods has the potential to become very helpful because more useful information
can be read in a short time. described, Facial Recognition, the biggest breakthrough in
Biometric identification and security since fingerprints, uses an individual’s facial
features to identify and recognize them. A technology that seems too far-fetched taken
straight from a science-fiction novel is now available in smartphones in the palm of our
hands. Facial Recognition has gained traction as the primary method of identification
whether its mobile phones, smart security systems, ID verification or something as
simple as login in a website. Recent strides in facial recognition technologies have made
it possible to design, build and implement a facial recognition system ourself. Using
Computer Vision and machine learning libraries like Facial Recognition and Dlib, people
can create a robust system that can detect faces and then match and identify it with a
database of pre-loaded facial data to successfully recognize them.
Hafiz Burhan Ul Haq et.al. [2] proposed that advancements in digital video technology
have empowered video surveillance to play a vital role in ensuring security and safety.
Public and private enterprises use surveillance systems to monitor and analyses daily
9
activities. Consequently, a massive volume of data is generated in videos that require
further processing to achieve security protocol. Analyzing video content is tedious and a
time-consuming task. Moreover, it also requires high-speed computing hardware. The
video summarization concept has emerged to overcome these limitations. This paper
presents a customized video summarization framework based on deep learning. The
proposed framework enables a user to summarize the videos according to the Object of
Interest (OoI), for example, person, airplane, mobile phone, bike, and car. Various
experiments are conducted to evaluate the performance of the proposed framework on
the video summarization (VSUMM) dataset, title-based video summarization (TV Sum)
dataset, and own dataset. The accuracy of VSUMM, TV Sum, and own dataset is 99.6%,
99.9%, and 99.2%, respectively. A desktop application is also developed to help the user
summarize the video based on the OoI.
Fady Bassel et.al. [4] proposed that in videos, description and keywords play an
important role in the choosing process of the right video to watch. The main idea of the
proposed approach is to generate descriptions and timestamps for videos automatically.
Our approach plays an essential role in reducing the time consumed searching for the
proper video. It aims to save time for users watching wrong unwanted videos and saves
their time using timestamps. Timestamps would help to find and watch only the desired
part of the video. One of the main goals of our approach is actual keyword extraction.
Extracted keywords help finding videos with the significant video's keywords. The
1
summarizing of the video depends on frames, emotions and speech. Firstly, the video
content appears in the frame and output a summarized text for the video content.
Secondly, emotion and how it changes during a specific period merged with the
outputted summarization of the frames. Thirdly, the audio transcribing into text occurs
and output an abstractive summarization of the audio track. Finally, the fusion happens
between all summarizations (audio, video, emotion) using natural language processing
techniques. Techniques such as tokenization, sentence segmentation and lemmatization
\& stemming, and then abstractive summarization. Video summarization occurs to get a
meaningful accurate description of the video. Having an accurate description helps
finding the inquired content matching the description. The implemented experiment
showed that on average 87\% of the participants found generated text well representing
the video.
Shraddha Yadav et.al. [5] proposed two different methods to generate summary and
important keywords from the given YouTube video - extractive and abstractive. They
have made a simple user interface through which users can easily get their summaries
through these methods, and surely find it easy to interact with their user interface and get
what they want. Their project surely satisfies the users and solve all the problems that
it’s supposed to tackle which is saving time and efforts, by providing only the useful
information about the topic which interests them so that they don't have to watch those
long videos and the time that saved can be used in gaining more knowledge.
E. Apostolidis et.al. [6] proposed method in is focuses on the recent advances in the
area and provides a comprehensive survey of the existing deep-learning-based methods
for generic video summarization. After presenting the motivation behind the
development of technologies for video summarization, they formulated the video
summarization task and discuss the main characteristics of a typical deep-learning-based
analysis pipeline. Then, suggested a taxonomy of the existing algorithms and provide a
systematic review of the relevant literature that shows the evolution of the deep-
learning-based video summarization technologies and leads to suggestions for future
developments.
Yudong Jiang et.al. [7] said that previous methods mainly take diversity and
representativeness of generated summaries as prior knowledge in algorithm design. In
1
this paper [4], they formulate video summarization as a content-based recommender
problem, which should distill the most useful content from a long video for users who
suffer from information overload. A scalable deep neural network is proposed on
predicting if one video segment is a useful segment for users by explicitly modelling
both segment and video. Moreover, they accomplished scene and action recognition in
untrimmed videos to find more correlations among different aspects of video
understanding tasks. Also, paper discussed the effect of audio and visual features in
summarization task.
Aniqa Dilawari and Muhammad Usman Ghani Khan. [8] stated that a massive
number of videos is produced every day, which contains audio, visual and textual data.
This constant increase is due to the ease of recording service in portable devices such as
mobile phones, tablets or cameras. The major challenge is to understand the visual
semantics and convert it into a condensed format such as caption or summary to save
storage space, enables users to index and navigate and help gain information in less
time. We propose an innovative joint end-to-end solution, ASoVS, which uses deep
neural network to generate natural language description and abstractive text
summarization of an input video. This provides a text-based video description and
abstractive summary enabling users to discriminate between relevant and irrelevant
information according to their needs. Furthermore, our experiments show that the joint
model can attain better results than the baseline methods in separate tasks with
informative, concise and readable multi-line video description and summary in a human
evaluation.
P. Choudhary et.al. [9] proposed that Automatic summarization techniques will give
the user an easy way to look up important content of a collection of media and to browse
media of their choice later. With the evolution of sophisticated capturing devices, cloud-
based summarization solutions, which have a lot of turnaround time, are less preferred
by end user. In this paper, author proposed a real-time video summarization technique
for mobile platform which analyses the video during live camera recording and
generates summary instantaneously. This technique employs the method of analyzing
intrinsic video data like the contents of video stream, and corresponding extrinsic
metadata such as external camera information of the video stream. The proposed
technique has been able to achieve an f-measure of 0.66 and 0.84 on SumMe and
1
SumLive datasets respectively while limiting the overall power consumption to 20
milliamps on an embedded system.
Justine Raju Thomas et.al. [10] elaborated that Summarization is the process of
reducing a text document to create a summary that retains the most important points of
the original document. Extractive summarizers work on the given text to extract
sentences that best convey the message hidden in the text. Most extractive
summarization techniques revolve around the concept of finding keywords and
extracting sentences that have more keywords than the rest. Keyword extraction usually
is done by extracting relevant words having a higher frequency than others, with stress
on important ones. Manual extraction or annotation of keywords is a tedious process
brimming with errors involving lots of manual effort and time. In this paper, we
proposed an algorithm to extract keyword automatically for text summarization in e-
newspaper datasets. The proposed algorithm is compared with the experimental result of
articles having the similar title in four different e-Newspapers to check the similarity and
consistency in summarized results.
Bin Zhao and Eric P. Xing [11] proposed online video highlighting, a principled way
of generating short video summarizing the most important and interesting contents of an
unedited and unstructured video, costly both timewise and financially for manual
processing. Specifically, their method learns a dictionary from given video using group
sparse coding, and updates atoms in the dictionary on-the fly. A summary video is then
generated by combining segments that cannot be sparsely reconstructed using the
learned dictionary. The online fashion of their proposed method enables it to process
arbitrarily long videos and start generating summaries before seeing the end of the
video. Moreover, the processing time required by proposed method is close to the
original video length, achieving quasi real-time summarization speed.
Idham Widodo et.al. [12] aimed to investigate the rhetorical structure of move and step
of short lecture by famous applied linguist Jack C. Richards posted on YouTube. The
data of this study were 22 video-transcripts of a short lecture of Jack C Richards. The
results: (1) three moves of rhetorical structure such as M1 – Introduction, M2 – Content
of Short Lecture, and M3 – Conclusion. They are 100% occurred in all of the data
analysed as obligatory category. (2) the most often found steps in the short lectures that
occurred 100% and classified as obligatory category, such as M2SB – Argumentation of
1
the short lecture and M3SA – Summarizing the points and the steps with 60-99%
percentage of occurrences as classified as conventional category, namely are M1SE –
Announcing topic of oral presentation, M1SA – Greeting the Audience, M2SC –
Illustration of short lecture, and M2SA – Description of short lecture. The new
proposed model of spoken genre analysis adapted from Ali and Singh (2019), the
Sermon model by Cheong cited in Safnil (2010) and Seliman (1996) for identifying the
rhetorical structure of short lecture is effective enough to capture the possible rhetorical
moves and steps in a whole text of short lecture by famous applied linguist posted in
YouTube.
Sourav Biswas and Atul kumar patel [13] said that watching long YouTube videos is
very time-consuming and boring. Nowadays YouTube is an essential aspect of providing
news and information. It is also considered a second teacher to the students; educational
videos are the most viewed videos on YouTube today. In this project, we have tried to
provide a quick, precise, and informative summary of a video. Many techniques are
already discovered but they only provide test summarization. We have tried to get the
summary of a video basically a YouTube video. For this project, we have used a
hugging face transformer to summarize the content of a YouTube video along with that
we have used python API to get the subtitle of a given video. After that our model will
perform text summarization on it and display the summary to the user so that people can
save their precious time reading the summary.
Abdulwahid Albeer et.al. [14] stated that Automatic summarization is a technique for
quickly introducing key information by abbreviating large sections of material.
Summarization may apply to text and video with a different method to display the
abstract of the subject. Natural language processing is employed in automated text
summarization in this research, which applies to YouTube videos by transcribing and
applying the summary stages in this study. Based on the number of words and sentences
in the text, the method term frequency-inverse document frequency (TF-IDF) was used
to extract the important keywords for the summary. Some videos are long and boring or
take more time to display the information that sometimes finds in a few minutes.
Therefore, the essence of the proposed system is to find the way to summarize the long
video and introduce the important information to the user as a text with few numbers of
lines to benefit the students or the researchers that have no time to spend with long
1
videos for extract the useful data. The results have been evaluated using Rouge method
on the convolutional neural network (CNN)-dailymail-master data set.
Vaishali P. Kadam et.al. [15] said that text summarization is the most popular
application and a challenging task in the natural language processing. It is important for
searching the specific information within the short time span from the input document. It
is presently in demand to have quick information access as a summary to make a
conclusion about the document text. This summary always presented with limited word
and specific information contents for the search item. Summarizer systems are capable
of generating a short version of the overall text after the analysis of the text it always
retain its original meaning and the actual theme in the summary text. There are many
automated summarizer systems developed for various Indian languages but still these
systems are not achieved the matured stage. This paper proposed a methodology for
development of the automated text summarization technique for Marathi language. We
have got 44.48% compression accuracy for the summary by our system.
S. Tharun, et.al. [16] concluded that thousands of video recordings are created and
shared on the internet every day. It is becoming increasingly difficult to spend time to
watch such videos, which may take longer than anticipated, and our efforts may go in
vain if we are unable to extract meaningful information from them. Summarizing
transcripts of such videos helps us to quickly search for relevant patterns in the video
without having to go through the entire content. Abstractive transcript summarization
model is very useful in extracting YouTube video transcripts and generates a
summarized version. An automatic summarizer's purpose is to shorten the time of
reading, enable easier selection, be less prejudiced compared to humans, and portray
content that is compressed while preserving the important material of the actual
document. Extractive and abstractive approaches are the two most common ways to
summarise text. Extractive approaches choose phrases or sentences from input text,
whereas Abstractive methods generate new words from input text, making the task much
more difficult.
Amey Thakur and Mega Satish [17] described that Text summarization is the process
of making a synopsis from a given text document while keeping the important
information and meaning of it. Automatic summarization has become an essential
1
method for accurately locating significant information in vast amounts of text in a short
amount of time and with minimal effort. In this project, we propose to implement a web
application that can summarize a text or a Wikipedia link. We have additionally been
given an opportunity to compare different methods of summarization. Problem
Statement - The tremendous abundance of material available on the internet has
produced an odd paradox: people are immersed in information, yet they are yearning for
wisdom. It is tough to keep up with the internet's daily production of billions of articles.
Is there a method to absorb information more effectively in this case without increasing
reading time? We are proposing for the above problem a Text Summarizer web app
using NLP and NLTK libraries.
Shivani Patil et.al. [18] proposed summarization of the video in Regional Languages.
During the procedure, we used methodology NLP, LSA, and MoviePy. This paper aims
to produce a short video of long video without missing any point. The technique first
short video of any downloaded video. A web application that takes an input of the video
and accuracy of the video, then we get this summaries video into text and this text
converted into any regional language. This paper is going to represent an Extraordinary
NLP application. This application benefits Students, and teachers by saving time.
1
CHAPTER 3
1
template engine. Both are Pocco projects.
Json - JSON (JavaScript Object Notation) is a lightweight data-interchange
format. It is easy for humans to read and write. It is easy for machines to
parse and generate. It is based on a subset of the JavaScript Programming
Language Standard ECMA-262 3rd Edition - December 1999. JSON is a
text format that is completely language independent but uses conventions
that are familiar to programmers of the C-family of languages, including C,
C++, C#, Java, JavaScript, Perl, Python, and many others. These properties
make JSON an ideal data-interchange language.
JavaScript - JavaScript is a simple programming language. It is designed to
build web-centric applications. It complements and integrates with Java.
JavaScript is very easy to use as it integrates with HTML. It is open-source
and cross platform.
Html - Html stands for HyperText Markup Language. It is used to create
web pages and web applications. It is a very easy and simple language. It
can be easily understood and modified. It is a markup language, so it
provides a flexible way to design web pages along with the text.
Css - Cascading Style Sheets (CSS) is a stylesheet language used to describe
the presentation of a document written in HTML or XML (including XML
dialects such as SVG, MathML or XHTML). CSS describes how elements
should be rendered on screen, on paper, in speech, or on other media.
3.2.2 Tools
1
complete tools, compilers and other features to make software
development easy.
1
CHAPTER 4
PROPOSED METHODOLOGY
2
4.2 System Architecture and Flowchart
B. Get Transcript
Using a python API called Youtube transcript api we can get the
transcripts/subtitles for a given YouTube video. It also generates the
transcript for youtube videos.
C. Text Summarization
The process of condensing lengthier text into a concise summary while
maintaining the main ideas and general meaning is known as text
summarizing.
There are two methods that are frequently employed for text
summarization:
1) Extractive Summarization: In this method, the model isolates the crucial
phrases and sentences from the source text and only
outputs them.
2) Abstractive Summarization: The model generates new sentences in a
new format, resulting in an entirely distinct text that is shorter than the
original. Transformers will be used in this project to implement this
strategy. In this system, abstractive text summarization will be done on the
transcript received in the previous phase using the Python Hugging Face
transformers module.
D. User Interface
User interface is needed to ensure that the user can interact with the system.
2
User is done using languages like HTML, CSS and flask as a framework. It
will be useful to provide users better interaction with the system.
4.2.2 Flowchart
2
CHAPTER 5
In my project, I will use two types of test methods, and test system. This testing
process also helps test one of the systems; We tried all systems.
There are several rules that can serve as testing objectives they are:
2
System Testing Steps:
i. Integration of all modules in the system.
ii. Preparation of test cases.
iii. Preparation of possible test data with all validation checks.
iv. Actual testing done manually.
v. Recording of all reproduced errors.
vi. Modifications done for the errors found during testing.
vii. Prepared the test result script after rectification of errors.
When unit testing is done for all modules, the whole system is integrated into that
module with all its dependencies. In the integration process, we consider each
module individually and test the system at every step. This will help reduce errors
during system testing.
This project does not use any special security measures as it is an approximate
model and does not collect data (symptoms) from customers. It is used only when
estimating, so no special security is required.
2
CHAPTER 6
2
inaccuracies, it can impact the quality and coherence of the generated
summaries. Additionally, the summarizer may struggle with summarizing
videos that have poor audio quality or unclear speech.
v. The YouTube Transcript Summarizer focuses solely on the textual content of
the video transcripts. It does not take into account any visual information, such
as images, graphs, or demonstrations present in the videos. As a result, the
summaries may not capture the full richness of the video content, particularly
when visual elements play a significant role.
vi. The YouTube Transcript Summarizer relies on the YouTube Data API to fetch
video information and transcripts. Any changes or restrictions imposed by
YouTube on their API may impact the functionality or availability of the
extension. Changes in API policies or limitations may require updates or
adjustments to ensure continued compatibility.
vii. The YouTube Transcript Summarizer is developed as a Chrome extension,
limiting its usage to the Chrome browser. Users on other browsers or platforms
may not have access to the extension's features. Additionally, future updates or
changes to the Chrome browser or its extension framework may require
modifications to maintain compatibility.
viii. The YouTube Transcript Summarizer project may have limited flexibility in
terms of user control over summarization parameters. Users may not have the
ability to customize the summarization process, such as adjusting the length of
the summary or specifying the level of detail required. This lack of
customization could limit the project's suitability for individual user preferences
and requirements.
ix. The project's user interface (UI) may have limited customization options. Users
may have minimal control over the appearance, layout, or visual aspects of the
extension's UI. The project may focus on providing a functional and intuitive UI
without extensive customization features, which could restrict users who prefer
more personalized or tailored UI experiences.
2
CHAPTER 7
CONCLUSION
This project has proposed a YouTube Transcript summarizer. The system takes the
input YouTube video from the Chrome extension of the Google Chrome browser
when the user clicks the summary button on the Chrome extension webpage and
accesses the transcripts of that video using the python API. The obtained transcripts
are then summarized with the transformer package. The user is then presented with a
summary text on the Chrome extension webpage. This project helps users a lot by
saving their precious time and resources. This helps us get the gist of the video
without watching the entire video. It also helps the user to identify unusual and
unhealthy content so that it does not interfere with their viewing experience. This
project also provides a great user interface when finding summary text because
Chrome extensions have been used.
2
CHAPTER 8
BIBLIOGRAPHY
8.1 REFERENCES
[1] Chaflekar, Prof & Bahadure, Achal & Bramhapurikar, Hosanna & Satpute,
Ruchika & Jumde, Rutuja & Bakhare, Sakshi & Bhirange, Shivani. (2022).
YouTube Transcript Summarizer using Natural Language Processing.
International Journal of Advanced Research in Science, Communication and
Technology. 108-113. 10.48175/IJARSCT-3034.
[2] Haq, Hafiz Burhan & Asif, Muhammad & Ahmad, Maaz & Ashraf, Rehan &
Mahmood, Toqeer. (2022). An Effective Video Summarization Framework
Based on the Object of Interest Using Deep Learning. Mathematical Problems
in Engineering. 2022. 1-25. 10.1155/2022/7453744.
[3] A. N. S. S. Vybhavi, L. V. Saroja, J. Duvvuru and J. Bayana, "Video
Transcript Summarizer," 2022 International Mobile and Embedded
Technology Conference (MECON), 2022, pp. 461-465, doi:
10.1109/MECON53876.2022.9751991.
[4] Bassel, Fady & Refaat, Mark & Abdelhamed, Mohamed & Shorim, Nada &
AbdelRaouf, Ashraf. (2021). Automatic Video summarization with
Timestamps using natural language processing fusion. 0060-0066.
10.1109/CCWC51732.2021.9376115.
[5] Shraddha Yadav, Arun Kumar Behra, Chandra Shekhar Sahu, Nilmani
Chandrakar, “SUMMARY AND KEYWORD EXTRACTION FROM
YOUTUBE VIDEO TRANSCRIPT”, International Research Journal of
Modernization in Engineering Technology and Science
Volume:03/Issue:06/June-2021 Impact Factor- 5.354.
[6] E. Apostolidis, E. Adamantidou, A. I. Metsai, V. Mezaris and I. Patras,
"Video Summarization Using Deep Neural Networks: A Survey," in
Proceedings of the IEEE, vol. 109, no. 11, pp. 1838-1863, Nov. 2021,
doi:10.1109/JPROC.2021.3117472.
[7] Yudong Jiang, Kaixu Cui, Bo Peng, Changliang Xu; “Comprehensive Video
Understanding: Video Summarization with Content-Based Video
Recommender Design”; Proceedings of the IEEE/CVF International
Conference on Computer Vision (ICCV), 2019, pp. 0-0.
[8] Dilawari, Aniqa & Khan, Muhammad Usman. (2019). ASoVS: Abstractive
Summarization of Video Sequences. IEEE Access. PP. 1-1.
10.1109/ACCESS.2019.2902507.
[9] P. Choudhary, S. P. Munukutla, K. S. Rajesh and A. S. Shukla, "Real time
video summarization on mobile platform," 2017 IEEE International
2
Conference on Multimedia and Expo (ICME), 2017, pp. 1045-1050, doi:
10.1109/ICME.2017.8019530.
[10] Thomas, Justine & Bharti, Drsantosh & Babu, Korra. (2016). Automatic
Keyword Detection for Text Summarization in e-
Newspapers.10.1145/2980258.2980442.
[11] Bin Zhao, Eric P. Xing; Quasi Real-Time Summarization for Consumer
Videos; Proceedings of the IEEE Conference on Computer Vision and Pattern
Recognition (CVPR), 2014, pp. 2513-2520.
[12] Widodo, I. Diani, and S. Safnil, “The Rhetorical Structure of Short Lecture by
Famous Applied Linguists Jack C. Richards Posted on YouTube”, JADILA,
vol. 1, no. 2, pp. 128-138, Nov. 2020.
[13] Sourav Biswas, A. K. P. (2022) “YouTube Transcript Summarizer to
Summarize the content ofYouTube.” Zenodo.
doi:10.5281/ZENODO.6511886.
[14] Albeer, Rand & Alshahad, Huda & Aleqabie, Hiba J. & Al-Shakarchy, Noor.
(2022). Automatic summarization of YouTube video transcription text using
term frequency-inverse document frequency.
[15] Kadam, V. P., Alazani, S. A. and Namrata Mahender, C. (2022) “A text
summarization system for Marathi language.” Zenodo. doi:
10.5281/ZENODO.7073509.
[16] Tharun, S. & Kumar, R. & Sravanth, P. & Reddy, G. & Akshay, B. (2022).
Survey on Abstractive Transcript Summarization of YouTube Videos.
International Journal of Advanced Research in Science, Communication and
Technology. 231-238. 10.48175/IJARSCT-3181.
[17] Thakur, Amey & Satish, Mega. (2021). Text
Summarizer.10.13140/RG.2.2.17259.67360.
[18] Patil, Shivani & Yadav, Swati & Shinde, Shreya & Waghmare, Darshani &
Patil, Rutuja & Babar, Prof. (2022). Video Transcript Summarization in
Marathi. International Journal of Advanced Research in Science,
Communication and Technology. 82-86. 10.48175/IJARSCT-4983.
2
8.2 SNAPSHOTS
3
Snapshot 2. Interface of the extension.
3
Snapshot 3. Extension Summarizes the Transcript.
3
8.3 APPENDIX
def get_transcript(video_id):
transcript_list = YouTubeTranscriptApi.get_transcript(video_id)
transcript = ' '.join([d['text'] for d in transcript_list])
return transcript
def get_summary(transcript):
summariser = pipeline('summarization')
summary = ''
for i in range(0, (len(transcript)//1000)+1):
summary_text = summariser(transcript[i*1000:(i+1)*1000])[0]['summary_text']
summary = summary + summary_text + ' '
return summary
3
<title>Youtube Transcript Summariser</title>
<style>
h1 {
text-align: center;
}
body {
width: max-content;
max-width: 800px;
}
button {
background-color: red;
color: white;
border-radius: 8px;
width: max-content;
height: max-content;
padding: 10px;
font-size: large;
margin: auto;
display: block;
border-color: coral;
}
button[disabled] {
background-color: lightcoral;
color: white;
border-radius: 8px;
width: max-content;
height: max-content;
padding: 10px;
font-size: large;
margin: auto;
display: block;
border-color: lightpink;
}
3
p{
font-size: medium;
}
</style>
</head>
<body>
<h1>Youtube Transcript Summariser</h1>
<button id="summarise" type="button">Summarise</button>
<br/>
<p id="output"></p>
<script src="popup.js"></script>
</body>
</html>
3
#code for json used in extension
{
"manifest_version": 3,
"name": "Youtube Summariser",
"description": "An extension to summarize youtube videos using the transcript",
"version": "1.0",
"permissions": ["activeTab", "declarativeContent"],
"host_permissions": ["http://127.0.0.1:5000/*"],
"action": {
"default_title": "Summarise this video",
"default_icon": {
"16": "images/icon.png",
"32": "images/icon.png",
"48": "images/icon.png",
"128": "images/icon.png"
},
"default_popup": "popup.html"
},
"icons": {
"16": "images/icon.png",
"32": "images/icon.png",
"48": "images/icon.png",
"128": "images/icon.png"
}
}
3
CHAPTER 9
BIBLIOGRAPHICAL SKETCH
Er. Aman Singh is currently serving as Assistant Professor of Department of the Post
Graduate Department of Computer Science & Engineering of the Raja Balwant Singh
Engineering Technical Campus, Bichpuri, Agra. He obtained his B.Tech degree in
Computer Science and Enginering from U.P.T.U with First Division in 2011. He obtained
the Master of Technology (M.Tech) degree from SRM University, Chennai in Computer
Science & Engineering with First Division in 2014. He is having eight years of experience.
He is presently engaged in research and development activities in the area of Data Structure,
Software Engineering, Computer Organization and Architecture and DAA.
Academic Qualification: B. Tech and M. Tech, Ph.D. (Pursuing)
Designation with Department: Assistant Professor (Computer Science & Engineering)
Contact No: 9358656548
Email: [email protected]
Specialization: Data Structure, Software Engineering, Computer Organization and
Architecture and DAA.
Experience: 08 Years
Research Articles/Published/Membership:
3
Research Articles Published: 07
Papers published in International and National conferences: 07
“CBIR Algorithm for Image Feature Extraction Using Color, Texture and Shape Mo-
dels.
Different Approaches of Image Retrieval Techniques.
A Review on Application of Digital Image Processing on Biotechnology & Bioscien-
ce.
Video Based Face Recognition Biometric Security System.
An Efficient Approach for Face Identification Using Neural Network.
Applications of Computer in Agricultural Research- A review
Big Data and Its Use in Smart Farming and Agricultural Data Analysis
Journals/Academic Achievements:
Participated in a Two-week ISTE STTP on Technical Communication conducted by
IIT, Bombay, 2015.
Successfully completed FDP101x Foundation Program in ICT for Education by IIT
Bombay, 2017.
Participated in Faculty Development Program on “Android Skilling” by Google,
AKTU, IEI Agra region, 2017.
Participated in FDP on Natural Language Processing (WNLP-2017) Sponsored by
Dr. A.P.J. Abdul Kalam University, Lucknow, UP.
No. of B.Tech. Students Guided: 20
3
Prof. (Dr.) Brajesh Kumar Singh (H.O.D)
Dr. Brajesh Kumar Singh was born in District Agra (U.P.) in 1978. He completed his
doctorate degree in Computer Science and Engineering from Motilal Nehru National
Institute of Technology, Allahabad (U.P.) in year 2014. He joined as a Lecturer. / Asstt.
Prof. at R.B.S. Engineering Technical Campus, Bichpuri, Agra in Year 2001. In year 2007,
he was appointedas Reader/ Assoc. Prof. in same organization. In December 2017, he took
over charge as Headof the department in Computer Science and Engineering. In Oct 2018,
he got promoted on thepost of Professor. He has guided more than 50 B.Tech. and 9 M.
Tech. projects of National and international repute. He is supervising 2 Ph.D. candidates.
He has 50 publications to his credit in national and international journals and proceedings
of high repute with large number of citations of his research manuscripts. Dr. Singh has
delivered several invited talks/ key note addresses and chaired sessions in national and
international conferences of high repute in India and abroad. He is having collaborative
training programs/workshops with IIT Bombay. He significantly contributed in enhancing
the research standards in the department of CSE. He is in the receipt of IBM best project
awards. Dr. Singh has organized successfully more than 45 International and national
Conferences/Seminars/Workshops as organizing secretary/ memberof international program
Committee in India and abroad. He is the editor of highly reputed national/ International
Journals.
3
Present Area of work: Software Engineering, Software Project Management, Data
Mining,Soft Computing, Computer Vision, IoT, Cloud Computing.
Awards and Recognitions
• Best Project Award by IBM.
• Best Paper Awards
• Chaired Springer Sponsored International Conference at Ajmer, India in 2017.
• Chaired Springer Sponsored International Conference at Ajmer, India in 2018.
• Coordinator, spoken tutorial Training programs in collaboration with IIT
Bombayunder National Mission on Education through ICT, MHRD, Govt. of
India.
• Delivered a keynote speech and chaired a session in IC4S 2017 at Phuket,
Thailand.
• Delivered a keynote speech and chaired a session in IC4S 2018 at Bangkok,
Thailand.
• Delivered an invited talk at Campus of ITS, Sukolilo-Surbaya, Indonesia as
visiting professor in workshop on Software Testing for The Information
System InternationalConference (ISICO), held during July 22-25, 2019.
• Founder Developer of College Website: www.fetrbs.org
• Guiding 01 Ph. D. Scholars enrolled with AKTU, Lucknow.
• Invitation from IEEE international conference, China to Chair a session
• Member of IEEE SOCIETY and IEEE Communications Society, the largest
technicalprofessional society in the world.
• Member of various International Associations/Societies of Artificial
Intelligence/Computer Science/ Scientific Computing.
• No. of M. Tech. Scholars Guided: 10
• Nominated, treasurer for the IEEE, UP section, SP/C (Signal
Processing/Computer)Joint Chapter in year 2014.
• One Book Published for Engineering and MCA students
• Organized 1 Springer sponsored Scopus indexed International Conferences
as Conference Chair.
• Organized 2 National Conferences as Joint secretary/ Secretary.
• Supervised 01 Ph. D. Scholars enrolled with AKTU, Lucknow.
• Visited China to present Research paper in IEEE conference.
4
Journals/Academic Achievements
4
Gourav Sharma
Gourav Sharma is a final year student of Computer Science & Engineering at Raja
Balwant Singh Engineering Technical Campus, Agra. He has passed his High School
and Intermediate examinations from CBSE in the year 2017 & 2019 respectively with
a score of 95% & 71%. He is skilled in Python, flask. He has achieved many medals in
kabaddi.
4
PLAGIARISM CHECK
Chapter 1
Chapter 2
4
Chapter 3
Chapter 4
4
Chapter 5
Chapter 6
4
Chapter 7
Chapter 8