0% found this document useful (0 votes)
63 views20 pages

Mini Project Report

Uploaded by

Abhinav Desai
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
63 views20 pages

Mini Project Report

Uploaded by

Abhinav Desai
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 20

Report On

YouTube Transcript Summarizer

Submitted in partial fulfillment of the requirements of the Mini project in


Semester VI of Third Year Computer Engineering

by

Vipul Bhoir (Roll No. 07)


Mrudul Chaudhari (Roll No. 12)
Abhinav Desai (Roll No. 14)
Aditya Lawate (Roll No. 30)

Mentor
Prof. Sanket Patil

Vidyavardhini's College of Engineering & Technology

Department of Computer Engineering

(A.Y. 2023-24)
Vidyavardhini's College of Engineering & Technology

Department of Computer Engineering

CERTIFICATE

This is to certify that the Mini Project entitled “ YouTube Transcript

Summarizer ” is a bonafide work of Vipul Bhoir (Roll No. 07), Mrudul

Chaudhari (Roll No. 12), Abhinav Desai (Roll No. 14), Aditya Lawate

(Roll No. 30), submitted to the University of Mumbai in partial fulfillment of

the requirement for the award of the degree of “Bachelor of Engineering” in

Semester VI of Third Year “Computer Engineering” .

______________________
Prof. Sanket Patil
Mentor

________________ _______________
Dr Megha Trivedi Dr. H.V. Vankudre
Head of Department Principal
Vidyavardhini's College of Engineering & Technology

Department of Computer Engineering

Mini Project Approval

This Mini Project entitled “ YouTube Transcript Summarizer ” by Vipul

Bhoir (Roll No. 07), Mrudul Chaudhari (Roll No. 12), Abhinav Desai

(Roll No. 14), Aditya Lawate (Roll No. 30), is approved for the degree of

Bachelor of Engineering in Semester VI of Third Year Computer

Engineering .

Examiners

1………………………………………
(Internal Examiner Name & Sign)

2…………………………………………
(External Examiner name & Sign)

Date :

Place :
Contents

Abstract i

Acknowledgments ii

1 Introduction 1
1.1 Introduction
1.2 Problem Statement & Objectives

2 Literature Survey 3

2.1 Survey of Existing System/SRS


2.2 Limitation Existing system or Research gap
2.3 Mini Project Contribution

3 Proposed System 6

3.1 Introduction
3.2 Architecture/ Framework/Block diagram
3.3 Algorithm and Process Design
3.4 Details of Hardware & Software
3.5 Experiment and Results for Validation and Verification
3.6 Analysis
3.7 Conclusion and Future work.

References 14
Abstract

In today's fast-paced lifestyle, the Internet is flooded with a vast number of video
recordings, making it challenging to sift through them all. Often, viewers find
themselves spending considerable time watching lengthy videos, only to discover they
contain irrelevant or misleading content. With the exponential growth of YouTube
users each year, the pressure to generate views incentivizes creators to sometimes
distort the truth or misrepresent their content. This not only wastes users' time and
resources but also erodes trust in online platforms.

To address this issue, an Omni platform tool has been developed to enhance user
interaction. The website features a "summarize" button, providing a user-friendly
solution. When activated, it generates a concise summary of the current YouTube
video being played in the Google Chrome web browser. This tool aims to streamline
the viewing experience, saving users time and ensuring they receive relevant
information without falling prey to misleading tactics often employed by content
creators. Along with this the platform also provides the option for the user to upload
its whole video and get the transcript of the same. This tool is also beneficial for
content creators to use it for their channel.

i
Acknowledgement

I would like to extend my heartfelt gratitude to all those who have played a significant
role in the successful realization of our Mini Project. It is not only my individual
effort, but the collective encouragement and guidance of several individuals that have
made this project possible.

First and foremost, I am deeply thankful to Mr. Sanket Patil Sir our Project Guide, for
his unwavering support and invaluable assistance. The guidance and encouragement
he provided was instrumental in bringing this project to fruition. Without his
contribution, this project would not have seen the light of day.

We also express our gratitude to Dr. Megha Trivedi Ma’am (Head of Department) and
The Department of Computer Engineering for all their aid and support during the
project's conception, implementation, and presentation. We are grateful to Mr. Harish
Vankudre Sir, our esteemed Principal, for giving all facilities for the growth of our
Project.

I would also like to acknowledge the vital role played by all the members who have
contributed and continue to contribute to this project. Their guidance and support
were essential in ensuring the project's success. I am sincerely grateful for their
consistent encouragement and assistance.

This project's completion is the result of a collaborative effort, and I am indebted to


each one of you for your support and guidance throughout this journey. Thank you for
being an integral part of our project's success.

ii
1. Introduction

1.1 Introduction

In today's rapidly evolving digital landscape, the exponential growth of online data
presents a significant challenge for individuals seeking to efficiently navigate through
vast amounts of unstructured information. This flood of data highlights the critical
necessity for reliable content summarization tools capable of extracting key insights
from lengthy articles. These tools empower users to evaluate the relevance of content
before delving into comprehensive examination. Automatic text summarization stands
out as a crucial solution to this challenge, with applications spanning various domains
such as news aggregation, blogging, and product descriptions. Over time, text
summarization techniques have evolved into sophisticated models that integrate
fundamental principles with advanced methodologies.

Extractive summarization techniques focus on extracting significant sections from the


original text, resulting in a condensed subset of sentences without introducing new
linguistic elements. While this approach aims to preserve the original text's structure
and wording, it often lacks coherence and contextual relevance due to its verbatim
nature. Conversely, abstractive summarization seeks to distill the essence of the
content by synthesizing information from the original text. This method may include
incorporating novel phrases or restructuring sentences to ensure coherence.
Abstractive summarization, leveraging natural language understanding and
generation, offers unparalleled flexibility and expressiveness, albeit facing challenges
in maintaining accuracy and coherence.

Given the dynamic nature of online content and the increasing demand for efficient
information retrieval mechanisms, the search for robust automatic text summarization
techniques becomes crucial. In this paper, we present a comprehensive framework for
summarizing YouTube transcripts.

1
1.2 Problem Statement & Objectives

Problem Statement:
In the digital era, the abundance of online video content, particularly on platforms like
YouTube, presents a challenge for users in efficiently extracting relevant information
from lengthy video transcripts. To address this issue, we propose the development of
an automated YouTube transcript summarizer tool. This tool aims to condense lengthy
transcripts into concise summaries, alleviating information overload and enhancing
user experience by providing key insights without the need for exhaustive manual
review. Challenges include processing the diverse content landscape of YouTube,
ensuring relevance and accuracy in generated summaries, and catering to user
preferences for customization and personalization. By tackling these challenges, the
proposed tool seeks to empower users with efficient navigation and value extraction
from the vast array of video content available online.

Objectives:

A YouTube transcript summarizer aims to automatically generate concise summaries


of video content. Here are its key objectives:

● Save Time: By summarizing the transcript, viewers can quickly grasp the
video's main points without having to watch the entire thing. This is especially
helpful for long videos.
● Improve Information Access: With summaries, users can efficiently find
relevant videos. Imagine searching for a specific topic - summaries help you
decide which videos hold the information you need.
● Enhance Understanding: Summarizers can highlight important concepts and
keywords, making it easier for viewers to understand the video's core content.

2
2. Literature Survey

2.1 Survey of Existing System

The use of YouTube transcript summarizers has gained attention from researchers in
recent years due to the increasing amount of video content available on the platform.
This section presents a literature survey of some of the previous works on YouTube
transcript summarizers. ‘Automated Video Summarization Using Speech Transcript’
by Cuneyt M. Taskiran, Aronon Amir, Dulce B. Ponceleon, Edward J. Delph
describes the compact representations of video data can enable efficient video
browsing. They propose the method which summarizes the long video automatically.
Their representations provide the user relevant information about the content with
particular sequence examined while preserving the essentials of the content [1].

‘Video Summarization using NLP’ by Sanjana R., Sai Gagana V, Vedhavati K R,


Kiran K N proposes an automatic video summarization using Natural Language
Processing (NLP) based algorithms. The increasing popularity of YouTube gave us
the millions of video repository and hence there is an increase demand for good
summarization algorithms to summarize various video without loss of any accurate
information of the content. Their proposed system describes the YouTube video
transcripts based on which summarized video is generated [2] .

Millions of videos are created and shared on the repository platforms such as
YouTube, Reddit, Instagram, etc. It is becoming a challenging task to spend time on
watching such videos, which may have longer duration. Sometimes efforts of
watching the videos may go in vain if we are unable to extract our meaningful
information from them but Summarizing transcripts of such videos can help us in
extracting the meaningful information from the transcript of the video.

3
2.2 Limitations of Existing System

The YouTube summarization model is helpful in extracting the transcripts and


generates the summarized version of it. The model automatically produces a summary
containing important sentences and including all relevant information related to the
original documentation. Abstractive approach generates a new word from input text
making the task more difficult while Extractive approach extract the sentences and
phrases from the input[3]. In 2018, Nallapati et al. proposed a method for
summarizing YouTube video transcripts using deep learning models. The method
used a combination of convolutional and recurrent neural networks to extract relevant
information from the transcript. The authors evaluated their method on a dataset of
YouTube videos and reported competitive results compared to other summarization
methods [4]. In 2019, Nguyen et al. proposed a method for summarizing YouTube
video transcripts using a hybrid approach that combines rule-based and machine
learning techniques. The method used a set of rules to extract sentences from the
transcript, which were then used to train a machine learning model to generate a
summary. The authors evaluated their method on a dataset of TED Talks and reported
competitive results compared to other summarization methods [5]. In 2020, Zeng et
al. proposed a method for summarizing YouTube video transcripts using a
graph-based approach. The method used a combination of TF-IDF and Text Rank
algorithms to extract key phrases from the transcript. The authors evaluated their
method on a dataset of TED Talks and reported competitive results compared to other
summarization methods [6]. In 2021, Huang et al. proposed a method for
summarizing YouTube video transcripts using a transformer-based language model.
The method used a pre-trained transformer model to extract relevant information from
the transcript. The authors evaluated their method on a dataset of educational videos
and reported competitive results compared to other summarization methods [7].

In conclusion, the literature survey suggests that YouTube transcript summarization


has been approached using various techniques, including deep learning models,
hybrid approaches, graph-based methods, and transformer-based language models.
These methods have been evaluated on different datasets, including TED Talks,
educational videos, and general YouTube videos, and have reported competitive
results compared to other summarization methods.

4
2.3 Mini Project Contribution

Table 2.3.1: Mini Project Individual Contribution Table.

Names Vipul Mrudul Abhinav Aditya

Planning ✔️ ✔️ ✔️ ✔️
Analysis ✔️ ✔️ ✔️
Research ✔️ ✔️
Design ✔️ ✔️
Implementation ✔️ ✔️ ✔️ ✔️
Draft ✔️ ✔️
Final Report ✔️ ✔️ ✔️

5
3. Proposed System

3.1 Introduction

In an era defined by rapid digitalization and information overload, the need for
efficient content consumption has never been more critical. Meetings, conferences,
and online tutorials often generate extensive video content, presenting a challenge for
individuals seeking to extract key insights without investing significant time.
Additionally, the vast array of videos available online makes it daunting for users to
locate relevant and high-quality content efficiently.

To address these challenges, we propose a revolutionary solution: a video


summarization system designed to streamline content consumption and maximize
productivity. Our system leverages advanced algorithms and natural language
processing techniques to analyze video content and generate concise summaries,
enabling users to grasp essential information in a fraction of the time.

Getting the summary of a provided video or video link is incredibly simple with our
system. Users need only paste the video link into our interface, and within minutes,
they'll receive a comprehensive summary of the video's content. Whether catching up
on missed meetings, researching specific topics on online platforms like YouTube, or
enhancing learning experiences through concise tutorials, our solution empowers
users to make the most of their time in today's fast-paced digital landscape.Through
this comprehensive analysis process, our video summarization system delivers
summaries that are not only accurate and informative but also tailored to meet the
diverse needs of users across various domains. From business meetings to educational
lectures, our system empowers users to extract value from video content efficiently
and effectively.

6
3.2 Architecture/ Framework/Block diagram

Fig. 3.2.1: Flowchart of overall system architecture

7
3.3 Algorithm and Process Design

3.3.1 Algorithm:

Algorithm for the proposed system is given below to generate the precise summary
from given Video or Video link:

1. START.
2. Paste the Video link or Video.
3. Generate the Mp3 from the video.
4. Generate the Transcript from the Mp3.
5. Pass the Transcript to the Deepgram API ( contains a pretrained model ).
6. Get the Concise Summary.
7. Listen OR/AND Translate OR/AND Download the Summary.
8. END.

3.3.2 Process Design:

1. We used the ffmpeg library to convert an MP4 file to MP3 file and pass that
MP3 file to Deepgram API.
2. The response JSON is then sliced and only the required data is taken.
3. Deepgram API gives a faster speed of transcript generation thus it was used.
4. Youtube Video’s audio file is downloaded using the ytdl core library
5. The extracted summary is displayed to the user. And with the use of Google
Translate API, we translate the language of summary into any one as preferred
by the end user.
6. The user has the option to choose the language of his choice.
7. The Audio play and pause feature is extracted using react text to speech hook.

8
3.4 Details of Hardware & Software

3.4.1 Hardware Requirements:

The minimum hardware requirements for developing this project are as follows:

1. Processor : Standard Processor with a speed of 1.6 GHz.


2. RAM : 256 MB RAM or more.
3. Hard Disk : 20 GB or more sufficient storage space.
4. Monitor : Standard Color Monitor.
5. Internet Connection: Good internet connection with at least 5 mbps speed.

3.4.2 Software Requirements:


1. Deepgram API: Active deepgram account
2. HTML, CSS, JS
3. Javascript Frameworks:
- Node JS
- React JS
- Express JS for API calls and services
4. Bootstrap CSS
5. Ant Designs
6. Ffmpeg - to convert the MP4 file to MP3 file.
7. Visual Studio Code: Virtual Environment Text Editor to run the code.

9
3.5 Experiment and Results for Validation and Verification

Fig. 3.5.1: Pasting the URL of Video and Generated Summary

Fig. 3.5.2: Generated Summary is translated into desired language

10
Fig. 3.5.3: Language translation is converted into speech for ease of user

Fig. 3.5.3: Upload video to get transcript of your video

11
3.6 Analysis

Our video summarization system employs a sophisticated analysis process to extract


key information from video content efficiently. This analysis encompasses several
stages, each designed to ensure accuracy and relevance in the generated summaries.

1. Video Parsing: The system begins by parsing the provided video or video link,
extracting the audiovisual content and associated metadata. This step lays the
foundation for subsequent analysis by providing the raw materials for content
extraction.
2. Speech Recognition: Utilizing advanced speech recognition algorithms, the system
transcribes spoken dialogue from the video. This transcription process captures the
spoken content accurately, enabling the system to analyze and summarize verbal
information effectively.
4. Text Processing: Once transcribed, the textual content undergoes natural language
processing (NLP) techniques to identify key concepts, topics, and sentiments. By
analyzing the text at a semantic level, the system can extract meaningful information
and discern the primary themes addressed in the video.
5. Content Summarization: Building upon the insights gained from audiovisual
analysis and text processing, the system generates a concise summary of the video's
content. This summary encapsulates the most salient points, providing users with a
comprehensive overview of the video's key insights and takeaways.
6. Multilingual Support: By incorporating multilingual support, our system becomes
more accessible and inclusive, catering to users from diverse linguistic backgrounds.
This feature opens up new avenues for global adoption and ensures that users
worldwide can benefit from the summarization capabilities of your system in their
preferred language.
7. Download Summary: Offering the option to download summarized content
provides users with greater flexibility and convenience in accessing and sharing the
generated summaries. Whether offline viewing is preferred or summaries need to be
archived for future reference, this feature adds value by empowering users to control
how they interact with summarized content.
8. Listen Summary: Introducing a "Listen Summary" feature enhances accessibility

12
and usability, particularly for users with visual impairments or those who prefer
auditory learning styles.

3.7 Conclusion and Future work.

3.7.1 Conclusion:

In conclusion, our proposed video summarization system represents a significant


advancement in content consumption efficiency and productivity. By leveraging
advanced algorithms and natural language processing techniques, the system
effectively extracts key insights from video content, enabling users to grasp essential
information in a fraction of the time.

Through thorough analysis encompassing speech recognition, visual analysis, and text
processing, the system generates concise summaries that encapsulate the most salient
points of the original video. This empowers users to quickly access relevant
information, whether catching up on missed meetings, conducting research, or
enhancing learning experiences.

Further more, the system undergoes rigorous quality assurance measures to ensure
accuracy and relevance, providing users with reliable summaries that meet their
diverse needs across various domains. With our video summarization system, users
can reclaim valuable time and maximize productivity in today's fast-paced digital
landscape.

3.7.2 Future work:

Looking ahead, the future scope of our video summarization system is promising,
with several avenues for further enhancement and expansion:
1. Multimodal Analysis: Incorporating advanced techniques for multimodal analysis,
including the integration of audio, visual, and textual cues, to enhance the
comprehensiveness and accuracy of the generated summaries.
2. Real-time Summarization: Developing capabilities for real-time video
summarization, enabling users to access summarized content as videos are being
streamed or recorded, thereby enhancing efficiency in live settings.

13
3. Personalization: Implementing personalized summarization features that adapt to
individual user preferences and priorities, ensuring that summaries are tailored to
meet the specific needs and interests of each user.
5. Integration with Smart Devices: Integrating the video summarization system with
smart devices and virtual assistants to provide seamless access to summarized content
across various platforms and devices.

References

[1]. ‘Automated Video Summarization Using Speech Transcript’ by Cuneyt M.


Taskiran, Aronon Amir, Dulce B. Ponceleon, Edward J. Delph
[2]. “Digital video Summarization Techniques”, Ashenafi Workie, Rajesh Sharma,
Yun Koo Chun
[3]. S. Tharun, R. Kranthi Kumar, P. Sai Sravanth, G. Srujan Reddy, B. Akshay,
“Survey on Abstractive Transcript Summarization of YouTube Videos”, in
International Journal of Advanced Research in Science, Communication and
Technology (IJARCET)
[4]. Nallapati, R., Zhou, B., Gulcehre, C., & Xiang, B. (2017). Summarunner: A
recurrent neural network based sequence model for extractive summarization of
documents. In Proceedings of the AAAI Conference on Artificial Intelligence (Vol.
31, No. 1).
[5]. Nguyen, T. T., Nguyen, M. Q., Nguyen, L. T., & Nguyen, H. N. (2019). A hybrid
approach for summarizing youtube video transcripts. Information Processing &
Management, 56(6), 1444-1459.
[6]. Zeng, J., Wei, F., & Liu, S. (2020). Learning to summarize from human feedback
on summary prototypes. In Proceedings of the 58th Annual Meeting of the
Association for Computational Linguistics (pp. 5641-5647).
[7]. Huang, X., Shi, Y., Xiong, W., & Zhang, J. (2021). EduSum: A large-scale dataset
and neural model for automated educational video summarization. In Proceedings of
the 2021 Conference of the North American Chapter of the Association for
Computational Linguistics: Human Language Technologies (pp. 452-462).
[8]. https://atmamani.github.io/blog/building-restful-apis-with-flask-in-python/
[9]. https://pypi.org/project/youtube-transcript-api

14

You might also like