0% found this document useful (0 votes)
49 views27 pages

Technical Report 1.2

Semanto tube project technical report

Uploaded by

Talha Azeem
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
49 views27 pages

Technical Report 1.2

Semanto tube project technical report

Uploaded by

Talha Azeem
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 27

SemantoTube

TECHNICAL REPORT

Supervisor

Prof. Kareemullah

Submitted by

Talha Azeem
2019-ag-6072

Nabeel ur Rehman
2019-ag-6078

A TECHNICAL REPORT OF FYP SUBMITTED FOR THE DEGREE OF


BACHELOR OF SCIENCE
IN
COMPUTER SCIENCE
DEPARTMENT OF COMPUTER SCIENCE
FACULTY OF SCIENCES
UNIVERSITY OF AGRICULTURE FAISALABAD
DECLARATION

I hereby declare that the contents of the report SemantoTube are project of my own research and
no part has been copied from any published source (except the references). I further declare that
this work has not been submitted for award of any other diploma/degree. The university may
take action if the information provided is found false at any stage. In case of any default the
scholar will be proceeded against as per UAF policy.

_________________

Talha Azeem

_________________

Nabeel ur Rehman
CERTIFICATE

To,
The Controller of Examinations,
University of Agriculture,
Faisalabad.

The supervisory committee certify that TALHA AZEEM [2019-ag-6072] and NABEEL UR
REHMAN [2019-ag-6078] has successfully completed his FYP for the degree of B.S.
Computer Science under our guidance and supervision.

_____________________________________
Prof. Kareemullah
Supervisor

_____________________________________
Talha Azeem
Member
_____________________________________
Nabeel ur Rehman
Member

_______________________________________
Dr. Muhammad Ahsan Latif
Incharge,
Department of Computer Science
ACKNOWLEDGEMENT

I thank all who in one way or another contributed in the completion of this report. First, I thank
to ALLAH ALMIGHTY, most magnificent and most merciful, for all his blessings. Then I am
so grateful to the Department of Computer Science for making it possible for me to study here.
My special and heartily thanks to my supervisor, Prof. Kareemullah who encouraged and
directed me. Her challenges brought this work towards a completion. It is with her supervision
that this work came into existence. For any faults I take full responsibility. I am also deeply
thankful to my informants. I want to acknowledge and appreciate their help and transparency
during my research. I am also so thankful to my fellow students whose challenges and productive
critics have provided new ideas to the work. Furthermore, I also thank my family who
encouraged me and prayed for me throughout the time of my research. May the Almighty God
richly bless all of you.
ABSTRACT

SemantoTube is a web-based video search engine that leverages natural language processing
(NLP) techniques to revolutionize the way users search for information within YouTube videos.
With the vast amount of video content available on the internet, finding specific information
within lengthy videos can be time-consuming and frustrating. SemantoTube aims to address this
challenge by understanding the context and meaning of the video's content through NLP
analysis.

The project enables users to enter a YouTube video link and query, which is then processed by
retrieving the video's transcript and tokenizing it. The tokenized transcript, along with the user's
query, is sent to the Cohere AI API to obtain the most relevant results. These results are
presented to the user in a clear and organized manner, allowing for efficient navigation through
the video content.

SemantoTube focuses on semantic search, going beyond keyword-based approaches to


understand the intent and context of the user's query. By providing a more accurate and
meaningful search experience, users can quickly locate specific topics, phrases, or quotes within
videos. The system also offers the ability to automatically play the video at the exact time the
relevant information appears, saving users valuable time and effort.

In addition to its search capabilities, SemantoTube prioritizes user-friendliness, scalability, and


multilingual support. The application is designed to handle large numbers of videos and users,
optimize performance across various devices, and integrate seamlessly with popular video
platforms like YouTube. With its wide range of functionalities and a commitment to enhancing
the video search experience, SemantoTube aims to empower users in their quest for knowledge
in an ever-expanding digital world.
Table of Contents

1. INTRODUCTION 1
1.1 Background 1
1.2 Description 1
1.3 Scope 1
1.4 Objectives 1
2. REQUIREMENTS 3
2.1 Functional Requirements 3
2.2 Non- Functional Requirements 4
2.3 Hardware Requirements 4
2.4 Software Requirements 4
3. METHODOLOGY 6
4. Timeline 8
5. Design 9
6. RESULTS & DISCUSSION 17

List of Figures

Figure 1: RAD Activities 3


Figure 2: Tentative Timeline of the Project activities 4

1
1. INTRODUCTION
1.1 Background
In today's world, the amount of information available on the internet is vast and growing every
day. One of the most common ways people consume this information is through videos,
especially on platforms like YouTube. However, searching for specific information within a
video can be time-consuming and frustrating. The proposed project, SemantoTube, aims to
address this problem by using natural language processing (NLP) to understand the contents of
videos and return results based on the user's query.
1.2 Description
SemantoTube is a web-based video search engine that utilizes NLP to understand the contents
of videos and return the most relevant results based on the user's query. The user can enter a link
to a YouTube video, and the system will retrieve the transcript of the video and tokenize it before
sending it to the Cohere AI API along with the user's query. The system will then receive the
most relevant results from the API and present them to the user. The user can then select a result,
and the video will automatically play at the exact time the relevant information appears. This will
save the user time when searching for specific information within lengthy YouTube videos.
1.3 Scope
The proposed project, "SemantoTube," is a web-based video search engine that utilizes natural
language processing (NLP) to understand the contents of YouTube videos and return results
based on the user's query. The project aims to provide users with a more efficient way of
searching for information in lengthy YouTube videos by utilizing the Cohere AI API to match
the user's query with the transcript of the video and return the most relevant results.
It is difficult to search the entire video of Youtube and find a user query’s answer. SemantoTube
will search in seconds and make it easy for the user to find his answer in more relevant and
semantic rather than keyword base.
1.4 Objectives
The primary goal of the project is to develop a web-based video application that utilizes natural
language processing (NLP) techniques to help users quickly locate specific information within
longer YouTube videos in multiple languages.
● The application should be able to understand the context and meaning of the text in the
video, rather than just searching for keywords.
● Users should be able to search for specific topics, phrases, or quotes within the video
semantically and have relevant results returned to them.
● It should be user-friendly and easy to navigate.
● The application should be able to handle large numbers of videos and users.
● The application should be scalable to handle increasing numbers of videos and users..

1
● The application should have multilingual support, allowing users to search for videos in
different languages.
● The application should perform semantic search, understanding the intent and context of
the user's query and returning more accurate results.
● The application should be able to extract key information from the video and make it
easily accessible to the user
● The application should be able to handle a wide range of video formats and sources
● The application should be able to analyze the video's audio and transcript to improve the
search experience
● The application should be able to handle large numbers of requests and provide results in
real-time
● The application should be integrated with popular video platform such as YouTube
● The application should be optimized for mobile and tablet devices
● The application should be available in multiple languages.

2. REQUIREMENTS
2.1 Functional Requirements

FR01: Search within a video


FR01-01 Provide a search bar where users can input a query
FR01-02 Allow users to submit a YouTube video link along with the query
FR01-03 Retrieve the transcript of the video from YouTube API
FR01-04 Allow users to search for specific information within the video by providing a
query

FR02: Retrieve the transcript


FR02-01 Retrieve the transcript of the video from YouTube API
FR02-02 Tokenize the transcript by breaking it into individual words or phrases

FR03: Send the user's query to the Cohere’s API


FR03-01 Send the tokenized transcript and the user's query to the Cohere AI API

FR03-02 Receive the most relevant results from the API

FR04: Relevant results from the API


FR04-01 Receive the most relevant results from the API
FR04-02 Present the results to the user in a clear and organized manner
FR05: Automatically play the video at the exact time

2
FR05-01 Allow the user to select a specific result from the search results
FR05-02 Automatically play the video at the exact time the relevant information appears
in the video

FR06: Responsive Design


FR06-01 Ensure the system is compatible with different web browsers and devices

FR06-02 Ensure the system is responsive and adapts to different screen sizes
FR06-03 Optimize the system for mobile and tablet devices

2.2 Non- Functional Requirements


NFR01 System shall remain available 24/7 to its users.
NFR02 System shall provide tooltips for every option/button.
NFR03 System shall be mobile responsive and optimized for mobile devices
NFR04 System shall be able to handle at least 100 concurrent users
NFR05 System shall have an average response time of less than 2 seconds
NFR06 System shall have at least 99% uptime
NFR07 System shall be secure, with proper encryption.
NFR08 System shall be compliant with relevant regulations and standards.

2.3 Hardware Requirements


Processor A quad-core processor with a minimum clock speed of 2.4 GHz or higher
Memory A minimum of 8 GB of RAM
Storage A minimum of 200 GB of storage space for the application and media files
Operating Linux
System
Web Server Nginx
Network A high-speed internet connection with at least 1 Gbps upload and download
speeds
Monitoring A monitoring solution to track the performance and uptime of the system
Security A firewall and intrusion detection/prevention system to protect the system
from cyber-attacks.

2.4 Software Requirements


Programming Python, JavaScript, HTML, CSS

3
languages
Web Flask
Framework
NLP Cohere AI API
Cloud AWS, GCP or Azure
services
Web Server Apache or Nginx
Operating Linux
System
Video player Youtube player

3. METHODOLOGY

The methodology used for this project will be the Rapid Application Development (RAD)
methodology. RAD is an iterative development approach that emphasizes a rapid prototyping
and incremental delivery of working software. This methodology is chosen for this project as it is
well-suited for projects where the requirements are not well-defined and are likely to change
over time.

The RAD methodology consists of four major activities: Requirements Planning, User Design,
Construction, and Cutover.
Requirements Planning: In this phase, the project team will gather and document the
requirements of the project. The team will conduct interviews with stakeholders and users to
gather requirements and feedback.

4
User Design: In this phase, the project team will create a prototype of the system based on the
gathered requirements. The prototype will be presented to the stakeholders and users for
feedback and testing.

Construction: In this phase, the project team will start development of the system based on the
approved prototype. The team will use agile development methods to ensure that the system is
developed in an incremental and iterative manner.

Cutover: In this phase, the project team will test and deploy the system. The team will conduct
user acceptance testing and training before deploying the system to production.

3.1 Tools & Technologies

 ReactJs for frontend development


 Python Flask for backend development
 Cohere AI API for natural language processing
 Git for version control
 Visual Studio Code or PyCharm as the development environment
 Postman for testing the API
 Vercel for hosting the web application

3.2 Models

There are a few different models that could be used for this project, depending on the
specific requirements and goals of the project. Below are a few options.

 A transcription model: This model would be responsible for transcribing the audio
from the YouTube videos into text. This model could be trained on a large dataset of
transcribed videos to learn the patterns of speech and improve its accuracy.

 The system will transcribe the speech in the video to be sent to the cohere AI API

 A natural language processing (NLP) model: This model would be responsible for
understanding the contents of the transcriptions from the videos and returning results
based on the user's query. This model could be trained on a large dataset of text and
labeled with different categories or topics to learn how to classify and understand the
text.

 A search algorithm: This model would be responsible for searching through the
transcriptions and returning the most relevant results to the user's query. This model
5
could be trained on a dataset of text and labeled with different categories or topics to
learn how to match the user's query with the most relevant text.

 A time-stamp generator algorithm: This model would be responsible for generating


time-stamps for the relevant information in the video based on the user's query. This
model could be trained on a dataset of transcripts and time-stamped videos to learn how
to match the text with the corresponding time in the video

4. Timeline
The project is expected to take approximately 2 months to complete. A tentative timeline of the
project activities is given in Figure 2 below.
Figure 2: Tentative Timeline of the Project activities

The project will start with the requirements planning phase, where the project team will gather
and document the requirements of the project. This phase is expected to take 2 months.

Next, the project team will move on to the user design phase, where a prototype of the system
will be created based on the gathered requirements. The prototype will be presented to the
stakeholders and users for feedback and testing. This phase is expected to take half a month.

6
In the construction phase, the project team will start development of the system based on the
approved prototype. The team will use agile development methods to ensure that the system is
developed in an incremental and iterative manner. This phase is expected to take 1 month.

Finally, in the cutover phase, the project team will test and deploy the system. The team will
conduct user acceptance testing and training before deploying the system to production. This
phase is expected to take half a month.

5. Design

5.1 Use Case Diagram

7
5.1.1 Use Case Scenarios

Use Case Title Search Within a Video

Use Case Id 01

Requirement Id FR01

8
Description: This use case describes the process of searching for specific information within a video.

Pre-Conditions:
The user must have a YouTube video link.
The video must have a transcript available.

Task Sequence Exceptions

1. User enters a query in the search bar.


2. User submits a YouTube video link along with the query.
3. The system retrieves the transcript of the video from the YouTube API.
4. The system tokenizes the transcript by breaking it into individual words or
phrases.
5. The system sends the tokenized transcript and the user's query to the Cohere AI
API.
6. The system receives the most relevant results from the API.
7. The system presents the results to the user in a clear and organized manner.

Post Conditions: The user is provided with relevant search results within the video.

Unresolved issues: No issue

Authority: User

Use Case Title Retrieve Video Transcript

Use Case Id 02

Requirement Id FR02

Description: This use case describes the process of retrieving the transcript of a video.

Pre-Conditions:
The user must have a YouTube video link.
The video must have a transcript available.

Task Sequence Exceptions

1. User submits a YouTube video link.

9
2. The system retrieves the transcript of the video from the YouTube
API.
3. The system tokenizes the transcript by breaking it into individual
words or phrases.

Post Conditions: The transcript of the video is retrieved and tokenized for further processing.

Unresolved issues: No issue

Authority: User

Use Case Title Send Query to Cohere's API

Use Case Id 03

Requirement Id FR03

Description: This use case describes the process of sending the user's query to the Cohere AI API.

Pre-Conditions:
The transcript of the video must be available and tokenized.

Task Sequence Exceptions

1. The system sends the tokenized transcript and the user's query to the
Cohere AI API.
2. The system receives the most relevant results from the API.

Post Conditions: The system receives the relevant results from the Cohere AI API.

Unresolved issues: No issue

Authority: User

Use Case Title Display Relevant Results

Use Case Id 04

Requirement Id FR04

10
Description: This use case describes the process of presenting the most relevant search results to the
user.

Pre-Conditions:
The relevant results from the Cohere AI API must be available.

Task Sequence Exceptions

The system presents the results to the user in a clear and organized
manner.

Post Conditions: The user is provided with the most relevant search results.

Unresolved issues: No issue

Authority: User

11
12
13
Use Case Title Automatically Play Video at Exact Time

Use Case Id 05

Requirement Id FR05

Description: This
use case describes the process of automatically playing the video at the exact time the
relevant information appears.

Pre-Conditions:
The user has selected a specific result from the search results.
Task Sequence Exceptions

1. The user selects a specific result from the search results.


2. The system automatically plays the video at the exact time the relevant
information appears.

Post Conditions: The video starts playing at the exact time the relevant information appears.

Unresolved issues: No issue

Authority: User

14
5.2 Sequence Diagram

15
5.3 Class Diagram

16
6. RESULTS & DISCUSSION
6.1 Test Cases

Test Case ID: 1


Test Case Title: Test the functionality of searching for specific information within a video.
Test Case Priority: High
Requirement: FR01
Test Description: Test the functionality of searching for specific information within a video.
Test Date: 05/25/2023
Dependencies:
Test Steps: 1. Enter a query in the search bar.
2. Submit a YouTube video link along with the query.
3. Verify that the system retrieves the transcript of the video.
4. Verify that the system tokenizes the transcript correctly.
5. Verify that the system sends the tokenized transcript and the user's
query to the Cohere AI API.
6. Verify that the system receives the most relevant results from the API.
7. Verify that the system presents the results to the user in a clear and
organized manner.

Test Data
Expected Results: The user is provided with relevant search results within the video.

Actual Results: As above


Status: (Pass/Fail) Pass

Test Case ID: 2


Test Case Title: Test the functionality of retrieving the transcript of a video.
Test Case Priority: High
Requirement: FR02
Test Description: Test the functionality of searching for specific information within a video.

Test Date: 05/25/2023


Dependencies:
Test Steps: 1. Submit a YouTube video link.
2. Verify that the system retrieves the transcript of the video.
3. Verify that the system tokenizes the transcript correctly.

Test Data
Expected Results: The transcript of the video is retrieved and tokenized for further
processing.
Actual Results: As above
Status: (Pass/Fail) Pass

17
Test Case ID: 3
Test Case Title: Send Query to Cohere's API
Test Case Priority: High
Requirement: FR03
Test Description: Test the functionality of sending the user's query to the Cohere AI API.

Test Date: 05/25/2023


Dependencies:
Test Steps: 1. Send the tokenized transcript and the user's query to the Cohere AI
API.
2. Verify that the system receives the relevant results from the API.

Test Data
Expected Results: The system receives the relevant results from the Cohere AI API.

Actual Results: As above


Status: (Pass/Fail) Pass

Test Case ID: 4


Test Case Title: Display Relevant Results
Test Case Priority: High
Requirement: FR04
Test Description: Test the functionality of presenting the most relevant search results to the user.

Test Date: 05/25/2023


Dependencies:
Test Steps: 1. Verify that the system presents the results to the user in a clear and
organized manner.

Test Data
Expected Results: The user is provided with the most relevant search results.

Actual Results: As above


Status: (Pass/Fail) Pass

Test Case ID: 5

18
Test Case Title: Automatically Play Video at Exact Time
Test Case Priority: High
Requirement: FR05
Test Description: Test the functionality of automatically playing the video at the exact time the
relevant information appears.
Test Date: 05/25/2023
Dependencies:
Test Steps: Select a specific result from the search results.
Verify that the video starts playing at the exact time the relevant
information appears.

Test Data
Expected Results: The video starts playing at the exact time the relevant information
appears.
Actual Results: As above
Status: (Pass/Fail) Pass

Test Case ID: 6


Test Case Title: Responsive Design
Test Case Priority: High
Requirement: FR06
Test Description: Test the responsiveness and compatibility of the system with different web
browsers and devices.
Test Date: 05/25/2023
Dependencies:
Test Steps: Access the system using different web browsers (e.g., Chrome, Firefox,
Safari).
Access the system using different devices (e.g., desktop, mobile, tablet).

Test Data
Expected Results: The system is compatible with different web browsers and devices, and
it adapts to different screen sizes.
Actual Results: As above
Status: (Pass/Fail) Pass

6.2 Conclusion

19
Based on the test cases conducted for the SemantoTube project, it can be concluded that the
system performs well in retrieving video transcripts, searching within videos, and providing
relevant results to the users. The tokenization process is accurate, and the integration with the
Cohere AI API successfully retrieves relevant results based on user queries.

The system also demonstrates responsiveness and compatibility with different web browsers and
devices, ensuring a seamless user experience across various platforms. The automatic playback
feature functions as expected, playing the video at the exact time the relevant information
appears.
Overall, the test results indicate that the SemantoTube project meets the functional requirements
and performs its intended tasks effectively. The project shows promise in providing users with a
more efficient way of searching for specific information within lengthy YouTube videos,
leveraging natural language processing techniques and the Cohere AI API.

However, it is important to note that these test cases cover a limited scope of the project. To
ensure comprehensive testing, additional test cases should be designed to cover edge cases,
performance testing, security testing, and compatibility with different video formats and sources.

Continued testing and refinement will be crucial to further improve the system's accuracy,
usability, and scalability. Regular updates and bug fixes should be implemented based on user
feedback and additional requirements that may arise. With ongoing testing and improvements,
SemantoTube has the potential to become a valuable tool for users seeking efficient video search
capabilities.

20
21

You might also like