Project Report Minor Project (1)
Project Report Minor Project (1)
Submitted By
June 2023
1
CERTIFICATE
Date :
PANKAJ KUAMR
Signature of Student
Date :
2
ABSTRACT
3
ACKNOWLEDGEMENT
4
CONTENTS
Declaration ……………………………………………………………………. i
Certificate ……………………………............................................................. ii
Acknowledgement ……………………………………………………………. iv
Contents ……………………………………………………………………….. v
1. Introduction 1
1.1 Overview ....................................................................................................... 1
2. System Analysis 2
2.1 Purpose .......................................................................................................... 2
2.1.1 Existing System........................................................................... 2
2.1.2 Proposed System ......................................................................... 2
2.2 Problem definition ................................................................................... 3
2.3 Feasibility Study ................................................................................ 3
2.3.1 Technical Feasibility .............................................................. 3
2.3.2 Economical Feasibility ................................................................ 3
2.3.3 Operational Feasibility ................................................................ 3
2.4 Objective of the study
3. System Requirement 4
4. Methodology
5. Design Document 11
5.1 Purpose ........................................................................................................ 11
5.2 Scope ........................................................................................................... 11
5.3 Overview ..................................................................................................... 11
5.4 Data Design ........................................................................................... 11
5.5 Tables .................................................................................................... 12
7. Conclusion
8. Future Scope
5
CHAPTER 1
INTRODUCTION
1.1 Overview
This project is part of a real-time face detection system that uses the
MTCNN model to detect faces in video frames, draws rectangles around the
detected faces, and serves the processed video frames as a video stream via a
Flask web server
CHAPTER 2
SYSTEM ANALYSIS
1.2 Purpose
The purpose of this project is to implement a real-time face detection system using the Multi-task
Cascaded Convolutional Networks (MTCNN) model and Flask, a Python web framework.
The uniqueness or difference of this system from others can be attributed to the following aspects:
1. Real-Time Face Detection: Unlike some systems that process pre-recorded videos, this
system is designed to work with real-time video streams. This makes it suitable for
applications that require immediate face detection, such as surveillance or live events.
2. Use of MTCNN: The system uses the Multi-task Cascaded Convolutional Networks
(MTCNN) model for face detection. MTCNN is known for its high detection accuracy, even
with partial occlusion and small faces. This might not be the case with other systems that use
different face detection models.
3. Web-Based Interface: The system uses Flask, a Python web framework, to serve the
processed video stream to a web page. This means that the face detection results can be
accessed from any device with a web browser, which may not be the case with other systems
that require specific software to view the results.
4. Web-Based Interface: The system uses Flask, a Python web framework, to serve the
processed video stream to a web page. This means that the face detection results can be
accessed from any device with a web browser, which may not be the case with other systems
that require specific software to view the results.
5. OpenCV Integration: The system uses OpenCV for video capture, processing, and encoding.
OpenCV is a popular library for computer vision tasks, and its integration into this system
allows for a wide range of additional functionalities if needed, such as object detection,
motion tracking, etc.
The proposed system is a real-time face detection application that processes video frames, detects
faces, draws bounding boxes around the detected faces, and streams the processed video to a web page.
This system could be used in a variety of applications, such as surveillance, live events, or any scenario
where real-time face detection is required.
1.3 Problem definition
the problem is to design and implement a system that can perform real-time face
detection on video streams, process the video data, and serve the results to a web page.
The objective of this study is to design, implement, and evaluate a real-time face detection system that can
process video streams, detect faces in each frame, draw bounding boxes around the detected faces, and
serve the processed video frames as a video stream via a web-based interface.
a. Implement Real-Time Face Detection: Use the Multi-task Cascaded Convolutional Networks
(MTCNN) model to detect faces in real-time video frames.
b. Draw Bounding Boxes: For each detected face, draw a rectangle (or bounding box) around the
face on the frame to visually represent the results of the face detection process.
c. Process Video Frames: Write the processed frames, with the bounding boxes drawn, to an output
video file.
d. Serve Video Stream: Encode the processed frames as JPEG images and yield them in a format
that can be used for a video stream.
e. Web-Based Interface: Use the Flask web framework to serve the video stream to a web page,
allowing the face detection results to be accessed from any device with a web browser.
f. Evaluate System Performance: Evaluate the performance of the system in terms of face
detection accuracy and processing speed. This will involve testing the system with different
types of video streams and measuring its performance.
CHAPTER 3
SYSTEM REQUIREMENTS
The system requirements for this project can be categorized into hardware requirements and software
requirements:
Hardware Requirements:
Processor: A modern multi-core processor. Real-time video processing and face detection are computationally
intensive tasks that can benefit from parallel processing.
Memory: Sufficient RAM to handle the video processing and face detection tasks. The exact amount will depend
on the resolution and frame rate of the video streams.
Storage: Sufficient storage space to store the processed video files. The exact amount will depend on the length
and quality of the videos.
Software Requirements:
Operating System: Any operating system that supports Python and the required libraries (Windows, macOS,
Linux, etc.).
Development Environment: A development environment that supports Python, such as Visual Studio Code.
CHAPTER 4
METHODOLOGY
The methodology for this project involves several steps, each corresponding to a part of the provided code snippet:
1. Face Detection: The system uses the Multi-task Cascaded Convolutional Networks (MTCNN) model to
detect faces in video frames. This is done in the line faces = detector.detect_faces(frame_rgb).
2. Drawing Bounding Boxes: For each detected face, the system draws a rectangle (or bounding box)
around the face on the frame. This is done in the loop for face in faces:.
3. Video Processing: The processed frames, with the bounding boxes drawn, are written to an output video
file. This is done in the line out.write(frame).
4. Video Streaming: The processed frames are also encoded as JPEG images and yielded in a format that
can be used for a video stream. This is done in the yield statement.
5. Web Server: The Flask web server serves the video stream to a web page. The @app.route('/video_feed')
decorator indicates that the video stream can be accessed by going to the /video_feed URL on your web
server.
1.1 Purpose
The purpose of this project is to develop a real-time face detection system that operates on
video streams. The system uses the Multi-task Cascaded Convolutional Networks (MTCNN)
model to detect faces in each frame of the video. Detected faces are highlighted with bounding
boxes for visual representation.
The processed frames are then encoded into JPEG format and served as a video stream via a
Flask web server. This allows users to view the face detection results in real-time from any
device with a web browser.
Initially, the project aimed to create a website where users could upload videos and photos for
face detection. However, due to technical challenges, such as issues with importing the dlib
library, the scope was narrowed down to real-time face detection in video streams.
Despite the change in scope, the project still provides valuable insights into the application of
computer vision techniques in real-time scenarios. It also serves as a foundation for future
enhancements, such as face recognition, emotion detection, and user-uploaded content
processing.
1.2 Scope
The scope of this project, as represented by the provided code snippet, is to implement a real-
time face detection system on video streams. The system uses the Multi-task Cascaded
Convolutional Networks (MTCNN) model to detect faces in each frame of the video. Detected
faces are highlighted with bounding boxes for visual representation.
The processed frames are then encoded into JPEG format and served as a video stream via a
Flask web server. This allows users to view the face detection results in real-time from any
device with a web browser.
The system is designed to handle real-time video streams, but it does not currently support
user-uploaded content. The scope of the project does not include face recognition or emotion
detection, although these could be potential areas for future enhancement.
The system is implemented in Python, using libraries such as OpenCV for video processing,
MTCNN for face detection, and Flask for web server setup and video streaming. The code is
designed to be run on a local machine, and it has not been optimized for deployment on a web
server or for handling large-scale traffic.
1.3 Overview
This Python code is part of a Flask web application that performs real-time face detection on
video streams. The application uses OpenCV for video processing and a face detection model
for identifying faces in each frame. The key components of the code are as follows:
Face Detection: The application uses a face detection model to identify faces in each frame of
the video. For each detected face, a rectangle is drawn on the frame for visual representation.
Frame Processing: Each processed frame is written to an output video file. The frame is then
encoded into a JPEG image. If the encoding fails, the processing loop breaks.
Frame Streaming: The encoded frame is yielded in a format suitable for video streaming.
This is a critical part of the Flask server's response to requests for the video feed.
Resource Management: After all frames have been processed, the video capture and video
writer objects are released, freeing up system resources.
Flask Routes: Two Flask routes are defined. The /video_feed route returns the video stream
as a response to requests. The / route is set up to handle both GET and POST requests,
although the handler function for this route is not shown in the provided code.
CHAPTER 6
DEVELOPMENT OF THE SYSTEM
The development of the system involves several key steps:
Face Detection: The system uses the Multi-task Cascaded Convolutional Networks (MTCNN)
model to detect faces in video frames. This involves converting the frame to RGB format and
passing it to the detector. The detector returns a list of faces, with each face represented as a
dictionary containing the bounding box and facial landmarks.
Drawing Bounding Boxes: For each detected face, the system extracts the bounding box
coordinates and uses them to draw a rectangle around the face on the frame. This is done
using the cv2.rectangle() function, which takes the frame, the top-left and bottom-right
coordinates of the rectangle, the color of the rectangle, and the thickness of the lines.
Video Writing: The processed frames, with the bounding boxes drawn, are written to an output video file using the
out.write(frame) function. This allows the results of the face detection to be saved for later viewing.
Frame Encoding: The processed frames are also encoded as JPEG images using the cv2.imencode('.jpg', frame)
function. This converts the frames into a format that can be used for a video stream.
Video Streaming: The encoded frames are yielded in a format that can be used for a video stream. This is done using
a yield statement, which produces a sequence of frames that can be sent to the client one at a time.
Web Server Setup: The Flask web server is set up to serve the video stream to a web page. This involves defining
routes for the video feed and the home page, and returning appropriate responses for each route.
The development of the system involves implementing each of these steps, testing them to ensure they work
correctly, and then integrating them into the complete system.
CHAPTER 7
CONCLUSION
In conclusion, the system developed successfully implements real-time face detection in
video streams. It uses the Multi-task Cascaded Convolutional Networks (MTCNN) model to
detect faces in each frame, draws bounding boxes around the detected faces, and writes the
processed frames to an output video file.
The system also encodes the processed frames as JPEG images and yields them in a format
that can be used for a video stream. This stream is served to a web page via a Flask web
server, allowing the face detection results to be accessed from any device with a web browser.
The system demonstrates the effective use of computer vision techniques for face detection
and the use of web technologies for real-time video streaming. It has potential applications in
various fields, including surveillance, live event monitoring, and interactive installations.
Future work could involve improving the face detection accuracy, optimizing the system for
better performance, and adding additional features, such as face recognition or emotion
detection.
CHAPTER 8
FUTURE SCOPE
The current system provides a solid foundation for real-time face detection in video streams. However, there are
several directions in which this project could be expanded in the future:
Face Recognition: The system could be extended to not just detect faces, but also recognize them. This would
involve training a model on a dataset of known faces and then using this model to identify people in the video
stream.
Emotion Detection: Another possible extension is emotion detection, where the system could analyze the facial
expressions of the people in the video stream and infer their emotional state.
Object Detection: The system could be generalized to detect other types of objects in addition to faces. This would
involve using a more general object detection model, such as YOLO or SSD.
Performance Optimization: There may be opportunities to optimize the system for better performance. This could
involve using a more efficient face detection model, optimizing the video processing pipeline, or using hardware
acceleration.
Integration with Other Systems: The system could be integrated with other systems to create more complex
applications. For example, it could be integrated with a security system to provide real-time alerts when unknown
faces are detected.
User Interface Improvements: The web interface could be improved to provide more information about the
detected faces and to allow users to interact with the system in more ways.
These are just a few possibilities. The future scope of this project is vast, given the wide range of applications of
face detection and video processing technologies.
CHAPTER 9
REFERENCES
OpenCV: OpenCV is a library of programming functions mainly aimed at real-time computer
vision. The official documentation can be found here: OpenCV Documentation
Flask: Flask is a micro web framework written in Python. The official documentation can be
found here: Flask Documentation
Video Streaming with Flask: A tutorial on how to implement video streaming with Flask can
be found here: Video Streaming with Flask