FACE MASK DETECTION USING PYTHON
1 – Introduction
A new strain of virus was identified in humans, known as novel coronavirus (nCoV), which
was never previously been identified in humans. Coronaviruses (CoV) are a wide group of
viruses which cause illness that range from basic colds to infections like Middle East
Respiratory Syndrome (MERS) and Severe Acute Respiratory Syndrome (SARS). The first
infected patient of coronavirus was found in December 2019. The habit of wearing face
masks while stepping out is rising due to the COVID- 19 corona virus epidemic. Before
Covid-19, masks were worn by people to protect their health from air pollution. Scientists
have concluded that wearing face masks works on decreasing COVID-19 transmission. In
2020, the rapid spread of COVID-19 led the World Health Organization to declare COVID-
19 as a global pandemic. The virus spreads through close contact of humans and in
crowded/overcrowded places. Among them cleaning hands, maintaining a safe distance,
wearing a mask, refraining from touching eyes, nose, and mouth are the main, where wearing
a mask is the simplest one.
Unfortunately, people are not following these rules properly which is resulting in speeding
the spread of this virus. The solution can be to detect the people not wearing mask and
informing their authorities.
The face mask detection is a technique to find out whether the person is wearing a mask or
not. In medical applications Deep learning techniques are highly used as it allows researchers
to study and evaluate large quantities of data. Deep learning models have shown great role in
object detection. These models and architectures can be used in detecting the mask on a face.
Here we introduce a face mask detection model which is based on computer vision and deep
learning. The proposed model can be integrated with computer or laptop cameras allowing it
to detect people who are wearing masks and not wearing masks. The model has been put
together using deep learning and classical machine learning techniques with openev, tensor
flow and keras. We have introduced a comparison between three machine learning algorithms
to find the most suitable algorithm that yields the highest accuracy.
2– Flowchart
3 – Methodology
Data Collection and Preparation:
Data Source: Get the employee survey dataset from the HR department or other reliable
sources. Ensure that the dataset contains demographic information, job roles, performance
ratings, tenure, and survey results regarding career satisfaction, growth possibilities, and so
on.
Clean the dataset by removing missing values, outliers, and inconsistencies. Standardize
formats and maintain data integrity.
Exploratory Data Analysis (EDA): Conduct preliminary exploratory analysis to determine the
distribution of variables, look for correlations, and show relevant trends in employee
demographics and survey results.
Statistical Analysis and Hypothesis Test:
Descriptive Statistics: Use summary statistics (mean, median, and standard deviation) to
summarize key variables such as work satisfaction, career growth perceptions, and so on.
Correlation Analysis: Look for correlations between variables (such as job satisfaction and
tenure) to uncover probable linkages.
Hypothesis Testing: Use statistical tests (such as t-tests and ANOVA) to compare survey
results across demographic categories (e.g., gender and department).
Interpretation of results:
Model interpretation entails analyzing model predictions and feature importance to better
understand the aspects that influence career satisfaction, growth perceptions, and so ,Business
Insights: Turn model results into practical insights for HR decision-makers. Identify critical
areas for increasing employee happiness, retention methods, and career development
activities.
4 – Python
Python is a high-level, object-oriented, and interpreted programming language. It was created
by Guido van Rossum from 1985 to 1990 and released in 1991.
Python's syntax is close to the English language, allowing developers to construct programs
with fewer lines than some other programming languages. Python is an interpreter-based
language, which means that code can be executed as soon as it is written. Prototyping may be
done quickly.
Characteristics of Python :-
Following are the important characteristics of python programming -
• Python is a dynamic, high-level, free open source and interpreted programming language.
• It supports object-oriented programming as well as procedural oriented programming.
• It can be used as a scripting language or can be compiled to byte-code for building large
applications.
• It provides very high-level dynamic data types and supports dynamic type checking.
• It supports automatic garbage collection.
• It can be easily integrated with C, C+, COM, ActiveX, CORBA, and Java.
Use of NumPy : -
NumPy is a Python package. It means 'Numerical Python'. It is a library that includes
multidimensional array objects as well as routines for array processing.
Jim Hugunin created Numeric, the predecessor to NumPy. Another package, Num array, was
also created, with some more features.
Travis Oliphant created NumPy in 2005 by adding Num array functionality into the Numeric
library. There are numerous contributors to this open source project.
Operations with NumPy NumPy allows developers to conduct logical and mathematical
operations on arrays, as well as Fourier transformations and shape manipulation methods.
Operations related to linear algebra.
NumPy has in-built functions for linear algebra and random number generation.
Simple program to create a matrix-
First of all we import numpy package then using this we take input in numpy function as a
list then we create a matrix.
There is many more function can be perform by using this like that take sin value of the given
value ,print a zero matrix etc. we also take any image in the form of array.
Use of Pandas : -
Pandas is an open-source Python library licensed under BSD that provides high-performance,
user-friendly data structures and data analysis tools for the Python programming language.
Python with Pandas is utilized in a variety of sectors, both academic and commercial, such as
finance, economics, statistics, and analytics.
Pandas is an open-source Python library that offers high-performance data manipulation and
analysis tools through its strong data structures. Pandas is named after Panel Data, a type of
econometrics that uses multidimensional data.
Key Features of Pandas:
• A fast and efficient Data Frame object with default and configurable indexing.
• Tools for loading data into in-memory data objects from various file formats.
• Data alignment and seamless handling of missing data.
• Reshaping and pivoting data sets.
• Label-based slicing, indexing, and subsetting of big datasets.
• Columns in a data structure can be deleted or added.
• Group data for aggregation and transformation.
Pandas works with the following three data structures:
1 – Series
2 – Data Frame
3 – Panel
These data structure are built on top of NumPy array , which means they are fast .
Use of Plotly: -
Plotly is another great Python module for building interactive and publishable visuals. It
supports a wide range of chart types and customization possibilities, making it ideal for
building interactive dashboards, online apps, and presentations. Here are some of the main
uses and features of Plotly.
Plotly specializes in creating interactive visualizations that allow users to explore data points,
zoom in on specific areas, and reveal extra information when hovered.
Plotly is an open-source Python module for data visualization that supports a variety of
graphs such as line charts, scatter plots, bar charts, histograms, and area plots. Plotly creates
interactive graphs, which can be integrated on websites and offer a wide range of advanced
plotting choices.
1- Plotly specializes in creating interactive visualizations that allow users to explore data
points, zoom in on specific areas, and reveal extra information when hovered.
2- Scatter Plots: Plotly supports scatter plot customisation with markers, colors, sizes,
and tooltips for each data point.
3- Plotly allows you to build interactive bar charts with choices for stacked, grouped,
and horizontal bars, as well as color and annotation customization.
4- Heatmaps: Plotly can create interactive heatmaps to visualize matrix or categorical
data.
5- Dashboard and Layout Customization: Plotly lets you customize dashboard layouts
with many subplots, annotations, and responsive designs.
5-Dataset
Creating a dataset for analyzing employee career surveys requires multiple processes. Here's
an organized technique to generating such a dataset:
1. Define the variables and survey questions.
Employee Information: - ID (anonymous if required)
Please provide your age, gender, department/division, and job title/position.
1- Career and Work Experience:
Years of experience.
Previous companies worked for
Previous positions Education level
Career development includes training and development opportunities.
Career development within the firm
Career objectives (short and long term)
Feedback and suggestions:
Ideas for improvement
Feedback on Company Culture
Suggestions for Career Development programs
2. Data Collection.
Conduct the survey using a platform that supports structured data export (e.g., Google Forms,
SurveyMonkey).
Maintain anonymity and secrecy according to corporate policies.
3. Data preparation.
Clean the data to eliminate inconsistencies and inaccuracies.
Encode categorical variables correctly (for example, gender, department).
4. Data Structure
Divide the data into columns (variables) and rows (individual survey replies).
Ensure that each variable has a clear definition and data type.
5. Data Analysis.
Analyze the dataset with statistical methods and visualisations.
Look for trends and connections between variables, such as work happiness vs. age and
career.
Example Dataset Format (CSV)
6 - Source Code