Final Documentation
Final Documentation
Ankita Ghosh
Registration No. – 211-1211-0406-21 Roll No. – 213211-11-0021
Labanya Thakur
Registration No. – 211-1211-1075-21 Roll No. – 213211-11-0039
Department of Computer Science
Maharaja Manindra Chandra College
1
DECLARATION
(Ankita Ghosh)
Date –
Place - (Labanya Thakur)
2
CERTIFICATE
Supervisor
Date: Mrs. Monali Poddar
Assistant Professor
Department of Computer science
Maharaja Manindra Chandra College
3
ACKNOWLEDGEMENT
At the very start, we would like to acknowledge that this period has been a
period of intensive learning both on professional and personal level.
First of all, we would like to thank Mrs. Monali Poddar, Assistant
Professor, Department of Computer science, Maharaja Manindra Chandra
College, for helping us in completing this internal project a success.
Again, we would like to thank her for her continuous support throughout
the project tenure as our supervisor. She was always there to guide and
help us whenever we had any problem relating our project, she constantly
guided and steered us always in the right direction.
Further, we would also like to express our sincere gratitude to Our Teacher
in Charge, Dr. Amita Mazumdar ,for her support, blessings and well
wishes. Also express our heartful gratitude to other faculty members of
Department of Computer Science, Maharaja Manindra Chandra College,
for encouraging us for this project work.
Finally, we would like thank our family for constantly supporting us as
well as for always being there as a source of strength in making this
project how today it is.
4
ABSTRACT
In today's fast-paced world, mental health assessment and management
are increasingly vital. This project presents a web-based platform designed
for individuals to assess their mental well-being through structured
questionnaires. Users begin by signing in with basic demographic details:
name, age, and gender. Upon authentication, users complete two standard
assessments: the Generalized Anxiety Disorder 7-item scale (GAD-7) and
the Patient Health Questionnaire 9-item scale (PHQ-9). These widely used
questionnaires gauge the severity of anxiety and depression symptoms,
respectively. The collected responses undergo evaluation using the
Random Forest machine learning algorithm. This advanced analytical
approach is employed to generate personalized scores based on the
questionnaire answers. The scores reflect the severity levels of anxiety and
depression symptoms, providing users with actionable insights into their
mental health status. Privacy and confidentiality are prioritized throughout
the platform's design and implementation. Stringent data protection
measures ensure that user information remains secure and anonymous. By
leveraging machine learning, this project aims to enhance mental health
awareness and empower individuals to monitor their well-being
proactively. The platform's results are intended to guide users towards
appropriate interventions, whether through self-care strategies or
professional assistance, thereby promoting overall mental well-being in the
community. Future plans include expanding the platform to incorporate a
broader range of psychometric tests, catering to diverse mental health
needs globally. We also plan on making the project web based as this
provides an easier access to people all over the globe.
5
List of Figure
List of Abbreviations
6
Table of Contents
7
INTRODUCTION
Mental health is integral to daily life as it influences emotional stability,
cognitive functioning, and interpersonal relationships. It directly impacts one's
ability to manage stress, make decisions, and maintain productive routines. Good
mental health contributes to overall well-being, enhancing resilience, and
enabling individuals to navigate challenges effectively. It fosters positive social
interactions, supports physical health, and plays a crucial role in achieving
personal fulfilment and satisfaction. Prioritizing mental health promotes a
balanced and fulfilling life, ensuring individuals can thrive in their personal and
professional pursuits.
Mental health is integral to daily life as it influences emotional stability,
cognitive functioning, and interpersonal relationships. It directly impacts one's
ability to manage stress, make decisions, and maintain productive routines. Good
mental health contributes to overall well-being, enhancing resilience, and
enabling individuals to navigate challenges effectively. It fosters positive social
interactions, supports physical health, and plays a crucial role in achieving
personal fulfilment and satisfaction. Prioritizing mental health promotes a
balanced and fulfilling life, ensuring individuals can thrive in their personal and
professional pursuits
In the realm of mental health assessment, effective tools and methodologies play
a crucial role in understanding and managing individuals' well-being. This
project introduces a robust system designed to streamline the evaluation process
through structured questionnaires and advanced analytics, without being web-
based. Users begin by signing up into the platform using their personal details
such as name, age, and gender, ensuring personalized interactions tailored to
their demographics. Subsequently, users login using their username and
password, guided through standardized assessments like the Generalized Anxiety
Disorder 7-item scale (GAD-7) and the Patient Health Questionnaire 9-item
scale (PHQ-9). The PHQ-9 is a brief, self-administered questionnaire consisting
of nine items that correspond to the diagnostic criteria for major depressive
disorder in the DSM-IV (Diagnostic and Statistical Manual of Mental Disorders,
fourth edition,). It assesses the frequency and severity of depressive symptoms
over the past two weeks. Each item is scored on a scale from 0 to 3, and the total
score ranges from 0 to 27. Higher scores indicate more severe depressive
symptoms. The PHQ-9 is used in clinical practice and research to screen,
diagnose, monitor, and measure the severity of depression. It helps healthcare
8
professionals make informed decisions about treatment options and
interventions. The GAD-7 is a brief, seven-item questionnaire designed to assess
the severity of generalized anxiety disorder (GAD) symptoms over the past two
weeks. It evaluates common symptoms such as feeling nervous, anxious, or on
edge, as well as physical symptoms associated with anxiety. Each item is scored
on a scale from 0 to 3, and the total score ranges from 0 to 21. Similar to the
PHQ-9, higher scores indicate more severe symptoms. The GAD-7 is useful in
clinical settings to screen for GAD, monitor symptom severity over time, and
evaluate the effectiveness of treatments. These assessments are pivotal in
gauging anxiety and depression symptoms, providing a quantitative basis for
evaluating mental health status. The heart of the project lies in employing the
Random Forest machine learning algorithm to process. Random Forest is a
versatile and powerful machine learning algorithm known for its effectiveness in
both classification and regression tasks. It operates by constructing multiple
decision trees during training, where each tree is trained on a random subset of
the data and a random subset of features. This randomness helps to ensure
diversity among the trees and reduces the risk of overfitting. The algorithm
aggregates predictions from individual trees to make final predictions, typically
using the mode for classification or the mean for regression. Random Forests are
robust against noise and outliers, handle high-dimensional data well, and provide
insights into feature importance, aiding in understanding the underlying
relationships in the data. These qualities make Random Forests widely used
across various domains for tasks requiring accurate and interpretable predictions.
We fed the machine with dataset obtained from ―Kaggle.com‖ and provided
necessary functions for evaluating user responses from the questionnaires. This
sophisticated analytical approach enables the generation of personalized scores
that reflect the severity and nature of symptoms reported by the user.
Looking ahead, the project aims to expand its range by incorporating additional
validated psychometric tests, thereby broadening its applicability to various
mental health conditions and enhancing diagnostic accuracy. Future
developments will focus on integrating tools such as the Beck Depression
Inventory (BDI) and the Social Phobia Inventory (SPIN), among others, to
provide a more comprehensive assessment framework. By leveraging advanced
analytics and validated assessments, this project endeavours to empower
individuals to monitor their mental health proactively, facilitating early
intervention and informed decision-making in seeking appropriate support or
treatment.
9
Analysis
Dataset Machine Learning Model
Dataset (Random forest)
Training
Production
Result
10
LITERATURE REVIEW
Title: Predicting the Utilization of Mental Health Treatment with Various
Machine Learning Algorithms
Author: MEERA SHARMA, SONOK MAHAPATRA, ADEETHYIA SHANKAR
Year: 2020
In 2017, about 792 million people (more than 10% of the global population) lived
their lives with a mental disorder [24] – 78 million of which committed suicide
because of it. In these unprecedented times of COVID-19, mental health
challenges have been even further exacerbated as home environments have been
proven to be major sources of the creation and worsening of poor mental health.
Additionally, proper diagnosis and treatment for people with mental health
disorders remains underdeveloped in modern-day’s society due to the widely
everpresent public stigma attached to caring about mental health. Recently there
have been attempts in the data science world to predict if a person is suicidal (and
other diagnostic approaches) yet all face major setbacks. To begin, big data has
many ethical issues related to privacy and reusability without permission—
especially in regards to using feeds from social media. Additionally, people
diagnosed with specific mental health conditions may not actually seek
treatment, so data may be incorrect. In this research, we address both of these
problems by using anonymous datasets to predict the answer to a different
question—whether or not people are seeking mental health treatment. We also
use a large variety of machine learning and deep learning classifiers and
predictive models to predict with a high accuracy rate through statistical analysis.
From our research, we were able to conclude that machine learning can be used
to predict likelihood of individuals seeking treatment with a high degree of
accuracy (76.3% - 82.5%) by utilizing a self-reported questionnaire. Similarly,
through a simple questionnaire that asks enough questions relevant to mental
health, machine learning should also be able to determine if the person requires
treatment. Despite stigma surrounding mental illness, individuals would be able
to utilize machine learning to determine the correct course of action for their
mental illness. As a result, these individuals would be more productive, reducing
social and economic costs at the tech workplace.
11
Title: Prediction of Mental Disorder for employees in IT Industry
Author: Sandhya P, Mahek Kantesaria
Year: 2019
Mental health is nowadays a topic which is most frequently discussed when it
comes to research but least frequently discussed when it comes to the personal
life. The wellbeing of a person is the measure of mental health. The increasing
use of technology will lead to a lifestyle of less physical work. Also, the constant
pressure on an employee in any industry will make more vulnerable to mental
disorder. These vulnerabilities consist of peer pressure, anxiety attack,
depression, and many more. Here we have taken the dataset of the questionnaires
which were asked to an IT industry employee. Based on their answers the result
is derived. Here output will be that the person needs an attention or not. Different
machine learning techniques are used to get the results. This prediction also tells
us that it is very important for an IT employee to get the regular mental health
check up to tract their health. The employers should have a medical service
provided in their company and they should also give benefits for the affected
employees There are many suggestions that employers and employees could keep
in mind. Employers need to keep track of number of their employees having
mental disorder. Employers should allow flexible work environment with flexible
work scheduling and break timings. They should allow employees to work from
home or have flexible place of work.
12
reviewed 28 studies of AI and mental health that used electronic health records
(EHRs), mood rating scales, brain imaging data, novel monitoring systems (e.g.,
smartphone, video), and social media platforms to predict, classify, or subgroup
mental health illnesses including depression, schizophrenia or other psychiatric
illnesses, and suicide ideation and attempts. Collectively, these studies revealed
high accuracies and provided excellent examples of AI’s potential in mental
healthcare, but most should be considered early proof-of-concept works
demonstrating the potential of using machine learning (ML) algorithms to
address mental health questions, and which types of algorithms yield the best
performance. Summary As AI techniques continue to be refined and improved, it
will be possible to help mental health practitioners re-define mental illnesses
more objectively than currently done in the DSM-5, identify these illnesses at an
earlier or prodromal stage when interventions may be more effective, and
personalize treatments based on an individual’s unique characteristics. However,
caution is necessary in order to avoid overinterpreting preliminary results, and
more work is required to bridge the gap between AI in mental health research and
clinical care.
OBJECTIVE
The objective of this project is to develop a comprehensive system for assessing
and analyzing mental health using standardized questionnaires, namely the
Generalized Anxiety Disorder 7-item scale (GAD-7) and the Patient Health
Questionnaire 9-item scale (PHQ-9). The system aims to provide users with
personalized evaluations of their anxiety and depression symptoms based on their
questionnaire responses. Utilizing the Random Forest machine learning
algorithm, the project seeks to enhance the accuracy and reliability of these
assessments. Ultimately, the goal is to empower individuals to monitor their
mental well-being proactively, guiding them towards appropriate interventions or
support as needed. The project also aims to maintain user privacy and
confidentiality throughout the process, ensuring secure handling of sensitive
information.
13
METHODOLOGY
Model Model
Evaluation Validation Deployment
Problem definition: This is the very first step, it includes identifying and
understanding the problem. This step sets the stage for the entire project and
ensures the consecutive steps are aligned with the identified issue/problem. In our
case we had to understand the impact of anxiety and depression on mainly young
adults. We had to learn about the various standardized tests and their scoring
schemes.
In summary, the problem definition phase involved a deep dive into the impact of
anxiety and depression on young adults. By comprehensively understanding the
nature of these conditions and the tools available for their assessment, we
14
established a solid groundwork for our project. This phase ensures that all
subsequent steps are directed towards addressing the identified issues, ultimately
contributing to the well-being of young adults.
Data Collection: This step involves searching and collecting relevant datasets,
which would be later used for training and testing the implemented model. In this
step, raw and unprocessed data is collected from various websites. We had to
look through a lot of websites, as there are very limited numbers of datasets were
available of mental health surveys, relevant to our specific tests.
Overall, the data collection step was a meticulous and iterative process, involving
thorough research, evaluation, and collaboration. The resulting datasets formed
the foundation for the subsequent steps of pre-processing, training, and testing
the mental health model, ultimately contributing to the development of a robust
and reliable tool for mental health analysis.
Data Pre-processing: Data pre-processing deals with data cleaning, data
transformation and data splitting. Data cleaning basically refers to the step, in
which we deal with missing values or unexpected values (NaN values). Data
transformation refers to transforming and manipulating data according to
requirements, for e.g., normalizing or standardizing the data, handling categorical
variables, and creating new features if necessary. Splitting data involves splitting
the processed data set into training set and testing set. We replaced missing
values with the median value and encoded categorical columns(like gender) into
Boolean values for better classifications. We split the processed data in the ratio
8(train):2(test).
Feature Engineering: Feature engineering involves feature selection, encoding
categorical variables, and scaling features creating new features or modifying
existing ones to improve the performance of the model. We selected relevant
features like GAD Score, PHQ Score ,age and gender. Encoded the gender
column for easy computation.
In conclusion, feature engineering is a crucial step in the data pre-processing
phase of machine learning. It involves transforming raw data into meaningful
features that better represent the underlying problem to predictive models,
resulting in improved model performance.
Model Selection: Choosing the appropriate machine learning algorithm(s) for
the task .Comparing different algorithms based on performance metrics,
15
considering linear models, decision trees, SVMs etc. We implemented various
models like, linear regression, SVM, random Forest etc., tested their accuracy
and chose the model producing highest accuracy (in our case random forest).
Model Training: Training the selected model(s) using the prepared data,
splitting the data into training and validation sets, fitting the model to the training
data, and tuning hyper parameters. We split the data into train-test model in the
ratio 8:2.
Model Evaluation: Assessing the performance of the trained model(s) using
validation data. Calculating metrics such as accuracy, precision, recall, F1-score,
and using cross-validation. Machine learning models like linear regression,
logistic regression were providing low accuracy ,whereas SVM and random
forest were providing higher accuracy , random forest provided accuracy
score(1.0) for anxiety and (0.99) for depression.
Validate the Model: Evaluation of the final model on the test set to get an
unbiased estimate of its performance. Report relevant performance metrics and
compare them to the baseline. Validating the model involves a thorough
evaluation on a test set, reporting comprehensive performance metrics, and
comparing these metrics to a baseline model. This process ensures that the final
model is reliable, robust, and ready for deployment.
Deploy the Model: Deploying the model into a production environment (e.g., a
web service, mobile app). Continuously monitoring the model's performance in
the real world and collect new data for re-training if necessary.
16
PROJECT REQUIREMENT
Requirements are basic constraints that are required to develop a system.
Requirements are gathered while designing the system. The following are the
requirements that are to be discussed:
1. Functional Requirements
2. Non Functional Requirements
3. Technical Requirements
a) Software Requirements
b) Hardware Requirements
FUNCTIONAL REQUIREMENTS: The software requirements specification
is a technical specification of requirements for the software product. It is the first
step in the requirements analysis process. It lists requirements of a particular
software system. The following details to follow the special libraries like sk-
learn, pandas, matplotlib and seaborn.
NON FUNCTIONAL REQUIREMENTS:
Process of functional steps:
I. Problem define
II. Preparing data
III. Evaluating algorithms
IV. Improving results
V. Prediction the result
Technical Requirements:
a) SOFTWARE REQUIREMENTS:
Operating System: Windows
Tool: Anaconda with VS code, PyCharm
b) HARDWARE REQUIREMENTS:
Processor: Pentium IV/III
Hard disk: minimum 80 GB
RAM: minimum 2 GB
17
SOFTWARE DESCRIPTION
ANACONDA:
Anaconda is a free and open-source distribution of the Python and R
programming languages for scientific computing (data science, machine learning
applications, large-scale data processing, predictive analytics, etc.), that aims to
simplify package management and deployment. Package versions are managed
by the package management system ―Conda‖. Anaconda distribution comes
with more than 1,400 packages as well as the Conda package and virtual
environment manager called Anaconda Navigator and it eliminates the need to
learn to install each library independently. Custom packages can be made using
the conda build command, and can be shared with others by uploading them to
Anaconda Cloud, PyPI or other repositories. The default installation of
Anaconda2 includes Python 2.7 and Anaconda3 includes Python 3.7.
ANACONDA NAVIGATOR:
Anaconda Navigator is a desktop graphical user interface (GUI) included in
Anaconda® distribution that allows you to launch applications and easily manage
conda packages, environments, and channels without using command-line
commands. Navigator can search for packages on Anaconda.org or in a local
Anaconda Repository.
Data scientists often use multiple versions of many packages and use multiple
environments to separate these different versions 28.The command-line program
conda is both a package manager and an environment manager. This helps data
scientists ensure that each version of each package has all the dependencies it
requires and works correctly.
The following applications are available by default in Navigator:
⮚ JupyterLab
⮚ Jupyter Notebook
⮚ Spyder
⮚ PyCharm
⮚ VSCode
⮚ Glueviz
⮚ Orange 3 App
⮚ Rstudio
⮚ Anaconda Prompt (Windows only)
⮚ Anaconda PowerShell (Windows only)
18
Fig.3 ANACONDA NAVIGATOR
PyCharm:
19
4. Version Control: Seamless integration with version control systems like Git,
providing tools for committing, branching, merging, and viewing changes
directly from within the IDE.
6. Code Analysis: Built-in code analysis tools that provide warnings, suggestions,
and errors to help maintain code quality and adherence to coding standards.
VS CODE:
Visual Studio Code (VS Code) is a lightweight and versatile source-code editor
developed by Microsoft. It is renowned for its ease of use, extensibility, and
powerful features that cater to a wide range of programming languages and
frameworks.
20
6. Version Control Integration: Seamless integration with Git and other version
control systems, allowing for easy management of code repositories, including
viewing changes, committing, and branching.
PYTHON:
3. Extensive Standard Library: Python comes with a vast standard library that
provides modules and packages for tasks ranging from file I/O and networking to
web development and data manipulation.
6. Scalability and Performance: While Python is known for its ease of use and
rapid development, it also offers scalability and performance optimizations
through techniques like code optimization, parallel processing, and integration
with compiled languages.
21
IMPLEMENTATION
user fetch
User User User
signup Database login
details details
User database
Question
selection
window
PHQ 9 GAD 7
Questionnaire Database Questionnaire
Questionnaire database
scores
Scores Scores
scores
Database
Score Database
Trained
Trained Predictive analysis Anxiety
Depression Model
Model
Result pdf
Report User
Generation
23
RESULT AND DISCUSSION
Sign in window
Sign up window
24
Successful login
25
GAD – 7 Question Window
26
Score Analysis Window
Stored Result
27
Generated Report
28
Questionnaire and scoring scheme for PHQ-9:-
PHQ-9 Not at Several More than Nearly
all days half the days every day
29
CONCLUSION AND FUTURE WORK
Conclusion:
This project mainly focuses on determining levels of mental health disorders
namely anxiety, depression in young adults. Depression and anxiety are very
common among young adults, early detection helps in proving early treatment as
well. Our model follows standardized GAD-7 and PHQ-9 scoring schemes and
takes into consideration other parameters like age, gender to determine the level
of anxiety/depression of an individual. Automated scoring reduces risks of
manual errors like manual scoring, increases time efficiency as there is no need
to book an appointment, be physically present in a particular place etc. The report
generated can be given to the psychologist or psychiatrist and treatment can be
started early.
In conclusion, our project leverages the power of automated scoring and
comprehensive assessment to address the critical issue of mental health disorders
among young adults. By ensuring early detection and facilitating timely
intervention, we hope to make a significant positive impact on the mental health
landscape. Our vision of a globally accessible application, enriched with a variety
of mental health tests and supported by a network of mental health professionals,
aims to guide individuals towards the appropriate treatment and support they
need.
Future Work:
To further enhance the impact of our model, we plan to integrate it with existing
health systems and electronic health records. This will streamline the process of
sharing assessment results with healthcare providers, ensuring seamless
continuity of care. We also intend to expand the range of mental health
assessments available within our application. Including tests for other mental
health conditions will provide a more holistic approach to mental health
evaluation. Partnering with mental health professionals will be crucial in refining
our model and ensuring its efficacy. Their expertise will guide the development
of interventions and support systems tailored to the needs of individuals
diagnosed through our application. Educating users about mental health and
providing resources for self-help and professional support will be integral to our
mission. We aim to empower individuals with knowledge and tools to manage
their mental health proactively.
30
REFERENCES
Books:
Modern Tkinter for Busy Python Developers (by Mark Roseman)
Tkinter GUI Programming by Example (by David Love)
Machine Learning For Absolute Beginners: A Plain English
Introduction (Second Edition) (by Oliver Theobald.)
Links:
https://www.google.co.in/
https://en.wikipedia.org/
https://www.javatpoint.com/python-tkinter
https://www.geeksforgeeks.org/python-programming-language/
https://www.kaggle.com/datasets/shahzadahmad0402/depression-and-
anxiety-data
https://www.geeksforgeeks.org/machine-learning/
31