0% found this document useful (0 votes)
6 views

PCOS Detection and Monitoring Using Machine Learning

The document presents a study on detecting and monitoring Polycystic Ovarian Syndrome (PCOS) using various machine learning algorithms, highlighting the importance of early detection and lifestyle management. The proposed methodology includes a web application that offers personalized diet and exercise plans, a chatbot for patient inquiries, and a virtual doctor for emergencies. Logistic regression was identified as the most accurate model for predicting PCOS, and the developed web app aims to improve patient health management through continuous monitoring.

Uploaded by

ssvishnupriyag2
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
6 views

PCOS Detection and Monitoring Using Machine Learning

The document presents a study on detecting and monitoring Polycystic Ovarian Syndrome (PCOS) using various machine learning algorithms, highlighting the importance of early detection and lifestyle management. The proposed methodology includes a web application that offers personalized diet and exercise plans, a chatbot for patient inquiries, and a virtual doctor for emergencies. Logistic regression was identified as the most accurate model for predicting PCOS, and the developed web app aims to improve patient health management through continuous monitoring.

Uploaded by

ssvishnupriyag2
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 5

2024 5th International Conference on Image Processing and Capsule Networks (ICIPCN)

2024 5th International Conference on Image Processing and Capsule Networks (ICIPCN) | 979-8-3503-6717-1/24/$31.00 ©2024 IEEE | DOI: 10.1109/ICIPCN63822.2024.00046

PCOS Detection and Monitoring using Machine


Learning
1st Bhargavi Nimmala 2nd Udaya Deepthi Nimmala
dept. of Electronics and Communication Engineering dept. of Electronics and Communication Engineering
Gokaraju Rangaraju Institute of Engineering and Technology Gokaraju Rangaraju Institute of Engineering and Technology
Hyderabad, India Hyderabad, India
[email protected] [email protected]

3rd Akhilesh Elangi 4th Shilpa Bagade


dept. of Electronics and Communication Engineering dept. of Electronics and Communication Engineering
Gokaraju Rangaraju Institute of Engineering and Technology Assistant Professor
Hyderabad, India Gokaraju Rangaraju Institute of Engineering and Technology
[email protected] Hyderabad, India
[email protected]

Abstract—A common endocrine condition affecting women who PCOS is one of the most common hormonal illnesses affecting
are of reproductive age is called polycystic ovarian syndrome, or women of reproductive age, it is a significant public health
PCOS. Hormonal abnormalities, irregular menstrual cycles, and concern [2]. An estimated 8–13% of women who are fertile
the development of ovarian cysts are characteristics. PCOS has an
impact on one’s health and well-being throughout life, which are affected by this condition, and up to 70% of cases go
includes preventing major illnesses like high blood pressure, heart untreated [2]. It might lead to long term health problems that
and blood vessel problems, type-2 diabetes, uterine cancer and affect one’s physical and emotional health. The condition
infertility. A combination of genetics, hormone abnormalities, known as polycystic ovarian syndrome is brought on by the
stress, environment, and lifestyle choicesis probably what causes ovaries producing an excessive amount of androgens, which are
PCOS. Early detection and ongoing monitoring are necessary to
promote improved management and prompt intervention for typically seen in small amounts in women [3]. The ovaries
PCOS. The primary goal of this research study is to use create large amounts of androgens when ovulationis absent,
symptoms to determine if a woman has PCOS. Machine Learning which causes the ovaries to become packed with fluid and form
(ML) algorithms like SVM, Random Forest, Decision Tree, tiny cysts [4]. Among the most typical signsof PCOS irregular
Gaussian Naive Bayes and Logistic Regression can be used to menstruation, hyperpigmentation, acne, excessive hair growth,
identify it. This study has compared each model’s output and the
best model was applied for predictions. Additionally, an online
lethargy, severe cramps, depression, and difficulty in
application was developed that allows patients to obtain exercise conceiving are included [4]. Significant side effects from PCOS
and food plans, both of which are helpful in improving their PCOS include obesity, sleep apnea, gestational diabetes, heart arrest,
condition. Patients can also get answers to their inquiries from a metabolic syndrome, type 2 diabetes,and endometrial cancer
chatbot, and in case of any emergency, they can contact the virtual [4]. The precise etiology of polycystic ovary syndrome (PCOS)
doctor.
remains unclear nevertheless, a num ber of factors, including
Index Terms—Polycystic Ovary Syndrome (PCOS), Machine genetics and family history, hormones released during
Learning algorithms, Web App, Dietary plan, Physical Exercise pregnancy and the early postnatal period, and lifestyle or
plan, Chatbot, and Virtual Doctor. environment, are strongly linked to the condition of PCOS.
PCOS is an irreversible chronic illness. However, dietary
I. INTRODUCTION adjustments, exercise regimens, and fertility therapies can all
Multiple small fluid-filled sacs appeared around the ovary’s help to improve the condition of PCOS [5].
outer edge in patients with PCOS. These sacs are called cysts.
They are tiny, fluid-filled cysts that hold immature eggs. These
are referred as follicles. Eggs are not regularly released by the
follicles [1]. PCOS or Polycystic Ovarian Syndrome, is a
common hormonal condition that affects women, who are
fertile. Though symptoms may change over time, they typically
start in early life [2]. Hormonal irregularities, irregular periods,
elevated testosterone levels, and ovarian cysts can all be caused
by a lack of ovulation. One major contributor to infertility is
Fig. 1. PCOS
PCOS [2].

979-8-3503-6717-1/24/$31.00 ©2024 IEEE 238


DOI 10.1109/ICIPCN63822.2024.00046
Authorized licensed use limited to: Zhejiang University. Downloaded on December 18,2024 at 15:16:16 UTC from IEEE Xplore. Restrictions apply.
II. PROPOSED METHODOLOGY
The proposed model is an advancement of current
approaches, and the research is utilized to overcome the
limitations of such methods. In this study, it first determines
whether a womanhas PCOS or not, and then it suggests the
appropriate courseof action to monitor the patient’s health.

Limitations of Existing Methodology:


• It identifies only the Presence of PCOS in women using
machine learningalgorithms but it is not monitoring the
PCOS condition.
• PCOS is not cured but it can have improved with
lifestyles, so monitoring is important, that is not present in
existing methodology.
• There is no proper communication between the patient and
methodology.
• There is no chance of clearing the queries of the patient
in this methodology.
• Virtual Doctor is not present in existing approach for
emergency contact.
Advantages of the Proposed Methodology:
• The proposed system will keep track of the patient’shealth
and identify PCOS Fig. 2. Proposed System Architecture
• In this project, the PCOS Chatbot is present onthe web
app, which is used to answer questions of patients, it
improves the communication between the patient and the the features accessible, it chooses the best and most pertinent
methodology. ones. The Python “SelectKBest” class is a component of the
• The patient can control PCOS with the diet and activity Scikit-learn library’s “sklearn Feature selection” module. The
plans that are shown on this web app. top k features with the highest scores based on a certain
• A Virtual Doctor is also accessible for further statistical test are chosen using it for feature selection.
communication in the case of any emergency. Reducing the dimensionality of the dataset is an essential step
in the data preprocessing pipeline that can enhance model
A. System Architecture performance and minimize overfitting.
I. Dataset Collection: Accurate output requires a massive IV. Train Data and Test Data: The dataset must be split into
amount of data. The data set used in this project is gathered training and testing sets after preprocessing. Here, the 80% of
from Kaggle. There are 541 entries (rows) and 45 features the data is used for training and remaining 20% of the data
(columns). This data set is employed in numerous projects is used for testing. In this case, the model is trained using
and studies. the training dataset, and its performance is assessed using the
II. Data Preprocessing: The data gathered from Kaggle testing dataset. Popular classification algorithms, including
contains some noise, outliers, and incorrect numbers in addition Random Forest, K-NN, SVM, Naive Bayes, and DecisionTree,
there are some missing values present. The prediction would were trained on the PCOS dataset for this study.
provide an error if the same dataset is used. Therefore, before V. Modelling:
the data is used for the machine learning algorithms, it must Algorithms
be cleaned. The Python SimpleImputer class is a component Logistic Regression: For binary classification situations, such
of the Scikit-learn library’s sklearn.impute module. It offers as yes/no, true/false, or 0/1, where the dependent variable is
fundamental methods for impute missing values from datasets. categorical and has only two possible outcomes, logistic
The most frequent value in the column, the mean, the median, regression is a statistical technique that can be applied. Its
or a constant number might be used to fill in the blanks. Here foundation is the logistic function, sometimes referred to as
simple Imputer from python is used to handle the missing the sigmoid function, an S-shaped curve that transforms every
values, the missing values are filled with the mean of the real-valued integer into a value between 0 and 1. This is why
respective column values. it is called “logistic.”
III. Feature Selection: The model’s accuracy will decline
and its complexity will rise if all of its characteristics are Random Forest: A machine learning technique called
used. Its three main objectives are to reduce overfitting, lower
computational costs, and increase interpretability of the model.
In this project, Python feature selection methods selectkBest
and f classif are used for selecting best features. Out of all

239

Authorized licensed use limited to: Zhejiang University. Downloaded on December 18,2024 at 15:16:16 UTC from IEEE Xplore. Restrictions apply.
Random Forest makes predictions by employing several to train the training set then these are tested with testing data.
decision trees. It produces a “forest” of trees, each of which can Among all logistic regression provides the highest accuracy and
forecast an event on its own. Combining the forecasts from precision. Thus, a model is developed using logistic regression.
each tree in the forest yields the final estimate. Random Forest This model used to forecast PCOS for the given input values
is well known for its resilience and capacity to manage big, entered in web app. Finally, the output is displayed on the web
highly complex datasets. It is frequently utilized in a variety app.
of industries, including marketing, banking, and healthcare,
for both regression and classification applications. This project III. WEB APPLICATION DEVELOPMENT
uses it for classification in this instance. In this project A web application is developed that is used
K-Nearest Neighbors: K-nearest neighbors (KNN) is a simple to monitors the patient’s health. PCOS is a disorder best
machine learning algorithm that is used for both regression and managed rather than treated. Patients can monitor and control
classification issues. In this case, it is usedfor classification their disease with the help of this website. Exercise and diet
in this project. For regression, it uses the mean of the K plans are crucial components of PCOS management. To help
nearest data points in the feature space as a basis for them manage their lifestyles, a diet chart and a physical exercise
predictions, and for classification, it uses the majority vote. plans are placed in web app. Further a PCOS Chatbot is placed
Machine learning novices can gain from its ease of in web app that answers patient questions, and a virtual doctor
comprehension and application. Its performance is highly is also placed for emergency contact.
dependent on the distance metric and K choice. Flask is used to construct the web application. Flask is a
Support vector Machine: SVM is a potent supervised learn- lightweight WSGI web application framework that is robust
ing technique for classification and regression applications. enough to scale up to complex applications, yet still makes web
In order to optimize the margin between the classes and to development simple and quick to get started with. Upon
divide data points into different classes, the optimal hyperplane accessing the website, the user is required to create a new
is found. It is employed for classification in this project. SVM account. After completing the sign up process, the user can use
works effectively in high dimensional domains and is their credentials to log in to the website. Upon logging in, users
especially useful in situations when there are more dimensions can access several choices within the online application. They
than samples. are doctor details, chatbot, recommendation, and prediction.
Naïve Bayes: Naive Bayes is a popular machine learning
technique for classification since it is straight forward and
efficient.Its foundation is the Bayes theorem, which determines
the likelihood of a hypothesis in light of the available data.
In fact, Naive Bayes frequently performs well despite its
simplicity and the “naive” assumption of feature independence,
especially with huge datasets. It is especially well liked for
text classification applications because of its effectiveness and
capacity to manage high dimensional data.
Decision Tree: One well known machine learning method for
both regression and classification applications is the decision
tree. It expresses judgments as a tree like structure, with each
leaf node representing the prediction or outcome and each
interior node representing a decision based on a feature that
leads to several branches. Decision trees are very helpful for
comprehending the reasoning behind forecasts since they are
simple to read and picture. It is employed for classification in
this project.
PREDICTION OF PCOS: The collected dataset must first go
through the Data Preprocessing procedure. it is done using
simple imputer, it fills the missing values present in the data
with the mean of its column values. Subsequently, feature
selection is employed to mitigate the high dimensionality,lower
computational costs, and enhance performance, among many
other benefits. For feature selection, f classif and selec- tKBest
are employed. The training and test datasets have been
separated from the dataset. Now, ML Algorithms are used to
train the training dataset. The Random Forest, Decision tree,
SVM, K-NN, Logistic Regression, and Gaussian Naïve Bayes Fig. 3. Web App WorkFlow
algorithms are employed. These methods are now being used

240

Authorized licensed use limited to: Zhejiang University. Downloaded on December 18,2024 at 15:16:16 UTC from IEEE Xplore. Restrictions apply.
• Prediction: AMH, hair growth, follicle numbers (R and L), • Exercise Chart: The Exercise Chart in figure 6 displayed
and other information can be used by the user to predict the different exercises that are useful to monitor the
whether they have PCOS or not. Out of all the attributes, patients’ health condition.
these are the ones that are more significant in predicting
PCOS.
• Recommendation: We included a food and exercise chart
in this area that can help users to keep an eye on PCOS.
The diet chart provides the user with a comprehensive,
effective plan for choosing foods and diets that will
maintain and balance their hormone levels and overall
health. A list of exercises that are beneficial for
maintaining excellent health and tracking PCOS condition
may be found in the workout chart.
• PCOS ChatBot: It was created with the use of natural
language processing and is intended to answer patients’
questions.
• Virtual Doctor: The contact informationof a virtual doctor
is provided, which can be used for contacting the doctor in
case of any emergency.
IV. RESULTS & DISCUSSION
Logistic Regression is the most accurate and precise
classification approach among all other algorithms for PCOS
prediction when applied to the chosen dataset. Thus, PCOS
is classified and predicted using logistic regression.
Fig. 6. Exercises Chart
MONITORING: We have a diet and exercise chart available on
the Web app to help the users to keep an eye on the patient’s
health V. CONCLUSION AND FUTURE SCOPE
Ultimately, when all classification techniques are applied to
the provided dataset, logistic regression exhibits the highest
accuracy compared to the other algorithms. As a result, logistic
regression is used for predicting the PCOS. Finally, this study
has compared each algorithm’s output. In addition, a website
developed that provides patients with access to food
Fig. 4. Result guidelines and an exercise program, both of whichare helpful
in tracking their PCOS condition. Patients canalso get answers
to their inquiries from a Chabot, and in case of an emergency,
they can contact with a virtual doctor. To improve this model,
the web app might have an automated reminder feature.
Additionally, diseases linked to PCOS could be generated
depending on the patient’s symptoms and overall health.

Fig. 5. Dietary Chart

Dietary: The dietary chart in figure 5 clearly explained what


food can a PCOS patient take. This Dietary plan is very much
useful to monitor their health condition.

241

Authorized licensed use limited to: Zhejiang University. Downloaded on December 18,2024 at 15:16:16 UTC from IEEE Xplore. Restrictions apply.
REFERENCES [12] Graselin, S. O., Arunprasath, T., Rajasekaran, M. P., Kottaimalai,
R. (2023, December). A Systematic Review based on the Detection
[1] Nasim, S., Almutairi, M. S., Munir, K., Raza, A., Younas, F. (2022). A of PCOS using Machine Learning Techniques. In 2023 2nd Interna-
novel approach for polycystic ovary syndrome prediction using machine tional Conference on Automation, Computing and Renewable Systems
learning in bioinformatics. IEEE Access, 10, 97610-97624. (ICACRS) (pp. 1855-1861). IEEE.
[2] Ahmed, S., Rahman, M. S., Jahan, I., Kaiser, M. S., Hosen, A. S., [13] Bharati, S., Podder, P., Mondal, M. R. H. (2020, June). Diagnosis of
Ghirime, D., Kim, S. H. (2023). A Review on the Detection Techniques polycystic ovary syndrome using machine learning algorithms. In 2020
of Polycystic Ovary Syndrome Using Machine Learning. IEEE Access IEEE region 10 symposium (TENSYMP) (pp. 1486-1489). IEEE.
[3] Tiwari, S., Kane, L., Koundal, D., Jain, A., Alhudhaif, A., Polat, K., [14] Chauhan, P., Patil, P., Rane, N., Raundale, P., Kanakia, H. (2021, June).
... Althubiti, S. A. (2022). SPOSDS: A smart Polycystic Ovary Syndrome Comparative analysis of machine learning algorithms for prediction of
diagnostic system using machine learning. Expert Systems with pcos. In 2021 international conference on communication information
Applications, 203, 117592. and computing technology (ICCICT) (pp. 1-7). IEEE.
[4] Kumar, D., Kumar, A. (2023). PCOS Prediction Using Machine Learn- [15] Aggarwal, S., Pandey, K. (2021). An analysis of PCOS disease pre-
ing Techniques. NEU Journal for Artificial Intelligence and Internet of diction model using machine learning classification algorithms. Recent
Things, 1(2). Patents on Engineering, 15(6), 53-63.
[5] Rodriguez Paris, V., Solon-Biet, S. M., Senior, A. M., Edwards, M. C., [16] Sangeetha, D. P., Raj, P. N., Shurthika, R. Early Identification of PCOS
Desai, R., Tedla, N., ... Walters, K. A. (2020). Defining the impact of using Machine Learning Techniques.
dietary macronutrient balance on PCOS traits. Nature communications,
11(1), 5262.
[6] Reka, S., Elakkiya, R. (2022, October). Early diagnosis of polycystic
ovary syndrome (PCOS) in young women: a machine learning approach.
In 2022 IEEE International Symposium on Mixed and Augmented
Reality Adjunct (ISMAR-Adjunct) (pp. 286-288). IEEE.
[7] Ollila, M. M., Arffman, R. K., Korhonen, E., Morin-Papunen, L., Franks,
S., Junttila, J., Piltonen, T. T. (2023). Women with PCOShave an
increased risk for cardiovascular disease regardless of diagnostic
criteria—a prospective population-based cohort study. European Journal
of Endocrinology, 189(1), 96-105.
[8] Tanwani, Namrata. (2020). Detecting PCOS using Machine Learning.
10.13140/RG.2.2.10265.24169.
[9] Srinithi, V., Rekha, R. (2023, February). Machine learning for diagnosis
of polycystic ovarian syndrome (PCOS/PCOD). In 2023 International
Conference on Intelligent Systems for Communication, IoT and Security
(ICISCoIS) (pp. 19-24). IEEE.
[10] Garad, R. M., Teede, H. J. (2020). Polycystic ovary syndrome: improv-
ing policies, awareness, and clinical care. Current Opinion in Endocrine
and Metabolic Research, 12, 112-118.
[11] Adla, Y. A. A., Raydan, D. G., Charaf, M. Z. J., Saad, R. A., Nasreddine,
J., Diab, M. O. (2021, October). Automated detection of polycystic ovary
syndrome using machine learning techniques. In 2021 Sixth international
conference on advances in biomedical engineering (ICABME) (pp. 208-
212). IEEE.

242

Authorized licensed use limited to: Zhejiang University. Downloaded on December 18,2024 at 15:16:16 UTC from IEEE Xplore. Restrictions apply.

You might also like