PCOS Detection and Monitoring Using Machine Learning
PCOS Detection and Monitoring Using Machine Learning
2024 5th International Conference on Image Processing and Capsule Networks (ICIPCN) | 979-8-3503-6717-1/24/$31.00 ©2024 IEEE | DOI: 10.1109/ICIPCN63822.2024.00046
Abstract—A common endocrine condition affecting women who PCOS is one of the most common hormonal illnesses affecting
are of reproductive age is called polycystic ovarian syndrome, or women of reproductive age, it is a significant public health
PCOS. Hormonal abnormalities, irregular menstrual cycles, and concern [2]. An estimated 8–13% of women who are fertile
the development of ovarian cysts are characteristics. PCOS has an
impact on one’s health and well-being throughout life, which are affected by this condition, and up to 70% of cases go
includes preventing major illnesses like high blood pressure, heart untreated [2]. It might lead to long term health problems that
and blood vessel problems, type-2 diabetes, uterine cancer and affect one’s physical and emotional health. The condition
infertility. A combination of genetics, hormone abnormalities, known as polycystic ovarian syndrome is brought on by the
stress, environment, and lifestyle choicesis probably what causes ovaries producing an excessive amount of androgens, which are
PCOS. Early detection and ongoing monitoring are necessary to
promote improved management and prompt intervention for typically seen in small amounts in women [3]. The ovaries
PCOS. The primary goal of this research study is to use create large amounts of androgens when ovulationis absent,
symptoms to determine if a woman has PCOS. Machine Learning which causes the ovaries to become packed with fluid and form
(ML) algorithms like SVM, Random Forest, Decision Tree, tiny cysts [4]. Among the most typical signsof PCOS irregular
Gaussian Naive Bayes and Logistic Regression can be used to menstruation, hyperpigmentation, acne, excessive hair growth,
identify it. This study has compared each model’s output and the
best model was applied for predictions. Additionally, an online
lethargy, severe cramps, depression, and difficulty in
application was developed that allows patients to obtain exercise conceiving are included [4]. Significant side effects from PCOS
and food plans, both of which are helpful in improving their PCOS include obesity, sleep apnea, gestational diabetes, heart arrest,
condition. Patients can also get answers to their inquiries from a metabolic syndrome, type 2 diabetes,and endometrial cancer
chatbot, and in case of any emergency, they can contact the virtual [4]. The precise etiology of polycystic ovary syndrome (PCOS)
doctor.
remains unclear nevertheless, a num ber of factors, including
Index Terms—Polycystic Ovary Syndrome (PCOS), Machine genetics and family history, hormones released during
Learning algorithms, Web App, Dietary plan, Physical Exercise pregnancy and the early postnatal period, and lifestyle or
plan, Chatbot, and Virtual Doctor. environment, are strongly linked to the condition of PCOS.
PCOS is an irreversible chronic illness. However, dietary
I. INTRODUCTION adjustments, exercise regimens, and fertility therapies can all
Multiple small fluid-filled sacs appeared around the ovary’s help to improve the condition of PCOS [5].
outer edge in patients with PCOS. These sacs are called cysts.
They are tiny, fluid-filled cysts that hold immature eggs. These
are referred as follicles. Eggs are not regularly released by the
follicles [1]. PCOS or Polycystic Ovarian Syndrome, is a
common hormonal condition that affects women, who are
fertile. Though symptoms may change over time, they typically
start in early life [2]. Hormonal irregularities, irregular periods,
elevated testosterone levels, and ovarian cysts can all be caused
by a lack of ovulation. One major contributor to infertility is
Fig. 1. PCOS
PCOS [2].
239
Authorized licensed use limited to: Zhejiang University. Downloaded on December 18,2024 at 15:16:16 UTC from IEEE Xplore. Restrictions apply.
Random Forest makes predictions by employing several to train the training set then these are tested with testing data.
decision trees. It produces a “forest” of trees, each of which can Among all logistic regression provides the highest accuracy and
forecast an event on its own. Combining the forecasts from precision. Thus, a model is developed using logistic regression.
each tree in the forest yields the final estimate. Random Forest This model used to forecast PCOS for the given input values
is well known for its resilience and capacity to manage big, entered in web app. Finally, the output is displayed on the web
highly complex datasets. It is frequently utilized in a variety app.
of industries, including marketing, banking, and healthcare,
for both regression and classification applications. This project III. WEB APPLICATION DEVELOPMENT
uses it for classification in this instance. In this project A web application is developed that is used
K-Nearest Neighbors: K-nearest neighbors (KNN) is a simple to monitors the patient’s health. PCOS is a disorder best
machine learning algorithm that is used for both regression and managed rather than treated. Patients can monitor and control
classification issues. In this case, it is usedfor classification their disease with the help of this website. Exercise and diet
in this project. For regression, it uses the mean of the K plans are crucial components of PCOS management. To help
nearest data points in the feature space as a basis for them manage their lifestyles, a diet chart and a physical exercise
predictions, and for classification, it uses the majority vote. plans are placed in web app. Further a PCOS Chatbot is placed
Machine learning novices can gain from its ease of in web app that answers patient questions, and a virtual doctor
comprehension and application. Its performance is highly is also placed for emergency contact.
dependent on the distance metric and K choice. Flask is used to construct the web application. Flask is a
Support vector Machine: SVM is a potent supervised learn- lightweight WSGI web application framework that is robust
ing technique for classification and regression applications. enough to scale up to complex applications, yet still makes web
In order to optimize the margin between the classes and to development simple and quick to get started with. Upon
divide data points into different classes, the optimal hyperplane accessing the website, the user is required to create a new
is found. It is employed for classification in this project. SVM account. After completing the sign up process, the user can use
works effectively in high dimensional domains and is their credentials to log in to the website. Upon logging in, users
especially useful in situations when there are more dimensions can access several choices within the online application. They
than samples. are doctor details, chatbot, recommendation, and prediction.
Naïve Bayes: Naive Bayes is a popular machine learning
technique for classification since it is straight forward and
efficient.Its foundation is the Bayes theorem, which determines
the likelihood of a hypothesis in light of the available data.
In fact, Naive Bayes frequently performs well despite its
simplicity and the “naive” assumption of feature independence,
especially with huge datasets. It is especially well liked for
text classification applications because of its effectiveness and
capacity to manage high dimensional data.
Decision Tree: One well known machine learning method for
both regression and classification applications is the decision
tree. It expresses judgments as a tree like structure, with each
leaf node representing the prediction or outcome and each
interior node representing a decision based on a feature that
leads to several branches. Decision trees are very helpful for
comprehending the reasoning behind forecasts since they are
simple to read and picture. It is employed for classification in
this project.
PREDICTION OF PCOS: The collected dataset must first go
through the Data Preprocessing procedure. it is done using
simple imputer, it fills the missing values present in the data
with the mean of its column values. Subsequently, feature
selection is employed to mitigate the high dimensionality,lower
computational costs, and enhance performance, among many
other benefits. For feature selection, f classif and selec- tKBest
are employed. The training and test datasets have been
separated from the dataset. Now, ML Algorithms are used to
train the training dataset. The Random Forest, Decision tree,
SVM, K-NN, Logistic Regression, and Gaussian Naïve Bayes Fig. 3. Web App WorkFlow
algorithms are employed. These methods are now being used
240
Authorized licensed use limited to: Zhejiang University. Downloaded on December 18,2024 at 15:16:16 UTC from IEEE Xplore. Restrictions apply.
• Prediction: AMH, hair growth, follicle numbers (R and L), • Exercise Chart: The Exercise Chart in figure 6 displayed
and other information can be used by the user to predict the different exercises that are useful to monitor the
whether they have PCOS or not. Out of all the attributes, patients’ health condition.
these are the ones that are more significant in predicting
PCOS.
• Recommendation: We included a food and exercise chart
in this area that can help users to keep an eye on PCOS.
The diet chart provides the user with a comprehensive,
effective plan for choosing foods and diets that will
maintain and balance their hormone levels and overall
health. A list of exercises that are beneficial for
maintaining excellent health and tracking PCOS condition
may be found in the workout chart.
• PCOS ChatBot: It was created with the use of natural
language processing and is intended to answer patients’
questions.
• Virtual Doctor: The contact informationof a virtual doctor
is provided, which can be used for contacting the doctor in
case of any emergency.
IV. RESULTS & DISCUSSION
Logistic Regression is the most accurate and precise
classification approach among all other algorithms for PCOS
prediction when applied to the chosen dataset. Thus, PCOS
is classified and predicted using logistic regression.
Fig. 6. Exercises Chart
MONITORING: We have a diet and exercise chart available on
the Web app to help the users to keep an eye on the patient’s
health V. CONCLUSION AND FUTURE SCOPE
Ultimately, when all classification techniques are applied to
the provided dataset, logistic regression exhibits the highest
accuracy compared to the other algorithms. As a result, logistic
regression is used for predicting the PCOS. Finally, this study
has compared each algorithm’s output. In addition, a website
developed that provides patients with access to food
Fig. 4. Result guidelines and an exercise program, both of whichare helpful
in tracking their PCOS condition. Patients canalso get answers
to their inquiries from a Chabot, and in case of an emergency,
they can contact with a virtual doctor. To improve this model,
the web app might have an automated reminder feature.
Additionally, diseases linked to PCOS could be generated
depending on the patient’s symptoms and overall health.
241
Authorized licensed use limited to: Zhejiang University. Downloaded on December 18,2024 at 15:16:16 UTC from IEEE Xplore. Restrictions apply.
REFERENCES [12] Graselin, S. O., Arunprasath, T., Rajasekaran, M. P., Kottaimalai,
R. (2023, December). A Systematic Review based on the Detection
[1] Nasim, S., Almutairi, M. S., Munir, K., Raza, A., Younas, F. (2022). A of PCOS using Machine Learning Techniques. In 2023 2nd Interna-
novel approach for polycystic ovary syndrome prediction using machine tional Conference on Automation, Computing and Renewable Systems
learning in bioinformatics. IEEE Access, 10, 97610-97624. (ICACRS) (pp. 1855-1861). IEEE.
[2] Ahmed, S., Rahman, M. S., Jahan, I., Kaiser, M. S., Hosen, A. S., [13] Bharati, S., Podder, P., Mondal, M. R. H. (2020, June). Diagnosis of
Ghirime, D., Kim, S. H. (2023). A Review on the Detection Techniques polycystic ovary syndrome using machine learning algorithms. In 2020
of Polycystic Ovary Syndrome Using Machine Learning. IEEE Access IEEE region 10 symposium (TENSYMP) (pp. 1486-1489). IEEE.
[3] Tiwari, S., Kane, L., Koundal, D., Jain, A., Alhudhaif, A., Polat, K., [14] Chauhan, P., Patil, P., Rane, N., Raundale, P., Kanakia, H. (2021, June).
... Althubiti, S. A. (2022). SPOSDS: A smart Polycystic Ovary Syndrome Comparative analysis of machine learning algorithms for prediction of
diagnostic system using machine learning. Expert Systems with pcos. In 2021 international conference on communication information
Applications, 203, 117592. and computing technology (ICCICT) (pp. 1-7). IEEE.
[4] Kumar, D., Kumar, A. (2023). PCOS Prediction Using Machine Learn- [15] Aggarwal, S., Pandey, K. (2021). An analysis of PCOS disease pre-
ing Techniques. NEU Journal for Artificial Intelligence and Internet of diction model using machine learning classification algorithms. Recent
Things, 1(2). Patents on Engineering, 15(6), 53-63.
[5] Rodriguez Paris, V., Solon-Biet, S. M., Senior, A. M., Edwards, M. C., [16] Sangeetha, D. P., Raj, P. N., Shurthika, R. Early Identification of PCOS
Desai, R., Tedla, N., ... Walters, K. A. (2020). Defining the impact of using Machine Learning Techniques.
dietary macronutrient balance on PCOS traits. Nature communications,
11(1), 5262.
[6] Reka, S., Elakkiya, R. (2022, October). Early diagnosis of polycystic
ovary syndrome (PCOS) in young women: a machine learning approach.
In 2022 IEEE International Symposium on Mixed and Augmented
Reality Adjunct (ISMAR-Adjunct) (pp. 286-288). IEEE.
[7] Ollila, M. M., Arffman, R. K., Korhonen, E., Morin-Papunen, L., Franks,
S., Junttila, J., Piltonen, T. T. (2023). Women with PCOShave an
increased risk for cardiovascular disease regardless of diagnostic
criteria—a prospective population-based cohort study. European Journal
of Endocrinology, 189(1), 96-105.
[8] Tanwani, Namrata. (2020). Detecting PCOS using Machine Learning.
10.13140/RG.2.2.10265.24169.
[9] Srinithi, V., Rekha, R. (2023, February). Machine learning for diagnosis
of polycystic ovarian syndrome (PCOS/PCOD). In 2023 International
Conference on Intelligent Systems for Communication, IoT and Security
(ICISCoIS) (pp. 19-24). IEEE.
[10] Garad, R. M., Teede, H. J. (2020). Polycystic ovary syndrome: improv-
ing policies, awareness, and clinical care. Current Opinion in Endocrine
and Metabolic Research, 12, 112-118.
[11] Adla, Y. A. A., Raydan, D. G., Charaf, M. Z. J., Saad, R. A., Nasreddine,
J., Diab, M. O. (2021, October). Automated detection of polycystic ovary
syndrome using machine learning techniques. In 2021 Sixth international
conference on advances in biomedical engineering (ICABME) (pp. 208-
212). IEEE.
242
Authorized licensed use limited to: Zhejiang University. Downloaded on December 18,2024 at 15:16:16 UTC from IEEE Xplore. Restrictions apply.