MADDA WALABU UNIVERSITY
DEPARTMENT OF COMPUTER SCIENCE
Course Title: Machine Learning & Artificial intelligence
Title:-Thyroid Disease Detection using Deep Learning
Submitted to: - Wosenu.B (Ass.Proff)
By
Mekedes Gezahegn
February, 2023
Bale Robe, Ethiopia
Contents
List of Figure................................................................................................................................................. ii
List of Acronomy ......................................................................................................................................... iii
Abstract ........................................................................................................................................................ iv
1 Introduction ................................................................................................................................................ 1
1.1 Background........................................................................................................................................ 1
1.2 Problem statement ............................................................................................................................. 2
2. Objective ................................................................................................................................................... 2
2.1 General Objective ............................................................................................................................... 2
2.2 Specific objectives .............................................................................................................................. 2
2.3 Scope and Limitation .......................................................................................................................... 3
2.4 Methodology ....................................................................................................................................... 3
3 Results and discussion ............................................................................................................................... 4
3.1 Accuracy of prediction ...................................................................................................................... 5
4 Conclusion ................................................................................................................................................. 7
References ..................................................................................................................................................... 8
i
List of Figure
Figure 1 :- Model Accuracy Graphics ............................................................................................ 6
Figure 2 :- Loss function graphics. ................................................................................................. 6
ii
List of Acronomy
SVM Support Vector Machine
XGBOOST Extreme Gradient boost
RELU Rectified Linear Unit
iii
Abstract
Thyroid is a chronic and complicated infection that can be brought on by abnormal TSH (Thyroid
Simulating Hormone) levels or thyroid organ problems. When a person has hypothyroidism, their
body produces antibodies that destroy the thyroid gland. When the body does not create enough
thyroid hormones, it happens. Almost every organ in the body receives energy from the thyroid. It
regulates processes like heartbeat and digestive system operation. The body's natural processes
start to slack off when thyroid hormone levels are too low. Hypothyroidism, often known as an
underactive thyroid, affects women more frequently than males. Although it can start at any age,
it frequently affects adults over the age of 60. Most people's symptoms of the condition worsen
gradually over time. It is characterized by symptoms such as weight gain, fatigue, depression, high,
irregular menstruation cycle, hair loss, and improper digestive system functioning. The objective
of the project is to use machine learning model and deep learning activation to detect thyroid
disease.in this case the SVM algorithm applied to detect thyroid disease the Accuracy in SVM is
99 % in test and in training accuracy 99% and deep learning the model Accuracy in test 99% in
training 99%.
iv
ii
1 Introduction
1.1 Background
The thyroid gland is a vital hormone gland: It plays a major role in the metabolism, growth and
development of the human body. It helps to regulate many body functions by constantly releasing a
steady amount of thyroid hormones into the bloodstream [1]. With the thyroid include a variety of
disorders that can result in the gland producing too little thyroid hormone (hypothyroidism) or too
much (hyperthyroidism). The hormones, total serum thyroxin (T4) and total serum triiodothyronine
(T3) are the two active thyroid hormones produced by the thyroid gland to control the metabolism
of body. For the functioning of each cell and each tissue and organ in a right way, in overall energy
yield and regulation and to generate proteins in the ordnance of body temperature, these hormones
are necessary [2]. When your thyroid makes either too much or too little of these important
hormones, it’s called a thyroid disease. There are several different types of thyroid disease, including
hyperthyroidism, hypothyroidism, thyroiditis and Hashimoto’s thyroiditis [3] .The treatment of
disease is a constant concern for medical professionals, and accurate diagnosis given to a patient at
the appropriate time is crucial. Recently, some sophisticated Using standard medical reporting
procedures and extra reports based on symptoms, diagnoses can be made. Implementing machine
learning techniques may provide answers too many queries such as "What are the factors for
influencing the thyroid?" "Which age group of people are impacted due to thyroid?" "What is the
applicable treatment for a disease?" etc. After being processed and put into practice using specific
approaches, health care data can yield information that can be used in diagnosis and treatment of
diseases more efficiently and accurately with better decision making and minimizing the death risk.
I will prepare, training, and testing of the data, a step-by-step description of each strategy employed,
and a comparison of the predictive accuracy of the various methods are all covered in detail in this
paper. Hypothyroidism occurs when our body doesn’t produce enough thyroid hormones.
Hypothyroidism affects women more frequently than men [4]. It commonly affects people over the
age of 60, but can begin at any age. It may be discovered through a routine blood test or after
symptoms begin.In this project developed a model using deep learning and machine learning
1
1.2 Problem statement
Hospitals maintain all the patient records. Even though, those records are not used in an efficient
manner for diagnosis. To maintain the records in an efficient error free manner, the new proposed
system is introduced Thyroid disease is a common cause of medical diagnosis and prediction, with
an onset that is difficult to forecast in medical research. The thyroid gland is one of our body's
most vital organs. Thyroid hormone releases are responsible for metabolic regulation.
Hyperthyroidism and hypothyroidism are one of the two common diseases of the thyroid that
releases thyroid hormones in regulating the rate of body's metabolism [5]. For all the previous
reasons, these shortages in medical resources and the serious effect of the disease, have prepared
this study contribute to accelerating disease discovery, and rapid diagnosis to reduce its effects on
humanity by using deep learning neural network to assist physicians in the process of diagnosis
and treatment the thyroid disorder disease.
2. Objective
2.1 General Objective
The overall objective of the proposed system is to develop model using deep Learning and
machine learning for Thyroid Disease Detection.
2.2 Specific objectives
To develop the algorithm using deep learning and machine learning for Hypothyroid
Prediction
Measure the performance of algorithm
To evaluate and validate the performance of the Model
2
2.3 Scope and Limitation
This paper is concerned is to predict the estimated risk on a patient's chance of obtaining thyroid
disease or not. The disease prediction system can be developed in a variety of ways. This project
focuses on using deep-learning classification algorithms to forecast the risk degree of disease. This
research comes across different limitations on a different phase of the research processes. There
are several different types of thyroid disease in this project it detect the only on type of thyroid
disease Hypothyroid.
2.4 Methodology
Will discuss a research methodology to build a dataset and techniques in order to achieve the
objectives.
Data Preprocessing
Collect the dataset
Downloaded the dataset ‘Hypothyroid.csv’ from the data source Kaggle data set repository
Link: - Thyroid Disease Detection | Kaggle
• Imported Libraries as required to work with the dataset
• Load the Dataset
• Dataset we used contains 3772 records and 29 attributes
• Used pandas library to load dataset
• Using this function pd.read_csv() we imported dataset
• Splitting dataset
• Split the dataset into train set and test set in the ratio 0.25
• Train set was used to construct model using machine learning algorithms
• The test set was used to verify model performance
Applied Algorithms
MAke it title case
I will develop the model after preprocessing data set and training and testing the dataset. I will
propose models like SVM, Xgboot, deep learning activation relu, sigmoid The main purpose
of the project is to develop model to detect thyroid disease. The rectified linear activation
function or ReLU is a non-linear function or piecewise linear function that will output the input
directly if it is positive, otherwise, it will output zero It is the most commonly used activation
function in neural networks, especially in Convolutional Neural Networks (CNNs) &
Multilayer perceptron’s. Sigmoid function also known as logistic function is considered as the
primary choice as an activation function since its output exists between (0,1). As a result, it's
especially useful in models that require the probability to be predicted as an output. Because
3
the likelihood/probability, of anything, only occurs between 0 and 1, sigmoid turns out to be
the best option. Machine learning (ML) approaches have attracted a lot of attention, with
interest in many fields and applications expanding. The SVM algorithm is implemented in
python programming language and tested on a data set. The SVM model is created by using
python programming language. The dataset is divided into training part and testing part. Then,
the SVM model is trained accordingly. The model is trained on linear kernels available for
SVM and its prediction accuracies are calculated by testing set. After training the data support
vector machine it identifies the hyperplane and divide it to positive hyperplane and negative
hyperplane. It accepts new instance data and divide it to either positive hyperplane or negative
hyperplane and it produce predicted outcome.
3 Results and discussion
The thyroid dataset is being applied for processing on the deep Learning activation on Relu,
Sigmoid and develop Model.
4
3.1 Accuracy of prediction
In this section, the accuracy of prediction in each algorithm is measured in order to evaluate the
number of the percentage of correctly predicted results to the total number of results.
5
Figure 1 Model Accuracy Graphics
Figure 2 :- Loss function graphics.
6
4 Conclusion
For treatments like vaccine development and drags design, medical data is essential. The dataset
is gathered in the medical application as a consequence of testing the patient's response to a certain
drug or by gathering the results of medical tests to diagnose a condition. Specific disease. Thyroid
disease is often difficult to diagnose because it can easily confuse symptoms with other symptoms
of the disease. After early detection of thyroid disease, treatment can control the dysfunction. In
this study I used different algorithm machine learning such as SVM and Xgboost and deep learning
activation function relu and sigmoid. There are two label positive and negative the accuracy score
training data was 99%.while the accuracy score of test data was 99% in svm also in deep learning
the accuracy score training data was 99%.while the accuracy score of test data was 99% using
provide data .generally both machine learning and deep learning have good accuracy .
7
References
[1] InformedHealth.org, How does the thyroid gland work, Germany: institute for quality and
efficency in Health care, 2016.
[2] T. F, "A comparative study on thyroid disease," p. Appl 36:944–949, 2009.
[3] Gyanendra Chaubey.Dhananjay Bisen.Vibhash Yadav, "Thyroid Disease PredictionUsing
Machine Learning Approaches," 2020.
[4] Khalid salman1.Emrullah Sonuç, "Thyroid Disease Classification Using Machine Learning,"
in 2nd International Conference onPhysics and Applied Sciences, 2021.
[5] Lerina Aversano.Mario Luca Bernardi.Marta Cimitile.Martina Iammarino, "Thyroid Disease
Treatment prediction with machine learning," in 25th International Conference on
KnowledgeBased and Intelligent Information & Engineering, (2021.
[6] C. M. e. al, "Validation of an approach using only patient big data from clinical," (2020),.