0% found this document useful (0 votes)
16 views15 pages

Project Final

Uploaded by

21ucs050
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
16 views15 pages

Project Final

Uploaded by

21ucs050
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 15

BREAST CANCER CLASSIFICATION

USING MACHINE LEARNING

Under The Guidance Of


Prof.Dr.Geetha

1.SRINITHA S
2.SUJA G
3.SRI KARISHMA
CONTENTS

Abstract
Introduction
Problem definition and scope
Methodology
Modelling
Evaluation Metrices
Requirement specification
References
ABSTRACT

Breast cancer is the most common and deadly type of cancer in the world.

Based on machine learning algorithms such as XG Boost, random forest, logistic


regression, and K-nearest neighbor, this paper establishes different models to
classify and predict breast cancer, so as to provide a reference for the early
diagnosis of breast cancer.

Accuracy indicates the probability of detecting malignant cancer cells in medical


diagnosis, which is of great significance for the classification of breast cancer.
INTRODUCTION
According to the World Health Organization (WHO), the number of females that died
in 2018 is about 627,000. The late discovery of this disease and complex
procedures are the main reasons for the low survival rate.

Therefore, detection of breast cancer earlier is vital to decrease the risk of


developing cancer in other tissue cells and carry out a proper treatment[2]. Cancer
is a creation of abnormal cells that come from a modification in these cells
genetically and spreads into the body, a late in diagnosis and treatment leads to
death.

There are two types of breast cancer, Malignant and Beningn. The former is harmful,
malignant, ability to infect other organs, and classified as cancerous. The latter is
non-invasive, not harmful, and not spread to other organs.
PROBLEM DEFINITION AND SCOPE
Develop a machine learning model to classify breast cancer tumors as malignant or
benign based on features extracted from Fine needle aspiration.

The goal is to assist healthcare professionals in diagnosing breast cancer more


accurately and efficiently, leading to timely interventions and improved patient
outcomes.

Identify relevant features such as tumor size, shape, texture, and margin, as well as
patient demographic information.

Split the dataset into training, validation, and test sets, ensuring that each set
contains a representative distribution of malignant and benign cases.
METHODOLOGY
1.Dataset collection :

The dataset is collected from Fine needle aspiration ,that is a type of biopsy
procedure .

2.Data preprocessing :
Imputing missing values.
Standardizing the values.

3.Train and Test Split :


Train the model using training data and then evaluate the model using
the test data .
4.Logistic Regression model :
It is one of the best model when it comes to binary classification .

New data is derived by using logistic regression from training and


testing.

5.Result analysis :
Based on the new data model shows the particular tumor is
malignant or benign.

0 indicates Malignant .

1 indicates benign .
MODELLING:

1.Logistic Regression: Logistic Regression is a simple and interpretable


algorithm that works well for binary classification tasks like breast
cancer classification. It models the probability that a given instance
belongs to a particular class.
.

2.Random Forest: Random Forest is an ensemble learning algorithm that


constructs multiple decision trees during training and outputs the mode
of the classes (classification) or the mean prediction (regression) of
the individual trees.
3.K-Nearest Neighbors (KNN): KNN is a simple and intuitive algorithm that
classifies instances based on the majority class among their nearest
neighbors in the feature space.

4.Naive Bayes: Naive Bayes is a probabilistic classifier that applies Bayes'


theorem with the "naive" assumption of independence between features.
Despite its simplicity, Naive Bayes can perform well in many classification
tasks, especially with limited amounts of data.
EVALUATION METRICES :
1.Accuracy :

This measures the proportion of correctly classified instances out of the total
instances.
Accuracy = (TP + TN)/ (TP + TN + FP + FN)

2.Precision :
In the context of breast cancer classification, precision tells us the proportion of
correctly identified cancer cases among all cases predicted as positive. A higher
precision indicates fewer false positives.

Precision = TP/ (TP + FP)


3.Recall:

Recall tells us the proportion of correctly identified cancer cases among all
actual cancer cases. A higher recall indicates fewer false negatives.

Recall = TP / TP + FN

4.F1 Score :

The F1 score is the harmonic mean of precision and recall, and it provides
a balance between the two metrics.

F1 Score =2 * (precision * recall) / (precision + recall).


REQUIREMENT SPECIFICATION

HARWARE SOFTWARE

System: Intel core i3 Operating system : windows 11


processor Coding language : python
Hard Disk:512 GB Web Framework: Spyder
Input Devices:
Keyboard,Mouse
RAM:8GB
REFERENCES
https://www.hindawi.com/journals/cin/2023/6530719
/
https://www.researchgate.net/publication/346617710_Breast_cancer
classification_using_machine_learning_techniques_a_comparative_study

https://youtu.be/bFh1umUDaGc?si=jrm4w7zQWbjDQO2r

https://www.cancer.org/cancer/types/breast-cancer/screening-tests-and-early-
detection/breast-biopsy/fine-needle-aspiration-biopsy-of-the-breast.html

https://www.geeksforgeeks.org/understanding-logistic-regression/
THANK YOU.....

You might also like