0% found this document useful (0 votes)

21 views23 pages

Mini Project Report (1) .Final

This project report focuses on the development of a breast cancer prediction model using machine learning techniques, specifically comparing the performance of algorithms such as SVM, Logistic Regression, Random Forest, and KNN. The research aims to enhance prediction accuracy for breast cancer outcomes across three domains: pre-diagnosis, diagnosis and treatment prediction, and treatment outcome prediction. The methodology includes data processing, feature selection, and model training, utilizing various datasets to validate the effectiveness of the proposed techniques.

Uploaded by

Pratibha Mohan

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

21 views23 pages

Mini Project Report (1) .Final

Uploaded by

Pratibha Mohan

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 23

A PROJECT REPORT

ON
Breast Cancer Prediction
For the partial fulfillment for the award of the degree of

BACHELOR OF TECHNOLOGY
In
COMPUTER SCIENCE AND ENGINEERING
(Artificial Intelligence and Machine Learning)

Submitted By

Akash Yadav (2001921530005)

Akhil kumar (2001921530006)
Anurag Yadav (2001921530015)

Under the Supervision of

Mrs. Anju Chandna

G.L. BAJAJ INSTITUTE OF TECHNOLOGY AND MANAGEMENT,

GREATER NOIDA

Affiliated to
DR. APJ ABDUL KALAM TECHNICAL UNIVERSITY, LUCKNOW
2022-23
TABLE OF CONTENT

1. Declaration................................................................................................ (1)
2. Certificate …............................................................................................. (2)
3. Acknowledgement .................................................................................... (3)
4. Abstract...................................................................................................... (4)

Chapters

1. Introduction..........................................................................................(5)
2. Related Works……………….….........................................................(6)
3. Fundamentals........................................................................................(7)
4. System Requirements and Specification…………………………….(11)
5. Proposed Methodology………………………………………………..(12)
6. Results……………………………………………………..…………(14)
7. Conclusions and Future Scope……………………………………….(18)
8. Bibliography………………………………………………………….(19)
Declaration

We hereby declare that the project work presented in this report “Breast Cancer
Prediction”, in partial fulfillment of the requirement for the award of thedegree
of Bachelor of Technology in Computer Science & Engineering, submittedto A.P.J.
Abdul Kalam Technical University, Lucknow, is based on my own workcarried
out at Department of Computer Science & Engineering, G.L. BajajInstitute of
Technology & Management, Greater Noida. The work contained in the report is
original and project work reported in this report has not been submitted by me/us
for award of any other degree or diploma.

Name: Akash yadav Name: Akhil kumar

Roll No.: 2001921530005 Roll No.: 2001921530006

Signature: Signature:

Name: Anurag Yadav

Roll No.: 2001921530015

Signature:

Date:

1
Certificate

This is to certify that the Project report “Breast Cancer Prediction” done by

Akash yadav (2001921530005), Akhil kumar (2001921530006) and Anurag

yadav (2001921530015) is an original work carried out by them in Department

of Computer Science & Engineering, G.L Bajaj Institute of Technology &

Management, Greater Noida under my guidance. The matter embodied in this

project work has not been submitted earlier for the award of any degree or diploma

to the best of my knowledge and belief.

Date:

Mrs. Anju Chandna Dr. Sansar Singh Chauhan

Signature of the Signature of

Supervisor Head of the Department

2
Acknowledgement

The merciful guidance bestowed to us by the almighty made us stick out this
project to a successful end. We humbly pray with sincere heart for his guidance
to continue forever.

We pay thanks to our project guide Mrs. Anju Chandna who has given guidance
and light to us during this project. Her versatile knowledge has caused us in the
critical times during the span of this project.

We pay special thanks to our Head of Department Dr. Sansar Singh Chauhan
who has been always present as a support and help us in all possible way during
this project.

We also take this opportunity to express our gratitude to all those people who have
been directly and indirectly with us during the completion of the project.

We want to thanks our friends who have always encouraged us during this project.

At the last but not least thanks to all the faculty of CSE Department who provided
valuable suggestions during the period of project.

3
Abstract

Women are seriously threatened by breast cancer with high morbidity and mortality. The

lack of robust prognosis models results in difficulty for doctors to prepare a treatment plan

that may prolong patient survival time. Hence, the requirement of time is to develop the

technique which gives minimum error to increase accuracy. Four algorithm SVM, Logistic

Regression, Random Forest and KNN which predict the breast cancer outcome have been

compared in the paper using different datasets. All experiments are executed within a

simulation environment and conducted in JUPYTER platform. Aim of research categorises

in three domains. First domain is prediction of cancer before diagnosis, second domain is

prediction of diagnosis and treatment and third domain focuses on outcome during treatment.

The proposed work can be used to predict the outcome of different technique and suitable

technique can be used depending upon requirement. This research is carried out to predict the

accuracy. The future research can be carried out to predict the other different parameters and

breast cancer research can be categorises on basis of other parameters.

Keywords — Breast Cancer, machine learning, feature selection, classification, prediction,

KNN , Random Forest, ROC.

4
Chapter 1
INTRODUCTION

Problem Definition

Fig. 1.1:Mine Image

The second major cause of women's death is breast cancer (after lung cancer). 246,660 of
women's new cases of invasive breast cancer are expected to be diagnosed in the US during
2016 and 40,450 of women’s death is estimated. Breast cancer is a type of cancer that starts
in the breast. Cancer starts when cells begin to grow out of control.Breast cancer cells usually
form a tumour that can often be seen on an x-ray or felt as a lump.

Breast cancer can spread when the cancer cells get into the blood or lymph system and are
carried to other parts of the body. The cause of Breast Cancer includes changes and mutations
in DNA. There are many different types of breast cancer and common ones include ductal
carcinoma in situ (DCIS) and invasive carcinoma. Others, like phyllodes tumours and
angiosarcoma are less common. There are many algorithms for classification of breast cancer
outcomes.

The side effects of Breast Cancer are – Fatigue, Headaches, Pain and numbness (peripheral
neuropathy), Bone loss and osteoporosis. There are many algorithms for classification and
prediction of breast cancer outcomes. The present paper gives a comparison between the
performance of four classifiers: SVM , Logistic Regression , Random Forest and kNN which
are among the most influential data mining algorithms. It can be medically detected early
during a screening examination through mammography or by portable cancer diagnostic tool.

5
Cancerous breast tissues change with the progression of the disease, which can be directly
linked to cancer staging. The stage of breast cancer (I–IV) describes how far a patient’s cancer
has proliferated. Statistical indicators such as tumour size, lymph node metastasis, and distant
metastasis and so on are used to determine stages. To prevent cancer from spreading, patients
have to undergo breast cancer surgery, chemotherapy, radiotherapy and endocrine. The goal
of the research is to identify and classify Malignant.

Typical cancer screening procedures are grounded on the "gold-standard", that consists of three
tests: clinical evaluation, radiological imaging, and pathology testing. [18]. This traditional
technique, which is based on regression, detects the existence of cancer, whereas new ML
techniques and algorithms are built on model creation.

In its training and testing stages, the model is meant to forecast unknown data and offers a
satisfactory predicted outcome [19]. Preprocessing, feature selection or extraction, and
classification are the three major methodologies used in machine learning [20]. The feature
extraction part of the machine learning method is crucial for cancer diagnosis and prediction.
This process may differentiate between benign and malignant tumours [21]

6
Chapter 2
RELATED WORK

The cause of Breast Cancer includes somechanges and mutations in DNA. Cancer starts
when cells begin to grow out of control. Breast cancer cells usually form a tumour that
can often be seen on an x-ray or felt as a lump. There are many different types of breast
cancer and common ones include some ductal carcinoma in situ (DCIS) and invasive
carcinoma. Others, like phyllodes tumours and angiosarcoma are less common. Wang,
D.; Zhang and Y.-H Huang (2018) et al.

[1] used Logistic Regression and achieved an Accuracy of 96.4 %. Akbugday et al., [2]
performed classification on Breast Cancer Dataset by using KNN, SVM and achieved
accuracy of 96.85%. KAYA KELES et al., [3] in the paper titled “Breast Cancer
Prediction and Detection Using Data Mining” used Random Forest and achieved
accuracy of 92.2%.Vikas Chaurasia and Saurabh Pal et al., [4] compare the performance
criterion of supervised learning classifiers; such as Naïve Bayes, SVM-RBF kernel, RBF
neural networks, Decision trees (J48) and simple CART; to find the best classifier in
breast cancer datasets. Dalen, D.Walker and G. Kadam et al. [5] used ADABOOST and
achieved accuracy of 97.5% better than Random Forest. Kavitha et al., [6] used ensemble
methods with Neural Networks and achieved accuracy of 96.3% lesser than previous
7
studies. According to Sinthia et al., [7] used backpropagation method with 94.2 %
accuracy.

The experimental result shows that SVM-RBF kernel is more accurate than other
classifiers; it scores accuracy of 96.84% in Wisconsin Breast Cancer (original) datasets .
We have used classification methods like SVM, KNN, Random Forest, Naïve Bayes,
ANN. Prediction and prognosis of cancer development are focused on three major
domains: risk assessment or prediction of cancer susceptibility, prediction of cancer
relapse, and prediction of cancer survival rate. The first domain comprises prediction of
the probability of developing certain cancer prior to the patient diagnostics.

The second issue is related to prediction of cancer recurrence in terms of diagnostics and
treatment, and the third case is aimed at prediction of several possible parameters
characterizing cancer development and treatment after the diagnosis of the disease:
survival time, life expectancy, progression, drug sensitivity, etc. The survivability rate
and the cancer relapse are dependent very much on the medical treatment and the quality
of the diagnosis.

Radiology professionals frequently struggle with mammography mass lesion labelling,

which can lead to unneeded and costly breast biopsies. The paper's implementation was
evaluated using three publicly available benchmark datasets: the DDMS, INbreast, and
BCDR databases for training and testing, and the MIAS dataset for testing only. The
results showed that when PCNN is paired with CNN, it outperforms other approache
for the same publicly available datasets .
As we know that data pre-processing is a data mining technique that used for filter data
in a usable format. Because the realworld dataset almost available in different format. It
is not available as per our requirement so it must be filtered in understandable format.
Data pre-processing is a proven method of resolving such issues. Data preprocessing
convert the dataset into usable format for pre-processing we have used standardization
method.

8
Chapter 3
Fundamentals
1. Logistic regression:

This type of statistical model (also known as logit model) is often used for classification
and predictive analytics. Logistic regression estimates the probability of an event
occurring, such as voted or didn’t vote, based on a given dataset of independent variables.
Since the outcome is a probability, the dependent variable is bounded between 0 and 1. In
logistic regression, a logit transformation is applied on the odds—that is, the probability of
success divided by the probability of failure. This is also commonly known as the log odds,
or the natural logarithm of odds, and this logistic function is represented by the following
formulas:

In this logistic regression equation, logit(pi) is the dependent or response variable and x is
the independent variable. The beta parameter, or coefficient, in this model is commonly
estimated via maximum likelihood estimation (MLE). This method tests different values
9
of beta through multiple iterations to optimize for the best fit of log odds.

All of these iterations produce the log likelihood function, and logistic regression seeks to
maximize this function to find the best parameter estimate. Once the optimal coefficient
(or coefficients if there is more than one independent variable) is found, the conditional
probabilities for each observation can be calculated, logged, and summed together to yield
a predicted probability. For binary classification, a probability less than 0.5 will predict 0
while a probability greater than 0 will predict 1.

After the model has been computed, it’s best practice to evaluate the how well the model
predicts the dependent variable, which is called goodness of fit. The Hosmer–Lemeshow
test is a popular method to assess model fit.

Graph for Logistic Regression

10
2. Random Forest Classifier:

Random forest, as the name implies, constitutes of many separate decision

trees which all works as an ensemble Each separate tree of the Random forest
[19] gives out a class forecast and the class with the most votes transform
into our model’s desire as shown in Fig 3.5.

Fig. 3.5: Random Forest Classification

The principal idea propelling random forest is a straightforward however an

amazing way — the knowledge of groups. In information science talk, the
clarification that the random forest model works so well is: A colossal number
of commonly uncorrelated models (trees) functioning as a council will outrun
any of the its fundamental models exclusively.

11
3. Support Vector Machine

Support Vector Machine is a supervised machine learning algorithm which is doing well in
pattern recognition problems and it is used as a training algorithm for studying classification
and regression rules from data. SVM is most precisely used when the number of features and
number of instances are high. A binary classifier is built by the SVM algorithm. In an SVM
model, each data item is represented as points in an n-dimensional space where n is the number
of features where each feature is represented as the value of a coordinate in the n-dimensional
space. Here's how a support vector machine algorithm model works: (1) First, it finds lines or
boundaries that correctly classify the training dataset. (2) Then, from those lines or boundaries,
it picks the one that has the maximum distance from the closest data points.

12
Chapter 4
System Requirements and Specifications

1. Hardware Requirements:

• Processor/CPU: Core i5 or above.

• RAM: Minimum 4 GB.
• Storage: Minimum 1 GB.

2. Software Requirements:

• Programming Language: Python.

• OS Version: Windows 8.1 or above.
• IDE: Visual Studio Code.

3. Supporting Python Modules:

import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LogisticRegression
from sklearn.metrics import accuracy_score
from sklearn.ensemble import RandomForestClassifier
from sklearn.model_selection import RandomizedSearchCV
from sklearn.tree import DecisionTreeClassifier

13
Chapter 5
PROPOSED METHODOLOGY

Data Processing
Data Processing Data preparation

Feature Feature Selection

Projection

Feature
Scaling

Model selecton Prediction

Phase 1- Pre-Processing Data

The first phase we do is to collect the data that we are interested in collecting for pre-processing
and to apply classification and Regression methods. Data pre-processing is a data mining
technique that involves transforming raw data into an understandable format. Real world data
is often incomplete, inconsistent, and lacking certain to contain many errors. Data pre-
processing is a proven method of resolving such issues. Data pre-processing prepares raw data
for further processing. For pre-processing we have used standardization method to pre-process
the UCI dataset. This step is very important because the quality and quantity of data that you
gather will directly determine how good your predictive model can be. In this case we collect
the Breast Cancer samples which are Benign and Malignant. This will be our training data.

14
Phase 2- DATA PREPRATION

Data Preparation, where we load our data into a suitable place and prepare it for use in our
machine learning training. We’ll first put all our data together, and then randomize the
ordering.

Phase 3- FEATURE SELECTION

In machine learning and statistics, feature selection, also known as variable selection, attribute
selection, is the process of selection a subset of relevant features for use in model construction.

Data File and Feature Selection Breast Cancer Wisconsin (Diagnostic):- Data Set from Kaggle
repository and out of 31 parameters we have selected about 8-9 parameters. Our target
parameter is breast cancer diagnosis – malignant or benign. We have used Wrapper Method
for Feature Selection. The important features found by the study are: Concave points worst,
Area worst, Area se, Texture worst, Texture mean, Smoothness worst, Smoothness mean,
Radius mean, Symmetry mean.

We have used Wrapper Method for Feature Selection. The important features found by the
study are: 1. Concave points worst 2. Area worst 3. Area se 4. Texture worst 5. Texture mean
6. Smoothness worst 7. Smoothness mean 8. Radius mean 9. Symmetry means.

Attribute Information:
ID number 2) Diagnosis (M = malignant, B = benign) 3–32)

Phase 4- FEATURE PROJECTION

Feature projection is transformation of high-dimensional space data to a lower dimensional

space (with few attributes). Both linear and nonlinear reduction techniques can be used in
accordance with the type of relationships among the features in the dataset.

Phase 5- FEATURE SCALING

Most of the times, your dataset will contain features highly varying in magnitudes, units and
range. But since, most of the machine learning algorithms use Euclidian distance between two
data points in their computations. We need to bring all features to the same level of magnitudes.
This can be achieved by scaling.

Phase 6- MODAL SELECTION

15
Supervised learning is the method in which the machine is trained on the data which the input
and output are well labelled. The model can learn on the training data and can process the
future data to predict outcome. They are grouped to Regression and Classification techniques.
A regression problem is when the result is a real or continuous value, such as “salary” or
“weight”. A classification problem is when the result is a category like filtering emails spam”
or “not spam”. Unsupervised Learning: Unsupervised learning is giving away information to
the machine that is neither classified nor labelled and allowing the algorithm to analyse the
given information without providing any directions.

In unsupervised learning algorithm the machine is trained from the data which is not labelled
or classified making the algorithm to work without proper instructions. In our dataset we have
the outcome variable or Dependent variable i.e. Y having only two set of values, either M
(Malign) or B (Benign). So Classification algorithm of supervised learning is applied on it. We
have chosen three different types of classification algorithms in Machine Learning. We can
use a small linear model, which is a simple.

Phase 7- PREDICTION

Machine learning is using data to answer questions. So Prediction, or inference, is the step
where we get to answer some questions. This is the point of all this work, where the value of
machine learning is real.

16
Chapter 6

RESULTS:-

The work was implemented on i3 processor with 2.30GHz speed, 2 GB RAM, 320 GB external
storage and all experiments on the classifiers described in this paper were conducted using
libraries from Anaconda machine learning environment. In Experimental studies we have
partition 70-30% for training & testing. JUPYTER contains a collection of machine learning
algorithms for data pre-processing, classification, regression, clustering and association rules.
Machine learning techniques implemented in JUPYTER are applied to a variety of real-world
problems. The results of the data analysis are reported. To apply our classifiers and evaluate
them, we apply the 10-fold cross validation test which is a technique used in evaluating
predictive models that split the original set into a training sample to train the model, and a test
set to evaluate it. After applying the pre-processing and preparation methods, we try to analyse
the data visually and figure out the distribution of values in terms of effectiveness and
efficiency.

We evaluate the effectiveness of all classifiers in terms of time to build the model, correctly
classified instances, incorrectly classified instances and accuracy.

Algorithms Accuracy Sensitivity Specificity Precision F1-Score ROC

Logistic 0.96244131 TP / (TP + F TN / (FP +

Regression 45539906 N) TN)

Decision 1.0
Tree

Random 0.99295774 TP/(TP + F TN/(FP + TN) (TP + TN ) /

Forest 64788732 N) (TP + FP +
TN + FN)

TP- True positive

FN- false Negative
TN- True negative
FN- False Negative

17
In order to better measure the performance of classifiers, simulation error is also considered in
this study. To do so, we evaluate the effectiveness of our classifier in terms of: x Kappa statistic
(KS) as a chance-corrected measure of agreement between the classifications and the true
classes, x Mean Absolute Error (MAE) as how close forecasts or predictions are to the eventual
outcomes, x Root Mean Squared Error (RMSE), x Relative Absolute Error (RAE), x Root
Relative Squared Error (RRSE).

In this section, the results of the data analysis are reported. To apply our classifiers and evaluate
them, we apply the 10-fold cross validation test which is a technique used in evaluating
predictive models that split the original set into a training sample to train the model, and a test
set to evaluate it. After applying the preprocessing and preparation methods, we try to analyse
the data visually and figure out the distribution of values in terms of effectiveness and
efficiency.

EFFECTIVENESS

In this section, we evaluate the effectiveness of all classifiers in terms of time to build the
model, correctly classified instances, incorrectly classified instances and accuracy.

The ROC space is defined with true positives and false positives as the x and y coordinates,
respectively. ROC curve summarizes the performance across all possible thresholds. The
diagonal of the ROC graph can be interpreted as random guessing, and classification models
that fall below the diagonal are considered worse than random guessing.

18
Chapter 7
CONCLUSION AND FUTURE WORK:-

We can notice that SVM takes about 0.07 s to build its model unlike k-NN that takes just 0.01
s. It can be explained by the fact that k-NN is a lazy learner and does not do much during
training process unlike others classifiers that build the models. In other hand, the accuracy
obtained by SVM (97.13%) is better than the accuracy obtained by C4.5, Naïve Bayes and k-
NN that have an accuracy that varies between 95.12 % and 95.28 %. It can also be easily seen
that SVM has the highest value of correctly classified instances and the lower value of
incorrectly classified instances than the other classifiers.

After creating the predicted model, we can now analyse results obtained in evaluating
efficiency of our algorithms. SVM and C4.5 got the highest value (97 %) of TP for benign
class but k-NN correctly predicts 97% of instance that belong to malignant class. The FP rate
is lower when using SVM classifiers (0.03 for benign class and 0.02 for malignant class), and
then other algorithms follow: k-NN, C4.5 and NB. From these results, we can understand why
SVM has outperformed other classifiers.

FUTURE WORK:-
The analysis of the results signifies that the integration of multidimensional data along with
different classification, feature selection and dimensionality reduction techniques can provide
auspicious tools for inference in this domain. Further research in this field should be carried
out for the better performance of the classification techniques so that it can predict on more
variables. We are intending how to parametrize our classification techniques hence to achieve
high accuracy. We are looking into many datasets and how further Machine Learning
algorithms can be used to characterize Breast Cancer. We want to reduce the error rates with
maximum accuracy.

19
Chapter 8
BIBLIOGRAPHY:-

[1] Mert A, Kilic N, Akan A.,”Breast cancer classification by using support vector machines
with reduced dimension.”,ELMAR, ,IEEE,2011 Sep 14 ,pp.37-40.

[2]Gou J, Du L, Zhang Y, Xiong T ,”A new distance-weighted k-nearest neighbor classifier”,

J.ournal of Information of Computer Science,June 2012 ,vol.9(6),pp.1429-36.

[3]Octaviani TL, Rustam Z,”Random forest for breast cancer prediction”,AIP Conference
Proceedings, AIP Publishing Nov 4 ,2019 ,vol. 2168.

[4 ] UCHealth 2015, “How Accurate are mammograms?”, UCHealth viewed 16 November

2019,

[5]Karabatak M, Ince MC,” An expert system for detection of breast cancer based on
association rules and neural network”, Expert systems with Applications”,2009 March,vol
36(2),pp.3465-3469

20
21

Project Report On Breast Cancer
67% (3)
Project Report On Breast Cancer
47 pages
PFISTER
No ratings yet
PFISTER
1,238 pages
Restful Api
No ratings yet
Restful Api
69 pages
CHAPTER ONE To 3-1
No ratings yet
CHAPTER ONE To 3-1
51 pages
Brest Cancer Tumor Detection
No ratings yet
Brest Cancer Tumor Detection
40 pages
Chapter One To Three
No ratings yet
Chapter One To Three
39 pages
Cancer Prediction Using Machine Learning
No ratings yet
Cancer Prediction Using Machine Learning
5 pages
Disease Presiction
No ratings yet
Disease Presiction
32 pages
Project Report
No ratings yet
Project Report
27 pages
Report of Breast Cancer
No ratings yet
Report of Breast Cancer
80 pages
Assignment On EWU
100% (1)
Assignment On EWU
6 pages
Breast Cancerr Main
100% (1)
Breast Cancerr Main
47 pages
Advanced Analytics of Image Datasets in Human Health
From Everand
Advanced Analytics of Image Datasets in Human Health
Dr. Zemelak Goraga
No ratings yet
Breast Cancer Prediction Model Assignment
No ratings yet
Breast Cancer Prediction Model Assignment
37 pages
Breast Cancer Diagnostiic Using Machine Learning
No ratings yet
Breast Cancer Diagnostiic Using Machine Learning
72 pages
Leveraging Naive Bayes For Enhanced Survival Analysis in Breast Cancer
No ratings yet
Leveraging Naive Bayes For Enhanced Survival Analysis in Breast Cancer
10 pages
+ Chelton: V/Uhf Receiver TYPE 707-I SERIAL Nos. 1 - 100
No ratings yet
+ Chelton: V/Uhf Receiver TYPE 707-I SERIAL Nos. 1 - 100
51 pages
Breast Cancer Aiml Project
No ratings yet
Breast Cancer Aiml Project
25 pages
Breast Cancer Detection Using Machine Learning Algorithm PDF
No ratings yet
Breast Cancer Detection Using Machine Learning Algorithm PDF
29 pages
Breast Cancer Prediction A Comparative S-1
No ratings yet
Breast Cancer Prediction A Comparative S-1
14 pages
Echocardiography Technician - The Comprehensive Guide: Vanguard Professionals
From Everand
Echocardiography Technician - The Comprehensive Guide: Vanguard Professionals
ANTILLIA TAURED
No ratings yet
Research Proposal UK
No ratings yet
Research Proposal UK
13 pages
T212033 - Prachi Ratilal Patil
No ratings yet
T212033 - Prachi Ratilal Patil
28 pages
ARMA RFM 6T6R B20 360W Datasheet
No ratings yet
ARMA RFM 6T6R B20 360W Datasheet
4 pages
Breast Cancer Modeling and Prediction Combining
No ratings yet
Breast Cancer Modeling and Prediction Combining
6 pages
XII - IP - 2013-14 - Guwahati Region
No ratings yet
XII - IP - 2013-14 - Guwahati Region
137 pages
Breast Cancer Prediction Using Deep Learning Technique RNN and GRU
No ratings yet
Breast Cancer Prediction Using Deep Learning Technique RNN and GRU
5 pages
(IJCST-V11I3P3) :DR M Narendra, A Nandini, T Kamal Raj, V Sai Sowmya, CH Brahma Reddy
No ratings yet
(IJCST-V11I3P3) :DR M Narendra, A Nandini, T Kamal Raj, V Sai Sowmya, CH Brahma Reddy
3 pages
Predictive Modeling For Breast Cancer Classification in The Context of Bangladeshi Patients by Use of Machine Learning Approach With Explainable AI
No ratings yet
Predictive Modeling For Breast Cancer Classification in The Context of Bangladeshi Patients by Use of Machine Learning Approach With Explainable AI
17 pages
2023 Ieeee
No ratings yet
2023 Ieeee
6 pages
Breast Cancer Classification Model Using Principal Component Analysis and Deep Neural Network
No ratings yet
Breast Cancer Classification Model Using Principal Component Analysis and Deep Neural Network
13 pages
Machine Learning Algorithms For Breast Cancer Prediction and Diagnosis Machine Learning Algorithms For Breast Cancer Prediction and Diagnosis
No ratings yet
Machine Learning Algorithms For Breast Cancer Prediction and Diagnosis Machine Learning Algorithms For Breast Cancer Prediction and Diagnosis
6 pages
Utilizing Cutting-Edge Machine Learning Methods Fo - 241221 - 101813 Paper
No ratings yet
Utilizing Cutting-Edge Machine Learning Methods Fo - 241221 - 101813 Paper
7 pages
Breast Cancer Detection Using ETC
No ratings yet
Breast Cancer Detection Using ETC
13 pages
Analysis of Machine Learning Algorithms On Cancer Dataset
No ratings yet
Analysis of Machine Learning Algorithms On Cancer Dataset
10 pages
Python Worksheet 1: Task Completed
No ratings yet
Python Worksheet 1: Task Completed
4 pages
Machine Learning Models For Breast Cancer Classifi
No ratings yet
Machine Learning Models For Breast Cancer Classifi
13 pages
Comparative Analysis of Breast Cancer Detection Using Cutting-Edge Machine Learning Algorithms (MLAs)
No ratings yet
Comparative Analysis of Breast Cancer Detection Using Cutting-Edge Machine Learning Algorithms (MLAs)
15 pages
A Novel SVM Kernel Classifier Technique Using Supp
No ratings yet
A Novel SVM Kernel Classifier Technique Using Supp
19 pages
Zeroth Review Minor P
No ratings yet
Zeroth Review Minor P
11 pages
BCPUML Breast Cancer Prediction Using Machine Learning Approach-A Performance Analysis
No ratings yet
BCPUML Breast Cancer Prediction Using Machine Learning Approach-A Performance Analysis
10 pages
User Manual: 42PFK7109 42PFS7109 47PFK7109 47PFS7109 55PFK7109 55PFS7109
No ratings yet
User Manual: 42PFK7109 42PFS7109 47PFK7109 47PFS7109 55PFK7109 55PFS7109
101 pages
Project Report: Bangladesh University of Business & Technology (BUBT)
No ratings yet
Project Report: Bangladesh University of Business & Technology (BUBT)
18 pages
150,000+ Free Fonts - Download Now - FFonts
No ratings yet
150,000+ Free Fonts - Download Now - FFonts
5 pages
Classification of Breast Cancer Using A Novel Neural Network-Based Architecture
No ratings yet
Classification of Breast Cancer Using A Novel Neural Network-Based Architecture
6 pages
Fast Facts: Early Breast Cancer
From Everand
Fast Facts: Early Breast Cancer
Jayant S. Vaidya
No ratings yet
Analog Testing 02
0% (1)
Analog Testing 02
39 pages
Internship Report Final
No ratings yet
Internship Report Final
38 pages
1599311465islam2020 Article BreastCancerPredictionACompara
No ratings yet
1599311465islam2020 Article BreastCancerPredictionACompara
14 pages
Breast Cancer Prediction Using Gated Attentive Multimodal Deep Learning
No ratings yet
Breast Cancer Prediction Using Gated Attentive Multimodal Deep Learning
11 pages
Journal-Breast Cancer Prediction
No ratings yet
Journal-Breast Cancer Prediction
10 pages
WMN Chapter 1
No ratings yet
WMN Chapter 1
23 pages
Breast Cancer Prediction Using Machine Learning: Article
No ratings yet
Breast Cancer Prediction Using Machine Learning: Article
13 pages
Comparative Study of Classification Techniques On Breast Cancer FNA Biopsy Data
No ratings yet
Comparative Study of Classification Techniques On Breast Cancer FNA Biopsy Data
8 pages
Nocd Just Replace The Original Oald8.exe in The Folder of Dictionary Installation
No ratings yet
Nocd Just Replace The Original Oald8.exe in The Folder of Dictionary Installation
1 page
Research Paper Final
No ratings yet
Research Paper Final
11 pages
Yuuy
No ratings yet
Yuuy
5 pages
Studi Kasus Airway
No ratings yet
Studi Kasus Airway
58 pages
Research Paper Diagnosis
No ratings yet
Research Paper Diagnosis
10 pages
Research Paper-1
No ratings yet
Research Paper-1
6 pages
2019-05 Machine Learning Techniques For Detecting and Predicting Breast Cancer
No ratings yet
2019-05 Machine Learning Techniques For Detecting and Predicting Breast Cancer
5 pages
Yousefi Arzyabiamalkard12
No ratings yet
Yousefi Arzyabiamalkard12
5 pages
Introduction To The Motherboard Meet 1
No ratings yet
Introduction To The Motherboard Meet 1
8 pages
A Review Paper On Breast Cancer Detection Using Deep Learning
No ratings yet
A Review Paper On Breast Cancer Detection Using Deep Learning
10 pages
Malignant and Benign Breast Cancer Classification Using Machine Learning Algorithms
No ratings yet
Malignant and Benign Breast Cancer Classification Using Machine Learning Algorithms
5 pages
Nessus Report: 21/mar/2012:16:20:52 GMT
No ratings yet
Nessus Report: 21/mar/2012:16:20:52 GMT
74 pages
Breast Cancer Prediction Using Machine Learning
No ratings yet
Breast Cancer Prediction Using Machine Learning
8 pages
Implementation of a Remote and Automated Quality Control Programme for Radiography and Mammography Equipment
From Everand
Implementation of a Remote and Automated Quality Control Programme for Radiography and Mammography Equipment
IAEA
No ratings yet
How Can Machine Learning Be Used To Classify Breast Cancer?
No ratings yet
How Can Machine Learning Be Used To Classify Breast Cancer?
6 pages
Neural Network
No ratings yet
Neural Network
15 pages
Breast Cancer Classification Using Machine Learning
No ratings yet
Breast Cancer Classification Using Machine Learning
9 pages
01 Linear Data Structures
No ratings yet
01 Linear Data Structures
56 pages
Breast Cancer Diagnosis Using Deep Learning Algorithm: Naresh Khuriwal DR Nidhi Mishra
No ratings yet
Breast Cancer Diagnosis Using Deep Learning Algorithm: Naresh Khuriwal DR Nidhi Mishra
6 pages
Prediction of Breast Cancer Using Supervised Machine Learning Techniques
No ratings yet
Prediction of Breast Cancer Using Supervised Machine Learning Techniques
5 pages
Chronicle Design
No ratings yet
Chronicle Design
12 pages
Data Struture and Alghorithem
No ratings yet
Data Struture and Alghorithem
46 pages
Grdjev06i010003 PDF
No ratings yet
Grdjev06i010003 PDF
4 pages
Implementation of NTRIP and Management System in NIGNET Network
No ratings yet
Implementation of NTRIP and Management System in NIGNET Network
78 pages
Feature Selection For Breast Cancer Detection Using Machine Learning Algorithms
No ratings yet
Feature Selection For Breast Cancer Detection Using Machine Learning Algorithms
4 pages
Panasonic Phone System KXT308
No ratings yet
Panasonic Phone System KXT308
6 pages
Openboxes-Docs Documentation: Release 0.7.18
No ratings yet
Openboxes-Docs Documentation: Release 0.7.18
42 pages
Varietal Discrimination of Guava Psidium Guajava Leaves Using Multi Features Analysis
No ratings yet
Varietal Discrimination of Guava Psidium Guajava Leaves Using Multi Features Analysis
19 pages
SPM Presentation
No ratings yet
SPM Presentation
12 pages
122 Penn ST LRev 613
No ratings yet
122 Penn ST LRev 613
33 pages
Assignment02 With Instructions
No ratings yet
Assignment02 With Instructions
7 pages
Day 1 of AWS Journey
No ratings yet
Day 1 of AWS Journey
3 pages
Chapter I
No ratings yet
Chapter I
18 pages
Endress-Hauser Liquiline CM442 EN
No ratings yet
Endress-Hauser Liquiline CM442 EN
11 pages
SWOT Analysis (Paragraph Form
No ratings yet
SWOT Analysis (Paragraph Form
1 page
F Module 5 Questions (P 3)
No ratings yet
F Module 5 Questions (P 3)
3 pages

Mini Project Report (1) .Final

Uploaded by

Mini Project Report (1) .Final

Uploaded by

A PROJECT REPORT

Akash Yadav (2001921530005)

Under the Supervision of

G.L. BAJAJ INSTITUTE OF TECHNOLOGY AND MANAGEMENT,

Name: Akash yadav Name: Akhil kumar

Roll No.: 2001921530005 Roll No.: 2001921530006

Name: Anurag Yadav

Roll No.: 2001921530015

Akash yadav (2001921530005), Akhil kumar (2001921530006) and Anurag

yadav (2001921530015) is an original work carried out by them in Department

of Computer Science & Engineering, G.L Bajaj Institute of Technology &

Management, Greater Noida under my guidance. The matter embodied in this

to the best of my knowledge and belief.

Mrs. Anju Chandna Dr. Sansar Singh Chauhan

Signature of the Signature of

simulation environment and conducted in JUPYTER platform. Aim of research categorises

breast cancer research can be categorises on basis of other parameters.

Keywords — Breast Cancer, machine learning, feature selection, classification, prediction,

Fig. 1.1:Mine Image

Radiology professionals frequently struggle with mammography mass lesion labelling,

Graph for Logistic Regression

Random forest, as the name implies, constitutes of many separate decision

Fig. 3.5: Random Forest Classification

The principal idea propelling random forest is a straightforward however an

• Processor/CPU: Core i5 or above.

• Programming Language: Python.

3. Supporting Python Modules:

Feature Feature Selection

Model selecton Prediction

Phase 1- Pre-Processing Data

Phase 3- FEATURE SELECTION

Phase 4- FEATURE PROJECTION

Feature projection is transformation of high-dimensional space data to a lower dimensional

Phase 5- FEATURE SCALING

Phase 6- MODAL SELECTION

Algorithms Accuracy Sensitivity Specificity Precision F1-Score ROC

Logistic 0.96244131 TP / (TP + F TN / (FP +

Random 0.99295774 TP/(TP + F TN/(FP + TN) (TP + TN ) /

TP- True positive

[2]Gou J, Du L, Zhang Y, Xiong T ,”A new distance-weighted k-nearest neighbor classifier”,

[4 ] UCHealth 2015, “How Accurate are mammograms?”, UCHealth viewed 16 November

You might also like