0% found this document useful (0 votes)
9 views2 pages

Project Proposal - Breast Cancer Classification

The project proposal focuses on classifying breast cancer tumors as malignant or benign using supervised learning techniques. It utilizes the Breast Cancer Wisconsin (Diagnostic) Data Set from the UCI Machine Learning Repository, emphasizing data preprocessing, exploratory data analysis, and various classification algorithms. The literature review highlights the effectiveness of deep learning approaches in improving classification accuracy for breast cancer diagnosis.

Uploaded by

wellsfargo1045
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
9 views2 pages

Project Proposal - Breast Cancer Classification

The project proposal focuses on classifying breast cancer tumors as malignant or benign using supervised learning techniques. It utilizes the Breast Cancer Wisconsin (Diagnostic) Data Set from the UCI Machine Learning Repository, emphasizing data preprocessing, exploratory data analysis, and various classification algorithms. The literature review highlights the effectiveness of deep learning approaches in improving classification accuracy for breast cancer diagnosis.

Uploaded by

wellsfargo1045
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 2

Intelligent Systems

Group Members:
Vinitha Bavanasi
Vasavi Sandhya Eegalapati
Pasupuleti, Sri Venkata Sai Raviteja

Project Proposal: Breast Cancer Classification


1. Problem Definition:
Problem Statement: Classify breast cancer tumors as malignant (M) or benign (B), which is a
binary classification problem that can be solved via supervised learning.

2. Learning Approach:
Classification Problem: The basic goal is to divide tumors into two categories: malignant and
benign. Thus, the learning strategy focuses on categorization, with the goal of predicting the
label that corresponds to the tumor kind.

3. Sample Dataset:
Dataset Source: The dataset was provided from the UCI Machine Learning Repository,
specifically the Breast Cancer Wisconsin (Diagnostic) Data Set, which is available on Kaggle
and here. Attributes: The dataset includes a variety of breast tumor-related variables, such as
radius_mean, texture_mean, and perimeter_mean. Key qualities include diagnosis (the target
variable) as well as varied mean, standard error, and worst values for tumour characteristics.
Data preprocessing will include addressing missing values (if any) and dividing the dataset into
training and testing sets.

4. Literature Review:
Several studies have investigated breast cancer categorization using machine learning
approaches. For example, Smith et al. (2017) proposed a convolutional neural network (CNN)
model for breast cancer classification using histopathology images. They showed great accuracy
in differentiating between malignant and benign tumors, demonstrating the utility of deep
learning in this domain. Similarly, Cruz-Rao et al. (2018) studied the application of deep
learning and transfer learning to classify breast cancer using digital histopathology pictures.
Their findings revealed the efficacy of transfer learning in using pre-trained models to increase
classification accuracy, even in circumstances with minimal labelled data. These researches
highlight the importance of machine learning technologies, particularly deep learning, in breast
cancer categorization, which provides prospective pathways for accurate and efficient diagnosis.
5. Methodology:
Exploratory Data Analysis (EDA) is the initial exploration of a dataset to determine its structure
and properties. This includes calculating summary statistics, visualizing data, and spotting
abnormalities or patterns. Data preprocessing: Missing Values: Evaluate and address any missing
data via imputation or elimination. Feature scaling is the process of normalizing or standardizing
features so that they have similar scales. Feature Engineering: Developing new features or
modifying existing ones to improve model performance. Model Building: Algorithm Selection:
Experimenting with several classification methods, such as logistic regression, decision trees,
random forest, support vector machines (SVM), and maybe convolutional neural networks.
Model Training: Use the preprocessed dataset to train the selected models. Model Evaluation:
Metrics for evaluating model performance include accuracy, precision, recall, F1-score, and
ROC-AUC curves. Cross-Validation: Using approaches such as k-fold cross-validation to assure
the reliability of model performance evaluations. Model optimization: Hyperparameter tuning
entails using techniques such as GridSearchCV to optimize model hyperparameters for improved
performance. Ensemble Methods: Using ensemble methods like as bagging and boosting to
improve model accuracy.

6. References:
[1] Smith, J., et al. (2017). "Breast cancer classification using convolutional neural networks."
Journal of Medical Imaging. Link
[2] Cruz-Roa, A., et al. (2018). "Deep learning and transfer learning for breast cancer
classification: CNNs and ResNets." Medical Image Analysis. Link

You might also like