0% found this document useful (0 votes)
68 views

Chapter 4. Development Process

The document discusses the development process for a brain tumor detection system using Python and Anaconda. It describes requirement analysis, features of Python like being easy to code and open source. It then discusses Anaconda distribution which comes with over 250 packages pre-installed and its advantages over pip like dependency management. It also describes modules in the proposed system like image processing, pre-processing, segmentation, classification to detect brain tumors using a convolutional neural network with dense layers in deep learning.

Uploaded by

Aishwarya Balaji
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
68 views

Chapter 4. Development Process

The document discusses the development process for a brain tumor detection system using Python and Anaconda. It describes requirement analysis, features of Python like being easy to code and open source. It then discusses Anaconda distribution which comes with over 250 packages pre-installed and its advantages over pip like dependency management. It also describes modules in the proposed system like image processing, pre-processing, segmentation, classification to detect brain tumors using a convolutional neural network with dense layers in deep learning.

Uploaded by

Aishwarya Balaji
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 10

CHAPTER 4

DEVELOPMENT PROCESS
4.1. REQUIREMENT ANALYSIS
Requirements are a feature of a system or description of something that the system is
capable of doing in order to fulfil the system’s purpose. It provides the appropriate
mechanism for understanding what the customer wants, analyzing the needs assessing
feasibility, negotiating a reasonable solution, specifying the solution unambiguously,
validating the specification and managing the requirements as they are translated into an
operational system.

4.1.1. PYTHON:

Python is a dynamic, high level, free open source and interpreted programming
language. It supports object-oriented programming as well as procedural oriented
programming. In Python, we don’t need to declare the type of variable because it is a
dynamically typed language.

For example, x=10 .Here, x can be anything such as String, int, etc.
Python is an interpreted, object-oriented programming language similar to PERL, that has
gained popularity because of its clear syntaxand readability. Python is said to be relatively
easy to learn and portable, meaning its statements can be interpreted in a number of operating
systems, including UNIX-based systems, Mac OS, MS-DOS, OS/2, and various versions of
Microsoft Windows 98. Python was created by Guido van Rossum, a former resident of the
Netherlands, whose favourite comedy group at the time was Monty Python's Flying Circus.
The source code is freely available and open for modification and reuse. Python has a
significant number of users.

Features in Python

There are many features in Python, some of which are discussed below
 Easy to code
 Free and Open Source
 Object-Oriented Language
 GUI Programming Support
 High-Level Language
 Extensible feature
 Python is Portable language
 Python is Integrated language
 Interpreted Language

4.2. ANACONDA

Anaconda distribution comes with over 250 packages automatically installed, and


over 7,500 additional open-source packages can be installed from PyPI as well as
the conda package and virtual environment manager. It also includes a GUI, Anaconda
Navigator,[12] as a graphical alternative to the command line interface (CLI).

The big difference between conda and the pip package manager is in how package
dependencies are managed, which is a significant challenge for Python data science and the
reason conda exists.

When pip installs a package, it automatically installs any dependent Python packages
without checking if these conflict with previously installed packages. It will install a package
and any of its dependencies regardless of the state of the existing installation. Because of this,
a user with a working installation of, for example, Google Tensorflow, can find that it stops
working having used pip to install a different package that requires a different version of the
dependent numpy library than the one used by Tensorflow. In some cases, the package may
appear to work but produce different results in detail.

In contrast, conda analyses the current environment including everything currently


installed, and, together with any version limitations specified (e.g. the user may wish to have
Tensorflow version 2,0 or higher), works out how to install a compatible set of dependencies,
and shows a warning if this cannot be done.

Opensource packages can be individually installed from the Anaconda


repository, Anaconda Cloud (anaconda.org), or the user's own private repository or mirror,
using the conda install command. Anaconda, Inc. compiles and builds the packages available
in the Anaconda repository itself, and provides binaries for Windows 32/64 bit, Linux 64 bit
and MacOS 64-bit. Anything available on PyPI may be installed into a conda environment
using pip, and conda will keep track of what it has installed itself and what pip has installed.
Custom packages can be made using the conda build command, and can be shared
with others by uploading them to Anaconda Cloud, PyPI or other repositories.

The default installation of Anaconda2 includes Python 2.7 and Anaconda3 includes
Python 3.7. However, it is possible to create new environments that include any version of
Python packaged with conda.

4.2.1. Anaconda Navigator


Anaconda Navigator is a desktop graphical user interface (GUI) included in
Anaconda distribution that allows users to launch applications and manage conda packages,
environments and channels without using command-line commands. Navigator can search for
packages on Anaconda Cloud or in a local Anaconda Repository, install them in an
environment, run the packages and update them. It is available
for Windows, macOS and Linux.

The following applications are available by default in Navigator:

 JupyterLab
 Jupyter Notebook
 QtConsole
 Spyder
 Glue
 Orange
 RStudio
 Visual Studio Code

4.2.2. JUPYTER NOTEBOOK

Jupyter Notebook (formerly IPython Notebooks) is a web-based


interactive computational environment for creating Jupyter notebook documents. The
"notebook" term can colloquially make reference to many different entities, mainly the
Jupyter web application, Jupyter Python web server, or Jupyter document format depending
on context. A Jupyter Notebook document is a JSON document, following a versioned
schema, containing an ordered list of input/output cells which can contain code, text
(using Markdown), mathematics, plots and rich media, usually ending with the ".ipynb"
extension.

Jupyter Notebook can connect to many kernels to allow programming in different


languages. By default, Jupyter Notebook ships with the IPython kernel. As of the 2.3
release[11][12] (October 2014), there are currently 49 Jupyter-compatible kernels for many
programming languages, including Python, R, Julia and Haskell.

The Notebook interface was added to IPython in the 0.12 release [14] (December 2011),
renamed to Jupyter notebook in 2015 (IPython 4.0 – Jupyter 1.0). Jupyter Notebook is similar
to the notebook interface of other programs such as Maple, Mathematica, and SageMath, a
computational interface style that originated with Mathematica in the 1980s. According
to The Atlantic, Jupyter interest overtook the popularity of the Mathematica notebook
interface in early 2018.

4.3. RESOURCE REQUIREMENTS :


SOFTWARE REQUIREMENTS:

Op e r a t i n g S y s t e m Windows 7or later


Simulation Tool Anaconda (Jupyter notebook)
Do c u m e n t a t i o n Ms – Office

HARDWARE REQUIREMENTS:

CPU type Intel Pentium


Ram size 4GB
Hard disk capacity 80 GB
Keyboard type Internet keyboard
Monitor type 15 Inch colour monitor
CD -drive type 52xmax
4.4.SYSTEM ARCHITECTURE

Dataset Collection
CNN in Deep
Training Dense Layer
Learning

Pre-
Processing

Testing

Segmentation

Classification

Prediction Of
Brain Tumor
4.4.1. USECASE DIAGRAM

Image Processing

Pre-Processing

Segmentation

Classification

admin

4.5. PROPOSED SYSTEM

 Our proposed system involves Dense Layer in Convolutional Neural Network (CNN)
Algorithm in Deep Learning concept used to train the dataset.
 In Dense Layer, each layer obtains additional inputs from all preceding layers and
passes on its own feature-maps to all subsequent layers.
 In Dense Layer uses features of all complexity levels. It tends to give more smooth
decision boundaries. 

4.5.1. ADVANTAGES

 Easy detection of the Brain Tumor with the concluded technique.


 Time consuming.
 Best accuracy Model helps in better treatment as early.
 Detection of best Model will quick the treatment which is life saving
SYSTEM MODULES:
 Module 1: Image Processing
 Module 2: Pre-Processing
 Module 3: Segmentation
 Module 4: Classification

Module 1: Dataset Collection and Pre-processing

A dataset (or data set) is a collection of data, usually presented in tabular form. Each


column represents a particular variable. Each row corresponds to a given member of
the dataset in question. It lists values for each of the variables, such as height and weight of
an object. Each value is known as a datum.

We have chosen to use a publicly-available Healthcare dataset which contains a


relatively small number of inputs and cases. The data is arranged in such a way that will
allow those trained in medical disciplines to easily draw parallels between familiar statistical
and novel ML techniques. Additionally, the compact dataset enables short computational
times on almost all modern computers.

The sklearn.preprocessingpackage provides several common utility functions and


transformer classes to change raw feature vectors into a representation that is more suitable
for the downstream estimators.

In general, learning algorithms benefit from standardization of the data set. If some
outliers are present in the set, robust scalers or transformers are more appropriate. The
behaviors of the different scalers, transformers, and normalizers on a dataset containing
marginal outliers is highlighted in Compare the effect of different scalers on data with
outliers.

Standardization, or Mean removal and Variance Scaling


Standardization of datasets is a common requirement for many machine learning
estimators implemented in scikit-learn; they might behave badly if the individual features do
not more or less look like standard normally distributed data: Gaussian with zero mean and
unit variance.
Scaling features to a range
In practice we often ignore the shape of the distribution and just transform the data to
center it by removing the mean value of each feature, then scale it by dividing non-constant
features by their standard deviation.
For instance, many elements used in the objective function of a learning algorithm
(such as the RBF kernel of Support Vector Machines or the l1 and l2 regularizers of linear
models) assume that all features are centered around zero and have variance in the same
order. If a feature has a variance that is orders of magnitude larger than others, it might
dominate the objective function and make the estimator unable to learn from other features
correctly as expected.
An alternative standardization is scaling features to lie between a given minimum and
maximum value, often between zero and one, or so that the maximum absolute value of each
feature is scaled to unit size. This can be achieved using MinMaxScaler or MaxAbsScaler,
respectively.
The motivation to use this scaling include robustness to very small standard
deviations of features and preserving zero entries in sparse data.
MaxAbsScaler works in a very similar fashion, but scales in a way that the training data lies
within the range [-1,1] by dividing through the largest maximum value in each feature. It is meant for
data that is already centered at zero or sparse data.

Normalization
Normalization is the process of scaling individual samples to have unit norm. This
process can be useful if you plan to use a quadratic form such as the dot-product or any other
kernel to quantify the similarity of any pair of samples.
This assumption is the base of the Vector Space Model often used in text
classification and clustering contexts.

Module 3: Segmentation
Image segmentation is the process of dividing the image into non- overlapping
meaningful regions. The main objective if an image segmentation is to divide an image into
many sections for the further analysis, so we can get the only necessary or a segment of
information. We use various image segmentation algorithms to split and group a certain set of
pixels together from the image. By doing so, we are actually assigning labels to pixels and the
pixels with the same label fall under a category where they have some or the other thing
common in them.

Using these labels, we can specify boundaries, draw lines, and separate the most
required objects in an image from the rest of the not-so-important ones. In the below
example, from a main image on the left, we try to get the major components, e.g. chair, table
etc. and hence all the chairs are colored uniformly. In the next tab, we have detected
instances, which talk about individual objects, and hence the all the chairs have different
colors.

This is how different methods of segmentation of images work in varying degrees of


complexity and yield different levels of outputs.

Module 4: Classification
  Image classification is to identify and portray, as a unique gray level (or color), the
features occurring in an image in terms of the object or type of land cover these features
actually represent on the ground. Image classification is perhaps the most important part of
digital image analysis.
K-Nearest Neighbours
Neighbours based classification is a type of lazy learning as it does not attempt to
construct a general internal model, but simply stores instances of the training data.
Classification is computed from a simple majority vote of the k nearest neighbours of each
point.

Support Vector Machine

Support vector machine is a representation of the training data as points in space


separated into categories by a clear gap that is as wide as possible. New examples are then
mapped into that same space and predicted to belong to a category based on which side of the
gap they fall.

You might also like