ML Full For Print New 1
ML Full For Print New 1
Aim:
To implement a Linear Regression with a Real Dataset experiment with different features in building
a model.
Algorithm:
Program:
import numpy as np
import pandas as pd
1
# Display first few rows of the dataset
print(df.head())
3
Sample input output:
4
Result:
Thus to implement a Linear Regression with a Real Dataset experiment with different features in
building a model was successfully executed.
5
Ex.no:2 Implement a binary classification model. That is, answers a binary
question such as "Are houses in this neighbourhood above a certain
Date: price?" (use data from exercise 1). Modify the classification threshold and
determine how that modification influences the model. Experiment with
different classification metrics to determine your model's effectiveness.
Aim:
To implement a binary classification model and to modify the classification threshold and determine
how that modification influences the model with different classification metrics to determine your model's
effectiveness.
Algorithm:
Program:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from sklearn.model_selection import train_test_split, GridSearchCV
from sklearn.neighbors import KNeighborsClassifier
from sklearn.metrics import confusion_matrix, classification_report, roc_curve, roc_auc_score
# Initialize variables
neighbors = np.arange(1, 9)
train_accuracy = np.empty(len(neighbors))
test_accuracy = np.empty(len(neighbors))
plt.figure()
plt.plot([0, 1], [0, 1], 'k--')
plt.plot(fpr, tpr, label='k-NN (n_neighbors=7)')
plt.xlabel('False Positive Rate (FPR)')
plt.ylabel('True Positive Rate (TPR)')
plt.title('ROC Curve')
plt.legend()
plt.show()
8
Sample input output:
9
Result:
Thus to implement a binary classification model and to modify the classification threshold and
determine how that modification influences the model with different classification metrics to determine your
model's effectiveness was successfully executed.
10
EX.NO:3 CLASSIFICATION WITH NEAREST NEIGHBORS.IN THIS QUESTION
YOU WILL USE SCIKIT’S LEARN’S KNN CLASSIFIER TO CLASSIFY
DATE: REAL VS FAKE NEWS HEADLINES.THE AIM OF THIS QUESTION IS
FOR YOU TO READ THE SCIKIT-LEARN API AND GET
COMFORTABLE WITH TRAINING/VALIDATION SPLITS.USE
CALIFORNIA HOUSING DATASET.
AIM:
To implement a program for Classification using Nearest Neighbors using Scikit-learn’s KNN
classifier and evaluate the model’s performance with training/validation splits and metrics.
ALGORITHM:
PROGRAM:
# Import Libraries
import numpy as np
import pandas as pd
11
from sklearn.model_selection import train_test_split
# Load Dataset
wine = load_wine()
X = wine.data
y = wine.target
# Split Data
train_accuracy = []
test_accuracy = []
for k in neighbors:
knn = KNeighborsClassifier(n_neighbors=k)
knn.fit(X_train, y_train)
train_accuracy.append(knn.score(X_train, y_train))
test_accuracy.append(knn.score(X_test, y_test))
# Plot K vs Accuracy
plt.figure(figsize=(8,5))
plt.ylabel("Accuracy")
12
plt.title("KNN Accuracy for Different K Values")
plt.legend()
plt.grid(True)
plt.show()
best_k = 5
knn = KNeighborsClassifier(n_neighbors=best_k)
knn.fit(X_train, y_train)
y_pred = knn.predict(X_test)
# Evaluation
cm = confusion_matrix(y_test, y_pred)
print(cm)
plt.figure(figsize=(6, 4))
plt.ylabel("Actual")
plt.title("Confusion Matrix")
plt.show()
13
SAMPLE INPUT OUTPUT:
14
RESULT:
Thus to implement a program for Classification using Nearest Neighbors using Scikit-learn’s KNN
classifier and evaluate the model’s performance with training/validation splits and metrics was successfully
executed.
15
EX NO: 4 ANALYZE DELTAS BETWEEN TRAINING SET
AND VALIDATION SET RESULTS TO DETERMINE THE MODEL
IS OVERFITTING
DATE:
AIM:
To analyze the difference in performance between training and validation sets to determine if the model is
overfitting, and to visualize the results to detect and address this issue.
ALGORITHM:
PROGRAM:
import numpy as np
X, y = make_classification(n_samples=10000, n_features=20,
16
n_informative=5, n_redundant=15, random_state=1)
train_scores = []
test_scores = []
model.fit(X_train, y_train)
train_yhat = model.predict(X_train)
test_yhat = model.predict(X_test)
train_scores.append(train_acc)
test_scores.append(test_acc)
plt.figure(figsize=(10, 6))
plt.xlabel('Tree Depth')
plt.ylabel('Accuracy')
17
plt.legend()
plt.grid(True)
plt.show()
18
SAMPLE INPUT OUTPUT:
19
RESULT:
Thus to analyse the difference in performance between training and validation sets to determine if
the model is overfitting, and to visualize the results to detect and address this issue was executed
successfully.
20
EX NO: 5 Implement the K-Means algorithm using the given
dataset.
DATE:
Aim:
To implement the K-Means Clustering Algorithm on the given biological dataset and group the data points
(species) based on their codon usage frequencies.
1. Start
2. Import the required libraries (pandas, sklearn, matplotlib, etc.).
3. Load the dataset containing codon usage frequencies and other features.
4. Select the relevant numerical features (e.g., UUU, UUC, UUA, UUG).
5. Normalize the features using StandardScaler for better clustering results.
6. Choose the number of clusters (e.g., k = 3).
7. Apply the K-Means clustering algorithm:
■ Initialize centroids randomly.
■ Assign each point to the nearest centroid.
■ Update centroids as the mean of assigned points.
■ Repeat steps until centroids do not change or max iterations are reached.
8. Assign the cluster labels to the dataset.
9. Display the final clustered data with the cluster number.
10. Optionally, visualize clusters using scatter plots.
PROGRAM:
import pandas as pd
21
import seaborn as sns
data = {
'DNAType': [0]*7,
'SpeciesName': [
'Bohle iridovirus',
'Homo sapiens'
],
df = pd.DataFrame(data)
22
features = df[['UUU', 'UUC', 'UUA', 'UUG']]
scaler = StandardScaler()
scaled_features = scaler.fit_transform(features)
df['Cluster'] = kmeans.fit_predict(scaled_features)
pd.set_option('display.max_columns', None)
plt.figure(figsize=(8, 6))
plt.xlabel("UUU Frequency")
plt.ylabel("UUC Frequency")
plt.grid(True)
plt.show()
23
SAMPLE INPUT OUTPUT:
24
RESULT:
Thus to implement the K-Means Clustering Algorithm on the given biological dataset and group
the data points (species) based on their codon usage frequencies has been successfully completed.
25
EX.NO:6 IMPLEMENT THE NAÏVE BAYES CLASSIFIER USING THE GIVEN
DATASET
DATE:
Aim:
To Implement The Naïve Bayes Classifer Using The Given Dataset For Predicting The Results.
Algorithm:
STEP4:Then install the necessary packages and libraries in the created environment on the jupyter
notebook
Program:
import numpy as np
import pandas as pd
26
# Step 2: Create a sample dataset similar to what's used in the experiment
data = {
'Gender': ['Male', 'Male', 'Female', 'Female', 'Male', 'Male', 'Female', 'Female', 'Male', 'Female'],
'Age': [19, 35, 26, 27, 45, 46, 48, 50, 29, 31],
'EstimatedSalary': [19000, 20000, 43000, 57000, 26000, 28000, 30000, 87000, 80000, 150000],
'Purchased': [0, 0, 0, 1, 1, 1, 1, 1, 0, 1]
dataset = pd.DataFrame(data)
sc = StandardScaler()
X_train = sc.fit_transform(X_train)
27
X_test = sc.transform(X_test)
classifier = GaussianNB()
classifier.fit(X_train, y_train)
y_pred = classifier.predict(X_test)
cm = confusion_matrix(y_test, y_pred)
ac = accuracy_score(y_test, y_pred)
X1, X2 = np.meshgrid(
28
plt.contourf(X1, X2, classifier.predict(np.array([X1.ravel(), X2.ravel()]).T).reshape(X1.shape),
plt.xlim(X1.min(), X1.max())
plt.ylim(X2.min(), X2.max())
for i, j in enumerate(np.unique(y_set)):
plt.title(title)
plt.xlabel('Age')
plt.ylabel('Estimated Salary')
plt.legend()
plt.show()
29
SAMPLE INPUT OUTPUT:
30
RESULT:
Thus to implement the naïve bayes classifier using the given dataset has been successfully
completed.
31
EX.NO : 7 PROJECT
DATE: BRAIN TUMOUR DETECTION USING MACHINE LEARNING
ALGORITHM
Abstract
The development of aberrant brain cells, some of which may turn cancerous, is known as a brain
tumour. Magnetic Resonance Imaging (MRI) scans are the most common technique for finding brain
tumours. Information about the aberrant tissue growth in the brain is discernible from the MRI scans.
In numerous research papers, machine learning and deep learning algorithms are used to detect brain
tumours. It takes extremely little time to forecast a brain tumour when these algorithms are applied
to MRI pictures, and the better accuracy makes it easier to treat patients. The radiologist can make
speedy decisions thanks to these predictions. A self-defined Artificial Neural Network (ANN) and
Convolution Neural Network (CNN) are used in the proposed work to detect the existence of the
presence of brain tumor and their performance is analyzed.
Keywords: Convolution Neural Network, Machine Learning, Brain tumor, Algorithms
Introduction
The brain is one of the most crucial parts of the human body since it regulates the operation of all
other organs and aids in decision-making It is basically the central nervous system's command post
and is in charge of carrying out the body's regular voluntary and involuntary functions. The tumour
is an uncontrolled proliferation of undesirable tissue that has formed a fibrous mesh inside of our
brain. A brain tumour is identified in roughly 3,540 youngsters this year at the age of 15 To
effectively prevent and treat the condition, it is crucial to have a thorough grasp of brain tumours and
their stages. ANN and CNN is used in the classification of normal and tumor brain.
ANN(Artifical Neural Network) works like a human brain nervous system, on this basis a
digital computer is connected with large amount of interconnections and networking which makes
neural network to train with the use of simple processing units applied on the training set and stores
the experiential knowledge. It has different layers of neurons which is connected together. The neural
network can acquire the knowledge by using data set applied on learning process. There will be one
input and output layer whereas there may be any number of hidden layers. In the learning process,
the weight and bias is added to neurons of each layer depending upon the input features and on the
previous layers(for hidden layers and output layers). A model is trained based on the activation
function applied on the input features and on the hidden layers where more learning happens to
achieve the expected output
32
Existing Methodology
Brain tumour is detected by using Image processing techniques. Various algorithms are used
for the partial fulfilment of the requirements to arrive the best results. Some of the
algorithms used are Probabilistic neural network has been used for more productivity using
SVM and KNN technique. Segmentation plays major role to detect brain tumour.
Pre- Processing
Pre-Processing ways purpose the upgrade of the image while not dynamic the
information content. The first driver of image flaws is as Low.
Segmentation
Local developing could be a basic district primarily based image division strategy. It’s in
addition delegated a pixel-based image division strategy since it includes the
determination of introductory seed focuses. This manner to alter division inspects
neighbouring elements of introductory seed focuses and figures out if the pixel neighbours
need to be additional to the district. The procedure is iterated on, in associate degree
indistinguishable approach from general data grouping calculations. A general discourse
of the venue developing calculation is portrayed beneath.
Convolutional Neural Network (CNN) are easier to coach and fewer liable to over fitting.
Methodology like mentioned earlier within the report, we have a tendency to use a patch
primarily based segmentation approach. The Convolutional spec and implementation
administrated exploitation CAFFE. CNNs are the continuation of the multi-layer
33
Perceptron. In the MLP, a unit performs an easy computation by taking the weighted add
of all different units that function input to that. The network is organized into layers of
units within the previous l2ayer. The essence of CNNs is that the convolutions.
The most trick that convolutional Neural Network that avoid the mater too several
parameters is distributed connections. Each unit isn’t connected connect to each
different unit within the previous layer.
Proposed Methodology
The two techniques ANN and CNN are applied on the brain tumor dataset and
their performance on classifying the image is analyzed. Steps followed in applying ANN
on the brain tumor dataset are
The ANN model used here has seven layers. First layer is the flatten layer which
converts the 256x256x3 images into single dimensional array. The next five layers are
the dense layers having the activation function as relu and number of neurons in each
layers are 128,256,512,256 and 128 respectively. These five layers act as the hidden
layers and the last dense layer having the activation function is sigmoid is the output
layer with 1 neuron representing the two classes.
The model is compiled with the adam optimization technique and binary
crossentropy loss function. The model is generated and trained by providing the training
images and the validation images. Once the model is trained, it is tested using the test
34
image set. Next the same dataset is given to the CNN technique. Steps followed
in applying CNN on the brain tumor dataset are
The CNN sequential model is generated by implementing different layers. The input
image is reshaped into 256x256. The convolve layer is applied on the input image with
the relu as activation function, padding as same which means the output images looks
like the input image and the number of filters are 32,32,64,128,256 for different
convolve layers. The max pooling applied with the 2x2 window size and droupout
function is called with 20% of droupout. Flatten method is applied to convert the
features into one dimensional array. The fully connected layer is done by calling the
dense method with the number of units as 256 and relu as the activation function. The
output layer has 1 unit to represent the two classes and the sigmoid as activation
function. The architecture of CNN model is shown in the Figure . The implementation
is done using Python language and are executed in google colab.
35
The model is applied for 200 with the training and the validation dataset. The
history of execution is stored and plotted to understand the models generated.
Convolve(32,3x3,"relu",(256x256x3),padding=same)
Convolve(32,3x3,"relu",(256x256x3),padding=same)
Convolve(128,3x3,"relu",(256x256x3),padding=same)
Convolve(256,3x3,"relu",(256x256x3),padding=same)
Convolve(64,3x3,"relu",(256x256x3),padding=same)
Max Pooling(2x2)
Max Pooling(2x2)
Max Pooling(2x2)
Droupout(0.2)
Droupout(0.2)
Output Layer
Droupout(0.2)
Droupout(0.2)
Droupout(0.2)
Flatten()
Figure : Architecture of CNN model
DataSet
The dataset is taken from Github website. This dataset contains MRI images of brain
tumor. Figure shows the sample normal and brain tumor image. Out of 1672 training
image, 877 images are tumor image and 795 images are non tumor image. 92 tumor
and 94 non tumor images taken from 186 validation images. Among 207 testing
images, 116 tumor images and 91 non tumor images.
36
Experimental Result Analysis
Conclusion:
CNN is considered as one of the best technique in analyzing the image dataset. The
CNN makes the prediction by reducing the size the image without losing the information
needed for making predictions. ANN model generated here produces 65.21% of testing
accuracy and this can be increased by providing more image data. The same can be done
by applying the image augmentation techniques and the analyzing the performance of the
ANN and CNN can be done. The model developed here is generated based on the trail and
error method. In future optimization techniques can be applied so as to decide the number
of layers and filters that can used in a model. As of now for the given dataset the CNN
proves to be the better technique in predicting the presence of brain tumor.
37
References:
[1] Javeria Amin Muhammad Sharif Mudassar Raza Mussarat Yasmin 2018 Detection of
Brain Tumor based on Features Fusion and Machine Learning Journal of Ambient
Intelligence and Humanized Computing Online Publication.
[2]. Rajeshwar Nalbalwar Umakant Majhi Raj Patil Prof.Sudhanshu Gonge 2014 Detection
of Brain Tumor by using ANN International Journal of Research in Advent Technology
[3]. Fatih Özyurt Eser Sert Engin Avci Esin Dogantekin 2019 Brain tumor detection based on
Convolutional Neural Network with neutrosophic expert maximum fuzzy sure entropy Elsevier Ltd
38