0% found this document useful (0 votes)
10 views6 pages

AML - LAB21 6 6 1.ipynb - Colab

The document outlines a machine learning project using a dataset of breast cancer diagnoses, where data is loaded, preprocessed, and prepared for training a neural network. The dataset contains 569 entries with various features, and the diagnosis is converted from categorical to binary format. An artificial neural network (ANN) is initialized and trained on the processed data, with training and validation metrics being reported.

Uploaded by

Aastha Mehta
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
10 views6 pages

AML - LAB21 6 6 1.ipynb - Colab

The document outlines a machine learning project using a dataset of breast cancer diagnoses, where data is loaded, preprocessed, and prepared for training a neural network. The dataset contains 569 entries with various features, and the diagnosis is converted from categorical to binary format. An artificial neural network (ANN) is initialized and trained on the processed data, with training and validation metrics being reported.

Uploaded by

Aastha Mehta
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 6

7/3/24, 9:37 AM AML_LAB21 6 6 1.

ipynb - Colab

import numpy as np # linear algebra


import pandas as pd # data processing, CSV file I/O (e.g. pd.read_csv)
import seaborn as sns
import matplotlib.pyplot as plt
from sklearn.model_selection import train_test_split
import tensorflow as tf
from sklearn.metrics import classification_report, confusion_matrix

df = pd.read_csv("/content/data.csv")
df.head()

id diagnosis radius_mean texture_mean perimeter_mean area_mean smoothn

0 842302 M 17.99 10.38 122.80 1001.0

1 842517 M 20.57 17.77 132.90 1326.0

2 84300903 M 19.69 21.25 130.00 1203.0

3 84348301 M 11.42 20.38 77.58 386.1

4 84358402 M 20.29 14.34 135.10 1297.0

5 rows × 33 columns

df.drop(["id","Unnamed: 32"], axis=1 , inplace=True)

df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 569 entries, 0 to 568
Data columns (total 31 columns):
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 diagnosis 569 non-null object
1 radius_mean 569 non-null float64
2 texture_mean 569 non-null float64
3 perimeter_mean 569 non-null float64
4 area_mean 569 non-null float64
5 smoothness_mean 569 non-null float64
6 compactness_mean 569 non-null float64
7 concavity_mean 569 non-null float64
8 concave points_mean 569 non-null float64
9 symmetry_mean 569 non-null float64
10 fractal_dimension_mean 569 non-null float64
11 radius_se 569 non-null float64
12 texture_se 569 non-null float64
13 perimeter_se 569 non-null float64
14 area_se 569 non-null float64
15 smoothness_se 569 non-null float64
16 compactness_se 569 non-null float64
17 concavity_se 569 non-null float64
18 concave points_se 569 non-null float64
19 symmetry_se 569 non-null float64
20 fractal_dimension_se 569 non-null float64
21 radius_worst 569 non-null float64
22 texture_worst 569 non-null float64
23 perimeter_worst 569 non-null float64
24 area_worst 569 non-null float64
25 smoothness_worst 569 non-null float64
26 compactness_worst 569 non-null float64
27 concavity_worst 569 non-null float64
28 concave points_worst 569 non-null float64
29 symmetry_worst 569 non-null float64
30 fractal_dimension_worst 569 non-null float64
dtypes: float64(30), object(1)
memory usage: 137.9+ KB

list(set(df.dtypes.tolist()))

[dtype('float64'), dtype('O')]

df.select_dtypes(include = ['object']).columns

Index(['diagnosis'], dtype='object')

https://colab.research.google.com/drive/1H9LjXp8tesaBqkOcmW876-gdkheA-UE8?authuser=1 1/6
7/3/24, 9:37 AM AML_LAB21 6 6 1.ipynb - Colab
df.diagnosis = [1 if each == "M" else 0 for each in df.diagnosis]

df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 569 entries, 0 to 568
Data columns (total 31 columns):
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 diagnosis 569 non-null int64
1 radius_mean 569 non-null float64
2 texture_mean 569 non-null float64
3 perimeter_mean 569 non-null float64
4 area_mean 569 non-null float64
5 smoothness_mean 569 non-null float64
6 compactness_mean 569 non-null float64
7 concavity_mean 569 non-null float64
8 concave points_mean 569 non-null float64
9 symmetry_mean 569 non-null float64
10 fractal_dimension_mean 569 non-null float64
11 radius_se 569 non-null float64
12 texture_se 569 non-null float64
13 perimeter_se 569 non-null float64
14 area_se 569 non-null float64
15 smoothness_se 569 non-null float64
16 compactness_se 569 non-null float64
17 concavity_se 569 non-null float64
18 concave points_se 569 non-null float64
19 symmetry_se 569 non-null float64
20 fractal_dimension_se 569 non-null float64
21 radius_worst 569 non-null float64
22 texture_worst 569 non-null float64
23 perimeter_worst 569 non-null float64
24 area_worst 569 non-null float64
25 smoothness_worst 569 non-null float64
26 compactness_worst 569 non-null float64
27 concavity_worst 569 non-null float64
28 concave points_worst 569 non-null float64
29 symmetry_worst 569 non-null float64
30 fractal_dimension_worst 569 non-null float64
dtypes: float64(30), int64(1)
memory usage: 137.9 KB

df.describe().T

https://colab.research.google.com/drive/1H9LjXp8tesaBqkOcmW876-gdkheA-UE8?authuser=1 2/6
7/3/24, 9:37 AM AML_LAB21 6 6 1.ipynb - Colab

count mean std min 25%

diagnosis 569.0 0.372583 0.483918 0.000000 0.000000 0.00

radius_mean 569.0 14.127292 3.524049 6.981000 11.700000 13.3

texture_mean 569.0 19.289649 4.301036 9.710000 16.170000 18.84

perimeter_mean 569.0 91.969033 24.298981 43.790000 75.170000 86.24

area_mean 569.0 654.889104 351.914129 143.500000 420.300000 551.10

smoothness_mean 569.0 0.096360 0.014064 0.052630 0.086370 0.09

compactness_mean 569.0 0.104341 0.052813 0.019380 0.064920 0.09

concavity_mean 569.0 0.088799 0.079720 0.000000 0.029560 0.06

concave points_mean 569.0 0.048919 0.038803 0.000000 0.020310 0.03

symmetry_mean 569.0 0.181162 0.027414 0.106000 0.161900 0.1

fractal_dimension_mean 569.0 0.062798 0.007060 0.049960 0.057700 0.06

radius_se 569.0 0.405172 0.277313 0.111500 0.232400 0.32

texture_se 569.0 1.216853 0.551648 0.360200 0.833900 1.10

perimeter_se 569.0 2.866059 2.021855 0.757000 1.606000 2.28

area_se 569.0 40.337079 45.491006 6.802000 17.850000 24.53

smoothness_se 569.0 0.007041 0.003003 0.001713 0.005169 0.00

compactness_se 569.0 0.025478 0.017908 0.002252 0.013080 0.02

concavity_se 569.0 0.031894 0.030186 0.000000 0.015090 0.02

concave points_se 569.0 0.011796 0.006170 0.000000 0.007638 0.0

symmetry_se 569.0 0.020542 0.008266 0.007882 0.015160 0.0

fractal_dimension_se 569.0 0.003795 0.002646 0.000895 0.002248 0.00

radius_worst 569.0 16.269190 4.833242 7.930000 13.010000 14.9

texture_worst 569.0 25.677223 6.146258 12.020000 21.080000 25.4

perimeter_worst 569.0 107.261213 33.602542 50.410000 84.110000 97.66

area_worst 569.0 880.583128 569.356993 185.200000 515.300000 686.50

smoothness_worst 569.0 0.132369 0.022832 0.071170 0.116600 0.13

compactness_worst 569.0 0.254265 0.157336 0.027290 0.147200 0.2

concavity_worst 569.0 0.272188 0.208624 0.000000 0.114500 0.22

concave points_worst 569.0 0.114606 0.065732 0.000000 0.064930 0.09

symmetry_worst 569.0 0.290076 0.061867 0.156500 0.250400 0.28

y = df.diagnosis.values.reshape(-1,1)
X = df.iloc[:,1:].values

X = ((X - np.min(X))/(np.max(X)-np.min(X)))

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

print(f"X_train: {X_train.shape}")
print(f"X_test: {X_test.shape}")
print(f"y_train: {y_train.shape}")
print(f"y_test: {y_test.shape}")

X_train: (455, 30)


X_test: (114, 30)
y_train: (455, 1)
y_test: (114, 1)

# Initializing the ANN


ann = tf.keras.models.Sequential()

https://colab.research.google.com/drive/1H9LjXp8tesaBqkOcmW876-gdkheA-UE8?authuser=1 3/6
7/3/24, 9:37 AM AML_LAB21 6 6 1.ipynb - Colab
# Adding the input layer and the first hidden layer
ann.add((tf.keras.layers.Dense(units=6,
activation="relu")))

# Adding the second hidden layer


ann.add((tf.keras.layers.Dense(units=6,
activation="relu")))

# Adding the output layer


ann.add((tf.keras.layers.Dense(units=1,
activation="sigmoid")))

ann.compile(optimizer="adam",
loss="binary_crossentropy",
metrics=["accuracy"])

model_train=ann.fit(X_train,y_train,batch_size=32,epochs=100, verbose=1,validation_data=(X_test, y_test))

Epoch 1/100
15/15 [==============================] - 1s 18ms/step - loss: 0.6901 - accuracy: 0.6571 - val_loss: 0.6878 - val_accuracy: 0.631
Epoch 2/100
15/15 [==============================] - 0s 4ms/step - loss: 0.6859 - accuracy: 0.6374 - val_loss: 0.6832 - val_accuracy: 0.6228
Epoch 3/100
15/15 [==============================] - 0s 4ms/step - loss: 0.6814 - accuracy: 0.6374 - val_loss: 0.6789 - val_accuracy: 0.6228
Epoch 4/100
15/15 [==============================] - 0s 5ms/step - loss: 0.6773 - accuracy: 0.6374 - val_loss: 0.6743 - val_accuracy: 0.6228
Epoch 5/100
15/15 [==============================] - 0s 5ms/step - loss: 0.6730 - accuracy: 0.6418 - val_loss: 0.6702 - val_accuracy: 0.6404
Epoch 6/100
15/15 [==============================] - 0s 5ms/step - loss: 0.6685 - accuracy: 0.6462 - val_loss: 0.6657 - val_accuracy: 0.6404
Epoch 7/100
15/15 [==============================] - 0s 5ms/step - loss: 0.6643 - accuracy: 0.6462 - val_loss: 0.6606 - val_accuracy: 0.6491
Epoch 8/100
15/15 [==============================] - 0s 6ms/step - loss: 0.6593 - accuracy: 0.6484 - val_loss: 0.6555 - val_accuracy: 0.6491
Epoch 9/100
15/15 [==============================] - 0s 6ms/step - loss: 0.6543 - accuracy: 0.6549 - val_loss: 0.6500 - val_accuracy: 0.6491
Epoch 10/100
15/15 [==============================] - 0s 15ms/step - loss: 0.6487 - accuracy: 0.6593 - val_loss: 0.6440 - val_accuracy: 0.666
Epoch 11/100
15/15 [==============================] - 0s 9ms/step - loss: 0.6430 - accuracy: 0.6593 - val_loss: 0.6376 - val_accuracy: 0.6754
Epoch 12/100
15/15 [==============================] - 0s 7ms/step - loss: 0.6369 - accuracy: 0.6703 - val_loss: 0.6312 - val_accuracy: 0.6842
Epoch 13/100
15/15 [==============================] - 0s 9ms/step - loss: 0.6304 - accuracy: 0.6813 - val_loss: 0.6238 - val_accuracy: 0.6842
Epoch 14/100
15/15 [==============================] - 0s 9ms/step - loss: 0.6234 - accuracy: 0.6857 - val_loss: 0.6161 - val_accuracy: 0.7018
Epoch 15/100
15/15 [==============================] - 0s 6ms/step - loss: 0.6159 - accuracy: 0.7033 - val_loss: 0.6079 - val_accuracy: 0.7281
Epoch 16/100
15/15 [==============================] - 0s 7ms/step - loss: 0.6087 - accuracy: 0.7429 - val_loss: 0.6000 - val_accuracy: 0.7456
Epoch 17/100
15/15 [==============================] - 0s 10ms/step - loss: 0.6007 - accuracy: 0.7714 - val_loss: 0.5913 - val_accuracy: 0.763
Epoch 18/100
15/15 [==============================] - 0s 6ms/step - loss: 0.5923 - accuracy: 0.7758 - val_loss: 0.5820 - val_accuracy: 0.7632
Epoch 19/100
15/15 [==============================] - 0s 8ms/step - loss: 0.5838 - accuracy: 0.7802 - val_loss: 0.5723 - val_accuracy: 0.7719
Epoch 20/100
15/15 [==============================] - 0s 7ms/step - loss: 0.5747 - accuracy: 0.7890 - val_loss: 0.5621 - val_accuracy: 0.7807
Epoch 21/100
15/15 [==============================] - 0s 7ms/step - loss: 0.5653 - accuracy: 0.7956 - val_loss: 0.5515 - val_accuracy: 0.7895
Epoch 22/100
15/15 [==============================] - 0s 10ms/step - loss: 0.5555 - accuracy: 0.8044 - val_loss: 0.5403 - val_accuracy: 0.815
Epoch 23/100
15/15 [==============================] - 0s 9ms/step - loss: 0.5455 - accuracy: 0.8198 - val_loss: 0.5292 - val_accuracy: 0.8158
Epoch 24/100
15/15 [==============================] - 0s 10ms/step - loss: 0.5349 - accuracy: 0.8220 - val_loss: 0.5173 - val_accuracy: 0.833
Epoch 25/100
15/15 [==============================] - 0s 9ms/step - loss: 0.5243 - accuracy: 0.8242 - val_loss: 0.5052 - val_accuracy: 0.8333
Epoch 26/100
15/15 [==============================] - 0s 13ms/step - loss: 0.5132 - accuracy: 0.8286 - val_loss: 0.4932 - val_accuracy: 0.842
Epoch 27/100
15/15 [==============================] - 0s 9ms/step - loss: 0.5025 - accuracy: 0.8308 - val_loss: 0.4810 - val_accuracy: 0.8421
Epoch 28/100
15/15 [==============================] - 0s 11ms/step - loss: 0.4915 - accuracy: 0.8308 - val_loss: 0.4691 - val_accuracy: 0.842
Epoch 29/100

y_pred = ann.predict(X_test)
y_pred = (y_pred > 0.5)
print(np.concatenate((y_pred.reshape(len(y_pred),1),y_test.reshape(len(y_test),1)),1))

https://colab.research.google.com/drive/1H9LjXp8tesaBqkOcmW876-gdkheA-UE8?authuser=1 4/6
7/3/24, 9:37 AM AML_LAB21 6 6 1.ipynb - Colab

4/4 [==============================] - 0s 3ms/step


[[0 0]
[1 1]
[1 1]
[0 0]
[0 0]
[1 1]
[1 1]
[1 1]
[0 0]
[0 0]
[0 0]
[1 1]
[0 0]
[1 1]
[0 0]
[1 1]
[0 0]
[0 0]
[0 0]
[1 1]
[0 1]
[0 0]
[1 1]
[0 0]
[0 0]
[0 0]
[0 0]
[0 0]
[0 0]
[1 1]
[0 0]
[0 0]
[0 0]
[0 0]
[0 0]
[0 0]
[1 1]
[0 0]
[1 1]
[0 0]
[0 0]
[1 1]
[0 0]
[0 0]
[0 0]
[0 0]
[0 0]
[0 0]
[0 0]
[0 0]
[1 1]
[1 1]
[0 0]
[0 0]
[0 0]
[0 0]
[0 0]

import matplotlib.pyplot as plt


acc = model_train.history['accuracy']
val_acc = model_train.history['val_accuracy']
loss = model_train.history['loss']
val_loss = model_train.history['val_loss']
epochs = range(1, len(acc) + 1)
plt.plot(epochs, acc, 'g', label='Training acc')
plt.plot(epochs, val_acc, 'b', label='Validation acc')
plt.title('Training and validation accuracy')
plt.legend()
plt.figure()
plt.plot(epochs, loss, 'g', label='Training loss')
plt.plot(epochs, val_loss, 'r', label='Validation loss')
plt.title('Training and validation loss')
plt.legend()
plt.show()

https://colab.research.google.com/drive/1H9LjXp8tesaBqkOcmW876-gdkheA-UE8?authuser=1 5/6
7/3/24, 9:37 AM AML_LAB21 6 6 1.ipynb - Colab

https://colab.research.google.com/drive/1H9LjXp8tesaBqkOcmW876-gdkheA-UE8?authuser=1 6/6

You might also like