Ai HW
Ai HW
1.
Choose a tabular classification-dataset (preferably csv file) from Kaggle website. Write
the details of the selected dataset in the box below.
Dataset Details
Dataset name korean_lotto
Dataset URL https://www.kaggle.com/datasets/calebreigada/south-
korean-lottery-numbers
Number of rows 1003
Number of columns 9
Size of the csv file (in Kilo 27.8 KB
byte)
Type of data of the first numerical
input column (numerical or
string ?)
Type of data of the second numerical
input column (numerical or
string ?)
Type of data of the output Numerical
column (numerical or
string ?)
2.
Apply the train_test_split method as (75% of rows for training and 25% for testing).
Apply two different classification algorithms over the selected dataset.
Find (for each algorithm) the accuracy, confusion matrix, precision, and recall. Fill the
output in the box below.
Algorithm#1
Algorithm name KNeighborsClassifier
Accuracy 0.027888446215139442
confusion matrix ValueError: Found input variables with inconsistent numbers of
samples: [752, 251]
Precision ValueError: Found input variables with inconsistent numbers of
samples: [752, 251]
Recall ValueError: Found input variables with inconsistent numbers of
samples: [752, 251]
Algorithm#2
Algorithm name DecisionTreeClassifier
Accuracy 0.035856573705179286
confusion matrix ValueError: Found input variables with inconsistent numbers of
samples: [752, 251]
Precision ValueError: Found input variables with inconsistent numbers of
samples: [752, 251]
Recall ValueError: Found input variables with inconsistent numbers of
samples: [752, 251]
Copy and paste a screenshot of your Python code in the following box.
import pandas
from sklearn.metrics import confusion_matrix
from sklearn.metrics import accuracy_score
from sklearn.metrics import classification_report
a=pandas.read_csv("C:\\Users\\Zaid Saeed\\Downloads\\korean_lotto.csv")
print(a.shape)
x = myarray[:,0:8]
y = myarray[:,8]
validation_size = 0.25
seed = 3
x1,x2,y1,y2=model_selection.train_test_split(x, y, test_size=validation_size,
random_state=seed)
#first alg.
#from sklearn.neighbors import KNeighborsClassifier
#n = KNeighborsClassifier()
#n.fit(x1,y1)
#predictions = n.predict(x2)
#second alg.
from sklearn.tree import DecisionTreeClassifier
model = DecisionTreeClassifier()
model.fit(x1, y1)
predictions = model.predict(x2)
from sklearn.metrics import accuracy_score
print(accuracy_score(y2, predictions))
actual = [1, 1, 0, 1, 0, 0, 1, 0]
predicted = [1, 0, 0, 1, 0, 0, 1, 1]
results = confusion_matrix(y1, y2)
print('Confusion Matrix')
print(results)
print('results'),accuracy_score(y1, y2)
print('Report :')
print(classification_report(y1, y2))
Copy and paste a screenshot of your output screen in the following box.
Without error:
runfile('C:/Users/Zaid Saeed/.spyder-py3/untitled2.py', wdir='C:/Users/Zaid
Saeed/.spyder-py3')
(1003, 9)
0.0398406374501992
With error:
runfile('C:/Users/Zaid Saeed/.spyder-py3/untitled2.py', wdir='C:/Users/Zaid
Saeed/.spyder-py3')
(1003, 9)
0.043824701195219126
Traceback (most recent call last):
ValueError: Found input variables with inconsistent numbers of samples: [752, 251]
3.
Apply a standard neural network over the selected dataset. Write the details of your
neural networks in the box below.
Neural Network Details
The number of layers 3
(Excluding the input layer)
The number of epochs 150
The number of batches 10
The value of input_dim 8
The number of neurons (circles) 12
in your first-hidden layer
Find (for your neural network) the accuracy, and the loss. Fill the output in the box
below.
Accuracy ModuleNotFoundError: No module
named 'keras'
Loss ModuleNotFoundError: No module
named 'keras'
Copy and paste a screenshot of your Python code (for your neural network) in the
following box.
import pandas
d=pandas.read_csv("C:\\Users\\Zaid Saeed\\Downloads\\korean_lotto.csv")
array = d.values
X = array[:,0:8]
y = array[:,8]
X_train, X_validation, Y_train, Y_validation = model_selection.train_test_split(X, y,
test_size=0.15, random_state=7)
m=Sequential()
m.add(Dense(12,input_dim=8,activation='relu'))
m.add(Dense(8,activation='relu'))
m.add(Dense(1,activation='sigmoid'))
m.compile(loss='binary_crossentropy',optimizer='adam',metrics=['accuracy'])
m.fit(X_train,Y_train,epochs=150, batch_size=10)
loss,accuracy=m.evaluate(X_validation, Y_validation)
print('Accuracy: %.2f' % (accuracy*100))
print(loss)
p=m.predict_classes(X_validation)
Copy and paste a screenshot of your output screen (for your neural network) in the
following box.
runfile('C:/Users/Zaid Saeed/.spyder-py3/untitled0.py', wdir='C:/Users/Zaid
Saeed/.spyder-py3')
Traceback (most recent call last):