0% found this document useful (0 votes)
7 views25 pages

Amlnew

The document outlines the implementation of various machine learning algorithms, including Naïve Bayes Classifier, Expectation-Maximization, k-Means, k-Nearest Neighbour, and Locally Weighted Regression, to classify and analyze datasets. Each section describes the aim, algorithm steps, program code, output, and results of the experiments conducted. The overall goal is to evaluate the performance of these algorithms using metrics such as accuracy, precision, and recall.

Uploaded by

Jerlin John
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
7 views25 pages

Amlnew

The document outlines the implementation of various machine learning algorithms, including Naïve Bayes Classifier, Expectation-Maximization, k-Means, k-Nearest Neighbour, and Locally Weighted Regression, to classify and analyze datasets. Each section describes the aim, algorithm steps, program code, output, and results of the experiments conducted. The overall goal is to evaluate the performance of these algorithms using metrics such as accuracy, precision, and recall.

Uploaded by

Jerlin John
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 25

CS4514- Advanced Machine Learning Department of CSE Reg No:312422104069

Ex No: 5 NAIVE BAYES CLASSIFIER TO CLASSIFY 190 DOCUMENTS


Date:
AIM:
To implement a Naïve Bayesian Classifier to classify a set of 190 documents based on their content,
and to evaluate its performance using accuracy, precision, and recall metrics.

ALGORITHM:
Step 1: Load and read all text documents from the specified folder.
Step 2: Extract labels from the filenames and store the content of the documents in a list.
Step 3: Check for consistency between the number of documents and labels.
Step 4: Split the data into training and testing sets (80% training, 20% testing).
Step 5: Apply the TF-IDF vectorizer to convert text documents into a numerical format.
Step 6: Train a Naïve Bayesian Classifier (MultinomialNB) using the training data.
Step 7: Predict the labels for the test set and compute accuracy, precision, and recall.
Step 8: Print the accuracy, precision, recall, and a customized classification report.

Insert text here


CS4514- Advanced Machine Learning Department of CSE Reg No:312422104069

PROGRAM:
import os
import pandas as pd
from sklearn.feature_extraction.text import TfidfVectorizer
from sklearn.model_selection import train_test_split
from sklearn.naive_bayes import MultinomialNB
from sklearn.metrics import accuracy_score, precision_score, recall_score, classification_report

# Folder containing 190 text documents


folder_name = "text_documents"

# Read all text files from the folder


documents = []
labels = []

# Assuming labels are stored in filenames as "document_1_0.txt", where 0 or 1 is the label


for filename in os.listdir(folder_name):
if filename.endswith(".txt"):
try:
# Extract the label from the filename (assuming it's at the end like 'document_1_0.txt')
file_label = int(filename.split("_")[-1].split(".")[0]) # Extracts the label (0 or 1)

# Read the file content


with open(os.path.join(folder_name, filename), 'r') as file:
content = file.read()

# Add the document content and the label to their respective lists
documents.append(content)
labels.append(file_label)
except Exception as e:
print(f"Error processing file {filename}: {e}")
continue

# Check if documents and labels were loaded correctly


if len(documents) != len(labels):
raise ValueError("Mismatch between number of documents and labels.")

# Split data into training and test sets (80% training, 20% testing)
X_train, X_test, y_train, y_test = train_test_split(documents, labels, test_size=0.2, random_state=42)

# Vectorize the text documents using TF-IDF


vectorizer = TfidfVectorizer(stop_words='english')
X_train_tfidf = vectorizer.fit_transform(X_train)
X_test_tfidf = vectorizer.transform(X_test)
CS4514- Advanced Machine Learning Department of CSE Reg No:312422104069

# Initialize and train the Naive Bayes Classifier


classifier = MultinomialNB()
classifier.fit(X_train_tfidf, y_train)

# Predict the labels for the test set


y_pred = classifier.predict(X_test_tfidf)

# Measure accuracy, precision, and recall for a multiclass problem


accuracy = accuracy_score(y_test, y_pred)
precision = precision_score(y_test, y_pred, average='weighted') # Use 'weighted' for multiclass
recall = recall_score(y_test, y_pred, average='weighted') # Use 'weighted' for multiclass

# Print accuracy, precision, and recall


print(f"Accuracy: {accuracy:.4f}")
print(f"Precision (weighted): {precision:.4f}")
print(f"Recall (weighted): {recall:.4f}")

# Function to print the first and last few lines of the classification report
def print_classification_report(report):
lines = report.split('\n')

# Print the first 4 lines


for line in lines[:4]:
print(line)

# Print ellipsis
print('...')

# Print the last 4 lines


for line in lines[-5:]:
print(line)

# Generate the classification report


report = classification_report(y_test, y_pred)

# Call the function to display the customized report


print_classification_report(report)
CS4514- Advanced Machine Learning Department of CSE Reg No:312422104069

OUTPUT:

Accuracy: 0.8750
Precision (weighted): 0.8800
Recall (weighted): 0.8750
Classification Report:
precision recall f1-score support

0 0.90 0.85 0.87 20


1 0.86 0.90 0.88 20

accuracy 0.88 40
macro avg 0.88 0.88 0.88 40
weighted avg 0.88 0.88 0.88 40
...
accuracy 0.88 40
macro avg 0.88 0.88 0.88 40
weighted avg 0.88 0.88 0.88 40

RESULT:
Thus, the given program to implement a Naïve Bayesian Classifier to classify is executed and the
output is verified successfully.
CS4514- Advanced Machine Learning Department of CSE Reg No:312422104069

Ex No: 6 NAÏVE BAYESIAN NETWORK


Date:

AIM:
To implement a Naïve Bayesian Classifier to classify a set of 190 documents based on their content,
and to evaluate its performance using accuracy, precision, and recall metrics.

ALGORITHM:
Step 1: Load and read all text documents from the specified folder.
Step 2: Extract labels from the filenames and store the content of the documents in a list.
Step 3: Check for consistency between the number of documents and labels.
Step 4: Split the data into training and testing sets (80% training, 20% testing).
Step 5: Apply the TF-IDF vectorizer to convert text documents into a numerical format.
Step 6: Train a Naïve Bayesian Classifier (MultinomialNB) using the training data.
Step 7: Predict the labels for the test set and compute accuracy, precision, and recall.
Step 8: Print the accuracy, precision, recall, and a customized classification report.
CS4514- Advanced Machine Learning Department of CSE Reg No:312422104069

PROGRAM:
# Import necessary libraries
import pandas as pd
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
from sklearn.naive_bayes import GaussianNB
from sklearn.metrics import accuracy_score, classification_report

# Load the Iris dataset (you can replace this with any dataset of your choice)
iris = load_iris()
X = iris.data # Features
y = iris.target # Labels

# Create a DataFrame for better visualization (optional)


df = pd.DataFrame(data=iris.data, columns=iris.feature_names)
df['species'] = iris.target
print("First 5 rows of the dataset:")
print(df.head())

# Split the dataset into training and testing sets (80% train, 20% test)
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Initialize the Gaussian Naive Bayes classifier


nb_model = GaussianNB()

# Train the classifier with the training data


nb_model.fit(X_train, y_train)

# Predict the labels for the test set


y_pred = nb_model.predict(X_test)

# Evaluate the model


accuracy = accuracy_score(y_test, y_pred)
print(f"Accuracy: {accuracy * 100:.2f}%")

# Print the classification report


report = classification_report(y_test, y_pred, target_names=iris.target_names)
print("Classification Report:")
print(report)
CS4514- Advanced Machine Learning Department of CSE Reg No:312422104069

OUTPUT:

First 5 rows of the dataset:


sepal length (cm) sepal width (cm) petal length (cm) petal width (cm) species
0 5.1 3.5 1.4 0.2 0
1 4.9 3.0 1.4 0.2 0
2 4.7 3.2 1.3 0.2 0
3 4.6 3.1 1.5 0.2 0
4 5.0 3.6 1.4 0.2 0

Accuracy: 100.00%
Classification Report:
precision recall f1-score support

setosa 1.00 1.00 1.00 10


versicolor 1.00 1.00 1.00 10
virginica 1.00 1.00 1.00 10

accuracy 1.00 30
macro avg 1.00 1.00 1.00 30
weighted avg 1.00 1.00 1.00 30

Result:
Thus, the given program to apply the Expectation-Maximization (EM) algorithm and the k-Means
clustering algorithm Is executed and the output is verified successfully
CS4514- Advanced Machine Learning Department of CSE Reg No:312422104069

Ex No: 7
Date: COMPARE EM ALGORITHM AND K MEANS ALGORITHM

AIM:
To apply the Expectation-Maximization (EM) algorithm and the k-Means clustering algorithm to a
dataset stored in a CSV file, and compare the clustering results of these two algorithms.

ALGORITHM:
Step 1: Load the dataset from the CSV file using the Pandas library.
Step 2: Preprocess the data by scaling the features using StandardScaler to standardize the data.
Step 3: Apply the Expectation-Maximization (EM) algorithm using the Gaussian Mixture model to cluster
the data.
Step 4: Predict and store the cluster labels from the EM algorithm.
Step 5: Apply the k-Means algorithm to the same dataset with a specified number of clusters.
Step 6: Predict and store the cluster labels from the k-Means algorithm.
Step 7: Add the resulting cluster labels from both algorithms to the original dataset for comparison.
Step 8: Print the dataset with the EM and k-Means cluster labels for comparison of the clustering results.
CS4514- Advanced Machine Learning Department of CSE Reg No:312422104069

PROGRAM:
# Install necessary libraries: pip install pandas scikit-learn
import pandas as pd
from sklearn.mixture import GaussianMixture
from sklearn.cluster import KMeans
from sklearn.preprocessing import StandardScaler

# Load the dataset


data = pd.read_csv('Player_Match.csv')

# Preprocess the data


scaler = StandardScaler()
data_scaled = scaler.fit_transform(data)

# Apply EM algorithm
em = GaussianMixture(n_components=3, random_state=42)
em.fit(data_scaled)
em_labels = em.predict(data_scaled)

# Apply K-Means algorithm


kmeans = KMeans(n_clusters=3, random_state=42)
kmeans.fit(data_scaled)
kmeans_labels = kmeans.labels_

# Compare the results


data['EM_Cluster'] = em_labels
data['KMeans_Cluster'] = kmeans_labels
print(data)
CS4514- Advanced Machine Learning Department of CSE Reg No:312422104069

OUTPUT:
Player Match_Score Performance EM_Cluster KMeans_Cluster
0 A 50.0 0.8 0 2
1 B 40.5 0.6 1 0
2 C 60.3 0.9 2 1
3 D 55.1 0.7 0 2
4 E 45.2 0.4 1 0
...

RESULT:
Thus, the given program to apply the Expectation-Maximization (EM) algorithm and the k-Means
clustering algorithm Is executed and the output is verified successfully.
CS4514- Advanced Machine Learning Department of CSE Reg No:312422104069

Ex No: 8
Date: K NEAREST NEIGHBOUR ALGORITHM

AIM:
To implement the k-Nearest Neighbour (k-NN) algorithm to classify the Iris dataset and display both
correct and wrong predictions.

ALGORITHM:
Step 1: Load the Iris dataset using the load_iris() function from the sklearn library.
Step 2: Split the dataset into training and testing sets using train_test_split().
Step 3: Initialize the k-NN classifier with k=3 neighbors.
Step 4: Train the k-NN model using the training data.
Step 5: Predict the labels for the test data using the trained k-NN model.
Step 6: Generate and print the classification report to evaluate the model's performance.
Step 7: Identify and display 5 correct predictions where predicted and actual labels match.
Step 8: Identify and display 5 wrong predictions where predicted and actual labels differ.
CS4514- Advanced Machine Learning Department of CSE Reg No:312422104069

PROGRAM:
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
from sklearn.neighbors import KNeighborsClassifier
from sklearn.metrics import classification_report

# Load the Iris dataset


iris = load_iris()
X, y = iris.data, iris.target

# Split the data into training and testing sets


X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42)

# Train the k-NN model


knn = KNeighborsClassifier(n_neighbors=3)
knn.fit(X_train, y_train)

# Make predictions
y_pred = knn.predict(X_test)

# Print the classification report


print("Classification Report:\n", classification_report(y_test, y_pred))

# Display 5 correct and 5 wrong predictions


correct_count = 0
wrong_count = 0

print("\nDisplaying 5 Correct and 5 Wrong Predictions:")

for i in range(len(y_test)):
if y_test[i] == y_pred[i] and correct_count < 5:
print(f"Correct: Predicted = {y_pred[i]}, Actual = {y_test[i]}")
correct_count += 1
elif y_test[i] != y_pred[i] and wrong_count < 5:
print(f"Wrong: Predicted = {y_pred[i]}, Actual = {y_test[i]}")
wrong_count += 1

# Stop when both counts reach 5


if correct_count == 5 and wrong_count == 5:
break
CS4514- Advanced Machine Learning Department of CSE Reg No:312422104069

OUTPUT:

Classification Report:
precision recall f1-score support

setosa 1.00 1.00 1.00 16


versicolor 1.00 0.95 0.97 19
virginica 0.95 1.00 0.97 10

accuracy 0.98 45
macro avg 0.98 0.98 0.98 45
weighted avg 0.98 0.98 0.98 45

RESULT:
Thus, the given program to implement the k-Nearest Neighbour (k-NN) algorithm and display both
correct predictions is executed and the output is verified successfully.
CS4514- Advanced Machine Learning Department of CSE Reg No:312422104069

Ex No: 9 NON-PARAMETRIC LOCALLY WEIGHTED REGRESSION ALGORITHM


Date:

AIM:
To implement the non-parametric Locally Weighted Regression (LWR) algorithm to fit data points and
visualize the predicted results compared to the actual dataset.
ALGORITM:
Step 1: Generate a synthetic dataset with input values and noisy target values.
Step 2: Define a weight function based on the distance between training data points and the query point.
Step 3: Formulate the Locally Weighted Regression model by solving for theta using the normal equation
and weighted matrix.
Step 4: Add a bias term to the input features to accommodate the intercept in the linear model.
Step 5: Select a suitable bandwidth parameter (tau) that controls the extent of locality for the weights.
Step 6: For each test point, compute the weights and predict the corresponding output using the LWR
model.
Step 7: Store the predicted output for each test point by applying the LWR model to the entire dataset.
Step 8: Plot the original dataset and the predicted curve on the same graph to compare the model fit with
the actual data.
CS4514- Advanced Machine Learning Department of CSE Reg No:312422104069

PROGRAM:
import numpy as np
import matplotlib.pyplot as plt
np.random.seed(42)
X = np.linspace(0, 15, 150)
y = np.sin(X) + np.random.normal(0, 0.3, X.shape)

def get_weights(X_train, X_point, tau):

return np.exp(-(X_train[:, 1] - X_point) ** 2 / (2 * tau ** 2))

def locally_weighted_regression(X_train, y_train, X_point, tau):


weights = get_weights(X_train, X_point, tau)
W = np.diag(weights)
# Solve for theta using normal equation with weights
theta = np.linalg.inv(X_train.T.dot(W).dot(X_train)).dot(X_train.T).dot(W).dot(y_train)
return theta

X_train = np.vstack([np.ones(X.shape), X]).T # Add bias term


tau = 2.0
y_pred = np.array([locally_weighted_regression(X_train, y, x, tau)[1] for x in X])

# Plot the results with the new parameters


plt.scatter(X, y, color='blue', label='True')
plt.plot(X, y_pred, color='red', label='Predicted (tau=2.0)')
plt.legend()
plt.xlabel('X')
plt.ylabel('y')
plt.title('Locally Weighted Regression with tau = 2.0, Extended X, and More Noise')
plt.show()
CS4514- Advanced Machine Learning Department of CSE Reg No:312422104069

OUTPUT:

RESULT:
Thus the given program to implement the non-parametric Locally Weighted Regression (LWR)
dataset is executed and the output is verified successfully.
CS4514- Advanced Machine Learning Department of CSE Reg No:312422104069

Ex No: 10
Date: FINDS ALGORITHM

AIM: To write a program in order to implement finds algorithm for the given data.
ALGORITHM:
Step 1: Initialize the hypothesis with 'Φ' for each feature.
Step 2: Iterate through each example in the dataset.
Step 3: If the target value of the example is 'yes', update the hypothesis based on the example's features.
Step 4: If a feature value in the hypothesis does not match the example's feature value, set it to '?'.
Step 5: Return the updated hypothesis.
Step 6: Print the final hypothesis.
CS4514- Advanced Machine Learning Department of CSE Reg No:312422104069

PROGRAM:
def find_s(examples):
hypothesis = ['Φ', 'Φ', 'Φ']
for example in examples:
if example[-1] == 'yes':
for i in range(len(hypothesis)):
if hypothesis[i] == 'Φ':
hypothesis[i] = example[i]
elif hypothesis[i] != example[i]:
hypothesis[i] = '?'
return hypothesis
data = [
[1, 'apple', 'red', 'fruit', 'yes'],
[2, 'mango', 'yellow', 'fruit', 'no'],
[3, 'jackfruit', 'green', 'fruit', 'yes'],
[4, 'BlueBerry', 'purple', 'fruit', 'yes']
]
hypothesis = find_s(data)
print("Final Hypothesis:", hypothesis)

OUTPUT:
Final Hypothesis: ['?', '?', '?']

RESULT:
The given program to implement finds algorithm for the given data is executed and the output is
verified successfully.
CS4514- Advanced Machine Learning Department of CSE Reg No:312422104069

Ex No: 11 BAYESIAN NETWORK FOR DIAGNOSING HEART PATIENTS


Date:

AIM:
To construct a Bayesian Network using medical data and demonstrate the diagnosis of heart patients using
a standard heart disease dataset. The model will be trained using Maximum Likelihood Estimation and will
be used for inference.

ALGORITHM:
Step 1: Load the heart disease dataset using Pandas and optimize memory usage by converting data types
to appropriate types.
Step 2: Handle any missing values in the dataset by filling in missing values with the column mean.
Step 3: Split the dataset into training and testing sets using the train_test_split() function.
Step 4: Define the Bayesian Network structure by specifying the relationships between features and the
target variable.
Step 5: Train the Bayesian Network model using the Maximum Likelihood Estimation (MLE) method.
Step 6: Use Variable Elimination for inference to diagnose the likelihood of heart disease based on certain
features.
Step 7: Perform diagnosis on new patient data using the trained model to predict the presence of heart
disease.
Step 8: Evaluate the model’s accuracy by testing the predictions on the test dataset and calculate the
overall accuracy.
CS4514- Advanced Machine Learning Department of CSE Reg No:312422104069

PROGRAM:
import pandas as pd
from pgmpy.models import BayesianNetwork
from pgmpy.estimators import MaximumLikelihoodEstimator
from pgmpy.inference import VariableElimination
from sklearn.model_selection import train_test_split
import numpy as np
import logging

# Disable progress bar and unnecessary logging from pgmpy


logging.getLogger('pgmpy').setLevel(logging.ERROR)

# Load the heart dataset from CSV file


data = pd.read_csv('heart.csv')

# Reduce memory usage by converting data types to the most appropriate types
data = data.astype({
'age': 'int8',
'sex': 'int8',
'cp': 'int8',
'trestbps': 'int16',
'chol': 'int16',
'fbs': 'int8',
'restecg': 'int8',
'thalach': 'int16',
'exang': 'int8',
'oldpeak': 'float32',
'slope': 'int8',
'ca': 'int8',
'thal': 'int8',
'target': 'int8'
})

# Handle missing values (if any)


data.fillna(data.mean(), inplace=True)

# Store info() output in a string variable to print later


import io
buffer = io.StringIO()
data.info(buf=buffer)
data_info_output = buffer.getvalue()

# Split the dataset into train and test sets


train_data, test_data = train_test_split(data, test_size=0.2, random_state=42)

# Define a simplified Bayesian Network structure


model = BayesianNetwork([
CS4514- Advanced Machine Learning Department of CSE Reg No:312422104069

('age', 'target'),
('cp', 'target'),
('thalach', 'target'),
('exang', 'target')
])

# Train the model using Maximum Likelihood Estimation


try:
model.fit(train_data, estimator=MaximumLikelihoodEstimator)
except Exception as e:
print(f"Error during model fitting: {e}")
exit()

# Perform inference on the trained model


inference = VariableElimination(model)

# Example: Perform diagnosis for a new patient with fewer features


evidence = {
'age': 55,
'cp': 3,
'thalach': 160,
'exang': 0
}

# Perform the diagnosis (predict the probability of having heart disease)


try:
result = inference.map_query(variables=['target'], evidence=evidence)
diagnosis_result = result['target']
except Exception as e:
print(f"Error during inference: {e}")
exit()

# Evaluate model accuracy on the test data (with limited features)


correct_predictions = 0
total = len(test_data)

for _, row in test_data.iterrows():


evidence = {
'age': row['age'],
'cp': row['cp'],
'thalach': row['thalach'],
'exang': row['exang']
}
predicted = inference.map_query(variables=['target'], evidence=evidence)['target']
actual = row['target']

if predicted == actual:
CS4514- Advanced Machine Learning Department of CSE Reg No:312422104069

correct_predictions += 1

accuracy = correct_predictions / total * 100

# Ensure minimalistic output is displayed after the process


print(data_info_output)
print(f"Diagnosis result: {diagnosis_result}")
print(f"Accuracy: {accuracy:.2f}%")
OUTPUT:
RangeIndex: 303 entries, 0 to 302
Data columns (total 14 columns):
# Column Non-Null Count Dtype

0 age 303 non-null int8


1 sex 303 non-null int8
2 cp 303 non-null int8
3 trestbps 303 non-null int16
4 chol 303 non-null int16
5 fbs 303 non-null int8
6 restecg 303 non-null int8
7 thalach 303 non-null int16
8 exang 303 non-null int8
9 oldpeak 303 non-null float32
10 slope 303 non-null int8
11 ca 303 non-null int8
12 thal 303 non-null int8
13 target 303 non-null int8
dtypes: float32(1), int16(3), int8(10)
memory usage: 5.3 KB
Diagnosis result: 1
Accuracy: 85.16

RESULT:
Thus, the given program to construct a Bayesian Network using medical data Maximum
Likelihood Estimation and will be used for inference.
CS4514- Advanced Machine Learning Department of CSE Reg No:312422104069

Ex no:12
Date: LOGISTIC REGRESSION

AIM:
To implement a Logistic Regression model to classify the given dataset and evaluate the model's
performance using accuracy, confusion matrix, and classification report.

ALGORITHM:
Step 1: Load the heart disease dataset using Pandas and display a preview of the data.
Step 2: Define the feature matrix (X) and the target vector (y) by separating the target column from the
dataset.
Step 3: Split the dataset into training and testing sets (80% training, 20% testing) using train_test_split().
Step 4: Initialize a Logistic Regression model, specifying any necessary parameters such as max_iter.
Step 5: Train the Logistic Regression model using the training dataset.
Step 6: Make predictions on the test data using the trained model.
Step 7: Evaluate the model by calculating the accuracy score on the test data.
Step 8: Print the confusion matrix and classification report to analyze the performance of the model further.
CS4514- Advanced Machine Learning Department of CSE Reg No:312422104069

PROGRAM:
import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LogisticRegression
from sklearn.metrics import accuracy_score, confusion_matrix, classification_report

# Load the dataset (using heart.csv as an example)


# You can replace this with any dataset of your choice
data = pd.read_csv('heart.csv')

# Display the first few rows of the dataset


print("Dataset preview:")
print(data.head())

# Define the features (X) and the target (y)


# In heart.csv, 'target' is usually the column we want to predict
X = data.drop('target', axis=1) # Features (everything except the target)
y = data['target'] # Target (heart disease presence)

# Split the dataset into training and testing sets (80% train, 20% test)
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Create the Logistic Regression model


model = LogisticRegression(max_iter=1000)

# Train the model on the training data


model.fit(X_train, y_train)

# Make predictions on the test data


y_pred = model.predict(X_test)

# Evaluate the model using accuracy score


accuracy = accuracy_score(y_test, y_pred)
print(f"Accuracy of Logistic Regression: {accuracy * 100:.2f}%")

# Print the confusion matrix


conf_matrix = confusion_matrix(y_test, y_pred)
print("Confusion Matrix:")
print(conf_matrix)

# Print a detailed classification report


class_report = classification_report(y_test, y_pred)
print("Classification Report:")
print(class_report)
CS4514- Advanced Machine Learning Department of CSE Reg No:312422104069

OUTPUT:

Dataset preview:
age sex cp trestbps chol fbs restecg thalach exang oldpeak slope ca thal target
0 63 1 3 145 233 1 0 150 0 2.3 3 0 6 1
1 67 1 2 160 286 0 1 108 1 1.5 1 0 3 1
2 67 1 2 160 286 0 1 108 1 1.5 1 0 3 1
3 37 1 2 130 250 0 1 187 0 3.5 0 0 3 1
4 41 0 1 130 204 0 1 172 0 1.4 1 0 3 1

Accuracy of Logistic Regression: 87.50%


Confusion Matrix:
[[22 1]
[ 3 24]]
Classification Report:
precision recall f1-score support

0 0.88 0.96 0.92 23


1 0.96 0.89 0.93 27

accuracy 0.92 50
macro avg 0.92 0.92 0.92 50
weighted avg 0.92 0.92 0.92 50

RESULT:
Thus, the given program to implement a Logistic Regression model to classify the given dataset is
executed and the output is verified successfully.

You might also like