0% found this document useful (0 votes)
17 views5 pages

Assignment ML

The document discusses using machine learning models like SVM, Naive Bayes and Random Forest classifiers to predict diseases by analyzing medical data. It loads and preprocesses a disease dataset, trains and tests the models on it, and evaluates their accuracy on test data using metrics like confusion matrix.

Uploaded by

vtu19941
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
17 views5 pages

Assignment ML

The document discusses using machine learning models like SVM, Naive Bayes and Random Forest classifiers to predict diseases by analyzing medical data. It loads and preprocesses a disease dataset, trains and tests the models on it, and evaluates their accuracy on test data using metrics like confusion matrix.

Uploaded by

vtu19941
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 5

SCHOOL OF COMPUTING

DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING


ASSIGNMENT

Programme Name : B. Tech CSE


Course Code / Course Name : 10211CS120/ Machine Learning
Year / Semester : 2023-2024 / Summer
VTU Number : 19941
Register Number : 21UECM0046
Name : CHENNA ANAND
Slot : S6+L18
Faculty : Dr. T. Kujani

Use Case: DISEASE PREDICTION

Objective: Disease prediction holds immense potential for transforming healthcare.


Its objectives range from enabling early detection of potential health risks through analyzing
factors like genetics and initial symptoms, to improving diagnostic accuracy by leveraging
vast amounts of medical data. Furthermore, disease prediction models can categorize
individuals based on their susceptibility to specific diseases, allowing for targeted
interventions and personalized healthcare plans. This not only empowers preventative
measures but also optimizes resource allocation within healthcare systems, ensuring
preparedness for outbreaks and better patient outcomes. Ultimately, disease prediction paves
the way for a future of personalized medicine, where treatment plans are tailored to each
individual's unique health profile.

Program:
# Importing libraries
import numpy as np
import pandas as pd
from scipy.stats import mode
import matplotlib.pyplot as plt
import seaborn as sns
from sklearn.preprocessing import LabelEncoder
from sklearn.model_selection import train_test_split, cross_val_score
from sklearn.svm import SVC
from sklearn.naive_bayes import GaussianNB
from sklearn.ensemble import RandomForestClassifier
from sklearn.metrics import accuracy_score, confusion_matrix

%matplotlib inline
# Reading the train.csv by removing the
# last column since it's an empty column
DATA_PATH = "dataset/Training.csv"
data = pd.read_csv(DATA_PATH).dropna(axis = 1)

# Checking whether the dataset is balanced or not


disease_counts = data["prognosis"].value_counts()
temp_df = pd.DataFrame({
"Disease": disease_counts.index,
"Counts": disease_counts.values
})

plt.figure(figsize = (18,8))
sns.barplot(x = "Disease", y = "Counts", data = temp_df)
plt.xticks(rotation=90)
plt.show()

X = data.iloc[:,:-1]
y = data.iloc[:, -1]
X_train, X_test, y_train, y_test =train_test_split(
X, y, test_size = 0.2, random_state = 24)

print(f"Train: {X_train.shape}, {y_train.shape}")


print(f"Test: {X_test.shape}, {y_test.shape}")

# Training and testing SVM Classifier


svm_model = SVC()
svm_model.fit(X_train, y_train)
preds = svm_model.predict(X_test)

print(f"Accuracy on train data by SVM Classifier\


: {accuracy_score(y_train, svm_model.predict(X_train))*100}")

print(f"Accuracy on test data by SVM Classifier\


: {accuracy_score(y_test, preds)*100}")
cf_matrix = confusion_matrix(y_test, preds)
plt.figure(figsize=(12,8))
sns.heatmap(cf_matrix, annot=True)
plt.title("Confusion Matrix for SVM Classifier on Test Data")
plt.show()

# Training and testing Naive Bayes Classifier


nb_model = GaussianNB()
nb_model.fit(X_train, y_train)
preds = nb_model.predict(X_test)
print(f"Accuracy on train data by Naive Bayes Classifier\
: {accuracy_score(y_train, nb_model.predict(X_train))*100}")

print(f"Accuracy on test data by Naive Bayes Classifier\


: {accuracy_score(y_test, preds)*100}")
cf_matrix = confusion_matrix(y_test, preds)
plt.figure(figsize=(12,8))
sns.heatmap(cf_matrix, annot=True)
plt.title("Confusion Matrix for Naive Bayes Classifier on Test Data")
plt.show()
# Training and testing Random Forest Classifier
rf_model = RandomForestClassifier(random_state=18)
rf_model.fit(X_train, y_train)
preds = rf_model.predict(X_test)
print(f"Accuracy on train data by Random Forest Classifier\
: {accuracy_score(y_train, rf_model.predict(X_train))*100}")

print(f"Accuracy on test data by Random Forest Classifier\


: {accuracy_score(y_test, preds)*100}")

cf_matrix = confusion_matrix(y_test, preds)


plt.figure(figsize=(12,8))
sns.heatmap(cf_matrix, annot=True)
plt.title("Confusion Matrix for Random Forest Classifier on Test Data")
plt.show()

Output:

You might also like