0% found this document useful (0 votes)

6 views32 pages

ML Project Assigment

Uploaded by

Vinay Thakur

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

6 views32 pages

ML Project Assigment

Uploaded by

Vinay Thakur

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 32

Machine Learning With Python

(KMBA352)

PROJECT FILE

Master of Business Administration, Business Analytics

Dr. A.P.J.Abdul Kalam Technical University, Lucknow

Vinay Kumar

2301921570138

Under the Guidance of

Ms. Rekha

1
S.No PROGRAMME PAGE DATE SIGN
. NO.
Write a Python program to load the
iris data from a given CSV file into
1 4 24-09-2024
a data frame and print the shape of
the data, the type of the data and
the first 3 rows.

Write a Python program using Scikit-

learn to print the keys, number of
2 5-6 26-09-2024
rows columns, feature names and the
description of the Iris data.

Write a Python program to split the

iris dataset into its attributes (X)
and labels (y). The X variable
3 contains the first four columns 7 30-09-2024
(i.e. attributes), and y contains
the labels of the dataset.

Write a Python program to draw a

scatterplot, then add a joint
4 density estimate to describe
individual distributions on the same 8-9 03-10-2024
plot between Sepal length and Sepal
width.

Write a Python program using Scikit-

learn to split the iris dataset into
70% train data and 30% test data.
5 Out of total 150 records, the 10-11 08-10-2024
training set will contain 120
records and the test set contains 30
of those records. Print both
datasets.

Implement and demonstrate the any

suitable algorithm for finding the
6 12-13 11-10-2024
most specific hypothesis based on a
given set of training data samples.
Read the training data from a .CSV
file.

For a given set of training data

examples stored in a .CSV file,
implement and demonstrate the
7 Candidate-Elimination algorithm to 14-15 15-10-2024
output a description of the set of
all hypotheses consistent with the
training examples.

2
Write a program to demonstrate the
working of the decision tree using
any suitable algorithm. Use an
8 appropriate data set for building 16-17 18-10-2024
the decision tree and apply this
knowledge to classify a new sample.

Build an Artificial Neural Network

by implementing the Backpropagation
9 algorithm. 18-19 05-11-2024

Write a program to implement the

naïve Bayesian classifier for a
10 20-21 07-11-2024
sample training data set stored as a
. CSV file.

Write a program to construct a

Bayesian network considering medical
11 22-23 12-11-2024
data. Use this model to demonstrate
the diagnosis of heart patients
using standard Heart Disease Data
Set.

Apply any suitable algorithm to

cluster a set of data stored in
12
a .CSV file. Use the same data set
for clustering using k-Means 24-26 14-11-2024
algorithm. Compare the results of
these two algorithms and comment on
the quality of clustering.

Write a program to implement k-

Nearest Neighbor algorithm to
13 classify the iris data set. 27-28 18-11-2024

Implement the non-parametric

Regression algorithm in order to fit
14 29-30 25-11-2024
data points. Select appropriate data
set for your experiment and draw
graphs.

Write a Python program to get the

accuracy of the Logistic Regression.
15 31-32 27-11-2024

3
Q1. Write a Python program to load the iris data from a
given CSV file into a data frame and print the shape of
the data, the type of the data and the first 3 rows.

In [7]: import pandas as pd

# Load iris data into dataframe

df = pd.read_csv('iris.csv')

# Print shape, type and first 3 rows

print(df.shape)
print(type(df))
print(df.head(3))

(150, 5)
<class 'pandas.core.frame.DataFrame'>
sepal.length sepal.width petal.length petal.width variety
0 5.1 3.5 1.4 0.2 Setosa
1 4.9 3.0 1.4 0.2 Setosa
2 4.7 3.2 1.3 0.2 Setosa

4
Q2. Write a Python program using Scikit-learn to print
the keys, number of rows columns, feature names and
the description of the Iris data

In [8]: from sklearn.datasets import

load_iris iris = load_iris()

# Print keys of iris dict

print(iris.keys())

# Print number of rows and columns

print(iris.data.shape)

# Print feature names

print(iris.feature_names)

# Print data description

print(iris.DESCR)

dict_keys(['data', 'target', 'frame', 'target_names', 'DESCR',

'feature_names', 'filename', 'data_module'])
(150, 4)
['sepal length (cm)', 'sepal width (cm)', 'petal length (cm)', 'petal width
(cm)']
.. _iris_dataset:

Iris plants dataset

Data Set Characteristics:

:Number of Instances: 150 (50 in each of three classes)

:Number of Attributes: 4 numeric, predictive attributes and the class
:Attribute Information:
- sepal length in cm
- sepal width in cm

5
- petal length in cm
- petal width in cm
- class:
- Iris-Setosa
- Iris-Versicolour
- Iris-Virginica

:Summary Statistics:

============= ==== ==== ====== ==== ====================

= = =
Min Max Mean SD Class Correlation
============= ==== ==== ====== ==== ==================
= = = ==
sepal length: 4.3 7.9 5.84 0.83 0.7826
sepal width: 2.0 4.4 3.05 0.43 -0.4194
petal length: 1.0 6.9 3.76 1.76 0.9490 (high!)
petal width: 0.1 2.5 1.20 0.76 0.9565 (high!)
============= ==== ==== ====== ==== ====================
= = =

6
Q3. Write a Python program to split the iris dataset into
its attributes (X) and labels (y). The X variable contains
the first four columns (i.e. attributes) and y contains the
labels of the dataset

In [9]: from sklearn.datasets import

load_iris iris = load_iris()

# Get the Iris data attributes

X = iris.data

# Get the Iris labels (0, 1, 2)

y = iris.target

print("Attributes shape: ",

X.shape) print("Labels shape:",
y.shape)
Attributes shape: (150, 4)
Labels shape: (150,)

7
Q4. Write a Python program to draw a scatterplot, then
add a joint density estimate to describe individual
distributions on the same plot between Sepal length and
Sepal width.

In [10]: import seaborn as sns

import matplotlib.pyplot as plt

# Load the Iris dataset from seaborn

iris = sns.load_dataset('iris')

# Create a scatterplot with a joint density estimate

sns.scatterplot(x='sepal_length', y='sepal_width', data=iris, color='blue')
sns.kdeplot(x='sepal_length', y='sepal_width', data=iris, color='red',
fill=True,

# Show the plot

plt.title('Scatterplot with Joint Density
Estimate') plt.show()

8
9
Q5. Write a Python program using Scikit-learn to split
the iris dataset into 70% train data and 30% test data.
Out of total 150 records, the training set will contain
120 records and the test set contains 30 of those
records. Print both datasets.

In [11]: from sklearn.model_selection import train_test_split

import seaborn as sns

# Load the Iris dataset from seaborn

iris = sns.load_dataset('iris')

# Split the dataset into 70% train and 30% test

train_data, test_data = train_test_split(iris, test_size=0.3,
random_state=42)

# Display the shapes of the training and

test sets print(f"Training Set Shape:
{train_data.shape}") print(f"Test Set Shape:
{test_data.shape}")

# Display the training

set print("\nTraining
Set:")
print(train_data)

# Display the test

10
Training Set Shape: (105, 5)
Test Set Shape: (45, 5)

Training Set:
sepal_lengt sepal_widt petal_lengt petal_widt species
81 h h h h versicolo
16 5.4 3.9 1.3 0.4 setosa
10 5.5
5.4 2.4
3.7 3.7
1.5 1.0
0.2 r
setosa
133 6.3 2.8 5.1 1.5 virginica
137 6.4 3.1 5.5 1.8 virginica
75 6.6 3.0 4.4 1.4 versicolo
r
109 7.2 3.6 6.1 2.5 virginica
.. ... ... ... ... ...
71 6.1 2.8 4.0 1.3 versicolo
r
106 4.9 2.5 4.5 1.7 virginica
14 5.8 4.0 1.2 0.2 setosa
92 5.8 2.6 4.0 1.2 versicolo
r
102 7.1 3.0 5.9 2.1 virginica

[10 rows x columns]

5 5
Tes Set:
t sepal_lengt
h sepal_widt petal_lengt petal_widt species
h h h
73 6.1 2.8 4.7 1.2
versicolo
r
18 5.7 3.8 1.7 0.3 setosa
118 7.7 2.6 6.9 2.3 virginica
78 6.0 2.9 4.5 1.5 versicolo
r
76 6.8 2.8 4.8 1.4 versicolo
r
31 5.4 3.4 1.5 0.4 setosa
64 5.6 2.9 3.6 1.3 versicolo
r
141 6.9 3.1 5.1 2.3 virginica
68 6.2 2.2 4.5 1.5 versicolo
r
82 5.8 2.7 3.9 1.2 versicolo
r
110 6.5 3.2 5.1 2.0 virginica
12 4.8 3.0 1.4 0.1 setosa
36 5.5 3.5 1.3 0.2 setosa
9 4.9 3.1 1.5 0.1 setosa
19 5.1 3.8 1.5 0.3 setosa
56 6.3 3.3 4.7 1.6 versicolo
r
104 6.5 3.0 5.8 2.2 virginica
69 5.6 2.5 3.9 1.1 versicolo
r
55 5.7 2.8 4.5 1.3 versicolo
r
132 6.4 2.8 5.6 2.2 virginica
29 4.7 3.2 1.6 0.2 setosa
127 6.1 3.0 4.9 1.8 virginica
26 5.0 3.4 1.6 0.4 setosa
128 6.4 2.8 5.6 2.1 virginica
131 7.9 3.8 6.4 2.0 virginica
145 6.7 3.0 5.2 2.3 virginica
Q6.
108 Implement
6.7 and2.5
demonstrate5.8 the any
1.8 suitable
virginica
algorithm
143 for
6.8 finding
3.2the most specific
5.9 hypothesis
2.3 virginica based
on
45 a given4.8
set of training
3.0 data 1.4samples.0.3Read the
setosa
training
30 data
4.8 from a .CSV file.1.6
3.1 0.2 setosa
22 4.6 3.6 1.0 0.2 setosa
15 5.7 4.4 1.5 0.4 setosa
11
In [14]: import pandas as pd

def find_most_specific_hypothesis(training_data):
# Check if there are positive examples
positive_examples = training_data[training_data['label'] == 'Y']

if positive_examples.empty:
print("No positive examples in the training data. Setting the
hypothesis to
# Set the hypothesis to a default value, such as all '?'
return ['?'] * (len(training_data.columns) - 1)

# Initialize the hypothesis with the first positive example

hypothesis = positive_examples.iloc[0, :-1].tolist()

# Refine the hypothesis based on positive examples

for index, row in training_data.iterrows():
if row['label'] == 'Y':
for i in range(len(hypothesis)):
if hypothesis[i] != row[i]:
hypothesis[i] = '?'

return hypothesis

# Use the Iris dataset as an example

iris = pd.read_csv('https://archive.ics.uci.edu/ml/machine-learning-
databases/iris/ iris.columns = ['sepal_length', 'sepal_width',
'petal_length', 'petal_width', 'lab

# Display the training

data print("Training
Data:") print(iris)

# Apply the algorithm to find the most specific hypothesis

specific_hypothesis = find_most_specific_hypothesis(iris)

# Display the most specific

hypothesis print("\nMost Specific

12
Training Data:
sepal_lengt sepal_widt petal_lengt petal_widt label
h h h h
0 5.1 3.5 1.4 0.2 Iris-setosa
1 4.9 3.0 1.4 0.2 Iris-setosa
2 4.7 3.2 1.3 0.2 Iris-setosa
3 4.6 3.1 1.5 0.2 Iris-setosa
4 5.0 3.6 1.4 0.2 Iris-setosa
.. ... ... ... ... ...
145 6.7 3.0 5.2 2.3 Iris-
virginica
146 6.3 2.5 5.0 1.9 Iris-
virginica
147 6.5 3.0 5.2 2.0 Iris-
virginica
148 6.2 3.4 5.4 2.3 Iris-
virginica
149 5.9 3.0 5.1 1.8 Iris-
virginica

[150 rows x 5 columns]

No positive examples in the training data. Setting the hypothesis to a
default val ue.

Most Specific Hypothesis:

['?', '?', '?', '?']

13
Q7. For a given set of training data examples stored in a
.CSV file, implement and demonstrate the Candidate-
Elimination algorithm to output a description of the set
of all hypotheses consistent with the training examples

In [15]:

def initialize_hypothesis(data):

return hypothesis

def is_consistent(instance, hypothesis):

def candidate_elimination(training_data):
hypothesis_space = initialize_hypothesis(training_data)

hypothesis_space[0][i] = instance[i]
elif hypothesis_space[0][i] != instance[i]:

hypothesis_space[1][i] = instance[i]

if instance[i] != hypothesis_space[1][i]:

return hypothesis_space

14
iris = pd.read_csv('https://archive.ics.uci.edu/ml/machine-learning-
databases/iris/ iris.columns = ['sepal_length', 'sepal_width',
'petal_length', 'petal_width', 'lab

print("Training
Data:") print(iris)

hypotheses = candidate_elimination(iris)

Training Data:
sepal_lengt sepal_widt petal_lengt petal_widt label
h h h h
0 5.1 3.5 1.4 0.2 Iris-setosa
1 4.9 3.0 1.4 0.2 Iris-setosa
2 4.7 3.2 1.3 0.2 Iris-setosa
3 4.6 3.1 1.5 0.2 Iris-setosa
4 5.0 3.6 1.4 0.2 Iris-setosa
.. ... ... ... ... ...
145 6.7 3.0 5.2 2.3 Iris-
virginica
146 6.3 2.5 5.0 1.9 Iris-
virginica
147 6.5 3.0 5.2 2.0 Iris-
virginica
148 6.2 3.4 5.4 2.3 Iris-
virginica
149 5.9 3.0 5.1 1.8 Iris-
virginica

[150 rows x 5 columns]

Set of Hypotheses Consistent with Training Examples:

['0', '0', '0', '0']
['?', '?', '?', '?']

15
Q8. Write a program to demonstrate the working of the
decision tree using any suitable algorithm. Use an
appropriate data set for building the decision tree and
apply this knowledge to classify a new sample

In [16]:
from sklearn.model_selection import train_test_split
from sklearn.tree import DecisionTreeClassifier
from sklearn.metrics import accuracy_score

clf = DecisionTreeClassifier(random_state=42)

y_pred = clf.predict(X_test)
accuracy = accuracy_score(y_test, y_pred)

16
predicted_class = clf.predict(new_sample)
class_name =
iris.target_names[predicted_class][0]

print(f"\nNew Sample Classification:")

print(f"Predicted Class: {class_name}")
Accuracy on Test Set: 100.00%

New Sample Classification:

Predicted Class: setosa

17
Q9. Build an Artificial Neural Network by implementing the
Backpropagation algorithm

In [17]: import numpy as np

class NeuralNetwork:
def init (self, input_size, hidden_size, output_size):
# Initialize weights and biases
self.weights_input_hidden = np.random.rand(input_size,
hidden_size) self.bias_hidden = np.zeros((1, hidden_size))
self.weights_hidden_output = np.random.rand(hidden_size,
output_size) self.bias_output = np.zeros((1, output_size))

def sigmoid(self, x):

return 1 / (1 + np.exp(-x))

def sigmoid_derivative(self, x):

return x * (1 - x)

def forward(self, inputs):

# Forward pass
self.hidden_layer_activation = np.dot(inputs,
self.weights_input_hidden) + self.hidden_layer_output =
self.sigmoid(self.hidden_layer_activation) self.output_activation =
np.dot(self.hidden_layer_output, self.weights_hidd
self.predicted_output = self.sigmoid(self.output_activation)
return self.predicted_output

def backward(self, inputs, targets, learning_rate):

# Backward pass
error = targets - self.predicted_output

# Output layer
output_delta = error *
self.sigmoid_derivative(self.predicted_output) hidden_error =
output_delta.dot(self.weights_hidden_output.T)

# Update weights and biases

self.weights_hidden_output +=
self.hidden_layer_output.T.dot(output_delta) self.bias_output +=
np.sum(output_delta, axis=0, keepdims=True) * learning

# Hidden layer
hidden_delta = hidden_error *
self.sigmoid_derivative(self.hidden_layer_out
self.weights_input_hidden 18
+= inputs.T.dot(hidden_delta) *
target_data =
np.array([targets[i]])
self.forward(input_data)
self.backward(input_data, target_data, learning_rate)

return self.forward(inputs)

nn = NeuralNetwork(input_size=2, hidden_size=2, output_size=1)

nn.train(inputs, targets, epochs=10000, learning_rate=0.1)

prediction = nn.predict(np.array([inputs[i]]))
print(f"Input: {inputs[i]}, Predicted Output:
Input: [0 0], Predicte Output [[0.05346176]]
d :
Input: [0 1], Predicte Output [[0.95140656]]
d :
Input: [1 0], Predicte Output [[0.95124283]]
d :
Input: [1 1], Predicte Output [[0.05207599]]
d :

19
Q10. Write a program to implement the naïve Bayesian
classifier for a sample training data set stored as a .
CSV file

In [18]:
from sklearn.model_selection import train_test_split
from sklearn.naive_bayes import GaussianNB
from sklearn.metrics import accuracy_score, classification_report,
confusion_matrix

data = pd.DataFrame(data=iris.data, columns=iris.feature_names)

nb_classifier = GaussianNB()
nb_classifier.fit(X_train,
y_train)

y_pred = nb_classifier.predict(X_test)

accuracy = accuracy_score(y_test, y_pred)

confusion_mat = confusion_matrix(y_test, y_pred)
classification_rep = classification_report(y_test, y_pred)

20
print(f'Accuracy: {accuracy}')
print(f'Confusion Matrix:\
n{confusion_mat}')
print(f'Classification Report:\n{classification_rep}')
Accuracy: 1.0
Confusion Matrix:
[[10 0 0]
[ 0 9 0]
[ 0 0 11]]
Classification Report:
precision recall f1-score support

0 1.00 1.00 1.00 10

1 1.00 1.00 1.00 9
2 1.00 1.00 1.00 11

accuracy 1.00 30
macro avg 1.00 1.00 1.00 30
weighted avg 1.00 1.00 1.00 30

21
Q11. Write a program to construct a Bayesian network
considering medical data. Use this model to
demonstrate the diagnosis of heart patients using
standard Heart Disease Data Set.

In [1]:

22
age Gender Family diet Lifestyle cholestrol
heartdisease 0 0 0 1 1 3
0 1
1 0 1 1 1 3 0 1
2 1 0 0 0 2 1 1
3 4 0 1 1 3 2 0
4 3 1 1 0 0 2 0
5 2 0 1 1 1 0 1
6 4 0 1 0 2 0 1
7 0 0 1 1 3 0 1
8 3 1 1 0 0 2 0
9 1 1 0 0 0 2 1
10 4 1 0 1 2 0 1
11 4 0 1 1 3 2 0
12 2 1 0 0 0 0 0
13 2 0 1 1 1 0 1
14 3 1 1 0 0 1 0
15 0 0 1 0 0 2 1
16 1 1 0 1 2 1 1
17 3 1 1 1 0 1 0
18 4 0 1 1 3 2 0
For Age enter SuperSeniorCitizen:0, SeniorCitizen:1, MiddleAged:2, Youth:3,
Teen:4 For Gender enter Male:0, Female:1
For Family History enter Yes:1,
No:0 For Diet enter High:0,
Medium:1
for LifeStyle enter Athlete:0, Active:1, Moderate:2,
Sedentary:3 for Cholesterol enter High:0, BorderLine:1,
Normal:2
Enter Age: 0
Enter Gender: 0
Enter Family History: 0
Enter Diet: 0
Enter Lifestyle: 3
Enter Cholestrol: 0
+ + +
| heartdisease | phi(heartdisease) |
+=================+=====================+
| heartdisease(0) | 0.5000 |
+ + +
| heartdisease(1) | 0.5000 |
+ + +
Finding Elimination Order: : : 0it [00:00, ?
it/s] 0it [00:00, ?it/s]

23
Q12. Apply any suitable algorithm to cluster a set of
data stored in a .CSV file. Use the same data set for
clustering using k-Means algorithm. Compare the results
of these two algorithms and comment on the quality of
clustering.

In [23]:

from sklearn.cluster import KMeans, AgglomerativeClustering

from sklearn.preprocessing import StandardScaler
from sklearn.metrics import silhouette_score

24
data = pd.DataFrame(data=iris.data, columns=iris.feature_names)

scaled_data = scaler.fit_transform(data)

kmeans = KMeans(n_clusters=3, random_state=42)

kmeans_labels =
kmeans.fit_predict(scaled_data)

hierarchical = AgglomerativeClustering(n_clusters=3)
hierarchical_labels = hierarchical.fit_predict(scaled_data)

data['KMeans_Cluster'] = kmeans_labels
data['Hierarchical_Cluster'] =
hierarchical_labels

plt.figure(figsize=(12, 6))

sns.scatterplot(data=data, x='sepal length (cm)', y='sepal width (cm)',

hue='KMean plt.title('K-Means Clustering')

sns.scatterplot(data=data, x='sepal length (cm)', y='sepal width (cm)',

hue='Hiera plt.title('Hierarchical Clustering')

kmeans_silhouette = silhouette_score(scaled_data, kmeans_labels)

hierarchical_silhouette = silhouette_score(scaled_data, hierarchical_labels)

print(f"Silhouette Score - K-Means: {kmeans_silhouette}")

print(f"Silhouette Score - Hierarchical: {hierarchical_silhouette}")

C:\ProgramData\anaconda3\Lib\site-packages\sklearn\cluster\_kmeans.py:1412:
Future Warning: The default value of `n_init` will change from 10 to
'auto' in 1.4. Set t he value of `n_init` explicitly to suppress the
warning
super()._check_params_vs_input(X, default_n_init=10) C:\ProgramData\
anaconda3\Lib\site-packages\sklearn\cluster\_kmeans.py:1436:

UserWa rning: KMeans is known to have a memory leak on Windows with MKL,
when there are l ess chunks than available threads. You can avoid it by
setting the environment var iable OMP_NUM_THREADS=1.
warnings.warn(

25
Silhouette Score - K-Means:
0.45994823920518635 Silhouette Score -
Hierarchical: 0.446689041028591

26
Q13. Write a program to implement k-Nearest Neighbor
algorithm to classify the iris data set

In [29]:

from sklearn.model_selection import

train_test_split from sklearn.preprocessing
import StandardScaler from sklearn.neighbors
import KNeighborsClassifier
from sklearn.metrics import accuracy_score, classification_report,
confusion_matrix

data = pd.DataFrame(data=iris.data, columns=iris.feature_names)

X_train_scaled = scaler.fit_transform(X_train)

knn_classifier = KNeighborsClassifier(n_neighbors=k_value)

knn_classifier.fit(X_train_scaled, y_train)

y_pred = knn_classifier.predict(X_test_scaled)

27
accuracy = accuracy_score(y_test, y_pred)
confusion_mat = confusion_matrix(y_test, y_pred)
classification_rep = classification_report(y_test, y_pred)

print(f'Accuracy: {accuracy}')
print(f'Confusion Matrix:\
n{confusion_mat}')
print(f'Classification Report:\n{classification_rep}')
Accuracy: 1.0
Confusion Matrix:
[[10 0 0]
[ 0 9 0]
[ 0 0 11]]
Classification Report:
precision recall f1-score support

0 1.00 1.00 1.00 10

1 1.00 1.00 1.00 9
2 1.00 1.00 1.00 11

accuracy 1.00 30
macro avg 1.00 1.00 1.00 30
weighted avg 1.00 1.00 1.00 30

28
Q14. Implement the non-parametric Regression
algorithm in order to fit data points. Select appropriate
data set for your experiment and draw graphs

In [22]: import numpy as np

import matplotlib.pyplot as plt
from sklearn.neighbors import KNeighborsRegressor

# Generate synthetic data

np.random.seed(42)
X = np.sort(5 * np.random.rand(80, 1), axis=0)
y = np.sin(X).ravel() + np.random.normal(0, 0.1, X.shape[0])

# Fit k-NN regression model

k_value = 5
knn_regressor = KNeighborsRegressor(n_neighbors=k_value)
knn_regressor.fit(X, y)

# Predict using the model

X_test = np.arange(0.0, 5.0, 0.01)[:,
np.newaxis] y_pred =
knn_regressor.predict(X_test)

# Plot the results

plt.scatter(X, y, color='darkorange', label='data')
plt.plot(X_test, y_pred, color='navy', label='prediction', linewidth=2)
plt.xlabel('data')
plt.ylabel('target')
plt.title('k-NN
Regression') plt.legend()
plt.show()

29
30
Q15. Write a Python program to get the accuracy of the
Logistic Regression.
In [21]:

from sklearn.model_selection import train_test_split

from sklearn.linear_model import LogisticRegression

from sklearn.metrics import accuracy_score

from sklearn.exceptions import ConvergenceWarning

simplefilter("ignore", category=ConvergenceWarning)

data = pd.DataFrame(data=iris.data, columns=iris.feature_names)

X_scaled = scaler.fit_transform(X)

logistic_regression_model = LogisticRegression(max_iter=1000)

31
logistic_regression_model.fit(X_train, y_train)

y_pred = logistic_regression_model.predict(X_test)

accuracy = accuracy_score(y_test, y_pred)

print(f'Accuracy: {accuracy}')
Accuracy: 1.0

CS178 Homework #1: Problem 0: Getting Connected
No ratings yet
CS178 Homework #1: Problem 0: Getting Connected
4 pages
Mlpy 2
No ratings yet
Mlpy 2
18 pages
Harsh Kumar MLP assignment 1
No ratings yet
Harsh Kumar MLP assignment 1
19 pages
ML N PY Programs
No ratings yet
ML N PY Programs
17 pages
Unit 2 ML
No ratings yet
Unit 2 ML
93 pages
Exercise and Experiment 3
No ratings yet
Exercise and Experiment 3
14 pages
Unit-2 Feature Selection
No ratings yet
Unit-2 Feature Selection
92 pages
Dataset Iris Flower. Final
No ratings yet
Dataset Iris Flower. Final
7 pages
Lab 3 - SciKitLearn ML
No ratings yet
Lab 3 - SciKitLearn ML
2 pages
Ludic - Workshop - Iris - Copie
No ratings yet
Ludic - Workshop - Iris - Copie
5 pages
Lab 1 - Machine Learning with Python - ML Engineering مهم
No ratings yet
Lab 1 - Machine Learning with Python - ML Engineering مهم
10 pages
Python For Kids
No ratings yet
Python For Kids
19 pages
Machine Learning Lab Dlihebca6sem
100% (1)
Machine Learning Lab Dlihebca6sem
25 pages
Chap5 - Wei - Ipynb - Colab
No ratings yet
Chap5 - Wei - Ipynb - Colab
29 pages
Lab Session 10
No ratings yet
Lab Session 10
9 pages
109 Sourabh Vivek Chougule
No ratings yet
109 Sourabh Vivek Chougule
75 pages
ML LabReport Final Index Edited
No ratings yet
ML LabReport Final Index Edited
35 pages
Machine Learning Lab Manual
No ratings yet
Machine Learning Lab Manual
39 pages
CS3362 Data Science Laboratory Manual 2022-23
No ratings yet
CS3362 Data Science Laboratory Manual 2022-23
54 pages
Machine Learning Lab Manual
No ratings yet
Machine Learning Lab Manual
36 pages
178 hw1
No ratings yet
178 hw1
4 pages
To Study About Numpy, Pandas and Matplotlib Libraries in Python
No ratings yet
To Study About Numpy, Pandas and Matplotlib Libraries in Python
21 pages
K-Nearest Neighbors Classifiers 2025
No ratings yet
K-Nearest Neighbors Classifiers 2025
33 pages
Task 1
No ratings yet
Task 1
14 pages
Regression Scikit Learn
No ratings yet
Regression Scikit Learn
33 pages
ML Lab File
No ratings yet
ML Lab File
43 pages
Intro To Scikit Learning
No ratings yet
Intro To Scikit Learning
18 pages
Basic Data Prep and Pre-Processing
No ratings yet
Basic Data Prep and Pre-Processing
12 pages
Ass-1 Prac
No ratings yet
Ass-1 Prac
23 pages
KNN Datacamp
No ratings yet
KNN Datacamp
31 pages
DS Manual
No ratings yet
DS Manual
34 pages
ML Yogesh
No ratings yet
ML Yogesh
23 pages
Eda Unit 1
No ratings yet
Eda Unit 1
7 pages
Task 1 Iris Flower Classification Using Machine Learning
No ratings yet
Task 1 Iris Flower Classification Using Machine Learning
10 pages
Lab Manual
No ratings yet
Lab Manual
32 pages
Iris - Copy1 - Jupyter Notebook
No ratings yet
Iris - Copy1 - Jupyter Notebook
8 pages
MLT Lab1
No ratings yet
MLT Lab1
27 pages
2 Machine Learning
No ratings yet
2 Machine Learning
21 pages
Program
No ratings yet
Program
9 pages
KR&AI-ML-DM Practical Journal ANS
No ratings yet
KR&AI-ML-DM Practical Journal ANS
64 pages
AAM PR QB
No ratings yet
AAM PR QB
13 pages
ChatGPT - MyLearning On Coding For Machine Learning
No ratings yet
ChatGPT - MyLearning On Coding For Machine Learning
16 pages
ML LabManual
No ratings yet
ML LabManual
16 pages
Understanding-Code-for A-Classifier
No ratings yet
Understanding-Code-for A-Classifier
15 pages
AIML Record 56
No ratings yet
AIML Record 56
28 pages
DM Practicals in Python
No ratings yet
DM Practicals in Python
55 pages
Scikit Learn
No ratings yet
Scikit Learn
25 pages
Wa0001
No ratings yet
Wa0001
39 pages
Lab Exercise 2
No ratings yet
Lab Exercise 2
5 pages
Cse Machine Learning Lab Manual
No ratings yet
Cse Machine Learning Lab Manual
22 pages
Assignment 1
No ratings yet
Assignment 1
3 pages
Data Analysis Lab - Final - 23-24
No ratings yet
Data Analysis Lab - Final - 23-24
11 pages
ML Keshav
No ratings yet
ML Keshav
23 pages
ML Lab Report
No ratings yet
ML Lab Report
8 pages
ML LAB Manual
No ratings yet
ML LAB Manual
28 pages
An Introduction To Supervised Learning With Scikit-Learn: Machine Learning: The Problem Setting
No ratings yet
An Introduction To Supervised Learning With Scikit-Learn: Machine Learning: The Problem Setting
4 pages
ML Lab - Abbs
No ratings yet
ML Lab - Abbs
23 pages
BCSL606 Machine Learning Lab Final Draft
No ratings yet
BCSL606 Machine Learning Lab Final Draft
32 pages
ML File Syllabus
No ratings yet
ML File Syllabus
43 pages
Anova Mba
No ratings yet
Anova Mba
8 pages
Top 10 Machine Learning Algo PDF
No ratings yet
Top 10 Machine Learning Algo PDF
15 pages
Module2 Intro Google Earth Engine Presentation PDF
No ratings yet
Module2 Intro Google Earth Engine Presentation PDF
31 pages
Applied Microsoft Power BI - 4th Edition
100% (3)
Applied Microsoft Power BI - 4th Edition
527 pages
Data Mining
No ratings yet
Data Mining
35 pages
DOE For Method Development and Validation 2122014
No ratings yet
DOE For Method Development and Validation 2122014
9 pages
Comprehensive Assessment of Pharmaceutical Waste Disposal Practices in Ishaka-Bushenyi Municipality Implications For Public Health and Environmental Sustainability
No ratings yet
Comprehensive Assessment of Pharmaceutical Waste Disposal Practices in Ishaka-Bushenyi Municipality Implications For Public Health and Environmental Sustainability
11 pages
Kristiani Et Al. 2024
No ratings yet
Kristiani Et Al. 2024
12 pages
Research Proposal
No ratings yet
Research Proposal
3 pages
Chatbot For Prediction of Weight and BMI
No ratings yet
Chatbot For Prediction of Weight and BMI
3 pages
Brochure Tableau
0% (1)
Brochure Tableau
4 pages
Data Science For Economics and Finanical Management
No ratings yet
Data Science For Economics and Finanical Management
10 pages
Quality Assessment Tool - Review Articles: Instructions For Completion
No ratings yet
Quality Assessment Tool - Review Articles: Instructions For Completion
8 pages
Classification in Data Mining
No ratings yet
Classification in Data Mining
14 pages
AIG-BBP-BW - V1.0 Final
No ratings yet
AIG-BBP-BW - V1.0 Final
56 pages
Validity and Reliability of The Maslach.....
No ratings yet
Validity and Reliability of The Maslach.....
10 pages
Business Statistics End Term Exam (Set 3)
No ratings yet
Business Statistics End Term Exam (Set 3)
5 pages
IB - SEHS:Biology - IA - Checklist 2
No ratings yet
IB - SEHS:Biology - IA - Checklist 2
4 pages
Chapter 08 - ANOVA MANOVA
No ratings yet
Chapter 08 - ANOVA MANOVA
33 pages
Spearman's Cooeficient of Rank Correlation
No ratings yet
Spearman's Cooeficient of Rank Correlation
17 pages
Dwbi Unit 4 & 5
No ratings yet
Dwbi Unit 4 & 5
26 pages
Knex Stem Teachers Guide
No ratings yet
Knex Stem Teachers Guide
27 pages
Workforce Analytics Report PDF
No ratings yet
Workforce Analytics Report PDF
40 pages
IBM SPSS Statistics Brief Guide
No ratings yet
IBM SPSS Statistics Brief Guide
60 pages
Research Methods For Engineering Educators
100% (1)
Research Methods For Engineering Educators
39 pages
(Ebook PDF) An Introduction To Educational Research: Connecting Methods To Practice PDF Download
100% (2)
(Ebook PDF) An Introduction To Educational Research: Connecting Methods To Practice PDF Download
44 pages
Regression Statistics
No ratings yet
Regression Statistics
11 pages
By Olivia Wilson
No ratings yet
By Olivia Wilson
11 pages
Hughes, J. E., Kerr, S. P., & Ooms, A. (2005) - Content-Focused Technology Inquiry Groups
No ratings yet
Hughes, J. E., Kerr, S. P., & Ooms, A. (2005) - Content-Focused Technology Inquiry Groups
13 pages
Group 8 MM 3 D Proposal Appendices 2
No ratings yet
Group 8 MM 3 D Proposal Appendices 2
51 pages

ML Project Assigment

Uploaded by

ML Project Assigment

Uploaded by

Machine Learning With Python

Master of Business Administration, Business Analytics

Dr. A.P.J.Abdul Kalam Technical University, Lucknow

Under the Guidance of

Write a Python program using Scikit-

Write a Python program to split the

Write a Python program to draw a

Write a Python program using Scikit-

Implement and demonstrate the any

For a given set of training data

Build an Artificial Neural Network

Write a program to implement the

Write a program to construct a

Apply any suitable algorithm to

Write a program to implement k-

Implement the non-parametric

Write a Python program to get the

In [7]: import pandas as pd

# Load iris data into dataframe

# Print shape, type and first 3 rows

In [8]: from sklearn.datasets import

load_iris iris = load_iris()

# Print keys of iris dict

# Print number of rows and columns

# Print feature names

# Print data description

dict_keys(['data', 'target', 'frame', 'target_names', 'DESCR',

Iris plants dataset

**Data Set Characteristics:**

:Number of Instances: 150 (50 in each of three classes)

============= ==== ==== ====== ==== ====================

In [9]: from sklearn.datasets import

load_iris iris = load_iris()

# Get the Iris data attributes

# Get the Iris labels (0, 1, 2)

print("Attributes shape: ",

In [10]: import seaborn as sns

# Load the Iris dataset from seaborn

# Create a scatterplot with a joint density estimate

# Show the plot

In [11]: from sklearn.model_selection import train_test_split

# Load the Iris dataset from seaborn

# Split the dataset into 70% train and 30% test

# Display the shapes of the training and

# Display the training

# Display the test

[10 rows x columns]

# Initialize the hypothesis with the first positive example

# Refine the hypothesis based on positive examples

# Use the Iris dataset as an example

# Display the training

# Apply the algorithm to find the most specific hypothesis

# Display the most specific

[150 rows x 5 columns]

Most Specific Hypothesis:

def is_consistent(instance, hypothesis):

[150 rows x 5 columns]

Set of Hypotheses Consistent with Training Examples:

print(f"\nNew Sample Classification:")

New Sample Classification:

In [17]: import numpy as np

def sigmoid(self, x):

def sigmoid_derivative(self, x):

def forward(self, inputs):

def backward(self, inputs, targets, learning_rate):

# Update weights and biases

nn = NeuralNetwork(input_size=2, hidden_size=2, output_size=1)

nn.train(inputs, targets, epochs=10000, learning_rate=0.1)

data = pd.DataFrame(data=iris.data, columns=iris.feature_names)

accuracy = accuracy_score(y_test, y_pred)

0 1.00 1.00 1.00 10

from sklearn.cluster import KMeans, AgglomerativeClustering

kmeans = KMeans(n_clusters=3, random_state=42)

sns.scatterplot(data=data, x='sepal length (cm)', y='sepal width (cm)',

sns.scatterplot(data=data, x='sepal length (cm)', y='sepal width (cm)',

kmeans_silhouette = silhouette_score(scaled_data, kmeans_labels)

Data Set Characteristics: