0% found this document useful (0 votes)

2 views24 pages

Machine Intelligence

This document is a practical journal submitted by Mr. Rajigare Prathamesh Arun for the Master of Science in Computer Science program at the University of Mumbai, focusing on Machine Intelligence. It includes various practical implementations such as multiple regression, logistic regression, KNN, and bootstrapping techniques, along with code examples and results. The journal is certified by the guide and includes a structured approach to data analysis and model fitting.

Uploaded by

Prathamesh

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

2 views24 pages

Machine Intelligence

Uploaded by

Prathamesh

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 24

Application Id-32890 M.Sc.CS.

SEM-2Part-1

Institute of Distance and Open Learning

Vidya Nagari, Kalina, Santacruz East - 400098

A Practical Journal Submitted in fulfillment

of the degree of
MASTER OF SCIENCE
IN
COMPUTER SCIENCE

YEAR 2023-2024

Part 1 - Semester 2

Machine Intelligence

By
Mr. RAJIGARE PRATHAMESH ARUN

Application ID:- 32890

Seat No:- 4500378

Under the Guidance of

Prof. Sujatha Sunder

Application Id-32890 M.Sc.CS.SEM-2Part-1

Institute of Distance and Open Learning

Vidya Nagari, kalina, Santacruz East – 400098

CERTIFICATE

This is to certify that, this practical journal entitled “ Machine Intelligence

is a record of work carried out by Mr. Rajigare PrathameshArun (Seat No:- 4500378 )
, student of Master of Science in Computer Science Part 1
class and is submitted to University of Mumbai, in partial fulfillment of
the requirement for the award of the degree of Master of Science in Computer
Science. The practical journal has been approved.

Guide External Examiner Coordinator – M.Sc.CS

Sr.No Date Practicals Signature

2 Implement multiple regression model on

a standard data set

3 Implement Logistic Regression

4 Fit a classification model using K Nearest

Neighbour (KNN) Algorithm on a given
data set

5 Use bootstrap to give an estimate of a

given statistic.

6 For a given data set, split the data into

two training and testing and fit the
following on the training set: (i) Linear
model using least squares (ii) Ridge
regression model (iii) Lasso model (iv)
PCR model (v) PLS model

7 For a given data set, perform the

following:
Perform the polynomial regression and
make a plot of the resulting polynomial fit
to the data.

8 Decision Tree

9 For a given data set, split the dataset into

training and testing. Fit the following
models on the training set and evaluate
the performance on the test set: (i)
Boosting and Bagging (ii) Random
Forest
Practical 1
Step 1: Import the required python packages
# Import libraries
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
from sklearn.model_selection import train_test_split
from pandas.core.common import random_state
from sklearn.linear_model import LinearRegression
# Get dataset
df_sal = pd.read_csv('/content/Salary_Data.csv')
df_sal.head()
# Describe data
df_sal.describe()
# Data distribution
plt.title('Salary Distribution Plot')
sns.distplot(df_sal['Salary'])
plt.show()
# Relationship between Salary and Experience
plt.scatter(df_sal['YearsExperience'], df_sal['Salary'], color = 'lightcoral')
plt.title('Salary vs Experience')
plt.xlabel('Years of Experience')
plt.ylabel('Salary')
plt.box(False)
plt.show()
# Splitting variables
X = df_sal.iloc[:, :1] # independent
y = df_sal.iloc[:, 1:] # dependent
# Splitting dataset into test/train
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size = 0.2, random_state =
0)
# Regressor model
regressor = LinearRegression()
regressor.fit(X_train, y_train)
# Prediction result
y_pred_test = regressor.predict(X_test) # predicted value of y_test
y_pred_train = regressor.predict(X_train) # predicted value of y_train
# Prediction on training set
plt.scatter(X_train, y_train, color = 'lightcoral')
plt.plot(X_train, y_pred_train, color = 'firebrick')
plt.title('Salary vs Experience (Training Set)')
plt.xlabel('Years of Experience')
plt.ylabel('Salary')
plt.legend(['X_train/Pred(y_test)', 'X_train/y_train'], title = 'Sal/Exp', loc='best',
facecolor='white')
plt.box(False)
plt.show()
Practical 2 Implement multiple regression model on a standard data set

import numpy as np
import matplotlib as mpl
from mpl_toolkits.mplot3d import Axes3D
import matplotlib.pyplot as plt

def generate_dataset(n):
x = []
y = []
random_x1 = np.random.rand()
random_x2 = np.random.rand()
for i in range(n):
x1 = i
x2 = i/2 + np.random.rand()*n
x.append([1, x1, x2])
y.append(random_x1 * x1 + random_x2 * x2 + 1)
return np.array(x), np.array(y)

x, y = generate_dataset(200)

mpl.rcParams['legend.fontsize'] = 12

fig = plt.figure()
ax = fig.add_subplot(projection ='3d')
ax.scatter(x[:, 1], x[:, 2], y, label ='y', s = 5)
ax.legend()
ax.view_init(45, 0)
plt.show()

Output:

Practical 3: Implement Logistic Regression

# Import necessary libraries

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
from sklearn.datasets import load_diabetes
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
from sklearn.linear_model import LogisticRegression
from sklearn.metrics import accuracy_score, classification_report, confusion_matrix,
roc_curve, auc

# Load the diabetes dataset

diabetes = load_diabetes()
X, y = diabetes.data, diabetes.target

# Convert the target variable to binary (1 for diabetes, 0 for no diabetes)

y_binary = (y > np.median(y)).astype(int)

# Split the data into training and testing sets

X_train, X_test, y_train, y_test = train_test_split(
X, y_binary, test_size=0.2, random_state=42)

# Standardize features
scaler = StandardScaler()
X_train = scaler.fit_transform(X_train)
X_test = scaler.transform(X_test)

# Train the Logistic Regression model

model = LogisticRegression()
model.fit(X_train, y_train)

# Evaluate the model

y_pred = model.predict(X_test)
accuracy = accuracy_score(y_test, y_pred)
print("Accuracy: {:.2f}%".format(accuracy * 100))

# evaluate the model

print("Confusion Matrix:\n", confusion_matrix(y_test, y_pred))
print("\nClassification Report:\n", classification_report(y_test, y_pred))

Output:
Confusion Matrix:
[[36 13]
[11 29]]

Classification Report:
precision recall f1-score support

0 0.77 0.73 0.75 49

1 0.69 0.72 0.71 40

accuracy 0.73 89
macro avg 0.73 0.73 0.73 89
weighted avg 0.73 0.73 0.73 89

# Visualize the decision boundary with accuracy information

plt.figure(figsize=(8, 6))
sns.scatterplot(x=X_test[:, 2], y=X_test[:, 8], hue=y_test, palette={
0: 'blue', 1: 'red'}, marker='o')
plt.xlabel("BMI")
plt.ylabel("Age")
plt.title("Logistic Regression Decision Boundary\nAccuracy: {:.2f}%".format(
accuracy * 100))
plt.legend(title="Diabetes", loc="upper right")
plt.show()
Practical 4 Fit a classification model using K Nearest Neighbour (KNN)
Algorithm on a given data set.

# Import necessary modules

from sklearn.neighbors import KNeighborsClassifier
from sklearn.model_selection import train_test_split
from sklearn.datasets import load_iris
import numpy as np
import matplotlib.pyplot as plt

irisData = load_iris()

# Create feature and target arrays

X = irisData.data
y = irisData.target

# Split into training and test set

X_train, X_test, y_train, y_test = train_test_split(
X, y, test_size = 0.2, random_state=42)

neighbors = np.arange(1, 9)
train_accuracy = np.empty(len(neighbors))
test_accuracy = np.empty(len(neighbors))

# Loop over K values

for i, k in enumerate(neighbors):
knn = KNeighborsClassifier(n_neighbors=k)
knn.fit(X_train, y_train)

# Compute training and test data accuracy

train_accuracy[i] = knn.score(X_train, y_train)
test_accuracy[i] = knn.score(X_test, y_test)

# Generate plot
plt.plot(neighbors, test_accuracy, label = 'Testing dataset Accuracy')
plt.plot(neighbors, train_accuracy, label = 'Training dataset Accuracy')

plt.legend()
plt.xlabel('n_neighbors')
plt.ylabel('Accuracy')
plt.show()

Practical 5 : Use bootstrap to give an estimate of a given statistic.

Example of how bootstrap samples are created and used to estimate a statistic
of interest.
Let's say we have a small dataset of 5 observations:
Original Data: [3, 4, 5, 6, 7]
Create bootstrap samples by resampling with replacement:
We'll create 3 bootstrap samples of size 5 by randomly drawing observations from
the original data with replacement.
Each bootstrap sample will have the same size as the original dataset.
 Bootstrap Sample 1: [5, 6, 3, 4, 7]
 Bootstrap Sample 2: [4, 3, 6, 4, 6]
 Bootstrap Sample 3: [7, 5, 7, 3, 4]
Calculate the statistic of interest (median) for each bootstrap sample:
 Bootstrap Sample 1 median: 5
 Bootstrap Sample 2 median: 4
 Bootstrap Sample 3 median: 5
Repeat steps 1 and 2 many times (e.g., 10,000 times):
By repeating the process of creating bootstrap samples and calculating the median,
we can build an empirical sampling distribution of the median.
Use the empirical sampling distribution to calculate confidence intervals or perform
hypothesis tests:
For example, if we want to construct a 95% confidence interval for the median, we
can find the 2.5th and 97.5th percentiles of the empirical sampling distribution of the
median.
Let's say the 2.5th percentile is 4, and the 97.5th percentile is 6.
Then, the 95% confidence interval for the median would be [4, 6].

Example of Using Bootstrapping to Create Confidence Intervals

Let's say we have a small sample of data representing the heights (in inches) of 10
individuals:
Heights = [65.2, 67.1, 68.5, 69.3, 70.0, 71.2, 72.4, 73.1, 74.5, 75.8]
We want to estimate the 95% confidence interval for the mean height in the
population using bootstrapping.
Here are the steps we would follow:
Calculate the sample mean from the original data:
Sample mean = (65.2 + 67.1 + 68.5 + 69.3 + 70.0 + 71.2 + 72.4 + 73.1 + 74.5 +
75.8) / 10 = 70.71 inches
Create a large number of bootstrap samples from the original data by resampling
with replacement. For example, let's create 10,000 bootstrap samples, each of size
10.
For each bootstrap sample, calculate the mean height.
After computing the means for all 10,000 bootstrap samples, we now have an
empirical bootstrap sampling distribution of the mean.
From this empirical bootstrap sampling distribution, we can determine the 95%
confidence interval by finding the 2.5th and 97.5th percentiles of the distribution.
Let's say the 2.5th percentile is 69.8 inches, and the 97.5th percentile is 71.6 inches.
Then, the 95% confidence interval for the mean height is [69.8, 71.6] inches.
This confidence interval means that if we were to repeat the process of taking a
sample of size 10 and constructing a bootstrap confidence interval many times, 95%
of those intervals would contain the true population mean height.
The key advantage of bootstrapping in this example is that it does not require any
assumptions about the underlying distribution of heights in the population. It relies
solely on the information contained in the original sample data.

Practical 6
For a given data set, split the data into two training and testing and fit the
following on the training set: (i) Linear model using least squares (ii) Ridge
regression model (iii) Lasso model (iv) PCR model (v) PLS model

(i) Linear model using least squares

import numpy as np
from scipy import optimize
import matplotlib.pyplot as plt
plt.style.use('seaborn-poster')
# generate x and y
x = np.linspace(0, 1, 101)
y = 1 + x + x * np.random.random(len(x))
# assemble matrix A
A = np.vstack([x, np.ones(len(x))]).T

# turn y into a column vector

y = y[:, np.newaxis]
# Direct least square regression
alpha = np.dot((np.dot(np.linalg.inv(np.dot(A.T,A)),A.T)),y)
print(alpha)
# plot the results
plt.figure(figsize = (10,8))
plt.plot(x, y, 'b.')
plt.plot(x, alpha[0]*x + alpha[1], 'r')
plt.xlabel('x')
plt.ylabel('y')
plt.show()
ii) Ridge regression model

#Model
lr = LinearRegression()

#Fit model
lr.fit(X_train, y_train)

#predict
#prediction = lr.predict(X_test)

#actual
actual = y_test

train_score_lr = lr.score(X_train, y_train)

test_score_lr = lr.score(X_test, y_test)

print("The train score for lr model is {}".format(train_score_lr))

print("The test score for lr model is {}".format(test_score_lr))

#Ridge Regression Model

ridgeReg = Ridge(alpha=10)

ridgeReg.fit(X_train,y_train)

#train and test scorefor ridge regression

train_score_ridge = ridgeReg.score(X_train, y_train)
test_score_ridge = ridgeReg.score(X_test, y_test)

print("\nRidge Model............................................\n")
print("The train score for ridge model is {}".format(train_score_ridge))
print("The test score for ridge model is {}".format(test_score_ridge))

Output:

plt.figure(figsize = (10, 10))

plt.plot(features,ridgeReg.coef_,alpha=0.7,linestyle='none',marker='*',markersize=5,c
olor='red',label=r'Ridge; $\alpha = 10$',zorder=7)
#plt.plot(rr100.coef_,alpha=0.5,linestyle='none',marker='d',markersize=6,color='blue',
label=r'Ridge; $\alpha = 100$')
plt.plot(features,lr.coef_,alpha=0.4,linestyle='none',marker='o',markersize=7,color='gr
een',label='Linear Regression')
plt.xticks(rotation = 90)
plt.legend()
plt.show()

Output:
iii) Lasso model

#Lasso regression model

print("\nLasso Model............................................\n")
lasso = Lasso(alpha = 10)
lasso.fit(X_train,y_train)
train_score_ls =lasso.score(X_train,y_train)
test_score_ls =lasso.score(X_test,y_test)

print("The train score for ls model is {}".format(train_score_ls))

print("The test score for ls model is {}".format(test_score_ls))

Output:

pd.Series(lasso.coef_, features).sort_values(ascending = True).plot(kind = "bar")

iv) PCR model

%matplotlib inline
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
from sklearn.preprocessing import scale
from sklearn import model_selection
from sklearn.decomposition import PCA
from sklearn.linear_model import LinearRegression
from sklearn.cross_decomposition import PLSRegression, PLSSVD
from sklearn.metrics import mean_squared_error
df = pd.read_csv('Hitters.csv').dropna().drop('Player', axis=1)
df.info()
dummies = pd.get_dummies(df[['League', 'Division', 'NewLeague']])
y = df.Salary

# Drop the column with the independent variable (Salary), and columns for which we
created dummy variables
X_ = df.drop(['Salary', 'League', 'Division', 'NewLeague'], axis=1).astype('float64')
# Define the feature set X.
X = pd.concat([X_, dummies[['League_N', 'Division_W', 'NewLeague_N']]], axis=1)
pca = PCA()
X_reduced = pca.fit_transform(scale(X))
pd.DataFrame(pca.components_.T).loc[:4,:5]
# 10-fold CV, with shuffle
n = len(X_reduced)
kf_10 = model_selection.KFold( n_splits=10, shuffle=True, random_state=1)
regr = LinearRegression()
mse = []

# Calculate MSE with only the intercept (no principal components in regression)
score = -1*model_selection.cross_val_score(regr, np.ones((n,1)), y.ravel(), cv=kf_10,
scoring='neg_mean_squared_error').mean()
mse.append(score)

# Calculate MSE using CV for the 19 principle components, adding one component
at the time.
for i in np.arange(1, 20):
score = -1*model_selection.cross_val_score(regr, X_reduced[:,:i], y.ravel(),
cv=kf_10, scoring='neg_mean_squared_error').mean()
mse.append(score)

# Plot results
plt.plot(mse, '-v')
plt.xlabel('Number of principal components in regression')
plt.ylabel('MSE')
plt.title('Salary')
plt.xlim(xmin=-1);
pca2 = PCA()

# Split into training and test sets

X_train, X_test , y_train, y_test = model_selection.train_test_split(X, y, test_size=0.5,
random_state=1)

# Scale the data

X_reduced_train = pca2.fit_transform(scale(X_train))
n = len(X_reduced_train)

# 10-fold CV, with shuffle

kf_10 = model_selection.KFold( n_splits=10, shuffle=True, random_state=1)

mse = []

# Calculate MSE with only the intercept (no principal components in regression)
score = -1*model_selection.cross_val_score(regr, np.ones((n,1)), y_train.ravel(),
cv=kf_10, scoring='neg_mean_squared_error').mean()
mse.append(score)
# Calculate MSE using CV for the 19 principle components, adding one component
at the time.
for i in np.arange(1, 20):
score = -1*model_selection.cross_val_score(regr, X_reduced_train[:,:i],
y_train.ravel(), cv=kf_10, scoring='neg_mean_squared_error').mean()
mse.append(score)

plt.plot(np.array(mse), '-v')
plt.xlabel('Number of principal components in regression')
plt.ylabel('MSE')
plt.title('Salary')
plt.xlim(xmin=-1);

(v) PLS model

from sys import stdout
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt

from scipy.signal import savgol_filter

from sklearn.cross_decomposition import PLSRegression

from sklearn.model_selection import cross_val_predict
from sklearn.metrics import mean_squared_error, r2_score
data = pd.read_csv("../input/peach-nir-spectra-brix-
values/peach_spectrabrixvalues.csv")
data.head()
y = data['Brix'].values
X = data.values[:, 1:]
y.shape
X.shape
# Plot the data
wl = np.arange(1100, 2300, 2)
print(len(wl)
with plt.style.context('ggplot'):
plt.plot(wl, X.T)
plt.xlabel("Wavelengths (nm)")
plt.ylabel("Absorbance")

output:
Practical 7:
For a given data set, perform the following:
Perform the polynomial regression and make a plot of the resulting polynomial
fit to the data.

import numpy as np
import matplotlib.pyplot as plt
import pandas as pd
# Importing the dataset
datas = pd.read_csv('data.csv')
datas
X = datas.iloc[:, 1:2].values
y = datas.iloc[:, 2].values
# Features and the target variables
X = datas.iloc[:, 1:2].values
y = datas.iloc[:, 2].values
# Fitting Linear Regression to the dataset
from sklearn.linear_model import LinearRegression
lin = LinearRegression()
lin.fit(X, y)
# Fitting Polynomial Regression to the dataset
from sklearn.preprocessing import PolynomialFeatures
poly = PolynomialFeatures(degree=4)
X_poly = poly.fit_transform(X)
poly.fit(X_poly, y)
lin2 = LinearRegression()
lin2.fit(X_poly, y)
# Visualising the Linear Regression results
plt.scatter(X, y, color='blue')
plt.plot(X, lin.predict(X), color='red')
plt.title('Linear Regression')
plt.xlabel('Temperature')
plt.ylabel('Pressure')
plt.show()

Practical 8 – Decision Tree

# Load libraries
import pandas as pd
from sklearn.tree import DecisionTreeClassifier # Import Decision Tree Classifier
from sklearn.model_selection import train_test_split # Import train_test_split function
from sklearn import metrics #Import scikit-learn metrics module for accuracy
calculation
col_names = ['pregnant', 'glucose', 'bp', 'skin', 'insulin', 'bmi', 'pedigree', 'age', 'label']
# load dataset
pima = pd.read_csv("diabetes.csv", header=None, names=col_names)
pima.head()

Output

#split dataset in features and target variable

feature_cols = ['pregnant', 'insulin', 'bmi', 'age','glucose','bp','pedigree']
X = pima[feature_cols] # Features
y = pima.label # Target variable

# Split dataset into training set and test set

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=1)
# 70% training and 30% test
# Create Decision Tree classifer object
clf = DecisionTreeClassifier()

# Train Decision Tree Classifer

clf = clf.fit(X_train,y_train)

#Predict the response for test dataset

y_pred = clf.predict(X_test)
# Model Accuracy, how often is the classifier correct?
print("Accuracy:",metrics.accuracy_score(y_test, y_pred))
Accuracy: 0.6753246753246753
from sklearn.tree import export_graphviz
from sklearn.externals.six import StringIO
from IPython.display import Image
import pydotplus
dot_data = StringIO()
export_graphviz(clf, out_file=dot_data,
filled=True, rounded=True,
special_characters=True,feature_names =
feature_cols,class_names=['0','1'])
graph = pydotplus.graph_from_dot_data(dot_data.getvalue())
graph.write_png('diabetes.png')
Image(graph.create_png())

Output:

Practical 9
For a given data set, split the dataset into training and testing. Fit the following
models on the training set and evaluate the performance on the test set: (i)
Boosting and Bagging (ii) Random Forest

i) Boosting and Bagging

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns

%matplotlib inline
sns.set_style("whitegrid")
plt.style.use("fivethirtyeight")
df = pd.read_csv("../input/pima-indians-diabetes-database/diabetes.csv")
df.head()
df.info()
df.isnull().sum()
pd.set_option('display.float_format', '{:.2f}'.format)
df.describe()

categorical_val = []
continous_val = []
for column in df.columns:
# print('==============================')
# print(f"{column} : {df[column].unique()}")
if len(df[column].unique()) <= 10:
categorical_val.append(column)
else:
continous_val.append(column)
df.columns

# How many missing zeros are mising in each feature

feature_columns = [
'Pregnancies', 'Glucose', 'BloodPressure', 'SkinThickness',
'Insulin', 'BMI', 'DiabetesPedigreeFunction', 'Age'
]

for column in feature_columns:

print("============================================")
print(f"{column} ==> Missing zeros : {len(df.loc[df[column] == 0])}")
from sklearn.impute import SimpleImputer

fill_values = SimpleImputer(missing_values=0, strategy="mean", copy=False)

df[feature_columns] = fill_values.fit_transform(df[feature_columns])

for column in feature_columns:

print("============================================")
print(f"{column} ==> Missing zeros : {len(df.loc[df[column] == 0])}")

from sklearn.model_selection import train_test_split

X = df[feature_columns]
y = df.Outcome

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3,

random_state=42)

from sklearn.metrics import confusion_matrix, accuracy_score, classification_report

def evaluate(model, X_train, X_test, y_train, y_test):

y_test_pred = model.predict(X_test)
y_train_pred = model.predict(X_train)

print("TRAINIG RESULTS: \n===============================")

clf_report = pd.DataFrame(classification_report(y_train, y_train_pred,
output_dict=True))
print(f"CONFUSION MATRIX:\n{confusion_matrix(y_train, y_train_pred)}")
print(f"ACCURACY SCORE:\n{accuracy_score(y_train, y_train_pred):.4f}")
print(f"CLASSIFICATION REPORT:\n{clf_report}")

print("TESTING RESULTS: \n===============================")

clf_report = pd.DataFrame(classification_report(y_test, y_test_pred,
output_dict=True))
print(f"CONFUSION MATRIX:\n{confusion_matrix(y_test, y_test_pred)}")
print(f"ACCURACY SCORE:\n{accuracy_score(y_test, y_test_pred):.4f}")
print(f"CLASSIFICATION REPORT:\n{clf_report}")

ii)Random Forest
# Data Processing
import pandas as pd
import numpy as np

# Modelling
from sklearn.ensemble import RandomForestClassifier
from sklearn.metrics import accuracy_score, confusion_matrix, precision_score,
recall_score, ConfusionMatrixDisplay
from sklearn.model_selection import RandomizedSearchCV, train_test_split
from scipy.stats import randint

# Tree Visualisation
from sklearn.tree import export_graphviz
from IPython.display import Image
import graphviz
bank_data['default'] = bank_data['default'].map({'no':0,'yes':1,'unknown':0})
bank_data['y'] = bank_data['y'].map({'no':0,'yes':1})
# Split the data into features (X) and target (y)
X = bank_data.drop('y', axis=1)
y = bank_data['y']

# Split the data into training and test sets

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2)
rf = RandomForestClassifier()
rf.fit(X_train, y_train)
y_pred = rf.predict(X_test)
accuracy = accuracy_score(y_test, y_pred)
print("Accuracy:", accuracy)
# Export the first three decision trees from the forest

for i in range(3):
tree = rf.estimators_[i]
dot_data = export_graphviz(tree,
feature_names=X_train.columns,
filled=True,
max_depth=2,
impurity=False,
proportion=True)
graph = graphviz.Source(dot_data)
display(graph)

output:

Capstone Project Vivek
100% (4)
Capstone Project Vivek
145 pages
Group 3-Report Mgt555
No ratings yet
Group 3-Report Mgt555
19 pages
Capstone Interim Report - HR CTC Prediction
80% (10)
Capstone Interim Report - HR CTC Prediction
16 pages
Business Statistics Ii B.com 2ND Year Question Bank
No ratings yet
Business Statistics Ii B.com 2ND Year Question Bank
8 pages
2.1 ML (Implementation of Simple Linear Regression in Python)
No ratings yet
2.1 ML (Implementation of Simple Linear Regression in Python)
8 pages
Week 7 Assignment 1
No ratings yet
Week 7 Assignment 1
6 pages
Machine Learning Lab
No ratings yet
Machine Learning Lab
43 pages
Lesson 2 Linear Regression
100% (1)
Lesson 2 Linear Regression
21 pages
Machine
100% (1)
Machine
45 pages
R20!63!20ITC27 Deep Learning Lab Manual (Minor Proj 2) Dr.K.ramu
No ratings yet
R20!63!20ITC27 Deep Learning Lab Manual (Minor Proj 2) Dr.K.ramu
47 pages
Machine Learning Hands-On
100% (1)
Machine Learning Hands-On
18 pages
CP4252 Machine Learning Lab Manual
No ratings yet
CP4252 Machine Learning Lab Manual
26 pages
Chapter-4-Simple Linear Regression & Correlation
100% (3)
Chapter-4-Simple Linear Regression & Correlation
9 pages
Lab 1
No ratings yet
Lab 1
8 pages
Regression Dataset Example
No ratings yet
Regression Dataset Example
14 pages
Module-2 - Logistic Regression in Machine Learning
No ratings yet
Module-2 - Logistic Regression in Machine Learning
28 pages
Lab Mannual of ML
No ratings yet
Lab Mannual of ML
43 pages
ML RECORD - Merged
No ratings yet
ML RECORD - Merged
33 pages
Machine Learning and Data Analytics Using Python Lab
No ratings yet
Machine Learning and Data Analytics Using Python Lab
36 pages
School of Engineering: Lab Manual On Machine Learning Lab
No ratings yet
School of Engineering: Lab Manual On Machine Learning Lab
23 pages
Lecture-2 Unit 2
No ratings yet
Lecture-2 Unit 2
56 pages
Machine Learning-SEAIML-241P (PR) Bharat
No ratings yet
Machine Learning-SEAIML-241P (PR) Bharat
42 pages
Machine Learning 2
No ratings yet
Machine Learning 2
45 pages
ML Lab
No ratings yet
ML Lab
23 pages
AMDL Practicals
No ratings yet
AMDL Practicals
27 pages
M.E Machine Learning - CP4252 Lab Manual4716718074353656238
No ratings yet
M.E Machine Learning - CP4252 Lab Manual4716718074353656238
26 pages
C2W3 Lab 01 Model Evaluation and Selection
No ratings yet
C2W3 Lab 01 Model Evaluation and Selection
21 pages
ML Lab Record - 250625 - 105014
No ratings yet
ML Lab Record - 250625 - 105014
29 pages
Sahil ML
No ratings yet
Sahil ML
21 pages
ML Manual With Outputs
No ratings yet
ML Manual With Outputs
30 pages
20dit073 Jay Prajapati ML
No ratings yet
20dit073 Jay Prajapati ML
68 pages
ML Manual
No ratings yet
ML Manual
53 pages
C2W3 Lab 01 Model Evaluation and Selection
No ratings yet
C2W3 Lab 01 Model Evaluation and Selection
21 pages
cp4252 Machine Learning Lab Manual
No ratings yet
cp4252 Machine Learning Lab Manual
21 pages
01 Machine Learning
No ratings yet
01 Machine Learning
25 pages
ML Record
No ratings yet
ML Record
23 pages
ML Practical File
No ratings yet
ML Practical File
30 pages
Index: Name - JINESH PRAJAPAT Class - B. Tech, III Year Branch - AI & DS Sem - V
No ratings yet
Index: Name - JINESH PRAJAPAT Class - B. Tech, III Year Branch - AI & DS Sem - V
35 pages
Simple Linear Regression
No ratings yet
Simple Linear Regression
30 pages
MlLabManualdocx 2024 09 04 22 02 58
No ratings yet
MlLabManualdocx 2024 09 04 22 02 58
19 pages
FYMCA IDSLab A6 Submission
No ratings yet
FYMCA IDSLab A6 Submission
9 pages
Easy Pract ML
No ratings yet
Easy Pract ML
7 pages
Agniva
No ratings yet
Agniva
16 pages
ML Lab 01999676272
No ratings yet
ML Lab 01999676272
12 pages
Lab (Work) Experiment File Priyanka Rajak 0901MC221056
No ratings yet
Lab (Work) Experiment File Priyanka Rajak 0901MC221056
19 pages
DSBDA Practicals
No ratings yet
DSBDA Practicals
16 pages
Lab Manual 04
No ratings yet
Lab Manual 04
12 pages
Beer Sales With Analysis
No ratings yet
Beer Sales With Analysis
84 pages
ML Lab Programs
No ratings yet
ML Lab Programs
9 pages
23UCC554
No ratings yet
23UCC554
9 pages
Python For Data Science IA 1 Programs
No ratings yet
Python For Data Science IA 1 Programs
14 pages
Shubham Pract 6 - Merged
No ratings yet
Shubham Pract 6 - Merged
12 pages
Empirical Methods in Finance Time Series Models Part 2: ARIMA Models by Sakshi Sharma
No ratings yet
Empirical Methods in Finance Time Series Models Part 2: ARIMA Models by Sakshi Sharma
49 pages
Final
No ratings yet
Final
13 pages
17-18.anova Doe
No ratings yet
17-18.anova Doe
18 pages
Data Analytics
No ratings yet
Data Analytics
10 pages
ML File
No ratings yet
ML File
10 pages
Machine Learning Lab Manual 06
100% (1)
Machine Learning Lab Manual 06
8 pages
Simple Linear Regression in Machine Learning
No ratings yet
Simple Linear Regression in Machine Learning
7 pages
AI and ML Lab Ex3 To 12
No ratings yet
AI and ML Lab Ex3 To 12
27 pages
B-56 Sanket Jambhulkar MLA-3
No ratings yet
B-56 Sanket Jambhulkar MLA-3
7 pages
Write A Lab Report On Linear Regression and Logistic Regression. Include The Cost Function Differentiation and The Code in The Report.
No ratings yet
Write A Lab Report On Linear Regression and Logistic Regression. Include The Cost Function Differentiation and The Code in The Report.
7 pages
Exp 1
No ratings yet
Exp 1
6 pages
Exploratory Factor Analysis
No ratings yet
Exploratory Factor Analysis
54 pages
ML-Lab07-Building and Evaluating Multivariate Regression Models in Python
No ratings yet
ML-Lab07-Building and Evaluating Multivariate Regression Models in Python
5 pages
ML 01 (Shubham)
No ratings yet
ML 01 (Shubham)
14 pages
ML Practical 1 Worksheet
No ratings yet
ML Practical 1 Worksheet
4 pages
HW1 Final
No ratings yet
HW1 Final
4 pages
Python Machine Learning - Logistic Regression
No ratings yet
Python Machine Learning - Logistic Regression
1 page
ANOVA (Analysis of Variance) : Dr. Rachita Gupta
No ratings yet
ANOVA (Analysis of Variance) : Dr. Rachita Gupta
13 pages
Beyond Correlation
100% (1)
Beyond Correlation
33 pages
Probit Model
No ratings yet
Probit Model
29 pages
Amazon Data Analysis Presentation 1
No ratings yet
Amazon Data Analysis Presentation 1
20 pages
RMN PLS 2001 02 PLSR
No ratings yet
RMN PLS 2001 02 PLSR
22 pages
Factor Analysis - Stat 390 Presentation 3
No ratings yet
Factor Analysis - Stat 390 Presentation 3
13 pages
Simple Regression: Quality MKT Share X Y Error LINEST Output (1
100% (1)
Simple Regression: Quality MKT Share X Y Error LINEST Output (1
2 pages
Multiple-Choice Questions: Describing Data: Numerical
No ratings yet
Multiple-Choice Questions: Describing Data: Numerical
4 pages
One Way ANOVA For H0: M
No ratings yet
One Way ANOVA For H0: M
2 pages
Week 6 SVM
No ratings yet
Week 6 SVM
18 pages
Final Exam - Advanced Econometrics 2
No ratings yet
Final Exam - Advanced Econometrics 2
3 pages
Name: Chinmay Tripurwar Roll No: 22b3902: Simple Regression Model Analysis
No ratings yet
Name: Chinmay Tripurwar Roll No: 22b3902: Simple Regression Model Analysis
9 pages
Kematangan Emosi Ditinjau Dari Pola Asuh Otoriter Orang Tua Pada Siswa SMP Talitakum Medan
No ratings yet
Kematangan Emosi Ditinjau Dari Pola Asuh Otoriter Orang Tua Pada Siswa SMP Talitakum Medan
9 pages
Unit 5
No ratings yet
Unit 5
8 pages
2025 - Structural Equation Modeling (SEM) in L2 Writing Research
No ratings yet
2025 - Structural Equation Modeling (SEM) in L2 Writing Research
19 pages
STAT1070 Mock Exam Questions
No ratings yet
STAT1070 Mock Exam Questions
14 pages
Ayush SMA CO1
No ratings yet
Ayush SMA CO1
2 pages
Scribd
No ratings yet
Scribd
7 pages
ISYE6414 HW1 Solutions
No ratings yet
ISYE6414 HW1 Solutions
15 pages
DATA MINING AND MACHINE LEARNING. PREDICTIVE TECHNIQUES: REGRESSION, GENERALIZED LINEAR MODELS, SUPPORT VECTOR MACHINE AND NEURAL NETWORKS
From Everand
DATA MINING AND MACHINE LEARNING. PREDICTIVE TECHNIQUES: REGRESSION, GENERALIZED LINEAR MODELS, SUPPORT VECTOR MACHINE AND NEURAL NETWORKS
César Pérez López
No ratings yet
The Supervised Learning Workshop - Second Edition: A New, Interactive Approach to Understanding Supervised Learning Algorithms, 2nd Edition
From Everand
The Supervised Learning Workshop - Second Edition: A New, Interactive Approach to Understanding Supervised Learning Algorithms, 2nd Edition
Blaine Bateman
No ratings yet

Machine Intelligence

Uploaded by

Machine Intelligence

Uploaded by

Application Id-32890 M.Sc.CS.

Institute of Distance and Open Learning

A Practical Journal Submitted in fulfillment

Application ID:- 32890

Seat No:- 4500378

Under the Guidance of

Prof. Sujatha Sunder

Institute of Distance and Open Learning

This is to certify that, this practical journal entitled “ Machine Intelligence

Guide External Examiner Coordinator – M.Sc.CS

2 Implement multiple regression model on

3 Implement Logistic Regression

4 Fit a classification model using K Nearest

5 Use bootstrap to give an estimate of a

6 For a given data set, split the data into

7 For a given data set, perform the

9 For a given data set, split the dataset into

Practical 3: Implement Logistic Regression

# Import necessary libraries

# Load the diabetes dataset

# Convert the target variable to binary (1 for diabetes, 0 for no diabetes)

# Split the data into training and testing sets

# Train the Logistic Regression model

# Evaluate the model

# evaluate the model

0 0.77 0.73 0.75 49

# Visualize the decision boundary with accuracy information

# Import necessary modules

# Create feature and target arrays

# Split into training and test set

# Loop over K values

# Compute training and test data accuracy

Practical 5 : Use bootstrap to give an estimate of a given statistic.

Example of Using Bootstrapping to Create Confidence Intervals

(i) Linear model using least squares

# turn y into a column vector

train_score_lr = lr.score(X_train, y_train)

print("The train score for lr model is {}".format(train_score_lr))

#Ridge Regression Model

#train and test scorefor ridge regression

plt.figure(figsize = (10, 10))

#Lasso regression model

print("The train score for ls model is {}".format(train_score_ls))

pd.Series(lasso.coef_, features).sort_values(ascending = True).plot(kind = "bar")

# Split into training and test sets

# Scale the data

# 10-fold CV, with shuffle

(v) PLS model

from scipy.signal import savgol_filter

from sklearn.cross_decomposition import PLSRegression

Practical 8 – Decision Tree

#split dataset in features and target variable

# Split dataset into training set and test set

# Train Decision Tree Classifer

#Predict the response for test dataset

i) Boosting and Bagging

# How many missing zeros are mising in each feature

for column in feature_columns:

fill_values = SimpleImputer(missing_values=0, strategy="mean", copy=False)

for column in feature_columns:

from sklearn.model_selection import train_test_split

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3,

from sklearn.metrics import confusion_matrix, accuracy_score, classification_report

def evaluate(model, X_train, X_test, y_train, y_test):

print("TRAINIG RESULTS: \n===============================")

print("TESTING RESULTS: \n===============================")

# Split the data into training and test sets

You might also like