0% found this document useful (0 votes)

10 views29 pages

ML Lab34

The document describes a series of experiments involving data processing and machine learning techniques applied to housing data and a linear regression model. It includes steps for data imputation, anomaly detection, standardization, normalization, and encoding, followed by the implementation of a gradient descent algorithm for linear regression. The final results include predictions from the model and statistical calculations related to height and weight data.

Uploaded by

Manav Purswani

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

10 views29 pages

ML Lab34

Uploaded by

Manav Purswani

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 29

Experiment 1

import pandas as pd
data = pd.read_csv('large_housing_data_mumbai.csv')
print("Original Data:")
print(data.head())

Original Data:
House_ID Bedrooms Size (sq ft) Price (INR) Location Year_Built
0 1 4.0 855.0 31356226.0 Juhu 2002.0
1 2 5.0 1847.0 27775439.0 Andheri 2004.0
2 3 NaN 2363.0 37325149.0 Bandra 2000.0
3 4 5.0 626.0 6147116.0 South Mumbai 2002.0
4 5 5.0 NaN 49899606.0 Worli NaN

#Imputation
#Handle missing values using median for numerical columns and the most
frequent value for categorical columns.
from sklearn.impute import SimpleImputer
num_features = ['Bedrooms', 'Size (sq ft)', 'Price (INR)', 'Year_Built']
cat_features = ['Location']
num_imputer = SimpleImputer(strategy='median')
data[num_features] = num_imputer.fit_transform(data[num_features])
cat_imputer = SimpleImputer(strategy='most_frequent')
data[cat_features] = cat_imputer.fit_transform(data[cat_features])
print("\nData After Imputation:")
print(data.head())

Data After Imputation:

House_ID Bedrooms Size (sq ft) Price (INR) Location Year_Built
0 1 4.0 855.0 31356226.0 Juhu 2002.0
1 2 5.0 1847.0 27775439.0 Andheri 2004.0
2 3 3.0 2363.0 37325149.0 Bandra 2000.0
3 4 5.0 626.0 6147116.0 South Mumbai 2002.0
4 5 5.0 1702.5 49899606.0 Worli 2012.0

#Anomaly Detection
#Detect anomalies in the dataset. Here, we use Z-scores to identify anomalies
in the Price (INR) column.
from scipy import stats
z_scores = stats.zscore(data[num_features])
data['Anomaly'] = (abs(z_scores) > 3).any(axis=1) # Mark anomalies
print("\nData After Anomaly Detection:")
print(data.head())
#Rule-Based Anomaly Detection
#simple rules where:
#A house with less than 1000 sq ft should have 1 to 2 bedrooms.
#A house with 1000-2000 sq ft should have 2 to 4 bedrooms.
#A house with more than 2000 sq ft should have 3 or more bedrooms.
def is_bedroom_size_reasonable(row):
if row['Size (sq ft)'] < 1000:
return 1 <= row['Bedrooms'] <= 2
elif row['Size (sq ft)'] <= 2000:
return 2 <= row['Bedrooms'] <= 4
else:
return row['Bedrooms'] >= 3
data['Bed_Size_Anomaly'] = ~data.apply(is_bedroom_size_reasonable, axis=1)
print("\nData After Rule-Based Anomaly Detection:")
print(data.head())

Data After Anomaly Detection:

House_ID Bedrooms Size (sq ft) Price (INR) Location Year_Built \
0 1 4.0 855.0 31356226.0 Juhu 2002.0
1 2 5.0 1847.0 27775439.0 Andheri 2004.0
2 3 3.0 2363.0 37325149.0 Bandra 2000.0
3 4 5.0 626.0 6147116.0 South Mumbai 2002.0
4 5 5.0 1702.5 49899606.0 Worli 2012.0

Anomaly
0 False
1 False
2 False
3 False
4 False

Data After Rule-Based Anomaly Detection:

Anomaly Bed_Size_Anomaly
0 False True
1 False True
2 False False
3 False True
4 False True

#Standardization
#Standardize numerical features so they have a mean of 0 and a standard
deviation of 1.
from sklearn.preprocessing import StandardScaler
# Standardize numericals
scaler = StandardScaler()
data[num_features] = scaler.fit_transform(data[num_features])
print("\nData After Standardization:")
print(data.head())

Data After Standardization:

House_ID Bedrooms Size (sq ft) Price (INR) Location Year_Built \
0 1 0.710719 -1.231366 -0.017529 Juhu -1.432248
1 2 1.432261 0.194650 -0.110953 Andheri -1.124866
2 3 -0.010823 0.936408 0.138203 Bandra -1.739631
3 4 1.432261 -1.560557 -0.675243 South Mumbai -1.432248
4 5 1.432261 -0.013071 0.466275 Worli 0.104664

Anomaly Bed_Size_Anomaly
0 False True
1 False True
2 False False
3 False True
4 False True

#Normalization
#Normalize numerical features to fit within the range [0, 1]
from sklearn.preprocessing import MinMaxScaler

normalizer = MinMaxScaler()
data[num_features] = normalizer.fit_transform(data[num_features])
print("\nData After Normalization:")
print(data.head())

Data After Normalization:

House_ID Bedrooms Size (sq ft) Price (INR) Location Year_Built \
0 1 0.75 0.142055 0.058226 Juhu 0.090909
1 2 1.00 0.540128 0.050309 Andheri 0.181818
2 3 0.50 0.747191 0.071422 Bandra 0.000000
3 4 1.00 0.050161 0.002493 South Mumbai 0.090909
4 5 1.00 0.482143 0.099222 Worli 0.545455

Anomaly Bed_Size_Anomaly
0 False True
1 False True
2 False False
3 False True
4 False True

#Encoding
#One-Hot Encode the categorical feature Location.
from sklearn.preprocessing import OneHotEncoder
# One-Hot Encoding for 'Location'
encoder = OneHotEncoder(sparse=False)
encoded_location = encoder.fit_transform(data[['Location']])
encoded_df = pd.DataFrame(encoded_location,
columns=encoder.get_feature_names_out(['Location']))

data_encoded = pd.concat([data, encoded_df], axis=1).drop('Location', axis=1)

print("\nData After Encoding:")

print(data_encoded.head())

Data After Encoding:

House_ID Bedrooms Size (sq ft) Price (INR) Year_Built Anomaly \
0 1 0.75 0.142055 0.058226 0.090909 False
1 2 1.00 0.540128 0.050309 0.181818 False
2 3 0.50 0.747191 0.071422 0.000000 False
3 4 1.00 0.050161 0.002493 0.090909 False
4 5 1.00 0.482143 0.099222 0.545455 False

Bed_Size_Anomaly Location_Andheri Location_Bandra Location_Borivali \

0 True 0.0 0.0 0.0
1 True 1.0 0.0 0.0
2 False 0.0 1.0 0.0
3 True 0.0 0.0 0.0
4 True 0.0 0.0 0.0

Location_Juhu Location_Malad Location_Pali Hill Location_South Mumbai

\
0 1.0 0.0 0.0 0.0
1 0.0 0.0 0.0 0.0
2 0.0 0.0 0.0 0.0
3 0.0 0.0 0.0 1.0
4 0.0 0.0 0.0 0.0

Location_Worli
0 0.0
1 0.0
2 0.0
3 0.0
4 1.0

/usr/local/lib/python3.10/dist-
packages/sklearn/preprocessing/_encoders.py:975: FutureWarning: `sparse` was
renamed to `sparse_output` in version 1.2 and will be removed in 1.4.
`sparse_output` is ignored unless you leave `sparse` to its default value.
warnings.warn(
Experiment 2
import numpy as np

class GradientDescentMSE:
def __init__(self, lr=0.01, n_iters=1000):
self.lr = lr
self.n_iters = n_iters
self.x1 = None
self.x2 = None

def fit(self, X, y):

# Initialize parameters
self.x1 = np.random.randn() # Initialize x1
self.x2 = np.random.randn() # Initialize x2

for _ in range(self.n_iters):
# Compute predictions
y_pred = self.x1 * X[:, 0] + self.x2 * X[:, 1]

# Compute gradients for MSE

grad_x1 = (2/len(y)) * np.sum((y_pred - y) * X[:, 0])
grad_x2 = (2/len(y)) * np.sum((y_pred - y) * X[:, 1])

# Update parameters
self.x1 = self.x1 - self.lr * grad_x1
self.x2 = self.x2 - self.lr * grad_x2

return self.x1, self.x2

def objective_function(self, X):

return self.x1 * X[:, 0] + self.x2 * X[:, 1]

# Example dataset with two features

X = np.array([[0.5, 1.0], [1.0, 2.0], [1.5, 2.5], [2.0, 3.5]]) # Features
y = np.array([1.5, 2.5, 3.0, 4.0]) # True values

# Initialize and run gradient descent

gd_mse = GradientDescentMSE(lr=0.01, n_iters=1000)
final_x1, final_x2 = gd_mse.fit(X, y)
final_predictions = gd_mse.objective_function(X)

print(f"Final x1: {final_x1}, Final x2: {final_x2}")

print(f"Final Predictions: {final_predictions}")

Final x1: -1.3189740745133114, Final x2: 1.9351908822144432

Final Predictions: [1.27570384 2.55140769 2.85951609 4.13521994]
Experiment 3
import pandas as pd
import numpy as np
file_path = '/ml-linear-reg.csv'
data = pd.read_csv(file_path)
#display data
print(data)

Height Weight
0 151 63
1 174 81
2 138 56
3 186 91
4 128 47
5 136 57
6 179 76
7 163 72
8 152 62
9 131 48

#mean of X (Height) and Y (Weight)

x_mean = np.mean(data['Height'])
y_mean = np.mean(data['Weight'])

#Display
print(f"Mean of Height (x_mean): {x_mean}")
print(f"Mean of Weight (y_mean): {y_mean}")

Mean of Height (x_mean): 153.8

Mean of Weight (y_mean): 65.3

# Calculate xi - x_bar and yi - y_bar

data['xi-xbar'] = data['Height'] - x_mean
data['yi-ybar'] = data['Weight'] - y_mean

#Display xi - x_bar and yi - y_bar

print(data[['Height', 'Weight', 'xi-xbar', 'yi-ybar']])

Height Weight xi-xbar yi-ybar

0 151 63 -2.8 -2.3
1 174 81 20.2 15.7
2 138 56 -15.8 -9.3
3 186 91 32.2 25.7
4 128 47 -25.8 -18.3
5 136 57 -17.8 -8.3
6 179 76 25.2 10.7
7 163 72 9.2 6.7
8 152 62 -1.8 -3.3
9 131 48 -22.8 -17.3
# Calculate product of (xi - x_bar) and (yi - y_bar)
data['(xi-xbar)*(yi-ybar)'] = data['xi-xbar'] * data['yi-ybar']

# Display product of (xi - x_bar) and (yi - y_bar)

print(data[['Height', 'Weight', 'xi-xbar', 'yi-ybar', '(xi-xbar)*(yi-
ybar)']])

Height Weight xi-xbar yi-ybar (xi-xbar)*(yi-ybar)

0 151 63 -2.8 -2.3 6.44
1 174 81 20.2 15.7 317.14
2 138 56 -15.8 -9.3 146.94
3 186 91 32.2 25.7 827.54
4 128 47 -25.8 -18.3 472.14
5 136 57 -17.8 -8.3 147.74
6 179 76 25.2 10.7 269.64
7 163 72 9.2 6.7 61.64
8 152 62 -1.8 -3.3 5.94
9 131 48 -22.8 -17.3 394.44

# Calculate square of (xi - x_bar)

data['sq(xi-xbar)'] = data['xi-xbar'] ** 2

# Display square of (xi - x_bar)

print(data[['Height', 'Weight', 'xi-xbar', 'yi-ybar', '(xi-xbar)*(yi-ybar)',
'sq(xi-xbar)']])

Height Weight xi-xbar yi-ybar (xi-xbar)*(yi-ybar) sq(xi-xbar)

0 151 63 -2.8 -2.3 6.44 7.84
1 174 81 20.2 15.7 317.14 408.04
2 138 56 -15.8 -9.3 146.94 249.64
3 186 91 32.2 25.7 827.54 1036.84
4 128 47 -25.8 -18.3 472.14 665.64
5 136 57 -17.8 -8.3 147.74 316.84
6 179 76 25.2 10.7 269.64 635.04
7 163 72 9.2 6.7 61.64 84.64
8 152 62 -1.8 -3.3 5.94 3.24
9 131 48 -22.8 -17.3 394.44 519.84

# Calculate sum of square of (xi - x_bar)

sum_sq_xi_xbar = np.sum(data['sq(xi-xbar)'])

# Display sum of square of (xi - x_bar)

print(f"Sum of square of (xi - x_bar): {sum_sq_xi_xbar}")

Sum of square of (xi - x_bar): 3927.6000000000004

# Calculate sum of (xi - x_bar) * (yi - y_bar)

sum_xiyi_xbar_ybar = np.sum(data['(xi-xbar)*(yi-ybar)'])

# Display sum of (xi - x_bar) * (yi - y_bar)

print(f"Sum of (xi - x_bar) * (yi - y_bar): {sum_xiyi_xbar_ybar}")
Sum of (xi - x_bar) * (yi - y_bar): 2649.6

# Calculate b1 (slope)
b1 = sum_xiyi_xbar_ybar / sum_sq_xi_xbar

# Display b1 (slope)
print(f"Slope (b1): {b1}")

Slope (b1): 0.6746104491292392

# Calculate b0 (intercept)
b0 = y_mean - b1 * x_mean

# Display b0 (intercept)
print(f"Intercept (b0): {b0}")

Intercept (b0): -38.45508707607699

# Define a function to predict weight from height

def predict(height):
return b1 * height + b0

# Example prediction
height_new = 160
weight_prediction = predict(height_new)
print(f'Predicted weight for height {height_new} cm is
{weight_prediction:.2f} kg')

Predicted weight for height 160 cm is 69.48 kg

Experiment 4
import numpy as np
1
The sigmoid function is defined as 𝜙(𝑧) = 1+𝑧−𝑧

$\therefore \phi (\hat{y}) = \frac{1}{1 + e^(-\hat{y})} $

$\therefore \hat{y} = \frac{1}{1 + e^(-wx+b)} $
def sigmoid(x):
return 1/(1+np.exp(-x))

class LogisticRegression():

def init(self, lr=0.01, n_iters=1000):

self.lr = lr
self.n_iters = n_iters
self.weights = None
self.bias = None

def fit(self, X, y):

n_samples, n_features = X.shape
self.weights = np.zeros(n_features)
self.bias = 0

for _ in range(self.n_iters):
linear_pred = np.dot(X, self.weights) + self.bias
y_pred = sigmoid(linear_pred) # Logistic addition. Rest all is
Linear Regression.

dw = (1/n_samples) * np.dot(X.T, (y_pred - y))

db = (1/n_samples) * np.sum(y_pred - y)

self.weights = self.weights - self.lr*dw

self.bias = self.bias - self.lr*db

def predict(self, X):

linear_pred = np.dot(X, self.weights) + self.bias
y_pred = sigmoid(linear_pred)
class_pred = [0 if y<=0.5 else 1 for y in y_pred]
return class_pred

import numpy as np
from sklearn.model_selection import train_test_split
from sklearn import datasets
import matplotlib.pyplot as plt

bc = datasets.load_breast_cancer()
X, y = bc.data, bc.target
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2,
random_state=1234)

model = LogisticRegression(lr=0.01)
model.fit(X_train,y_train)
y_pred = model.predict(X_test)

def accuracy(y_pred, y_test):

return np.sum(y_pred==y_test)/len(y_test)

acc = accuracy(y_pred, y_test)

print(acc)

0.9210526315789473

C:\Users\rohra\AppData\Local\Temp\ipykernel_19392\4033946986.py:2:
RuntimeWarning: overflow encountered in exp
return 1/(1+np.exp(-x))

import pandas as pd

results = pd.DataFrame({'Actual': y_test, 'Predicted': y_pred})

print(results)

Actual Predicted
0 1 1
1 1 1
2 1 1
3 1 1
4 1 1
.. ... ...
109 1 1
110 0 0
111 1 0
112 0 0
113 0 0

[114 rows x 2 columns]

Experiment 5
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
df = pd.read_csv("penguins_size.csv")
df.head()

# EDA
# Missing Data
df.info()

df.isna().sum()

# What percentage are we dropping?

100*(10/344)
df = df.dropna()
df.info()

df.head()

df['sex'].unique()

df['island'].unique()

df = df[df['sex']!='.']
# Feature Engineering
pd.get_dummies(df)
pd.get_dummies(df.drop('species',axis=1),drop_first=True)

# Train and Test split

X = pd.get_dummies(df.drop('species',axis=1),drop_first=True)
y = df['species']
from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42)
# Decision Tree Classifier
from sklearn.tree import DecisionTreeClassifier
model = DecisionTreeClassifier()
model.fit(X_train,y_train)
base_pred = model.predict(X_test)
from sklearn.metrics import confusion_matrix, classification_report, ConfusionMatrixDisplay
import matplotlib.pyplot as plt

# Generate confusion matrix

y_pred = model.predict(X_test)
cm = confusion_matrix(y_test, y_pred)
# Display the confusion matrix
disp = ConfusionMatrixDisplay(confusion_matrix=cm)
disp.plot()

plt.show()

print(classification_report(y_test,base_pred))

model.feature_importances_
pd.DataFrame(index=X.columns,data=model.feature_importances_,columns=['Feature
Importance'])

# Visualize the tree

from sklearn.tree import plot_tree
plt.figure(figsize=(12,8))
plot_tree(model);

from sklearn.tree import plot_tree

import matplotlib.pyplot as plt
# Convert X.columns to a list
plt.figure(figsize=(12, 8), dpi=150)
plot_tree(model, filled=True, feature_names=X.columns.tolist())

plt.show()

def report_model(model):
model_preds = model.predict(X_test)
print(classification_report(y_test,model_preds))
print('\n')
plt.figure(figsize=(12,8),dpi=150)
plot_tree(model,filled=True,feature_names=X.columns);
pruned_tree = DecisionTreeClassifier(max_depth=2)
pruned_tree.fit(X_train,y_train)
from sklearn.tree import plot_tree
import matplotlib.pyplot as plt

def report_model(model):
# Print classification report if needed (e.g., precision, recall, etc.)
print('\n')

# Convert X.columns to a list before passing to plot_tree

plt.figure(figsize=(12, 8), dpi=150)
plot_tree(model, filled=True, feature_names=X.columns.tolist())
plt.show()

pruned_tree = DecisionTreeClassifier(max_leaf_nodes=3)
pruned_tree.fit(X_train,y_train)
report_model(pruned_tree)

entropy_tree = DecisionTreeClassifier(criterion='entropy')
entropy_tree.fit(X_train,y_train)
report_model(entropy_tree)
Experiment 6
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
from sklearn.model_selection import train_test_split
from sklearn.svm import SVC
from sklearn.metrics import accuracy_score, classification_report, confusion_matrix
from sklearn.preprocessing import LabelEncoder

# Load dataset from CSV

data = pd.read_csv('iris.csv')

print(data)

# Use only the first two features for training and visualization
X = data.iloc[:, :2].values # First two features
y = data.iloc[:, -1].values # Target variable (last column)

# Encode target labels (species) to numeric values

label_encoder = LabelEncoder()
y_encoded = label_encoder.fit_transform(y)

# Split the data into training and test sets

X_train, X_test, y_train, y_test = train_test_split(X, y_encoded, test_size=0.3,
random_state=42)

# Create and train the SVM model with RBF kernel

svm_rbf = SVC(kernel='rbf', gamma='auto')
svm_rbf.fit(X_train, y_train)

# Make predictions
y_pred = svm_rbf.predict(X_test)

# Accuracy
accuracy = accuracy_score(y_test, y_pred) * 100
print(f"Accuracy: {accuracy:.4f}\n")

# Classification report (formatted as a DataFrame)

report_dict = classification_report(y_test, y_pred,
target_names=label_encoder.classes_, output_dict=True)
report_df = pd.DataFrame(report_dict).transpose()
print("Classification Report:\n", report_df)

# Confusion matrix
conf_matrix = confusion_matrix(y_test, y_pred)
print("\nConfusion Matrix:")
print(conf_matrix)
print()

# Visualize the Confusion Matrix as a heatmap

plt.figure(figsize=(8, 6))
sns.heatmap(conf_matrix, annot=True, fmt='d', cmap='Blues',
xticklabels=label_encoder.classes_, yticklabels=label_encoder.classes_)
plt.ylabel('True Label')
plt.xlabel('Predicted Label')
plt.title('Confusion Matrix Heatmap')
plt.show()

print()

# Visualize the decision boundary (for 2D data)

def plot_decision_boundary(X, y, model):
h = .02 # step size in the mesh
x_min, x_max = X[:, 0].min() - 1, X[:, 0].max() + 1
y_min, y_max = X[:, 1].min() - 1, X[:, 1].max() + 1
xx, yy = np.meshgrid(np.arange(x_min, x_max, h), np.arange(y_min, y_max, h))

Z = model.predict(np.c_[xx.ravel(), yy.ravel()])
Z = Z.reshape(xx.shape)

plt.contourf(xx, yy, Z, alpha=0.8, cmap=plt.cm.coolwarm)

plt.scatter(X[:, 0], X[:, 1], c=y, edgecolors='k', marker='o',
cmap=plt.cm.coolwarm)
plt.xlabel('Feature 1')
plt.ylabel('Feature 2')
plt.title('SVM Decision Boundary with RBF Kernel')
plt.show()

# Plot decision boundary using the test set

plot_decision_boundary(X_test, y_test, svm_rbf)
Experiment 7
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import LabelEncoder
from sklearn.ensemble import RandomForestClassifier, AdaBoostClassifier
from sklearn.svm import SVC
from sklearn.metrics import accuracy_score, classification_report, confusion_matrix

# Load dataset from CSV

data = pd.read_csv('iris.csv')

# Use only the first two features for training and visualization
X = data.iloc[:, :2].values # First two features
y = data.iloc[:, -1].values # Target variable (last column)

# Encode target labels (species) to numeric values

label_encoder = LabelEncoder()
y_encoded = label_encoder.fit_transform(y)

# Split the data into training and test sets

X_train, X_test, y_train, y_test = train_test_split(X, y_encoded, test_size=0.3,
random_state=42)

# 1. SVM Model
svm_rbf = SVC(kernel='rbf', gamma='auto', probability=True)
svm_rbf.fit(X_train, y_train)
y_pred_svm = svm_rbf.predict(X_test)

# 2. Random Forest Model (Bagging)

rf = RandomForestClassifier(n_estimators=100, random_state=42)
rf.fit(X_train, y_train)
y_pred_rf = rf.predict(X_test)

# 3. AdaBoost Model (Boosting)

ada = AdaBoostClassifier(base_estimator=SVC(kernel='linear', probability=True),
n_estimators=50, random_state=42)
ada.fit(X_train, y_train)
y_pred_ada = ada.predict(X_test)

# Combine predictions using majority voting

y_pred_ensemble = np.array([y_pred_svm, y_pred_rf, y_pred_ada])
y_pred_final = np.array([np.bincount(x).argmax() for x in y_pred_ensemble.T])

# Accuracy for each model

for model_name, y_pred in zip(['SVM', 'Random Forest', 'AdaBoost', 'Ensemble'],
[y_pred_svm, y_pred_rf, y_pred_ada, y_pred_final]):
accuracy = accuracy_score(y_test, y_pred)
print(f"{model_name} Accuracy: {accuracy:.4f}\n")

# Classification report for the ensemble model

report_dict = classification_report(y_test, y_pred_final, target_names=label_encoder.classes_,
output_dict=True)
report_df = pd.DataFrame(report_dict).transpose()
print("Ensemble Classification Report:\n", report_df)

# Confusion matrix for the ensemble model

conf_matrix = confusion_matrix(y_test, y_pred_final)
print("\nEnsemble Confusion Matrix:")
print(conf_matrix)

# Visualize the Confusion Matrix as a heatmap for the ensemble model

# Visualize the decision boundary for the ensemble model

Z = model.predict(np.c_[xx.ravel(), yy.ravel()])
Z = Z.reshape(xx.shape)

plt.contourf(xx, yy, Z, alpha=0.8, cmap=plt.cm.coolwarm)

plt.scatter(X[:, 0], X[:, 1], c=y, edgecolors='k', marker='o', cmap=plt.cm.coolwarm)
plt.xlabel('Feature 1')
plt.ylabel('Feature 2')
plt.title('Ensemble Decision Boundary with Bagging and Boosting')
plt.show()

# Since we can't train an ensemble model directly, we just plot the decision boundary using the
SVM model
plot_decision_boundary(X_test, y_test, svm_rbf)
Experiment 8
import numpy as np
import pandas as pd
from sklearn.preprocessing import StandardScaler
import matplotlib.pyplot as plt

# Step 1: Load the Iris dataset from a CSV file

data = pd.read_csv('iris.csv')

# Assuming the last column is the target label and the rest are features
X = data.iloc[:, :-1].values # Features (all rows, all columns except the last)
y = data.iloc[:, -1].values # Target (all rows, last column)

# Map string labels to integers

label_mapping = {'setosa': 0, 'versicolor': 1, 'virginica': 2}
y_numeric = np.array([label_mapping[label] for label in y])

# Step 2: Standardize the data

scaler = StandardScaler()
X_std = scaler.fit_transform(X)

# Step 3: Calculate the covariance matrix

cov_matrix = np.cov(X_std.T)

# Step 4: Calculate the eigenvalues and eigenvectors

eigenvalues, eigenvectors = np.linalg.eig(cov_matrix)

# Step 5: Sort the eigenvalues and eigenvectors

sorted_indices = np.argsort(eigenvalues)[::-1]
eigenvalues_sorted = eigenvalues[sorted_indices]
eigenvectors_sorted = eigenvectors[:, sorted_indices]

# Step 6: Select the number of principal components

n_components = 2
eigenvectors_subset = eigenvectors_sorted[:, :n_components]

# Step 7: Transform the data

X_pca = X_std.dot(eigenvectors_subset)

# Step 8: Visualize the results

plt.figure(figsize=(8, 6))
plt.scatter(X_pca[:, 0], X_pca[:, 1], c=y_numeric, cmap='viridis', edgecolor='k', s=100)
plt.xlabel('Principal Component 1')
plt.ylabel('Principal Component 2')
plt.title('PCA of Iris Dataset')
plt.grid()
plt.show()

Psycho Cybernetics MAxwell Maltz PDF
100% (5)
Psycho Cybernetics MAxwell Maltz PDF
10 pages
Land Development Project
No ratings yet
Land Development Project
19 pages
COT-MATH - Identifying Parallel, Inter-Secting and Perpendicular Lines
70% (10)
COT-MATH - Identifying Parallel, Inter-Secting and Perpendicular Lines
3 pages
Predicting Home Prices in Bangalore
No ratings yet
Predicting Home Prices in Bangalore
18 pages
Introduction To Machine Learning (ML) With Sklearn
No ratings yet
Introduction To Machine Learning (ML) With Sklearn
10 pages
Ass 1 ML
No ratings yet
Ass 1 ML
21 pages
ML Observation
No ratings yet
ML Observation
29 pages
Normialization Dataset
No ratings yet
Normialization Dataset
7 pages
T2 Summary VHA
No ratings yet
T2 Summary VHA
14 pages
Kaggle Machine Learning
No ratings yet
Kaggle Machine Learning
6 pages
Linear Regression Analysis - Polynomial Regression
No ratings yet
Linear Regression Analysis - Polynomial Regression
25 pages
Quantam - Learning - Colaboratory
No ratings yet
Quantam - Learning - Colaboratory
13 pages
Machine Learning Project: TITLE: Predicting The Sale Price of A House Using Linear Regression
No ratings yet
Machine Learning Project: TITLE: Predicting The Sale Price of A House Using Linear Regression
20 pages
House Price Prediction Models
No ratings yet
House Price Prediction Models
16 pages
Document From Jahnavi
No ratings yet
Document From Jahnavi
20 pages
Project 4 - House Price Prediction - Ipynb - Colab
No ratings yet
Project 4 - House Price Prediction - Ipynb - Colab
5 pages
Real Estate Price Prediction Model
No ratings yet
Real Estate Price Prediction Model
33 pages
A
No ratings yet
A
2 pages
DL 1
No ratings yet
DL 1
11 pages
The Data Science Process
100% (1)
The Data Science Process
53 pages
Import As Import As From Import: "Mean Squared Errors: "
No ratings yet
Import As Import As From Import: "Mean Squared Errors: "
1 page
ML Lab - BCSL606
No ratings yet
ML Lab - BCSL606
67 pages
Machine Learning Lab Manual
No ratings yet
Machine Learning Lab Manual
26 pages
Bi El
No ratings yet
Bi El
26 pages
ML Lab Manual
No ratings yet
ML Lab Manual
25 pages
Machine Learning Lab Manual
No ratings yet
Machine Learning Lab Manual
33 pages
Setup: Chapter 2 - End-To-End Machine Learning Project
No ratings yet
Setup: Chapter 2 - End-To-End Machine Learning Project
31 pages
AAAAAAAAAAAAAAAAAAAAAAAAA
No ratings yet
AAAAAAAAAAAAAAAAAAAAAAAAA
41 pages
Week 12
No ratings yet
Week 12
2 pages
Regression Analysis - Lasso and Ridge Regularization
No ratings yet
Regression Analysis - Lasso and Ridge Regularization
17 pages
Regression Algorithm
No ratings yet
Regression Algorithm
9 pages
Multiple - Linear - Regression - AirBNB - Solution-0.2 - New - Ipynb - Colaboratory
No ratings yet
Multiple - Linear - Regression - AirBNB - Solution-0.2 - New - Ipynb - Colaboratory
11 pages
Exp - 2-EDA - CaliforniaData Set - HeatMap - PairPlot-checkpoint - Jupyter Notebook
No ratings yet
Exp - 2-EDA - CaliforniaData Set - HeatMap - PairPlot-checkpoint - Jupyter Notebook
12 pages
Unit 1: Shobana T S Assistant Professor Dept. of ISE, BMSCE
No ratings yet
Unit 1: Shobana T S Assistant Professor Dept. of ISE, BMSCE
127 pages
House - Price - Prediction
No ratings yet
House - Price - Prediction
16 pages
Housepriceprediction ML 221104055342 Fb5109ae
No ratings yet
Housepriceprediction ML 221104055342 Fb5109ae
17 pages
Capstone Project Report
No ratings yet
Capstone Project Report
187 pages
Machine Learning Lab
No ratings yet
Machine Learning Lab
20 pages
02 End To End Machine Learning Project
No ratings yet
02 End To End Machine Learning Project
26 pages
Exp 1 A
No ratings yet
Exp 1 A
5 pages
Predicting House Prices Using Regression Techniques: Problem Statement: Problems Faced During Buying A House
No ratings yet
Predicting House Prices Using Regression Techniques: Problem Statement: Problems Faced During Buying A House
20 pages
The Boston Housing Dataset
100% (2)
The Boston Housing Dataset
4 pages
ML Merged
No ratings yet
ML Merged
28 pages
ML Record
No ratings yet
ML Record
15 pages
ML Lab Manual
No ratings yet
ML Lab Manual
60 pages
Housing Linear
No ratings yet
Housing Linear
3 pages
Pandas - Jupyter Notebook - 19!7!2025
No ratings yet
Pandas - Jupyter Notebook - 19!7!2025
36 pages
Project PDF
No ratings yet
Project PDF
13 pages
ML Manual
No ratings yet
ML Manual
30 pages
Pattern - Recognition - 3 - Code With Output
No ratings yet
Pattern - Recognition - 3 - Code With Output
7 pages
Assignment 4
No ratings yet
Assignment 4
7 pages
ML Lab Manual
No ratings yet
ML Lab Manual
110 pages
Exercise3 Solution
No ratings yet
Exercise3 Solution
19 pages
Emllab
No ratings yet
Emllab
6 pages
California Housing Price Prediction .
No ratings yet
California Housing Price Prediction .
1 page
ML - Datascience Manual
No ratings yet
ML - Datascience Manual
64 pages
Linear Reg
No ratings yet
Linear Reg
25 pages
Mlext
No ratings yet
Mlext
1 page
Xgboost
No ratings yet
Xgboost
12 pages
House Price Prediction: # Importing Necessary Libraries
No ratings yet
House Price Prediction: # Importing Necessary Libraries
18 pages
House Price Prediction
No ratings yet
House Price Prediction
1 page
Exercise 5
No ratings yet
Exercise 5
6 pages
Let's Play with Excel
From Everand
Let's Play with Excel
Anurag Pandey
No ratings yet
Project Management Module Wise Questions With Repetitions
No ratings yet
Project Management Module Wise Questions With Repetitions
2 pages
DC Ans
No ratings yet
DC Ans
73 pages
DC Ans-1
No ratings yet
DC Ans-1
66 pages
Distributed Systems Cleaned Question Bank
No ratings yet
Distributed Systems Cleaned Question Bank
5 pages
DC Final Sem
No ratings yet
DC Final Sem
142 pages
Distributed Systems Updated Question Bank
No ratings yet
Distributed Systems Updated Question Bank
3 pages
MIS
No ratings yet
MIS
55 pages
Management Consultant Resume Example
No ratings yet
Management Consultant Resume Example
1 page
Goa Plan
No ratings yet
Goa Plan
2 pages
What Is Double Spending Problem? How Pow Solves It?
No ratings yet
What Is Double Spending Problem? How Pow Solves It?
67 pages
ADS Ans
No ratings yet
ADS Ans
18 pages
Data Warehousing and Mining Techmax Semester 6 Computer Engineering
No ratings yet
Data Warehousing and Mining Techmax Semester 6 Computer Engineering
109 pages
54.01 101490900101 101490900144 Operator's Platform
No ratings yet
54.01 101490900101 101490900144 Operator's Platform
6 pages
Tifac Core at Nit Hamirpur
No ratings yet
Tifac Core at Nit Hamirpur
6 pages
Doggy Styles 3 - Loving Duke
No ratings yet
Doggy Styles 3 - Loving Duke
11 pages
Sensor Selection Getting It Right For Flammable Gases
No ratings yet
Sensor Selection Getting It Right For Flammable Gases
8 pages
Reviewer Print
No ratings yet
Reviewer Print
9 pages
Challenges That Face Entrepreneurships in Tanzania
No ratings yet
Challenges That Face Entrepreneurships in Tanzania
6 pages
Reciepes
No ratings yet
Reciepes
10 pages
Company Law.
No ratings yet
Company Law.
40 pages
Dynamic Hysys
No ratings yet
Dynamic Hysys
1 page
UNIT 3 - Test 2
No ratings yet
UNIT 3 - Test 2
7 pages
Discussion - Design Concepts For Jib Cranes
No ratings yet
Discussion - Design Concepts For Jib Cranes
2 pages
Identifying Ethical Issues in AI Partners in Human-AI Co-Creation
No ratings yet
Identifying Ethical Issues in AI Partners in Human-AI Co-Creation
6 pages
01-Historical Perspectives
No ratings yet
01-Historical Perspectives
22 pages
Younity Community Course Module Oct 1.0
0% (1)
Younity Community Course Module Oct 1.0
103 pages
La Influencia de Los Traductores en Los Avances Científicos
No ratings yet
La Influencia de Los Traductores en Los Avances Científicos
3 pages
Jenkins BeetleBook Guide
No ratings yet
Jenkins BeetleBook Guide
2 pages
KEI HW List Price - 15th Feb 2025
No ratings yet
KEI HW List Price - 15th Feb 2025
1 page
Adhoc Faculty Application Form
No ratings yet
Adhoc Faculty Application Form
3 pages
Lecture 8 BEC
No ratings yet
Lecture 8 BEC
14 pages
2023 2024 Class Catch Up Friday Program
100% (1)
2023 2024 Class Catch Up Friday Program
6 pages
CLONE HDD Beginners Guides
No ratings yet
CLONE HDD Beginners Guides
11 pages
How God Answers Prayer
100% (1)
How God Answers Prayer
12 pages
Curtis E50 Pump Parts List
No ratings yet
Curtis E50 Pump Parts List
8 pages
Lgep 2: SKF High Load, Extreme Pressure Bearing Grease
No ratings yet
Lgep 2: SKF High Load, Extreme Pressure Bearing Grease
2 pages
Annexure-9 Technical & Functional Requirements SUB: RFP For Implementation of HRMS Under SAAS Model. Ref: Your GEM BID
No ratings yet
Annexure-9 Technical & Functional Requirements SUB: RFP For Implementation of HRMS Under SAAS Model. Ref: Your GEM BID
94 pages
Algebra Balance Scales
No ratings yet
Algebra Balance Scales
1 page
Sos 28 July Step Regular Session by Saeed Mdcat Team
100% (1)
Sos 28 July Step Regular Session by Saeed Mdcat Team
5 pages