0% found this document useful (0 votes)

6 views17 pages

MDS372 Lab4 2448001

The document outlines a comprehensive analysis of house price prediction using various feature selection techniques on the California housing dataset. It evaluates the performance of models using all features, forward selection, backward elimination, recursive feature elimination, exhaustive search, and LASSO regression, measuring their effectiveness through mean squared error (MSE). The results indicate that feature selection methods, particularly exhaustive search and LASSO regression, improve model accuracy by reducing irrelevant features.

Uploaded by

Aaditya Dhaka

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

6 views17 pages

MDS372 Lab4 2448001

Uploaded by

Aaditya Dhaka

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 17

LAB-4

Name: Aaditya Kumar Dhaka

Reg no: 2448001

# Importing libraries
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
from sklearn.linear_model import LinearRegression
from mlxtend.feature_selection import SequentialFeatureSelector as SFS
from sklearn.feature_selection import RFE
from mlxtend.feature_selection import ExhaustiveFeatureSelector as EFS
from sklearn.linear_model import Lasso, Ridge
from sklearn.metrics import mean_squared_error

from sklearn.datasets import fetch_california_housing

data = fetch_california_housing()
X = pd.DataFrame(data.data, columns=data.feature_names)
y = data.target # House Prices

# Display the first few rows

df.head()

MedInc HouseAge AveRooms AveBedrms Population AveOccup

Latitude \
0 8.3252 41.0 6.984127 1.023810 322.0 2.555556
37.88
1 8.3014 21.0 6.238137 0.971880 2401.0 2.109842
37.86
2 7.2574 52.0 8.288136 1.073446 496.0 2.802260
37.85
3 5.6431 52.0 5.817352 1.073059 558.0 2.547945
37.85
4 3.8462 52.0 6.281853 1.081081 565.0 2.181467
37.85

Longitude Target
0 -122.23 4.526
1 -122.22 3.585
2 -122.24 3.521
3 -122.25 3.413
4 -122.25 3.422
print("No of records/rows:", df.shape[0])
print("No of features/columns:", df.shape[1])
print("Features:", df.columns)

No of records/rows: 20640
No of features/columns: 9
Features: Index(['MedInc', 'HouseAge', 'AveRooms', 'AveBedrms',
'Population', 'AveOccup',
'Latitude', 'Longitude', 'Target'],
dtype='object')

df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 20640 entries, 0 to 20639
Data columns (total 9 columns):
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 MedInc 20640 non-null float64
1 HouseAge 20640 non-null float64
2 AveRooms 20640 non-null float64
3 AveBedrms 20640 non-null float64
4 Population 20640 non-null float64
5 AveOccup 20640 non-null float64
6 Latitude 20640 non-null float64
7 Longitude 20640 non-null float64
8 Target 20640 non-null float64
dtypes: float64(9)
memory usage: 1.4 MB

# Standardize Features
scaler = StandardScaler()
X_scaled = scaler.fit_transform(X)

# Split Dataset
X_train, X_test, y_train, y_test = train_test_split(X_scaled, y,
test_size=0.2, random_state=42)

# Function to Train and Evaluate Model

def evaluate_model(X_train, X_test, y_train, y_test):
model = LinearRegression()
model.fit(X_train, y_train)
y_pred = model.predict(X_test)
mse = mean_squared_error(y_test, y_pred)
return mse
Baseline Model
This is the model before any feature selection. Uses all available features to predict house prices.
# Train a model using ALL features (Baseline)
mse_all_features = evaluate_model(X_train, X_test, y_train, y_test)
print(f"MSE with All Features: {mse_all_features:.4f}")

MSE with All Features: 0.5559

This means, on average, the squared difference between predicted and actual values is 0.5559.

This serves as a reference for comparing feature selection techniques.

Wrapper Method
Wrapper methods use a machine learning model to select features by evaluating their impact on
performance.

1. Forward Selection
Starts with no features.

Adds features one by one that improve the model the most.

Stops when adding more features does not improve performance.

# Plot MSE vs. Number of Features
mse_values = []
num_features = []

for k in range(1, X_train.shape[1] + 1): # Test for 1 to all features

sfs_k = SFS(model, k_features=k, forward=True, floating=False,
scoring='neg_mean_squared_error', cv=5)
sfs_k.fit(X_train, y_train)

X_train_k = sfs_k.transform(X_train)
X_test_k = sfs_k.transform(X_test)

mse = evaluate_model(X_train_k, X_test_k, y_train, y_test)

mse_values.append(mse)
num_features.append(k)

# Plotting
plt.figure(figsize=(6, 4))
plt.plot(num_features, mse_values, marker="o", linestyle="--",
color="blue", label="MSE Score")
plt.xlabel("Number of Selected Features")
plt.ylabel("MSE Score")
plt.title("Forward Selection: MSE vs. Number of Features")
plt.legend()
plt.grid()
plt.show()

We test different numbers of selected features from 1 to all features.

Train Model & Compute MSE , for each feature count, we:

Select the best features using Forward Selection (sfs). Train the model using only those features.
Calculate MSE (error) and store it.

Plot MSE vs. Number of Features. This shows how model performance changes as we add or remove
features.
sfs = SFS(model, k_features=4, forward=True, floating=False,
scoring='neg_mean_squared_error', cv=5)
sfs.fit(X_train, y_train)

X_train_fs = sfs.transform(X_train)
X_test_fs = sfs.transform(X_test)

mse_forward = evaluate_model(X_train_fs, X_test_fs, y_train, y_test)

print(f"MSE with Forward Selection: {mse_forward:.4f}")

MSE with Forward Selection: 0.5490

MSE: 0.5490 (Better than baseline)

This means removing some features improved the model slightly. Suggests that not all features were
useful.

2. Backward Elimination
Starts with all features in the model.

Removes the least important feature one by one, based on the effect on model performance.

Stops when removing another feature would increase the error (MSE).
# Store MSE values for different numbers of selected features
mse_values = []
num_features = []

for k in range(X_train.shape[1], 0, -1): # Iterate from all features

to 1 feature
sbs = SFS(model, k_features=k, forward=False, floating=False,
scoring='neg_mean_squared_error', cv=5)
sbs.fit(X_train, y_train)

# Transform dataset with selected features

X_train_bs = sbs.transform(X_train)
X_test_bs = sbs.transform(X_test)

# Compute MSE and store results

mse = evaluate_model(X_train_bs, X_test_bs, y_train, y_test)
mse_values.append(mse)
num_features.append(k)

# Plot MSE vs. Number of Features for Backward Elimination

plt.figure(figsize=(6, 4))
plt.plot(num_features, mse_values, marker="o", linestyle="--",
color="red", label="MSE Score")
plt.xlabel("Number of Selected Features")
plt.ylabel("MSE Score")
plt.title("Backward Elimination: Model Performance vs. Number of
Features")
plt.legend()
plt.grid()
plt.show()
The model begins with all available features.

Remove Least Important Features :

It removes one feature at a time, starting with the least important one. The removal is based on
how much the feature contributes to reducing error (MSE).

Train Model & Compute MSE , for each feature count, we :

Select the best features using Backward Elimination (sbs). Train the model using only those
selected features. Calculate MSE (error) and store it.

Plot MSE vs. Number of Features showing how model performance changes as we remove features.
Find the Optimal Features the ones where MSE is the lowest.
sbs = SFS(model, k_features=4, forward=False, floating=False,
scoring='neg_mean_squared_error', cv=5)
sbs.fit(X_train, y_train)

X_train_bs = sbs.transform(X_train)
X_test_bs = sbs.transform(X_test)

mse_backward = evaluate_model(X_train_bs, X_test_bs, y_train, y_test)

print(f"MSE with Backward Elimination: {mse_backward:.4f}")

MSE with Backward Elimination: 0.5490

MSE: 0.5490 (Better than baseline)

This means removing some features improved the model slightly. Suggests that not all features were
useful.

3. Recursive Feature Elimination (RFE)

Starts with all features and trains the model.

Removes the least important feature based on its contribution to the model.

Repeats the process recursively until only the desired number of features remain.

Key Idea: RFE ranks features by importance and eliminates them one by one until only the most
significant ones are left.
# Store MSE values for different feature counts
mse_values = []
num_features = []

for k in range(1, X_train.shape[1] + 1): # Iterate from 1 feature to

all features
rfe = RFE(estimator=LinearRegression(), n_features_to_select=k)
rfe.fit(X_train, y_train)

# Transform dataset using selected features

X_train_rfe = rfe.transform(X_train)
X_test_rfe = rfe.transform(X_test)

# Compute MSE and store results

mse = evaluate_model(X_train_rfe, X_test_rfe, y_train, y_test)
mse_values.append(mse)
num_features.append(k)

# Plot MSE vs. Number of Features for RFE

plt.figure(figsize=(6, 4))
plt.plot(num_features, mse_values, marker="o", linestyle="--",
color="purple", label="MSE Score")
plt.xlabel("Number of Selected Features")
plt.ylabel("MSE Score")
plt.title("RFE: Model Performance vs. Number of Features")
plt.legend()
plt.grid()
plt.show()
Starting with all Features, the model begins with all available features.

Remove Least Important Features :

It removes one feature at a time, starting with the least important one. The removal is based on
how much the feature contributes to reducing error (MSE).

Train Model & Compute MSE , for each feature count, we:

Select the best features using Recursive Feature Elimination (RFE). Train the model using only
those selected features. Calculate MSE (error) and store it.

Plot MSE vs. Number of Features showing how model performance changes as we remove features.
Find the Optimal Features, the ones where MSE is the lowest.
rfe = RFE(estimator=LinearRegression(), n_features_to_select=3)
rfe.fit(X_train, y_train)

X_train_rfe = rfe.transform(X_train)
X_test_rfe = rfe.transform(X_test)

mse_rfe = evaluate_model(X_train_rfe, X_test_rfe, y_train, y_test)

print(f" MSE with RFE: {mse_rfe:.4f}")

MSE with RFE: 0.5608

MSE with RFE: 0.5608 (Slightly Worse than Baseline)

This means removing some features slightly increased the error, suggesting that important features
might have been removed.

4. Exhaustive Search
Tests all possible feature subsets within the specified range (min_features=3, max_features=5).

Trains a model for each subset and calculates the corresponding MSE.

Selects the feature set that minimizes MSE, ensuring the best possible feature combination
mse_values = []
feature_counts = list(range(3, 6)) # Testing feature sets of size 3,
4, and 5

for num_features in feature_counts:

efs = EFS(model, min_features=num_features,
max_features=num_features, scoring='neg_mean_squared_error', cv=3)
efs.fit(X_train, y_train)

X_train_efs = efs.transform(X_train)
X_test_efs = efs.transform(X_test)

mse_values.append(evaluate_model(X_train_efs, X_test_efs, y_train,

y_test))

# Plot MSE vs. Number of Selected Features

plt.figure(figsize=(6, 4))
plt.plot(feature_counts, mse_values, marker='o', linestyle='-',
color='orange')
plt.xlabel("Number of Selected Features")
plt.ylabel("MSE")
plt.title("Exhaustive Search: MSE vs. Number of Features")
plt.grid(True, linestyle='--', alpha=0.6)
plt.show()

Features: 56/56
efs = EFS(model, min_features=4, max_features=4,
scoring='neg_mean_squared_error', cv=3)
efs.fit(X_train, y_train)

X_train_efs = efs.transform(X_train)
X_test_efs = efs.transform(X_test)

mse_exhaustive = evaluate_model(X_train_efs, X_test_efs, y_train,

y_test)
print(f"MSE with Exhaustive Search: {mse_exhaustive:.4f}")

Features: 70/70

MSE with Exhaustive Search: 0.5490

Since MSE is lowest at 4 features, setting min_features and max_features to 4 ensures the model
selects only the optimal feature subset.

MSE with Exhaustive Search: 0.5490 (Better Performance than Baseline model)

Thus Exhaustive Search found the optimal 4-feature subset, leading to the lowest error. It confirms
that removing irrelevant features improved model accuracy.

Embedded Method
Embedded methods select features during model training by applying built-in regularization
techniques (e.g., LASSO shrinks coefficients, dropping less important features).

They are faster than wrapper methods and prevent overfitting.

1. LASSO Regression
Performs feature selection by shrinking some coefficients to zero using L1 regularization.

Automatically removes less important features, keeping only the most significant ones.

Controls complexity through the alpha parameter, preventing overfitting.

import numpy as np
import matplotlib.pyplot as plt
from sklearn.linear_model import Lasso
from sklearn.metrics import mean_squared_error

# Define alpha values (log scale for better visualization)

alpha_values = np.logspace(-3, 2, 10) # From 0.001 to 100
mse_values = []

# Loop through different alpha values

for alpha in alpha_values:
lasso = Lasso(alpha=alpha)
lasso.fit(X_train, y_train)
y_pred = lasso.predict(X_test)
mse_values.append(mean_squared_error(y_test, y_pred))

# Find the best alpha (minimum MSE)

best_alpha = alpha_values[np.argmin(mse_values)]

# Plot Alpha vs. MSE for LASSO

plt.figure(figsize=(8, 5))
plt.plot(alpha_values, mse_values, marker='o', linestyle='-',
color='r', label="LASSO")
plt.xscale('log') # Log scale for alpha
plt.xlabel("Alpha (LASSO Penalty)")
plt.ylabel("Mean Squared Error (MSE)")
plt.title(f"LASSO Regression: MSE vs. Alpha")
plt.grid(True)
plt.legend()
plt.show()

X-axis represents different values of alpha (regularization strength).

Smaller alpha means less regularization (closer to standard Linear Regression). Larger alpha
means stronger regularization (more shrinkage of coefficients).

Y-axis represents the Mean Squared Error (MSE) of the model on the test data.

Lower MSE indicates better predictive performance. Higher MSE suggests underfitting (too
much shrinkage).

Small alpha (left side of the plot) → Low regularization : MSE is high if alpha is too low because we
include too many irrelevant features (overfitting risk).

Moderate alpha (middle of the plot) → Optimal point : This is where MSE is minimum, meaning
LASSO has effectively removed unnecessary features while retaining the important ones. This is the
best choice of alpha for balancing bias and variance.

Large alpha (right side of the plot) → High regularization : MSE increases because LASSO shrinks too
many coefficients to zero, leading to underfitting (important features are lost).

Choose alpha where MSE is lowest to get the best feature subset with good predictive power.
# Train Lasso with the best alpha
lasso = Lasso(alpha=best_alpha)
lasso.fit(X_train, y_train)

# Select only important features (non-zero coefficients)

selected_lasso = np.where(lasso.coef_ != 0)[0]

print(f"Selected {len(selected_lasso)} features out of

{X_train.shape[1]}")
print(f"Selected feature indices: {selected_lasso}") # Print selected
feature indices

# Subset dataset to selected features

X_train_lasso = X_train[:, selected_lasso]
X_test_lasso = X_test[:, selected_lasso]

# Train Lasso again on reduced features

lasso_selected = Lasso(alpha=best_alpha)
lasso_selected.fit(X_train_lasso, y_train)

# Evaluate performance
y_pred = lasso_selected.predict(X_test_lasso)
mse_lasso = mean_squared_error(y_test, y_pred)

print(f"MSE after LASSO Feature Selection: {mse_lasso:.4f}")

print(f"Best Alpha Selected: {best_alpha:.4f}")

Selected 7 features out of 8

Selected feature indices: [0 1 2 3 5 6 7]
MSE after LASSO Feature Selection: 0.5486
Best Alpha Selected: 0.0129

So from the MSE results, this gives the lowest MSE (0.5486, better than baseline model), meaning
the model is performing optimally well.

2. Ridge Regression
Lasso (L1 regularization) removes unimportant features by setting coefficients to zero, making it
useful for feature selection.

Ridge (L2 regularization) shrinks coefficients without setting them to zero, meaning all features are
retained but with reduced impact.

When to use which:

If the goal is feature selection, Lasso is the right approach. If we want to reduce multicollinearity
and stabilize coefficients without removing features, Ridge can be tried as well. If unsure, we
cantry both Lasso and Ridge, compare MSE values, and choose the one that performs best
Keeps all features but shrinks coefficients to prevent overfitting.

Uses L2 regularization, which penalizes large coefficients.

Helps when features are highly correlated and prevents instability in predictions.
import numpy as np
import matplotlib.pyplot as plt
from sklearn.linear_model import Ridge
from sklearn.metrics import mean_squared_error

alphas = np.logspace(-4, 4, 50) # Test values from 0.0001 to 10000

mse_values = []

for alpha in alphas:

ridge = Ridge(alpha=alpha)
ridge.fit(X_train, y_train)
y_pred = ridge.predict(X_test)
mse_values.append(mean_squared_error(y_test, y_pred))

# Plot MSE vs Alpha

plt.figure(figsize=(8,5))
plt.plot(alphas, mse_values, 'bo-', markersize=3)
plt.xscale("log")
plt.xlabel("Alpha (log scale)")
plt.ylabel("Mean Squared Error (MSE)")
plt.title("MSE vs Alpha for Ridge Regression")
plt.show()
best_alpha = alphas[np.argmin(mse_values)] # Get the alpha with
lowest MSE

ridge = Ridge(alpha=best_alpha)
ridge.fit(X_train, y_train)

y_pred = ridge.predict(X_test)
mse_ridge = mean_squared_error(y_test, y_pred)

print(f"Optimal Alpha: {best_alpha:.4f}")

print(f"MSE with Ridge Regression: {mse_ridge:.4f}")

Optimal Alpha: 232.9952

MSE with Ridge Regression: 0.5518

Comparing MSE Across Methods

import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
from sklearn.linear_model import LinearRegression, Ridge
from sklearn.metrics import mean_squared_error

# Define the evaluation function

def evaluate_model(X_train, X_test, y_train, y_test, alpha=None):
if alpha is None:
model = LinearRegression() # Default case with no alpha
else:
model = Ridge(alpha=alpha) # If alpha is provided, use Ridge

model.fit(X_train, y_train)
y_pred = model.predict(X_test)
return mean_squared_error(y_test, y_pred)

# Define feature selection methods

methods = ["All Features", "Forward", "Backward", "RFE", "Exhaustive",
"LASSO", "Ridge"]
mses = [evaluate_model(X_train, X_test, y_train, y_test), mse_forward,
mse_backward, mse_rfe, mse_exhaustive, mse_lasso, mse_ridge]

# Plot
plt.figure(figsize=(10, 5))
ax = sns.barplot(x=methods, y=mses)

# Print MSE values on top of bars

for i, mse in enumerate(mses):
ax.text(i, mse + 0.02, f"{mse:.4f}", ha='center', fontsize=12,
fontweight='bold')

# Labels and title

plt.xlabel("Feature Selection Method")
plt.ylabel("Mean Squared Error (MSE)")
plt.title("Comparison of Feature Selection Methods")
plt.xticks(rotation=45)
plt.show()
LASSO Performs the Best (0.5486)

LASSO (L1 Regularization) removes unimportant features by setting their coefficients to zero,
leading to a more efficient and optimized model. By selecting only the most relevant features,
LASSO reduces noise and prevents overfitting, which improves MSE.

Forward, Backward, and Exhaustive Selection Perform Similarly (0.5490)

These methods systematically evaluate feature subsets, ensuring that only relevant features
remain. Since they rely on statistical evaluation, they tend to select similar subsets, leading to
almost identical MSE values.

All Features (0.5559) Performs Worse Than Selected Features

Keeping all features can introduce irrelevant or redundant features, which may add noise and
slightly reduce model performance.. Feature selection methods remove these unnecessary
features, leading to a small but noticeable improvement.

RFE Performs the Worst (0.5608)

RFE removes features recursively, but its selection process might have eliminated some
important features, leading to higher error.

Ridge Regression (0.5518) is Better Than Using All Features, But Worse Than LASSO

Ridge (L2 Regularization) shrinks coefficients instead of removing features. It helps control
overfitting but does not eliminate irrelevant features, meaning it doesn’t reduce MSE as
effectively as LASSO.

A Guide To 21 Feature Importance Methods and Packages in Machine Learning (With Code) - by Theophano Mitsa - Dec, 2023 - Towards Data Science
100% (1)
A Guide To 21 Feature Importance Methods and Packages in Machine Learning (With Code) - by Theophano Mitsa - Dec, 2023 - Towards Data Science
41 pages
(Feature Engineering) (Extended-Cheatsheet)
No ratings yet
(Feature Engineering) (Extended-Cheatsheet)
9 pages
Regression Analysis - Lasso and Ridge Regularization
No ratings yet
Regression Analysis - Lasso and Ridge Regularization
17 pages
Warpper Method
No ratings yet
Warpper Method
8 pages
ML Manual
No ratings yet
ML Manual
30 pages
CQF June 2021 M4L4 Solutions
No ratings yet
CQF June 2021 M4L4 Solutions
14 pages
ML
No ratings yet
ML
17 pages
ML Short Code - Under Updating
No ratings yet
ML Short Code - Under Updating
4 pages
MLLab Manual
No ratings yet
MLLab Manual
24 pages
UNITIV BtechIot
No ratings yet
UNITIV BtechIot
43 pages
20BCP021 Assignment 6
No ratings yet
20BCP021 Assignment 6
15 pages
ML Manual
No ratings yet
ML Manual
9 pages
Iii Aid - ML
No ratings yet
Iii Aid - ML
30 pages
Linear Regression Analysis - Polynomial Regression
No ratings yet
Linear Regression Analysis - Polynomial Regression
25 pages
Machine Learning Lab Manual
No ratings yet
Machine Learning Lab Manual
9 pages
C1 W2 Lab04 FeatEng PolyReg Soln
No ratings yet
C1 W2 Lab04 FeatEng PolyReg Soln
5 pages
Data Preprocessing Example Programs1
No ratings yet
Data Preprocessing Example Programs1
9 pages
Machine Learning Lab Manual
No ratings yet
Machine Learning Lab Manual
22 pages
Zerox Ready
No ratings yet
Zerox Ready
21 pages
Backward && Forward Feature Selection PART-2
No ratings yet
Backward && Forward Feature Selection PART-2
6 pages
Train
No ratings yet
Train
17 pages
Data Science Record - 05
No ratings yet
Data Science Record - 05
20 pages
ML 1-11
No ratings yet
ML 1-11
27 pages
ML Book Notes
No ratings yet
ML Book Notes
9 pages
T2 Summary VHA
No ratings yet
T2 Summary VHA
14 pages
DA Programs
No ratings yet
DA Programs
44 pages
Report
No ratings yet
Report
40 pages
1
No ratings yet
1
13 pages
Machine Learning Lab
No ratings yet
Machine Learning Lab
20 pages
ML Labmanual
No ratings yet
ML Labmanual
33 pages
ML Lab File
No ratings yet
ML Lab File
43 pages
ML Lab Manual
No ratings yet
ML Lab Manual
60 pages
ML Journal External
No ratings yet
ML Journal External
14 pages
22K61A0654 2 Sasi Auto
No ratings yet
22K61A0654 2 Sasi Auto
24 pages
ML Manual
No ratings yet
ML Manual
24 pages
Mlalllabprgs
No ratings yet
Mlalllabprgs
17 pages
House Pricing
No ratings yet
House Pricing
15 pages
Experiment 1
No ratings yet
Experiment 1
19 pages
Data Pre-Processing Python For Beginner
No ratings yet
Data Pre-Processing Python For Beginner
12 pages
Data Pre-Processing Python For Beginner
No ratings yet
Data Pre-Processing Python For Beginner
12 pages
Data Pre Processing
No ratings yet
Data Pre Processing
2 pages
1 Tutorial: Linear Regression
No ratings yet
1 Tutorial: Linear Regression
8 pages
Machine Learning Lab - Preprocessing
No ratings yet
Machine Learning Lab - Preprocessing
13 pages
Unit 1: Shobana T S Assistant Professor Dept. of ISE, BMSCE
No ratings yet
Unit 1: Shobana T S Assistant Professor Dept. of ISE, BMSCE
127 pages
External
No ratings yet
External
11 pages
ML Minimized Programs
No ratings yet
ML Minimized Programs
9 pages
Machinelearning
No ratings yet
Machinelearning
26 pages
Northbay Summarizes Data Pre-Processing Algorithms
No ratings yet
Northbay Summarizes Data Pre-Processing Algorithms
10 pages
Project 2
No ratings yet
Project 2
5 pages
ML Lab Experiment Shivansh
No ratings yet
ML Lab Experiment Shivansh
29 pages
ML LAB Manual-1
No ratings yet
ML LAB Manual-1
33 pages
Lab Extern L
No ratings yet
Lab Extern L
8 pages
Experiment Number: 3: Aim:-Study of The Linear Regression in The Machine Learning Using The Boston Housing Dataset. 1)
No ratings yet
Experiment Number: 3: Aim:-Study of The Linear Regression in The Machine Learning Using The Boston Housing Dataset. 1)
14 pages
Project Idea
No ratings yet
Project Idea
8 pages
Featureselection
No ratings yet
Featureselection
11 pages
ML Full For Print New 1
No ratings yet
ML Full For Print New 1
38 pages
Machine Learning Lab Manual
No ratings yet
Machine Learning Lab Manual
18 pages
Unit1 ML Programs
No ratings yet
Unit1 ML Programs
5 pages
Machine Learning Lab Manual
No ratings yet
Machine Learning Lab Manual
26 pages
ML Programs
No ratings yet
ML Programs
14 pages
Unit-4: Finite State Machines: MTH-352 Discrete Mathematics
No ratings yet
Unit-4: Finite State Machines: MTH-352 Discrete Mathematics
77 pages
Data Visualization
No ratings yet
Data Visualization
46 pages
Boolean Algebras, Boolean Rings and Stone's Representation Theorem
No ratings yet
Boolean Algebras, Boolean Rings and Stone's Representation Theorem
8 pages
Journal of Environmental Science, Computer Science and Engineering & Technology
No ratings yet
Journal of Environmental Science, Computer Science and Engineering & Technology
4 pages
CA-3 Assignment-Term Paper Title: Combinatorics and Its Applications School of Chemical Engineering and Physical Sciences
No ratings yet
CA-3 Assignment-Term Paper Title: Combinatorics and Its Applications School of Chemical Engineering and Physical Sciences
12 pages
Least Mean Square (LMS) Algorithm: 3.1 Spatial Filtering
No ratings yet
Least Mean Square (LMS) Algorithm: 3.1 Spatial Filtering
16 pages
Bpy - Py - 25109-E-Commerce Fraud Detection Based On Machine Learning Techniques Systematic Literature Review
No ratings yet
Bpy - Py - 25109-E-Commerce Fraud Detection Based On Machine Learning Techniques Systematic Literature Review
107 pages
16 Recommender Systems PDF
No ratings yet
16 Recommender Systems PDF
6 pages
SMIL - Multimodal Learning With Severely Missing Modality 2021
No ratings yet
SMIL - Multimodal Learning With Severely Missing Modality 2021
9 pages
IJISRT23MAY2427
No ratings yet
IJISRT23MAY2427
7 pages
DL Practicals
No ratings yet
DL Practicals
10 pages
Empirical Asset Pricing Via Machine Learning
No ratings yet
Empirical Asset Pricing Via Machine Learning
78 pages
Complete UNIT III DEEP LEARNING
No ratings yet
Complete UNIT III DEEP LEARNING
126 pages
Adam S. Charles Nicholas P. Bertrand John Lee Christopher J. Rozell
No ratings yet
Adam S. Charles Nicholas P. Bertrand John Lee Christopher J. Rozell
5 pages
Sensitivity Analysis of The Thermal Detection of T
No ratings yet
Sensitivity Analysis of The Thermal Detection of T
8 pages
Deep Learning Ans Semantic Fused FRS
No ratings yet
Deep Learning Ans Semantic Fused FRS
10 pages
URetinex-Net Retinex-Based Deep Unfolding Network For Low-Light Image Enhancement
No ratings yet
URetinex-Net Retinex-Based Deep Unfolding Network For Low-Light Image Enhancement
10 pages
Machine Learning - AL3451 - Important Questions With Answer
No ratings yet
Machine Learning - AL3451 - Important Questions With Answer
27 pages
Image Deblurring and Super-Resolution by Adaptive Sparse Domain Selection and Adaptive Regularization
No ratings yet
Image Deblurring and Super-Resolution by Adaptive Sparse Domain Selection and Adaptive Regularization
35 pages
Apache Spark Interview Questions and Answers PDF
No ratings yet
Apache Spark Interview Questions and Answers PDF
31 pages
Minor Project Synopsis - Dog Breed Identification
No ratings yet
Minor Project Synopsis - Dog Breed Identification
43 pages
Distance Metric Learning
No ratings yet
Distance Metric Learning
14 pages
v7.0 Tutorial
No ratings yet
v7.0 Tutorial
24 pages
LBDL
No ratings yet
LBDL
142 pages
Blind Image Deconvolution Using Deep Generative Priors: Muhammad Asim, Fahad Shamshad, and Ali Ahmed
No ratings yet
Blind Image Deconvolution Using Deep Generative Priors: Muhammad Asim, Fahad Shamshad, and Ali Ahmed
20 pages
Giga Casting Straightening 2024
No ratings yet
Giga Casting Straightening 2024
21 pages
Power System Fault Classification and Prediction Based On A Three-Layer Data Mining Structure
No ratings yet
Power System Fault Classification and Prediction Based On A Three-Layer Data Mining Structure
18 pages
Micromachines 13 00317
No ratings yet
Micromachines 13 00317
10 pages
Benchmarking Detection Transfer Learning With Vision Transformers
No ratings yet
Benchmarking Detection Transfer Learning With Vision Transformers
9 pages
Some Introductory Remarks On Bayesian Inference
100% (1)
Some Introductory Remarks On Bayesian Inference
35 pages
Online Prediction For QoS - Zhu2017
No ratings yet
Online Prediction For QoS - Zhu2017
14 pages
15 Types of Regression in Data Science PDF
No ratings yet
15 Types of Regression in Data Science PDF
42 pages
The Cross Entropy Method For Classification
No ratings yet
The Cross Entropy Method For Classification
8 pages

MDS372 Lab4 2448001

Uploaded by

MDS372 Lab4 2448001

Uploaded by

LAB-4

Name: Aaditya Kumar Dhaka

Reg no: 2448001

from sklearn.datasets import fetch_california_housing

# Display the first few rows

MedInc HouseAge AveRooms AveBedrms Population AveOccup

# Function to Train and Evaluate Model

MSE with All Features: 0.5559

This serves as a reference for comparing feature selection techniques.

Stops when adding more features does not improve performance.

for k in range(1, X_train.shape[1] + 1): # Test for 1 to all features

mse = evaluate_model(X_train_k, X_test_k, y_train, y_test)

We test different numbers of selected features from 1 to all features.

mse_forward = evaluate_model(X_train_fs, X_test_fs, y_train, y_test)

MSE with Forward Selection: 0.5490

for k in range(X_train.shape[1], 0, -1): # Iterate from all features

# Transform dataset with selected features

# Compute MSE and store results

# Plot MSE vs. Number of Features for Backward Elimination

Remove Least Important Features :

Train Model & Compute MSE , for each feature count, we :

mse_backward = evaluate_model(X_train_bs, X_test_bs, y_train, y_test)

MSE with Backward Elimination: 0.5490

3. Recursive Feature Elimination (RFE)

for k in range(1, X_train.shape[1] + 1): # Iterate from 1 feature to

# Transform dataset using selected features

# Compute MSE and store results

# Plot MSE vs. Number of Features for RFE

Remove Least Important Features :

mse_rfe = evaluate_model(X_train_rfe, X_test_rfe, y_train, y_test)

MSE with RFE: 0.5608

for num_features in feature_counts:

mse_values.append(evaluate_model(X_train_efs, X_test_efs, y_train,

# Plot MSE vs. Number of Selected Features

mse_exhaustive = evaluate_model(X_train_efs, X_test_efs, y_train,

MSE with Exhaustive Search: 0.5490

They are faster than wrapper methods and prevent overfitting.

Controls complexity through the alpha parameter, preventing overfitting.

# Define alpha values (log scale for better visualization)

# Loop through different alpha values

# Find the best alpha (minimum MSE)

# Plot Alpha vs. MSE for LASSO

X-axis represents different values of alpha (regularization strength).

# Select only important features (non-zero coefficients)

print(f"Selected {len(selected_lasso)} features out of

# Subset dataset to selected features

# Train Lasso again on reduced features

print(f"MSE after LASSO Feature Selection: {mse_lasso:.4f}")

Selected 7 features out of 8

When to use which:

Uses L2 regularization, which penalizes large coefficients.

alphas = np.logspace(-4, 4, 50) # Test values from 0.0001 to 10000

for alpha in alphas:

# Plot MSE vs Alpha

print(f"Optimal Alpha: {best_alpha:.4f}")

Optimal Alpha: 232.9952

Comparing MSE Across Methods

# Define the evaluation function

# Define feature selection methods

# Print MSE values on top of bars

# Labels and title

Forward, Backward, and Exhaustive Selection Perform Similarly (0.5490)

All Features (0.5559) Performs Worse Than Selected Features

RFE Performs the Worst (0.5608)

You might also like