0% found this document useful (0 votes)

14 views20 pages

Data Science Record - 05

The document outlines experiments for data exploration, preprocessing, linear regression, logistic regression, and Naive Bayes classification using Python. It includes algorithms, code implementations, and results for each experiment, demonstrating data handling and model evaluation techniques. The experiments aim to provide practical applications of machine learning methods on datasets.

Uploaded by

Deepak Sathis

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

14 views20 pages

Data Science Record - 05

Uploaded by

Deepak Sathis

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 20

EXP.

NO: 01
PERFORM DATA EXPLORATION AND PREPROCESSING
DATE: 23.01.2025

AIM:
To write a python code that will perform data exploration and preprocessing for the
uploaded dataset.
ALGORITHM:
Step 1: Start the program
Step 2: Import the necessary python libraries
Step 3: Load the data set in the current file directory
Step 4: Perform data exploration and data preprocessing for the loaded dataset
Step 5: Display the output
Step 6: Stop the program
CODE:
import pandas as pd
pd.set_option("display.max_rows", None, "display.max_columns", None, "display.width",
None)
file_path = '/content/traffic_accidects.csv'
df = pd.read_csv(file_path)
print("First few rows of the dataset:")
print(df.head())
print("First few rows of the dataset:")
print(df.head())
print("\nSummary Statistics:")
print(df.describe(include="all"))
print("\nMissing Values:")
print(df.isnull().sum())
if 'Age' in df.columns:
df['Age'] = df['Age'].fillna(df['Age'].median())
if 'Salary' in df.columns:
df['Salary'] = df['Salary'].fillna(df['Salary'].mean())
if 'AccidentDate' in df.columns:
df['AccidentDate'] = df['AccidentDate'].fillna("Unknown")
df['AccidentDate'] = pd.to_datetime(df['AccidentDate'], errors='coerce')
if 'Gender' in df.columns:
df['Gender'] = df['Gender'].map({'Male': 0, 'Female': 1})
if 'SeverityScore' in df.columns:
df = df.dropna(subset=['SeverityScore'])
if 'AccidentDate' in df.columns:
current_year = pd.Timestamp.now().year
df['YearsSinceAccident'] = current_year - df['AccidentDate'].dt.year
if 'Salary' in df.columns:
Q1 = df['Salary'].quantile(0.25)
Q3 = df['Salary'].quantile(0.75)
IQR = Q3 - Q1
lower_bound = Q1 - 1.5 * IQR
upper_bound = Q3 + 1.5 * IQR
df = df[(df['Salary'] >= lower_bound) & (df['Salary'] <= upper_bound)]
print("\nCleaned Dataset:")
print(df.head().to_string())
OUTPUT:
Particulars Marks Allotted Marks Awarded

Program / Simulation 40

Program Execution 30

Result 20

Viva Voce 10

Total 100

RESULT:
Thus, a program for data exploration and preprocessing has been successfully
executed.

EXP.NO: 02 (a)

DATE: 30.01.2025 Implement linear and logistic regression

1). Linear regression:

a). Single linear regression:
AIM:
To write a python code for the implementation of single linear regression to find a
straight line that goes through data points.
ALGORITHM:
Step 1: Start the program
Step 2: Import the necessary python libraries
Step 3: load the datasets and using formula build a code for single linear regression
Step 4: Display the output
Step 5: Stop the program
CODE:
import pandas as pd
import numpy as np
from sklearn.model_selection import train_test_split
from sklearn.linear_model import Ridge, Lasso
from sklearn.metrics import mean_squared_error, r2_score
import matplotlib.pyplot as plt
from sklearn.preprocessing import StandardScaler
from sklearn.feature_selection import SelectFromModel
file_path = "/content/datasets/house_prices.csv"
df = pd.read_csv(file_path)
df = df.drop(columns=['id', 'date'])
df['bedrooms'] = df['bedrooms'].fillna(df['bedrooms'].median())
df['bathrooms'] = df['bathrooms'].fillna(df['bathrooms'].median())
df['sqft_living'] = df['sqft_living'].fillna(df['sqft_living'].median())
df['sqft_lot'] = df['sqft_lot'].fillna(df['sqft_lot'].median())
df['waterfront'] = df['waterfront'].fillna(df['waterfront'].mode()[0])
df['view'] = df['view'].fillna(df['view'].mode()[0])
df['condition'] = df['condition'].fillna(df['condition'].mode()[0])
df['grade'] = df['grade'].fillna(df['grade'].mode()[0])
df['age_of_house'] = 2025 - df['yr_built']
df['time_since_renovation'] = 2025 - df['yr_renovated']
df['time_since_renovation'] = df['time_since_renovation'].where(df['yr_renovated'] != 0,
0)
df['total_sqft'] = df['sqft_living'] + df['sqft_basement']
df['bedrooms_sqft'] = df['bedrooms'] * df['sqft_living']
df = pd.get_dummies(df, columns=['waterfront', 'view', 'condition', 'zipcode'],
drop_first=True)
X = df.drop(columns=['price'])
y = df['price']
scaler = StandardScaler()
X_scaled = scaler.fit_transform(X)
X_train, X_test, y_train, y_test = train_test_split(X_scaled, y, test_size=0.2,
random_state=42)
ridge_model = Ridge(alpha=1.0)
lasso_model = Lasso(alpha=0.1)
ridge_model.fit(X_train, y_train)
lasso_model.fit(X_train, y_train)
y_pred_ridge = ridge_model.predict(X_test)
y_pred_lasso = lasso_model.predict(X_test)
mse_ridge = mean_squared_error(y_test, y_pred_ridge)
rmse_ridge = np.sqrt(mse_ridge)
r2_ridge = r2_score(y_test, y_pred_ridge)
mse_lasso = mean_squared_error(y_test, y_pred_lasso)
rmse_lasso = np.sqrt(mse_lasso)
r2_lasso = r2_score(y_test, y_pred_lasso)
print("Ridge Regression Model Evaluation:")
print(f"MSE: {mse_ridge:.2f}")
print(f"RMSE: {rmse_ridge:.2f}")
print(f"R-squared: {r2_ridge:.2f}")
print("\nLasso Regression Model Evaluation:")
print(f"MSE: {mse_lasso:.2f}")
print(f"RMSE: {rmse_lasso:.2f}")
print(f"R-squared: {r2_lasso:.2f}")
plt.figure(figsize=(10, 6))
plt.scatter(y_test, y_pred_ridge, color='blue', alpha=0.6, label="Ridge")
plt.plot([y_test.min(), y_test.max()], [y_test.min(), y_test.max()], 'r--', linewidth=2)
plt.title('Actual vs Predicted Housing Prices (Ridge Regression)', fontsize=16)
plt.xlabel('Actual Housing Price', fontsize=14)
plt.ylabel('Predicted Housing Price', fontsize=14)
plt.legend()
plt.grid()
plt.show()
plt.figure(figsize=(10, 6))
plt.scatter(y_test, y_pred_lasso, color='green', alpha=0.6, label="Lasso")
plt.plot([y_test.min(), y_test.max()], [y_test.min(), y_test.max()], 'r--', linewidth=2)
plt.title('Actual vs Predicted Housing Prices (Lasso Regression)', fontsize=16)
plt.xlabel('Actual Housing Price', fontsize=14)
plt.ylabel('Predicted Housing Price', fontsize=14)
plt.legend()
plt.grid()
plt.show()

OUTPUT:

b). Multi linear regression:

AIM:
To write a python code for the implementation of multi linear regression to find a
straight line that goes through data points.
ALGORITHM:
Step 1: Start the program
Step 2: Import the necessary python libraries
Step 3: load the datasets and using formula build a code for multi linear regression
Step 4: Display the output
Step 5: Stop the program
CODE:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
from sklearn.linear_model import LogisticRegression
from sklearn.metrics import accuracy_score, confusion_matrix
file_path = "/content/house_prices.csv"
df = pd.read_csv(file_path)
price_threshold = 500000
df['price_above_threshold'] = (df['price'] > price_threshold).astype(int)
categorical_columns = ['waterfront', 'view', 'condition', 'grade', 'zipcode']
df = pd.get_dummies(df, columns=categorical_columns, drop_first=True)
features = ['bedrooms', 'bathrooms', 'sqft_living', 'sqft_lot', 'floors', 'sqft_above',
'sqft_basement', 'yr_built', 'yr_renovated', 'lat', 'long', 'sqft_living15',
'sqft_lot15'] + [col for col in df.columns if
col.startswith(tuple(categorical_columns))]
X = df[features]
y = df['price_above_threshold']
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
scaler = StandardScaler()
X_train_scaled = scaler.fit_transform(X_train)
X_test_scaled = scaler.transform(X_test)
model = LogisticRegression(max_iter=1000)
model.fit(X_train_scaled, y_train)
y_pred = model.predict(X_test_scaled)
accuracy = accuracy_score(y_test, y_pred)
conf_matrix = confusion_matrix(y_test, y_pred)
plt.figure(figsize=(6, 4))
sns.heatmap(conf_matrix, annot=True, fmt='d', cmap='Blues', xticklabels=["Below",
"Above"], yticklabels=["Below", "Above"])
plt.xlabel("Predicted Label")
plt.ylabel("True Label")
plt.title(f'Confusion Matrix (Accuracy: {accuracy:.2f})')
plt.show()
coefficients = model.coef_[0]
intercept = model.intercept_[0]
coeff_df = pd.DataFrame({'Feature': features, 'Coefficient':
coefficients}).sort_values(by='Coefficient', ascending=False)
accuracy, conf_matrix, coeff_df.head(10), intercept

OUTPUT:
EXP.NO: 02 (b)
DATE: 30.01.2025 Implement linear and logistic regression

2). Logistic regression:

AIM:
To write a python code for the implementation of logistic regression to find a sigmoid
that goes through data points.
ALGORITHM:
Step 1: Start the program
Step 2: Import the necessary python libraries
Step 3: load the datasets and using formula build a code for logistic regression
Step 4: Display the output
Step 5: Stop the program

CODE:
import numpy as np
import matplotlib.pyplot as plt
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import PolynomialFeatures
iris = load_iris()
X = iris.data[:, 0].reshape(-1, 1)
y = (iris.target == 0).astype(int)
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42)
poly = PolynomialFeatures(degree=2)
X_train_poly = poly.fit_transform(X_train)
X_test_poly = poly.transform(X_test)
def sigmoid(z): return 1 / (1 + np.exp(-z))
def gradient_descent(X, y, theta, lr=0.01, iters=1000):
for _ in range(iters):
theta -= lr * (X.T @ (sigmoid(X @ theta) - y)) / len(y)
return theta
theta = np.zeros(X_train_poly.shape[1])
theta_optimal = gradient_descent(X_train_poly, y_train, theta)
predictions = sigmoid(X_test_poly @ theta_optimal) >= 0.5
accuracy = np.mean(predictions == y_test)
print(f"Accuracy: {accuracy * 100:.2f}%")
x_values = np.linspace(X_train.min(), X_train.max(), 100).reshape(-1, 1)
x_poly = poly.transform(x_values)
y_values = sigmoid(x_poly @ theta_optimal) >= 0.5
plt.scatter(X_train, y_train, color='blue', label='Training data')
plt.plot(x_values, y_values, color='red', label='Decision Boundary')
plt.xlabel('Sepal Length (cm)')
plt.ylabel('Setosa (1) vs Not Setosa (0)')
plt.title('Logistic Regression with Curved Decision Boundary')
plt.legend()
plt.grid(True)
plt.show()

OUTPUT:
Particulars Marks Allotted Marks Awarded

Program / Simulation 40

Program Execution 30

Result 20

Viva Voce 10

Total 100

RESULT:
Thus, a program for linear and logistic regression has been successfully executed.

EXP.NO: 03

DATE: 06.02.2025 Naive bayes classifier

AIM:
To write a Python code for the implementation of the Naive Bayes Classifier for classifying
data based on probability distributions.
ALGORITHM:
Step 1: Start the program
Step 2: Import the necessary Python libraries
Step 3: Load the dataset and preprocess the data
Step 4: Compute the prior probabilities and likelihood using Bayes' theorem
Step 5: Build the Naïve Bayes classifier and train it on the dataset
Step 6: Use the trained model to make predictions
Step 7: Display the output
Step 8: Stop the program
CODE:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
from sklearn.metrics import accuracy_score, confusion_matrix
from sklearn.model_selection import train_test_split
class NaiveBayesClassifier:
def __init__(self):
self.class_priors = {}
self.means = {}
self.variances = {}
self.classes = None

def fit(self, X, y):

self.classes = np.unique(y)
for c in self.classes:
X_c = X[y == c]
self.class_priors[c] = len(X_c) / len(X)
self.means[c] = np.mean(X_c, axis=0)
self.variances[c] = np.var(X_c, axis=0) + 1e-9
def gaussian_pdf(self, x, mean, variance):
coeff = 1 / np.sqrt(2 * np.pi * variance)
exponent = np.exp(-((x - mean) ** 2) / (2 * variance))
return coeff * exponent
def predict(self, X):
predictions = []
for x in X:
posteriors = {}
for c in self.classes:
prior = np.log(self.class_priors[c])
likelihood = np.sum(np.log(self.gaussian_pdf(x, self.means[c], self.variances[c])))
posteriors[c] = prior + likelihood
predictions.append(max(posteriors, key=posteriors.get))
return np.array(predictions)
df = pd.read_csv('/content/house_prices2.csv')
for col in df.columns:
if df[col].dtype == 'object':
df[col] = pd.factorize(df[col])[0]
X = df.iloc[:, :-1].values
y = df.iloc[:, -1].values
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
nb = NaiveBayesClassifier()
nb.fit(X_train, y_train)
y_pred = nb.predict(X_test)
accuracy = accuracy_score(y_test, y_pred)
print(f"Accuracy: {accuracy:.2f}")
conf_matrix = confusion_matrix(y_test, y_pred)
plt.figure(figsize=(6, 5))
sns.heatmap(conf_matrix, annot=True, fmt='d', cmap='Blues', xticklabels=np.unique(y),
yticklabels=np.unique(y))
plt.xlabel("Predicted Label")
plt.ylabel("True Label")
plt.title("Confusion Matrix for Naïve Bayes Classifier")
plt.show()
test_sizes = np.linspace(0.1, 0.5, 5)
accuracies = []
for test_size in test_sizes:
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=test_size,
random_state=42)
nb.fit(X_train, y_train)
y_pred = nb.predict(X_test)
accuracies.append(accuracy_score(y_test, y_pred))
plt.figure(figsize=(7, 5))
plt.plot(test_sizes, accuracies, marker='o', linestyle='-', color='m', label="Naïve Bayes
Accuracy")
plt.xlabel("Test Size")
plt.ylabel("Accuracy")
plt.title("Naïve Bayes Accuracy vs. Test Size")
plt.legend()
plt.grid()
plt.show()
OUTPUT:
Particulars Marks Allotted Marks Awarded

Program / Simulation 40

Program Execution 30

Result 20

Viva Voce 10

Total 100

RESULT:

The required Naïve bayes model has been executed successfull

EXP.NO: 04
AIM:
DATE: 13.03.2025 POWER BI

To make an analytical dashboard for E-commerce

ALGORITHM:
STEP 1: Load Data - Import dataset into Power BI using Get Data and

load it into a table.

STEP 2: Preprocess Data - Handle missing values, encode Gender,

compute Experience.

STEP 3: Define Variables - Create binary target variable

SalaryAbove50K, set X and y.

STEP 4: Split Data - Divide features and target into training

and testing sets.

STEP 5: Train Model - Train Naïve Bayes model using Power

BI AI Insights.

STEP 6: Evaluate and Visualize - Compute accuracy, generate confusion matrix, plot
ROC curve.

OUTPUT:
MARK ALLOCATION:

Particulars Marks Allotted Marks Awarded

Program / Simulation 40

Program Execution 30

Result 20

Viva Voce 10

Total 100

RESULT:
Thus, the zomato sales dataset has been successfully visualized using a PowerBI
dashboard.

Paul Kline-An Easy Guide To Factor Analysis-Routledge (1993)
0% (1)
Paul Kline-An Easy Guide To Factor Analysis-Routledge (1993)
24 pages
CP4252 Machine Learning Lab Manual
No ratings yet
CP4252 Machine Learning Lab Manual
26 pages
Regression Analysis - Cheatsheet
No ratings yet
Regression Analysis - Cheatsheet
9 pages
Machine Learning Lab Manual
No ratings yet
Machine Learning Lab Manual
22 pages
ML Manual
No ratings yet
ML Manual
30 pages
Unit 5
No ratings yet
Unit 5
171 pages
ANOVA
No ratings yet
ANOVA
12 pages
Chandigarh Group of Colleges College of Engineering Landran, Mohali
No ratings yet
Chandigarh Group of Colleges College of Engineering Landran, Mohali
47 pages
DataAnalytics Lab Manual
No ratings yet
DataAnalytics Lab Manual
35 pages
ML Lab Experiment Shivansh
No ratings yet
ML Lab Experiment Shivansh
29 pages
DS Food
No ratings yet
DS Food
23 pages
DA Programs
No ratings yet
DA Programs
44 pages
ML Lap
No ratings yet
ML Lap
23 pages
Regression Analysis - Lasso and Ridge Regularization
No ratings yet
Regression Analysis - Lasso and Ridge Regularization
17 pages
Output
No ratings yet
Output
18 pages
ML Recordjp
No ratings yet
ML Recordjp
35 pages
Machine Learning Lab
No ratings yet
Machine Learning Lab
20 pages
ML Full For Print New 1
No ratings yet
ML Full For Print New 1
38 pages
Da Lab Mannual
No ratings yet
Da Lab Mannual
25 pages
ML 1-11
No ratings yet
ML 1-11
27 pages
ML All Projectpdf Removed
No ratings yet
ML All Projectpdf Removed
41 pages
ML Final Prac
No ratings yet
ML Final Prac
47 pages
Hypothesis Testing
No ratings yet
Hypothesis Testing
57 pages
Lecture 3
No ratings yet
Lecture 3
25 pages
ML Record
No ratings yet
ML Record
19 pages
ML Manual
No ratings yet
ML Manual
24 pages
Pengaruh Penentuan Lokasi Terhadap Kesuksesan Usah
No ratings yet
Pengaruh Penentuan Lokasi Terhadap Kesuksesan Usah
12 pages
DSBDA Practicals
No ratings yet
DSBDA Practicals
16 pages
Machinelearning
No ratings yet
Machinelearning
26 pages
Train
No ratings yet
Train
17 pages
Chapter 5 Hypothesis Testing
No ratings yet
Chapter 5 Hypothesis Testing
25 pages
Inquiries, Investigations and Immersion: Quarter 3 - Module 7: Population and Sampling Methods
No ratings yet
Inquiries, Investigations and Immersion: Quarter 3 - Module 7: Population and Sampling Methods
28 pages
Weighing of Birds.2
No ratings yet
Weighing of Birds.2
35 pages
ML
No ratings yet
ML
17 pages
ML Lab Record
No ratings yet
ML Lab Record
17 pages
Gaurav - Data Mining Lab Assignment
No ratings yet
Gaurav - Data Mining Lab Assignment
36 pages
Aiml Practicals
No ratings yet
Aiml Practicals
22 pages
ML Manual
No ratings yet
ML Manual
18 pages
MDS372 Lab4 2448001
No ratings yet
MDS372 Lab4 2448001
17 pages
Experiment Number: 3: Aim:-Study of The Linear Regression in The Machine Learning Using The Boston Housing Dataset. 1)
No ratings yet
Experiment Number: 3: Aim:-Study of The Linear Regression in The Machine Learning Using The Boston Housing Dataset. 1)
14 pages
Chapter 8 Review
No ratings yet
Chapter 8 Review
6 pages
Machine Learning Project: TITLE: Predicting The Sale Price of A House Using Linear Regression
No ratings yet
Machine Learning Project: TITLE: Predicting The Sale Price of A House Using Linear Regression
20 pages
Document From Jahnavi
No ratings yet
Document From Jahnavi
20 pages
Unit4 Multivariate Analysis
No ratings yet
Unit4 Multivariate Analysis
20 pages
BSR PPT - Compiled
No ratings yet
BSR PPT - Compiled
24 pages
Exercise4 Solution
No ratings yet
Exercise4 Solution
20 pages
JETIR2205547
No ratings yet
JETIR2205547
9 pages
Zerox Ready
No ratings yet
Zerox Ready
21 pages
Group Work Assignment Supervised and Unsupervised Learning
No ratings yet
Group Work Assignment Supervised and Unsupervised Learning
10 pages
Linear Regression Analysis - Polynomial Regression
No ratings yet
Linear Regression Analysis - Polynomial Regression
25 pages
Math 1030 Skittles Term Project
No ratings yet
Math 1030 Skittles Term Project
8 pages
Lecturer-Correlation Analysis
No ratings yet
Lecturer-Correlation Analysis
10 pages
T2 Summary VHA
No ratings yet
T2 Summary VHA
14 pages
ML Manual
No ratings yet
ML Manual
9 pages
Purchase Orders
No ratings yet
Purchase Orders
14 pages
External
No ratings yet
External
11 pages
QMT11 Chapter 11 Experimental Design and ANOVA
No ratings yet
QMT11 Chapter 11 Experimental Design and ANOVA
40 pages
Project 4 - House Price Prediction - Ipynb - Colab
No ratings yet
Project 4 - House Price Prediction - Ipynb - Colab
5 pages
Dsbda 5
No ratings yet
Dsbda 5
4 pages
Machine Learning
No ratings yet
Machine Learning
10 pages
ML 6 7 8
No ratings yet
ML 6 7 8
10 pages
Da 012307
No ratings yet
Da 012307
8 pages
R10 Sampling and Estimation
No ratings yet
R10 Sampling and Estimation
17 pages
Regression Algorithm
No ratings yet
Regression Algorithm
9 pages
Lasso Regression Aim: Roll Number: 160122733094 Date
No ratings yet
Lasso Regression Aim: Roll Number: 160122733094 Date
8 pages
Student Alcohol Consumption Conference Paper
No ratings yet
Student Alcohol Consumption Conference Paper
8 pages
Ash Regression
No ratings yet
Ash Regression
11 pages
KS Test PDF
No ratings yet
KS Test PDF
6 pages
Hierarchical Nested Anova 121
No ratings yet
Hierarchical Nested Anova 121
22 pages
Traffic Engineering Lab Exercise Report 1&2 Department of Civil Engineering Name:Mekuanint Getnet Entry No: 2018cep2086
No ratings yet
Traffic Engineering Lab Exercise Report 1&2 Department of Civil Engineering Name:Mekuanint Getnet Entry No: 2018cep2086
18 pages
Data Preparation
No ratings yet
Data Preparation
12 pages
SiddharthShah 1032221195 DivC 50 DL LabAssignment2
No ratings yet
SiddharthShah 1032221195 DivC 50 DL LabAssignment2
7 pages
IoT Task4 21BEC0384
No ratings yet
IoT Task4 21BEC0384
9 pages
Task 1
No ratings yet
Task 1
5 pages
Python File
No ratings yet
Python File
5 pages
DA Lab2
No ratings yet
DA Lab2
5 pages
Praveen Ai
No ratings yet
Praveen Ai
6 pages
STAT 3022 Data Analysis Class Slides 1
No ratings yet
STAT 3022 Data Analysis Class Slides 1
16 pages
Index
No ratings yet
Index
4 pages
Data Analysis in Python-3
No ratings yet
Data Analysis in Python-3
4 pages
Conference
No ratings yet
Conference
3 pages
ASWIN
No ratings yet
ASWIN
3 pages
Chapter 6 Multicollinerity
No ratings yet
Chapter 6 Multicollinerity
4 pages
FML - Lab - Ipynb - Colab
No ratings yet
FML - Lab - Ipynb - Colab
3 pages
CS-7830 Assignment-2 Questions 2022
No ratings yet
CS-7830 Assignment-2 Questions 2022
4 pages
Student List Export 1740247935189
No ratings yet
Student List Export 1740247935189
2 pages
IQR Calculation Chart
No ratings yet
IQR Calculation Chart
2 pages
Deepak S
No ratings yet
Deepak S
2 pages
Statistics-Imp Questions Paper 2
No ratings yet
Statistics-Imp Questions Paper 2
2 pages
Ex No.: Date: Problem Statement
No ratings yet
Ex No.: Date: Problem Statement
3 pages
My Resume
No ratings yet
My Resume
1 page
Mlext
No ratings yet
Mlext
1 page
LAMPIRAN 12. Uji Kruskal-Wallis Selisih Kadar Kolesterol LDL Dan HDL Antara Sebelum Dan Sesudah Perlakuan
No ratings yet
LAMPIRAN 12. Uji Kruskal-Wallis Selisih Kadar Kolesterol LDL Dan HDL Antara Sebelum Dan Sesudah Perlakuan
4 pages
Oromia State University College of Finance and Management Studies Department of Management Business Statistics Mid Exam. For Weekend Students
No ratings yet
Oromia State University College of Finance and Management Studies Department of Management Business Statistics Mid Exam. For Weekend Students
2 pages
DSBDA Prac4 2
No ratings yet
DSBDA Prac4 2
1 page
CSE (2250 X 3300 PX)
No ratings yet
CSE (2250 X 3300 PX)
1 page
Sampling: Design and Procedures
No ratings yet
Sampling: Design and Procedures
4 pages
Computer Engineering Laboratory Solution Primer
From Everand
Computer Engineering Laboratory Solution Primer
Karan Bhandari
No ratings yet

Data Science Record - 05

Uploaded by

Data Science Record - 05

Uploaded by

EXP.

DATE: 30.01.2025 Implement linear and logistic regression

1). Linear regression:

b). Multi linear regression:

2). Logistic regression:

DATE: 06.02.2025 Naive bayes classifier

def fit(self, X, y):

The required Naïve bayes model has been executed successfull

To make an analytical dashboard for E-commerce

load it into a table.

STEP 2: Preprocess Data - Handle missing values, encode Gender,

STEP 3: Define Variables - Create binary target variable

SalaryAbove50K, set X and y.

STEP 4: Split Data - Divide features and target into training

and testing sets.

STEP 5: Train Model - Train Naïve Bayes model using Power

Particulars Marks Allotted Marks Awarded

You might also like