Data Analytics II

The document outlines a laboratory exercise for a Data Science and Big Data Analytics course, focusing on using Logistic Regression for predicting purchases based on age and estimated salary. It details the steps of loading a dataset, preprocessing data, training a model, making predictions, and evaluating performance using a confusion matrix. Key metrics such as accuracy, precision, and recall are computed to assess the model's effectiveness.

Uploaded by

Chirag Patekar

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

9 views4 pages

Data Analytics II

Uploaded by

Chirag Patekar

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 4

Third Year Engineering (2019 Pattern)

Course Code: 310256

Course Name: Data Science and Big Data Analytics Laboratory
Group A
4) Data Analytics II
import pandas as pd
import numpy as np
import seaborn as sns
import matplotlib.pyplot as plt
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
from sklearn.linear_model import LogisticRegression
from sklearn.metrics import confusion_matrix, accuracy_score, precision_score,
recall_score

# Step 1: Load the dataset

df = pd.read_csv("Social_Network_Ads.csv")
print("\nDataset Info:")
print(df.info())
print("\nFirst 5 Rows:")
print(df.head())

# Step 2: Data Preprocessing

# Selecting relevant features and target variable
X = df[['Age', 'EstimatedSalary']]
y = df['Purchased'] # Target variable
# Splitting dataset
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2,
random_state=42)

# Feature Scaling
scaler = StandardScaler()
X_train_scaled = scaler.fit_transform(X_train)
X_test_scaled = scaler.transform(X_test)

# Step 3: Train Logistic Regression Model

model = LogisticRegression()
model.fit(X_train_scaled, y_train)

# Step 4: Make Predictions

y_pred = model.predict(X_test_scaled)

# Step 5: Compute Confusion Matrix

conf_matrix = confusion_matrix(y_test, y_pred)
tn, fp, fn, tp = conf_matrix.ravel()
accuracy = accuracy_score(y_test, y_pred)
precision = precision_score(y_test, y_pred)
recall = recall_score(y_test, y_pred)
error_rate = 1 - accuracy

# Step 6: Display Results

print("\nConfusion Matrix:")
print(conf_matrix)
print(f"\nTrue Positives (TP): {tp}")
print(f"False Positives (FP): {fp}")
print(f"True Negatives (TN): {tn}")
print(f"False Negatives (FN): {fn}")
print(f"Accuracy: {accuracy:.2f}")
print(f"Error Rate: {error_rate:.2f}")
print(f"Precision: {precision:.2f}")
print(f"Recall: {recall:.2f}")
Explanation of Each Step:
1. Loading the Dataset
o Read Social_Network_Ads.csv into a Pandas DataFrame.
2. Data Preprocessing
o Selected Age and EstimatedSalary as features.
o Used Purchased as the target variable.
o Applied StandardScaler() for feature scaling.
3. Splitting the Data
o Split into 80% training and 20% testing using train_test_split().
4. Training the Model
o Trained a Logistic Regression model using LogisticRegression().
5. Making Predictions
o Predicted labels for the test set using .predict().
6. Computing the Confusion Matrix
o Extracted True Positives (TP), False Positives (FP), True Negatives (TN),
False Negatives (FN).
o Calculated Accuracy, Error Rate, Precision, and Recall.

OUTPUT-

30 Days ML Projects Challenge
No ratings yet
30 Days ML Projects Challenge
288 pages
Supervised Learning
100% (1)
Supervised Learning
15 pages
Regression Analysis - Cheatsheet
No ratings yet
Regression Analysis - Cheatsheet
9 pages
Classification
No ratings yet
Classification
3 pages
Session 6 - CSD102 Measures of Divergence From Normality
100% (1)
Session 6 - CSD102 Measures of Divergence From Normality
30 pages
Assignment 2: Hive
No ratings yet
Assignment 2: Hive
11 pages
Machine Learning Hands-On
100% (1)
Machine Learning Hands-On
18 pages
Statistics and Probability: Senior High School
77% (13)
Statistics and Probability: Senior High School
44 pages
PA Lab2
No ratings yet
PA Lab2
11 pages
Document (AI&ML)
No ratings yet
Document (AI&ML)
29 pages
T Test Formula
100% (1)
T Test Formula
2 pages
Home Work
No ratings yet
Home Work
12 pages
Machine Learning Internship Report
No ratings yet
Machine Learning Internship Report
13 pages
Ml-Exp-3 - Jupyter Notebook
No ratings yet
Ml-Exp-3 - Jupyter Notebook
6 pages
Data Analytics Program
No ratings yet
Data Analytics Program
11 pages
Probability Assignment
80% (5)
Probability Assignment
25 pages
Machine Learning Lab
No ratings yet
Machine Learning Lab
43 pages
Data Analytics I
No ratings yet
Data Analytics I
4 pages
Assignment 2
No ratings yet
Assignment 2
3 pages
# Use This Cell To Write Your Code
No ratings yet
# Use This Cell To Write Your Code
2 pages
3 Month AI Architect Learning Program
No ratings yet
3 Month AI Architect Learning Program
3 pages
Logistic Regression
No ratings yet
Logistic Regression
2 pages
Learn Machine Learning in One Lesson Book
No ratings yet
Learn Machine Learning in One Lesson Book
8 pages
Data Analytcs 2
No ratings yet
Data Analytcs 2
2 pages
Simple Linear Regression
No ratings yet
Simple Linear Regression
11 pages
School of Engineering: Lab Manual On Machine Learning Lab
No ratings yet
School of Engineering: Lab Manual On Machine Learning Lab
23 pages
AI
No ratings yet
AI
16 pages
ML Practical 205160694034
No ratings yet
ML Practical 205160694034
33 pages
Logistic Regression
No ratings yet
Logistic Regression
18 pages
About The Dataset - Car Evaluation Dataset (UCI Machine Learning Repository
No ratings yet
About The Dataset - Car Evaluation Dataset (UCI Machine Learning Repository
5 pages
Ds Assign 33
No ratings yet
Ds Assign 33
7 pages
Articles Xgboost Classification With Smote-Enn Algorithm
No ratings yet
Articles Xgboost Classification With Smote-Enn Algorithm
11 pages
ML Complete Notes Hridoy
No ratings yet
ML Complete Notes Hridoy
5 pages
Marketing Research Study Note
100% (1)
Marketing Research Study Note
18 pages
ML PDF
No ratings yet
ML PDF
30 pages
Machine Intelligence
No ratings yet
Machine Intelligence
24 pages
ML Adv
No ratings yet
ML Adv
51 pages
Document 4
No ratings yet
Document 4
3 pages
Machine Learning Course Overview
No ratings yet
Machine Learning Course Overview
2 pages
Assignment 9
No ratings yet
Assignment 9
2 pages
Bank Marketing Targets 1724510938
No ratings yet
Bank Marketing Targets 1724510938
13 pages
Lab Manual 04
No ratings yet
Lab Manual 04
12 pages
Datascience PR 6 Veda
No ratings yet
Datascience PR 6 Veda
6 pages
2021BCS0103 ML
No ratings yet
2021BCS0103 ML
1 page
Prac5 (DS)
No ratings yet
Prac5 (DS)
2 pages
Machine Learning Strategies
No ratings yet
Machine Learning Strategies
59 pages
Machine Learning Path
No ratings yet
Machine Learning Path
21 pages
Linear Regression Analysis: Module - I
No ratings yet
Linear Regression Analysis: Module - I
13 pages
DS Food
No ratings yet
DS Food
23 pages
Index: Name - JINESH PRAJAPAT Class - B. Tech, III Year Branch - AI & DS Sem - V
No ratings yet
Index: Name - JINESH PRAJAPAT Class - B. Tech, III Year Branch - AI & DS Sem - V
35 pages
ML Lab Programs
No ratings yet
ML Lab Programs
9 pages
CRE Equations and Formulas Print Out
No ratings yet
CRE Equations and Formulas Print Out
30 pages
ML Theory
No ratings yet
ML Theory
5 pages
ML Manual With Outputs
No ratings yet
ML Manual With Outputs
30 pages
Shobit Sharma (2124399) ML Lab File PDF
No ratings yet
Shobit Sharma (2124399) ML Lab File PDF
19 pages
ML External Xerox
No ratings yet
ML External Xerox
1 page
Reflective Journal Writing 6 - 1733814927
No ratings yet
Reflective Journal Writing 6 - 1733814927
4 pages
Import Pandas As PD DF PD - Read - CSV ("Titanic - Train - CSV") DF - Head
No ratings yet
Import Pandas As PD DF PD - Read - CSV ("Titanic - Train - CSV") DF - Head
20 pages
Stats Quiz 5 Ans
No ratings yet
Stats Quiz 5 Ans
4 pages
Roadmap To Crack DS - ML Interviews PDF
No ratings yet
Roadmap To Crack DS - ML Interviews PDF
2 pages
Statistical Significance Versus Clinical Relevance
No ratings yet
Statistical Significance Versus Clinical Relevance
38 pages
C2W3 Lab 01 Model Evaluation and Selection
No ratings yet
C2W3 Lab 01 Model Evaluation and Selection
21 pages
Easy Pract ML
No ratings yet
Easy Pract ML
7 pages
C2W3 Lab 01 Model Evaluation and Selection
No ratings yet
C2W3 Lab 01 Model Evaluation and Selection
21 pages
Capstone Project - Jaro-Prof. Babji
No ratings yet
Capstone Project - Jaro-Prof. Babji
5 pages
Part 2 Exploring Relationships Among Variables
No ratings yet
Part 2 Exploring Relationships Among Variables
8 pages
A3 Classification and Feature Engineering
No ratings yet
A3 Classification and Feature Engineering
2 pages
Dsbda 5
No ratings yet
Dsbda 5
4 pages
Unit 5 Estimation: Structure
No ratings yet
Unit 5 Estimation: Structure
17 pages
Project 1 - How Much Crime?: Sample Solutions
No ratings yet
Project 1 - How Much Crime?: Sample Solutions
8 pages
Cars Project PDF
No ratings yet
Cars Project PDF
9 pages
Saint Mary'S College
No ratings yet
Saint Mary'S College
3 pages
01 Skeweness, Freq Dist
No ratings yet
01 Skeweness, Freq Dist
47 pages
Business Statistics
No ratings yet
Business Statistics
29 pages
Missing Data Management
No ratings yet
Missing Data Management
19 pages
Associates Degree Programme: Coursework Project: Part I
No ratings yet
Associates Degree Programme: Coursework Project: Part I
8 pages
Training at Gudar Campus
No ratings yet
Training at Gudar Campus
83 pages
Additional Mathematics Results For SMK Taman SEA From 2013 - 2017
No ratings yet
Additional Mathematics Results For SMK Taman SEA From 2013 - 2017
8 pages
Assignment: Central Tendency (Arithmetic Mean, Median and Mode)
No ratings yet
Assignment: Central Tendency (Arithmetic Mean, Median and Mode)
1 page
Numerical Computation - 7 - Linear Regression
No ratings yet
Numerical Computation - 7 - Linear Regression
27 pages
Assg 2
No ratings yet
Assg 2
10 pages
STAT 336-Course Outline - 2023-2024
No ratings yet
STAT 336-Course Outline - 2023-2024
4 pages
Classification and Prediction-Module4
No ratings yet
Classification and Prediction-Module4
26 pages
Business Analytics
No ratings yet
Business Analytics
10 pages
Unit 4
No ratings yet
Unit 4
19 pages
Hypothesis Formulation and Testing Revised PDF - Meenu Maheshwari
No ratings yet
Hypothesis Formulation and Testing Revised PDF - Meenu Maheshwari
35 pages
Week 8 - Hypothesis Testing Part 1
No ratings yet
Week 8 - Hypothesis Testing Part 1
4 pages
Group 1 Ba165 Pilot Testing Result
No ratings yet
Group 1 Ba165 Pilot Testing Result
24 pages
DATA MINING AND MACHINE LEARNING. PREDICTIVE TECHNIQUES: REGRESSION, GENERALIZED LINEAR MODELS, SUPPORT VECTOR MACHINE AND NEURAL NETWORKS
From Everand
DATA MINING AND MACHINE LEARNING. PREDICTIVE TECHNIQUES: REGRESSION, GENERALIZED LINEAR MODELS, SUPPORT VECTOR MACHINE AND NEURAL NETWORKS
César Pérez López
No ratings yet
DATA MINING and MACHINE LEARNING. PREDICTIVE TECHNIQUES: ENSEMBLE METHODS, BOOSTING, BAGGING, RANDOM FOREST, DECISION TREES and REGRESSION TREES.: Examples with MATLAB
From Everand
DATA MINING and MACHINE LEARNING. PREDICTIVE TECHNIQUES: ENSEMBLE METHODS, BOOSTING, BAGGING, RANDOM FOREST, DECISION TREES and REGRESSION TREES.: Examples with MATLAB
César Pérez López
No ratings yet