The document outlines a laboratory exercise for a Data Science and Big Data Analytics course, focusing on using Logistic Regression for predicting purchases based on age and estimated salary. It details the steps of loading a dataset, preprocessing data, training a model, making predictions, and evaluating performance using a confusion matrix. Key metrics such as accuracy, precision, and recall are computed to assess the model's effectiveness.
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0 ratings0% found this document useful (0 votes)
8 views4 pages
Data Analytics II
The document outlines a laboratory exercise for a Data Science and Big Data Analytics course, focusing on using Logistic Regression for predicting purchases based on age and estimated salary. It details the steps of loading a dataset, preprocessing data, training a model, making predictions, and evaluating performance using a confusion matrix. Key metrics such as accuracy, precision, and recall are computed to assess the model's effectiveness.
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 4
Third Year Engineering (2019 Pattern)
Course Code: 310256
Course Name: Data Science and Big Data Analytics Laboratory Group A 4) Data Analytics II import pandas as pd import numpy as np import seaborn as sns import matplotlib.pyplot as plt from sklearn.model_selection import train_test_split from sklearn.preprocessing import StandardScaler from sklearn.linear_model import LogisticRegression from sklearn.metrics import confusion_matrix, accuracy_score, precision_score, recall_score
print("\nConfusion Matrix:") print(conf_matrix) print(f"\nTrue Positives (TP): {tp}") print(f"False Positives (FP): {fp}") print(f"True Negatives (TN): {tn}") print(f"False Negatives (FN): {fn}") print(f"Accuracy: {accuracy:.2f}") print(f"Error Rate: {error_rate:.2f}") print(f"Precision: {precision:.2f}") print(f"Recall: {recall:.2f}") Explanation of Each Step: 1. Loading the Dataset o Read Social_Network_Ads.csv into a Pandas DataFrame. 2. Data Preprocessing o Selected Age and EstimatedSalary as features. o Used Purchased as the target variable. o Applied StandardScaler() for feature scaling. 3. Splitting the Data o Split into 80% training and 20% testing using train_test_split(). 4. Training the Model o Trained a Logistic Regression model using LogisticRegression(). 5. Making Predictions o Predicted labels for the test set using .predict(). 6. Computing the Confusion Matrix o Extracted True Positives (TP), False Positives (FP), True Negatives (TN), False Negatives (FN). o Calculated Accuracy, Error Rate, Precision, and Recall.
DATA MINING and MACHINE LEARNING. PREDICTIVE TECHNIQUES: ENSEMBLE METHODS, BOOSTING, BAGGING, RANDOM FOREST, DECISION TREES and REGRESSION TREES.: Examples with MATLAB