Objective
Objective
USE customer_db;
import mysql.connector
import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LogisticRegression
from sklearn.metrics import accuracy_score
# Fetch data
data = fetch_data(connection)
print("Data Retrieved from Database:")
print(data)
# Train model
model = train_model(data)
if __name__ == "__main__":
main()
How It Works:
1. Database Interaction:
Data is retrieved from the MySQL database using mysql.connector.
2. Data Preprocessing:
The retrieved data is converted into a Pandas DataFrame for easy manipulation.
3. Machine Learning:
A Logistic Regression model is trained to predict whether a customer will make a purchase based on
features like age, income, and browsing time.
4. Evaluation:
The program splits the data into training and testing sets and evaluates the model's accuracy.
Output:
2. Model Accuracy:
Model Accuracy: 100.00%
Detailed Explanation of Advanced DBMS Program Integrating MySQL and Python with Machine Learning
1. Purpose of the Program
The program is designed to integrate MySQL, a relational database, with Python, a programming language, to perform data
analysis and prediction using a Machine Learning algorithm (Logistic Regression). The goal is to:
Fetch customer data stored in a MySQL database.
Use the data to train a Logistic Regression model to predict if a customer will make a purchase based on features
like:
i. Age
ii. Income
iii. Browsing Time
Evaluate the model and print its accuracy.
3. Python Program
SQL Query: The query `SELECT age, income, browsing_time, purchase_made FROM customer_data` retrieves
relevant columns for analysis.
Fetching Data: The cursor object executes the query and fetches all rows into Python. The data is converted into a
Pandas DataFrame for easy manipulation and analysis.
The DataFrame serves as the input dataset for the machine learning algorithm.
Algorithm Choice: Logistic Regression is chosen as the machine learning algorithm. This algorithm is ideal for
binary classification problems where the target variable has two possible outcomes (1 = Purchase, 0 = No
Purchase).
Training: The model is trained on the training dataset using `model.fit(X_train, y_train)`. Logistic Regression
learns the relationship between features and the target variable.
Prediction: Once trained, the model is used to predict outcomes for the test dataset (`model.predict(X_test)`).
Step 5: Evaluation
Accuracy Score: The `accuracy_score` function compares the model’s predictions (`y_pred`) with the actual test
labels (`y_test`) to calculate the proportion of correct predictions. This score is displayed as a percentage,
providing a measure of the model’s effectiveness.
Step 6: Integration
Scalability: The program can handle large datasets stored in a database, making it suitable for real-world
applications.
Reusability: New data can be added to the database, and the program can re-train the model to improve
predictions.
Integration: It demonstrates how to integrate a relational database (MySQL) with advanced analytics (Python and
Machine Learning).
5. Output: