0% found this document useful (0 votes)
31 views32 pages

ML PR

Uploaded by

Rohit Majumder
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
31 views32 pages

ML PR

Uploaded by

Rohit Majumder
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 32

BHARATI VIDYAPEETH (DEEMED TO BE UNIVERSITY)

COLLEGE OF ENGINEERING

DEPARTMENT OF ENGINEERING & TECHNOLOGY OFF


CAMPUS, KHARGHAR, NAVI MUMBAI,410210

Mini Project Report


On
Wine Quality Prediction
Subject: Machine Learning
Presented By

Roll No. Name PRN


10 Rohit Majumder 2043110144
13 Divyansh Jain 2043110158
16 Neha Shaikh 2143110208

Signature of Internal Examiner Signature of External Examiner

1
BHARATI VIDYAPEETH (DEEMED TO BE UNIVERSITY)
COLLEGE OF ENGINEERING

DEPARTMENT OF ENGINEERING & TECHNOLOGY OFFCAMPUS,


KHARGHAR, NAVI MUMBAI,410210

CERTIFICATE
This is to certify that the requirements for the project report entitled ‘Wine Quality Prediction’ have
been successfully completed by the following students:

Name PRN No.


Rohit Majumder 2043110144
Divyansh Jain 2043110158
Neha Shaikh 2143110208

In partial fulfillment of B. Tech in the Department of CSE, BVDU DET, during the Academic Year
2023 – 2024.

Assistant Prof. Snehal Bode

Subject In charge

2
BHARATI VIDYAPEETH (DEEMED TO BE UNIVERSITY)

DEPARTMENT OF ENGINEERING & TECHNOLOGY


OFFCAMPUS, KHARGHAR, NAVI MUMBAI -
410210

DECLARATION

We declare that this written submission for B.TECH project entitled “Wine Quality Prediction” represent our ideas in our
own words and where others' ideas or words have been included, we have adequately cited and referenced the original
sources. We also declare that we have adhered to all principles of academic honesty and integrity and have not
misrepresented or fabricated or falsified any ideas / data / fact / source in our submission. We understand that any
violation of the above will cause disciplinary action by the institute and also evoke penal action from the sources which
have not been properly cited or from whom prior permission have not been taken when needed.

Project Group Members Signature

Rohit Majumder: __________


Neha Shaikh: __________
Divyansh Jain: __________

3
Abstract

The main goal of this project is to predict wine quality whether it is good or bad. For centuries
tasting has been done by humans and they have always predicted on the basis of sensory organs. But
in recent times industries are adopting newer technologies and applying them in all kinds of areas.
But still there are many areas in which human expertise is needed like product quality assurance.
Nowadays, it has become an expensive process as the demand for product is growing over the time.
Therefore, this project searches different machine learning techniques such as MLP classifier,
Decision Tree classifier, Support Vector Machines (SVM) for product quality assurance. These
techniques do quality assurance process with the help of available characteristics of product and
automate the process by minimizing human interference. The "Machine Learning-Based Wine
Quality Prediction" project is a data-driven endeavor designed to harness the power of machine
learning algorithms to forecast the quality of wines based on various physicochemical and sensory
attributes. In the ever-expanding world of wine production, the ability to predict wine quality
accurately is invaluable for winemakers and consumers alike. This project leverages a comprehensive
dataset of red and white wines, incorporating attributes such as acidity, alcohol content, residual
sugar, and more, to build predictive models that estimate wine quality.

4
Index

Chapter No. Title Page No.

1 Introduction 6

2 Implementation 7

3 Result 29

4 Conclusion 30

5 Reference 31

5
Chapter 1

1.1 INTRODUCTION

The most defining period of human history will always be remembered as


computing moved from mainframes to PCs to cloud and now to artificial
intelligence. An important area of artificial intelligence which came in lime
light, called as Machine Learning, allows computers to get into some kind of
self-learning mode involuntary. With the concepts and ideas from machine
learning, we have been able to spread from miscellaneous accurate
reduplications to big data iteration that too with at a marvellous speed. This
spectacle has been in momentum over the last several years. On the other
hand, data mining includes data discovery and sorting it among large data sets
vacant to identify the required designs and begin affiliations with the aim of
answering teething worries over and done with data analysis. Basically
linking, device learning and data mining use the same type of method and set
of processes, except the kind of data pre-dealing out and end guess varies.
Between these two core expanses to predict and present the truest results
potential.

1.2 PROBLEM STATEMENT

Predicting on the test data of Red Wine Quality Dataset and finding the
accuracy of the model using Logistic Regression, involving import of dataset,
quality check on the data (Data Wrangling), and performing Exploratory Data
Analysis (Univariate and Bivariate Analysis) using Histograms, Boxplots and
Scatter Plots. Thus, modelling the dataset using various machine learning
algorithms.

1.3 OBJECTIVE

o Build a Jupyter notebook in Anaconda, import data, and view numbers loaded
obsessed by the notebook.
o Practice Pandas to clean and formulate data.
o Use scikit-learn to create the machine learning exemplary.
o Use Matplotlib to see the model's performance.

6
Chapter 2

Implementation

7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
from tkinter import *
import numpy as np

def showQuality():
new =
np.array([[float(e1.get()),float(e2.get()),float(e3.get()),float(e4.get()),float(e5.
get()),float(e6.get()),float(e7.get()),float(e8.get()),float(e9.get()),float(e10.get
()),float(e11.get())]])
Ans = RF_clf.predict(new)
fin=str(Ans)[1:-1]#IT WILL remove[ ]
quality.insert(0, fin)

#------------------------------------------------------------------------
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns

# For this kernel, I amm only using the red wine dataset
data = pd.read_csv('winequality-red.csv')
data.head()

#Summary statistics
data.describe()

#All columns has the same number of data points


extra = data[data.duplicated()]
extra.shape

# Let's proceed to separate 'quality' as the target variable and the rest as
features.
y = data.quality # set 'quality' as target
X = data.drop('quality', axis=1) # rest are features
print(y.shape, X.shape)

#Let's look at the correlation among the variables using Correlation chart
colormap = plt.cm.viridis
plt.figure(figsize=(12,12))
plt.title('Correlation of Features', y=1.05, size=15)
sns.heatmap(data.astype(float).corr(),linewidths=0.1,vmax=1.0, square=True,
linecolor='white', annot=True)

#Use Random Forest Classifier to train a prediction model

27
from sklearn.model_selection import train_test_split, cross_val_score
from sklearn.ensemble import RandomForestClassifier
from sklearn.metrics import accuracy_score, log_loss
from sklearn.metrics import confusion_matrix

#Split data into training and test datasets


seed = 8 # set seed for reproducibility
X_train, X_test, y_train, y_test = train_test_split(X, y,
test_size=0.2,random_state=seed)

#Train and evaluate the Random Forest Classifier with Cross Validation
# Instantiate the Random Forest Classifier
RF_clf = RandomForestClassifier(random_state=seed)

# Compute k-fold cross validation on training dataset and see mean accuracy score
cv_scores = cross_val_score(RF_clf,X_train, y_train, cv=10, scoring='accuracy')

#Perform predictions
RF_clf.fit(X_train, y_train)
pred_RF = RF_clf.predict(X_test)

#------------------------------------------------------------------------
master = Tk()

Label(master, text="Fixed Acidity", anchor="nw", width=15).grid(row=0)


Label(master, text="Volatile Acidity", anchor="nw", width=15).grid(row=1)
Label(master, text="Citric Acid", anchor="nw", width=15).grid(row=2)
Label(master, text="Residual Sugar", anchor="nw", width=15).grid(row=3)
Label(master, text="Chlorides", anchor="nw", width=15).grid(row=4)
Label(master, text="Sulfur Dioxide", anchor="nw", width=15).grid(row=5)
Label(master, text="Total Sulfur Dioxide", anchor="nw", width=15).grid(row=6)
Label(master, text="Density", anchor="nw", width=15).grid(row=7)
Label(master, text="pH", anchor="nw", width=15).grid(row=8)
Label(master, text="Sulphates", anchor="nw", width=15).grid(row=9)
Label(master, text="Alcohol", anchor="nw", width=15).grid(row=10)
Label(master, text = "Quality", anchor="nw", width=15).grid(row=13)

e1 = Entry(master)
e2 = Entry(master)
e3 = Entry(master)
e4 = Entry(master)

28
e5 = Entry(master)
e6 = Entry(master)
e7 = Entry(master)
e8 = Entry(master)
e9 = Entry(master)
e10 = Entry(master)
e11 = Entry(master)
quality = Entry(master)

e1.grid(row=0, column=1)
e2.grid(row=1, column=1)
e3.grid(row=2, column=1)
e4.grid(row=3, column=1)
e5.grid(row=4, column=1)
e6.grid(row=5, column=1)
e7.grid(row=6, column=1)
e8.grid(row=7, column=1)
e9.grid(row=8, column=1)
e10.grid(row=9, column=1)
e11.grid(row=10, column=1)
quality.grid(row=13, column=1)

Button(master, text='Quit', command=master.destroy,width=15).grid(row=11, column=0,


sticky=W, pady=4)
Button(master, text='Find Quality', command=showQuality,width=17).grid(row=11,
column=1, sticky=W, pady=4)
Button(master, text='Project By',width=15).grid(row=14, column=0, sticky=W, pady=4)
Button(master, text='Mayur S. Satav',width=17).grid(row=14, column=1, sticky=W,
pady=4)

mainloop( )

29
Chapter 3
Results

30
Chapter 4

Conclusion
In conclusion, our wine quality prediction project successfully demonstrated the ability to predict
wine quality scores based on a comprehensive analysis of key attributes. The chosen machine
learning model exhibited strong performance, providing valuable insights into the factors that
influence wine quality. While limitations in the dataset were acknowledged, this project lays the
foundation for practical applications in the wine industry, empowering winemakers, and distributors
to make informed decisions about production and quality control. Future work can explore
enhancements and expanded datasets to further refine the predictive accuracy of the model,
contributing to the ongoing improvement of wine quality assessment processes.

31
References

Dataset has been taken from following link:

https://www.kaggle.com/uciml/red-wine-quality-cortez-et-al-2009

Research papers:

Wine Quality Prediction using Machine Learning


Algorithms by Devika Pawar, Aakansha Mahajan,
Sachin Bhoithe.

Links:

1. https://www.verzeo.in/

2. https://www.tutorialspoint.com/machine_learning/wh
at_is_machine_learning.htm

3. https://towardsdatascience.com/exploratory-data-
analysis-8fc1cb20fd15

4. https://en.wikipedia.org/wiki/Machine_learning

32

You might also like