Performance Evaluation of Multiple Machine Learning Models For Wine Quality Prediction
Performance Evaluation of Multiple Machine Learning Models For Wine Quality Prediction
1,2
S2 Ilmu Komputer, Fakultas Teknologi Informasi, Universitas Nusa Mandiri, Indonesia
1*
[email protected], [email protected]
*: Penulis korenspondensi (corresponding author)
Abstrak
Keywords: wine quality, voting Penelitian mengevaluasi kinerja sembilan model machine
classifier, model evaluation learning dalam memprediksi kualitas wine menggunakan
Kata kunci: kualitas wine, voting
classifier, evaluasi model
dataset dari repositori UCI telah dilakukan. Model machine
learning yang digunakan adalah Logistic Regression, K-
Nearest Neighbor (KNN), Decision Tree, Support Vector
Machine (SVM), Random Forest, XGBoost, LightGBM,
CatBoost, dan Gradient Boosting. Dataset wine yang
digunakan terdiri dari 1.599 sampel dengan 12 parameter
▪ 209
Telematika: Jurnal Informatika dan Teknologi Informasi ISSN: 1829-667X / E-ISSN: 2460-9021
Vol. 21, No. 2, Juni 2024, pp.209-223 DOI:10.31515/telematika.v21i2. 13007
1. Introduction
Wine is an alcoholic refreshment made by aging grapes and other natural products. The
generation preparation includes yeast maturing the characteristic sugars within the natural
product, changing them into liquor and carbon dioxide (CO2). Wine quality is impacted by
different variables, including grape assortment, maturation strategies, capacity conditions, and
the wine's age [1]. Wine quality is crucial in the alcoholic beverage industry, directly impacting
consumer satisfaction and market price. Experts typically perform quality assessments using
sensory methods, which require experience and are subjective [2]. while these conventional
methods have proven effective, they are time-consuming and costly. Therefore, a more efficient
and objective approach to wine quality assessment is needed. Advances in technology,
particularly in Artificial Intelligence (AI) and machine learning, offer opportunities to develop
more efficient and objective approaches for assessing wine quality. Traditional approaches to
predicting wine quality use statistical methods such as linear regression or discriminant analysis.
However, with advancements in machine learning, particularly deep learning, there is an
opportunity to improve prediction accuracy. Deep learning models, such as Deep Neural
Networks (DNN) and Convolutional Neural Networks (CNN), have shown success in various
predictive applications [3]. These approaches aim to reduce the subjectivity associated with
human assessment and improve consistency in determining wine quality. Machine learning
algorithms, like deep learning, enable more in-depth and precise analysis of the physicochemical
data related to wine [4].
The primary journal referenced in this research discusses wine quality prediction using machine
learning algorithms. The data originates from a public dataset that includes various chemical
components in wine. Researchers conducted data analysis and visualization, applying several
machine learning algorithms, including Random Forest, XGBoost, and Decision Tree, to predict
wine quality. The findings indicate that the Random Forest model has the best predictive
accuracy at 66.8%, followed by XGBoost at 60.1% and Decision Tree at 59.5%. The researchers
▪ 210
Telematika: Jurnal Informatika dan Teknologi Informasi ISSN: 1829-667X / E-ISSN: 2460-9021
Vol. 21, No. 2, Juni 2024, pp.209-223 DOI:10.31515/telematika.v21i2. 13007
concluded that these machine learning models perform well in predicting high-quality wine but
less so for low-quality wine [5].
Previous studies have explored the use of machine learning methods for wine quality prediction.
Research by Jeffrey A. Clarin compared the performance of several regression algorithms in
predicting white wine quality using a dataset from the UCI Machine Learning Repository and
implemented using WEKA. The study found that the Random Forest algorithm provided the
best performance with a correlation coefficient of r = 0.7459. Among the input variables, alcohol
and acidity remained significantly correlated with the prediction model performance, with
values of r = 0.44, and r = −0.391 respectively [6].
Another think about utilizing a machine learning technique to look at 1,599 wine tests, each
containing 11 input parameters, to recognize the factors with the foremost noteworthy effect on
by and large wine quality. The utilization of direct relapse models in this think about appeared
that liquor and causticity were the essential components influencing wine quality. Furthermore,
warm maps were utilized to display the connections among these factors. Assist investigation
utilized box plots and three-dimensional scramble plots to strengthen the conclusions inferred
from the straight relapse demonstration, giving more particular experiences into the factors that
have the most noteworthy impact on wine quality [4].
Other investigations compared the execution of a few relapse models and combinations of
relapse and gathering models in anticipating wine quality utilizing the wine quality dataset from
the UCI Machine Learning Store. This dataset comprises white and ruddy Vinho Verde wines
from northern Portugal, with 6,497 tests. Sometime recently preparing the models, the dataset
experienced suitable preprocessing steps to guarantee information quality and consistency. Five
relapse algorithms Linear Relapse (LR), Arbitrary Timberland Regressor (RF), Bolster Vector
Relapse (SVR), Choice Tree Regressor (DT), and Multi-layer Perceptron Regressor (MLP)—
were prepared and tried on the dataset. Furthermore, expectations from these person relapse
models were combined with four outfit models XGB Regressor (XGB), AdaBoost Regressor
(ABR), Stowing Regressor (BR), and Slope Boosting Regressor (GRB). They come about
demonstrated that among person models, Arbitrary Woodland (RF) appeared the finest
execution, with the most reduced MAE, MSE, and RMSE values and the most noteworthy R²
score. This recommends that RF is more suited to the ruddy wine quality dataset compared to
other relapse models. In any case, combining Irregular Woodland with Sacking Regressor (RF
and BR) outflanked the person models, appearing with lower mistakes and generally higher R²
scores [7].
2. Metode/Perancangan
This research employs quantitative methods with multiple machine-learning models. The nine
machine learning models used are Logistic Regression, K-Nearest Neighbor (KNN), Decision
Tree, Support Vector Machine (SVM), Random Forest, XGBoost, LightGBM, CatBoost, and
Gradient Boosting. The selection of these nine models provides a broad spectrum, allowing for
a comprehensive evaluation and precise accuracy comparison. This approach also helps identify
which model best fits the data characteristics. The overall research steps are shown in Figure
1.
▪ 211
Telematika: Jurnal Informatika dan Teknologi Informasi ISSN: 1829-667X / E-ISSN: 2460-9021
Vol. 21, No. 2, Juni 2024, pp.209-223 DOI:10.31515/telematika.v21i2. 13007
▪ 212
Telematika: Jurnal Informatika dan Teknologi Informasi ISSN: 1829-667X / E-ISSN: 2460-9021
Vol. 21, No. 2, Juni 2024, pp.209-223 DOI:10.31515/telematika.v21i2. 13007
Twelve chemical parameters were tested on 1,599 wine samples with varying values, as shown
in Table 2, which provides an example of the dataset used in this study.
▪ 213
Telematika: Jurnal Informatika dan Teknologi Informasi ISSN: 1829-667X / E-ISSN: 2460-9021
Vol. 21, No. 2, Juni 2024, pp.209-223 DOI:10.31515/telematika.v21i2. 13007
distribution. This is very useful in algorithms such as Support Vector Machines (SVM) that
assume the data is normally distributed. With standardization, features with different scales can
be treated equally by the model, potentially improving overall model performance [13].
2.6. Analysis
Subsequently, an analysis of the model performance evaluation results was conducted through
the confusion matrix. The confusion matrix provides an overview of the model's prediction
distribution and informs about the performance of the classification model by comparing
predicted values with actual values from the test data. The information includes the True
Positive (TP) value, which is the number of positive cases correctly predicted by the model,
meaning the model accurately identifies positive cases. True Negative (TN) is the number of
negative cases correctly predicted by the model, meaning the model accurately identifies
▪ 214
Telematika: Jurnal Informatika dan Teknologi Informasi ISSN: 1829-667X / E-ISSN: 2460-9021
Vol. 21, No. 2, Juni 2024, pp.209-223 DOI:10.31515/telematika.v21i2. 13007
negative cases. False Positive (FP) is the number of negative cases incorrectly predicted as
positive by the model. False Negative (FN) is the number of positive cases incorrectly predicted
as negative by the model [14]. From these four values, further evaluation metrics such as
accuracy, precision, recall, and F1-score can be calculated [15].
Accuracy is the total percentage of correct predictions out of all predictions made by the model.
It is calculated as shown in Equation (1) and Equation (2).
𝑇𝑃+𝑇𝑁
𝐴𝑐𝑐𝑢𝑟𝑎𝑐𝑦 = (
𝑇𝑃+𝐹𝑃+𝐹𝑁+𝑇𝑁
) (1)
Accuracy for more than one class:
𝑇𝑟𝑢𝑒 𝑃𝑜𝑠𝑖𝑡𝑖𝑣𝑒
𝐴𝑐𝑐𝑢𝑟𝑎𝑐𝑦 = 𝐽𝑢𝑚𝑙𝑎ℎ 𝑠𝑎𝑚𝑝𝑒𝑙 (2)
Precision shows the percentage of positive cases correctly predicted out of all positive
predictions made by the model. It is calculated as shown in Equation (3).
𝑇𝑃
𝑃𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛 = (𝑇𝑃+𝐹𝑃) (3)
Recall (Sensitivity or True Positive Rate) is the percentage of positive cases correctly identified
by the model out of all actual positive cases. It is calculated as shown in Equation (4).
𝑇𝑃
𝑅𝑒𝑐𝑎𝑙𝑙 = (𝑇𝑃+𝐹𝑁) (4)
F1-score is the harmonic mean of precision and recall. The F1-score provides a balance between
these two metrics and is useful when there is a class imbalance. It is calculated as shown in
Equation (5).
2∗(𝑅𝑒𝑐𝑎𝑙𝑙∗𝑃𝑟𝑒𝑐𝑖𝑠𝑠𝑖𝑜𝑛)
𝐹1 − 𝑆𝑐𝑜𝑟𝑒 = (5)
(𝑅𝑒𝑐𝑎𝑙𝑙+𝑃𝑟𝑒𝑐𝑖𝑠𝑠𝑖𝑜𝑛)
Quality Frequency
5 681
6 638
7 199
4 53
8 18
3 10
▪ 215
Telematika: Jurnal Informatika dan Teknologi Informasi ISSN: 1829-667X / E-ISSN: 2460-9021
Vol. 21, No. 2, Juni 2024, pp.209-223 DOI:10.31515/telematika.v21i2. 13007
From Table 3, it is evident that the wine quality with the highest number of samples is 5 with
681 samples, followed by 6 with 638 samples. Meanwhile, the qualities with the least number
of samples are 3 and 8, with 10 and 18 samples, respectively. The frequency distribution of wine
quality can be seen in Figure 2.
▪ 216
Telematika: Jurnal Informatika dan Teknologi Informasi ISSN: 1829-667X / E-ISSN: 2460-9021
Vol. 21, No. 2, Juni 2024, pp.209-223 DOI:10.31515/telematika.v21i2. 13007
▪ 217
Telematika: Jurnal Informatika dan Teknologi Informasi ISSN: 1829-667X / E-ISSN: 2460-9021
Vol. 21, No. 2, Juni 2024, pp.209-223 DOI:10.31515/telematika.v21i2. 13007
▪ 218
Telematika: Jurnal Informatika dan Teknologi Informasi ISSN: 1829-667X / E-ISSN: 2460-9021
Vol. 21, No. 2, Juni 2024, pp.209-223 DOI:10.31515/telematika.v21i2. 13007
The heatmap appears relationship values extending from -1 to 1. A esteem of 1 shows a idealize
positive relationship, meaning that as one trait increments, the other trait too increments
relatively. A esteem of -1 demonstrates a culminate negative relationship, meaning that as one
property increments, the other quality diminishes relatively. A esteem of shows no relationship
between the two traits. Based on the color elucidation, lighter colors (toward white) show more
grounded relationships (both positive and negative), whereas darker colors (toward dark) show
weaker or no relationship.
The correlation between variables shows a strong positive correlation between fixed acidity and
density (0.67) and between citric acid and fixed acidity (0.67). Total sulfur dioxide and free
sulfur dioxide also show a very strong positive correlation (0.67). The anticipated strong
correlation between alcohol and quality turned out to be moderately strong based on the
correlation heatmap, with a value of 0.48. Additionally, the correlations between volatile acidity
and quality, and between citric acid and pH, show moderately strong negative correlations, with
values of -0.39 and -0.54 respectively.
3.1. Modeling
The researcher compared the accuracy of models before and after normalization and
standardization, as shown in Table 5 and Figure 6.
Accuracy
Before
Model Normalization After After
and Normalization Standardization
Standardization
Logistic Regression 56.7% 58,90% 58,70%
K-Nearest Neighbors (KNN) 69.6% 75,10% 76,70%
Support Vector Classifier
42.7% 72,90% 77,00%
(SVC)
Random Forest 85.7% 85,30% 85,60%
Decision Tree 78.1% 78,70% 79,00%
Extreme Gradient Boosting
77.5% 77,80% 77,80%
(XGBoost)
LightGBM 87.8% 87,80% 86,40%
CatBoost 86.4% 86,60% 86,60%
Gradient Boosting 81.8% 82,00% 82,00%
▪ 219
Telematika: Jurnal Informatika dan Teknologi Informasi ISSN: 1829-667X / E-ISSN: 2460-9021
Vol. 21, No. 2, Juni 2024, pp.209-223 DOI:10.31515/telematika.v21i2. 13007
100.00%
90.00%
80.00%
70.00%
60.00%
50.00%
40.00%
30.00%
20.00%
10.00%
0.00%
Logistic K-Nearest Support Random Decision Extreme LightGBM CatBoost Gradient
Regression Neighbors Vector Forest Tree Gradient Boosting
(KNN) Classifier Boosting
(SVC) (XGBoost)
▪ 220
Telematika: Jurnal Informatika dan Teknologi Informasi ISSN: 1829-667X / E-ISSN: 2460-9021
Vol. 21, No. 2, Juni 2024, pp.209-223 DOI:10.31515/telematika.v21i2. 13007
False Positive (FP): There are 1, 2, 3, 1, and 1 cases where the model incorrectly predicted "not
good" as "good".
False Negative (FN): There are 1, 2, 7, 8, 35, 12, and 4 cases where the model incorrectly
predicted "good" as "not good".
▪ 221
Telematika: Jurnal Informatika dan Teknologi Informasi ISSN: 1829-667X / E-ISSN: 2460-9021
Vol. 21, No. 2, Juni 2024, pp.209-223 DOI:10.31515/telematika.v21i2. 13007
References
[1] R. Zhu, “Chemical Change and Quality Control in Winemaking,” Scientific and Social
Research, vol. 4, no. 7, pp. 62-67, 14 Juli 2022.
[2] M. H. Shahrajabian dan W. Sun, “Assessment of Wine Quality, Traceability and Detection
of Grapes Wine, Detection of Harmful Substances in Alcohol and Liquor Composition
Analysis,” Letters in Drug Design & Discovery, vol. 21 (8), no. Doi:
10.2174/1570180820666230228115450, pp. 1377-1399, Juni 2024.
[3] L. Le, P. N. Hurtado, I. Lawrence, Q. Tian dan B. Chen, “Applying Neural Networks in
Wineinformatics with the New Computational Wine Wheel,” Fermentation, vol. 9 (7), no.
Doi: 10.3390/fermentation9070629, pp. 629-629., 2023.
[4] J. Dong, “Red Wine Quality Analysis based on Machine Learning Techniques,”
Highlights in Science, Engineering and Technology, vol. 49, no. Doi:
10.54097/hset.v49i.8506, pp. 208-213, 2023.
▪ 222
Telematika: Jurnal Informatika dan Teknologi Informasi ISSN: 1829-667X / E-ISSN: 2460-9021
Vol. 21, No. 2, Juni 2024, pp.209-223 DOI:10.31515/telematika.v21i2. 13007
[5] C. Zeng, J. Fang, Q. Yang, C. Xiang, Z. Zhao dan Y. Lei, “Wine quality grade data analysis
and prediction based on multiple machine learning algorithms,” dalam Proceedings of the
2nd International Conference on Mechatronics and Smart Systems, 2024.
[6] J. A. Clarin, “Comparison of the Performance of Several Regression Algorithms in
Predicting the Quality of White Wine in WEKA,” International Journal of Emerging
Technology and Advanced Engineering, vol. 12 (07), no. Doi: 10.46338/ijetae0722_03 ,
pp. 20-26, 3 Juli 2022.
[7] A. K., “Regression Modeling Approaches for Red Wine Quality Prediction: Individual
and Ensemble,” International Journal for Research in Applied Science & Engineering
Technology (IJRASET), vol. 11, no. Doi: doi.org/10.22214/ijraset.2023.54363, pp. 3621-
3627, Juni 2023.
[8] N. Pourmoradi, “Red Wine Quality,” Kaggle, 2023. [Online]. Available:
https://www.kaggle.com/code/nimapourmoradi/red-wine-quality/input. [Diakses 21 Juni
2024].
[9] R. S. Jackson, Wine Science: Principles and Applications (3rd Edition), Burlington:
Academic Press, 2008.
[10] A. Géron, Hands-on Machine Learning with Scikit-Learn, Keras, and TensorFlow:
Concepts, Tools, and Techniques to Build Intelligent Systems (2nd edition), N. Tache,
Penyunt., Sebastopol: O’Reilly Media, Inc, 2019.
[11] Ridwan, E. H. Hermaliani dan M. Ernawati, “Penerapan Metode SMOTE Untuk
Mengatasi Imbalanced Data Pada,” Computer Science (CO-SCIENCE), vol. 4 (1), no. E-
ISSN: 2774-9711, pp. 80-88, Januari 2024.
[12] D. A. Nasution, H. H. Khotimah dan N. Chamidah, “Perbandingan Normalisasi Data untuk
Klasifikasi Wine Menggunakan Algoritma K-NN,” CESS (Journal of Computer
Engineering System and Science), vol. 4 (1), pp. 78-82, Januari 2019.
[13] S. Ioffe dan C. Szegedy, “Batch Normalization: Accelerating Deep Network Training by
Reducing Internal Covariate Shift,” dalam Proceedings of the 32nd International
Conference on Machine Learning, PMLR, 2015.
[14] K. S. Nugroho, “Confusion Matrix untuk Evaluasi Model pada Supervised Learning,” 13
November 2019. [Online]. Available: https://ksnugroho.medium.com/confusion-matrix-
untuk-evaluasi-model-pada-unsupervised-machine-learning-bc4b1ae9ae3f. [Diakses 23
Juni 2024].
[15] S. Raschka dan V. Mirjalili, Python Machine Learning, Burningham: Packt Publishing
Ltd., 2019.
▪ 223