Text Classification To Predict Skin Concerns Over Skincare Using Bidirectional Mechanism in Long Short-Term Memory
Text Classification To Predict Skin Concerns Over Skincare Using Bidirectional Mechanism in Long Short-Term Memory
Corresponding Author:
Andre Hangga Wangsa
Department of Informatics, Faculty of Computer Science, Universitas Mercu Buana
Jalan Raya Meruya Selatan no. 1, Kembangan Jakarta Barat-16550, Indonesia
Email: [email protected]
1. INTRODUCTION
Skincare products are the most mainstream cosmetics that maintain skin integrity, appearance, and
condition. The high market demand has made skincare products one of the most popular ways to deal with
skin concerns [1]. What's more, skincare trends began to rise drastically in 2020, when the COVID-19
pandemic began [2].
Skincare has various types and benefits according to the active ingredients contained in it [3]. Active
ingredients here play an important role in the performance of every skincare product because these
ingredients are chemicals that actively work on a specific target skin concern [4]. For example, salicylic acid
can reduce sebum secretion so that it can control oily skin and acne, but on the other hand, it can also cause
inflammation in sensitive and dry skin [5]. This is what makes skincare products difficult to use and
unsuitable for beginners; the user must thoroughly understand what is contained in them in order to meet
their skin care concerns and expectations [6].
Most beauty stores already manually sort all of their skincare products based on brands, skin types,
and skin concerns. But it will take a long time and require someone who knows about skincare products.
Instead, by collecting all information that relates to skincare products, such as the function of the product in
dealing with certain skin concerns, we might be able to build a model that can automatically classify and
predict the benefits of those skincare products quickly.
The information given to classify and predict is in the form of text data, which is a description of
skincare products, so it is called a text classification. Text classification is one of the tasks in natural language
processing (NLP), which aims to assign labels or targets to textual features or classes such as sentences,
queries, paragraphs, and documents [7]. There are two problems in text classification: binary and multi-class
classification. Binary classification is made up of only two labels, one of which is assigned a value in an
arbitrary feature space X [8]. Whereas multi-class classification has more than two labels [9].
There are various kinds of research on multiclass classification problems, despite the different uses
of domains or topics, data types, and algorithms. Although there is currently no research related to skin care
products, there are several studies that discuss the dermatology domain. The goal of Indriyani and Sudarma's
[3] study was to categorize facial skin types, which were divided into four categories: normal, dry, oily, and
combination skin. They used an image-type dataset of sixty facial images captured manually with a digital
camera. Although this makes it into computer vision instead of NLP, at least with a case that aims to classify
multiclass facial skin types and also by using a supervised learning algorithm, support vector machine
(SVM), The result is an average accuracy score of 91.66% and an average running time of 31.571 seconds,
which is higher than in previous studies [10]. The following study made extensive use of deep neural network
algorithms such as convolutional neural network (CNN), recurrent neural network (RNN), and long short-
term memory (LSTM) [11]. Although there are a few cases of binary classification because some datasets
only have two classes, the majority of the datasets have between five and ten classes. The research combines
several of those algorithms into a hybrid framework. Not only that, some algorithms are also modified into a
bidirectional mechanism. The proposed model achieved excellent performance on all tasks. A bidirectional
recurrent convolutional neural network attenuation-based (BRCAN) gave accuracy scores on the four multi-
class classification tasks of 73.46%, 75.05%, 77.75%, and 97.86%; those results are higher than all
comparison algorithms.
In relation to the aforementioned studies, we proposed a comparison of unidirectional/regular LSTM
and bidirectional long short-term memory (Bi-LSTM) in our own dataset collected from several skincare
online stores to classify skin concerns of each skincare product. The main purpose of this research is to find
out the difference between the performance results of the two proposed algorithms. In other research,
bidirectional mechanisms, which have layers that work forward and backward in sequence, are able to
outperform unidirectional LSTM [12].
2. METHOD
This section of the paper presents the research methodology. When doing research, researchers must
obtain data that will be studied for later processing. After obtaining the data, the data is still in the form of
raw data, which then the researcher must prepare the data to become a data set that can be processed. After
the data has gone through the processing stage, the last stage the researcher must do is to evaluate the
research model or instrument to understand their performance, as well as its strengths and weaknesses. More
details, there are several stages can be seen in Figure 1.
local database or spreadsheet [13]. The data is collected on a beauty online store website which is
lookfantastic.com, dermstore.com, allbeauty.com, sokoglam.com, and spacenk.com which market products
such as skincare, makeup, and beauty tools.
Text classification to predict skin concerns over skincare using bidirectional mechanism … (Devi Fitrianah)
140 ISSN: 2722-3221
(a) (b)
Figure 4. Shows the difference between, (a) regular dropout that drops neuron in neural network
independently and (b) spatial dropout 1D that drops entire 1D feature maps
form of a sequence. When compared with its predecessor vanilla RNN algorithm which is unable to use past
information, LSTM outperforms it with its long-term memory. LSTM transforms the memory shape of cells
withinside the RNN via way of means of reworking the tanh activation characteristic layer withinside the
RNN right into a shape containing memory devices and gate mechanisms, pursuits to determine how to make
use of and replace data saved in memory cells [21]. Now, there is a new concept of mechanism in those
sequence feed-forward neural network which called bidirectional. Bidirectional is a mechanism that able to
make a neural network works like two-way mirror, which trains an input data twice through past and future.
With implementing the bidirectional concept, a regular LSTM not only capable train the input data forward,
but also backward. According to Figure 5, Figures 6(a) and 6(b), those models are used the following formula
to calculate the predict values,
(a) (b)
Figure 6. Shows the differences of LSTM in: (a) unidirectional and (b) Bidirectional mechanism
Text classification to predict skin concerns over skincare using bidirectional mechanism … (Devi Fitrianah)
142 ISSN: 2722-3221
the forget gate counts the measure that decide to removes the previous memory values from the cell state.
Just like the forget gate, the input gate determines the new input to the cell state. Then, the LSTM’s cell state
Ct and the output Ht at time t are calculated,
𝐶𝑡 = 𝑓𝑡 ⊙ 𝐶𝑡 ⊙ 1 + 𝑙𝑡 ⊙ 𝑔𝑡
(2)
𝐻𝑡 = 𝑂𝑡 ⊙ 𝜎𝑐(𝐶𝑡 )
𝐶𝑡 = 𝑓𝑡 ⊙ 𝐶𝑡 ⊙ 1 + 𝑙𝑡 ⊙ 𝑑(𝑔𝑡 )
(3)
Where d is dropout. Next parameter is usual dropout that we apply same with recurrent dropout
where in both LSTM and Bi-LSTM layer. Last parameter is L2 regularizers which is a layer weight
regularizers that enforce penalties on layer parameters or layer activity during optimization process. These
penalties are added up in a loss function that optimizes the network applied on a per-layer basis there are
three ways to apply these regularizer, in layer’s kernel, bias, and output. L2 regularizer summed the suared
weights to the loss function. L2 are often to set a value on logarithmic scale between 0 and 0.1, such as 0.1,
0.001 and 0.0001.
TP = True Positive is a skin concern that is in the actual label and appears in the prediction.
FP =False Poisitive is a skin concern that is in actual label but doesn’t appears in the prediction.
FN = False Negative is skin concern that is not in the actual label but appears in the prediction.
TN = True Negative is a skin concern that is neither in the actual label nor the prediction
Precision is the percentage of positive cases that were actually predicted to be truly positive [23].
Precision is calculated,
𝑇𝑃
Precision = (5)
𝑇𝑃+𝐹𝑃
recall is the Percentage of actual positive cases that were correctly predicted. It actually measures the
coverage of positive cases and accurately reflects the predicted cases [23]. Recall is calculated,
𝑇𝑃
Recall = (6)
𝑇𝑃+𝐹𝑁
F1- Measure is a composite measure that captures the trade-offs related to precision and recall and calculated,
𝑃𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛 𝑥𝑅𝑒𝑐𝑎𝑙𝑙
F1-Measure = (7)
𝑃𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛+𝑅𝑒𝑐𝑎𝑙𝑙
loss function that being used is categorical cross-entropy. Categorical cross-entropy is specifically used for
the case of multi-class classification which increasing or decreasing the relative penalty of a probabilistic
false negative for an individual class [24]. The categorical cross-entropy loss function is used the following
formula,
output
size
Loss = − ∑𝑖=1 𝑦𝑖 ⋅ log 𝑦ˆ𝑖 (8)
(a)
(b)
Text classification to predict skin concerns over skincare using bidirectional mechanism … (Devi Fitrianah)
146 ISSN: 2722-3221
4. CONCLUSION
The findings have produced a satisfactory performance, with an adequate score obtained both for
accuracy and loss. The performance of the bidirectional LSTM model, which makes use of this bidirectional
mechanism, outperforms that of the LSTM model, which makes use of the same mechanism and produces an
accuracy score of 94.12% and a loss value of 19.91%. The bidirectional LSTM model produces a score of
98.04% for its accuracy and a loss value of 19.19% for its loss. The usage of an embedding layer where the
data was previously transformed into a tensor form may be adjusted by employing a popular word embedding
such as word2vec or gloVe, which requires a large amount of computer resources but can extract the
semantic meaning of the features. Both of the models that have been provided have been able to effectively
train on the dataset that we obtain from well-known websites that specialize in the sale of skincare items.
Because of this, the prediction successfully maps the skin's worries over the description of each skincare
product, both with unseen data or validation data and the description that we manually enter into the models.
Additionally, given the dataset that we have, this research has the potential to be further expanded into a
recommendation system for online retailers that offer skincare goods as well as a mobile application.
ACKNOWLEDGEMENTS
We would like to thank all colleagues at the Faculty of Computer Science, Universitas Mercu Buana
who were involved in this research, either in terms of knowledge assistance and for their other support.
REFERENCES
[1] J. E. Lee, M. L. Goh, and M. N. Bin Mohd Noor, “Understanding purchase intention of university students towards skin care
products,” PSU Research Review, vol. 3, no. 3, pp. 161–178, 2019, doi: 10.1108/prr-11-2018-0031.
[2] H. Symum, F. Islam, H. K. Hiya, and K. M. A. Sagor, “Assessment of the Impact of COVID-19 pandemic on population level
interest in Skincare: Evidence from a google trends-based Infodemiology study,” medRxiv, 2020, doi:
10.1101/2020.11.16.20232868.
[3] Indriyani and I. Made Sudarma, “Classification of facial skin type using discrete wavelet transform, contrast, local binary pattern
and support vector machine,” Journal of Theoretical and Applied Information Technology, vol. 98, no. 5, pp. 768–779, 2020.
[4] A. Borrego-Sánchez, C. I. Sainz-Díaz, L. Perioli, and C. Viseras, “Theoretical study of retinol, niacinamide and glycolic acid with
halloysite clay mineral as active ingredients for topical skin care formulations,” Molecules, vol. 26, no. 15, 2021, doi:
10.3390/molecules26154392.
[5] S. Khezri and K. Khezri, “The side effects of cosmetic consumption and personal care products,” Journal of Advanced Chemical
and Pharmaceutical Materials (JACPM), vol. 2, no. 3, pp. 152–156, 2019, [Online]. Available:
http://advchempharm.ir/journal/index.php/JACPM/article/view/121%0Ahttp://advchempharm.ir/journal/index.php/JACPM/article
/download/121/258.
[6] S. Cho et al., “Knowledge and behavior regarding cosmetics in Koreans visiting dermatology clinics,” Annals of Dermatology,
vol. 29, no. 2, pp. 180–186, 2017, doi: 10.5021/ad.2017.29.2.180.
[7] S. Minaee, N. Kalchbrenner, E. Cambria, N. Nikzad, M. Chenaghlu, and J. Gao, “Deep Learning Based Text Classification: A
Comprehensive Review,” 2020, [Online]. Available: http://arxiv.org/abs/2004.03705.
[8] A. Arami, A. Poulakakis-Daktylidis, Y. F. Tai, and E. Burdet, “Prediction of Gait Freezing in Parkinsonian Patients: A Binary
Classification Augmented With Time Series Prediction,” IEEE transactions on neural systems and rehabilitation engineering : a
publication of the IEEE Engineering in Medicine and Biology Society, vol. 27, no. 9, pp. 1909–1919, 2019, doi:
10.1109/TNSRE.2019.2933626.
[9] L. Tang, Y. Tian, and P. M. Pardalos, “A novel perspective on multiclass classification: Regular simplex support vector
machine,” Information Sciences, vol. 480, pp. 324–338, 2019, doi: 10.1016/j.ins.2018.12.026.
[10] S. A. Wulandari, W. A. Prasetyanto, and M. D. Kurniatie, “Classification of Normal , Oily and Dry Skin Types Using a 4-
Connectivity and 8-Connectivity Region Properties Based on Average Characteristics of Bound,” Jurnal Transformatika, vol. 17,
no. 01, pp. 78–87, 2019, [Online]. Available: journals.usm.ac.id/index.php/transformatika.
[11] J. Zheng and L. Zheng, “A Hybrid Bidirectional Recurrent Convolutional Neural Network Attention-Based Model for Text
Classification,” IEEE Access, vol. 7, pp. 106673–106685, 2019, doi: 10.1109/ACCESS.2019.2932619.
[12] R. L. Abduljabbar, H. Dia, and P. W. Tsai, “Unidirectional and bidirectional LSTM models for short-term traffic prediction,”
Journal of Advanced Transportation, vol. 2021, 2021, doi: 10.1155/2021/5589075.
[13] R. Diouf, E. N. Sarr, O. Sall, B. Birregah, M. Bousso, and S. N. Mbaye, “Web Scraping: State-of-the-Art and Areas of
Application,” Proceedings - 2019 IEEE International Conference on Big Data, Big Data 2019, pp. 6040–6042, 2019, doi:
10.1109/BigData47090.2019.9005594.
[14] S. Hara, A. Nitanda, and T. Maehara, “Data cleansing for models trained with SGD,” Advances in Neural Information Processing
Systems, vol. 32, 2019.
[15] M. A. Rosid, A. S. Fitrani, I. R. I. Astutik, N. I. Mulloh, and H. A. Gozali, “Improving Text Preprocessing for Student Complaint
Document Classification Using Sastrawi,” IOP Conference Series: Materials Science and Engineering, vol. 874, no. 1, 2020, doi:
10.1088/1757-899X/874/1/012017.
[16] F. Mohammad, “Is preprocessing of text really worth your time for toxic comment classification?,” 2018 World Congress in
Computer Science, Computer Engineering and Applied Computing, CSCE 2018 - Proceedings of the 2018 International
Conference on Artificial Intelligence, ICAI 2018, pp. 447–453, 2018.
[17] I. Boban, A. Doko, and S. Gotovac, “Sentence retrieval using Stemming and Lemmatization with different length of the queries,”
Advances in Science, Technology and Engineering Systems, vol. 5, no. 3, pp. 349–354, 2020, doi: 10.25046/aj050345.
[18] X. Zhaok et al., “AutoEmb: Automated Embedding Dimensionality Search in Streaming Recommendations,” Proceedings - IEEE
International Conference on Data Mining, ICDM, vol. 2021-December, pp. 896–905, 2021, doi:
10.1109/ICDM51629.2021.00101.
[19] D. López-Sánchez, J. R. Herrero, A. G. Arrieta, and J. M. Corchado, “Hybridizing metric learning and case-based reasoning for
adaptable clickbait detection,” Applied Intelligence, vol. 48, no. 9, pp. 2967–2982, 2018, doi: 10.1007/s10489-017-1109-7.
[20] G. Cheng, V. Peddinti, D. Povey, V. Manohar, S. Khudanpur, and Y. Yan, “An exploration of dropout with LSTMs,”
Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH, vol. 2017-
August, pp. 1586–1590, 2017, doi: 10.21437/Interspeech.2017-129.
[21] D. Fitrianah and R. N. Jauhari, “Extractive text summarization for scientific journal articles using long short-term memory and
gated recurrent units,” Bulletin of Electrical Engineering and Informatics, vol. 11, no. 1, pp. 150–157, 2022, doi:
10.11591/eei.v11i1.3278.
[22] A. Dutta, S. Kumar, and M. Basu, “A Gated Recurrent Unit Approach to Bitcoin Price Prediction,” Journal of Risk and Financial
Management, vol. 13, no. 2, p. 23, 2020, doi: 10.3390/jrfm13020023.
[23] F. Ramzan et al., “A Deep Learning Approach for Automated Diagnosis and Multi-Class Classification of Alzheimer’s Disease
Stages Using Resting-State fMRI and Residual Neural Networks,” Journal of Medical Systems, vol. 44, no. 2, 2020, doi:
10.1007/s10916-019-1475-2.
[24] Y. Ho and S. Wookey, “The Real-World-Weight Cross-Entropy Loss Function: Modeling the Costs of Mislabeling,” IEEE
Access, vol. 8, pp. 4806–4813, 2020, doi: 10.1109/ACCESS.2019.2962617.
[25] J. Xu, Y. Zhang, and D. Miao, “Three-way confusion matrix for classification: A measure driven view,” Information Sciences,
vol. 507, pp. 772–794, 2020, doi: 10.1016/j.ins.2019.06.064.
[26] S. Shin, Y. Lee, M. Kim, J. Park, S. Lee, and K. Min, “Deep neural network model with Bayesian hyperparameter optimization
for prediction of NOx at transient conditions in a diesel engine,” Engineering Applications of Artificial Intelligence, vol. 94, 2020,
doi: 10.1016/j.engappai.2020.103761.
BIOGRAPHIES OF AUTHORS
Andre Hangga Wangsa is Student in Universitas Mercu Buana who are pursuing
a bachelor's degree in Computer Science His interest are in Data Science fields such as Natural
Language Processing and Computer Vision. He can be contacted at email:
[email protected].
Text classification to predict skin concerns over skincare using bidirectional mechanism … (Devi Fitrianah)