Clothing Recommendation System Using Visual Analytics
Clothing Recommendation System Using Visual Analytics
Information Analytics
Yun-Rou Lin Wei-Hsiang Su Chub-Hsien Lin Bing-Fei Wu
Institute of Electrical and Institute of Electrical and Institute of Electrical and Institute of Electrical and
Control Engineering Control Engineering Control Engineering Control Engineering
National Chiao Tung National Chiao Tung National Chiao Tung National Chiao Tung
University University University University
Hsinchu, Taiwan Hsinchu, Taiwan Hsinchu, Taiwan Hsinchu, Taiwan
lilzoe @cssp.cn.nctu.edu.tw [email protected] [email protected] [email protected]
Abstract—Due to the short fashion style life circle, much Second, it takes too much time to process the massive
more different clothing designs show up. It is hard for information of all consumers. Contented-based approach, [6],
consumers to find the suitable clothes effectively. To solve this [7], can avoid the problems as stated above. In the content-
problem, an automatic and reliable recommendation system is based system, the complex correlation among consumer
in great demand. In this paper, the clothing attributes preferences and past behaviors will not be considered.
recognition, gender recognition, and body height are considered Features of the item are abstracted to represent the item. And
to design the recommendation system. Based on the clothing the system will learn from interests of consumers to
style, gender and body height, the system can recommend the recommend items. There is still a problem that potential
proper clothes with suitable size. On-line texture modeling is
interests of consumers will be ignored due to the
proposed to produce the variation of the clothing texture so that
the recommendation system can give reasonable and diversified
recommendation strategy.
choices for the consumers. Besides, the data of consumers’ In this research, because of the difficulty in collecting
wearing style is also useful to make the better marketing enough profiles of preferences and activities from consumers,
strategy. According to the reasons above, the clothing collaborative filtering is not the best way to establish a
recognition and recommendation system can create a win-win recommendation system. In addition, the features of the
situation between the consumers and the fashion industry. clothing can be easily represented by attributes so that content-
based system is utilized with several parts of visual
Keywords—deep learning, clothing recognition, content-based
filtering, recommendation system
information, which contains personal and clothing
information. The main stages of our research are:
I. INTRODUCTION (1) Personal information prediction
Nowadays, the rapid growing of fashion industry
Considering data of gender and body height of the
contributes to complex and great amount of clothing design.
consumer can help implement a robust recommendation
Too much information of clothing design leads to dilemma of
system. A gender recognition prediction model is applied to
choosing from different clothes when consumers are shopping
predict the gender used for determining the clothing category,
at a apparel store or on a website. Hence, it is necessary to
and the body height used for estimating the proper clothing
apply an automatic and efficient way to filter information so
size is measured based on face detection.
that consumers can buy suitable clothes easier. Applying
recommendation systems will be the solution to deal with (2) Clothing attributes recognition
these problems.
While content-based filtering is adopted, clothing
For the related researches on recommendation systems, [1], attributes can be defined as features of clothing, so a clothing
[2], [3], recommendation systems can be categorized into attribute recognition model is a necessary part in the system.
three main types: collaborative filtering, contented-based Convolutional Neural Network (CNN) is known as the best
approach and hybrid approach. Collaborative filtering, [4], [5] , approach to object classification, and it is easier to realize if
is based on calculating the correlation which refers to the training data is ready; therefore, it is employed to
relationships among preferences, and behaviors of all recognize what kinds of attributes on the clothing of the
consumers. Through computing similarity of subjective consumer. These attributes will be the key factors for
evaluations to other consumers to obtain the correlation, it can searching the similar clothes in the clothing gallery.
predict what consumers will prefer. A major advantage of
(3) Recommendation system
collaborative filtering is that it is able to process complex
items which are not easy to extract appropriate features. When the system gets personal information and clothing
However, there are two problems when collaborative filtering attributes in the two previous stages, it can figure out an
is adopted. First, a new item will be rarely recommended. similarity score which is computed by similarity calculation
Fig. 1. A schematic diagram for our body height prediction: true view (left),
image view (right)
where v is the vertical field of view, x is the horizontal distance It will take less training time and the model will be more
between the consumer and the camera, θ is the vertical angle robust if the pre-trained model is applied. After testing some
of view of the camera. Moreover, the ratio of vertical field of state-of-the-arts CNN models, the InceptionV3(GoogleNet)
view to the distance between the vertex of the consumer and performs the best and its parameter quantity is relatively low.
the height of the camera, r, can be defined as:
However, the issue is that some of the attributes, like
(2)
Collar Design, Neck Design and Neckline Design, account
where ℎ𝑖 is the height of the image, ℎ𝑣 , which is called for a really small percentage of the full image. It is hard for
vertex height in Fig. 1, is the height of the face forecast the model to learn to extract the representable features of
position in the image. Hence, combine the height of the these attributes on an image. An effect method, [18], is
camera, y, and (1), (2), the prediction of the body height, h, proposed to localize the area where the model focuses on by
can be estimated as: analyzing the confidence maps of the model, so the same idea
is adopted to estimate the region of the clothing attributes. For
(3) the smaller sizes of the clothing attributes, the recognition
B. Clothing attributes recognition models are trained by both the whole image and the estimated
region. In this way, the model can learn more information
To recognize the clothing attritubes, FashionAI dataset, about these attributes so the accuracy can be improved from
[17], is utilized to train a CNN model. The FashinAI dataset 83.80% to 87.59%. .
which is built by Alibaba’s team contains five clothing
The texture of the clothing is another important factor attribute, color features of tops and color features of bottoms
for describing the clothing style. Related research on color are labeled for the clothes in the clothing gallery, as shown in
expansion method, [19], shows a simple way that define the Table II. Each kind of labels is defined by different methods.
color by computing its value in HSV color space, and it helps Keywords from web crawler are used to label the gender. The
calculate the similarity in recommendation system. labels of Design type and Length type are determined by the
Consequently, texture extracting is proposed to extract the corresponding attribute recognition model. Labels of the
color feature from the clothing image. With the color feature, clothing gallery will be transformed into one-hot encoding
the color information of the clothing can be summarized. vector. Texture extracting method is applied to gain the color
features. The properties of clothing gallery are shown in Fig 3.
There are two steps for texture extracting. The first step is to
Since the clothing gallery contains more than ten thousand
find the location of the clothing by confidence map, and the
images, the variety of the clothing gallery can be achieved.
second step is to crop the area of the clothing into several
patches then calculate the average HSV value of all patches. TABLE II. CLOTHING ATTRIBUTES OF RECOMMENDATION SYSTEM
But the texture of tops tends to be more complex than bottoms
in general, it cannot represent the true composition of texture Main
accurately while calculating the average HSV value. In order Labels of attributes
types
to deal with the cases of tops, K-means is utilized to find the
major component of the HSV value among these patches. All
pixels captured from the patches are categorized into K Collar Shirt, Peter Pan, Puritan, Rib
clusters. And the centers of the cluster will then be the color
features.
Notched, Collarless,
Lapel
Shawl Collar, Plus Size Shawl
TABLE I. CLOTHING ATTRIBUTES FROM FASHIONAI DATASET
Design
Main Neck Turtle
Labels of attributes
types
Sleeveless, Cup, Short, Elbow, 3/4, Wrist, Skirt Short, Knee, Ankle, Floor
Sleeve
Long, Extra Long
C. Recommendation system
There are three steps in our recommendation system,
including building clothing gallery, similarity calculation, and
recommendation list generation. Fig. 3. Clothing gallery properties: clothing type(left), gender(middle),
Design type(right)
First, it is necessary to build a clothing gallery for
recommendation. Most of our clothing gallery is composed of The second step is to combine the clothing feature vectors
the data from FashionAI dataset and Large-scale Fashion into the similarity between the clothing of consumer and the
dataset, which is well-known as DeepFashion dataset, and the clothing gallery. The similarity is defined as the sum of
other parts are obtained by web crawler from well-known products among the clothing features:
shopping websites. Gender, Design type attribute, Length type
(4) TABLE III. MEAN ABSOLUTE ERROR (MAE) AND ROOT MEAN
SQUARED ERROR (RMSE) OF HEIGHT PREDICTION
where s is similarity, 𝑤𝑖 is adjustable weights for the feature
𝑔
vector products, 𝑓𝑖 is the feature vector of the clothes from
MAE RMSE
clothing gallery, and 𝑓𝑖𝑐 is the feature vector of the consumer
clothes. The 𝑤𝑖 is set to be the importance of each clothing
feature vector, since the consumer concentrates on the gender 1.72 cm 2.22 cm
and the Design type of the clothing, the weight for gender is
set to 10 and the weight for Design type is set to 5, other
C. Clothing attributes recognition
weights are set to 1.
In the last step, a random generator will elect clothes from TABLE IV. ACCURACY OF RECOGNITION SYSTEM
the result of second step, these clothes will be treated as a
recommendation list and give the consumer as a reference. Main Accuracy
Attributes
The Euclidean distance of color features between clothes can types (%)
be considered as the similarity of the clothing texture, so that
it can be a calculable factor of the random generator. The
random generator is based on the probability density function Collar Shirt, Peter Pan, Puritan, Rib 90.09
of the normal distribution, which is defined as: