0% found this document useful (0 votes)
108 views5 pages

Grocery Customer Behavior Analysis Using RFID-based Shopping Paths Data

This document discusses analyzing grocery customer behavior using RFID-based shopping path data. It proposes a new approach called spatial pattern clustering based on the longest common subsequence to analyze customer movement patterns and determine hot spots and dead spots in the store. This could help retailers improve store layout and product placement to provide a better customer experience and increase sales. Previous studies used clustering techniques on shopping path data but had limitations due to spatial constraints in stores. The proposed method seeks to improve on this for more accurate analysis of customer behavior.

Uploaded by

Arthur Aguilar
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
0% found this document useful (0 votes)
108 views5 pages

Grocery Customer Behavior Analysis Using RFID-based Shopping Paths Data

This document discusses analyzing grocery customer behavior using RFID-based shopping path data. It proposes a new approach called spatial pattern clustering based on the longest common subsequence to analyze customer movement patterns and determine hot spots and dead spots in the store. This could help retailers improve store layout and product placement to provide a better customer experience and increase sales. Previous studies used clustering techniques on shopping path data but had limitations due to spatial constraints in stores. The proposed method seeks to improve on this for more accurate analysis of customer behavior.

Uploaded by

Arthur Aguilar
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
You are on page 1/ 5

World Academy of Science, Engineering and Technology

International Journal of Economics and Management Engineering


Vol:5, No:11, 2011

Grocery Customer Behavior Analysis using


RFID-based Shopping Paths Data
In-Chul Jung, Young S. Kwon

environment (in-store layouts, product placement, etc.) closely


Abstract—Knowing about the customer behavior in a grocery has affects customer consumption behavior. Therefore, improving
been a long-standing issue in the retailing industry. The advent of the in-store environment is an important consequence of
RFID has made it easier to collect moving data for an individual increased understanding of customer behavior. If we can
shopper's behavior. Most of the previous studies used the traditional
determine the store areas where most sales activities occur and
statistical clustering technique to find the major characteristics of
customer behavior, especially shopping path. However, in using the where customers tend to stay for a long time, we can decide
clustering technique, due to various spatial constraints in the store, where to display products and how to build an effective store
International Science Index, Economics and Management Engineering Vol:5, No:11, 2011 waset.org/Publication/13460

standard clustering methods are not feasible because moving data such environment. A more effective store environment can provide
as the shopping path should be adjusted in advance of the analysis, convenient services for customers and hence increases sales. Up
which is time-consuming and causes data distortion. To alleviate this to now, however, store managers have relied on experiences of
problem, we propose a new approach to spatial pattern clustering
the high-sales locations and those where customers tend to stay
based on the longest common subsequence. Experimental results using
real data obtained from a grocery confirm the good performance of the for a long time. Based on their experience, they decided where
proposed method in finding the hot spot, dead spot and major path to display products and how to change in-store layout.
patterns of customer movements. Farley and Ring [3] recorded the movement of some
customers by following them in order to analyze the shopping
Keywords—customer path, shopping behavior, exploratory path, one of the shopping behaviors, of customers. However,
analysis, LCS, RFID due to the numerous customers that visit the stores daily, it is
difficult to record individual consumption behavior with only a
I. INTRODUCTION few researchers and limited budget. The record is also not

T HE goal of retailers (discount stores, department stores,


convenience stores, supermarkets, etc) is to increase the
gross profit margin through sales and cost reduction. This
reliable due to the small sample number. Several researchers
[4][9][11] have tried to solve these problems by using
radio-frequency identification (RFID) and clustering techniques
requires improving the efficiency of operation and providing to analyze customers' shopping behavior, especially shopping
attractive services for customers. Especially, the market focus path. However, the clustering method in customer shopping
of large discount stores has been continuous low price sales in path suffers from two problems. First, during the process of
tandem with the expansion of new branch stores. Recently, clustering, the clusters which are divided at the location of
however, they have struggled with the decreased consumer obstacles such as sales shelves can converge into the same
spending due to the economic recession. This has removed the cluster group. Because people cannot cross obstacles such as
competitive position of a low price strategy only. This situation shelves and merchandise stands, the store's physical
necessitates new marketing strategies such as aggressive environment and obstructions (shelves, merchandise stands,
promotions to customers. Traditional strategies include basket etc.) should be considered as a constraint for the shopping path
analysis or regional analysis based on customer purchase history clustering. Second, the length of a shopping path must be
and demographic information. Information about interested constant in order to apply a clustering algorithm to the shopping
products is analyzed from customer purchase history and path data; however, this length is actually variable in a store.
products recommended to customers through customer Insufficient consideration has been given to the physical
segmentation. The location of future profitable stores is obstacles such as sales shelves and product displays.
identified using demographic information and regional analysis. Furthermore, the input lengths between objects have to be
However, more aggressive promotional activities are needed as identical to apply cluster techniques, but customers’ traveling
these traditional analyses do not provide sufficient information distances in stores vary. For example, customer “A” may finish
to understand customer shopping patterns and behaviors in the shopping in 5 minutes and leave the store, but customer “B”
physical store environment. may roam around the store for more than 30 minutes. Due to this
According to the research of Newman et al. [13] on wide variety in shopping courses between customers, we need to
customers’ consumption decision making processes, the store generalize the paths into a normalized shopping length based on
time or space in order to apply the clustering algorithm.
In-Chul Jung is with Department of Industrial & Systems Engineering, However, this normalizing process can introduce a distortion or
Dongguk University, Seoul, Korea ([email protected]). noise into the shopping path or main travel information.
Young S. Kwon is with Department of Industrial & Systems Engineering,
Dongguk University, Seoul, Korea ([email protected]).

International Scholarly and Scientific Research & Innovation 5(11) 2011 1508 scholar.waset.org/1307-6892/13460
World Academy of Science, Engineering and Technology
International Journal of Economics and Management Engineering
Vol:5, No:11, 2011

Therefore, in this paper we propose a new method of spatial clustering method is inadequate for in-store shopping path
patterns clustering in order to solve the problems of previous pattern grouping.
studies by changing the real shopping path to path location
sequences and by using a new similarity measure between
different customers’ shopping paths. By adopting the longest
common subsequence (LCS) method as the basic idea and
expanding on it, we developed the main shopping path
identification technique that is capable of identifying the
hotspots where most of the customers’ visits are made and the
dead spots with few visits. We finally applied our newly
developed method to the real data of a large supermarket store
located in Seoul and analyzed its customer flow information.
Fig. 1 Euclidean distance and actual customer path in a store
II. RELATED LITERATURE
Since each customer has a different travel distance for
International Science Index, Economics and Management Engineering Vol:5, No:11, 2011 waset.org/Publication/13460

The retailers of large supermarkets have profited by


shopping, we need to convert the different distances into one
supplying a massive number of products at low price, and have
identical number throughout the entire process in order to apply
recently begun to use various marketing strategies to gain a
them into a clustering algorithm. For example, when a clustering
competitive advantage over other stores. They have tried to
method like K-mediods is used, the input values have to be
understand customers by analyzing demographic and
identical because of the characteristic of the algorithm.
transaction data. However, in order to understand customer
Therefore, different customers’ different distances need to be
shopping patterns inside a store, we need more data about
converted into an identical number, and the normalization is
customer behaviors. Transaction data and personal information
applied to equalize the size of the trace. As in Fig. 2, we can
only provide basic information such as how many times and
choose between temporal normalization and spatial
when they have visited due to the limitation on customer
normalization. During the process, since a different value from
behavior data collection technique.
the original distance is used as the input, the original
Some researchers have studied cases of customer behaviors
information can be either deformed or lost. Therefore, this
using direct observation or questionnaires. Harris [5] and
research provides a technique for detecting a main customer
McClure and West [12] have investigated whether store display
shopping path pattern in which all these path characteristics and
and product brand change affect sales figures. Cox [1] studied
store environment facts are taken into account.
some factors on unplanned purchases. Dickson and Sawyer [2]
studied information processing and decision making in a store
environment, and Hoyer [8] studied whether time pressure
affected unplanned purchases and changes to brands and
products. However, these studies did not investigate all
customers but merely a sample as complete sampling would be
very difficult and expensive. In recent years, technological
advances such as inexpensive RFID and video cameras have
been applied to the analysis of customer behavior [7][14][11].
These studies explained the behavioral properties of customers
in the store and supported the formulation of marketing strategy
and optimal operation. In particular Larsn et al. [11] and Hui et
al. [10] tried to identify the major shopping path using RFID
technology and the K-medoids clustering technique using the
Euclidean distance. Existing research mainly used clustering Fig. 2 Path normalization
techniques among data mining techniques to detect the main
shopping path patterns. However, the use of clustering III. SHOPPING PATH PATTERN ANALYSIS
techniques based on Euclidean distance can reduce its accuracy
due to obstacles such as sales stands and shelves in the actual A. Collection of shopping data using RFID
store environment. For example, as shown in Fig. 1, if we We installed an RFID sink node (Reader) on the ceiling of the
measure the Euclidean distance from position a to b and c, store, an RFID repeater inside the shelves and an RFID tag in
position c is closer than position b. However, the actual the shopping cart to collect customer shopping data (Fig. 3). For
customer’s travel distance in the store from position “a” to identifying individual information we mapped the collected
position “c” is further than position “a” to position “b” because shopping data with personal information (name, age, etc).
a customer has to walk around the sales stands. This is one of the
reasons why Euclidean distance measurement in the existing

International Scholarly and Scientific Research & Innovation 5(11) 2011 1509 scholar.waset.org/1307-6892/13460
World Academy of Science, Engineering and Technology
International Journal of Economics and Management Engineering
Vol:5, No:11, 2011

pattern.

Fig. 5 Example of LCS


Fig. 3 RFID Device System
To solve this problem, we provide a new similarity method by
For more precise shopping path information, we developed extending the LCS using relative distance.
an Ultra-low Power Wireless System One-Chip that uses a LCSS ( x, y )
new _ Similarity ( x, y ) = (2)
2.4GHz frequency band as an active tag type. The active tags Length _ of _ x + Length _ of _ y
International Science Index, Economics and Management Engineering Vol:5, No:11, 2011 waset.org/Publication/13460

were installed in the shopping carts and a reader received data in


a predefined interval. All collected data were sent to the storage If we re-apply the previous examples using (2), we can find
server (Fig. 4). the customer who has the closest shopping path pattern.
Customer-1 has a shopping path distance of 6, and customer-2
and -3 have shopping path distances of 5 and 6, respectively. By
comparing 4/ (6+5) and 4/ (6+4) through (2) we can determine
that customer 3 has a more similar shopping pattern to customer
1. Fig. 6 depicts this process. This means that the proposed
similarity method can overcome the limitations both in the
Euclidean distance measurement, which was not adequate for
the store environment with many blocking obstacles, and in the
sequential path.

Fig. 4 RFID System Composition

B. Proposed similarity method using LCS


The basic idea for shopping path pattern identification is to
extend the LCS method [6] using the newly proposed similarity
between the individual customers’ paths. Fig. 6 Supposed similarity method
The LCS method finds the LCS between two sequences. If X
= (x1, x2...xm), Y = (y1, y2...yn) are sequences, the LCS is,
C. Main-Shopping-Path-Pattern clustering method
We developed the Main-Shopping-Path-Pattern clustering
method to determine the K main shopping path in store. K is the
initial clustering count such as K-means clustering. Randomly
(1) initial K paths are selected among all customer paths in order to
include similar paths because each K cluster of paths is used for
finding similar paths among clusters. The cluster loop is
Because a shopping path can be referred by node ID locations repeated until there is no change in each group or when a
(RFID reader ID) where a customer has visited, a shopping path user-defined iterating times and then K-number of the correct
can be expressed with the location ID sequence in which the LCS sequences is finally generated. (Fig. 7 and Fig. 8)
shopper visited. The greater the length of the common sequence
between the visited paths of two customers, the greater the
similarity between the two customers’ shopping path pattern. If
customer-1 has a shopping path of A-B-C-D-E-F, and
customer-2, -3, and -4 each have shopping paths of A-B-E-Z-F,
A-B-C-F, and A-B-Y-Z-F, respectively, the LCS results of
customer-1 and the others are A-B-E-F, A-B-C-F, A-B-F in
order (Fig. 5). The LCS of customers 1-2 and 1-3 have the same
Fig. 7 Pseudo-code of Main-Shopping-Path-Pattern-Grouping
length 4 of common subsequence. In this case, LCS cannot
determine which customers have a more similar shopping path

International Scholarly and Scientific Research & Innovation 5(11) 2011 1510 scholar.waset.org/1307-6892/13460
World Academy of Science, Engineering and Technology
International Journal of Economics and Management Engineering
Vol:5, No:11, 2011

Similar results were generated as the parameter K, the


number of groups, was changed from 3 to 5. Fig. 10 depicts the
result when K was set to 5 and the analysis run again. The result
showed that most customers started their shopping from the
entrance and mainly shopped in a counter-clockwise direction
because the entrance and the cashier’s counter are located in the
lower left and the upper left of the picture, respectively. This
layout made the customers tend to move in a counter-clockwise
direction. Notably, products in the top 10 sales ranks were
mostly located on the lower side of the store layout, whereas
Fig. 8 Procedure of suggested method products displayed in the upper part of the layout (interior
products, hobby products, car products) were rarely included in
The suggested Main-Shopping-Path-Pattern clustering the shopping sequence and accordingly had very low actual
method can not only detect major the customers’ shopping path sales.
Fig. 10 reveals that the circled areas were mostly located
International Science Index, Economics and Management Engineering Vol:5, No:11, 2011 waset.org/Publication/13460

sequence but also identify the Hot spot and Dead spot areas. The
traditional Hot spot and Dead spot identification method uses close to the entrance, and that the areas within 5 meters of the
statistical analysis to sum up and compare all the locations entrance were the first Hot Spot. Furthermore, the area before
where the shopper visited. One of the main advantages of the moving to the casher’s counter after the shopping had been
suggested algorithm is its ability to simultaneously identify both completed had the most overlapping patterns and was
spots and the main shopping paths. As the LCS is characterized determined to be the second Hot Spot. Although few purchases
by grouping the main shopping path sequences in travel order, were made in this area, it is likely to be a highly effective area
the most repeatedly appearing nodes among all sequence groups for demonstration and should be used to display promotional
are regarded as a Hot spot, and the most rarely visited area as a products and hot products in order to generate more purchases.
Dead spot. The triangle area is a bridge area that connects the first Hot Spot
and the second Hot Spot, and most seasoning products and
IV. EXPERIMENTAL TEST AND RESULTS kitchen product purchases were made in this area. However, few
purchases were made in the area to the left side of the triangle,
To apply the proposed algorithm, we conducted a test and
even though the customers’ shopping paths included this area.
analyzed data for an actual large discount store located in Seoul,
These results indicated that the store manager needs to change
Korea. The store is a single floor building with an average of
product display and promote sales through product analysis.
554 customers daily.
More than 200 RFID Tags were installed on shopping carts
and 200 readers on shelves to collect the customers’ shopping
traces. The data were collected for a week in February 2011 and
filtered shopping paths were obtained for Monday, Wednesday
and Friday.

Fig. 10 Results of finding shopping movement patterns

V. CONCLUSION
Existing customer analysis in retail stores has relied on basket
analysis or sales statistics, and has rarely included analysis for
service efficiency or customer behavior pattern. However, our
Fig. 9 Region of installed RFID devices and main products study provides a method to identify customers’ shopping paths
or major sales areas by collecting and analyzing information on
customers’ main travel path, which was not provided in the

International Scholarly and Scientific Research & Innovation 5(11) 2011 1511 scholar.waset.org/1307-6892/13460
World Academy of Science, Engineering and Technology
International Journal of Economics and Management Engineering
Vol:5, No:11, 2011

existing customer analysis techniques. Existing customer [8] Hoyer, W. D. (1984), An Examination of Consumer Decision Making for
a Common Repeat Purchase Product, Journal of Consumer Research,
analysis methods using Euclidean distance suffered tumbling 11(3), 822-829.
issues with distant spots and data processing based on the [9] Hui, S. K., Bradlow, E. T. and Fader, P. S. (2009), Testing Behavioral
measurement. We have expanded the LCS technique and Hypotheses Using an Integrated Model of Grocery Store Shopping path
developed a new method to identify the customers’ main and purchase Behavior, Journal of consumer research, 36, 478-493.
[10] Hui, S. K., Fader, P. S. and Bradlow, E. T. (2009), Path Data in
pattern. This new method provides information necessary to Marketing: An Integrative Framework and Prospectus for Model
decide about customers’ shopping sequence and to determine Building, Marketing Science, 28(2), 320-335.
meaningful spots in stores. [11] Larson J. S., Bradlow E. T. and Fader P. S. (2005), An exploratory look at
supermarket shopping paths, International Journal of Research in
Based on this analysis, the results will increase understanding Marketing, 22(4), 395– 414.
of the customers’ consumption behavior and will assist in [12] McClure, P. J. and West, E. J. (1969), Sales Effects of a New Counter
deciding whether product display and layout need to be Display, Journal of Advertising Research, 9, 29-34.
[13] Newman, A. J., Yu, D. K. C. and Oulton , D. P. (2002), New insights into
changed. The proposed more quantitative method improves retail space and format planning from customer-tracking data, Journal of
existing qualitative analysis which mainly relied on store Retailing and Consumer Services, 9(5), 253-258.
employees’ daily experience and provides objective numbers in [14] Uotila, V. and Skogster, P. (2007), Space management in a DIY store
analyzing consumer shopping paths with data-tracking devices,
order to provide high-quality services to customers and increase
International Science Index, Economics and Management Engineering Vol:5, No:11, 2011 waset.org/Publication/13460

Facilities, 25(9), 363-374.


revenues accordingly.
For future research, we suggest combining our analysis with In-Chul Jung graduated from City liberal arts(BS), Incheon University in
legacy system information such as customer purchase history in 2002 and MS from Industrial Systems Engineering, Dongguk University, 2005
and Ph.D student in Industrial Systems Engineering, Dongguk University from
order to develop an intelligent store analysis system capable of 2006. Interests include machine learning, data mining, agents, and intelligent
improving operational efficiency and expanding sales. More information systems
analysis models need to be developed for more detailed analysis Young S. Kwon graduated from Seoul National University, Industrial
Engineering (BS), 1978 and MS from Industrial Engineering, Korea Advanced
of the shopping behavior of diverse customers, along with the Institute of Science and Technology, 1981 and Ph.D. from Korea Advanced
development of various measurement indexes to analyze the Institute of Science and Technology graduate,1996. He has worked professor in
store environment. The present multidimensional analysis Dongguk University, Industrial and Systems Engineering since 1981. Interests
include data mining, intelligent information systems
facilitated the extraction of information not previously available
.
from existing research.
We could quantify the information on both the customer and
the store by expanding the existing one-dimensional analysis
into multidimensional analysis. The results reveal the need for
more objective indicators in future advanced stores. We are
planning to conduct more varied analysis based on those
indicators. We developed new optimization technology to
support decision-making on point of sale and shelf locations that
will reduce customer traffic congestion, automate some of the
sales processes, expand automated services to improve
customer service and maximize profit for both manufacturing
and business distribution. This promises to be developed into a
customer service knowledge technique.

REFERENCES
[1] Cox, K. (1964), The Responsiveness of Food Sales to Shelf Space
Changes in Supermarkets, Journal of Marketing Research, 1(2), 63-67.
[2] Dickson, P. R. and Sawyer, A. G. (1986), Point-of-Purchase Behavior and
Price Perceptions of Supermarket Shoppers, Working Paper No. 86-102,
Marketing Science Institute, 1000 Massachusetts Ave., Cambridge, MA
02138.
[3] Farley, J. U. and Ring, L. W. (1996), A Stochastic Model of Supermarket
Traffic Flow, OPERATIONS RESEARCH, 14(4), 555-567.
[4] Gil J., Tobari E., Lemlij M., Rose A., Penn A. (2009), The Differentiating
Behaviour of Shoppers: Clustering of Individual Movement Traces in a
Supermarket, Proceedings of the 7th International Space Syntax
Symposium.
[5] Harris, D. H. (1958), The effect of display width in merchandising soap,
Journal of Applied Psychology, 42(4), 283-284.
[6] Hirschberg, D. S. (1977), Algorithms for the longest common
subsequence problem, Journal of ACM, 24(4), 664-675.
[7] Hou, J-L. and Chen, T-G. (2011), An RFID-based Shopping Service
System for retailers, Advanced Engineering Informatics, 25(1), 103-115.

International Scholarly and Scientific Research & Innovation 5(11) 2011 1512 scholar.waset.org/1307-6892/13460

You might also like