Recommendation System For Localized Products in Vending Machines
Recommendation System For Localized Products in Vending Machines
a r t i c l e i n f o a b s t r a c t
Keywords: This paper proposes a framework of localized product recommendation system for automatic vending
Vending machines machines applications. The goal is to offer suitable recommendations of localized products to customers
Localized products recommendation system in distinct locations. We develop a hybrid technique that combines a meta-heuristic approach, clustering
Set cover problem technique, classification, and statistical method. In the approach, an intelligent system is implemented to
Genetic algorithm
analyze product attributes and determine localized products based on the transaction data. To prove the
Machine learning
feasibility and effectiveness of proposed approach, we implemented the system in several automatic
vending machines owned by an information service company of Taiwan. Nine machines were selected
and compared from two locations: living lab by Institute for Information Industry of Taiwan at Song-shan
District and business office building at Nei-hu District in Taipei. The real life experiments showed that the
profit of vending machine increases after applying our system.
Ó 2011 Elsevier Ltd. All rights reserved.
1. Introduction years by Elliot (2007). Machine offers services such as cashless pur-
chasing, remote machine monitoring, electronic security, data-
Over the past few decades, there has been a dramatic increase based menu planning, and more.
in the breadth of development on vending machine industry. In Smart Store is such a creative and innovative conception which
the United States, it is a $US 30 billion-a-year in value. Unlike most offers customers value-added services through information tech-
shops, automatic vending machines operate 24 h a day and provide nologies. Smart Stores are similar to retail or convenience stores;
consumers with the convenience of being able to purchase prod- however, they are combined with innovative service, more conve-
ucts with self-service. These machines provide more human nient delivers avant-garde customer services, advanced electronic
friendly services than those in traditional stores in numerous coun- control or refrigeration monitoring of your stores. Now, there are
tries and owners can reduce setup cost and labor cost. With conceptions of Smart Stores offering advanced services on vending
requirements of customers, vending machines have offered a mul- machines with self-service everywhere in Japan, USA and Europe
tiplicity of products, such as foods, beverage, CD, newspapers, san- by Lo and Yang (2007). In 2006, Institute for Information Industry
itary utensils, entertainment, postage stamps, toys etc., which (III) executed the project ‘‘An Experiment on Distribution-service
studied by Anonymous (1953), Still (1953), Dun and Bradstreet Smart-Stores Project’’ subsidized by Economic departments of
(1992), Bessman (1994) and Guenette (1995). Taiwan, III has tried an smart auto-service store. Although the setup
Nowadays, there are a lot of vending machines related busi- cost of the store was about the same with opening a convenience
nesses being mounting steadily in USA and Japan. Fig. 1 shows a store, it can offer lower labor costs in business operation. Yang
statistics report which involves amount of vending machines and (2006) indicated that the market for convenience stores is almost
annual sales in vending machine industry of Japan from 1997 to saturated in Taiwan, but auto-service stores still have many poten-
2008 which showed by Japan Vending Machine Manufacturers tial business opportunities. There will be more and more business
Association (2009). It also expresses that the stable situation of investing in this industry to offer customers value-added services
vending business development. That means there is a vending ma- and turn into a new niche market.
chine you meet every 5 min if you walk on the street in Japan. To develop smart self-service stores, considerable concern has
Due to the development and advancement of technology, more arisen over some management issues; for example, how to select
and more information systems are applied on vending machines to appropriate products on a machine is an import issue. In the recent
supply personal and customization services. A host of technologies years, the most used method is that managers revoke products
has emerged to provide vending machine new benefits in recent with bad sales and select the substitute ones which sales good in
other machines. This intuitive approach only takes product sales
⇑ Corresponding author. Tel.: +886 2 2713 9000x366; fax: +886 2 2717 6510. into consideration, instead of thinking the relationship and depen-
E-mail address: [email protected] (F.-C. Lin). dence between different demands and different locations. Product
0957-4174/$ - see front matter Ó 2011 Elsevier Ltd. All rights reserved.
doi:10.1016/j.eswa.2011.01.051
9130 F.-C. Lin et al. / Expert Systems with Applications 38 (2011) 9129–9138
Cormen, Leiserson, Rivest, and Stein (2001) introduced that the 3.2.2. Crossover operators
set cover problem is a classical question in computer science and The crossover process is done by randomly selecting two chro-
complexity theory. Given several sets which may have partial ele- mosomes from the mating pool to form a pair of parents, and then
ments in common, and then you must select a minimum number the parents exchange their partial information to produce two new
of these sets that could contain all the elements. It was one of children. By choosing excellent selected method of previously, the
Karp’s 21 NP-complete problems shown to be NP-complete in expectation can breed well next generation. In crossover process,
1972. not all genes will be carried on crossover. The crossover is con-
The set cover problem is seen as a finite version of the notion of trolled by the parameter called ‘‘crossover probability Pc’’ which
compactness in topology, where the elements of certain infinite decides whether the crossover occurred or not.
families of sets can be covered by choosing only finitely many of
them. 3.2.3. Mutation operators
Advantage of mutation process is to generate new species. The
3.2. Genetic algorithm (GA) contribution is to force searching process to avoid local optima. By
the mutation process, genetic algorithm can insure all probably
Genetic algorithm (GA) is a search technique used in computing solves can be searched on search space.
and found exact or approximate solutions to optimization and On the medical science, mutation caused by copying errors in
search problems. Genetic algorithm was formally introduced by the genetic material during cell division is usually harmful. So in
John Holland, and this algorithm is categorized as global search genetic algorithm, the rate of mutation usually sets very low. This
heuristics. minuscule change may be harmful to lone gene, but to whole pop-
The operational processes of the genetic algorithm are de- ulation in evolving is helpful. Of course the flipped position is ran-
scribed as follows, and related flowchart is shown in Fig. 2: domly generated and mutation been activated or not is controlled
by the parameter ‘‘mutation rate Pm’’.
(1) Encode the parameter of the search problem as a character
string, and bit combination in the character string also rep- 3.3. k-means clustering method
resents the chromosome, or called individual.
(2) Initial population consists of N initial individuals is ran- Forgy (1965) described k-means clustering method which is a
domly generated. widely used clustering algorithm until now. The k-means cluster-
(3) Select the chromosomes with higher fitness to generate the ing method initially selects at random k instances to be the centers
next generation. of clusters and it continues to change the centers to obtain final re-
(4) By crossover and mutation, one generation of the genetic sult. The procedure uses a simple and easy way to group given data
algorithm is completed. set. Detailed algorithm is shown as follows:
(5) Until stop criteria are met (whether terminal or not), this Let k denote the number of clusters and D denote data set to be
algorithm runs repeatedly to generate the strongest gene clustered.
of adaptability.
Step 1. Randomly select k instances from D to be initial center of
3.2.1. Selection operators the k clusters.
Selection is the stage of a genetic algorithm of which the pro- Step 2. According to degree of dissimilarity of each instance to
cess plays a main role. Individual genome provides fitness value center of clusters, arrange it to the cluster with the lowest
by fitness function and is chosen from a population in GA for later dissimilarity.
breeding by selection method (rate the fitness of each genome and Step 3. Update the center of clusters, repeat Step 2 until the
select the best of them). clusters centers do not change anymore.
9132 F.-C. Lin et al. / Expert Systems with Applications 38 (2011) 9129–9138
each Di.
which are the uncertain variables and edges of which are the cau-
The splitting criterion of C4.5 is gain ratio which is a measure sal or influential links between variables. Associated with each
based on information theory. Gain ratio of attribute A is calculated node is a set of conditional probability functions modeling the
by uncertain relationship between the node and its parents. We can
arbitrarily query the built network, and obtain the result through
GainðAÞ Bayes’ theorem. The advantages of using Bayesian Network to
GainRatioðAÞ ¼ : ð1Þ
SplitInfoðAÞ model uncertain domains are well known, particularly given the
recent breakthroughs in algorithms and tools involved in imple-
SplitInfo is calculated by
menting them. Bayesian Network also has proven useful in appli-
Xv
jDj j jDj j cations such as medical diagnosis and diagnosis of mechanical
SplitInfoA ðDÞ ¼ log2 : ð2Þ
jDj jDj failures. Fig. 3 gives an example of Bayesian Network.
j¼1
The node ‘‘Cloudy’’ indicates whether the weather is cloudy. The
D is a data set, A is the attribute, m means that there are m clas- node ‘‘Rain’’ indicates whether the weather rains. The node ‘‘Wet
ses in the data set, and pi means the probability of instances which grass’’ indicates whether the grass is wet. The area of value is
belongs to class i in D. v means that attribute A has v number of val- ‘‘True’’ or ‘‘False’’. The meaning of figure structure is that it is more
ues, and Dj is the subset which is divided by the values of attribute likely to rain in cloudy day and water casting maybe close down.
A. Then Gain(A) is calculated by Cloudy day affects that people use sprinklers and it rains, so the ar-
rows is built from Cloudy to Sprinkler and Rain.
GainðAÞ ¼ InfoðDÞ InfoA ðDÞ; ð3Þ
And then, water casting and rain let grass wet, so water casting
and Info(D) is calculated by and rain have arrows to wet grass.
Table 1 is the conditional probability table of the node ‘‘rain’’ in
X
m
InfoðDÞ ¼ pi log2 ðpi Þ; ð4Þ Fig. 3. The parent node of node ‘‘Rain’’ is node ‘‘Cloudy’’. So the con-
i¼1 ditional probability table represents the value of P(rain|cloud).
In the proposed method, we compute the conditional probabil-
and InfoA(D) is calculated by
ity of products given key factors extracted from decision tree and
Xv filter products by these conditional probabilities to rank products.
jDj j
InfoA ðDÞ ¼ InfoðDj Þ: ð5Þ
j¼1
jDj
4. Localized products recommendation system
In the proposed method, products labeled through clustering
method and their attributes are input data of decision tree to build
This chapter illustrates how above-mentioned methodologies
model. Then, we can predict sales performance of products that are
integrate to handle our issue by proposed framework and depicts
not on machine and draw critical impact factors. We use C4.5
which proposed by Quinlan (1993) to build model from product
data. Table 1
Conditional probability table of the node rain in Fig. 3 (Alpaydin, 2004).
3.5. Bayesian Network Cloudy
True False
Bayesian Network is a probabilistic graphical model that repre-
Rain True 0.8 0.1
sents the dependencies between variables described by Alpaydin
False 0.2 0.9
(2004). Bayesian Network contains nodes and edges, nodes of
F.-C. Lin et al. / Expert Systems with Applications 38 (2011) 9129–9138 9133
what happened in related process. In the past years, when owners mendation process (products adjustment process) at Section 4.2.
began to allocate vending machines in offices, factories, hospitals, Fig. 6 shows the localized products recommendation processes.
schools, and army camps, they need to consider what kinds of
products in a vending machine as product list. The intuitive meth- 4.1. Initial selection process
od is that managers select products with good sales in other places
at operating time. In such way, consumers may not be satisfied 4.1.1. Attribute setup
with these products because it does not consider the different pref- We configure the essential data attributes for all products
erences of people at that location. This paper proposes a method which are ready for sale via the data management interface of
that meets localized demand and narrates specifically as follows. our system. In this research, we use beverages products and the
At the beginning of vending operations, we select a product set attributes are name, price, cost, volume, brand, category and pack-
that contains the most coverage of attributes of total products. ing. These product attributes are used in the initial selection pro-
That is because, at first time, we do not know customers’ prefer- cesses and localized products recommendation process.
ence at the location, so we select the product set that covers attri-
butes as many as possible. Then sales data is available for operating 4.1.2. Set cover problem
a period of time, and we induct the attributes of products with In this step, we transfer the initial products selection problem
good sales. Finally, we replace products of bad sales with products into the set cover problem. While initially setting a vending ma-
of good sales attributes. This selection method can promote cus- chine, operators do not have any transaction log at new place;
tomer satisfaction and total sales. Fig. 4 depicts this scenario. moreover, they do not have many options of slotted products
The proposed method consists of two main steps: initial prod- due to the slot limitations of vending machine. Therefore, we select
ucts selection process and products adjustment process. First, we a set of products that covers the most attributes of all products as
outline the workflow of the initial products selection process possible, for the reason is that we do not know what kinds of prod-
framework shown at Section 4.1. Fig. 5 illustrates the initial prod- ucts in this place are better, so we select a set of products that cov-
ucts selection processes. Next, we show localized products recom- ers the most attributes to fulfill universal of requirement. In later
operations, we can determine the products that fit this place more method, decision tree, and Bayesian Network to generate a recom-
quickly via transaction records. Selecting a subset of products that mended list of products and expect to offer suitable merchandises
contains the most attributes is just like the set cover problem. And to customers.
we use genetic algorithm to find a feasible solution, because the set
cover problem is a NP-complete problem. 4.2.1. Data collection
4.1.3. Genetic algorithm (GA) Step 1. Collect transaction data: We collect transaction records
We apply genetic algorithm to search an approximately optimal including machine ID, product ID, product name, slot number,
solution of the set cover problem. In chromosome of proposed GA, amount of money, and transaction time from machines. Trans-
the 0–1 binary representation is our choice. We use n-bit binary action data are segmented by machines or site for later process-
strings as the chromosome structure where n is the number of ing. And for transaction records in the same segmentation, we
the products we can choose from warehouse. The value ‘‘1’’ on col- cumulate sales volume for each product. The preprocessed data
umn i means that product i is chosen and value ‘‘0’’ is not. The con- is the input of the second part, ‘‘machine learning (cause and
straint is that the number of ‘‘1’’ in one chromosome must equal to effect analysis)’’.
the number of machine slots, m. The fitness function is the number Step 2. Calculate sales performance: Calculate sales performances
of the attributes one chromosome contains and we want to maxi- for each product slotted on machines. We thought that sales
mize it. In crossover step, we use two-point crossover and while volume is an important factor on sales performance, but only
the number of ‘‘1’’ in the chromosome does not equal to m, we re- sales volume cannot reflect the managers’ consideration; more-
pair this condition. In mutation step, we choose 2 columns in the over, higher profit is also needed. So we define the products’
same chromosome and then swap their values. Finally, we can sales performance of each product as a1 (sales volume of the
get a feasible set of products. These products will be slotted into product) + a2 (total profits of the product), where 0 6 a1,
vending machine for sale. a2 6 1, a1 + a2 = 1. The values of weight a1, a2 are set by the
degree of the importance of these two variables. In our experi-
4.2. Localization product recommendation process ment, we set a1, a2 as 0 and 1.
This section continues to describe the framework for another 4.2.2. Machine learning (cause and effect analysis)
process, localized products recommendation process. It displays
how methodologies use evolution to solve impossible problems. Step 3. Cluster sales performance: Divide products into groups
We will briefly depict the recommendation system how to com- according to their sales performance. We use k-means cluster-
bine several methods to handle with our issue. ing method and set k as 3. In the result, the upper bound of
Inventory issue of vending machine is one of the important worst sales performance cluster and the lower bound of best
problems affecting business benefit. How to increase customer de- sales performance cluster are showed to users for reference,
mand rate and decrease slow-moving items are what vending ma- and users can also set these two thresholds by themselves.
chine businesses respect. This paper combines k-means clustering System is in accordance with the threshold values to label
F.-C. Lin et al. / Expert Systems with Applications 38 (2011) 9129–9138 9135
products. Products whose sales performances are below the Step 4. Build predictive model and extract key attributes of popular
value of lower bound are labeled as ‘‘bad’’. Products whose sales products via decision tree: This step is to build the sales model of
performances are above the value of upper bound are labeled as products, predict the sales performances of products not on
‘‘good’’. Remain products are labeled as ‘‘middle’’. Products machine and find key factors of good achievements through
labeled as ‘‘bad’’ will not be sold. Afterward the labeled prod- the model. Product attributes and their labels are inputted into
ucts and their attributes will be the training data for decision decision tree to build predictive model. Then we predict sales
tree. performances of the products that are not on the machine
9136 F.-C. Lin et al. / Expert Systems with Applications 38 (2011) 9129–9138
Bayesian Network and utilize the key factors from root to leaves
according to the products’ attributes via built model. In theory, with ‘‘good’’ labels in built tree model to select products
good predictive sale achievements of products should be the exceeding probability doorsill value to be the substitute prod-
substitute products for bad ones. Because the predictive results ucts of bad products. We build the cause-effect network figure
are only discrete class labels, instances predicted as the same by sales performances and product attributes which include
labels do not have difference in degrees. So we extract the key brand, category, packing, pricing and volume. Through the built
attributes from decision tree model and compute the condi- network, we estimate the conditional probability of products
tional probability of each product predicted as ‘‘good’’ as the predicted as ‘‘good’’ given the key factors. These values of con-
criterion at step 5. ditional probability are the basis to sort the products predicted
Step 5. Filter recommendation products through Bayesian Net- as ‘‘good’’. Then we delete the products whose conditional prob-
work: We use sales of products and product attributes to build ability is lower than the threshold value from the substitute
F.-C. Lin et al. / Expert Systems with Applications 38 (2011) 9129–9138 9137
Aft er
product set. Users can also set the value of probability doorsill (4) Recommendation result: This part shows the list of revoking
on our system. The higher doorsill value obtains the smaller products and substitute products computed and suggested
subset of substitute product set. On the contrary, the lower by system, offering user to select. Besides, you can also
doorsill value gets the bigger one. decide whether the recommends result should be accepted
or not. The recommendation result is listed in Fig. 11.
4.2.3. Recommendation of products
6. Experimental results
Step 6. Decision analysis: Analysis results which include the key
factors and the values of conditional probability computed
To verify the effectiveness of integrated system through real
through Bayesian network given the key factors are showed
experiments, we create a project in cooperation with A-men Tech-
on system.
nology Company (A-men Tech. Co.). Nine vending machines di-
Step 7. Display the recommendation list: Finally, detail substitute
vided into two groups are totally installed and maintained for
products list and unsalable products are showed on system.
operations by A-men Tech. Co. in two office buildings at different
Besides listing the substitute products, manager can manually
regions and the products are all kinds of drinks being able to slot
select one substitute product for each unsalable product.
on vending machine. The situations of vending machine installa-
tion are shown in Figs. 12 and 13.
5. System implementation
Ong et al. (1996) indicated the objectives of vending machine
for stockholders and operators: (1) to generate revenue, (2) to gen-
The localized products recommendation system analyzes past
erate interest and awareness, to promote new drinks, and to test
sales records to comprehend and determine what kinds of prod-
the acceptance of new products before fixing prices, (3) to raise
ucts should be adjusted to alternate by decision analysis technol-
the image of soft drink industry. We focus on generating revenue,
ogy. The descriptions and displays of our system are showed in
that is, a solution focuses on increasing sales revenue. We pro-
the following:
cessed adjustment of products in vending machines at different
locations by our localized products recommendation system.
(1) Product data configuration: Let users set and modify product
The experiment period was performed from 20, December 2008
property and profiles such as product name, product prices,
to 28, February 2009, and there were more than fifty thousands
product volume, product brand, product category and pack-
transaction records. The result shows that the sales grew up at
ing etc. Figs. 7 and 8 display the user interfaces.
least two times (at most four times) in Figs. 14 and 15. It reveals
(2) Machine selection: List all machines which can be selected.
that a system involving the proposed methodology is effective
These machines are already sited for business. In this page,
for increasing sales volumes.
we input start time and end time of transaction records
and mark the machine we desire. Graphic interface showed
in Fig. 9. 7. Conclusion
(3) Parameter configuration: This display shows the cluster anal-
ysis results in the highest and the worst doorsill value for With the increasing competition within the soft drink industry,
users to refer. Users can configure the two values which company managers have to find ways to reduce the operation
affect the number of substitute product set. Further, users cost to remain competitive. One way toward this goal is to
can adjust parameters of system in doing recommends. apply appropriate information technologies to analyze transaction
Fig. 10 schematized how to set the parameters. records and supply value-added services. This research proposed
9138 F.-C. Lin et al. / Expert Systems with Applications 38 (2011) 9129–9138
an adjustment on products offering and a localized product Elliot, M. (2007). Video screens give vending machines new capabilities. Automatic
Merchandiser, 49(10), 92–98.
recommendation system based on k-means clustering, Decision
Forgy, E. W. (1965). Cluster analysis of multivariate data: Efficiency v.s.
Tree, and Bayesian Network to operate substitute product list. interpretability of classifications. Biometrics, 21, 768–780.
The experimental results reveal that proposed method and system Guenette, D. R. (1995). CD-ROM, the vending machine. CD-ROM Professional, 8(10).
is effective for increasing profit. p. 12.
Han, J., & Kamber, M. (2006). Data mining: Concepts and techniques (2nd ed). San
Francisco: Morgan Kaufmann [Chapter 7].
Acknowledgement Japan Vending Machine Manufacturers Association (2009). Universal numbers of
vending machine and annual sale in 2008. Available at: <http://www.jvma.or.jp/
information/fukyu2008.pdf> Accessed 07.01.09.
This study is conducted under the ‘‘Project Digital Convergence Kasavana, M. L. (2002). Vending machine technology in club operations. Club
Service Open Platform’’ of the Institute for Information Industry Management, 81(4), 75–76.
which is subsidized by the Ministry of Economy Affairs of the Lee, D. H. (2003). Consumers’ experiences, opinions, attitudes, satisfaction,
dissatisfaction, and complaining behavior with vending machines. Journal of
Republic of China. Consumer Satisfaction, Dissatisfaction and Complaining Behavior, 16, 178–197.
Lo, C. S., & Yang, H. W. (2007). Q-Shop experience firsthand without cashier (in
Reference Chinese). Intelligent Times.
Ong, H. L., Ang, B. W., Goh, T. N., & Deng, C. C. (1996). A model for vending machine
services in the soft drink industry. Asia-Pacific Journal of Operational Research,
Alpaydin, E. (2004). Introduction to machine learning. London: MIT Press [Chapter 3].
13(2), 209–224.
Anonymous (1953). Small coffee vending machine suitable for office use. Journal of
Quelch, J. A., & Takeuchi, H. (1981). Nonstore marketing: Fast track or slow. Harvard
Accountancy (pre-1986), 95(2), 151.
Business Review, 59(4), 75–84.
Bessman, J. (1994). CD vending machines go to market. Billboard, 106(28), 62–63.
Quinlan, J. R. (1986). Induction of decision trees. Machine Learning, 1(1), 81–106.
Blough, C. G. (1956). Checking vending machine inventories. Journal of Accountancy
Quinlan, J. R. (1993). C4.5: Programs for machine learning. San Francisco: Morgan
(pre-1986)., 102(2), 72–74.
Kaufmann [Chapter 2].
Borin, N., Farris, P. W., & Freeland, J. R. (1994). A model for determining retail
Simone, A., & Hannan, P. (1997). A pricing strategy to promote low-fat snack choices
product category assortment and shelf space allocation. Decision Sciences, 25(3),
through vending machines. American Journal of Public Health, 87(5), 849–851.
359–384.
Still, R. R. (1953). The effect of an automatic vending machine installation on
Cormen, T. H., Leiserson, C. E., Rivest, R. L., & Stein, C. (2001). Introduction to
cigarette sales. Journal of Marketing (pre-1986)., 17(1), 61–63.
algorithms (2nd ed). London: The MIT Press [Chapter 35].
Yang, Y. M. (2006). Retailer war in scope – Big whales eat dried shrimps (in
Dun and Bradstreet (1992). French fries from a vending machine. D&B Reports,
Chinese). The Liberty Times.
40(3), 14.