0% found this document useful (0 votes)
184 views

Brand Switching Analysis Using Data Analytics To Derive Consumer Behaviour

In this paper, we present Brand Switching Analysis using Data Analytics to derive various patterns of consumers’ actions within a retail store. The paper demonstrates multiple steps involved in conducting this analysis – Data Cleansing, Data Visualization, Data Segregation, and Data Representation, using various technologies.
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
184 views

Brand Switching Analysis Using Data Analytics To Derive Consumer Behaviour

In this paper, we present Brand Switching Analysis using Data Analytics to derive various patterns of consumers’ actions within a retail store. The paper demonstrates multiple steps involved in conducting this analysis – Data Cleansing, Data Visualization, Data Segregation, and Data Representation, using various technologies.
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 5

Volume 5, Issue 2, February – 2020 International Journal of Innovative Science and Research Technology

ISSN No:-2456-2165

Brand Switching Analysis using Data Analytics to


Derive Consumer Behaviour
Ayush Tripathi, Zatin Gupta
Bachelors of Technology in Computer Science and Assistant Professor at Computer Science and Engineering
Engineering, Raj Kumar Goel Institute of Technology, Department, Raj Kumar Goel Institute of Technology,
Ghaziabad (A.K.T.U.), Ghaziabad (A.K.T.U.),
Uttar Pradesh, India Uttar Pradesh, India

Abstract:- In this paper, we present Brand Switching Our goal in this paper is to process the available data*
Analysis using Data Analytics to derive various patterns into a meaningful format and conduct Brand Switching
of consumers’ actions within a retail store. The paper Analysis on the processed data to derive patterns of
demonstrates multiple steps involved in conducting this customer purchasing and *Data was downloaded from
analysis – Data Cleansing, Data Visualization, Data Kaggle switching behavior for a chosen brand (Paper
Segregation, and Data Representation, using various Chain) within a retail store.
technologies. The patterns and the inferences
established from the research can be harnessed by the Further, the paper demonstrates the step by step
brands and retail stores for outlining their marketing process of data mining and analytics and the inferences
strategy and targeting their potential customers. derived from it.

Keywords:- Data Cleaning; Data Cleansing; Data The paper also reveals the limitations and the steps
Visualization; Data Segregation; Data Mining; Brand that can be taken in the future to scale the research further.
Switching; Data Analysis; Consumer Behaviour; Marketing
Strategy; Sales. II. RESEARCH PROCESS AND INFERENCES

I. INTRODUCTION The dataset practiced in this study has been


downloaded from Kaggle. The dataset comprises various
In our world, more and more brands and products are information on customers purchasing within a retail store.
launched every day. Many retail stores observe the attrition
of customers, and the reasons could be numerous – for  Data Storage
example - increased choices and new brand launches in The raw data* was in CSV format. Opensource
competition, better marketing promotions offered by the technology - HDFS was used to store this data.
competitors, change in the placement of brand in a store
display. These phenomena where a customer moves from  Data Cleansing
purchasing from one brand of a product to buying a The stored data were cleaned to remove all the null
different brand of the same product is known as Brand values. The data cleansing was essential to remove the null
Switching. values (customer id, invoice numbers, brand id, null
columns) present in the columns for accurate results.
Retailers have huge data collected over a period of
time, which can be used to learn about the customer’s  Data Wrangling
purchasing habits. Data Mining (turning raw data into The dates in the raw dataset were in the string format.
useful information) and Data Analytics (analyzing raw data Timestamps and whitespaces were removed from the
to make conclusions about that information) can be column values, and the dataset was converted in usable date
effectively used to analyze a large amount of business data format (YYYY-MM-DD).
and look for patterns in large batches of data. Through this
process, businesses can learn more about their customers to Further, for comparison, the data was needed in two
develop more effective marketing strategies, increase sales, comparable periods. Example – YoY, Quarterly, Monthly,
and decrease costs. Event vs. Event. In this case, the contrast was made on
YoY.
A retail store must predict it's customers' switching
behaviors to sustain and retain its loyal customers. The  Data Segregation
Brand Switching Analysis can help brands to strategies Segregation of data in two usable chunks - 2011 vs
their marketing to retain and attract potential customers, 2010. (refer Figure 1 and Figure 2).
and enhance sales.
The process of Data Cleansing, Wrangling and
Segreagtion was done in Hive using HQL.

IJISRT20FEB125 www.ijisrt.com 16
Volume 5, Issue 2, February – 2020 International Journal of Innovative Science and Research Technology
ISSN No:-2456-2165

Fig 1:- Data of 2010 (after Data Cleansing, Wrangling and Segregation)

Fig 2:- Data of 2011 (after Data Cleansing, Wrangling and Segregation)

 Data Analysis and Comparison A connection was established with the Business
Data were further processed to get it into a format that Intelligence tool, and
can be used by Data Visualization tools for meaningful
representations. A comparison of the Data was conducted Data Representation was done using Bar Charts to
between the two periods (2010 vs. 2011) to analyze the display consumer segregation of the chosen brand.
consumers in chunks - retained consumers, lost consumers,
and new customers. Further, An analysis was done on the brand Paper
Chain (refer Figure 3). A significant switching between
brands was observed between 2010 vs. 2011.

Fig 3:- Data Analysis conducted for Lost, New, and retained customers using the filter selection of the brand.
In this Bar Graph, the Customer movement pattern between 2010 vs. 2011 is shown in context to the paper chain brand.

IJISRT20FEB125 www.ijisrt.com 17
Volume 5, Issue 2, February – 2020 International Journal of Innovative Science and Research Technology
ISSN No:-2456-2165
 Data Representation and Analysis
The Data represented in Grid and Table formats (refer
Figure 4 and 5) can further help analysts to spring various
meaningful patterns. The below representations (refer
Figure 4 and 5) clearly shows that:

1. The YoY Sales decreased in 2011 as compared to 2010


when examined in context to the lost and new
Fig 5:- Grid Analysis corresponding to a Brand. In this
customers.
2. The YoY Sales in the case of retained customers has context the brand is Paper Chain
increased in 2011 vs. 2010.
3. Overall, YoY Sales has marginally increased in 2011 As a retail brand owner, the curiosity lies in knowing
where the customers are going to - a different store or - a
due to the retained customers.
4. The number of lost customers is more than new distinct brand within the store. Similarly, for the new
customers in 2011. customers, the retailer is curious about where the customers
are coming from and why.

The further study was conducted on the processed


data to analyze the customer switching in detail, and Heat
Maps (Figure 6, Figure 7) were used for Data Analysis and
Representation.

The lost consumers were picked, and information on


where the lost customers were buying in 2010 was derived
Fig 4:- Sales YoY – Lost, new and retained customers (for using the Heap Map Analysis (refer Figure 6)
Paper chain brand)
Similarly, Heat Map Analysis was conducted for new
The Analysis depicts that the brand should focus on customers (refer Figure 7). The size of the Heat Map boxes
retaining its lost customers to enhance sales. for each brand is based on the number of consumers lost
and gained, respectively. Besides, using the filter on the
Similarly, the Grid Analysis (refer Figure 5) for a right side, various sample sizes can be compared to see the
particular brand shows how the brand (Paper Chain Kit Heat Map.
Retrospot, in this context) is Performing within the retail.
The Heat Map Analysis conducted provides various
information such as – where the consumer was previously
buying, which brand they have switched to, etc.

Fig 6:- Heat Map representing the Lost Consumer Analysis (sample brand – 50) ; The data in the white box showcases the
consumer who switched from one brand to the other

IJISRT20FEB125 www.ijisrt.com 18
Volume 5, Issue 2, February – 2020 International Journal of Innovative Science and Research Technology
ISSN No:-2456-2165

Fig 7:- Heat Map representing the New Consumer Analysis (sample brand – 50) ; The data in the white box showcases the
consumer who switched from the other brand to the paper chain kit retrospot

III. TOOLS USED FOR DATA ANALYSIS matrix / crosstab, which creates a highlight table, but can
also be displayed on a geographical map or even a
Hadoop Distributed File System – HDFS, The customized image – such as a webpage used to show where
Hadoop Distributed File System (HDFS) is the primary users are clicking.
data storage system used by Hadoop applications. It
employs a NameNode and DataNode architecture to IV. LIMITATIONS AND DIRECTIONS FOR THE
implement a distributed file system that provides high- FUTURE RESEARCH
performance access to data across highly scalable Hadoop
clusters. This research paper was conducted on the available
dataset, and hence the inference from the Brand Switching
HDFS is a key part of the many Hadoop ecosystem Analysis conducted is purely based on the accuracy of this
technologies, as it provides a reliable means for managing data. In addition, the dataset available was only one and a
pools of big data and supporting related big data half years for a retail store. Hence the patterns derived were
analytics applications. based on this period. As the available data was limited,
concluding future predictions couldn't be obtained.
For this research paper we worked on Cloudera
distribution on a Virtual Box. Further, the hierarchal data wasn't available.
Therefore, patterns couldn't be derived to find switching
HIVE is an open-source data warehousing solution between the categories within the store.
built on top of Hadoop. Hive supports queries expressed in
a SQL-like declarative language - HiveQL, which are While this study has explored the lost and new
compiled into map-reduce jobs that are executed using customers' brand switching behaviors within the store,
Hadoop. In addition, HiveQL enables users to plug in future studies could also track the sales generated,
custom map-reduce scripts into queries. marketing promotions impacts. Moreover, brand switching
patterns can be studied on online platforms.
The language includes a type system with support for
tables containing primitive types, collections like arrays Moreover, finding an extensive dataset can enhance
and maps, and nested compositions of the same. the study by conducting a study of hierarchical data to
derive brand switching patterns between categories, which
Tableau is business intelligence software that helps can support the brand to find the loyal customers to a
people visualize and understand their data. particular category. Brands can then do targeted promotions
to enhance sales.
In this paper, bar graphs, heat maps were created
using Tableau. Heat maps are a visualization where marks Additionally, other comparisons based on periods,
on a chart are represented as colors. As the marks “heat up” seasons, events, and festivals can be conducted to predict
due their higher values or density of records, a more intense trends and support brands' work on their approaches.
color is displayed. These colors can be displayed in a

IJISRT20FEB125 www.ijisrt.com 19
Volume 5, Issue 2, February – 2020 International Journal of Innovative Science and Research Technology
ISSN No:-2456-2165
REFERENCES

[1]. Drew Fudenberg and Jean Tirole | Customer Poaching


and Brand Switching | The RAND Journal of
Economics Vol. 31, No. 4 (Winter, 2000), pp. 634-
657
[2]. Michael S. Morgan and Chekitan Dev (1994) | Cornell
University School of Hotel Administration | An
Empirical Study of Brand Switching for a Retail
Service | Pg. 2- 8
[3]. Arvind Sahay and Nivedita Sharma (2010) | Brand
Relationships and Switching Behaviour for Highly
Used Products in Young Consumers | Vikalpa •
Volume 35, NO 1, January – March 2010
[4]. Carl Steinbach (March 2013) | committer and Project
Management Committee member | Apache Hadoop*
Community Spotlight (2013) | Intel IT Center
[5]. Daniel G. Murray | Tableau Your Data!: Fast and
Easy Visual Analysis with Tableau Software
[6]. Dataset Source: Kaggle:
https://www.kaggle.com/sanjeet41/online-retail
[7]. Ashish Thusoo, Joydeep Sen Sarma, Namit Jain,
Zheng Shao, Prasad Chakka, Ning Zhang, Suresh
Antony, Hao Liu, Raghotham Murthy (2010) |
Facebook Data Infrastructure Team, USA | Hive - a
petabyte scale data warehouse using Hadoop | 2010
IEEE 26th International Conference on Data
Engineering (ICDE 2010)
[8]. Tableau | Website https://www.tableau.com/
[9]. Hibernate | Community Document | Chapter 15. HQL:
The Hibernate Query Language |
https://docs.jboss.org/hibernate/orm/3.5/reference/en/h
tml/queryhql.html
[10]. Ryan Sleeper | data visualization evangelist |Tableau
201: How to Make a Heat Map | Evolytics Evolving
Analytics | https://evolytics.com/blog/tableau-201-
make-heat-map/
[11]. Cambridge Dictionary |
https://dictionary.cambridge.org/us/dictionary/english/
brand-switching
[12]. TechTargetNetwork | Hadoop Distributed File System
(HDFS) |
https://searchdatamanagement.techtarget.com/definiti
on/Hadoop-Distributed-File-System-HDFS

IJISRT20FEB125 www.ijisrt.com 20

You might also like