0% found this document useful (0 votes)

89 views36 pages

Basic Data Analysis

This document discusses various topics related to data analysis basics, including: 1) Reading data tables to draw conclusions about sales patterns of instant noodle brands across 4 stores with different catchment profiles. 2) Analyzing car ownership data by population strata to understand higher ownership in large cities versus rural areas. 3) The concepts of correlation between variables, how a high correlation does not necessarily imply causation, and using correlation to study relationships in retail sales data over time. 4) Different options for collecting brand image data through ratings scales or simple association measures on various product attributes.

Uploaded by

chaitanya

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPT, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

89 views36 pages

Basic Data Analysis

Uploaded by

chaitanya

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPT, PDF, TXT or read online on Scribd

You are on page 1/ 36

Data Analysis – Basics

Topics of today’s discussion

 Reading Data Tables to make Conclusions

 Reading Central Tendency and Dispersion

 Correlation and Causality

 Row-Column Normalization of data tables

 Variables type and Statistical Analysis tool to build

relationships
Topics of today’s discussion

 Reading Data Tables to make Conclusions

 Reading Central Tendency and Dispersion

 Correlation and Causality

 Row-Column Normalization of data tables

 Variables type and Statistical Analysis tool to build

relationships
Reading Data Tables – Situation 1

 Let us assume a city has 4 modern format stores (named

Store 1, Store 2, etc) of a single retail player
 They are more or less of similar size and have similar total
monthly Sales
 However, the Sales by different categories are different –
for example, one Store might have a higher Sales of FMCG
and another a higher Sale of Staples
 In such a scenario, let us look at the buyers of “instant
noodles” in these 4 stores in 2014
Reading Data Tables – Situation 1

Brands Purchased in each Store – Instant Noodles (2014)

Store 1 Store 2 Store 3 Store 4

Among buyers of Instant
Noodles in each Store
% Buying Maggi variants only 70% 75% 55% 85%
% Buying Yippee or Top Ramen
20% 20% 35% 10%
or others only
% Buying both 10% 5% 10% 5%
Reading Data Tables – Situation 1

% Contribution of Buyers of Instant Noodles Brands from each

Store (2014)

Store 1 Store 2 Store 3 Store 4

% Buying Maggi variants only 18% 27% 42% 13%

% Buying Yippee or Top Ramen
13% 18% 66% 4%
or others only
% Buying both 20% 14% 60% 6%
Reading Data Tables – Situation 1 – Assignment

 Reading the 2 tables what will you conclude about Instant

Noodles sales from the 4 stores?
 Can you make some guesses about the difference in the
catchment profiles of these stores?
Reading Data Tables – Situation 1

SOME SIMPLE CONCLUSIONS

 Store 3 has a substantially large number of buyers of
Instant Noodles as category – Store 4 has the least

 Among their respective buyers, Stores 1, 2 and 4 have high

(70%+) “solus Maggi” buyers, especially Store 4 (85%)

 Store 3 has lesser (55%) “solus Maggi” buyers. But, being

the largest seller of instant noodles, contributes maximum
to Maggi sales, as well as to the other brands’ sales
Reading Data Tables – Situation 1

SOME SIMPLE CONCLUSIONS

 Same-sized Stores, yet Store 3 has
 Higher Instant Noodles sales and
 Higher % of new brand (Yippee, Smoodles, etc) Sales
So, the catchment profile might be
- younger, with more double income hhlds, bachelors, etc
- also, psychographically, more open to trying new brands
- more exposed to media… hence aware of new brands

 Similarly, Store 4 catchment profile might be just the

opposite
Reading Data Tables – Situation 1 – Hint
(Column %s)

Brands Purchased in each Store – Instant Noodles

Store 1 Store 2 Store 3 Store 4

Base: Buyers of Instant
500 700 1500 300
Noodles in each Store
% Buying Maggi variants only 70% 75% 55% 85%
% Buying Yippee or Top Ramen
20% 20% 35% 10%
or others only
% Buying both 10% 5% 10% 5%
Reading Data Tables – Situation 1 – Hint
(Row %s)

% Contribution of Buyers of Instant Noodles Brands from each

Store

Store 1 Store 2 Store 3 Store 4

% of ALL INSTANT
17% 23% 50% 10%
NOODLES BUYERS
% Buying Maggi variants only 18% 27% 42% 13%
% Buying Yippee or Top Ramen
13% 18% 66% 4%
or others only
% Buying both 20% 14% 60% 6%
Reading Data Tables – Situation 2

Car Ownership (Among higher Social Class) by Pop Strata

All Large Urban Urban Rural Rural

India Metros 1 – 50L <1L 10K+ <10K
Target household
20000 2000 6000 5000 3000 4000
Population (‘000)
Sample Size 5000 1000 1000 1000 1000 1000
Column %
% Owning Cars 20% 51% 28% 19% 7% 2%
% Not Owning
80% 49% 72% 81% 93% 98%
Cars
Reading Data Tables – Situation 2

Car Ownership (Among higher Social Class) by Pop Strata

All Large Urban Urban Rural Rural
India Metros 1 – 50L <1L 10K+ <10K
Target household
20000 2000 6000 5000 3000 4000
Population (‘000)
Row %
% Owning Cars 100% 26% 43% 24% 5% 2%

From the given two tables (i.e. Column % and Row %) what do you
conclude about car ownership in India?
Reading Data Tables – Situation 2

BROADLY, THERE ARE 3 CONCLUSIONS ON CAR

OWNERSHIP:
1. Overall, 20% of households in higher Social Class of India,
own cars
2. Ownership is highest in large metros (51%) and comes
down step-wise as we go down the lower pop strata – lower
in non-metro urban and lowest in rural
3. However, due to large size of Urban 1 – 50 lakh population,
the highest contribution of cars (43%) comes from this pop
strata
Topics of today’s discussion

 Reading Data Tables to make Conclusions

 Reading Central Tendency and Dispersion

 Correlation and Causality

 Row-Column Normalization of data tables

 Variables type and Statistical Analysis tool to build

relationships
Assignment 2

In the catchment area of a store, 1000 people were asked some questions in a
survey. TG: 21 – 45yrs, housewives or single earning members, who
themselves shop for day-to-day household items,

They were asked to agree or disagree with a statement “I prefer buying day-to-
day items from modern format outlets rather than going to traditional Kirana
stores” in a five point scale (Likert Scale):

Of the 1000 people responding to this question, the mean score obtained was
3.1 out of 5. What can you conclude from this?
Assignment 2

If you are now given the following distribution:

What would you conclude?

Can you make some hypotheses on the sub-groups of people giving this
opinion?
Would you want the data to be analyzed in some other sub-groups?
Assignment 2

We may need to look at an output like:

… to check, Is the polarization of findings due to different attitudes

among different age groups and different stages of life?
Topics of today’s discussion

 Reading Data Tables to make Conclusions

 Reading Central Tendency and Dispersion

 Correlation and Causality

 Row-Column Normalization of data tables

 Variables type and Statistical Analysis tool to build

relationships
Correlation between two variables

When we look at data for two or more variables, we sometimes see that
data for two variables move in the same direction.
For example, if we record the heights and weights of a large number of
people, we would observe that there are many taller people who also weigh
higher and similarly there are many shorter people who also weigh lesser

Also, there can be two variables that move mostly in opposite directions
For example, the Power of the engine and Mileage of the car – mostly, cars
with higher Power would have lower Mileage.

We refer to the term ‘Correlation’ to explain the strength of linear

association between two variables and a co-efficient ‘r’ is used
to measure this strength

The value of ‘r’ can range between –1 and +1

Correlation between two variables

r value closer to +1  a strong positive association

r value closer to -1  a strong negative association
r value closer to 0  weak association between the two variables

In Retail Sales data too, it will be interesting to observe the Correlation

of Sales of certain categories over the period of time

- Is there a high positive Correlation between Sales of Shampoos and

Conditioners?
- Is there a negative Correlation between Sales of Shower Gels and
Soaps?
by looking at long-term Sales data
Correlation does not imply Causality

However, one must note “CORRELATION DOES NOT IMPLY

CAUSALITY”

Meaning, a high positive Correlation between A and B does not

mean that A causes B or A leads to B

e.g. Brand Imagery vs Brand Usage

 generally a high positive correlation… but does it mean increase in
Brand Imagery would lead to increase in Brand Usage?

NO!
Topics of today’s discussion

 Reading Data Tables to make Conclusions

 Reading Central Tendency and Dispersion

 Correlation and Causality

 Row-Column Normalization of data tables

 Variables type and Statistical Analysis tool to build

relationships
Let us look at a Brand Image Q’re situation

• You want to understand the imagery of 5 brands of

shampoos on several attributes

• The attributes can be “Makes hair shiny”, “Cleans hair

well”, …, “Has good packaging”, etc

• You may collect this info as “Ratings” (OPTION 1) or as

simple “Association” (OPTION 2)
Rate the following brands in terms of your preference on
each attribute in a 1 – 5 scale, 5 = Excellent,…, 1 = Poor
Dove Tresemme Head & S Clear Sunsilk
Gives shiny hair
Cleans hair well
Removes dandruff
…

…
OPTION 1
…
…

Good vfm

Good packaging
Which brands would you associate on each attribute as per your
preference? (TICK/CIRCLE CODE AS APPLICABLE)
Dove Tresemme Head & S Clear Sunsilk
Gives shiny hair A B C D E
Cleans hair well A B C D E
Removes dandruff A B C D E
… A B C D E

… A B C D E

… A B C D E
OPTION 2
… A B C D E
… A B C D E

Good vfm A B C D E

Good packaging A B C D E
Obviously…

• OPTION 1 will be more detailed, more robust to analyze

Statistically

• However, OPTION 1 will be very time consuming to

administer

• So, lot of times, we go ahead with OPTION 2 when we have

large number of attributes and/or brands to work with
A Typical output of such Image Association

Dove Tresemme Head & S Clear Sunsilk

Gives shiny hair 72% 32% 54% 42% 73%
Cleans hair well 68% 22% 60% 45% 75%
Removes dandruff 45% 12% 88% 82% 70%
Good vfm 55% 18% 60% 62% 82%

Good packaging 65% 24% 57% 51% 76%

• PROBLEM HERE IS… Large brands will always have high

associations across …and small brands will have small
associations across all attributes
SO, HOW DO WE GET THE RELATIVE STRENGTHS AND
WEAKNESSES OF SMALL BRANDS?
ROW – COLUMN NORMALIZATION

• It brings all Brands and all Attributes to the same platform

• Hence fair comparison can be made in terms of relative

Strengths and Weaknesses

(Refer to Excel File for computations)

Image Association

(Refer to Excel File for Row – Column normalization)

After Normalization…

Tin plate Plastic Glass Tetrapack

Makes the product reasonably priced -- ++ -

Increases longevity of product
Convenient for stocking in godown ++ + --
Looks attractive on shelves for a long time as colours do not
fade away +
Destroyable and the material useable without damaging the
environment -- -- + -
Popular among consumers ++ +
Convenient for transportation as it does not break / get
tampered easily ++ --
Not tampered easily by insects / rats ++ -- ++ --
Gives protection against foreign odours -- + +
Convenient for stocking on shelves - ++ -
Convenient for transportation by requiring less space --
Re-useable by the customer for some other purpose -- ++
Topics of today’s discussion

 Reading Data Tables to make Conclusions

 Reading Central Tendency and Dispersion

 Correlation and Causality

 Row-Column Normalization of data tables

 Variables type and Statistical Analysis tool to build

relationships
Variable Types and Statistical tools

DEPENDENT VARIABLE
INDEPENDENT VARIABLE

36
Variable Types and Statistical tools

Types of tools Applicable:

- Chi-Square test

- ANOVA

- Multiple Regression

- Discriminant Analysis OR Logistic Regression

37
Examples of Applications:
- Chi-Square test (Purchaser vs Non-Purchaser… Are there

differences by demographic groups?)

- ANOVA (Preference levels among different brands of vodka drinkers)

- Multiple Regression (“Ad likeability” by Uniqueness, Relevance, etc)

- Discriminant Analysis OR Logistic Regression (Purchase /

Non-Purchase by Ad Uniqueness, Relevance, etc)

38
THANK YOU!

JLL The Rise of Global Capabilities Centres in India Updated
No ratings yet
JLL The Rise of Global Capabilities Centres in India Updated
15 pages
Final MG413 CW2
100% (1)
Final MG413 CW2
19 pages
The Philippine Green Building Code
No ratings yet
The Philippine Green Building Code
5 pages
Intro To Data Analysis Project
No ratings yet
Intro To Data Analysis Project
49 pages
Parasite Zapper Circuit
No ratings yet
Parasite Zapper Circuit
8 pages
Disposal of Unused Drugs - Knowledge and Behavior Among People Around The World
100% (1)
Disposal of Unused Drugs - Knowledge and Behavior Among People Around The World
34 pages
BA1 Introduction 2025
No ratings yet
BA1 Introduction 2025
55 pages
Excel DataAnalysis
No ratings yet
Excel DataAnalysis
38 pages
Handout PS 1 - Customer Analytics
No ratings yet
Handout PS 1 - Customer Analytics
16 pages
Data Presentation - Descriptive Stats - PGPEX
No ratings yet
Data Presentation - Descriptive Stats - PGPEX
87 pages
Assessment Cover Sheet: Student Declaration
No ratings yet
Assessment Cover Sheet: Student Declaration
7 pages
02 - Data Exploration: IS5740: Management Support and Business Intelligence Systems
No ratings yet
02 - Data Exploration: IS5740: Management Support and Business Intelligence Systems
37 pages
BAM3
No ratings yet
BAM3
60 pages
Factor Analysis: Presented By: Gurvinder Kaur
No ratings yet
Factor Analysis: Presented By: Gurvinder Kaur
37 pages
570 ASM2 NguyenDangQuang GBS0909A
No ratings yet
570 ASM2 NguyenDangQuang GBS0909A
34 pages
Unit 2
No ratings yet
Unit 2
29 pages
MKTG Analytics - Intro, Appli, Process M 2023
No ratings yet
MKTG Analytics - Intro, Appli, Process M 2023
30 pages
Analytics Group Assignment
No ratings yet
Analytics Group Assignment
16 pages
MR - Data Preparation & Analysis 13th and 14th Feb 2024
No ratings yet
MR - Data Preparation & Analysis 13th and 14th Feb 2024
59 pages
Chapter 12 - Using Descriptive Analysis
No ratings yet
Chapter 12 - Using Descriptive Analysis
49 pages
Business Analytics (MIS171) Summary Notes
No ratings yet
Business Analytics (MIS171) Summary Notes
6 pages
Data Analytics (Finished
No ratings yet
Data Analytics (Finished
4 pages
Introduction - Lecture Slides-1
No ratings yet
Introduction - Lecture Slides-1
12 pages
Case Study BS Updated
No ratings yet
Case Study BS Updated
187 pages
Big Data SYBBA (CA)
No ratings yet
Big Data SYBBA (CA)
12 pages
Assignment IV
No ratings yet
Assignment IV
3 pages
Chapter-5 Data Analysis and Techniques in Marketing Research
No ratings yet
Chapter-5 Data Analysis and Techniques in Marketing Research
9 pages
Analysis Factor Analysis Cluster Analysis
No ratings yet
Analysis Factor Analysis Cluster Analysis
18 pages
Group 4 BRM Report Section B
No ratings yet
Group 4 BRM Report Section B
17 pages
Data Analysis
No ratings yet
Data Analysis
21 pages
Assignment 1 ISOM2500 2025spring
No ratings yet
Assignment 1 ISOM2500 2025spring
5 pages
Data Analytics Fundamentals
No ratings yet
Data Analytics Fundamentals
35 pages
Ant Analysis, Factor Analysis, Cluster Analysis
No ratings yet
Ant Analysis, Factor Analysis, Cluster Analysis
18 pages
MAY 2022 Sample Answer
No ratings yet
MAY 2022 Sample Answer
53 pages
Group 24 Business Analytics
100% (1)
Group 24 Business Analytics
21 pages
Final UNIT II-DESCRIPTIVE ANALYTICS
No ratings yet
Final UNIT II-DESCRIPTIVE ANALYTICS
128 pages
Ba Lecture 2
No ratings yet
Ba Lecture 2
54 pages
Big Data and Analytics
No ratings yet
Big Data and Analytics
86 pages
Chapter 2 (Descriptive)
No ratings yet
Chapter 2 (Descriptive)
92 pages
MKT3600 - L07 - Basic Data Analysis - FMA
No ratings yet
MKT3600 - L07 - Basic Data Analysis - FMA
32 pages
IBA Chapter 2 Slides Final Accessible
No ratings yet
IBA Chapter 2 Slides Final Accessible
43 pages
Session 1 and 2 Course Overview and Intro To R
No ratings yet
Session 1 and 2 Course Overview and Intro To R
147 pages
For Sainsbury
No ratings yet
For Sainsbury
5 pages
FBR & IT Applications: Compiled and Presented by DR - Deepak Joshi For Academic Use Only
No ratings yet
FBR & IT Applications: Compiled and Presented by DR - Deepak Joshi For Academic Use Only
77 pages
TEM NOTES Overview
No ratings yet
TEM NOTES Overview
16 pages
2035 CH1 Notes
No ratings yet
2035 CH1 Notes
32 pages
Data Basics For ML
No ratings yet
Data Basics For ML
23 pages
BRM - Topic 8
No ratings yet
BRM - Topic 8
94 pages
Group12 - Business Statistics - EL
No ratings yet
Group12 - Business Statistics - EL
11 pages
Week 10 Factor Analysis
No ratings yet
Week 10 Factor Analysis
61 pages
Ba CH01
No ratings yet
Ba CH01
14 pages
SPSS
No ratings yet
SPSS
30 pages
Bridging Blaze Lbolytc Finals Reviewer
No ratings yet
Bridging Blaze Lbolytc Finals Reviewer
33 pages
CIA Understanding
No ratings yet
CIA Understanding
5 pages
Market Research PDF
No ratings yet
Market Research PDF
509 pages
Business Statistics For Decision Making (Sneha)
No ratings yet
Business Statistics For Decision Making (Sneha)
15 pages
Module 01 Introduction To Business Statistics
No ratings yet
Module 01 Introduction To Business Statistics
16 pages
Marketing and Sales Management Lesson Note For Grade 11 M3u4
No ratings yet
Marketing and Sales Management Lesson Note For Grade 11 M3u4
23 pages
Session5 Factor Analysis Handout
No ratings yet
Session5 Factor Analysis Handout
16 pages
Intro and EDA
No ratings yet
Intro and EDA
74 pages
Tools
No ratings yet
Tools
14 pages
Statistical Data by Group 1 - Statistics Economic 2
No ratings yet
Statistical Data by Group 1 - Statistics Economic 2
19 pages
Cengage EBA 2e Chapter02
No ratings yet
Cengage EBA 2e Chapter02
84 pages
The Aisles Have Eyes: How Retailers Track Your Shopping, Strip Your Privacy, and Define Your Power
From Everand
The Aisles Have Eyes: How Retailers Track Your Shopping, Strip Your Privacy, and Define Your Power
Joseph Turow
3.5/5 (12)
627317044FINAL - Detailed Advt 02 - 2024
No ratings yet
627317044FINAL - Detailed Advt 02 - 2024
14 pages
Information Science: Competency Levels of Nursing Informatics
No ratings yet
Information Science: Competency Levels of Nursing Informatics
6 pages
Unit 5 Review Answers
No ratings yet
Unit 5 Review Answers
17 pages
CBT Exam-Aramco-Excel (1) - 1
No ratings yet
CBT Exam-Aramco-Excel (1) - 1
61 pages
BRKCRS 3147 Advanced Troubleshooting of The ASR1K and ASR4400 Made Easy 2014 Milan 90 Mins PDF
No ratings yet
BRKCRS 3147 Advanced Troubleshooting of The ASR1K and ASR4400 Made Easy 2014 Milan 90 Mins PDF
92 pages
Surface Engineering Industry Germany
No ratings yet
Surface Engineering Industry Germany
27 pages
SDS Underwater Cutting Rods 2018 PDF
100% (1)
SDS Underwater Cutting Rods 2018 PDF
8 pages
EX - NO 10 Simulation of Error Correction Code (CRC) Aim
100% (1)
EX - NO 10 Simulation of Error Correction Code (CRC) Aim
4 pages
Soalogic Cheat Sheet: Target Account Profile Elevator Pitch Target Account Profile Continued
No ratings yet
Soalogic Cheat Sheet: Target Account Profile Elevator Pitch Target Account Profile Continued
2 pages
WindowsSecurityChecklist Group Policy
100% (1)
WindowsSecurityChecklist Group Policy
17 pages
Final Bachelor Project 07 Vikram
No ratings yet
Final Bachelor Project 07 Vikram
62 pages
Judiciary Handbook
No ratings yet
Judiciary Handbook
5 pages
BS-08 Partitionof Bengal
No ratings yet
BS-08 Partitionof Bengal
23 pages
Part List DCP t500w
No ratings yet
Part List DCP t500w
29 pages
PRTG Report 4812 - Report Sensor - Created 2022-06-21 13-16-49 (2022-05-01 00-00 - 2022-05-31 00-00) UTC
No ratings yet
PRTG Report 4812 - Report Sensor - Created 2022-06-21 13-16-49 (2022-05-01 00-00 - 2022-05-31 00-00) UTC
2 pages
Proposed RAT For VPs
No ratings yet
Proposed RAT For VPs
3 pages
ReadISACA QAE Databases On ISACA PERFORMTica
No ratings yet
ReadISACA QAE Databases On ISACA PERFORMTica
9 pages
Industrial Internship Report ON Fundamental Analysis of Indian Steel Industry
No ratings yet
Industrial Internship Report ON Fundamental Analysis of Indian Steel Industry
60 pages
Interview - Questionaire Promotion
No ratings yet
Interview - Questionaire Promotion
5 pages
Midwifery Society of Nepal (MIDSoN)
No ratings yet
Midwifery Society of Nepal (MIDSoN)
5 pages
Maritime Sewip Datasheet
No ratings yet
Maritime Sewip Datasheet
2 pages
Solutions
100% (1)
Solutions
25 pages
Fast Track Quick Reference
No ratings yet
Fast Track Quick Reference
7 pages
AEDT Icepak Intro 2019R1 L3 Flow and Thermal Boundary Conditions
No ratings yet
AEDT Icepak Intro 2019R1 L3 Flow and Thermal Boundary Conditions
20 pages
Law485 Director
No ratings yet
Law485 Director
67 pages
Wise Holdings Vs Garcia
100% (2)
Wise Holdings Vs Garcia
2 pages