Business Statistics – Session 1
Introduction to Statistical Analysis
Hypothesis
Our Learning Journey Association testing
Confidence
and Causation • Hypothesis
Interval • Correlation formulation
• Sampling
analysis • Hypothesis
Data distribution • Simple testing – Z
Distribution linear test
• CI regression
• Normal estimation
distribution for the
‘Describing’ Data
• Standard population
• Measures of normal mean
descriptive statistics distribution
• Positional measures • Bell curve
– Quartiles
Understanding • 5-number summary
Data
• Outliers
• Reading a data
set
• Exploratory
data
Statistical Analysis visualization
• DCOVA lifecycle
• Statistical analysis
classification
Regarding Internal Assessment (IAs)
1. 9 IAs, each of 10 marks mapped to s.1-9
2. Typically, there are three parts to every IA – 1) concept, 2)
calculation of the output, 3) interpretation/explanation of the
output. Marks are split over these three parts and deduction
happens if any of the three is inaccurate/incomplete
3. Brief and accurate answers required for Parts 2 and 3
4. Submission invalid if adjudged as a copy case by plagiarism
checker - more than 40% match of your answer with someone
else’s
5. IAs where Excel is required won’t be remote-proctored. So, you
can use MS Excel tool
From Pin code to Bar code
• There are ~20,000 Pin (Postal Index Number) codes in India
• Retail companies – both online and bricks-and-mortar are penetrating deeper
into the country, targeting Tier 2, 3, 4 cities and beyond as the next frontiers
for growth
• As they scale up their operations and reach, Pin code serviceability becomes
the key differentiator for customer acquisition and retention
• Amazon claims to have 100% reach in terms of Pin code coverage
Business Question:
How do segment specific retailers decide which Pin codes to target? E.g. V-Mart
Retail, Zara, Reliance Retail, Big Bazaar, Shoppers Stop, Pantaloons…
DCOVA: The decision-making life cycle in statistics
DEFINE COLLECT ORGANIZE VISUALIZE ANALYZE
• Define the variables that you want to study in order to solve a business problem or meet a business
objective
• Collect the data from appropriate sources
• Organize the data collected by developing tables
• Visualize the data by developing charts
• Analyze the data by examining the appropriate tables and charts, and other statistical methods to reach
conclusions
Decision Scenario 1
The Brand Manager of Horlicks wants to launch a new
flavor targeted at young, working adults
Decision Scenario 2
A pharma company wants to test the safety and efficacy of
a new vaccine
Decision Scenario 3
A telecom service provider wants to measure the customer
satisfaction of its B2B customers
Decision Scenario 4
An e-commerce site wants to test a new promotional
campaign
Statistical Decision-Making
Consider these scenarios
1. The brand manager of Horlicks has to launch a new
flavour targeted at adults
Role of
2. A pharma company wants to test the safety and Statistics
in
efficacy of a drug Business
3. A telecom service provider wants to measure
customer satisfaction of its B2B clients
4. An ecommerce site wants to test a new promotional
Role of
campaign Statistic Role of
s Statistics in
Research
Put yourself in the shoes of the decision-maker in each of in
Decisio
n
these scenarios. Making
What will you do?
Statistical Analysis Pyramid
APPLIED
DESCRIPTIVE ANALYTICAL INFERENTIAL INDUCTIVE
Descriptive Statistics
• When statistical methods are used, a problem is often formulated in terms of
‘population’ or ‘universe’, which is defined as all the elements about which conclusions
or decisions are to be made. However, population data is often not available for
decision-making due to time, resource, and accessibility constraints
o In research vocabulary, if we have the entire population data, such a process is
called ‘Census’
o Descriptive statistics includes methods for collection, collation, tabulation, and
summarization of the population or sample data. For e.g. average, standard
deviation, and skewness help in summarizing and describing the main features of
the statistical data
o Visual techniques of describing data can include frequency distribution charts like
histogram and ogive, box plot, bar and column graph, pie chart, etc.
Inferential Statistics
• A frequent decision-making situation is when sample data is analyzed. Then based on
the sample evidence, inferences are drawn and generalized about the target population
• A common use case of inferential statistics is marketing research techniques like
cluster analysis, conjoint analysis, multidimensional scaling wherein data is collected
from a sample chosen from the target population, and analyzed to provide inferences
for the entire population
• Exit poll during elections is an example of sample survey. This method is referred to as
‘Statistical Inference’
• Hypotheses testing for the population mean using the sample mean form an important
part of inferential statistics
Analytical Statistics
• This deals with validating and establishing relationship between two or
more variables. This includes methods like correlation and regression,
association of attributes, multivariate analysis, etc., which help establishing
relationship between variables
• Often, analytical statistics is considered a subset of inferential statistics,
however, we have segregated the two for the purpose of this course
• This facilitates comparison, interpolation, extrapolation and relationships.
In these cases, we require multiple samples on different populations or same
population, for example, sales of a product before and after launch of
promotion campaign
• Two very useful visualization charts used in analytical statistics are scatter
plot and line fit plot
Inductive Statistics
• Decision-making in many business situations requires estimates about future like
trends and forecasting, e.g. sales, market share, etc.
• Inductive statistics include methods that help in generalizing the trends based on time
series data
• This process provides estimation indirectly on the basis of partial data or method of
forecasting based on past data, for example, technical analysis in equity research,
where future share price movement is predicted based on the historical share price
trend like 50 day moving average (DMA) or 200 DMA
Mapping of Statistical Analysis Techniques
Category Root Word Analysis Techniques (examples) When to Use
Descriptive Description Mean, median, standard Describe and
deviation, skewness visualize the data
Histogram, boxplot distribution
Inferential Inference Confidence interval estimation, Derive inferences
hypothesis testing for the for the population
population mean based on sample
Bell curve data
Analytical Relationship Correlation and regression To validate and
analysis establish
Scatter plot, line fit plot relationships
between variables
Inductive Time series Moving average, exponential To conduct trend
moving average, autoregression analysis and
Time series plot, e.g. line graph forecasting