0% found this document useful (0 votes)
6 views

Module-5

The document outlines key concepts in descriptive statistics relevant to business research, including measures of central tendency (mean, median, mode), dispersion (range, variance, standard deviation), and correlation. It emphasizes the importance of these statistical tools for analyzing data, making informed business decisions, and understanding relationships between variables. Additionally, it addresses sampling errors, non-sampling errors, and the significance of degrees of freedom in statistical analysis.

Uploaded by

kashish.l24-26
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
6 views

Module-5

The document outlines key concepts in descriptive statistics relevant to business research, including measures of central tendency (mean, median, mode), dispersion (range, variance, standard deviation), and correlation. It emphasizes the importance of these statistical tools for analyzing data, making informed business decisions, and understanding relationships between variables. Additionally, it addresses sampling errors, non-sampling errors, and the significance of degrees of freedom in statistical analysis.

Uploaded by

kashish.l24-26
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 51

Research Methodology

Module-5
-Dr. Jay Prakash Verma
Ph.D., MBA, M.Com., B.Com(H), UGC-NET
Associate Dean & Associate Professor- Author
AGENDA:
• Measurement of central tendency:
Mean, Median, Mode
• Dispersion: Range, Variance,
Descriptive Standard Deviation, Skewness and
Kurtosis
Statistics • Measures of relationship: Correlation
• Sampling and non-sampling errors
• Degree of freedom and standard
error
• Univariate and bivariate analysis.
Descriptive Statistics
in Business Research

• Understand fundamental statistical


concepts for business data analysis
• Apply appropriate statistical
measures to describe business
phenomena
• Interpret statistical results in
business decision-making contexts
• Develop critical analytical skills
for research and data-driven
decision making
What is Descriptive Statistics?
• Definition: Methods used to summarize,
organize, and simplify data.
• Purpose: Transforms raw data into
meaningful information.
• Role in Business Research:
• Foundation for data analysis and
interpretation.
• Provides clear summaries of complex
business data.
• Enables pattern identification and trend
analysis.
• Forms basis for more advanced
statistical analysis.
Types of Descriptive Statistics
Measures of Central Tendency: Identify the center of a data
distribution.
Measures of Dispersion: Describe the spread or variability of
data.
Measures of Distribution Shape: Describe asymmetry and
peakedness.
Measures of Relationship: Quantify connections between
variables.
• Definition: Statistical measures that
identify the center or middle of a data
set.
• Importance in Business:
• Provides representative values for
Introduction to business metrics.
• Allows comparison between
Central Tendency different data sets.
• Simplifies complex data for
decision-making.
• Common Measures: Mean, Median,
Mode
• Definition: Arithmetic average of all values in a
data set
• Formula: x̄ = Σfixi / Σfi.
• Business Applications:

Mean • Average sales figures


• Average customer spending
• Average production costs
• Strengths: Uses all data points, mathematically
precise
• Limitations: Sensitive to outliers and extreme
values
Mean - Examples
and Practice
• Example 1: Monthly sales figures for Q1
($000s): $120, $145, $130
• Mean = ($120 + $145 + $130)/3 = $131.67
• Example 2: Customer wait times (minutes): 3,
5, 8, 12, 7
• Mean = (3 + 5 + 8 + 12 + 7)/5 = 7 minutes
• Practice Problem: Calculate mean employee
productivity scores: 78, 82, 95, 67, 88, 91
• Definition: Middle value when data is arranged in ascending or
descending order.
• Calculation:
• For odd number of observations: middle value
• For even number: average of two middle values

Median
• Business Applications:
• Median income of target market
• Median product prices
• Median response times
• Strengths: Not affected by extreme values, suitable for ordinal
data
• Limitations: Ignores actual values of most observations
Median - Examples and Practice

• Example 1: House prices ($000s): $350, $280, $420, $315, $550


• Arranged: $280, $315, $350, $420, $550
• Median = $350
• Example 2: Employee tenure (years): 2, 8, 5, 10, 12, 4
• Arranged: 2, 4, 5, 8, 10, 12
• Median = (5 + 8)/2 = 6.5
• Practice Problem: Find median customer satisfaction rating: 3, 5,
2, 4, 3, 5, 1
• Definition: Most frequently occurring value in a data
set
• Characteristics:
• May have multiple modes (bimodal, multimodal)

Mode
• May not exist if no values repeat
• Business Applications:
• Most common product size purchased
• Most frequent customer complaints
• Most common price point
• Strengths: Works with nominal data, identifies most
common value
• Limitations: May not be representative of the data set
Mode - Examples and Practice
• Example 1: Product ratings (1-5 scale): 4, 3, 5, 4,
2, 4, 5
• Mode = 4 (occurs three times)
• Example 2: Customer age groups: 20s, 30s, 40s,
30s, 20s, 30s, 50s
• Mode = 30s (occurs three times)
• Practice Problem: Identify mode in marketing
channel conversions: Email, Social, Email, Direct,
Search, Social, Email
When to Use Mean:

• Data is approximately normally distributed


• No significant outliers
Comparing • Interval or ratio data

Measures of When to Use Median:


Central • Data is skewed
Tendency • Outliers present
• Ordinal data

When to Use Mode:

• Nominal data
• Interest in most common category
• Multimodal distributions for segment identification
• Definition: Measures how spread
out or scattered the data values
are.
• Importance in Business:
• Indicates data reliability and
Introduction consistency
to Dispersion • Reveals variability in business
metrics
• Helps assess risk and
uncertainty
• Common Measures: Range,
Variance, Standard Deviation
• Definition: Difference between the maximum and
minimum values.
• Formula: Range = Maximum value - Minimum value
• Business Applications:

Range • Price ranges in market analysis


• Production output variability
• Customer spending range
• Strengths: Simple to calculate and understand
• Limitations: Uses only two data points, ignores
distribution
• Example 1: Monthly revenue ($000s): $80,
$95, $110, $88, $92
Range - • Range = $110 - $80 = $30
Examples • Example 2: Product defect rates (%): 2.1, 1.8,
and Practice 3.2, 2.5, 1.9
• Range = 3.2 - 1.8 = 1.4
• Practice Problem: Calculate range of delivery
times (days): 3, 7, 2, 5, 8, 4
• Definition: Average of squared deviations from
the mean
• Formula: σ² = ∑ (xi - μ)² / N (population)
• Sample Variance: s² = ∑ (xi - x̄)² / (n - 1) (sample)
Variance • Business Applications:
• Analyzing variability in financial returns
• Quality control measurements
• Assessing consistency in performance metrics
• Limitations: Not in same units as original data
• Calculate the mean of the data set
• Subtract the mean from each data point
• Square each deviation
• Sum the squared deviations
Variance - • Divide by n (population) or n-1 (sample)
• Example: Customer wait times (minutes): 5,
Calculation 8, 4, 10, 7
• Mean = 6.8 minutes
Steps • Deviations: -1.8, 1.2, -2.8, 3.2, 0.2
• Squared deviations: 3.24, 1.44, 7.84, 10.24,
0.04
• Sum of squared deviations = 22.8
• Sample variance = 22.8/4 = 5.7
Standard
Deviation
• Definition: Square root of variance, measure of
average deviation
• Business Applications:
• Measuring volatility in stock prices
• Evaluating consistency in production
• Quantifying risk in business metrics
• Advantages: Same units as original data, widely
used in analysis
Standard Deviation - Examples

• Example 1: From previous slide, wait times standard deviation =


√5.7 = 2.39 minutes
• Example 2: Sales performance (units): 45, 52, 49, 38, 56
• Mean = 48
• Sample standard deviation = 6.96 units
• Interpretation: Approximately 68% of values fall within ±1 standard
deviation of the mean
• Definition: Standardized measure of
dispersion relative to the mean
• Formula: CV=s/x×100%
• Business Applications:
• Comparing variability between different
Coefficient of data sets
Variation • Assessing relative risk between
investments
• Comparing consistency across different
business units
• Advantage: Allows comparison of dispersion
across different scales
Definition: Characteristics describing the
form of a probability distribution

Distribution Key Measures:


Shape - • Skewness: Asymmetry of the distribution

Overview • Kurtosis: Peakedness and tail heaviness

Business Importance:
• Informs appropriate statistical tests
• Reveals underlying data patterns
• Guides data transformation decisions
Definition: Measure of asymmetry in a
distribution
• Formula:

Types:
• Positive skew (right skew): longer tail to the right
Skewness • Negative skew (left skew): longer tail to the left
• Zero skew: symmetric distribution

Interpretation:
• Positive: Mean > Median > Mode
• Negative: Mean < Median < Mode
• Zero: Mean = Median = Mode
Skewness - Visual Representation
• Positively Skewed
Distributions:
• Income distributions
• Property values
• Time to complete tasks
• Negatively Skewed
Distributions:
• Test scores with ceiling effects
• Age at retirement
• Product purity levels
• Definition: Measure of "tailedness" or peakedness of a
distribution

Kurtosis
• Types:
• Leptokurtic (positive): More peaked, heavier tails
• Mesokurtic (zero): Normal distribution
• Platykurtic (negative): Flatter, lighter tails
Kurtosis - Business Implications

• Leptokurtic Distributions:
• Financial returns during market volatility
• Customer response times with outliers
• Platykurtic Distributions:
• Evenly distributed sales across product
lines
• Uniform quality control measurements
• Business Impact:
• Risk assessment and management
• Identifying unusual patterns
• Validating statistical assumptions
• Definition: Statistical measure indicating
Introduction direction and strength of relationship
between variables

to • Key Characteristics:
• Direction (positive/negative)

Correlation • Strength (weak/moderate/strong)


• Linear vs. nonlinear
• Business Applications:
• Marketing effectiveness analysis
• Financial variable relationships
• Operational performance factors
Pearson Correlation Coefficient
• Definition: Measures linear relationship between two continuous
variables.

• Range: -1 to +1
• +1: Perfect positive correlation
• 0: No linear correlation
• -1: Perfect negative correlation
Strong Positive (0.7 to 1.0):
• As X increases, Y strongly increases

Moderate Positive (0.3 to 0.7):


Interpreting • As X increases, Y moderately increases
Correlation
Coefficients Weak Positive (0 to 0.3):
• As X increases, Y slightly increases
• Same interpretations for negative correlations (-
0.7 to -1.0, etc.)
• Example: r = 0.85 between advertising spend
and sales indicates strong positive
relationship
Example 1: Advertising Expenditure
vs. Sales
• Data points: (20,40), (25,46), (30,52), (35,58),
(40,65)
• Correlation coefficient: r = 0.99
Correlation • Interpretation: Very strong positive relationship
Examples
Example 2: Price vs. Demand

• Data points: (10,50), (20,40), (30,35), (40,25),


(50,15)
• Correlation coefficient: r = -0.98
• Interpretation: Very strong negative relationship
Correlation: Statistical relationship
between variables

Causation: One variable directly


influences another
Correlation
Important Distinction:
vs. Causation
• Correlation does not imply causation.
• Third variables may create spurious correlations.
• Coincidental relationships can show strong
correlation.
• Business Example: Ice cream sales and drowning
deaths (both caused by summer weather)
Spearman's Rank Correlation:
• For ordinal data or non-linear relationships
• Based on ranks rather than actual values
Other
Point-Biserial Correlation:
Correlation
• Between continuous and binary variables
Measures
Kendall's Tau:
• Non-parametric measure of relationship
• Useful for small sample sizes
• When to use each measure
Definition:

• Deviations between sample statistics and


population parameters

Introduction Types:
to Errors in • Sampling errors: Due to sampling process
Research • Non-sampling errors: All other sources

Business Impact:

• Affects decision quality


• Influences research reliability
• Determines confidence in findings
Definition: Differences between
sample and population due to
random sampling
Sampling Characteristics:
Errors • Naturally occurs in all samples
• Can be statistically estimated
• Decreases with larger sample sizes
• Business Example: Market research
survey with ±3% margin of error
Sample Size:

• Larger samples → smaller sampling errors

Factors Population Variability:


Affecting • More heterogeneous populations → larger sampling

Sampling errors

Error Sampling Fraction:

• Higher percentage of population sampled → smaller


errors

Sampling Design:

• Stratified sampling can reduce error compared to


simple random sampling
• Definition: Errors not attributable to sampling
variation
• Types:
Non- • Coverage errors (frame errors)
Sampling • Measurement errors
• Processing errors
Errors • Non-response errors
• Response errors
• Characteristics: Often more
problematic than sampling errors,
harder to measure
Coverage Errors:
• Target population vs. sampling frame
mismatch
Types of Non- • Example: Online survey excluding non-
internet users
Sampling
Errors - Measurement Errors:
Detailed • Flawed measurement instruments
• Example: Ambiguous survey questions

Non-response Errors:
• Bias from systematic non-participation
• Example: Lower response rates from certain
demographics
Sampling Error Reduction:

• Increase sample size


• Use stratified sampling when appropriate
Minimizing • Ensure random selection within strata
Research Non-sampling Error Reduction:
Errors
• Careful questionnaire design
• Thorough interviewer training
• Multiple contact attempts for non-
respondents
• Data validation and cleaning procedures
• Definition: Number of values free to vary in final calculation of
statistic
• Conceptual Understanding:
• Constraints reduce degrees of freedom
• Related to sample size and parameters estimated
Degrees of • General Formula: df = n - k
• n = sample size
Freedom - • k = number of parameters estimated

Concept • Degrees of Freedom - Examples


• One-Sample t-test: df = n - 1
• One parameter (mean) is estimated
• Independent Samples t-test: df = n₁ + n₂ - 2
• Two parameters (two means) are estimated
• Correlation: df = n - 2
• Two parameters (two means) are estimated
• Business Context: Affects critical values in hypothesis testing
size (n): SE = σ / √n
size (n): SE = σ / √n

• Definition: Standard deviation of a sampling


distribution
• Formulas:
• Standard Error of Mean: SE = σ / √n
Standard • Estimated SE of Mean: SE = s / √n
Error • Importance:
• Measures precision of sample statistics
• Used in confidence interval construction
• Foundation for inferential statistics
Standard Error - Applications
• Confidence Intervals:
• 95% CI = Sample statistic ± 1.96 × Standard Error
• Hypothesis Testing:
• Test statistic = (Sample statistic−Hypothesized value of Standard Error) /
Standard Error
• Business Example:
• SE of $2 in mean customer spending of $45
• Interpretation: High confidence actual mean is within $4 of estimate
Relationship Between Standard Error,
Sample Size and Variability
• Sample Size Effect:
• SE ∝ 1/√n (inversely proportional to square root of sample size)
• Doubling sample size reduces SE by factor of √2
• Population Variability Effect:
• SE ∝ σ (directly proportional to population standard deviation)
• Business Implication: Balancing precision needs with research costs
Definition: Statistical analysis of a single variable

Univariate Purpose:

• Understand distribution characteristics


Analysis - • Identify central tendency and dispersion
• Examine data quality and patterns
Introduction
Business Applications:

• Customer demographic analysis


• Product performance metrics
• Financial indicator assessment
Frequency Distributions:

• Tables showing count/percentage of


observations

Univariate Graphical Methods:

Analysis - • Histograms, bar charts, pie charts


Methods • Box plots, stem-and-leaf plots

Numerical Measures:

• All previously discussed central tendency and


dispersion measures
• Example: Analysis of customer age
distribution in market segment.
Data: Employee satisfaction scores (1-5 scale)

Univariate Numerical Analysis:

Analysis - •

Mean: 3.8
Median: 4
Example •

Mode: 4
Standard Deviation: 0.9
• Graphical Analysis: [Histogram showing distribution]
• Business Insights: Generally high satisfaction with some
variation
Definition: Statistical analysis examining
relationship between two variables

Purpose:
Bivariate
Analysis - • Determine association between variables
• Identify patterns and relationships
Introduction • Support predictive analysis

Business Applications:

• Price-demand relationships
• Marketing spend vs. sales
• Employee training and productivity
Cross-tabulations: For categorical
variables
Scatter Plots: For continuous
Bivariate variables
Analysis - Correlation Analysis: Pearson,
Methods Spearman, etc.
Simple Regression: Linear
relationship modeling
Contingency Tables: Joint frequency
distributions
Bivariate Analysis - Example
• Variables: Advertising Expenditure
($000s) and Sales ($000s)
• Correlation Analysis: r = 0.92
• Regression Equation: Sales = 120 +
4.5 × Advertising
• Scatter Plot: See fig
• Business Insight: Strong positive
relationship, $1,000 in advertising
associated with $4,500 in sales
Multivariate Analysis:
• Examining three or more variables
Moving simultaneously
• Multiple regression, factor analysis, cluster
Beyond analysis
Bivariate From Description to Inference:
Analysis • Hypothesis testing
• Confidence intervals
• Predictive modeling
• Business Value: More comprehensive
understanding of complex business
phenomena
Marketing:

• Customer segmentation
• Campaign effectiveness analysis
Business • Price sensitivity studies

Applications Operations:
of Descriptive • Quality control monitoring
Statistics • Process capability analysis
• Productivity measurement

Finance:

• Risk assessment
• Investment return analysis
• Cost variance analysis
Central Tendency: Mean, median, mode
represent the center
Dispersion: Range, variance, standard
deviation measure spread
Summary of Distribution Shape: Skewness and kurtosis
Key describe asymmetry and peakedness
Concepts Relationships: Correlation quantifies
associations between variables
Research Quality: Understanding sampling
and non-sampling errors
Analysis Approach: Choosing appropriate
univariate or bivariate methods

You might also like