0% found this document useful (0 votes)

9 views15 pages

2nd Unit - Statistics

Uploaded by

ravipal rao

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

9 views15 pages

2nd Unit - Statistics

Uploaded by

ravipal rao

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 15

2nd Unit - Environmental Statistics

Descriptive Statistics

Measures of Central Tendency - Mean, Median, and Mode

Measures of central tendency are statistical indicators that represent the "typical"
value of a dataset. They provide a single summary of the data's center or middle
point, helping us condense large datasets into a more manageable form.

Here are the three most common measures of central tendency:

1. Mean:

● The most well-known measure, calculated by summing all values in a dataset

and dividing by the total number of values.
● Considered the "balancing point" of the data, where half the values lie above
and half below.
● Sensitive to outliers (extreme values) that can skew the mean away from the
center.

2. Median:
● The middle value of a dataset when arranged in ascending or descending
order.
● If the dataset has an even number of values, the median is the average of the
two middle values.
● Less sensitive to outliers than the mean, making it a preferred choice for
skewed data distributions.

3. Mode:
● The value that appears most frequently in a dataset.
● Can be multimodal if there are multiple values with the highest frequency.
● Useful for identifying the most common value in a dataset, but doesn't
necessarily represent the center.

Choosing the right measure of central tendency depends on the nature of your data
and what you want to understand.

● Mean: Use it for normally distributed data where outliers are minimal, and you
want an overall "average" value.
● Median: Use it for skewed data or data with outliers, as it's less affected by
extreme values.
● Mode: Use it to identify the most frequent value in the data, especially for
categorical data.

Mean

The mean in statistics, often called the arithmetic mean or simply the average, is a
measure of central tendency that represents the sum of all values in a data set
divided by the number of values. It gives you a general sense of the "middle" of the
data.

Formula:

The most common formula for the mean (denoted by x̄) is:

Advantages:

● Simple and easy to calculate: Requires basic arithmetic operations, making

it accessible to beginners.
● Intuitive interpretation: Can be easily understood as the "typical" value in
the data set.
● Widely used: Found in various statistical analyses and can be compared to
other means for insights.

Limitations:

● Sensitive to outliers: Extreme values can significantly affect the mean,

making it less reliable for skewed data.
● May not represent the data accurately: In skewed distributions, the mean
can be pulled toward the tail and not reflect the center of the data.
● Ignores information about spread: Doesn't tell you anything about the
variability of the data, like how closely values are clustered around the mean.

Here are some examples to illustrate the advantages and limitations:

● Example 1 (Symmetrical data): Consider the test scores of 5 students: {70,

75, 80, 85, 90}. The mean is (70 + 75 + 80 + 85 + 90) / 5 = 80. This accurately
reflects the typical score and is a good summary of the data.
● Example 2 (Skewed data): Imagine income levels of residents in a town:
{10k, 20k, 30k, 40k, 100k}. The mean is (10k + 20k + 30k + 40k + 100k) / 5 =
40k. However, this doesn't represent the typical income because of the single
high earner (outlier). The median (30k) would be a better measure of central
tendency in this case.

Median

The median is one of the most fundamental measures of central tendency in

statistics. While the mean ("average") is often the first measure that comes to mind,
the median offers a different perspective and comes with its own set of advantages
and limitations.

Definition:

● The median is the middle value in a set of data when arranged in order, from
smallest to largest.
● If the data set has an odd number of observations, the median is the middle
one.
● If the data set has an even number of observations, the median is the average
of the two middle values.

Formula:
Advantages:

● Less sensitive to outliers: Unlike the mean, which can be skewed by

extreme values, the median is less influenced by outliers. This makes it a
better measure of central tendency when dealing with skewed data sets.
● Easy to understand and interpret: The median represents the "middle
value" and is often easier to explain and understand than the mean, especially
for non-statistical audiences.
● Robust to ordinal data: While the mean requires interval or ratio data, the
median can be calculated for ordinal data (data with ranked categories) as
well.

Limitations:

● Less precise than the mean: The median doesn't utilize all the information in
the data set, potentially leading to less precise estimates of central tendency
compared to the mean.
● Difficult to interpret with grouped data: When working with grouped data
(data presented in frequency tables), calculating the median can be more
complex and require additional manipulation.
● Limited information about the distribution: Unlike the mean and standard
deviation, the median doesn't provide information about the spread of the
data.

Mode

Definition: The mode in statistics is the value that appears most frequently in a data
set. It is a measure of central tendency, along with the mean and median.

Formula:
Advantages:

● Simple to calculate and understand: Even people with no background in

statistics can easily understand the concept of the mode.
● Robust to outliers: The mode is not affected by outliers in the data set,
making it a good choice for data sets with extreme values.
● Useful for nominal data: The mode is particularly useful for nominal data,
where the order of the values does not matter.

Limitations:

● May not be unique: It is possible for a data set to have multiple modes,
especially if there are several values that appear with the same high
frequency.
● Can be misleading for small data sets: For small data sets, random
fluctuations can lead to false modes.
● Not suitable for ordinal data: The mode is not a good measure of central
tendency for ordinal data, where the order of the values does matter.

Measures of Dispersion

In statistics, measures of dispersion tell us how spread out or scattered data is

around a central point, like the mean or median. This helps us understand the
variability within a dataset and compare the spread of different datasets.

Here's a breakdown of key points:

Types of Measures:

● Absolute Measures: These express dispersion in the original units of the

data. They include:
○ Range: Difference between the maximum and minimum values. Simple
but sensitive to outliers.
○ Mean Deviation (MD): Average of absolute deviations from the mean.
More robust than range.
○ Variance: Average of squared deviations from the mean. Sensitive to
outliers.
○ Standard Deviation (SD): Square root of variance. Most commonly
used, shows how much data deviates from the mean on average.
● Relative Measures: These are unitless and allow comparison across
datasets with different units. They include:
○ Coefficient of Range (CR): Range divided by the range, expressed as
a percentage.
○ Coefficient of Variation (CV): Standard deviation divided by the
mean, expressed as a percentage. Useful for comparing spread across
datasets with different means.
○ Coefficient of Mean Deviation (CMD): MD divided by the mean,
expressed as a percentage.

Range

The range is a simple yet powerful measure of dispersion in statistics. It tells you
how "spread out" your data is by simply calculating the difference between the
maximum and minimum values in your dataset.

Here's a breakdown of the range:

Formula:

Range = Maximum Value - Minimum Value

Interpretation:

● The range gives you an absolute measure of dispersion, meaning it tells you
the exact distance covered by your data points.
● A higher range indicates greater spread, while a lower range indicates more
closely clustered data points.

Limitations:

● The range can be easily skewed by outliers, as a single extreme value can
significantly inflate the range.
● It doesn't take into account the distribution of your data. Two datasets with the
same range can have very different underlying structures.

Applications:

● The range is a quick and easy way to get a preliminary sense of how spread
out your data is.
● It can be useful for comparing the variation of small datasets where outliers
are less likely to distort the picture.
● It can be used in conjunction with other measures of dispersion, like standard
deviation and quartile deviation, to provide a more comprehensive
understanding of data variability.

Examples:

● Consider a dataset of the ages of students in a class: {12, 13, 14, 15, 16}. The
range would be 16 - 12 = 4.
● In a dataset of exam scores: {70, 80, 85, 90, 95}, the range would be 95 - 70
= 25.

Quartile Deviation

Quartile deviation, also known as the semi-interquartile range, is a measure of

dispersion in statistics. It tells you how spread out the middle 50% of your data is.
Here's a breakdown of everything you need to know about it:

What it is:

● Half of the Interquartile Range (IQR): The IQR is the difference between the
third quartile (Q3) and the first quartile (Q1). Quartile deviation takes this
range and divides it by two.
● Focuses on the middle 50%: Unlike measures like variance and standard
deviation that consider all data points, quartile deviation only focuses on the
central 50% of your data. This makes it less sensitive to outliers.

Formula:

Quartile Deviation (QD) = (Q3 - Q1) / 2

Interpretation:

● A higher quartile deviation indicates that the middle 50% of your data is more
spread out, with larger differences between values.
● A lower quartile deviation indicates that the middle 50% of your data is more
tightly clustered, with smaller differences between values.

Advantages:

● Robust to outliers: Because it only considers the middle 50% of the data,
quartile deviation is less affected by outliers than other measures of
dispersion.
● Simple to calculate: The formula is straightforward and easy to apply, even
with basic calculations.
● Interpretable: The units of quartile deviation are the same as the units of your
data, making it easier to understand.

Disadvantages:

● Limited information: Unlike measures like standard deviation, quartile

deviation doesn't give you information about the spread of the entire dataset,
only the middle 50%.
● Not good for normal distributions: If your data is normally distributed,
quartile deviation may not be the most informative measure of dispersion.

Mean Deviation

Mean deviation, simply put, is the average of the absolute deviations (distances) of
all data points from the mean (or sometimes, the median) of the dataset. This means
we calculate the difference between each individual value and the central value, take
the absolute value (making negative differences positive), and then average them all.

Types of Mean Deviation:

● Mean deviation from the mean: This is the most common type, where the
central value is the mean of the dataset. It gives an average absolute distance
from the "typical" value, indicating how spread out the data is.
● Mean deviation from the median: In this case, the median, another measure
of central tendency, acts as the reference point. This is useful when the data
has outliers that skew the mean, as the median is less sensitive to extreme
values.

Calculating Mean Deviation:

1. Calculate the central value (mean or median) of your dataset.

2. For each data point, subtract the central value to get the deviation.
3. Take the absolute value of each deviation (to ignore negative signs).
4. Sum all the absolute deviations.
5. Divide the sum by the number of data points (n) to get the average absolute
deviation.

Formula:

Interpreting Mean Deviation:

● Higher mean deviation: Indicates a wider spread of data, meaning individual

values are further away from the central value.
● Lower mean deviation: Suggests a more compact dataset, where values are
closer to the center.

Applications of Mean Deviation:

● Comparing the variability of different datasets on the same scale, even if their
units are different.
● Identifying outliers that significantly deviate from the average.
● Assessing the "typical" spread of data around a central value.

Mean Deviation vs. Standard Deviation:

Both are measures of dispersion, but they differ in terms of calculation and
interpretation:

● Mean deviation: Uses absolute deviations, giving equal weight to all

distances from the center. It's easier to understand but less statistically
efficient.
● Standard deviation: Takes squares of deviations, giving more weight to
larger distances and penalizing outliers. It's more statistically robust but less
intuitive to interpret.
Standard Deviation

Standard deviation (SD) is the most common and widely used measure of
dispersion. It tells us, on average, how far individual data points deviate from the
mean. Essentially, it quantifies the "spread" of the data.

How is Standard Deviation Calculated?

There are two types of standard deviation:

● Population Standard Deviation (σ): Used when the entire population is

available.

where:

● x_i is each data point

● μ is the population mean
● N is the total number of data points
● Sample Standard Deviation (s): Used when only a sample of the population
is available.

where:

● x_i is each data point

● x̄ is the sample mean
● n is the sample size

Interpreting Standard Deviation:

● A higher SD indicates that the data is more spread out and deviates more
from the mean.
● A lower SD indicates that the data is more clustered around the mean.
● In a normal distribution, approximately 68% of the data points will fall within 1
SD of the mean, 95% within 2 SDs, and 99.7% within 3 SDs.

Uses of Standard Deviation:

● Comparing variability between different datasets.

● Statistical hypothesis testing.
● Quality control and process monitoring.
● Risk assessment and prediction.

Limitations of Standard Deviation:

● Sensitive to outliers.
● Cannot be directly compared across datasets with different units.
● Not a good measure for skewed distributions.

Coefficient of Variation

The coefficient of variation (CV) is a relative measure of dispersion used to compare

the variability of data sets with different units or means. It tells you how much, on
average, individual values in a data set deviate from the mean, expressed as a
percentage. This makes it extremely useful for comparing the stability or consistency
of different sets.

Key features of CV:

● Relative measure: Unlike standard deviation, CV is dimensionless and

allows comparison across data sets with different units or scales. For
example, you can easily compare the variability of height in centimeters (cm)
and weight in kilograms (kg) using CV.
● Expressed as a percentage: CV is multiplied by 100 to give a percentage
value, making it easier to interpret and communicate. A lower CV indicates
less variability, and a higher CV indicates more variability.
● Used with ratio data: CV should only be used with data measured on a ratio
scale, where values have a true zero and meaningful ratios can be formed.
Interval data with arbitrary zeros can lead to misleading interpretations of CV.

Calculating CV:

The formula for CV is:

CV = (Standard deviation / Mean) * 100

where:

● CV is the coefficient of variation

● σ is the population standard deviation
● s is the sample standard deviation (if using a sample)
● μ is the population mean
● X is the sample mean (if using a sample)

Interpreting CV:

There are no universal thresholds for interpreting CV, but some general guidelines
exist:

● CV < 20%: Low variability, considered stable or uniform

● 20% ≤ CV ≤ 50%: Moderate variability
● CV > 50%: High variability, considered scattered or inconsistent

Moments

In statistics, moments are descriptive measures of the probability distribution of a

random variable. They tell us about the shape, center, and spread of the data. The
first four moments are particularly important:

● First moment (mean): Measures the central tendency of the data. For a
symmetrical distribution, the mean coincides with the median and mode.
● Second moment (variance): Measures the spread of the data around the
mean. A higher variance indicates greater dispersion of the data points.
● Third moment (skewness): Measures the asymmetry of the distribution. A
positive skewness indicates a longer tail on the right side, while a negative
skewness indicates a longer tail on the left side. A zero skewness means the
distribution is symmetrical.
● Fourth moment (kurtosis): Measures the "tailedness" of the distribution. A
higher kurtosis indicates heavier tails than a normal distribution, while a lower
kurtosis indicates lighter tails. A kurtosis of 3 corresponds to a normal
distribution.

Skewness

Skewness is a vital concept in statistics, measuring the asymmetry of a data

distribution. It tells you how much the "weight" of your data points is unevenly
distributed around the central point (mean or median). Think of it like a seesaw: a
perfectly balanced seesaw with equal weights on both sides represents a
symmetrical distribution with zero skewness. If one side dips lower than the other,
you have skewness.

Types of Skewness:

● Positive Skewness: The tail of the distribution stretches out to the right, with
more data points clustered on the left side. Imagine a bunch of kids on a
seesaw – more on the left side makes the right side rise higher.
● Negative Skewness: The tail extends to the left, with more data points on the
right side. Picture the seesaw tipping the other way.
● No Skewness: The distribution is perfectly symmetrical, a bell-shaped curve
with the mean, median, and mode all coinciding. The seesaw is perfectly
balanced.
Understanding Skewness:

● Implications: Knowing the skewness helps interpret your data better. For
example, skewed income data might show more low earners than high
earners, highlighting income inequality.
● Impact on Statistics: Some statistical tests rely on normal distributions (zero
skewness). If your data is highly skewed, these tests might not be reliable.
● Measuring Skewness: Several methods exist, like Pearson's coefficient and
the moment skewness. These values indicate the direction and magnitude of
the asymmetry.

Kurtosis

Kurtosis is a statistical measure that quantifies the "tailedness" of a probability

distribution. In simpler terms, it tells you how much weight is in the tails of the
distribution, relative to the center.

Imagine a bell curve representing a normal distribution. A normal distribution has a

kurtosis of 3. If a distribution has a higher kurtosis than 3, it means that it has heavier
tails, which indicates that there are more extreme values (outliers) than in a normal
distribution. Conversely, a distribution with a lower kurtosis than 3 has lighter tails,
which means that there are fewer extreme values.

Types of kurtosis:

There are three main types of kurtosis:

● Leptokurtic: This type of distribution has heavy tails and a high kurtosis value
(greater than 3). This means that there are many more extreme values than in
a normal distribution.
● Mesokurtic: This type of distribution has medium tails and a kurtosis value of
3. This is the same as a normal distribution.
● Platykurtic: This type of distribution has light tails and a low kurtosis value
(less than 3). This means that there are fewer extreme values than in a
normal distribution.
How is kurtosis used?

Kurtosis is used in a variety of applications, including:

● Financial analysis: Kurtosis is used to measure the risk of an investment. A

high kurtosis investment is more likely to have large price swings, both
positive and negative.
● Quality control: Kurtosis is used to monitor the quality of a manufacturing
process. A high kurtosis process is more likely to produce defective products.
● Scientific research: Kurtosis is used to analyze data from experiments. A
high kurtosis data set may indicate that there are outliers that need to be
investigated.

Early Grade Reading Assessment For Kindergarten Final
75% (4)
Early Grade Reading Assessment For Kindergarten Final
6 pages
Psychology Project
No ratings yet
Psychology Project
14 pages
Research Statistics Using JASP
100% (1)
Research Statistics Using JASP
47 pages
Central Tendency, The Variability and Distribution of Your Dataset Is Important To Understand When Performing Descriptive Statistics.
No ratings yet
Central Tendency, The Variability and Distribution of Your Dataset Is Important To Understand When Performing Descriptive Statistics.
14 pages
Measures of Central Tendency
100% (1)
Measures of Central Tendency
48 pages
Business Statistics & Analytics For Decision Making Assignment 1 Franklin Babu
100% (1)
Business Statistics & Analytics For Decision Making Assignment 1 Franklin Babu
9 pages
Slides For IT SKill
No ratings yet
Slides For IT SKill
63 pages
3.3.1 Data Summarization
No ratings yet
3.3.1 Data Summarization
56 pages
Describing Data - Numerical Measure
No ratings yet
Describing Data - Numerical Measure
33 pages
Data Presentation
No ratings yet
Data Presentation
104 pages
Ch3 Numerically Summarizing Data
No ratings yet
Ch3 Numerically Summarizing Data
35 pages
2 - Introduction To Statistics
No ratings yet
2 - Introduction To Statistics
97 pages
Statistical Machine Learning
100% (1)
Statistical Machine Learning
12 pages
ASA Notes
No ratings yet
ASA Notes
28 pages
Share MBBS - Lecture 4 (1) - 1
No ratings yet
Share MBBS - Lecture 4 (1) - 1
68 pages
Descreptive Statistics 1
No ratings yet
Descreptive Statistics 1
74 pages
Lesson 3.2 Measures of Central Tendency Position and Variation
No ratings yet
Lesson 3.2 Measures of Central Tendency Position and Variation
62 pages
Unit IV
No ratings yet
Unit IV
80 pages
Unit 4 & 5 8614
No ratings yet
Unit 4 & 5 8614
58 pages
8614 Saba 2nd
No ratings yet
8614 Saba 2nd
44 pages
Day 3 Educational Statistics
No ratings yet
Day 3 Educational Statistics
37 pages
Lecture 9descriptivestatistics 171204035552
No ratings yet
Lecture 9descriptivestatistics 171204035552
26 pages
Biostatistics (Descriptive Statistics)
No ratings yet
Biostatistics (Descriptive Statistics)
30 pages
Stats For Data Science
No ratings yet
Stats For Data Science
21 pages
Module 2
No ratings yet
Module 2
28 pages
Unit 3 Summarising Data - Averages and Dispersion
No ratings yet
Unit 3 Summarising Data - Averages and Dispersion
22 pages
An Introduction To Statistics 1st Edition George Woodbury PDF Download
100% (1)
An Introduction To Statistics 1st Edition George Woodbury PDF Download
73 pages
Week 2
No ratings yet
Week 2
27 pages
Discriptive Statistics
No ratings yet
Discriptive Statistics
23 pages
Drawing Conclusions From Statistical Data: Measures of Central Tendency
No ratings yet
Drawing Conclusions From Statistical Data: Measures of Central Tendency
22 pages
Freq. Distribution Characteristics
No ratings yet
Freq. Distribution Characteristics
13 pages
Research pt-1
No ratings yet
Research pt-1
17 pages
Chap 3 Measures of Central Tendency
No ratings yet
Chap 3 Measures of Central Tendency
28 pages
Interpreting Test Score: Online Workshop 8602 Aiou
100% (1)
Interpreting Test Score: Online Workshop 8602 Aiou
39 pages
Intro To Statistics - Descriptive Statistics and NPC - 20250225 - 171911 - 0000
No ratings yet
Intro To Statistics - Descriptive Statistics and NPC - 20250225 - 171911 - 0000
23 pages
Data Processing and Anlysis
No ratings yet
Data Processing and Anlysis
41 pages
Descriptive Statistics
No ratings yet
Descriptive Statistics
14 pages
Standard Deviation
No ratings yet
Standard Deviation
13 pages
Statistics
No ratings yet
Statistics
10 pages
MCS Lecture 3
No ratings yet
MCS Lecture 3
57 pages
Measures of Central Tendency and Dispersion
No ratings yet
Measures of Central Tendency and Dispersion
9 pages
Chapter 3
No ratings yet
Chapter 3
17 pages
Stats Prac 1
No ratings yet
Stats Prac 1
10 pages
Statistics ClassNotes - 2
No ratings yet
Statistics ClassNotes - 2
10 pages
Module 10 Introduction To Data and Statistics
No ratings yet
Module 10 Introduction To Data and Statistics
63 pages
Measures of Central Tendency
No ratings yet
Measures of Central Tendency
5 pages
Week 3 - Review Topic - Measures of Central Tendency and Dispersion - NEUVLE
No ratings yet
Week 3 - Review Topic - Measures of Central Tendency and Dispersion - NEUVLE
13 pages
Unit 1 - Business Statistics & Analytics
No ratings yet
Unit 1 - Business Statistics & Analytics
25 pages
Presentation 4
No ratings yet
Presentation 4
29 pages
Measures
No ratings yet
Measures
8 pages
New Microsoft Office Word Document
No ratings yet
New Microsoft Office Word Document
10 pages
Dsbda Unit 2
No ratings yet
Dsbda Unit 2
155 pages
Jerome Statistics
No ratings yet
Jerome Statistics
12 pages
Lecture 1 - Measures of Central Tendency
No ratings yet
Lecture 1 - Measures of Central Tendency
3 pages
Research Ii: Whole Brain Learning System Outcome-Based Education
No ratings yet
Research Ii: Whole Brain Learning System Outcome-Based Education
16 pages
Statistics Question Bank
100% (1)
Statistics Question Bank
31 pages
f592b059 1643454320549
No ratings yet
f592b059 1643454320549
39 pages
Measure of Central Tendency Dispersion A
No ratings yet
Measure of Central Tendency Dispersion A
8 pages
Correlation and Regression
No ratings yet
Correlation and Regression
10 pages
Given The Learning Materials and Activities of This Chapter, They Will Be Able To
No ratings yet
Given The Learning Materials and Activities of This Chapter, They Will Be Able To
14 pages
Introduction
No ratings yet
Introduction
97 pages
Statistical Analysis - Descriptive Stat
No ratings yet
Statistical Analysis - Descriptive Stat
6 pages
Least Mastered Competencies
No ratings yet
Least Mastered Competencies
1 page
Which Measure of Central Tendency To Use
No ratings yet
Which Measure of Central Tendency To Use
8 pages
Statistics Merged
No ratings yet
Statistics Merged
37 pages
Formulae-MOCT AND MOD
No ratings yet
Formulae-MOCT AND MOD
297 pages
Representation of Data (S1) # 1
No ratings yet
Representation of Data (S1) # 1
6 pages
Predictive Modelling Project - Nandini
No ratings yet
Predictive Modelling Project - Nandini
31 pages
Staticus: Math 103 Lecture 9 Class Notes
No ratings yet
Staticus: Math 103 Lecture 9 Class Notes
4 pages
Principles of Statistical Analysis
No ratings yet
Principles of Statistical Analysis
21 pages
Regression Metrics
No ratings yet
Regression Metrics
26 pages
ITS665dm Topic2-DataUnderstanding
No ratings yet
ITS665dm Topic2-DataUnderstanding
53 pages
The T-Distribution
No ratings yet
The T-Distribution
33 pages
Manav Rachna International University: Bib-606: Digital Marketing
No ratings yet
Manav Rachna International University: Bib-606: Digital Marketing
11 pages
BUS270 Assignment 2
No ratings yet
BUS270 Assignment 2
28 pages
9 Relative Measures of Dispersion
No ratings yet
9 Relative Measures of Dispersion
22 pages
Z Score Calculator
No ratings yet
Z Score Calculator
2 pages
Understanding Data Variability Standard Deviation and Coefficient of Variation
No ratings yet
Understanding Data Variability Standard Deviation and Coefficient of Variation
9 pages
Relationship Between Introverted Student Behavior and 2
No ratings yet
Relationship Between Introverted Student Behavior and 2
16 pages
Uji Akar Unit - Pendekatan Augmented Dickey-Fuller (ADF)
No ratings yet
Uji Akar Unit - Pendekatan Augmented Dickey-Fuller (ADF)
16 pages
Data Statistika
No ratings yet
Data Statistika
14 pages
Asas SPSS Kuantitatif
No ratings yet
Asas SPSS Kuantitatif
10 pages
Suppose You Are Testing Ho: P 0.45 Versus Ha: P 0.45. A Random Sample of 310 People Produces A Value of 0.465, Use Á 0.05 To Test This Hypothesis
No ratings yet
Suppose You Are Testing Ho: P 0.45 Versus Ha: P 0.45. A Random Sample of 310 People Produces A Value of 0.465, Use Á 0.05 To Test This Hypothesis
3 pages
Case Processing Summary
No ratings yet
Case Processing Summary
4 pages
Statistics Problem Set
No ratings yet
Statistics Problem Set
2 pages
Kolmogorov-Smirnov D Table
No ratings yet
Kolmogorov-Smirnov D Table
2 pages
De-Mystifying Math and Stats for Machine Learning: Mastering the Fundamentals of Mathematics and Statistics for Machine Learning
From Everand
De-Mystifying Math and Stats for Machine Learning: Mastering the Fundamentals of Mathematics and Statistics for Machine Learning
Seaport AI Madhavan
No ratings yet
Introduction To Non Parametric Methods Through R Software
From Everand
Introduction To Non Parametric Methods Through R Software
Editor IJSMI
No ratings yet
Overview Of Bayesian Approach To Statistical Methods: Software
From Everand
Overview Of Bayesian Approach To Statistical Methods: Software
Vinaitheerthan Renganathan
No ratings yet
Descriptive Statistics: Six Sigma Thinking, #3
From Everand
Descriptive Statistics: Six Sigma Thinking, #3
Sumeet Savant
No ratings yet

2nd Unit - Statistics

Uploaded by

2nd Unit - Statistics

Uploaded by

2nd Unit - Environmental Statistics

Measures of Central Tendency - Mean, Median, and Mode

Here are the three most common measures of central tendency:

● The most well-known measure, calculated by summing all values in a dataset

● Simple and easy to calculate: Requires basic arithmetic operations, making

● Sensitive to outliers: Extreme values can significantly affect the mean,

Here are some examples to illustrate the advantages and limitations:

● Example 1 (Symmetrical data): Consider the test scores of 5 students: {70,

The median is one of the most fundamental measures of central tendency in

● Less sensitive to outliers: Unlike the mean, which can be skewed by

● Simple to calculate and understand: Even people with no background in

In statistics, measures of dispersion tell us how spread out or scattered data is

Here's a breakdown of key points:

● Absolute Measures: These express dispersion in the original units of the

Here's a breakdown of the range:

Range = Maximum Value - Minimum Value

Quartile deviation, also known as the semi-interquartile range, is a measure of

Quartile Deviation (QD) = (Q3 - Q1) / 2

● Limited information: Unlike measures like standard deviation, quartile

Types of Mean Deviation:

Calculating Mean Deviation:

1. Calculate the central value (mean or median) of your dataset.

Interpreting Mean Deviation:

● Higher mean deviation: Indicates a wider spread of data, meaning individual

Applications of Mean Deviation:

Mean Deviation vs. Standard Deviation:

● Mean deviation: Uses absolute deviations, giving equal weight to all

How is Standard Deviation Calculated?

There are two types of standard deviation:

● Population Standard Deviation (σ): Used when the entire population is

● x_i is each data point

● x_i is each data point

Interpreting Standard Deviation:

Uses of Standard Deviation:

● Comparing variability between different datasets.

Limitations of Standard Deviation:

The coefficient of variation (CV) is a relative measure of dispersion used to compare

Key features of CV:

● Relative measure: Unlike standard deviation, CV is dimensionless and

The formula for CV is:

CV = (Standard deviation / Mean) * 100

● CV is the coefficient of variation

● CV < 20%: Low variability, considered stable or uniform

In statistics, moments are descriptive measures of the probability distribution of a

Skewness is a vital concept in statistics, measuring the asymmetry of a data

Kurtosis is a statistical measure that quantifies the "tailedness" of a probability

Imagine a bell curve representing a normal distribution. A normal distribution has a

There are three main types of kurtosis:

Kurtosis is used in a variety of applications, including:

● Financial analysis: Kurtosis is used to measure the risk of an investment. A

You might also like