0% found this document useful (0 votes)

14 views5 pages

Lesson 3 4

This document discusses the importance of sample size in research, emphasizing its impact on statistical analysis, confidence intervals, and confidence levels. It also explains measures of central tendency, including mean, median, and mode, detailing their advantages and limitations in representing data. Additionally, it describes how the shape of a distribution affects these measures, particularly in symmetrical and skewed distributions.

Uploaded by

Joelyn Capa

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

14 views5 pages

Lesson 3 4

Uploaded by

Joelyn Capa

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 5

Lesson 3: Sample size, Confidence Interval and Confidence Level

Sample size - is a research term used for defining the number of individuals included in a research study to
represent a population.
Determining the appropriate sample size is one of the most important factors in statistical analysis. If the sample
size is too small, it will not yield valid results or adequately represent the realities of the population being
studied. On the other hand, while larger sample sizes yield smaller margins of error and are more representative,
a sample size that is too large may significantly increase the cost and time taken to conduct the research.
When selecting a sample there are multiple factors that can impact the reliability and validity of results. When
thinking about sample size, the two measures of error that are almost always synonymous with sample sizes
are the confidence interval and the confidence level.

Confidence Interval (Margin of Error)

Confidence intervals measure the degree of uncertainty or certainty in a sampling method and how much
uncertainty there is with any particular statistic. In simple terms, the confidence interval tells you how confident
you can be that the results from a study reflect what you would expect to find if it were possible to survey the
entire population being studied. The confidence interval is usually a plus or minus (±) figure. For example, if your
confidence interval is 6 and 60% percent of your sample picks an answer, you can be confident that if you had
asked the entire population, between 54% (60-6) and 66% (60+6) would have picked that answer.

Confidence Level
The confidence level refers to the percentage of probability, or certainty that the confidence interval
would contain the true population parameter when you draw a random sample many times. It is expressed as a
percentage and represents how often the percentage of the population who would pick an answer lies within
the confidence interval. For example, a 99% confidence level means that should you repeat an experiment or
survey over and over again, 99 percent of the time, your results will match the results you get from a population.
The larger your sample size, the more confident you can be that their answers truly reflect the
population. In other words, the larger your sample for a given confidence level, the smaller your confidence
interval.
Lesson 4: Measures of Central Tendency
A measure of central tendency is a single value that attempts to describe a set of data by identifying the
central position within that set of data.

MEAN
- The mean (or average) is the most popular and well known measure of central tendency. It can be used
with both discrete and continuous data, although its use is most often with continuous data.

- The mean is the sum of the value of each observation in a dataset divided by the number of
observations. This is also known as the arithmetic average.
Looking at the retirement age distribution again:
54, 54, 54, 55, 56, 57, 57, 58, 58, 60, 60
The mean is calculated by adding together all the values (54+54+54+55+56+57+57+58+58+60+60 = 623) and
dividing by the number of observations (11) which equals 56.6 years.

Advantage of the mean

- The mean can be used for both continuous and discrete numeric data.

Limitations of the mean

- The mean cannot be calculated for categorical data, as the values cannot be summed.
The mean has one main disadvantage: it is particularly susceptible to the influence of outliers. These are values
that are unusual compared to the rest of the data set by being especially small or large in numerical value. For
example, consider the wages of staff at a factory below:

Staff 1 2 3 4 5 6 7 8 9 10

Salary 15k 18k 16k 14k 15k 15k 12k 17k 90k 95k

The mean salary for these ten staff is $30.7k. However, inspecting the raw data suggests that this mean value
might not be the best way to accurately reflect the typical salary of a worker, as most workers have salaries in
the $12k to 18k range. The mean is being skewed by the two large salaries. Therefore, in this situation, we would
like to have a better measure of central tendency, taking the median would be a better measure of central
tendency in this situation.

Median
The median is the middle score for a set of data that has been arranged in order of magnitude. The median is
less affected by outliers and skewed data. In order to calculate the median, suppose we have the data below:

65 55 89 56 35 14 56 55 87 45 92
We first need to rearrange that data into order of magnitude (smallest first):

14 35 45 55 55 56 56 65 87 89 92

Our median mark is the middle mark - in this case, 56 (highlighted in bold). It is the middle mark because there
are 5 scores before it and 5 scores after it. This works fine when you have an odd number of scores, but what
happens when you have an even number of scores? What if you had only 10 scores? Well, you simply have to
take the middle two scores and average the result. So, if we look at the example below:

65 55 89 56 35 14 56 55 87 45

We again rearrange that data into order of magnitude (smallest first):

14 35 45 55 55 56 56 65 87 89

Only now we have to take the 5th and 6th score in our data set and average them to get a median of 55.5.

Advantage of the median

The median is less affected by outliers and skewed data than the mean and is usually the preferred measure of
central tendency when the distribution is not symmetrical.
Limitation of the median
The median cannot be identified for categorical nominal data, as it cannot be logically ordered.

Mode
The mode is the most commonly occurring value in a distribution.
Consider this dataset showing the retirement age of 11 people, in whole years:
54, 54, 54, 55, 56, 57, 57, 58, 58, 60, 60

This table shows a simple frequency distribution of the retirement age data.

Advantage of the mode

The mode has an advantage over the median and the mean as it can be found for both numerical and categorical
(non-numerical) data.
Limitations of the mode
The are some limitations to using the mode. In some distributions, the mode may not reflect the centre of the
distribution very well. When the distribution of retirement age is ordered from lowest to highest value, it is easy
to see that the centre of the distribution is 57 years, but the mode is lower, at 54 years.
54, 54, 54, 55, 56, 57, 57, 58, 58, 60, 60
It is also possible for there to be more than one mode for the same distribution of data, (bi-modal, or multi-
modal). The presence of more than one mode can limit the ability of the mode in describing the centre or typical
value of the distribution because a single value to describe the centre cannot be identified.
In some cases, particularly where the data are continuous, the distribution may have no mode at all (i.e. if all
values are different).

Impact of shape of distribution on measures of central tendency

Symmetrical distributions
When a distribution is symmetrical, the mode, median and mean are all in the middle of the distribution. The
following graph shows a larger retirement age dataset with a distribution which is symmetrical. The mode,
median and mean all equal 58 years.

Skewed distributions
When a distribution is skewed the mode remains the most commonly occurring value, the median remains the
middle value in the distribution, but the mean is generally ‘pulled’ in the direction of the tails. In a skewed
distribution, the median is often a preferred measure of central tendency, as the mean is not usually in the
middle of the distribution.
A distribution is said to be positively or right skewed when the tail on the right side of the distribution is longer
than the left side. In a positively skewed distribution it is common for the mean to be ‘pulled’ toward the right
tail of the distribution. Although there are exceptions to this rule, generally, most of the values, including the
median value, tend to be less than the mean value.
The following graph shows a larger retirement age data set with a distribution which is right skewed. The data
has been grouped into classes, as the variable being measured (retirement age) is continuous. The mode is 54
years, the modal class is 54-56 years, the median is 56 years, and the mean is 57.2 years.
Retirement age: Positive (right) skew

A distribution is said to be negatively or left skewed when the tail on the left side of the distribution is longer
than the right side. In a negatively skewed distribution, it is common for the mean to be ‘pulled’ toward the left
tail of the distribution. Although there are exceptions to this rule, generally, most of the values, including the
median value, tend to be greater than the mean value.
The following graph shows a larger retirement age dataset with a distribution which left skewed. The mode is
65 years, the modal class is 63-65 years, the median is 63 years and the mean is 61.8 years.

Mastering The Oracle 1z0-908 MySQL 8.0 Database Administrator Exam - Key Questions and Insights - Galaxy - Ai
No ratings yet
Mastering The Oracle 1z0-908 MySQL 8.0 Database Administrator Exam - Key Questions and Insights - Galaxy - Ai
20 pages
Tutorial For Oracle10g Forms and Reports
No ratings yet
Tutorial For Oracle10g Forms and Reports
76 pages
Measures of Central Tendency
100% (1)
Measures of Central Tendency
48 pages
Summer Internship Project On: "Basics of PHP"
100% (1)
Summer Internship Project On: "Basics of PHP"
45 pages
Manufacturing Process in Odoo
100% (3)
Manufacturing Process in Odoo
11 pages
Grade 7: Mathematics Quarter 4 - Module 4 MELC 6 and 7 Measures of Central Tendency
No ratings yet
Grade 7: Mathematics Quarter 4 - Module 4 MELC 6 and 7 Measures of Central Tendency
12 pages
Performance Task 1 - Attempt Review Prog 114
100% (1)
Performance Task 1 - Attempt Review Prog 114
4 pages
Cbse - Department of Skill Education: Information Technology (Subject Code-402)
100% (2)
Cbse - Department of Skill Education: Information Technology (Subject Code-402)
6 pages
Wic Ebt Technical Implementation Guide: Date: February 26, 2018
No ratings yet
Wic Ebt Technical Implementation Guide: Date: February 26, 2018
132 pages
Unit 4 Class Notes
No ratings yet
Unit 4 Class Notes
58 pages
Goals in Statistic
100% (1)
Goals in Statistic
149 pages
Measures of Central Tendency
No ratings yet
Measures of Central Tendency
11 pages
Measures of Central Tendency Dispersion and Correlation
100% (1)
Measures of Central Tendency Dispersion and Correlation
27 pages
QT Unit5 PDF
No ratings yet
QT Unit5 PDF
14 pages
MMW Module 6 - Measures of Central Tendency
No ratings yet
MMW Module 6 - Measures of Central Tendency
10 pages
Unit 1 JDBC
No ratings yet
Unit 1 JDBC
16 pages
Event Management System Synopsis
No ratings yet
Event Management System Synopsis
7 pages
Data Engineering Road Map
No ratings yet
Data Engineering Road Map
1 page
Chandru
No ratings yet
Chandru
27 pages
CH 2 - Measure of Central Tendency
100% (1)
CH 2 - Measure of Central Tendency
9 pages
Assignment
No ratings yet
Assignment
23 pages
PCA Dumps
No ratings yet
PCA Dumps
165 pages
Approximately 2.5% (By Empirical Rule, As The Distribution Is Mound-Shaped) of The
No ratings yet
Approximately 2.5% (By Empirical Rule, As The Distribution Is Mound-Shaped) of The
2 pages
2nd Assignment
No ratings yet
2nd Assignment
2 pages
Assignment of Biostatistics
No ratings yet
Assignment of Biostatistics
8 pages
Measures of Central Tendency
No ratings yet
Measures of Central Tendency
6 pages
Week 1 Database Administrator
No ratings yet
Week 1 Database Administrator
34 pages
Mongo DB
No ratings yet
Mongo DB
66 pages
Descreptive Statistics 1
No ratings yet
Descreptive Statistics 1
74 pages
2.3 Descriptive Numerical Summary Measures
No ratings yet
2.3 Descriptive Numerical Summary Measures
67 pages
Postgres Enterprise Manager: Release 7.15
No ratings yet
Postgres Enterprise Manager: Release 7.15
32 pages
8614 Saba 2nd
No ratings yet
8614 Saba 2nd
44 pages
Lecure-2 Descriptive Biostatistics
No ratings yet
Lecure-2 Descriptive Biostatistics
102 pages
BSQT PG II Sem II Notes Session (1 6)
No ratings yet
BSQT PG II Sem II Notes Session (1 6)
35 pages
المحاضرة رقم 3
No ratings yet
المحاضرة رقم 3
44 pages
Introduction To Statistics
No ratings yet
Introduction To Statistics
21 pages
Measures of Central Tendency Position and Dispersion 1.Pptx 20241015 145631 0000
No ratings yet
Measures of Central Tendency Position and Dispersion 1.Pptx 20241015 145631 0000
44 pages
Embedding SQL Server 2008 Express in An Application - Microsoft Learn PDF
No ratings yet
Embedding SQL Server 2008 Express in An Application - Microsoft Learn PDF
36 pages
Processes in Procurement2
No ratings yet
Processes in Procurement2
43 pages
Assignment
No ratings yet
Assignment
30 pages
Az-305 8
No ratings yet
Az-305 8
50 pages
Summarizing Data
No ratings yet
Summarizing Data
45 pages
Measures of Central Tendency
No ratings yet
Measures of Central Tendency
10 pages
Assignment#8614 2
No ratings yet
Assignment#8614 2
37 pages
Lecture 2-Descriptive Statistics
No ratings yet
Lecture 2-Descriptive Statistics
74 pages
Slides For IT SKill
No ratings yet
Slides For IT SKill
63 pages
AOL 1 Chapter Chapter 7 Part 1
No ratings yet
AOL 1 Chapter Chapter 7 Part 1
10 pages
Biostatistics Prelims Week 3
No ratings yet
Biostatistics Prelims Week 3
43 pages
Central Tendency
No ratings yet
Central Tendency
34 pages
Mysql Important Questions
No ratings yet
Mysql Important Questions
24 pages
Statistics 2025
No ratings yet
Statistics 2025
46 pages
Sol of Css G Ability 2019
No ratings yet
Sol of Css G Ability 2019
13 pages
Chapter 3 Statistical Parameters
No ratings yet
Chapter 3 Statistical Parameters
22 pages
8614 Assignment No 2
No ratings yet
8614 Assignment No 2
26 pages
Week 3
No ratings yet
Week 3
9 pages
Assignments On Central Tendency
No ratings yet
Assignments On Central Tendency
17 pages
Measure of Central Tendency
No ratings yet
Measure of Central Tendency
14 pages
Measure of Central Tendency Dispersion A
No ratings yet
Measure of Central Tendency Dispersion A
8 pages
Database Week 1-3
No ratings yet
Database Week 1-3
34 pages
Chap 3
No ratings yet
Chap 3
12 pages
Synopsis Todolist Project
No ratings yet
Synopsis Todolist Project
12 pages
Modelling and Simulation of ElasticSearch Using CloudSim Final
No ratings yet
Modelling and Simulation of ElasticSearch Using CloudSim Final
9 pages
Kinds of Variable
No ratings yet
Kinds of Variable
10 pages
Obesity
No ratings yet
Obesity
14 pages
1) Assignment Question
No ratings yet
1) Assignment Question
7 pages
Subject:-Business Statistics Topic: - Business Statistics (BC.203)
No ratings yet
Subject:-Business Statistics Topic: - Business Statistics (BC.203)
10 pages
What Are The Measures of Central Tendency?: L04: Basic Statistical Descriptions of Data
No ratings yet
What Are The Measures of Central Tendency?: L04: Basic Statistical Descriptions of Data
9 pages
What Is Central Tendency
No ratings yet
What Is Central Tendency
10 pages
New Microsoft Office Word Document
No ratings yet
New Microsoft Office Word Document
10 pages
Vivian 2nd Assignment
No ratings yet
Vivian 2nd Assignment
9 pages
Engineering Statistics: Measures of Central Tendency
No ratings yet
Engineering Statistics: Measures of Central Tendency
10 pages
Measures of Central Tendency
No ratings yet
Measures of Central Tendency
6 pages
4d Ch2 Data Collector QSG (Ed5) 2 - 1
No ratings yet
4d Ch2 Data Collector QSG (Ed5) 2 - 1
7 pages
Measures of Central Tendency: Take The Tour Plans & Pricing
No ratings yet
Measures of Central Tendency: Take The Tour Plans & Pricing
10 pages
Bigdata Bits
No ratings yet
Bigdata Bits
2 pages
10th Class DATABASE Notes
No ratings yet
10th Class DATABASE Notes
5 pages
Chapter 3 (Technical English For Statistics)
No ratings yet
Chapter 3 (Technical English For Statistics)
8 pages
Measure of Central Tendancy
No ratings yet
Measure of Central Tendancy
5 pages
Partition Types
No ratings yet
Partition Types
4 pages
Measures of Central Tendency
No ratings yet
Measures of Central Tendency
8 pages
Measures of Central Tendency
No ratings yet
Measures of Central Tendency
2 pages
Exception Handling in Pega
No ratings yet
Exception Handling in Pega
4 pages
Central Tendency
No ratings yet
Central Tendency
5 pages
Which Measure of Central Tendency To Use
No ratings yet
Which Measure of Central Tendency To Use
8 pages
Iris - Ipynb - Colab
No ratings yet
Iris - Ipynb - Colab
1 page
Mean
No ratings yet
Mean
2 pages
De-Mystifying Math and Stats for Machine Learning: Mastering the Fundamentals of Mathematics and Statistics for Machine Learning
From Everand
De-Mystifying Math and Stats for Machine Learning: Mastering the Fundamentals of Mathematics and Statistics for Machine Learning
Seaport AI Madhavan
No ratings yet
Machine Learning - A Complete Exploration of Highly Advanced Machine Learning Concepts, Best Practices and Techniques: 4
From Everand
Machine Learning - A Complete Exploration of Highly Advanced Machine Learning Concepts, Best Practices and Techniques: 4
Peter Bradley
No ratings yet
Descriptive Statistics: Six Sigma Thinking, #3
From Everand
Descriptive Statistics: Six Sigma Thinking, #3
Sumeet Savant
No ratings yet
Chi Squared for Beginners
From Everand
Chi Squared for Beginners
Stephanie Glen
No ratings yet

Lesson 3 4

Uploaded by

Lesson 3 4

Uploaded by

Lesson 3: Sample size, Confidence Interval and Confidence Level

Confidence Interval (Margin of Error)

Advantage of the mean

Limitations of the mean

We again rearrange that data into order of magnitude (smallest first):

Advantage of the median

Advantage of the mode

Impact of shape of distribution on measures of central tendency

You might also like