0% found this document useful (0 votes)
9 views87 pages

3rd Week

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
9 views87 pages

3rd Week

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 87

Name of Department : NURSING

Course Code and Name : ISF205E STATISTICS


Course Week : 3rd
Course Day and Time : MONDAYS 12:00-15:00
Course Credit/ACTS
Information : 3 -0-3CREDITS, 6ACTS
Examination Type and Gradings : 50% Midterm and Final
Instructor’s Name & Surname : Asst.Prof. Eda Merve KURTULUŞ
E-mail & Phone: : emkurtulus@gelişim.edu.tr -411
Instructor’s Room : 014
Office Hours : Fridays 9-11:00
GBS Link : https://gbs.gelisim.edu.tr/ders-plani-5-51-1
ALMS Link : https://lms.gelisim.edu.tr/Account/LoginBefore
AVESIS Link : https://avesis.gelisim.edu.tr/emkurtulus
| 14 WEEKS’S COURSE CONTENTS |

1. Introduction to statistics 8. Probability Distributions, Normal


Distrubution
2. Statistical concepts and the
9. Confidence Intervals and Sampling
regulation of data Methods
3. Descriptive Statistics 10. Cut off and Interval Estimation
4. Measures of central tendencies 11. Comparison of means Comparison of
ratios
5. Measures of central distribution
12. Hypothesis testing
6. Probability
13. Correlation and Regression
7. Problem solving and Quiz 14. Ki-square Testing
| DAILY FLOW |

Cut off-
Significancy

12.00-12.50/ 1st Hour Causation Normal Distribution

13.00-13.50/ 2nd Hour Data Presentation


14.00-14.50/ 3rd Hour Mean** Probablity

Std deviation

Mean, Median etc

Causation
| ABOUT THE PREVIOUS COURSE |

Causation
Correlation
Types Of Data
Nominal Data
Ordinal Data
Interval Data
Continous Data
| WEEKLY LEARNING OUTCOMES |
Students will

• define median, mean, harmonic mean, geometric mean, and mode.


•explain the differences between these measures of central tendency.
•calculate these measures from given datasets.
•compare the use of different means in symmetrical and skewed datasets.
•evaluate the appropriateness of different means in practical scenarios.
•create problems requiring the application of these measures.
1st Hour: DEFINITIONS
CLASS, UPPER BOUNDARY, LOWER BOUNDARY, INTERVALS, RANGE
• Class(es) : A term that is used for
• grouping the data with definite
boundaries.
• number of classes desired can be
selected.
• This is usually between 5 and 20.

Classes usually have intervals with


definite width
• Frequency: Frequency is the
number of occurrences of a
repeating event
• That means how often you
observe a certain event
• Cummulative Frequency:
• The number of values less than
the upper class boundary for
the current class. This is a
running total of the
frequencies.
• Upper (Class) Boundries
• Upper limit that Separates
of one class in a grouped
frequency distribution
from another.
• Lower (Class) Boundries
:
• Lower limit that
Separates of one class
in a grouped frequency
distribution from
another.
• The range is a measure of variability
that represents the difference
between the highest and lowest
values in a data set. It provides a
simple way to understand the
spread of the data.
• Formula:
• Range=Minimum Value TO
Maximum Value

• RANGE: 51-70
• Class Interval Size or Class Width
• The difference between the upper
and lower boundaries of any class.
• The class width is also the
difference between the lower
limits of two consecutive classes
or the upper limits of two
consecutive classes.
• It is not the difference between
the upper and lower limits of the
same class.
Class Limits and the
Midpoint
Size of the class interval (interval
of class) i=U-L
• The point halfway between adjacent
intervals
i = size of a class interval
• Upper and lower limits U = upper limit of a class interval
– Distance from upper and lower limit L = lower limit of a class interval
determines the size of class interval
The Midpoint
• The middlemost score value in a class interval
– The sum of the lowest and highest value in a class
interval divided by two
lowest score value  highest score value
m=
2

9
Central Tendencies Dispersion
• Central tendency is a property • Dispersion is a
of the data that they tend to
be clustered about a single property of the data
center point. that they tend to be
• Numbers spread out.
• used to present the center • Refers to the spread of
• or the middle set of the data values about the mean
 Quartiles
 mean (generally not part of the data set)
 Range
 median (may be part of the data set)
 Variance
 mode (always part of the data set)
 Standart Deviation
1.Arithmetic Mean
• Themean or arithmetic
mean is the "average« of the
sum.
•which is obtained by;
adding all the values in a
sample or population
 and dividing them by the
number of values.
17
18
Properties of the mean
1. Uniqueness -- For a given set of data there is one and
only one mean.
2. Simplicity -- The mean is easy to calculate.
3. Affected by extreme values -- The mean is
influenced by each value. Therefore, extreme values
can distort the mean.

19
Calculating Sample Mean
1. Uniqueness -- For a given set of Add up all of the data
data there is one and only one mean. points and divide by the
2. Simplicity -- The mean is easy to number of data points.
calculate.
3. Affected by extreme values --
The mean is influenced by each
value. Therefore, extreme values
can distort the mean.

Sample Mean = (2+8+3+4+1)/5 = 3.6


Number of drinks/day: 2 8 3 4 1
GROUP Frequency Xi fİ

51 1 51 x 1 51

66 3 66 x 3 198

72 4 72 x 4 288

82 5 82 x 5 410

94 7 94 x 7 658

n: 20 1605

=80,25
Mean of Grouped Data
Freq MidPoint
0--4 2 2 4
4--8 3 6 18
8--12 8 10 80
12--16 3 14 42
16--20 2 18 36

18 180 180/18 10
Groups Xi
f
i Fi x i
0-10 (0+10)/2 =5 3 15

10-20 15 12 180

20-30 25 35 875
10

∑ 𝑓𝑖=340
30-40 35 45 1575

40-50 45 110 4950

50-60 55 45 2475
𝑖=1
60-70 65 35 2275

70-80 75 30 2250
X̄ = = 49,47
80-90 85 15 1275

90-100 95 10 950

Toplam 340 16820


Weighted Mean

• Each item being averaged is


multiplied by a number (weight)
based on the item’s relative
importance.
• The result is summed and the total is
divided by the sum of the weights.
Weighted averages are used
extensively in descriptive statisticcal
analysis such as index numbers.
• Wi is the weighted ratio or frequency
of the data
Lessons Mathematics Chemistry Biology Physics Language

Credits 6 4 2 5 3

Grade 54 50 54 51 67

• Find the mean of the notes


• Find the weighted mean
• Which one should be used
decide

c) Weighted
Geometric Mean
• In a setting, when the growth in the
number of something per area is
examined over time.
• Or contamination
• G.O =
• The number does not change by the
same amount from one period to the
next, but the change is proportional to
the initial geometric mean is used.
• Another way of saying this is that the
growth is multiplicative, not additive.
• Geometric mean is the nth root of the
product of the values for n
observations
Example
• In a set • G.0=
• The incease in the 1st day is • =
0,8
• For the second day the
increase is by : 0,10 • =0,212
• And for 3rd : 0,12.
• What is the average increase
rate?
Harmonic Mean
• Harmonic mean is another measure
of central tendency
• Like arithmetic mean and geometric
mean, harmonic mean is also useful
for quantitative data.
• The harmonic mean can be
expressed as the reciprocal of the
arithmetic mean of the reciprocals
of the given set of observations
• Or quotient of the “number of the ("Reciprocal" just means
given values” and the“sum of the
reciprocals of the given values”. (1/ value )
• Harmonic Mean: Ideal for rates • The harmonic mean can be
and ratios, especially when the expressed as the reciprocal of
data points are inversely the arithmetic mean of the
related to the quantities of reciprocals of the given set of
interest (e.g., speed and time). observations
• Or quotient of the “number of
the given values” and the“sum
of the reciprocals of the given
values”.
Example
A driver has traveled 200
km road in 2 hours and
returned in 4 hours.
Calculate the average
speed of the driver on
this journey?
400 / 6 = 66.67
km/s
• A company has three machines
with production rates of 60
units/hour, 90 units/hour, and
120 units/hour.
• What is the average production
rate of the machines?
• So the first machine’s rate is 1/60
• 2nd machine’s rate is 1/90
• 3rd machine’s rate is 1/120
Descriptive Statistics

Central Tendencies Dispersion


Mean, Quartiles
Median Percentage
Mode, Range
Variance
Standart Deviation
2. MEDIAN
• The median is a simple
measure of central tendency.
• To find the median, we
arrange the observations in
order from smallest to largest • The MEDIAN is the number that
value. is in the middle of a set of data
.

• 1. Arrange the numbers in the


set in order from least to
greatest.

• 2. Then find the number that is


1 2 3 4 5 6 7 8 9 10 11

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16

In between 8TH and 9th IN THE RAW


85 86 86 86 89 90
63 73 84 86 88 95 97 97 100

The median is 88.

Half the numbers are Half the numbers are

less than the median. greater than the median.


63 73 84 88 95 97 97 100

88 + 95 = 183

183 ÷ 2 The median is


91.5
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16

160 160 165 170,1 170,1 175 178 178 178 180 181 183 185 186 187 190

Grouped Data

160-165 3
170-175 3
176-180 4
181-185 3
186-190 3
Finding the Median in a Grouped Data
• L= Class of medians lower boundry
•L + . İ • N=Total frequency
• F= Total frequency above the box
• i= Class interval size
• f= Frequence in box
Cummulative Frequency

2
5
13
16
18
Cummulative Frequency

2
5 (F)
L= 13
16
18 L+ .İ

= 10
Group Xi frequenciei Cummulative • Median : 340/2 = 170th
Frequency • L= Class of medians lower
boundry
0-10 (İ) 5 3 3
• N=Total frequency
10-20 15 12 15
(İ)
• F= Total frequency above the
20-30 25 35 50 box
30-40 35 45 95 (F) • i= Class interval size
40-50 45 110 (f) 205 • F= Frequence in box
(L)
50-60 55 45 250
60-70
70-80
65
75
35
30
285
315
• 40 + . 10
80-90 85 15 330 • = = 46,81
90-100 95 10 340
Total 340
Descritive Biostatistics

Central Tendencies Dispersion


Mean, Range
Median Variance
Mode, Standart Deviation
Quartiles
Percentage
• The mode of a set of data values is
the value that appears most often. It is
the value x at which its probability
Mode mass function takes its maximum
• The mode is a statistical term that value. In other words, it is the value
refers to the most frequently that is most likely to be sampled.
occurring number found in a set
of numbers.
• The mode is found by collecting
and organizing data in order to
count the frequency of each
result. The result with the highest
number of occurrences is the
mode of the set.
Measures of Central Tendencies - Mode

Example 1: 2, 3, 4, 5, 7, 7, 8. Mode = 7
Example 2: 2, 3, 4, 5, 7, 7, 8, 8.
Mode = 7, Polymodal
8
Example 3: 1,2,3,1,1,6,5,4,1,4,4,3 -> 1 has the highest
frequency (f=4)
Mode is 1 (not 4 )
The mode is not the frequency of the most
frequent score but the value
Finding the Mode in Grouped Data
Mode = L + . h

L: Mode’s class
𝝙 1 = Mod class frequency – previously class’s frequency
𝝙 2= Mod class frequency – Next class frequency
Finding the Mode in Grouped Data
Mode = L + . h
Groups Xi
f
i

𝝙 1 = Mod class frequency – previously class’s


L: Lower boundary of the box 0-10 (0+10)/2 =5 3

frequency
Mod class frequency – Next
10-20 15 12

𝝙
class frequency
20-30 25 35
2=
30-40 35 45

40-50 45 110
• Mode = L + . h
50-60 55 45

60-70 65 35
• 40+ . 10
70-80 75 30

80-90 85 15
• Mode = 45 90-100 95 10

Toplam 340
Hints

700
Chart Title
• If the number of repetitions
600
of each observation is the
500
same, there is no mode in
400
that data set.
300 • Distributions with a single
200 value with the highest
100 number are called a unimode
0
A
B
C
Hints

• If we 2 candidates It is then • If there are more than one


called bimodial. modes in the examined series;
• series is needed to be checked
Bimodial Şirketi Departmanları İşçi Sayısı
700
• If the series is homogeneous,
600 changing the class widths and
500
redoing the reclassification is
needed.
400

300

200

100
• Thus, the mode value becomes
0
A
single so.
B
C
D
E
F
Descritive Biostatistics
Central Tendencies Dispersion
Mean Quartiles
Median Percentage
Mode Variance
Standart Deviation
Quartiles
Measures of central tendencyQ is equal to the 25th percentile
1
that divide a group of data
into spesific subgroups Q is located at 50th percentile and equals
2
the median
• Q1: 25% of the data set is
below the first quartile Q3 is equal to the 75th percentile

• Q2: 50% of the data set is Quartile values are not necessarily
below the second quartile members of the data set
• Q3: 75% of the data set is Q1 Q2 Q3
below the third quartile
25% 25% 25% 25%
Variance vs Variety
• It is the distribution or spread • Values in different data sets
around the mean can have the same mean even
• Variety means having many though their values are totally
different values different.
SET 1
SET 1 SET 2
2.5
2 90 2
1.5
2 70 1
0.5
2 -20 0
0 1 2 3 4 5 6 7 8 9 10
SET 2
2 -50
100
2 8 80
60
2 -10 40

2 28 20
0
0 1 2 3 4 5 6 7 8 9 10
2 -9 -20
-40

2 -89 -60
-80
-100
Variance
A measure of the spread of the recorded values on a variable. A measure of dispersion.

The larger the variance, the further the individual cases are from the mean.

Mean
The smaller the variance, the closer the individual scores are to the mean.

Mean
Shape of the
Distribution
Symmetrical Distributions
• The mode, median, and mean have identical
values

Skewed Distributions
• The mode is the peak of the curve
• The mean is closer to the tail
• The median falls between the two

Bimodal Distributions
• Both modes should be used to describe the data

4
6
Variance
1. Find the mean of the data.
Hint – mean is the average so add up
the values and divide by the number
of items.
2. Subtract the mean from each value – the
result is called the deviation from the
mean.
3. Square each deviation of the mean.
4. Find the sum of the squares.
5. Divide the total by the number of items.
1. Find difference between each data point and mean.
2. Square the differences, and add them up.
3. Divide by the number of data points.

Data Frequency Distance = Absolute


Value
2 1 2-4 -2 |-2|
4 2 4-4 0 |0|
5 2 5-4 1 |1|
Variance Formula

The variance formula includes the Sigma Notation, ,
which represents the sum of all the items to the right
of Sigma.
(x   ) 2

n
Mean is represented by and n is the number of
items.
If measuring variance of population, denoted by 2 (“sigma-
squared”).
If measuring variance of sample, denoted by s2 (“s-squared”).
Measures average squared deviation of data points from their mean.
Highly affected by outliers. Best for symmetric data.
SAMPLE VARIANCE VS POPULATION VARIANCE

N  (x   ) 2

(x i  x)
s 2  i 1 n
n 1

In the exam we will be using this


formulation
MEASURES OF VARIABILITY
SAMPLE VARIANCE

• The sample variance is defined as follows:


N

(x i  x)
s 2  i 1
n 1

• Where s2 stands for the sample variance


• x is the sample mean
• n is the total number of values in the sample
• xi is the value of the i-th observation.
•  represents a summation
The heights (at the shoulders) are: 600mm, 470mm, 170mm, 430mm and 300mm.
Find out the Mean, the Variance, and the Standard Deviation.

Mean = 600 + 470 + 170 + 430 + 300/ 5

= 1970/ 5

= 394
394

• Now we calculate each dog's difference from the Mean:


To calculate the Variance,
take each difference, square
it, and then average the
result:
So the Variance is 21.704
• And the Standard Deviation is
just the square root of
Standard Deviation Variance, so:
σ = √21704
= 147,32...
= 147 (to the nearest mm)
• Standard Deviation
So, using the Standard Deviation we have a "standard"
• σ = √21704
way of knowing what is normal, and what is extra large or
extra small
• = 147,32...
But it shouldnt create a
394 ± 147 mm • = 147 (to the
600> 531mm nearest mm)
147 < 247 mm

Variance for simple data
Weight of Members

50

55
I have to find the mean:
58

60 x̄ =
64 x̄ =63
70

71

76
Weight frequency (Xİ-X) (Xi-X) (Xi-X)² x̄ =63
50 1 50-63 -13 169 Variance:
55 1 55-63 -8 64 SD²= 68,75
58 1 58-63 -5 25

60 1 60-63 -3 9

64 1 64-63 1 1

70 1 70-63 7 49

71 1 71-63 8 64

76 1 76-63 13 169

504 550
For Grouped Data
Classes
Örnek:
f
i
0-10 3 • Find the variance
10-20 12

20-30 35

30-40 45

40-50 110

50-60 45

60-70 35

70-80 30

80-90 15

90-100 10

Toplam 340
Classes Xi
f
i Fi x i
0-10 (0+10)/2 =5 3 15

10-20 15 12 180

20-30 25 35 875

30-40 35 45 1575
10

∑ 𝑓𝑖=340
40-50 45 110 4950

50-60 55 45 2475

60-70 65 35 2275
𝑖=1
70-80 75 30 2250

80-90 85 15 1275
X̄ = = 49,47
90-100 95 10 950

Toplam 340 16820


Sınıflar (Xi- x̄) (Xi- x̄ ) fi(Xi- x̄ )²
f
(‘den az) i Xi

0-10 3 5 5-49.47 -44,47 5932,74


10-20 12 15 15-49.47 -34,47 14258,1708
20-30 35 25 25-49.47 -24,47 20957,3315
30-40 45 35 35-49.47 -14,47 9422,1405
40-50 110 45 45-49.47 -4,47 2197,8990
50-60 45 55 55-49.47 5,53 1376,1405
60-70 35 65 65-49.47 15,53 8441,3315

70-80 30 75 75-49.47 25,53 19553,4270


80-90 15 85 85-49.47 35,53 18935,7135
90-100 10 95 95-49.47 45,53 20729,8090

Toplam 340 121804,7060


• S² =

• S² =

• =358,24
• S=
• =358,24
• =18,93

• =49,47 ± 18,93
Coefficient of variation
Coefficient of variation is a measure of the relative amount of variation
as opposed to the absolute variation.

C.V. is independent of the units of measure. It can be useful for


comparing different results from people investigating the same
variable.

80
| QUESTIONS AND SUGGESTIONS |
| RECOMMENDED WEEKLY STUDIES |

READY FOR YOUR QUIZ


| WHAT TO TAKE HOME? |

• define median, mean, harmonic mean, geometric mean, and mode.


•explain the differences between these measures of central tendency.
•calculate these measures from given datasets.
•compare the use of different means in symmetrical and skewed datasets.
•evaluate the appropriateness of different means in practical scenarios.
•create problems requiring the application of these measures.
| REFERENCES |

•Field, A. (2018). Discovering statistics using IBM SPSS Statistics (5th ed.). SAGE Publications.
•Sullivan, M. (2019). Statistics (6th ed.). Pearson.
•Bluman, A. G. (2018). Elementary statistics: A step by step approach (10th ed.). McGraw-Hill
Education.
•Moore, D. S., McCabe, G. P., & Craig, B. A. (2018). Introduction to the practice of statistics (9th
ed.). W.H. Freeman.
| ABOUT THE NEXT WEEK |

STANDART DEVIATION
VARIANCE
"Peace at Home, Peace in the World.”

Mustafa Kemal ATATÜRK

You might also like