0% found this document useful (0 votes)

11 views19 pages

Biostatistics 1

PSM Biostatistics

Uploaded by

Ayaan Baig

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

11 views19 pages

Biostatistics 1

PSM Biostatistics

Uploaded by

Ayaan Baig

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 19

Statistics

Statistics is the science of collecting, analyzing, interpreting, and presenting

data. It is a vast and complex field, with applications in many different
areas, including business, government, healthcare, and education.
Biostatistics
 Biostatistics is the use of statistics to collect, analyze, and interpret data
about living things.
 Biostatisticians use statistical methods to help scientists and doctors
understand how diseases work, develop new treatments, and improve
public health.

Use of Biostatistics:
 To design clinical trials to test new drugs or treatments.
 To analyze data from surveys to track the spread of diseases.
 To develop statistical models to predict the risk of developing certain
diseases.
 To work with public health officials to develop programs to improve the
health of populations.

Data
 Data are measurements or observations that are collected as a source of
information.
There are two main types of Data:
 Qualitative Data and Quantitative Data
Qualitative or Categorical:
Qualitative data are generally described by words or letters. They are not as
widely used as quantitative data because many numerical techniques do not
apply to the qualitative data. For example, it does not make sense to find an
average hair color or blood type.
There are two subgroup of Qualitative Data

Nominal and Ordinal

Nominal :-
 A nominal variable is another name for a categorical variable. Nominal
variables have two or more categories without having any kind of natural
order. they are variables with no numeric value, such as occupation or
political party affiliation. Another way of thinking about nominal variables is
that they are Named (nominal is from Latin nominalis, meaning pertaining
to names).
Ordinal :-
 The ordinal scale classifies according to Rank.
Means in order. Includes “First,” “second” and “ninety ninth.
Quantitative or Numerical
Quantitative data are always numbers and are the result of counting or
measuring attributes of a population.
Quantitative data can be separated into two subgroups:
 discrete (if it is the result of counting (the number of students of a given
ethnic group in a class, the number of books on a shelf, ...)
 continuous (if it is the result of measuring (distance traveled, weight of
luggage, …)

Data type

Nominal
Qualitative or
Categorical
Ordinal
Data
Discrete
Quantitative
or Numerical
Continuous

Example:
If you want to buy a shirt from Amazon.
Color: Red, blue, white
Pattern: plane, line, checks
Size: M, L, XL
Rating: 5 star, 4 star
Price: 499, 999
Discount: 25%, 50%
Color, Pattern
Qualitative or
Categorical
Rating, Size
Data
Quantity
Quantitative
or Numerical
Price,
Discount

Example:

Qualitative Quantitative

Profession
Number of
(Doctor,
Patients
Chemist)

Hypertension
SpO2
Stage( I, II, III)

Pain Intensity
(Mild, moderate,
severe)
Mathematical Statistics:
 Measure of central tendency /Measures of Center:
Measures of central tendency are statistical measures used to describe the center
or average value of a dataset.
A measure of center is a value at the center or middle of a data set.
There are several different ways to determine the center, so we have different
definitions of measures of center, including the mean, median, and mode. We
begin with the mean.

The arithmetic mean:

The arithmetic mean, or the mean, of a set of data is the measure of center found
by adding the data values and dividing the total by the number of data values.

Median:
The median of a data set is the measure of center that is the middle value
when the original data values are arranged in order of increasing (or
decreasing) magnitude.
 To find the median, first sort the values (arrange them in order), then
follow one of these two procedures:
 1. If the number of data values is odd, the median is the number located in
the exact middle of the list.
 2. If the number of data values is even, the median is found by computing
the mean of the two middle numbers
Example 1:
Find the median for this sample of data values,
 25, 74, 36, 46, 17, 57, 62.
 First sort the data values, as shown below:
 17, 25, 36, 46, 57, 62, 74
 Because the number of data values is an odd number (7), the median is the
number located in the exact middle of the sorted list, that is 46
Example 2:
Find the median for this sample of data values,
 25, 74, 36, 46, 17, 57, 62, 32
 First sort the data values, as shown below:
 17, 25, 32, 36, 46, 57, 62, 74
 Because the number of data values is an odd number (8),
 the median is found by computing the mean of the two middle numbers,
which are 36 and 46,
So median is,
36 + 46
𝑀𝑒𝑑𝑖𝑎𝑛 = = 41
2
Mode:
The mode of a data set is the value that occurs with the greatest
frequency.
 A data set can have one mode, more than one mode, or no mode.
 When two data values occur with the same greatest frequency, each one is
a mode and the data set is bimodal.
 When more than two data values occur with the same greatest frequency,
each is a mode and the data set is said to be multimodal.
 When no data value is repeated, we say that there is no mode.
Example:
Find mode of following data,
15, 24, 36, 41, 25, 24, 37, 36, 24, 68, 72
 The mode is 24, because it is the data value with the greatest frequency.

 Two modes: The values of 0, 0, 0, 1, 1, 2, 3, 5, 5, 5 have two modes: 0 and 5.

 No mode: The values of 0, 1, 2, 3, 5 have no mode because no value occurs
more than once.
Frequency Distribution:
 A frequency distribution lists data values (either individually or by groups of
intervals), along with their corresponding frequencies (or counts).
 Example: Frequency Distribution of Marks
 Here The frequency for a particular class is the number of original values
that fall into that class

Marks Frequency
1-10 5
11-20 8
21-30 13
31-40 23
41-50 11

i.e. 8 students get marks in between 11 to 20

 Lower class limits are the smallest numbers that can belong to the different
classes (example has lower class limits of 1, 11, 21, 31, 41)
 Upper class limits are the largest numbers that can belong to the different
classes
 Class boundaries are the numbers used to separate classes, but without
the gaps created by class limits. They are obtained as follows: Find the size
of the gap between the upper class limit of one class and the lower class
limit of the next class. Add half of that amount to each upper class limit to
find the upper class boundaries; subtract half of that amount from each
lower class limit to find the lower class boundaries.
 Class midpoints are the midpoints of the classes.
Mean from Grouped data/Frequency Distribution:

 Example: The following table shows the number of students and the time
they utilized daily for their studies. Find the mean time spent by students
for their studies.
Time (in Hrs.) 0-2 2-4 4-6 6-8 8-10

Students 7 20 12 8 3
Answer:

Time frequency f midpoint x f*x

0-2 7 1 7

2-4 20 3 60
4-6 12 5 60

6-8 8 7 56

8-10 3 9 27

Total 50 210
∑ 𝑓 ∗ 𝑥 210
𝑀𝑒𝑎𝑛 = = = 4.2
∑𝑓 50
Thus mean time spent by students is 4.2 hours
Median for Grouped data:

 In previous example:
N=50, And N/2= 25, Hence 25Th observation will be approximate median.
First we write cumulative frequency (cf),

Time frequency f cf

0-2 7 7
2-4 20 27

4-6 12 39

6-8 8 47

8-10 3 50
 Here, median class is 2-4 because N/2=25 belong to that class,
L=2, N=50, h= 2, f=20, cf= 7.
𝑁
− 𝑐𝑓 25 − 7
𝑀𝑒𝑑𝑖𝑎𝑛 = 𝐿 + [ 2 ]∗ℎ =2+[ ]∗2
𝑓 20

𝑀𝑒𝑑𝑖𝑎𝑛 = 2 + 1.8 = 3.8 hours

Mode:
 We know that the score repeating maximum number of times in a data is
called the mode of the data
Example: Calculate Mean, Median, Mode from the following grouped data
Class Frequency
2-4 3
4-6 4
6-8 2
8 - 10 1
Solution:
Now
L=lower boundary point of median class =4

n=Total frequency =10

cf=Cumulative frequency of the class preceding the median class =3

f=Frequency of the median class =4

h=class length of median class =2

𝑁
− 𝑐𝑓 5−3
𝑀𝑒𝑑𝑖𝑎𝑛 = 𝐿 + [ 2 ]∗ℎ = 4+[ ]∗2
𝑓 4

2
𝑀𝑒𝑑𝑖𝑎𝑛 = 4 + ∗ 2 = 4 + 1 = 5
4
Thus median is 5
Now, to find Modal Class
here, maximum frequency is 4.
The mode class is 4-6.

L=lower boundary point of mode class =4

f1= frequency of the mode class =4
f0= frequency of the preceding class =3
f2= frequency of the succeeding class =2
h= class length of mode class =2
𝑓1 − 𝑓0
𝑀𝑜𝑑𝑒 = 𝐿 + ( )∗ℎ
2𝑓1 − 𝑓0 − 𝑓2
4−3
𝑀𝑜𝑑𝑒 = 4 + ( )∗2
2∗4−3−2
1
𝑀𝑜𝑑𝑒 = 4 + ( ) ∗ 2 = 4.6667
3
Measure of Dispersion:
In the earlier classes we have learnt about the measures of central tendency
namely mean, median and mode. Such an average tells us only about the central
part of the data. But it does not give any information about the spread of the
data.
 Dispersion:
The degree to which numerical data tend to spread about an average value is
called the variation or dispersion of the data.
 Measures of Dispersion :
Following measures of dispersion are the commonly used –
• Range
• Variance
• Standard deviation

Range:
 Range is the simplest measure of dispersion. It is defined as the difference
between the largest value and the smallest value in the data.
Thus,
Range = Largest Value – Smallest Value
Range = L – S
Where, L = Largest Value and S = Smallest Value.
 Example: Following data gives weights of 10 students (in kgs) in a certain
school. Find the range of the data.
70, 62, 38, 55, 43, 73, 36, 58, 65, 47
Solution: Smallest Value = S = 36 Largest Value = L = 73
Range = L – S = 73 – 36 = 37
Variance:
 The variance of a variable X is defined as the arithmetic mean of the
squares of all deviations of X taken from its arithmetic mean.
In other words,
A measurement of how far each number in a data set is from the mean, and thus
from every other number in the set.
It is denoted by 𝑉𝑎𝑟(𝑋) 𝑜𝑟 𝜎 2 .
𝑛
1
𝑉𝑎𝑟(𝑋) = 𝜎 2 = ∑(𝑋𝑖 − 𝑋̅)2
𝑛
𝑖=1

And for grouped data

𝑛 𝑛
1 1
𝑉𝑎𝑟(𝑋) = 𝜎 2 = ∑ 𝑓𝑖 (𝑋𝑖 − 𝑋̅)2 = ∑ 𝑓𝑖 𝑋𝑖2 − 𝑥̅ 2
𝑁 N
𝑖=1 𝑖=1

Standard deviation:
 Standard Deviation is defined as the positive square root of the variance.
It is denoted by 𝜎 (𝑠𝑖𝑔𝑚𝑎)
Example: Compute variance and standard deviation of the following data.
9, 12, 15, 18, 21, 24, 27.
 Example: A die is rolled 30 times and the following distribution is obtained.
Find the variance and S.D,

Score 1 2 3 4 5 6

Frequency 2 6 2 5 10 5

Solution:

A Quick Guide To Quantitative Research in The Social Sciences
No ratings yet
A Quick Guide To Quantitative Research in The Social Sciences
26 pages
Notes Management Accounting
No ratings yet
Notes Management Accounting
23 pages
Chapter 3
50% (4)
Chapter 3
8 pages
Biblioteca
No ratings yet
Biblioteca
28 pages
(Haskins) Practical Guide To Critical Thinking PDF
100% (2)
(Haskins) Practical Guide To Critical Thinking PDF
14 pages
Stats Salah Notes
No ratings yet
Stats Salah Notes
6 pages
MMW (Data Management) - Part 1
No ratings yet
MMW (Data Management) - Part 1
26 pages
MMW Chapter 4
No ratings yet
MMW Chapter 4
84 pages
Statistical Methods
No ratings yet
Statistical Methods
43 pages
Lesson 02 Probability and Statistics
No ratings yet
Lesson 02 Probability and Statistics
127 pages
Basic Statistics (3685) PPT - Lecture On 20-01-2019
100% (1)
Basic Statistics (3685) PPT - Lecture On 20-01-2019
64 pages
Data Management Part 1 2024
No ratings yet
Data Management Part 1 2024
68 pages
Physics
No ratings yet
Physics
6 pages
E-Note 33325 Content Document 20250319114322AM
No ratings yet
E-Note 33325 Content Document 20250319114322AM
69 pages
MIDTERM - 1. Measures of Central Tendency and Position
No ratings yet
MIDTERM - 1. Measures of Central Tendency and Position
57 pages
Introduction To Statistics: Prepared By: Joshua Erdy A. Tan
No ratings yet
Introduction To Statistics: Prepared By: Joshua Erdy A. Tan
29 pages
Chapter 1 BFC34303
No ratings yet
Chapter 1 BFC34303
104 pages
Lesson 5 Planning Data Analyses
No ratings yet
Lesson 5 Planning Data Analyses
19 pages
Data Management
No ratings yet
Data Management
36 pages
Lesson 4 Data Management Mean Median Mode
No ratings yet
Lesson 4 Data Management Mean Median Mode
3 pages
Statistics
No ratings yet
Statistics
63 pages
Introduction To STATISTICS-new
No ratings yet
Introduction To STATISTICS-new
44 pages
Lesson 5 (Descriptive Statistics Part 1) - Oct 2024
No ratings yet
Lesson 5 (Descriptive Statistics Part 1) - Oct 2024
72 pages
ISM - Session 1 - May 2025
No ratings yet
ISM - Session 1 - May 2025
54 pages
Statistics and Data
No ratings yet
Statistics and Data
19 pages
Giu 3084 65 22361 2025-02-17T15 43 52
No ratings yet
Giu 3084 65 22361 2025-02-17T15 43 52
13 pages
DR - Nesrin H. Darwesh University of Duhok-College of Dentistry
No ratings yet
DR - Nesrin H. Darwesh University of Duhok-College of Dentistry
15 pages
Data Management MMW
No ratings yet
Data Management MMW
92 pages
Descriptive Statistics
No ratings yet
Descriptive Statistics
101 pages
Descriptive Statistics
No ratings yet
Descriptive Statistics
11 pages
UNIT II - Statistics For Data Science - New
No ratings yet
UNIT II - Statistics For Data Science - New
153 pages
Applied Statistical Methods (ASM) : "The True Logic of This World Is in The Calculus of Probabilities"
No ratings yet
Applied Statistical Methods (ASM) : "The True Logic of This World Is in The Calculus of Probabilities"
90 pages
Data-Management MMW
No ratings yet
Data-Management MMW
22 pages
1 - III YR, VII Unit Intro To Statistics
No ratings yet
1 - III YR, VII Unit Intro To Statistics
214 pages
Introduction To Biostatistics
No ratings yet
Introduction To Biostatistics
53 pages
Introduction To Statistics Lecture 7
No ratings yet
Introduction To Statistics Lecture 7
32 pages
Descriptive Statistics
No ratings yet
Descriptive Statistics
26 pages
Statistics Lecture 1
No ratings yet
Statistics Lecture 1
20 pages
Handout-A-Preliminaries (Advance Statistics)
No ratings yet
Handout-A-Preliminaries (Advance Statistics)
29 pages
Data Management
100% (1)
Data Management
51 pages
Mathematics As A Tool New
No ratings yet
Mathematics As A Tool New
62 pages
MIDTERM - 1. Measures-of-Central-Tendency-and-Position
No ratings yet
MIDTERM - 1. Measures-of-Central-Tendency-and-Position
56 pages
Ge 4 Topic 2-Statistics
67% (3)
Ge 4 Topic 2-Statistics
11 pages
Toaz - Info Ge 4 Topic 2 Statistics PR
No ratings yet
Toaz - Info Ge 4 Topic 2 Statistics PR
11 pages
Statistics and Probabilities Quarter 1
No ratings yet
Statistics and Probabilities Quarter 1
6 pages
Statistics
No ratings yet
Statistics
86 pages
Midterm Exam Reviewer
No ratings yet
Midterm Exam Reviewer
12 pages
Module 4
No ratings yet
Module 4
28 pages
MMW Stat 24 25
No ratings yet
MMW Stat 24 25
42 pages
MATM111
No ratings yet
MATM111
8 pages
Basics of Statistics
No ratings yet
Basics of Statistics
32 pages
Safari
No ratings yet
Safari
385 pages
Introduction and Descriptive Statistics
No ratings yet
Introduction and Descriptive Statistics
50 pages
Chapter 1 BFC34303 (Lyy)
No ratings yet
Chapter 1 BFC34303 (Lyy)
104 pages
Data Management
No ratings yet
Data Management
48 pages
L2-Types of Data, Central Tendency and Dispersion-2
No ratings yet
L2-Types of Data, Central Tendency and Dispersion-2
81 pages
MMW Data Management
No ratings yet
MMW Data Management
2 pages
Chap 4 Part1 Intro Measures of Central Tendency of Ungrouped Data 1
No ratings yet
Chap 4 Part1 Intro Measures of Central Tendency of Ungrouped Data 1
74 pages
UNIT II - Statistics For Data Science - New
No ratings yet
UNIT II - Statistics For Data Science - New
153 pages
Data Management MMW
No ratings yet
Data Management MMW
92 pages
Statistics MCT
No ratings yet
Statistics MCT
7 pages
GE 104 Module 4
No ratings yet
GE 104 Module 4
24 pages
Wa0014
No ratings yet
Wa0014
63 pages
3 Summarizing Data
No ratings yet
3 Summarizing Data
64 pages
Introduction To Business Statistics Through R Software: Software
From Everand
Introduction To Business Statistics Through R Software: Software
Editor IJSMI
No ratings yet
Multivariate Analysis of Variance-MANOVA
No ratings yet
Multivariate Analysis of Variance-MANOVA
14 pages
AI vs. Human Teachers: A Comparative Survey On The Potential For Substitution in Education
No ratings yet
AI vs. Human Teachers: A Comparative Survey On The Potential For Substitution in Education
11 pages
Moisture in Pavements and Design SA Emery
No ratings yet
Moisture in Pavements and Design SA Emery
83 pages
QP11 Data Analysis
No ratings yet
QP11 Data Analysis
3 pages
Partial Least Squares (PLS) Structural Equation Modeling (SEM) For Building and Testing Behavioral Causal Theory: When To Choose It and How To Use It
No ratings yet
Partial Least Squares (PLS) Structural Equation Modeling (SEM) For Building and Testing Behavioral Causal Theory: When To Choose It and How To Use It
24 pages
CH 4 Estimation.
100% (1)
CH 4 Estimation.
48 pages
Multiple-Intelligence (2) (Autorecovered)
No ratings yet
Multiple-Intelligence (2) (Autorecovered)
73 pages
Stop Oversampling For Class Imbalance Learning - A Review (OJO) - AHMAD S. TARAWNEH, AHMAD B. HASSANAT, GHADA AWAD ALTARAWNEH, ABDULLAH ALMUHAIMEED
No ratings yet
Stop Oversampling For Class Imbalance Learning - A Review (OJO) - AHMAD S. TARAWNEH, AHMAD B. HASSANAT, GHADA AWAD ALTARAWNEH, ABDULLAH ALMUHAIMEED
18 pages
Problem Set 3-FIEM
No ratings yet
Problem Set 3-FIEM
10 pages
Final Push Trig & Stats
No ratings yet
Final Push Trig & Stats
24 pages
A Comparison of Three Methods For Selecting Values of Input Variables in The Analysis of
No ratings yet
A Comparison of Three Methods For Selecting Values of Input Variables in The Analysis of
8 pages
Statistical Process Control Study (Cp-Cpk-X-R-Chart)
No ratings yet
Statistical Process Control Study (Cp-Cpk-X-R-Chart)
1 page
Stat For MGT II New (1) - 1
No ratings yet
Stat For MGT II New (1) - 1
67 pages
Smartphoneusagetowardslearningbehaviorandacademicperformanceofaccountancybusinessandmanagementstudentsoftacurongnationalhighschool PDF
No ratings yet
Smartphoneusagetowardslearningbehaviorandacademicperformanceofaccountancybusinessandmanagementstudentsoftacurongnationalhighschool PDF
56 pages
Mode or Modal Value
No ratings yet
Mode or Modal Value
4 pages
Visual Communications - MSc-Bharathiyaar Univesity
No ratings yet
Visual Communications - MSc-Bharathiyaar Univesity
10 pages
Introduction To Statistics
No ratings yet
Introduction To Statistics
18 pages
Efficacy of Pooled Serum Internal Quality Control
No ratings yet
Efficacy of Pooled Serum Internal Quality Control
5 pages
Lecture - 8 MLR
No ratings yet
Lecture - 8 MLR
63 pages
Intraday Momentum Tradingwith HMM
No ratings yet
Intraday Momentum Tradingwith HMM
38 pages
2088-Article Text-6814-1-10-20230619
No ratings yet
2088-Article Text-6814-1-10-20230619
9 pages
1 s2.0 S0044848616306913 Main
No ratings yet
1 s2.0 S0044848616306913 Main
9 pages
775 1804 2 PB - 230913 - 101218
No ratings yet
775 1804 2 PB - 230913 - 101218
19 pages
REDCap Beginners Guide
No ratings yet
REDCap Beginners Guide
14 pages
STA 2311 Statistical Programming II
No ratings yet
STA 2311 Statistical Programming II
3 pages

Biostatistics 1

Uploaded by

Biostatistics 1

Uploaded by

Statistics

Statistics is the science of collecting, analyzing, interpreting, and presenting

Nominal and Ordinal

The arithmetic mean:

 Two modes: The values of 0, 0, 0, 1, 1, 2, 3, 5, 5, 5 have two modes: 0 and 5.

i.e. 8 students get marks in between 11 to 20

Time frequency f midpoint x f*x

𝑀𝑒𝑑𝑖𝑎𝑛 = 2 + 1.8 = 3.8 hours

n=Total frequency =10

cf=Cumulative frequency of the class preceding the median class =3

f=Frequency of the median class =4

h=class length of median class =2

L=lower boundary point of mode class =4

And for grouped data

You might also like