Reviewer in IE-SAN1
Reviewer in IE-SAN1
Lesson 1: Basic Concepts of Probability and 2. Qualitative - It is non numerical data and is
Statistics subdivided into Two Types:
- Categorical data: are purely descriptive
Statistics - Statistical knowledge helps you use
and imply no ordering of any kind such
the proper methods to collect the data, employ the
as sex, area of residence.
correct analyses, and effectively present the
- Ordinal data: are those which imply some
results. Statistics is a crucial process behind how
kind of ordering like Level of education,
we make discoveries in science, make decisions
Socio-economic status, and Degree of
based on data, and make predictions.
severity of disease.
Probability - plays a vital role in the day to day
life. In the weather forecast, sports and gaming
strategies, buying or selling insurance, online Presentation of Data
shopping, and online games, determining blood
The first step in statistical analysis is to present
groups, and analyzing political strategies.
data in an easy way to be understood. The two
basic ways for data presentation are:
Definition of Statistics 1. Tabular presentation.
2. Graphical presentation
Statistics is the science of dealing with
numbers.
It is used for Collection, Summarization,
Tabulation
Presentation and Analysis of data.
Statistics provides a way of organizing data to Some rules for the construction tables:
get information on a wider and more formal
1. The table must be self-explanatory.
(objective) basis than relying on personal
2. Title: written at the top of table to define
experience (subjective).
precisely the content, the place and the
time.
3. Clear heading of the columns and rows and
Types of data
units of measurements
Any aspect of an individual that is measured, is 4. The size of the table depends on the
called variable. Variables are either: number of classes. Usually lie between 2
and 10 rows or classes. Its selection
1. Quantitative - it is numerical data.
depends on the form of data and the
- Discrete data: are usually whole
requirement of the distribution. Too small
numbers, such as number of cases of
may obscure some information and too
certain disease, number of hospital beds
long will not differ from raw data.
(no decimal fraction).
- Continuous data: it implies the Types of tables
measurement on a continuous scale e.g.
o For Qualitative data, draw a simple table e.g.,
height, weight, age (a decimal fraction can
List Table: count the number of observations
be present).
(frequencies) in each category.
o For Quantitative data, we have to form 3. Enumerate
frequency distribution Table. - the individuals in each blood group i.e.
individuals with blood group A are 6 and
those with blood group B are 6, AB are 5
List Table: and blood group O are 3
- Make sure that the total number of
A table consisting of two columns, the first individuals in all blood groups is 20 (the
giving an identification of the observational number of the studied group).
unit and the second giving the value of 4. Calculate The relative frequency
variable for that unit. - (%) of each blood group by dividing the
Example: number of patients in each hospital frequency of that group over the total
department are: number of individuals and multiplied by
Medicine 100 patients 100
- the percentage of group A = 6/20 x 100,
Surgery 80 patients
and the same for group AB = 5/20 x 100
ENT 28 patients and group O = 3/20 x 100.
x=
∑x the set is from the mean and thus from every
n other number in the set.
- Population parameter μ Steps in Calculating the Variance
1. Find the mean of the data set. Add all data
μ=
∑x values and divide by the sample size n.
N
2. Find the squared difference from the mean
- Where,
for each data value. Subtract the mean
n = number of data values in the sample
from each data value and square the
N= number of data values in the
result.
population
3. Find the sum of all the squared
Trimmed Mean differences. The sum of squares is all the
squared differences added together.
A measure of center that is more resistant
4. Calculate the variance. Variance is the
than the mean but still sensitive to specific
sum of squares divided by the number of
data values is the trimmed mean. A trimmed
data points.
mean is the mean of the data values left
after "trimming" a specified percentage of
- Population
the smallest and largest data values from the
data set. Usually, a 5% trimmed mean is used. Variance=σ =
∑
2 ( x i−μ )
2
Variance=s =
∑
2 ( x i−x )
2
2. The variance for each data point is Solving for Sample Variance:
calculated by subtracting the mean from
the value of the data point. Each of those 2
s=
∑ ( x i−x )
2
√ ∑ ( x−x )
√ ∑ ( x− x )
2 2
s=3.47
s= σ=
n−1 n
Example:
Big Blossom greenhouse was commissioned to
develop an extra-large rose for the Rose Bowl
Parade. A random sample of blossoms from
Hybrid A bushes yielded the following
diameters (in inches) for mature peak blooms.
2, 3, 3, 8, 10, 10
Solution:
x=
∑x
n
36
x=
6
x=6 inches
x x−x ( x−x )2
2 2−6=−4 −4
2
3 3−6=−3 −3
2
3 3−6=−3 −3
2