0% found this document useful (0 votes)
35 views9 pages

Reviewer Part 1

Uploaded by

Lindsay Castro
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
35 views9 pages

Reviewer Part 1

Uploaded by

Lindsay Castro
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 9

REVIEWER: MATM111

I. STATISTICS, ORIGIN AND TERMS


 Statistics itself came from the Latin word “status” which means state.
 Two Areas of Statistics
1. Descriptive Statistics – deals with the collection and presentation of data and collection
of summarizing values to describe its group characteristics.
Ex: Measure of central tendency and variation.
2. Inferential Statistics – deals with the predictions and inferences based on the analysis
and interpretation of the results of the information gathered by the statistician.
Ex: Inferential statistics are the t-test, z-test, analysis of variance, chi-square, and
Pearson r.
 Variable – a characteristic or attribute associated with the population being studied.
(Age, height, economic status)
Classification of variable:
1. categorical or qualitative – yields categorical or qualitative response (gender,
religion, 1-strongly disagree, 2-disagree, 3-agree,4-strongly agree)

2. numerical or quantitative – yields numerical or continuous response


representing an amount or quantity. (age, height, temperature)
Types of Quantitative Variable
1. Discrete variables – values obtained by counting and it assumes finite or
counting number.
2. Continuous variables – values obtained by measuring, all of which cannot be
put into a list because they can have any value in some interval of real numbers.
 Levels/Scale of Measurement
Scales of Measurement – subdivided into four categories and upon drawing inferences on
a random sample, the type of measurement scale must be carefully chosen.
1. Nominal – lowest level of measurement and classifies elements into two or more
categories or classes, the numbers indicating that the elements are different but not
according to order or magnitude. Ex. Blood pressure,
2. Ordinal –a scale that ranks individual in terms of the degree to which they possess a
characteristic of interest
3. Interval – in addition to ordering scores from high to low, it also establishes a uniform
unit in the scale so that any equal distance between two scores is of equal magnitude.
There is no absolute zero in this scale.
4. Ratio – highest level of measurement and in addition to being an interval scale, it also
has an absolute zero.

 Population – defined as groups of people, animals, places, things or ideas to which any
conclusions based on characteristics of a sample will be applied.
(Factors Contributing to Sleep Disturbance Among Patients Admitted in Care Unit of
the Philippines)
 Sample – a subgroup of the population.
(Factors Contributing to Sleep Disturbance Among Patients Admitted in Selected Care
Unit of the Philippines)
¡ Parameter – a numerical measure that describes a characteristic of the population. Ex:
average height of all the students
¡ Statistic – numerical measure that is used to describe a characteristic of a sample. Ex:
average height of 10 the students.

II. MEASURES OF CENTRAL TENDENCY


 The measures of central tendency is used to describe a whole set of data with a single
value.
 Three measures of Central tendency
1. Mean- the sum of all the values in the observation or a dataset divided by the total
number of observations. It is influenced by outliers (which are numbers that are much
higher or much lower than the rest of the data set). Can be use for nominal

2. Median- The median is considered as the physical middle point in a distribution


(arranged in ascending or descending order). The median is usually the preferred
measure of central tendency when the distribution is not symmetrical because it is not
affected by outliers and skewed data than the mean

3. Mode- It is the most commonly occurring value in a distribution. No mode, Unimodal,


bimodal or multi-modal. If the distribution may have no mode at all. In such case, it
may be better to consider using the median or mean, or group the data in to appropriate
intervals, and find the modal class. This is not affected by outliers

III. MEASURES OF VARIATION


 The measures of variation or dispersion is used to describe the distribution of the data.
How is the data distributed? Is it cluster in one area or is it really spread out?
 Different measurements of variation:
1. Range- The range is the simplest measure of variation to find. It is simply the
highest value minus the lowest value. Range is not resistant to change. It is
affected by outliers but does not consider all values in the data set. Thus it is not a
very useful measure of variability.
2. Mean Absolute Deviation- Is the average of how much the data values differ from
the mean. A small M.A.D value indicates clustered data values while a big
M.A.D value indicates a spread-out data values.
3. Variance- is the square of the deviation of data sets from its mean.
4. Standard Deviation- (s) is used to quantify the amount of variation or dispersion
of a set of data. Low SD means the data are more clustered, High SD means the
data are spread out.
5. Coefficient of Variation- analyzes the risk per unit of return of an investments.
This is the ratio of standard deviation to the mean and it expresses as a
percentage.

IV. MEASURES OF POSITIONS


 A measure of position determines the position of a single value in relation to other values
in sample or a population data set. We commonly refer to these measure of position as
quantiles or fractiles.
 Quantiles- It is a score distribution where the scores are divided into different equal parts.
 There are three kinds of quantiles:
1. Quartile- A measure of position that divides the ordered observations or score
distribution into 4 equal parts.
2. Decile- A measure of position that divides the ordered observations or score
distribution into 10 equal parts.
3. Percentile- A measure of position that divides the ordered observations or score
distribution into 100 equal parts.

 Quartile Deviation- It is another way of determining the spread of a distribution in terms


of QUARTILES
 Inter Quartile Range (IQR) = Q3−Q1
 Decile Deviation- It is another way of determining the spread of a distribution in terms of
DECILES
 Inter Decile Range (IDR) = D9−D 1
 Percentile Deviation -It is another way of determining the spread of a distribution in
terms of PERCENTILES.
 Inter Percentile Range (IPR) = P99−P1
 Median=Q2=D5=P50

V. NORMAL DISTRIBUTION
 Normal Distribution- It represents a hypothetical frequency distribution in which the
frequency of scores is greatest near the mean and progressively decreases toward the
extremes. The most important distribution for a continuous random variable. This
distribution is sometimes called the Gaussian distribution in honor of Carl Friedrich
Gauss. The normal distribution is a theoretical ideal distribution. Real-life empirical
distributions never match this model perfectly. However, many things in life do
approximate the normal distribution, and are said to be "normally distributed“.
 Negatively Skewed- when the tail on the left side of the histogram is longer than the right
side
 Positively Skewed- a type of distribution in which most values are clustered around the
left tail of the distribution while the right tail of the distribution is longer
 Properties of Normal Distribution
1. The distribution curve is bell-shaped.
2. The curve is symmetrical about its mean.
3. The mean, median, and mode coincide at the center.
4. The curve is asymptotic to the base line.
5. Classified by 2 parameters: Mean (μ) and standard deviation (σ). These represent
location and spread.
6. The area under the curve is 1. Thus, it represents the probability associated with
specific sets of measurement values.
7. The width of the curve is determined by the standard deviation of the distribution.
8. Along the horizontal line, the distance from one integral standard score to the next
integral standard score is measured by the standard deviation.
 Areas under the Normal Curve
In general, we can determine the area in any specified region under the normal curve and
associate it with probability, proportion, or percentage.
The area within one standard deviation from the mean is about 68%;
two standard deviations from the mean is about 95%; and three standard deviations from
the mean is about 99.7%.

 The Standard Normal Curve- is a normal probability distribution that has a mean of zero
and a standard deviation of one. Mean =0, standard deviation=1

VI. CORRELATION AND REGRESSION


 Correlation is a statistical technique used to determine the degree to which two variables
(x and y) are related. Finding the relationship between two quantitative variables without
being able to infer causal relationships
 A statistical method used to determine whether a relationship between two variables
(bivariate data) exists. The goal of a correlation analysis is to see the strength and the
direction of the relationship between two variables. Bivariate data involve two variables
that are taken from a sample or population.
 The pattern of data is indicative of the type of relationship between your two variables:
positive relationship, negative relationship and no relationship
 Two variables are positively correlated if the values of the two variables both increase (r
is positive). Ex. The no of hours studied and test scores
 Two variables are negatively correlated if the values of one variable increase while the
values of the other decrease (r is negative). Ex. Speed and distance traveled
 Two variables are not correlated, or they have zero correlation if one variable neither
increase nor decreases while the other increases ( r is zero)
 The degree of correlation can be determined by correlation coefficient. Its value
represents an interpretation as shown in the table below.
r Verbal Interpretation

0.00 No Correlation

±0.01 to ±0.20 Slight Correlation

±0.21 to ±0.40 Low Correlation

±0.41 to ±0.70 Moderate Correlation

±0.71 to ±0.80 High Correlation

±0.81 to ±0.99 Very High Correlation

±1.0 Perfect Correlation

 Regression Analysis is very powerful tool in the field of statistical analysis specially in
predicting the value of one variable to the given value of another variable, and those
variables that are related to each other.
 Equation of the straight line

y=a+bx

VI. PROBLEM SOLVING

A. MEASURES OF CENTRAL TENDENCY (UNGROUPED DATA)


Example:
Given the following Scores: 76, 85, 92, 75, 88, 85, 91, 76, 84, 90, 87, 82, 79, 95, 84.

1. Find the Mean/ Average

x=
∑ x = 76+ 85+92+75+88+ 85+91+76+ 84+ 90+87+82+79+ 95+84
n 15

1269
¿ =84. 6
15
2. Find the Median (Middle Value)
Arranged data: 75, 76, 76, 79, 82, 84, 84, 85, 85, 87, 88, 90, 91, 92, 95
Median= 85

3. Mode (most frequent value)


Mode= 76, 84

4. Range of the data set


Range =highest value-lowest value
Range= 95-75=20

Example:
Given the followings scores: 82, 76, 90, 88, 75, 94, 85, 78

x=
∑ x = 82+76+ 90+88+75+ 94+ 85+78 = 668 =83.5
n 8 8

What should be the score of the 9th student to achieve an overall average of 85?

x=
∑ x = 82+76+ 90+88+75+ 94+ 85+78+97 = 668+ 97 = 765 =85
n 9 9 9

B. MEASURES OF VARIATION (Ungrouped and Grouped Data)

Example1:
The following are the sample ages of football players in a Pampanga: 20, 21, 22, 23, 24, 24, 25,
26, 27, 28

Find the Variance and standard deviation


P1ayer x x- x (x- x )2
1 20 20-24=-4 (-4)2=16
2 21 21-24=-3 (-3)2=9
3 22 22-24=-2 (-2) 2=4
4 23 23-24=-1 (-1) 2=1
5 24 24-24=0 0
6 24 24-24=0 0
7 25 25-24=1 (1) 2=1
8 26 26-24=2 (2) 2=4
9 27 27-24=3 (3) 2=9
10 28 28-24=4 (4) 2=16
N=10 ∑ (x−x )2 =60
Mean :

x=
∑ x = 20+ 21+ 22+ 23+24+24 +25+26+ 27+28 =24
n 10

Variance :
2
s=
∑ (x−x)2 = 60 =6.667∨6
n−1 10−1

Standard deviation
s= √ s =√ 6=2.44
2

Example2:
The data is grouped into intervals as follows:

Length No. of x fx ¿ x−x∨¿ f ∨x−x∨¿ 2


(x−x ) 2
f ( x−x)
of Stay Patient
s (f)
1-3 5 2 10 |2-7|=5 (5 x 5) = 25 25 125
4-6 10 5 50 |5-7|=2 (10 x 2 )=20 4 40
7-9 8 8 64 |8-7|= 1 (8 x 1 ) = 8 1 8
10-12 4 11 44 |11-7|=4 (4 x 4) = 16 16 64
13-15 3 14 42 |14-7|=7 (3 x 7) = 21 49 147
N=30 ∑ fx = 210 ∑ f ∨x −x∨¿=90 ¿ ∑ f (x−x )2=384
1. Solved for MAD

Mean=
∑ fx = 210 =7
N 30

MAD=
∑ f ∨x−x ∨¿ = 90 =3 ¿
n 30

2. Find the variance

s=
∑2f (x−x )2
=
384 384
= =13.24
n−1 30−1 29

3. Standard deviation

s= √ s2 =√ 13.24 =3.64

B. MEASURES OF POSITIONS (Grouped Data)

Example 1.
Scores f <cf
1-20 3 3
21-40 7 10
41-60 12 22
61-80 10 32
81-100 8 40
N=40

1. Solve Q1

n 40
= =1 0
4 2

LQ1=21-0.5 =20.5
<cf=3
f=7
i=20
n
Q1 = LQ1 + 2
(
−¿ cf
f
) (
i = 20.5 + )
10−3
7
20=40.5+ 20=4 0.5

You might also like