Review of Basic Stat
Review of Basic Stat
BASIC
STATISTICS
MEASUREMENT SCALES/LEVELS
Dependence Method
The dependence methods test for the presence of or absence of
relationship between two sets of variables – the dependent and
independent variables. Common dependence methods are t-test,
ANOVA, ANCOVA, regression analysis, chi-square test, MANOVA,
discriminant analysis and, logistic regression.
CLASSIFICATION OF DATA ANALYTIC METHODS
Interdependence methods
When data sets do exist for which it is impossible to conceptually
designate one set of variables as dependent and another set of variables
as independent. For these types of data sets the objectives are to
identify how and why the variables are related among themselves.
Common examples are correlation analysis, principal component
analysis, and factor analysis.
RELATIONSHIPS OF VARIABLES
Dependency
Independent
Variables
•Age
Hypertension
•Lifestyle
•BMI
•Family History
RELATIONSHIPS OF VARIABLES
Interdependency
•Age
Systolic Pressure
•Weight
Blood Sugar Level
•Cholesterol level
INTERPRETING STATISTICAL RESULT
Important Terms
The test statistic is a value computed from the sample data, and it
is used in making the decision about the rejection of the null
hypothesis.
The critical region (or rejection region) is the set of all values of
the test statistic that cause us to reject the null hypothesis. It is
decided by Critical Value.
The significance level (denoted by ) is the probability that the
test statistic will fall in the critical region when the null hypothesis
is actually true. Common choices for are 0.05, 0.01, and 0.10.
INTERPRETING STATISTICAL
RESULT
The statement of the problem/hypothesis is the basis for
interpreting results.
The null hypothesis is either rejected or not to be rejected
Significant result is met when the null hypothesis is
rejected. Not significant when the null hypothesis is not
rejected.
INTERPRETING STATISTICAL
RESULT
Significance can mean any of the following:
• There is a relationship.
• There is an association between or among variables.
• There is an effect.
• The treatment is effective.
• A variable is dependent on the other variable/s.
• There is a difference/different effect.
INTERPRETING STATISTICAL RESULT
Question:
• When and how do you reject or fail to reject the
null hypothesis?
• When do we say that the result is Significant?
TRADITIONAL METHOD
Fail to reject H0 if the test statistic does not fall within the critical
region.
Critical Critical
Value Value
P-VALUE METHOD
Interpretation: Majority of the respondents, about 134 out of 202 (66%), are 16-
19 years of age.
DESCRIPTIVE STATISTICS
100
23%
33% 80
60
100
23%
40
20% 70 70
60
20
0
Envious Optimistic Pessimistic Trusting Envious Optimistic Pessimistic Trusting
DESCRIPTIVE STATISTICS
SALES IN MILLION
25
20 19.7 20.1
18.2
17.5
15 14.5
13.8
12.8 12.5
11.3
10.2
10
0
2001 2002 2003 2004 2005 2006 2007 2008 2009 2010
DESCRIPTIVE STATISTICS
5.5
4.5
3.5
2.5
2
2 2.5 3 3.5 4 4.5 5
DESCRIPTIV
E STATISTICS
1. MEASURES OF
CENTRAL
TENDENCY
Measures of absolute dispersion are expressed in the units of
DESCRIPTIV the original observations.
DISPERSION
Measures of relative dispersion are unit-less and are used
when one wishes to compare the scatter of one distribution with
another distribution.
•
Also called the two sample t-test for independent samples
Assumptions maybe equal or unequal variances
It intends to test whether there is a significant difference
between the means of two unrelated groups
It is use to test the null hypothesis:
T-TEST FOR DEPENDENT SAMPLES
•
Also called the paired t-test
It intends to test whether there is a significant
difference between the means from the same
group.
Mostly used in comparing pre-test and post-
test results
It is use to test the null hypothesis:
ANOVA – ANALYSIS OF VARIANCE
Illustration:
-1 0 1
Perfect Negative No/Zero Perfect
Correlation Correlation Positive
Correlation
Closer to 0 = weaker
Closer to 1.0 = stronger
r close to 1.0 perfect
r 0 could mean many things:
No correlation at all between X & Y
Non-linear relationship between X & Y
Restricted range on X and/or Y
Outlier may be causing problems
ACTIVITY: INTERPRET THE FOLLOWING R
COEFFICIENT
1) r = 0.85
2) r = -0.69
3) r = -0.37
4) r = -0.11
5) r = 0.09
6) r = 0.32
7) r = -0.92
8) r = 0.75
ACTIVITY: INTERPRET THE FOLLOWING R
COEFFICIENT
1) r = 0.85 Ans.: Very Strong Positive
2) r = -0.69 Ans.: Moderate/Strong Negative
3) r = -0.37 Ans.: Weak Negative
4) r = -0.11 Ans.: No/Very weak
5) r = 0.09 Ans.: No/Very weak
6) r = 0.29 Ans.: Weak Positive
7) r = -0.92 Ans.: Very Strong Negative
8) r = 0.75 Ans.: Strong Positive
INTERPRETING R (Evans, 1996)
r Verbal Interpretation
-1 Perfect Negative Correlation
-0.8 to -0.99 Very Strong Negative Correlation
Assumptions
Independent random sampling
Nominal/Ordinal level data
No more than 20% of the cells have an expected frequency less than 5
No empty cells