SU15 - X590 - IHRM - Module 8 - Data Analysis - Significance
SU15 - X590 - IHRM - Module 8 - Data Analysis - Significance
SU15 - X590 - IHRM - Module 8 - Data Analysis - Significance
METHODS
SPH-X590 SUMMER 2015
Theories
Deductive
Analysis
Reasoning
Hypotheses Propositions
SCIENTIFIC METHOD
Variables Concepts
Measurement Postulates
Data Analysis:
In the Big Picture of Methodology
Question to Answer
Hypothesis to Test Note: Results of empirical scientific studies
Theory always begin with the Descriptive Statistics,
whether results conclude with Inferential Statistics
depends of the Research Objectives/ Aims
Study Design:
Data Collection Method & Analysis
Inferential Statistics
Causal Inference
Collect Data: Test Hypothesis, Conclusions,
Measurements, Observations Interpretation, & Identification
Relationships
• Descriptive Statistics
o Summarization & Organization of variable values/scores
for the sample
• Inferential Statistics
o Inferences made from the Sample Statistic to the
Population Parameter.
o Able to Estimate Causation or make Causal Inference
• Isolate the effect of the Experimental (Independent) Variable
on the Outcome (Dependent) Variable
Data Analysis:
Descriptive Statistics
• Descriptive Statistics are procedures used for organizing and summarizing
scores in a sample so that the researchers can describe or communicate the
variables of interest.
• Note: Descriptive Statistics apply only to the sample: says nothing about how
accurately the data may reflect the reality in the population
• Attempts to rule out chance as an explanation for the results: that results reflect real
relationships that exist in the population and are not just random or only by chance.
• Before you can describe or evaluate a relationship using statistics, you must design
your study so that your research question can be addressed.
: summation
X : Independent Variable, typically
Y: Dependent Variable, typically
N= Size of the Population
n= Size of the Sample
≤ ≥ ≠ = : Equalities or Inequalities
± × ÷ + - : Mathematical Operators
α: alpha, refers to constant/ intercept
µ: mu, sample mean
β: beta coefficient/ standardized
δ: sigma, sample standard deviation
δ2: sigma squared, sample variance
Data Analysis:
Inferential Statistics & Types of Tests
Data Analysis:
• The frequency column contains the tallies for each value X: how
often each X value occurs in the data set.
o These tallies are the frequencies for each X value.
• Class Intervals all have the same width: typically, a simple number such as 2, 5, 10,
and so on.
• Each Class Interval begins with a value that is a multiple of the Interval Width.
o The Interval Width is selected so that the distribution will have approximately 10 intervals.
Data Analysis: Grouped Frequency Distribution
• Choosing a width of 15 Relative
Class Interval Frequency Frequency
Class Intervals produces
100 to <115 2 0.025
the following Frequency 115 to <130 10 0.127
Distribution. 130 to <145 21 0.266
145 to <160 15 0.190
• Age is typically displayed 160 to <175 15 0.190
as Grouped Frequency 175 to <190 8 0.101
190 to <205 3 0.038
Distribution:
205 to <220 1 0.013
o For Example:
220 to <235 2 0.025
• 45 to 54 Years 235 to <250 2 0.025
• 55 to 64 Years 79 1.000
Copyright © 2005 Brooks/Cole, a
division of Thomson Learning, Inc.
o The score categories (X values) are listed on the X axis and the
frequencies (Number of categories of X values) are listed on the Y axis.
Table
Histograms A frequency distribution
histogram: same set of
quiz scores as a table and
Also see Age Distribution of in a histogram.
Martians examples from
Sampling PowerPoint
• The Smooth Curve emphasizes the shape of the distribution: not the exact
frequency for each category
• Negatively Skewed: the scores tend to pile up on the right side and the
tail points to the left.
Data Analysis: Percentiles, Percentile Ranks, & Interpolation
• However, the definition can be written in mathematical notation to create a formula for
computing the z-score for any value of X.
X– μ
z = ────
σ
X = μ + zσ
The relationship between z-score values
and locations in a population
distribution.
An entire population of scores is transformed into z-scores. The transformation does not
change the shape of the population, but the mean is transformed into a value of 0 and
the standard deviation is transformed to a value of 1.
Following a z-score transformation, the X-axis
is relabeled in z-score units.
Why are z-scores important? Because if you know the distribution of your scores,
you can test hypothesis, and make predictions.
Data Analysis: Characteristics of z Scores
• Z scores tell you the number of standard deviation units a score is above or
below the mean
• The mean of the z score distribution = 0
• The SD of the z score distribution = 1
• The shape of the z score distribution will be exactly the same as the shape of
the original distribution
• S z=0
• S z2 = SS = N
• 2 = 1 = ( z2/N)
Data Analysis:
Sources of Error in Probabilistic Reasoning
• Type II Errors
o Failure to reject a false null hypothesis
o Sometimes called a “Beta” Error.
Data Analysis: Statistical Power
How sensitive is a test to detecting real effects?
• A powerful test decreases the chances of making a Type II Error
• Ways of Increasing Power:
o Increase sample size
o Make alpha level less conservative
o Use one-tailed versus a two-tailed test
Data Analysis:
Assumptions of Parametric Hypothesis Tests
(z, t, ANOVA)
Evaluation of F Ratio
• Obtained F is compared with a critical value
• If you get a significant F, all it tells you is that at least one of
the means is different from one of the others
• To figure out exactly where the differences are, you must use
Multiple Comparison Tests
Data Analysis: Multiple Comparison Tests
• The issue of “Experiment-wise Error”
o Results from an accumulation of “per comparison errors”
• Planned Comparisons
o Can be done with t tests (must be few in number)
• Unplanned Comparisons (Post Hoc tests)
o Protect against experiment-wise error
o Examples:
• Tukey’s HSD Test
• The Scheffe Test
• Fisher’s LSD Test
• Newman-Keuls Test
Data Analysis:
Measuring Effect Size in ANOVA
• Most common technique is “r2”
o Tells you what percent of the variance is due to
the treatment
o r2 = SS between groups/SS total
Single Factor ANOVA: One-Way ANOVA
• Can be Independent Measures
• Can be Repeated Measures
Do you the Standard Deviation
of the Population?
Yes No
Yes: No:
Only 2 2>
No:
Yes:
2>
Only 2
Groups
Use
ANOVA
Do you have
Independent data?
If F not If F is
Yes No Significant, Significant,
Retain Null Reject Null
Do you have If F not If F is
Independent data? Significant, Significant,
Retain Null Reject Null
Yes No
Compare Means
With Multiple
Use Comparison Tests
Use Paired
Independent
Sample
Sample
T test
T test
Use
Use Paired
Independent
Sample
Sample
T test
T test
Is t test
Significant?
Yes No