Week 12
Week 12
Social Research –
Methodological Thinking
Week 12
Department of Psychology
Faculty of Humanities
• A measure of variability is
• a numerical index that provides information about how spread out or how
much variation is present in a variable
• If all of the data values for a variable were the same, then there is
no variability
• The more di6erent your numbers, the more variability you have
• Data for group one: 44, 45, 45, 45, 46, 46, 47, 47, 48, 49
• Data for group two: 34, 37, 45, 51, 58, 60, 77, 88, 90, 98
• data for group two have more variability than group one
• when there is little variability in a group, we say that the scores are
homogeneous
• when the scores show a lot of variability, we say that the scores are
heterogeneous
Measures of variability
• Three types of measures of variability
• Range
• Variance
• Standard deviation
Range
• Simplest, but most crude, measure of variability
• the highest (i.e., largest) number minus the lowest (i.e., smallest) number
in a set of numbers
• Range = H - L
• H is the highest number, and L is the lowest number
Variance and standard deviation
• Two most popular measures of variability are
• the variance and
• the standard deviation
• They are superior to the range because they take into account
all of the data values for a variable
• provide information about the dispersion or variation around the mean
value of a variable
• Variance is
• the average deviation of the data values from their mean in “squared
units”
• the variance is popular because it has nice mathematical properties
• To turn the variance into more meaningful units, you can obtain
the standard deviation
• standard deviation is the square roots of variance
• to calculate the standard deviation,
• you take the square root of the variance (i.e., you put the value of the
variance into your calculator and press the square root key)
• an approximate indicator of the average distance that your data values
are from their mean
• if you have a mean of 5 and a standard deviation of 2, then the data values
tend to be approximately 2 units above or below 5
• For the variance and the standard deviation,
• the larger the value, the greater the data are spread out;
• the smaller the value, the less the data are spread out
Calculate the range, variance and standard
deviation
• Data set
• 14; 72; 52; 15; 19; 36; 58; 25
• Calculate
• Range
• Variance
• Standard deviation
• If a data value is +1.00, one can say that this value falls one
standard deviation above the mean, a value of +2.00 means it falls
two standard deviations above the mean, a value of –1.5 means it
falls one and a half standard deviations below the mean, and so
on
• “Standardized units” or “z scores” were used with the normal
curve
Calculate the z-score for the highest value
• Data set
• 14; 72; 52; 15; 19; 36; 58; 25
• Using data from TikTok University’s graduate data set, the mean
(i.e., the average) starting salary for males is ZAR 34,791.67, and
the mean starting salary for females is ZAR 31,269.23. Therefore,
the unstandardized difference between these two means is ZAR
34,791.67 minus ZAR 31,269.23, which is ZAR 3,522.44
• What can we deduce from this?
• To assist in deciding how different the group means are,
• the difference between the means is often transformed into a
standardised measure
Cohen’s d is .88. This says that the mean starting salary for men is .88
standard deviations above the mean for females.
Using Cohen’s criteria for interpretation, one would consider this a “large”
diJerence between the means.
Correlation coefficient
• Index indicating the strength and direction of linear relationship
between two quantitative variables
• value ranges from +1.0 to -1.0
• Negative correlation
• correlation in which values of two variables tend to move in opposite
directions
• e.g., the more hours students spend partying the night before an exam,
the lower their test grades tend to be
• Pearson correlation (r)
• used with two quantitative variables
• only appropriate if data is related in a linear fashion
• Partial correlation
• a technique that involves examining correlation after controlling for one or
more variables
• multiple regression
• involves two or more independent or predictor variables
• Prediction is made using the regression equation
• This equation defines the regression line that best fits the pattern
of observations in your data
• slope – how steep is the line
• y-intercept – point where regression line crosses y-axis
• Regression coefficient
• predicted change in the dependent variable (Y) given a one unit change in
the independent variable (X)
• Examining relationships
• Cohen’s d
• Correlation
• Regression analysis
• Contingency tables
Thank You
Next Lecture
Unit 11 Prof Eugene L Davids
Inferential statistics (Chapter 15)
Room 11-30 (Humanities Building)
Student Evaluation:
Please keep an eye out for this email Consultation: Tuesdays 9h00 – 11h00 (by
and complete prior email arrangement)
Exam Focus
Units 7 - 11