AS-level - Research Methods 4 - Correlation and Data Analysis
AS-level - Research Methods 4 - Correlation and Data Analysis
and data
analysis
What is a correlation?
Correlation
•We investigate link/relationship
between two variables
•Useful when possible to measure,
but not manipulate when
experiment is not possible
Correlation
•Cannot manipulate because of
practical and ethical reasons
•Can you think of any?
Correlation
•Cannot manipulate because of
practical and ethical reasons
•Can you think of any?
•To have a correlation, we need
continuous data
Correlation
•Continuous data number than
can take any value
•E.g. height = 193.3cm, temperature
28.1
Correlation – what can we
conclude?
•Let’s say we find a positive correlation –
can we say that one affects the other?
•We cannot assume a causal relationship
• Meaning, we cannot say that one variable
affected the other!
Correlation – what can we
conclude?
•So instead of independent and
dependent variables, we call them co-
variables
•To make judgements about cause
and effect, we need an experiment
Correlation – what can we
conclude?
•If there is no correlation, we can
conclude there is no causal
relationship either
Correlation – what can we
conclude?
•Positive correlation two
variables increase together
•Negative correlation one
variable increase, the other decrease
Hypothesis in correlational studies
•Hypothesis needs to operationalize
co-variables
•There is a correlation between studying
and stress how would you
operationalize this hypothesis?
Hypothesis in correlational studies
•How would you make a non-directional
hypothesis?
•“There will be a correlation between…”
•Directional hypothesis?
•“There will be a negative/positive correlation
between…”
Hypothesis in correlational studies
•How would you make a non-directional
hypothesis?
•“There will be a correlation between…”
•Directional hypothesis?
•“There will be a negative/positive correlation
between…”
Evaluating correlations
•Only valid if measures of both variables test
real phenomena
•Variables clearly defined and relate to
relationship
•Reliable if measures are consistent
•Self-reports less reliable because they are less
objective than scientific measurements
Evaluating correlations
•Most important to know
conclusions do not reflect a causal
relationship
Apply your knowledge
•Correlations good starting point
•Should we investigate further?
•Might be the best choice practically or
ethically
•Know the difference correlations vs.
experiements
Apply your knowledge
•Correlations good starting point
•Should we investigate further?
•Might be the best choice practically or
ethically know when to choose
•Know the difference correlations vs.
experiments
Apply your knowledge
•Be able to justify your how to measure
variables
•Know how to use a correlation that
accomplishes your aim
• Create two co-variables that can be
measured on a scale height, age, but also
results from questionnaire, tests
Data and data
analysis
Data analysis
•Be able to justify your how to measure
variables
•Know how to use a correlation that
accomplishes your aim
• Create two co-variables that can be
measured on a scale height, age, but also
results from questionnaire, tests
Data
•Like all scientists, we often have
numerical results from investigations
raw data
•This is often simplified mathematically
• So that we can make graphs, charts, etc.
Types of data
•Quantitative
• Numerical results, numbers
•Qualitative
• Detailed, descriptive
Quantitative
•Quantity of psychological measure
• Total, frequency
•Tend to be measured on scales e.g.
ratings
•You can get numerical values from
observations, questionnaires, and interviews
Quantitative
•Usually objective
•Fixed quantities reliable as well
Qualitative
•Quality of psychological characteristic
•More in-depth
•Observer notes, open-ended questions from
interviews, questionnaires, case studies
•Subjectivity risk!
•But can be representative
•Can be highly valid
Descriptive statistics
•There are many ways to analyze
data, most beyond this course
•We will focus on ‘descriptive
statistics’
Descriptive statistics
•Summary table
•Totals from:
• Tally chart, percentages, information
about averages, etc.
•Summary table has title, rows and
columns
Descriptive statistics
•Summary table
•DV and IV are in the table
•You should be able to compare data
in it
Measures of central tendency
•A set of quantitative results can be
summarized to a ‘middle’ score
•These are measures of central
tendency
•Do you know any?
The mode
•The most frequent score in a data set
•Can be used with numerical data (e.g.
scores) and data that can be counted
(e.g. favorite subject)
•It is the only measure of tendency that
can be used on discrete categories
The median
•Cannot be used in discrete data, only
numerical data
•To find median, put all data in order
from smallest to largest
•Median middle score
• What if its an even number?
The median
•Cannot be used in discrete data, only
numerical data
•To find median, put all data in order from
smallest to largest
•Median middle score
• What if its an even number?
• Add them and divided by 2
The mean
•In other words, the ‘average’
•Most used measure of tendency
• Because it includes all data points
•Add all scores, divide by number of
data points
The mean
•Add all scores, divide by number of
data points
•What can be a problem with this?
• Extreme scores
•Still, most representative
The mean
•64+66+76+73+74+81+83+82+80+88 =
767
•10 data points so:
• 767 divided by 10 = 76.7
Measures of spread
•Indicates how varied results are
within data set: clustered together or
widely dispersed?
•Range and standard deviation
The range
•Simplest measure of spread
•Step 1 – find largest and smallest
value in data set
•Step 2 – subtract smallest value
from largest value, then add 1
The range
•It says something about the diversity
of the data
•One problem doesn’t reflect
outliers
•Outliers abnormal distance from
other values
The range
•It says something about the diversity
of the data
•One problem doesn’t reflect
outliers
•Outliers abnormal distance from
other values
Standard deviation
•Standard deviation tells us more
•Considers the difference between
each data point and the mean
•More spread data higher standard
deviation, less spread lower
standard deviation
Graphs
•Visual representation of data
•Which ones do you know?
•Bar charts, histograms, scatter
graphs
Bar chart
•Used when there are separate categories
•Used when there are totals of data
collected in named categories
•Y-axis DV
•X-axis groups
Histograms
•Histograms show pattern in whole
data set continuous data, not
categories
•Measures frequency (on y-axis)
•DV on the x-axis
Scatter graphs
•Correlational study
•Dot is marked at point where score
on each variable cross
•Line of best fit calculated line
which goes through the graph
Scatter graphs
•R value is the strength of a
correlation
•Ranges from -1 to +1
•+1 strong positive correlation
•-1 strong negative correlation
Scatter graphs
•R value is the strength of a
correlation
•Ranges from -1 to +1
•+1 strong positive correlation
•-1 strong negative correlation