0% found this document useful (0 votes)
9 views10 pages

Correlation Analysis

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
9 views10 pages

Correlation Analysis

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 10

Correlation Analysis

Understanding
Relationships
Between
Va r i a b l e s

Karim
Allahverdiyev
Introduction to Correlation
Definition:
Correlation measures the strength and direction of a linear relationship between two
quantitative variables. The correlation coefficient, denoted as rrr, ranges from -1 to +1,
where:
• r=+1 : Perfect positive correlation.
• r=−1 : Perfect negative correlation.
• r=0 : No correlation.
Example:
• Positive Correlation:
⚬ Study time and exam scores: More study time often results in higher scores.
• Negative Correlation:
⚬ TV hours and physical fitness: More TV hours might lead to lower fitness levels.
Importance:
Correlation is essential for:
• Predictive Analysis: Helps forecast outcomes (e.g., sales and advertising).
• Identifying Relationships: Determines if variables are related for deeper analysis.
• Decision-Making: Guides decisions in fields like business, healthcare, and education.
Applications:
• Statistics: Explaining variance and dependencies.
• Economics: Studying market trends (e.g., inflation and unemployment rates).
Types of Correlation
Positive Correlation
• Definition: Both variables move in the same direction. If one increases, the other also
increases.
• Characteristics:
⚬ Direct relationship.
⚬ Correlation coefficient (rrr) is between 0 and +1.
• Example: As study hours increase, exam scores tend to improve.
• Graph: Points form an upward trend on a scatterplot.
Negative Correlation
• Definition: Variables move in opposite directions. If one increases, the other decreases.
• Characteristics:
⚬ Inverse relationship.
⚬ Correlation coefficient (rrr) is between -1 and 0.
• Example: As outdoor temperature increases, heating usage decreases.
• Graph: Points form a downward trend on a scatterplot.
No Correlation
• Definition: No clear relationship between the variables.
• Characteristics:
⚬ Correlation coefficient (rrr) is close to 0.
• Example: Shoe size and intelligence are unrelated.
• Graph: Points are scattered randomly on a scatterplot.
Correlation Coefficient
Definition:
The correlation coefficient, denoted r as is a numerical measure of the strength and
direction of the linear relationship between
X and two variables -1 to Its
+1value ranges
from Y

Formul
a:
Scatter Plot
Scatter Plot Features:
Axes:
X-axis (Independent Variable): This is typically where the variable you believe might be
influencing the other variable is placed.
Y-axis (Dependent Variable): This axis represents the variable that might change in
response to the other.
Data Points: Each dot represents a pair of values from the two variables. The position of
each dot is crucial for understanding the correlation.
Trend Lines (Optional): If there’s a strong linear relationship, you might draw a trend line. A
line of best fit helps to visualize the correlation:
Positive Correlation Trend Line: Slopes upward from left to right.
Negative Correlation Trend Line: Slopes downward from left to right.
No Correlation Trend Line: A horizontal line or no line at all.
Practical Applications of Scatter Plots:
In research: scatter plots are used to examine relationships between variables, predict
outcomes, and identify trends.
In business: they can help in identifying customer preferences or behaviors, understanding
market dynamics, and more.
In science: they’re used to study correlations between variables like temperature and
Functions of Correlation
• Predictive Analysis:
Correlation is a vital tool in forecasting because it establishes how two variables move together.
For example, a company might notice a positive correlation between the number of online ads
and website traffic. This relationship enables the company to predict future traffic based on
planned ad campaigns. Similarly, in climate studies, a strong correlation between atmospheric
CO2 levels and global temperatures allows scientists to forecast future climate changes.
• Decision-Making:
Businesses and researchers use correlation to make data-driven decisions. For instance:
In marketing, if there’s a strong correlation between social media engagement and sales, a
company might allocate more resources to social media campaigns.
In healthcare, understanding correlations between lifestyle factors (like diet and exercise) and
health outcomes can inform public health policies.
For researchers, correlation studies can guide hypotheses and determine the focus of deeper
investigations.
• Identifying Trends:
Trends in data often emerge from correlated variables. For example:
Retailers can observe seasonal trends by correlating sales data with months of the year,
enabling them to prepare for high-demand periods like holidays.
In finance, correlations between market indices and economic indicators help investors
understand and anticipate market trends.
In sociology, correlations between education levels and income patterns can provide insights
Assumptions of correlatietion
1. Linearity
• What It Means: The relationship between the two variables should form a
straight line. Non-linear patterns (e.g., curves) can make the correlation
coefficient inaccurate.
• Check: Look at a scatterplot to see if the points align roughly in a straight line.
• Fix: Use transformations (like log) or a method like Spearman’s correlation for
non-linear relationships.
2. Homoscedasticity
• What It Means: The spread of points (variance) should stay consistent across all
values of the variables.
• Check: A scatterplot should show an even spread of points, not a funnel shape
(narrow at one end, wide at the other).
• Fix: Transform the data (e.g., log or square root) or use robust methods.
3. No Outliers
• What It Means: Extreme data points can skew the results, inflating or reducing
the correlation.
• Check: Use scatterplots or boxplots to spot outliers.
• Fix: Remove or transform outliers, or use Spearman’s correlation, which is less
sensitive to them.
applications of Correlation
1. Economics: Relationship between Income and Spending
Correlation plays a crucial role in economics, especially in understanding how various factors
influence consumer behavior. One notable example is the relationship between income and
spending.
• Positive Correlation: Generally, as income increases, spending also rises, particularly for
non-essential or luxury items. Conversely, a drop in income tends to reduce spending on
discretionary goods.
• Practical Application: This relationship helps governments and businesses forecast
demand, set economic policies, and design products for different income groups. For
instance, during economic downturns, businesses may focus on offering affordable
alternatives to retain consumers.
2. Finance: Stock Prices and Market Indices
In finance, correlation is critical for understanding the relationships between various market
factors. One key area is the correlation between individual stock prices and broader market
indices.
• Positive Correlation: Many stocks move in tandem with market indices, reflecting overall
market trends. For example, during a bull market, individual stock prices typically rise along
with the index.
• Negative Correlation: Some assets, like gold or other safe-haven investments, may show
an inverse correlation with stock indices, rising in value when markets decline.
Limitations of correlation
• Correlation ≠ Causation
A correlation between two variables does not imply that one causes the other. There could
be other factors at play, or the relationship might be coincidental.
• Impact of Outliers
Outliers in the data can significantly influence the correlation coefficient, making it appear
stronger or weaker than it truly is.
• Confounding Variables
A hidden or unmeasured variable may affect both variables, creating a spurious relationship
that suggests a connection where none truly exists.
• Linear Relationship Assumption
Correlation measures the strength of a linear relationship. Nonlinear relationships may exist
but remain undetected by correlation.
• Limited Scope
Correlation is limited to pairs of variables and does not capture complex, multivariate
relationships.
• Magnitude Misinterpretation
A high correlation does not necessarily mean a strong or meaningful relationship, as it could
be due to other factors like sample size or data scaling.
Thanks
For Your Attention

You might also like