Ruvan
Ruvan
MARGINAL DISTRIBUTION
Definition:
Mathematical Explanation:
For a joint probability distribution involving two random variables X and Y, the marginal
distribution of XX is found by summing or integrating over all possible values of Y.
Mathematically:
P(X=x)=∑yP(X=x,Y=y)
This means we sum the joint probability over all possible values of Y to get the
marginal probability for X.
fX(x)=∫fX,Y(x,y) dy
In this case, we integrate the joint probability density function fX,Y(x,y) over all
possible values of Y.
Example:
Consider a dataset where we have two variables: the number of hours studied XX and the
score on a test Y. If you are only interested in the distribution of hours studied, regardless of
the test score, you would look at the marginal distribution of X.
Key Points:
Applications:
CONDITIONAL DISTRIBUTION
Definition:
Conditional distributions are important for understanding the relationships between variables,
and they play a fundamental role in statistics, especially when analyzing how one variable
behaves under certain conditions.
Mathematical Explanation:
P(Y=y∣X=x)=P(X=x,Y=y)/P(X=x)
Here, the conditional probability is computed by dividing the joint probability P(X =
x, Y = y) by the marginal probability of X, P(X=x).
fY∣X(y∣x)=fX,Y(x,y)/fX(x)
In this formula, the conditional probability density function fY∣X(y∣x) is the ratio of
the joint probability density function fX,Y(x, y) to the marginal density of X, fX(x).
Example:Suppose you're analyzing the relationship between hours studied (X) and test
scores (Y), and you want to understand how the distribution of test scores changes given that
a student has studied for 3 hours. The conditional distribution of Y given X = 3 would tell
you the probabilities of different test scores for students who studied for exactly 3 hours.
If you know that the joint distribution of hours studied and test scores includes values like
P(X = 3, Y = 90), P(X = 3, Y = 80), etc., you could compute the conditional probabilities by
dividing the joint probabilities by the marginal probability of X = 3.
Key Points:
Conditional distributions reveal how one variable behaves when we have specific
information about another variable.
They allow us to model and interpret dependencies between variables in a dataset.
In contrast to marginal distributions, conditional distributions provide a more detailed
understanding of relationships between variables.
The total conditional probability over all values of the conditioned variable must sum
(or integrate) to 1.
Applications:
Conditional distributions are widely used in areas such as regression analysis and
Bayesian statistics, where you model the dependency between variables.
In machine learning, conditional distributions help in building models like
conditional probability networks or Markov Chains, where the goal is to predict
one variable based on the value of another.
In medicine, for example, conditional distributions can model the likelihood of a
disease given certain symptoms or test results.
Illustrative Example:
Imagine you have a joint distribution for the number of hours studied (X) and the test score
(Y):
Conclusion: