Matlab Assignment
Matlab Assignment
If B ⊂ SX then
PX (B) = P(X −1 (B)).
Which Model is Better? New model or Old model?
If we know PX
then we can answer most (if not all) questions that interest us. So
unless the question is very, very complicated, we can focus on the
new model and put the old one on the shelf.
PX is a Set Function
PX acts on sets of numbers. It gives us the information that we
are looking for if we can formulate the question in the language of
sets. We think of a point as a singleton set - a set consisting of
exactly one element.
Non-discrete RVs
When SX is not countable it’s no longer useful to think of the
points in SX as being particulate. In this case two situations arise:
Continuous RVs: No points of SX have any mass, in other
words
PX (x) = 0, ∀x ∈ SX .
Mixed RVs: Some points in SX have mass and the remaining
points don’t have any mass. This situation will be discussed
later.
Discrete Random Variables
PX (x) = PX ({x}), ∀x ∈ SX .
When B is a finite subset of SX summing up the PMF over values
of B is straightforward. When B is countably finite, we use the
following technique:
We can sum up the PMF over B because we know how to add the
terms of a convergent series.
Intuitively, a set of measure zero is a set which has zero length - for
example a finite or countable set of points. All conditions imposed
on a pdf need only be satisfied outside of a set of measure zero.
The idea is that a set of measure zero does not contribute to
the integral. A point has zero length, so the slice above the point
has zero area.
Paradigm Shift
The definite integral of a function is the area under the curve of
the function.
In high school we’re used to think of integrals as antiderivatives.
However if we think of definite integrals as areas it not only helps
intuitively, but it also helps approximate the integral when no
antiderivative is available.
The relationship between the CDF and the PDF of a continuous
random variable X is given by
dFX
fX (x) = .
dx
As the pdf is only defined upto a set of measure zero, FX is only
required to be differentiable at all points except on a set of
measure zero. This condition is usually met quite easily - reasons
behind it will be explained in higher classes.
Caution
Geometrically, the PDF is the slope of the CDF. The slope
measures the rate at which the probability distribution
changes and should not be confused with the actual probability at
any given point!
Practical Applications
We’ve already discussed examples of discrete random variables at
length, and as far as I know, this concept is clear to most of you. I
did, however, get the impression that the same is not true for
continuous RVs. What follows is an attempt to answer the
following-
Question
What practical situations are best modelled using the families of
continuous random variables that we’ve seen?
Uniform Distribution
This is typically used to model a situation in which all outcomes
are equally likely. Examples are:
A meteor strikes the earth. The location where it lands is
observed.
An office worker glances at his watch. He observes the minute
hand.
Questions: In the second example, if the worker observes the hour
hand, would all outcomes still be (approximately) equally likely?
Should separate models be used for digital and analog watches?
Question: Consider an experiment in which an arrow is shot at a
target and the position where it lands is observed. Under what
conditions would we model this using a uniform distribution?
A Thought Experiment
Consider the following thought experiment. The amount of time
elapsed until the Riemann hypothesis is proved, is measured, from
the point when it was first conjectured in 1859. Is the probability
that the problem remains open for a total of 200 years less or more
than the probability that it will remain open for 200 years starting
now? Are the probabilities equal? Mathematically, we can express
this question in the following manner.
Let X be the amount of time elapsed until the Riemann hypothesis
is proved, starting from 1859. The current year is 2020. So
approximately 161 years have elapsed so far. Is
To tell you the truth, I have no idea what the answer to this
question is. But if the above probabilities were equal then we
would say that X is memoryless.
Exponential Distribution
There are a lot more distributions than the three just mentioned.
Students are encouraged to find out for themselves how to choose
between them, as further investigation is presently beyond our
scope.
Most of the practice problems in the text are based on pre-defined
models. It’s a great idea to solve all of them, to be well-acquainted
with the formulas and techniques and how to use them. A few
examples of how to do this have been presented in class. At the
same time it is important not to forget the bigger picture - the
question of how these models were arrived at.
Matlab Assignment - 10% of your grade, due 15th
April
Experiment 1
Observe the time gaps between your next 30 WhatsApp
messages. (You may replace WhatsApp with any instant
messaging app of your choice.)
Plot a histogram of your data using Matlab.
Fit a density function on to your histogram, by using the
appropriate Matlab tools.
Based on your distribution, find the probability that the time
elapsed until your next message is less than the expected time
gap.
Assignment - continued.
Experiment 2
Repeat Experiment 1, but this time, record the time gaps between
messages from one person - a person you communicate with
sufficiently often.