How To Calculate P Value
How To Calculate P Value
P value is a statistical measure that helps scientists determine whether or not their
hypotheses are correct. P values are used to determine whether the results of their
experiment are within the normal range of values for the events being observed.
Usually, if the P value of a data set is below a certain pre-determined amount (like, for
instance, 0.05), scientists will reject the "null hypothesis" of their experiment - in other
words, they'll rule out the hypothesis that the variables of their experiment
had no meaningful effect on the results. Today, p values are usually found on a
reference table by first calculating a chi square value.
Ad
Steps
1.
1
Determine your experiment's expected results. Usually, when scientists conduct an
experiment and observe the results, they have an idea of what "normal" or "typical"
results will look like beforehand. This can be based on past experimental results, trusted
sets of observational data, scientific literature, and/or other sources. For your
experiment, determine your expected results and express them as a number.
Example: Let's say prior studies have shown that, nationally, speeding
tickets are given more often to red cars than they are to blue cars. Let's say the average
results nationally show a 2:1 preference for red cars. We want to find out whether or not
the police in our town also demonstrate this bias by analyzing speeding tickets given by
our town's police. If we take a random pool of 150 speeding tickets given to either red or
blue cars in our town, we would expect 100 to be for red cars and 50 to be for blue
cars if our town's police force gives tickets according to the national bias.
Ad
2
Determine your experiment's observed results. Now that you've determined your
expected values, you can conduct your experiment and find your actual (or "observed")
tickets which were given to either red or blue cars. We found that 90 tickets were for red
cars and 60 were for blue cars. These differ from our expected results
of100 and 50, respectively. Did our experimental manipulation (in this case, changing
the source of our data from a national one to a local one) cause this change in results,
or are our town's police as biased as the national average suggests, and we're just
observing a chance variation? A p value will help us determine this.
and one for blue cars. Thus, in our experiment, we have 2-1 = 1 degree of freedom. If
we had compared red, blue, and green cars, we would have 2degrees of freedom, and
so on.
4
Compare expected results to observed results with chi square. Chi square(written
"x2") is a numerical value that measures the difference between an
experiment's expected and observed values. The equation for chi square is: x2 = ((oe)2/e), where "o" is the observed value and "e" is the expected value. [1] Sum the results
of this equation for all possible outcomes (see below).
Note that this equation includes a (sigma) operator. In other words, you'll
need to calculate ((|o-e|-.05)2/e) for each possible outcome, then add the results to get
your chi square value. In our example, we have two outcomes - either the car that
received a ticket is red or blue. Thus, we would calculate ((o-e) 2/e) twice - once for red
cars and once for blue cars.
Example: Let's plug our expected and observed values into the equation
x = ((o-e) /e). Keep in mind that, because of the sigma operator, we'll need to perform
2
((o-e)2/e) twice - once for red cars and once for blue cars. Our work would go as follows:
x2 = ((90-100)2/100) + (60-50)2/50)
x2 = ((-10)2/100) + (10)2/50)
x2 = (100/100) + (100/50) = 1 + 2 = 3 .
Choose a significance level. Now that we know our experiment's degrees of freedom
and our chi square value, there's just one last thing we need to do before we can find
our p value - we need to decide on a significance level. Basically, the significance level
is a measure of how certain we want to be about our results - low significance values
correspond to a low probability that the experimental results happened by chance, and
vice versa. Significance levels are written as a decimal (such as 0.01), which
corresponds to the percent chance that the experimental results happened by chance
(in this case, 1%).
Example: For our red and blue car example, let's follow scientific
convention and set our significance level at 0.05.
6
Use a chi square distribution table to approximate your p-value. Scientists and
statisticians use large tables of values to calculate the p value for their experiment.
These tables are generally set up with the vertical axis on the left corresponding to
degrees of freedom and the horizontal axis on the top corresponding to p-value. Use
these tables by first finding your degrees of freedom, then reading that row across from
the left to the right until you find the first value bigger than your chi square value. Look at
the corresponding p value at the top of the column - your p value is between this value
and the next-largest value (the one immediately to the left of it.)
one handy, use the one in the photo above or a free online table, like the one provided
by medcalc.org here.
Example: Our chi-square was 3. So, let's use the chi square
distribution table in the photo above to find an approximate p value. Since we know our
experiment has only 1 degree of freedom, we'll start in the highest row. We'll go from left
to right along this row until we find a value higher than 3 - our chi square value. The first
one we encounter is 3.84. Looking to the top of this column, we see that the
corresponding p value is 0.05. This means that our p value is between 0.05 and
0.1 (the next-biggest p value on the table).
7
Decide whether to reject or keep your null hypothesis. Since you have found an
approximate p value for your experiment, you can decide whether or not to reject the null
hypothesis of your experiment (as a reminder, this is the hypothesis that the
experimental variables you manipulated did not affect the results you observed.) If your
p value is lower than your significance value, congratulations - you've proved that it's
highly likely that there is a correlation between the variables you manipulated and the
results you've observed. If your p value is higher than your significance value, you can't
say with confidence whether the results you observed were the result of pure chance or
of your experimental manipulation.
Example: Our p value is between 0.05 and 0.1 . This means that it's
definitelynot smaller than 0.05, so, unfortunately, we can't reject our null hypothesis.
This means that we didn't reach the minimum 95% threshold of certainty we decided
upon to be able to say that our town's police give tickets to red and blue cars at a rate
that's significantly different than the national average.