0% found this document useful (0 votes)
80 views10 pages

Hypotesis Test

1. Hypothesis testing involves formulating competing claims called the null hypothesis (H0) and alternative hypothesis (H1). Special consideration is given to the null hypothesis. 2. A hypothesis test results in either rejecting the null hypothesis in favor of the alternative, or not rejecting the null hypothesis. Type I and type II errors can occur. 3. Key aspects of hypothesis testing include the test statistic, critical value(s), critical region, significance level, and the two types of possible errors.

Uploaded by

Luis Valens
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
80 views10 pages

Hypotesis Test

1. Hypothesis testing involves formulating competing claims called the null hypothesis (H0) and alternative hypothesis (H1). Special consideration is given to the null hypothesis. 2. A hypothesis test results in either rejecting the null hypothesis in favor of the alternative, or not rejecting the null hypothesis. Type I and type II errors can occur. 3. Key aspects of hypothesis testing include the test statistic, critical value(s), critical region, significance level, and the two types of possible errors.

Uploaded by

Luis Valens
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 10

Hypothesis Test Setting up and testing hypotheses is an essential part of statistical inference.

In order to formulate such a test, usually some theory has been put forward, either because it is believed to be true or because it is to be used as a basis for argument, but has not been proved, for example, claiming that a new drug is better than the current drug for treatment of the same symptoms. In each problem considered, the question of interest is simplified into two competing claims / hypotheses between which we have a choice; the null hypothesis, denoted H0, against the alternative hypothesis, denoted H1. These two competing claims / hypotheses are not however treated on an equal basis: special consideration is given to the null hypothesis. We have two common situations: 1. The experiment has been carried out in an attempt to disprove or reject a particular hypothesis, the null hypothesis, thus we give that one priority so it cannot be rejected unless the evidence against it is sufficiently strong. For example, H0: there is no difference in taste between coke and diet coke against H1: there is a difference.

2. If one of the two hypotheses is 'simpler' we give it priority so that a more 'complicated' theory is not adopted unless there is sufficient evidence against the simpler one. For example, it is 'simpler' to claim that there is no difference in flavour between coke and diet coke than it is to say that there is a difference. The hypotheses are often statements about population parameters like expected value and variance; for example H0 might be that the expected value of the height of ten year old boys in the Scottish population is not different from that of ten year old girls. A hypothesis might also be a statement about the distributional form of a characteristic of interest, for example that the height of ten year old boys is normally distributed within the Scottish population.

The outcome o of o a hypoth hesis test test is "Re eject H0 in n favour of f H1" or "Do not re eject H0".

Null N Hypo othesis The null n hypoth hesis, H0, represent ts a theory y that has been put forward, either because e it is belie eved to be true or be ecause it is to be us sed as a basis s for argum ment, but has h not be een proved d. For exa ample, in a clinical trial of o a new drug, d the null n hypoth hesis migh ht be that t the new dr rug is no bette er, on aver rage, than the current drug. W We would w write H0: there e is no diffe erence be etween the e two drug gs on aver rage. We give g specia al conside eration to the null hy ypothesis. This is du ue to the fact that the nu ull hypothe esis relates to the st tatement b being teste ed, wher reas the al lternative hypothesis relates t to the stat tement to be accepted if / when the nu ull is rejec cted. The final f conclusion onc ce the test has been n carried o out is always given in terms s of the nu ull hypothe esis. We either e "Reject H0 in f favour of H1" or "Do not re eject H0"; we w never conclude "Reject H1", or even n "Accept H1". If we conclude "Do not re eject H0", this does not neces ssarily mean that the e null hypothesis h s is true, it t only sugg gests that there is n not sufficie ent evide ence again nst H0 in fa avour of H1. Rejectin ng the null hypothes sis then, suggests that the t alterna ative hypo othesis ma ay be true. See also a hypot thesis test t.

Alternativ A ve Hypoth hesis The alternative a e hypothes sis, H1, is a stateme ent of what t a statistic cal hypothesis test is set up to establish. For ex xample, in n a clinical trial of a new drug, d the alternative a e hypothes sis might b be that the e new drug has a differ rent effect, on avera age, compared to that of the c current dru ug. We would d write H1: the tw wo drugs have h different effect ts, on ave erage. The alternative a e hypothes sis might also a be tha at the new w drug is b better, on avera age, than the curren nt drug. In this case we would d write

H1: the new drug is s better th han the current drug g, on avera age. The final f conclusion onc ce the test has been n carried o out is always given in terms s of the nu ull hypothe esis. We either e "Reject H0 in f favour of H1" or "Do not re eject H0". We W never conclude "Reject H1", or even "Accept t H1". If we conclude "Do not re eject H0", this does not neces ssarily mean that the e null hypothesis h s is true, it t only sugg gests that there is n not sufficie ent evide ence again nst H0 in fa avour of H1. Rejectin ng the null hypothes sis then, suggests that the t alterna ative hypo othesis ma ay be true.

Simple S Hy ypothesis s A sim mple hypot thesis is a hypothes sis which s specifies t the popula ation distribution com mpletely. Exam mples 1. H0: X ~ Bi(10 00,1/2), i.e e. p is spe ecified 2. H0: X ~ N(5,20), i.e. and ar re specifie ed See also a comp posite hypothesis.

Composit C te Hypoth hesis A com mposite hy ypothesis is a hypot thesis whiich does n not specify y the population dist tribution co ompletely. . Exam mples 1. X ~ Bi(100,p p) and H1: p > 0.5 2. X ~ N(0, ) and H1: unspecified See also a simpl le hypothe esis.

Type T I Err ror

In a hypothesis h s test, a ty ype I error occurs when the nu ull hypothe esis is rejected when it is in fac ct true; that is, H0 is w wrongly re ejected. For example, e in n a clinica al trial of a new drug g, the null hypothesis might be e that the t new dr rug is no better, b on average, t than the c current dru ug; i.e. H0: there e is no diffe erence be etween the e two drug gs on aver rage. A typ pe I error would w occu ur if we co oncluded t that the tw wo drugs p produced differ rent effects s when in fact there e was no d difference between t them. The following f table t gives s a summa ary of pos ssible resu ults of any hypothesis test:
Decision n Rejec ct H0 H0 Truth H1 Right de ecision Typ pe II Error Type I Error E Do on't reject H 0 Rig ght decision

A typ pe I error is s often considered to t be more e serious, and there efore more e important to av void, than a type II error. e The hypothesis test pro ocedure is there efore adjus sted so tha at there is a guaranteed 'low' probabilit ty of rejecting the nu ull hypothe esis wrong gly; this pr robability is never 0 0. This proba ability of a type I err ror can be precisely y computed d as P(type I error) = significance e level = The exact e prob bability of a type II error e is gen nerally unknown. If we do not rej ject the nu ull hypothe esis, it ma ay still be f false (a typ pe II error) as the sample may not be b big eno ough to ide entify the f falseness of the null hypothesis (especially if the truth is i very clo ose to hypo othesis). For any a given set s of data a, type I and type II errors are e inversely y related; the smaller the e risk of on ne, the hig gher the ris sk of the o other. A typ pe I error can c also be referred to as an e error of th he first kind d.

Type T II Er rror In a hypothesis h s test, a ty ype II error r occurs w when the n null hypoth hesis H0, is s not re ejected wh hen it is in n fact false e. For exam mple, in a clinical tri ial of a new

drug, , the null hypothesis h s might be e that the n new drug i is no bette er, on avera age, than the curren nt drug; i.e e. H0: there e is no diffe erence be etween the e two drug gs on aver rage. A typ pe II error would w occ cur if it was s conclude ed that the e two drug gs produ uced the same s effec ct, i.e. there is no diifference b between th he two drugs s on avera age, when n in fact the ey produc ced different ones. A typ pe II error is i frequen ntly due to sample s izes being g too smal ll. The probability p y of a type II error is generally y unknown n, but is sy ymbolised by and writte en P(type II error) = A typ pe II error can c also be b referred d to as an error of th he second d kind. Compare type I error. See also a powe er.

Test T Stati istic A tes st statistic is a quant tity calcula ated from our sample of data. . Its value is used to decide e whether or not the null hypothesis sho ould be rej jected in our hypothesis h s test. The choice c of a test statistic will de epend on the assum med proba ability mode el and the hypotheses under question. q

Critical C Va alue(s) The critical c value(s) for a hypothes sis test is a threshold to which the value of the e test stati istic in a sample is compared c to determ mine wheth her or not the null hypoth hesis is rejected. The critical c value for any y hypothes sis test de epends on the signif ficance level at which the t test is carried ou ut, and wh hether the test is one-sided or r two-s sided.

See also a critica al region.

Critical C Re egion The critical c reg gion CR, or o rejection n region R RR, is a set of values s of the test statis stic for whi ich the null hypothe esis is reje cted in a h hypothesis s test. Tha at is, the e sample space for the test statistic is p partitioned d into two regions; one region r (the e critical re egion) will lead us to o reject the null hyp pothesis H0, the other o will not. So, if the t observ ved value of the test t statistic i is a member of the critical re egion, we conclude c " "Reject H0"; if it is not a member of the critical re egion then we conclu ude "Do n not reject H0". See also a critica al value. See also a test statistic. s

Significan S nce Level l The significanc s ce level of f a statistic cal hypoth hesis test is a fixed p probability y of wr rongly reje ecting the null n hypothesis H0, iif it is in fa act true. It is the probab bility of a ty ype I error r and is se et by the in nvestigato or in relatio on to the e consequ uences of such an error. e That is, we wa ant to mak ke the signif ficance lev vel as sma all as poss sible in ord der to prot tect the nu ull hypothesis and d to prevent, as far as a possiblle, the inve estigator f from inadv vertently making m fals se claims. The significanc s ce level is usually de enoted by y Significance Level l = P(type I error) = Usua ally, the sig gnificance e level is chosen to b be 0.05 (o or equivale ently, 5%). .

P-Value P

The probability p y value (p-value) of a statistica al hypothe esis test is s the proba ability of getting g a va alue of the e test statiistic as ex xtreme as or more extre eme than that observ ved by cha ance alon ne, if the null hypothesis H0, is s true. It is the probab bility of wro ongly rejec cting the n null hypoth hesis if it is in fact true. It is equal e to th he significa ance level of the tes st for which we would only jus st reject the null hypothesis h s. The p-v value is co ompared w with the ac ctual signif ficance lev vel of our test and, if i it is sma aller, the re esult is sig gnificant. That is, if the null n hypoth hesis were e to be reje ected at th he 5% sign nficance level, , this woul ld be repo orted as "p p < 0.05". Smal ll p-values s suggest that t the nu ull hypothe esis is unl likely to be e true. The e smaller it is, the more co onvincing is i the rejection of th he null hyp pothesis. It t indica ates the strength of f evidence for say, re ejecting th he null hyp pothesis H0, rathe er than sim mply conclu uding "Rej ject H0' or r "Do not r reject H0".

Power P The power p of a statistica al hypothesis test measures th he test's a ability to reject the null hypothesis h s when it is actually y false - tha at is, to make a corre ect decision. In oth her words, the powe er of a hyp pothesis te est is the p probability y of not comm mitting a ty ype II erro or. It is calc culated by y subtracting the pro obability of a type II error from f 1, usually expr ressed as: : Power = 1 - P(type e II error) = The maximum m power a test t can ha ave is 1, the minimu um is 0. Id deally we want a test to have h high power, clo ose to 1.

One-sided O d Test

A one e-sided test is a statistical hyp pothesis te est in whic ch the values for which h we can reject r the null hypot thesis, H0 are locate ed entirely y in one tai il of the e probabili ity distribu ution. In oth her words, the critical region for f a one-s sided test is the set t of values s less than t the critical c valu ue of the te est, or the e set of values great ter than th he critica al value of f the test. A one e-sided test is also referred to o as a one e-tailed tes st of signif ficance. The choice c bet tween a one-sided and a a two-sided tes st is determ mined by the purpose p of f the investigation or r prior reas sons for u using a one e-sided test. Exam mple Supp pose we wanted w to test t a man nufacturers s claim tha at there ar re, on avera age, 50 matches in a box. We e could se et up the fo ollowing hy ypotheses s H0: = 50, 5 again nst H1: < 50 5 or H1: > 50 Eithe er of these two altern native hyp potheses w would lead d to a one-sided tes st. Presu umably, we w would want w to tes st the null hypothesi is against the first altern native hyp pothesis since it wou uld be use eful to know w if there is likely to o be less than 50 0 matches s, on avera age, in a b box (no on ne would c complain if they get the co orrect num mber of ma atches in a box or m more). Yet another a alt ternative hypothesis h s could be e tested ag gainst the same null l, leading this tim me to a two o-sided tes st: H0: = 50, 5 again nst H1: not t equal to 50 Here, nothing specific s ca an be said d about the e average number o of matches s in a box; b only that, t if we could reje ect the null hypothes sis in our t test, we would d know tha at the ave erage number of ma atches in a box is lik kely to be less than t or gre eater than n 50.

Two-Sided T d Test

A two o-sided tes st is a stat tistical hyp pothesis te est in whic ch the valu ues for which h we can reject r the null hypot thesis, H0 are locate ed in both tails of the e proba ability distribution. In oth her words, the critical region for f a two-s sided test is the set of values less than t a first critical value of the e test and the set of f values greater than a sec cond critical value of the test. A two o-sided tes st is also referred r to o as a two-tailed tes st of signifi icance. The choice c bet tween a one-sided test t and a two-sided d test is de etermined by the purpose e of the inv vestigation n or prior r reasons fo or using a one-sided d test. Exam mple Supp pose we wanted w to test t a man nufacturers s claim tha at there ar re, on avera age, 50 matches in a box. We e could se et up the fo ollowing hy ypotheses s H0: = 50, 5 again nst H1: < 50 5 or H1: > 50 Eithe er of these two altern native hyp potheses w would lead d to a one-sided tes st. Presu umably, we w would want w to tes st the null hypothesi is against the first altern native hyp pothesis since it wou uld be use eful to know w if there is likely to o be less than 50 0 matches s, on avera age, in a b box (no on ne would c complain if they get the co orrect num mber of ma atches in a box or m more). Yet another a alt ternative hypothesis h s could be e tested ag gainst the same null l, leading this tim me to a two o-sided tes st: H0: = 50, 5 again nst H1: not t equal to 50 Here, nothing specific s ca an be said d about the e average number o of matches s in a box; b only that, t if we could reje ect the null hypothes sis in our t test, we would d know tha at the ave erage number of ma atches in a box is lik kely to be less than t or gre eater than n 50.

One O Sample t-test

A one e sample t-test is a hypothesi is test for answering g question ns about th he mean n where th he data are e a random m sample of indepe endent obs servations s from an underlying norm mal distribu ution N(, ), wher re is un nknown. The null n hypoth hesis for the one sa ample t-tes st is: H0: = 0, where 0 is know wn. That is, the sam mple has been draw wn from a population of given n mean and unknown variance (whic ch therefor re has to b be estimat ted from th he sample e). This null hypot thesis, H0 is tested against a on ne of the fo ollowing a alternative hypotheses, de epending on the que estion pos sed: H1: is not n equal to t H1: > H1: <

Two T Sample t-test A two o sample t-test t is a hypothesis test for a answering g question ns about th he mean n where th he data are e collected d from two o random samples o of indep pendent observation ns, each fr rom an un nderlying n normal dis stribution:

When n carrying out a two o sample t-test, it is usual to a assume tha at the variances for th he two populations are equal, i.e.

The null n hypoth hesis for the two sample t-tes st is: H0: 1 = 2 That is, the two o samples s have both been dr rawn from the same e population. Thi is null hyp pothesis is s tested ag gainst one e of the following altern native hyp potheses, depending d g on the question po osed. H1: 1 is not equal to 2 H1: 1 > 2 H1: 1 < 2

You might also like