0% found this document useful (0 votes)
115 views26 pages

Anova Notes

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF or read online on Scribd
0% found this document useful (0 votes)
115 views26 pages

Anova Notes

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF or read online on Scribd
You are on page 1/ 26
Inference 2: Analysis of Variance (ANOVA) Aims: Appreciate the need for analysing data from more than two samples Understand the underlying model for analysis of variance Identify the correct analysis to apply to a given situation Carry out a one-way analysis of variance Carry out a two-way analysis of variance (without replications) REND This method is used to test for a difference between population means when there are more than two populations. It is a parametric test as it is based on the assumption that the samples come from underlying normally distributed populations Examples Compare the effects of three treatments Compare average weight loss of four diets Compare the productivity from five machines What methods did we use when we had only two populations? wo sample. & or Z& hat Why can't we use this test repeatedly to compare each possible pair? The leats would aot be undependut . Sig lwet valid Far Iie Grd ve wih a Lorge proo Pd a bype | erve> Consider the following two sets of data Data Set 4 A sample of 5 is taken from each of three populations 1 5 2 3 5.90 5.51 5.01 - 5.92 ~ 5.50 - 5.00 - 5.91 5.50 —| 499 5.89 5.49 4.98 5.88 5.50 5.02 = 5-40 5:86 H= $06 Data Set 2 A sample of 5 is taken from each of three populations 4 2 — 3 L 5.90 6.31 452 4.42 - 3.54 6.93 7.51 4.73 4.48 7.89 ~ 7.20 - 5.55 3.78 5.72 - 3.52 5:40 5:50 | 5-06 Produce a dot plot for these two sets ‘ « i a Set 2 _"* x” x, T = sum of all n observations = > >) x, Formulae Total sum of squares r Within samples sum of squares SS, = SS, These results are then summarised in an ANOVA table | Source of ‘Sums of Degrees of | Mean Square | Variation Squares freedom | ;* (ms) = ff Between s ca ‘Sample Within . - - | Sample Total mt Example 1 In an investigation into the effect of study method on leaming, college students were randomly assigned to one of three different methods: reading only, reading and underlining, reading and making notes. One week after studying a particular article, the students were given a test on the articles content. A ~ & - Cc ‘Study Method Reading Reading and Reading and _ ___|__Underlining making Notes Test Score 16 18 24 15 22 33 13 18 24 21 16 28 15 26 26 cacede / Tt #6 109 13S IS Ni JS s Ss lypotheses ny: Ma? Me 7 Me . Hak legot WO mecus differ Calculating the Sums of Squares 4. Enter all the data into stats mode in your calculator ye = 1081 t= 315 (2x) n= 1S 2. Calculate the total for each sample and the number in each sample ~ check T 3. Calculate the total sums of squares 466 4. got + WGor* 13S* _ BIS™ = 310 Ss is SS. __ 7 of : ‘Sums of Degrees of | Mean Square F | Germs caucest Squares - freedom (Ms) Witwims =r Rewoen | 310 2 pee ss [SS = 1-92) in }4ee = Ts? wee ge a Total Loo is-" We Test Statistic Fz qa Critical Value df: 212 Amor ve always orutoled fF (s%)= 3-885 wir Conclusion sash Ses * 7: Peek Pe . nat Thar Ww Signa (icant euiduce to a Suggest a kat mare dupperence behwetn Ene weoge for the Pas methods. Ae Woot Example 2. Four treatments for fever blisters, including a placebo, A, were randomly assigned to 20 patients. The data below show, for each treatment, the numbers of days from initial appearance of the blisters until healing is complete. Test the hypothesis, at the 5% level, that there is no difference between the four treatments with respect to mean healing time. ("TREATMENT | Number of days Te [ac A s 8 7 7 8 |3sis B [4 6 6 3 | 5 [aes c 6 4 _4 5 4 23/s | D 7 4 6 6 5 | agi 5 Hypotheses Hy: Mate =Me* Ho . He: Oke Wank WO MLOAS dulger Calculating the Sums of Squares 1. Enter all the data into stats mode in your calculator EEN = Ske r= WO ( Zc) n= 90 2. Calculate the total for each sample and the number in each sample — check T 3. Calculate the total sums of squares ea , . 644 - IG” =39 20 4. Calculate the between sample sum of squares Pop SS, => ++ 7, BS* + gur+ 2374 28% — no® = 178 5 20 Example 3. Eastside Health Authority has a policy whereby any patient admitted to a hospital with a suspected coronary heart attack is automatically placed in the intensive care unit. The table below gives the number of hours spent in intensive care by such patients at five hospitals in the area. Use a one-way analysis of variance to test at the 1% level for differences between hospitals. Hospital | A B c iD E Test Score 30 42, 65 67 70 25 57. 46 58 63 12 a7 55 Bt 80 23 30_ 27 - 16 Te 1oe Te 143 LOG Zs] Total as < te C7 3 3 sa Hypotheses Hy: An=Me =Me = Mp =Me or woot hWO mLar1S Calculating the Sums of Squares 1. Enter all the data into stats mode in your calculator Yds = 50354 T= B94 n= 14 2. Calculate the total for each sample and the number in each sample ~ check T 3. Calculate the total sums of squares xd So3S4- Fae = g28e-44u7b 1a 4. Calculate the between sample sum of squares a) a loc + 176% + 193t + ZOE* +2137 — BIK s & cs 3 3 1 = 6566.1304 S: ae [” Source of Sums of | Degrees of | Mean Square Fe Betms |__ Variation | Squares freedom | (Ms) Wil mS, Between Sample 1-8 m8 eB: $133 SAB = 48 win PRs [Pte [yee bses Total 34 io ra Test Statistic Fe Que Critical Value af: 3lb g = . Fy ts Yo)= 3-284 > Conclusion Bar Tre siqroficot woidueer fo rejeeh Ho and Suggest a dalfenee belwean cho men Hime fo be for bectmad~ AC Last hacky Foo dither Source of Sums of | Degrees of | Mean Square F Variation Squares freedom _| (Ms) _ Between 16566-71301, Ib 26-G824] 12-178 Withii 7 - - sample 11822167] I+ 197-3012 Total |g 258-9414] 1g Test Statistic Fe 12-778 Critical Value dpe 1h Figs (1%)= s-63s Conclusion Reject Ho Thee signficont eniderce fo suggeot vo macn me a Caper Ce core aay for De pour hospitals « Spear wW Uberswe At Want two Filling in ANOVA Tables 1. Three Brands of light bulb (hours in excess of 1000 hours) 4 2 _3 16 18 26 15, 2 31 3 20 24 21 16 30 18 _[ 24 24 Complete the following analysis of variance table and carry out a relevant hypothesis test Source of ss of ms F Variation [Between 310 z iss is: S within 126 12 “10 Total 430 1y 2. Four teaching techniques (score) A _B 77 76 80 79 76 83 7 80 76 73 78 Complete the following analysis of variance table and carry out a relevant hypothesis test ‘Source of 38 of ms F Variation Between 153.8 3 SL 266 SC Within 150. IG w27s Total 333.8 19 - 3. Five formulations (burning rate) 4 2 3 4 5 33 24 31 38 44 40 31 33 36 50 32 46 36 4 30 35 45 42 32 26 25 39 33 48 40 Complete the following analysis of variance table and carry out a relevant hypothesis test Source of ss df ms ¥ Variation _ Between 80 ZO Oo 3st Total 1202 Within U22 - 28 Set ae 10 Assumptions for One-way ANOVA 1. The observations are obtained independently and randomly from the populations. 2. The observations are from normally distributed populations. 3. The populations have a common variance, o* Note an estimate of the population variance is obtained from the anova table. We use s*= the within ms as an estimate of 0° . If after carrying out the anova we wish to calculate a confidence interval for a particular mean we use the within mean square as the best estimate of o” . Back to example 1 We concluded there was a difference. It is a good idea in practice to then calculate the sample means. Study Method Samp Reading _| 5 _ Reading and Underlining — Ss Reading and Making Notes | Ss 13 with 12. degrees of freedom. ‘The Model bel d ANOVA When we carry out a one-way ANOVA we are actually fitting a model x, ="+a,+e, where j= overall mean a, =mean effect of ith level of factor , =random variation ~ N(0,0°) ‘The text book gives the following formulae for these. They are NOT in the formula book. Tr overall mean Factor mean — overall mean Observed value ~ factor mean w= 31S =2) 1S a= 16-4 a= 27-21 #-S <6 Estimates of the ¢,, ‘Study Method Reading Reading and Reading and _ Underlining making Notes _| &, 1G-1e= 6 I¢-20 = -2 Ly- TF | ig -l6=-1 [22-202 2 -3 -2 i s =u x “1 @ 2 a5- iss Notice, Hustle witha SS uv PROTA, caole, Using Minitab Enter the data into the worksheet as shown ‘esi method Select Stat, ANOVA, one way ANOVA. Result is the response variable and method is the factor Method Nall hypothesis All means are equal Alternative hypothesis Not all means are equal Significance level a= 005 0vo vavonces were asumea forthe ent Factor Information Factor__Levels Values method 21,2, Analysis of Variance Source OF _Agj/SS_AdjMS_F-Value_P-Value method 2 3100 15500 1192 0001, enor 12-1560 1300 Total «144680 Model Summary Intra tof sore thd S__Rsq_Resa(ad)_Rsatpred) pecan 360555 G652% GOK 47.60% . Means i me method _N_Mean_stDev 958 C1 "| Tee 1 5 1600 300 (1249, 1950) | ~ 2 5 2000 400 (1649,2351) + 3 5 2700 374 (2349,3051) oS z Poca Ser = $8855 sept en nna Ifyou use Stat, ANOVA, general linear model you get the above output and the equation Regression Equation score = 21,000 - 5.00 method_1 - 1.00 method_2 + 6.00 method_3 13 ‘Two-way Anova Let’s go back to the first example. | Study Method Reading Reading and Reading and Underlining making Notes Test Score 16 18 24 15 _ 22 33 413 18 24 2 16 - 28 _ 15 26 26 For this design the 15 students would need to be a homogeneous group of students. The above design is called a Fully or Completely Randomised Design Suppose we are told that each row of the table corresponds to students of a particular predicted grade. ‘Study Method | Reading Reading and | Reading and Underlining | making Notes Grade of Student |. | 16 18 24 A | 15 22 33] E [43 18 24 B 21 | 16 28 c 15 26 26 ‘There is clearly a source of variation that needs to be taken account of. This can be done using two way ANOVA. Two way is an extension of the one factor situation to take account of a second source of variation. The levels of this second factor are often determined by groupings of subjects or units used in the investigation. As such it is often called a blocking factor because it places subjects or units into homogeneous groups called blocks. The design itseff is then called a randomised block design. Exam Question SS6 June 2012 [An investigation into the effect of a particular chemical on ripening times of fruit in cold storage is carried out by a company that stores apples of three varieties: Red Delicious, Golden Delicious and Pink Lady. The chemical is applied to three apples, one of each variety, selected at random fo those that are to be kept in cold storage. Three further apples, again one of each variety, are selected at random from these that are to be kept in cold storage. These apples are not treated with the chemical In addition to the chemical, it is believed that the variety of an apple might influence ne to Fipening The length of | to ripening is measured for the six apples in the investigation, (a) Mdentify those apples that constiute the control group, (marks (b) Explain the purpose of selecting apples for treatment with the chemical at random from those to be kept in cold storage. (1 mark) (©) Name the technique that you would use in order to analyse the data obtained from this investigation. 2 marks) (6) Name the blocking factor (1 mark) Notation m= number of rows n= number of columnns R, x,, = observation in the i” sample um of the observations in the i”* row = C, =sum of the observations in the j” column = T = sum of all mn observations = YY x, Formulae Total sum of squares BE Between Row Sum of Squares y* ” Ss, Between Column Sum of Squares Gq SS. > Within samples sum of squares r mn r mn r mn SS, = SS, - SS, - SS. These results are then summarised in an ANOVA table ‘Source of ‘Sums of ‘Degrees of | Mean Square Variation Squares freedom ~~ Between SS, mt Rows: Between SS, m1 Columns Within Sample Total mrt Example 4. In an investigation into the effect of study method on learning, college students were assigned to one of three different methods: reading only, reading and underlining, reading and making notes. In the experiment grade of student was used as a blocking factor i.e. each method of study was used with each grade of student. One week after studying a particular article, the students were given a test on the articles content. Study Reading Readingand | Readingand [ Row | Noin Method Underlining | makingNotes | Total | row Grade of D 16 18 24 SS 3 Student A 15 22 33 a6 | 3 LE 13 18 __24 S35 | 34 { B | 241 16 28 GS 3 c 15 26 26 et | 3 Col total xO 160 jas Ei} No in col Ss s Ss Test for a difference in mean score for the three study methods. Test for a difference in mean score for the grades of student. Calculating the Sums of Squares 1. Enter all the data into stats mode in your calculator YYs= Tos! T= 3S mnclS 2. Calculate the total for each row Calculate the total in each column. ~ check T 3. Calculate the total sums of squares Joel -— 31S? = Yue is 4, Calculate the between row sum of squares. Calculate the between column sum of squares. — 3 OT n sgttdot+ss*+es*+61% _ 3is* = 52-coct 3 is 2 $6%+ 100+ 13S* _ Zis* =210 Ss 1s 16 | Source of ‘Sums of Degrees of | Mean Square F Variation Squares freedom ss (MS)= 75 GradePawee” | Soccer, | Isltet | alae nae umn 316 | z iss | Ivo an, | 103-3333 |e 12467 Total tee te Mes Ma = Me = Me =Moe Me He Ok (eosk hwo aufher Test Statistic = 018% Critical Value Ag = ne Fug (5%) = 3-54 Conclusion Accept Ho my: Me>Rew =Men "ok (eeat hwo duper Test Statistic Faiz Critical Value aes Ue Fog lS%)> bee Conclusion Raeck Ho Thee sig tude Fo suggest a dyfperere. WV te mer leat score Qo he Hee shdy meviods- Ate ast two per Example 5 Prior to submitting a quotation for a construction project, companies prepare a detailed analysis of the estimated labour and materials costs required to complete the project. A company which employs three project cost assessors, wished to compare the mean values of these cost assessors’ cost estimates. This was done by requiring each assessor to estimate independently the costs of the same four projects. These costs, in £0000s, are shown below. Perform a two-way analysis of variance to test the hypothesis, at the 5% significance level, that there is no difference between the assessors’ mean cost estimates ‘Assessor Row | Noin Total_| row Project A B c | 1 46 49 a4 134 3 2 _ 62 63 59 | gt 3 3 50 | 54 54 (S3_ 3 4 66. 68 63. 147% [3 Coltotal | 294 | 234 | 220 | 618 No in col & ie - Calculating the Sums of Squares . Enter all the data into stats mode in your calculator EdN= S902’ r 673 mas Ie 2. Calculate the total for each row Calculate the total in each column. 3. Calculate the total sums of squares %,-DE8 r mn 3902 - 673% 12 = 721 4, Calculate the between row sum of squares. Calculate the between column sum of squares. .R 7? a. SS OT an Iaereieeeisetriq7? — 678° 3 pz. SS, “ya mom 22+ rsv'+2207 — 18" 296 & iz Source of ‘Sums of Degrees of | Mean Square F Variation ‘Squares freedom sy (us)= 7 Bee Rows, | ©163333 3 225 | DL Columns 2 | 13 ee Withi fs 7 ¢ 7 Sample__|'8°6G67 é Sut Total 74 Il ssessorS Ho: Hy = He = Me Hy ok Wank 2 differ Teot Shalistic ate 146 Aecapt Ho F242 F,,(s%)

You might also like