ANOVA2-Followup.txt When we went over the ANOVA2 LISTING in class, which replicated and extended the factorial analysis done in Howell, we noticed that the cell standard deviations differed considerably. Here they are: ------------------------------------------------------------------------------ 2-WAY, EQUAL NS, INDEPENDENT SAMPLES ANOVA Howell, 4th edition, page 404 Level of Level of ------------ITEMS------------ AGE CONDITON N Mean SD Old Adjective 10 11.0000000 2.49443826 Old Counting 10 7.0000000 1.82574186 Old Imagery 10 13.4000000 4.50185147 Old Intentional 10 12.0000000 3.74165739 Old Rhyming 10 6.9000000 2.13177026 Young Adjective 10 14.8000000 3.48966729 Young Counting 10 6.5000000 1.43372088 Young Imagery 10 17.6000000 2.59058123 Young Intentional 10 19.3000000 2.66874919 Young Rhyming 10 7.6000000 1.95505044 Note that the largest cell variance, 4.50**2 = 20.24, is almost ten times as large as the smallest cell variance, 1.43**2 = 2.04. ------------------------------------------------------------------------- I edited the ANOVA2 listing, cutting out all but output showing the ten cell means and standard deviations. I then read these into Minitab to investigate the relationship between the means and the standard deviations. Here is what I found. MTB > name c1 'mean' c2 'stdev' c3 'var' MTB > let c3=c2*c2 MTB > corr c1 c2 Correlation of mean and stdev = 0.538 MTB > corr c1 c3 Correlation of mean and var = 0.455 MTB > plot c1 c2 - 20.0+ - * mean - * - - 15.0+ * - - * - * - * 10.0+ - - * - * * * - 5.0+ +---------+---------+---------+---------+---------+------stdev 1.20 1.80 2.40 3.00 3.60 4.20 The cell means appear to be well correlated with both the cell standard deviations and the cell variances, suggesting a log or a sqrt transformation (as suggested by Howell back in chapter 11). I went back to the ANOVA2 SAS program and modified the data step to do the log transformation: "INPUT AGE CONDITON; DO I=1 TO 10; INPUT ITEMS @@; logitems=log10(items); OUTPUT; END;" and then checked the cell variances. The ratio of the largest to the smallest was a bit over 6, so I changed the transformation to "sqritems=sqrt(items);" which brought the ratio down to a hair over 4, which, given equal sample sizes, seemed adequate. Of course, I should also check the effect of this transformation on the shape of the within cells distributions, but the n's are small and these are not my research data, so I haven't. If these were your research data, for your thesis or whatever, you should, of course, carefully evaluate them for data entry errors, violations of assumptions, outliers, etc. prior to starting an analysis such as ANOVA. I probably have not put enough emphasis on such preliminary data checking, perhaps because it is not really very exciting, but it is important. I'll try to spend more time on it next semester in PSYC 6433.