ANOVA2Followup.txt
When we went over the ANOVA2 LISTING in class, which replicated and
extended the factorial analysis done in Howell, we noticed that the cell
standard deviations differed considerably. Here they are:

2WAY, EQUAL NS, INDEPENDENT SAMPLES ANOVA
Howell, 4th edition, page 404
Level of Level of ITEMS
AGE CONDITON N Mean SD
Old Adjective 10 11.0000000 2.49443826
Old Counting 10 7.0000000 1.82574186
Old Imagery 10 13.4000000 4.50185147
Old Intentional 10 12.0000000 3.74165739
Old Rhyming 10 6.9000000 2.13177026
Young Adjective 10 14.8000000 3.48966729
Young Counting 10 6.5000000 1.43372088
Young Imagery 10 17.6000000 2.59058123
Young Intentional 10 19.3000000 2.66874919
Young Rhyming 10 7.6000000 1.95505044
Note that the largest cell variance, 4.50**2 = 20.24, is almost
ten times as large as the smallest cell variance, 1.43**2 = 2.04.

I edited the ANOVA2 listing, cutting out all but output showing the
ten cell means and standard deviations. I then read these into Minitab to
investigate the relationship between the means and the standard deviations.
Here is what I found.
MTB > name c1 'mean' c2 'stdev' c3 'var'
MTB > let c3=c2*c2
MTB > corr c1 c2
Correlation of mean and stdev = 0.538
MTB > corr c1 c3
Correlation of mean and var = 0.455
MTB > plot c1 c2

20.0+
 *
mean  *


15.0+ *

 *
 *
 *
10.0+

 *
 * * *

5.0+
++++++stdev
1.20 1.80 2.40 3.00 3.60 4.20
The cell means appear to be well correlated with both the cell standard
deviations and the cell variances, suggesting a log or a sqrt transformation
(as suggested by Howell back in chapter 11). I went back to the ANOVA2 SAS
program and modified the data step to do the log transformation:
"INPUT AGE CONDITON; DO I=1 TO 10; INPUT ITEMS @@;
logitems=log10(items); OUTPUT; END;"
and then checked the cell variances. The ratio of the largest to the
smallest was a bit over 6, so I changed the transformation to
"sqritems=sqrt(items);" which brought the ratio down to a hair over 4, which,
given equal sample sizes, seemed adequate. Of course, I should also check
the effect of this transformation on the shape of the within cells
distributions, but the n's are small and these are not my research data, so
I haven't.
If these were your research data, for your thesis or whatever, you
should, of course, carefully evaluate them for data entry errors, violations
of assumptions, outliers, etc. prior to starting an analysis such as ANOVA.
I probably have not put enough emphasis on such preliminary data checking,
perhaps because it is not really very exciting, but it is important. I'll
try to spend more time on it next semester in PSYC 6433.