SS-Type.txt Karl has hardcopy docs on types of sums-of-squares, see him if you would like such. There are some example programs online here and below is copy of some correspondence on the topic. ======================================================================== Sender: "SAS(r) Discussion" From: Mark Lee first I would like to thank Karl Wuensch, Wang Kuo-chang, Tim Dorcey, and Mik Bickis, for their advice on Greenhouse-Geiser and Huyn-Feldt adjust- ments on repeated measures analysis. What I understand as the gist of those answers was that if a variable has only 2 levels, the covariance of the difference vector is of course equal to itself, and there is no ad- justment needed. That makes perfect sense to me. What does not exactly make sense is that in looking at an interaction of 2 variables, one with 2 levels and another with say 3 or 4, there is nocorrected probability given. Unfortunately for those who answer me I have another question. I Am doing a GLM anova - A * B * C * SUBJECTS (4 * 3 * 4 * 32 subjects) and I have factors B and C as within subjects variables. Also unfortunately I am missing data for 2 subjects each at different levels of variable C. The repeated or multivariate mode is said to drop these cases altogether. I set up my data in split plot (I hope that is the proper term) fashion using the SUBJECTS(A) * B * C term as my largest interaction's error term In other words I specified the entire model, and took up all the D.F. GLM gave me type I and Type III sums of squares for my tests (each of I specified) that were different from the results I got on the same analyse s from proc anova. My questions are: Since I have missing data I want to know which type are most appropo for my problem. Further I must say that I have read the section on the four estimable types of functions 4 times am am still confused as to what the real differences are. type I seem to be sequential sums of squares, and type two seem to be unique but I can seem to make out what they are saying. Perhaps it's a function of how dense I am. Type 3 and four are even more obscure. Also what is the adjustment made for the missing values? I can't really tell from the doc umentation. Any help would be greatly appreciated. Thank you in advance - Truly Mark D. Lee ======================================================================== From: "Karl L. Wuensch" To: Mark Lee In-Reply-To: Your message of Thu, 18 Apr 91 14:00:14 EDT Regarding the interaction between a two-level within-subjects factor and a between-subjects factor, sphericity can't be a problem because the interaction boils down to a between-subjects test on a single difference score -- this is why you get no corrected p's for such terms. Heterogeneity of variance (of the difference scores across levels of the between-subjects factor) could be problem. Regarding missing data in your design and Types of sums of squares, I recommend that you either delete the subjects who are missing data (as SAS would by default) or use a multiple regression procedure to predict for each subject what his/her score would be based on a model using as predictors the variables on which you do have data for the subject. This is a lot of work for a little data, so you'd better want it bad to do this. A lazy person might just substitute the cell mean for such missing data. There is a type of SS that purports to be appropriate for designs with missing data - - Type IV -- but in my mind these Type IV SS are more in the domain of magic than statistics, I don't recommend their use. On the more general topic of SAS's Types of sums-of-squares, Type I are "sequential," "hierarchical," or "stepdown," that is, each effect is partialled for preceding (to its left) effects. You probably will not often want to use these, unless you are doing something like a covariate analysis or a trend analysis where you entered the powers of the quantitative predictor(s). SAS's Type I SS are the same as Overall & Spiegel's (1969, Psychol. Bull., 72, 311- 322) Method III and SPSS' "METHOD=SSTYPE(SEQUENTIAL)" in MANOVA, but not the same as SPSS "METHOD=HIERARCHICAL" in ANOVA. SAS' Type III SS partial each effect for every other effect in the model - they are the same as Overall and Spiegel's Method I, the "METHOD=UNIQUE" in SPSS ANOVA, and the default in SPSS MANOVA. They are the sums-of-squares that are approximated by the classic "unweighted means" analysis of nonorthogonal factorial designs. This is probably the type of SS you most often want. I recommend an article by Howell and McConaughy (1982, Educ. & Psychol. Measurement, 42, 9-24) on the topic. Type II SS for ANOVA are strange -- I could tell you what they are, but since it is highly unlikely that you will ever want them in an ANOVA, I'll save us both time and not say more about them. Hope this helped. Ciao, Karl L. Wuensch, Psychology, East Carolina Univ. ======================================================================== Date: Sat, 06 Dec 1997 17:13:20 -0500 Sender: owner-edstat-l@eos.ncsu.edu From: Donald Macnaughton To: STAT-L@VM1.MCGILL.CA, edstat-l@jse.stat.ncsu.edu Subject: Which Sums of Squares Are Best in Unbalanced ANOVA? If you are not already confused enough by different types of sums of squares in ANOVA, check out the paper referenced below. You'll need the Adobe Acrobat reader to read the downloaded document. ----------------------------Original message---------------------------- Many readers will recall the recent controversy about methods for computing sums of squares in unbalanced analysis of variance. Many statisticians believe this controversy has been settled, with the conclusion being that the sums of squares commonly known as "SAS Type III" or "SPSS ANOVA Unique" are appropriate in most cases. I have written a paper that disagrees with this conclusion. The paper proposes that the sums of squares known as "SAS Type II" or "SPSS ANOVA Experimental" are appropriate in most cases. The reasoning is an extension of earlier writers' reasoning about the hypotheses being tested. Here is the abstract: ----------------------------------------------------------------- Which Sums of Squares Are Best In Unbalanced Analysis of Variance? ABSTRACT Three fundamental concepts of science and statistics are enti- ties, variables (which are formal representations of properties of entities), and relationships between variables. These con- cepts help to distinguish between two uses of the statistical tests in analysis of variance (ANOVA), namely - to test for relationships between the response variable and the predictor variables in an experiment - to test for relationships among the parameters of the model equation in an experiment. Two methods of computing ANOVA sums of squares are - Higher-level Terms are Omitted from the generating model equa- tions (HTO = SPSS ANOVA EXPERIMENTAL -= SAS Type II -= BMDP4V with Weights are Sizes, where -= signifies "approximately equals") - Higher-level Terms are Included in the generating model equa- tions (HTI = SPSS ANOVA UNIQUE = SPSS MANOVA UNIQUE = SAS Type III = BMDP4V with Weights are Equal = BMDP2V = MINITAB GLM = SYSTAT MGLH = Data Desk Type 3). This paper evaluates the HTO and HTI methods of computing ANOVA sums for squares for fulfilling the two uses of the ANOVA statis- tical tests. Evaluation is in terms of the hypotheses being tested and relative power. It is concluded that (contrary to current practice) the HTO method is generally preferable when a researcher wishes to test the results of an experiment for evi- dence of relationships between variables. ----------------------------------------------------------------- The paper contains 22,000 words and 105 references. It is avail- able at http://www.matstat.com/ss.htm -------------------------------------------------------- Donald B. Macnaughton MatStat Research Consulting Inc. donmac@matstat.com Toronto, Canada --------------------------------------------------------