East Carolina University
Department of Psychology


The Reversal Paradox (Simpson's Paradox)


    Here is log of some discussions among subscribers to stats listserves.

 

Date: Tue, 9 Jun 1992 19:34:44 GMT
Sender: "STATISTICAL CONSULTING" <STAT-L@MCGILL1.BITNET>
From: David Nichols <nichols@SPSS.COM>

Todd Dicker asks how two groups with positive slopes can combine to form a total group with a negative slope:

This can happen quite easily. There's even a name for it: Simpson's Paradox. To illustrate how this can easily happen, simply draw two clouds of points, each of them forming an ellipse tilted upward to the right, indicating a positive correlation, but have them located so that one cloud is above and to the left of the other. When you look at each group separately, they will each show a positive correlation, but if you check the overall relationship, it will be negative. The same thing can happen with categorical data as well, though it isn't as easy to illustrate graphically. Below is a rather crude attempt, within the limits of the available technology and my imagination, to illustrate the situation:


    ..
 ....
.....
....         ....
..        .......
         ......
         ...
 

This situation would obviously produce positive correlations within each cluster, but a negative one overall. This is essentially the basic reason why we do multiple regression instead of individual bivariate regressions.
--
David Nichols Statistical Support Specialist SPSS, Inc.
nternet: nichols@spss.com
*******************************************************************************
Any correlation between my views and those of SPSS is strictly due to chance.
========================================================================
Date: Wed, 10 Jun 92 13:50:46 EDT
From: "Karl L. Wuensch" <PSWUENSC@ECUVM1>
Subject: Simpson's paradox
To: edstat-l@ncsuvm

This is a cross-posting from STAT-L. I thought those who teach regression, factorial ANOVA, and multivariate contingency table analyses might find this useful and not see it on STAT-L.
------------------------------------------------------------------------------

David Nichols <nichols@SPSS.COM> noted:

>Todd Dicker asks how two groups with positive slopes can combine to form >a total group with a negative slope:
>
>This can happen quite easily. There's even a name for it: Simpson's Paradox.
............rest deleted...........
Nice explanation, David. Those interested in this paradox might want to read Messick & van de Geer's article "A Reveral Paradox" in Psychological Bulletin, 1981, vol. 90, 582-593. They define a reversal paradox as "having two varibles that are positively (negatively) related while at the same time having the variables negatively (positively) related within each level of a third variable. They refer to other articles on the topic, including Simpson's. They also have some nice examples, well worth reading. One of my favorites is this: The correlation between the number of firemen sent to a fire and the amount of damage suffered is positive, which seems paradoxical, until you realize that more firefighters are sent to larger fires. If you control for the size of the fire, the sign of the correlation changes to negative. Such examples provide a good way to introduce students to some of the complexities of multiple regression analyses, partial correlations, and analyses of covariance, including suppression. They also discuss a celebrated study of sex bias in grad school admissions (at UC, Berkeley) as an example of a reversal paradox, with categorical variables. Briefly, the percentage of female applicants accepted was lower than that of male applicants, indicating, in the aggregate data, "bias" against women. But women were applying to programs that were more competitive while men were applying to less competitive programs. It is not difficult to construct a 3-way contingency table that indicates "bias" in one direction in the aggregate, but the opposite direction at each level of the third variable. Here is a simple example I use (contrived), where the body of the table has acceptance ratios:
 

                     Sex

                  Female          Male
Program

Clinical          15/100          1/10
Psychology         15%             10%
---------------------------------------
Experimental       6/10         50/100
Psychology          60%           50%
---------------------------------------
                  21/110        51/110
Marginals          19%            46%
 

Note that in the marginals it looks like bias against women, but within each program it looks like bias against men! I add at this point that these data might still represent bias against women at a different level -- why is admission to the clinical program so competitive -- maybe because our society does not sufficiently fund programs in the helping professions, professions traditionally associated with women!

This paradox can also surface in factorial ANOVA when the cell frequences are not proportional. Consider the following:

                                              
Sex
Program     Male         Female         marginal means
         Mean   N       Mean   N     weighted unweighted
         ------------------------------------------------
A         150  60       110   40        134      130
         ------------------------------------------------
B         160  10       120   90        124      140
         ------------------------------------------------

 

Note that the weighted means indicate that the mean for A > for B, but the unweighted means indicate B > A (which is true within each sex).

Karl L. Wuensch, Dept. of Psychology, East Carolina Univ.
Greenville, NC 27858-4353
========================================================================
From: Tim Dorcey <TCD@CORNELLC>
Subject: Re: Simpson's paradox

Nice posting. My first reaction to the original posting was something like "of course this can happen, and here's some numbers that show how." Not particularly interesting on a mathematical level.... Your real world illustrations, however, were quite fascinating and reminded me why this "paradox" deserves a name.

Tim Dorcey BITNET: TCD@CORNELLC
Statistical Software Consultant Internet: TCD@CORNELLC.CIT.CORNELL.EDU
Cornell Information Technologies Phone: (607) 255-5715
Cornell University
Ithaca, NY 14853
========================================================================
From: Dr D.D. Stretch <dds@uk.ac.le>

Karl Wuensch amplifies quite nicely a message about Simpson's Paradox and other reversal paradoxes. I would just like to add that I have found a description and working through of Simpson's paradox as being a particularly good way of introducing people to log-linear modeling - particularly if they are sceptical about why they have to consider using this "radically new technique" (hah!) when they can just continue to abuse PROC FREQ in SAS, or CROSSTABS in SPSS by doing loads of 2-way chi-squared tests between all combinations of pairs of variables they are interested in. I also do the same thing, but not immediately at the beginning, with teaching people about path analysis by talking about suppressor variables (as related to the discussions about regression in this thread). David Kenny in his book Correlation and Causality (Wiley, 1979) talks about these on pages 62-65.

David Stretch

+-----------------------+---------------------------------------------+
| Dr. David D Stretch | Understanding atomic physics is child's |
| Dept. of Psychiatry | play compared to understanding child's play |
| dds@leicester.ac.uk | Anon. |
+-----------------------+---------------------------------------------+
========================================================================
From: vokey@HG.ULETH.CA

An example of the reversal paradox that works well with intro stats students concerns teenagers, sex, junk food and body fat. If body fat is plotted against junk food consumption in teenagers, a negative correlation is found at the level of male and female groups, but a positive correlation is found within groups. Incidentally, although it is rarely discussed in this way, Simpson's Paradox (with contingency tables) and the other reversal examples are just special case examples of the general inappropriateness of cross-level inferences (e.g., the ecological and individualistic fallacies).

Strictly speaking, it is the collapsing of multi-way contingency tables in the presence of non-additivity that results in what has come to be called Simpson's paradox (Blyth, 1972; Simpson, 1951) -- a reversal of relationship across different levels of stratification, resulting from heterogeneity in the dimension or dimensions collapsed over to produce the aggregate table. More recently (e.g., Hintzman, 1980), the term has been generalized to include changes in magnitude in addition to changes in sign of relationship following the collapsing of contingency tables. Probably the best known, real-world example of Simpson's paradox is that provided by Bickel, Hammel, and O'Connell (1975) concerning an apparent sex-bias in acceptances to graduate school at Berkeley in 1973. Whereas for the 101 departments as a whole, males were accepted proportionately more often than females, the data for the individual departments suggested that, if any, there was a bias in favor of accepting female students. The explanation for the apparent paradox is that even though generally, within departments, female applicants were proportionately as likely as (if not more likely than) male applicants to be accepted, females had a proportionately lower overall rate of acceptance because they applied disproportionately more often to the departments with the lowest rates of acceptance (Freedman, Pisani, & Purves, 1978).

That Simpson's paradox can and does occur is well- established. Note of at least some aspects of the general problem associated with collapsing over heterogeneity dates back at least as far as Yule (1903), and real-world examples of the reversal paradox were noted as early as Cohen and Nagel (1934). More recent, real-world examples may be found in Knapp (1985), Mantel (1982), and Wagner (1982). Further discussion and conceptualizations of the issue may be found in Cohen (1986), Flexser (1981), Gardner (1976), Hintzman (1980), Martin (1981), Paik (1985), Shapiro (1982), and Sunder (1983).
As noted, the problem is not limited to the collapsing of contingency tables; Simpson's paradox is but a special case of the more general problem of cross-level inferences (e.g., Knapp, 1977; Lindquist, 1940; Robinson, 1950; Thorndike, 1939), although it is not often portrayed as such (but see Paik, 1985; Wermuth, 1989). The "paradox" in Simpson's paradox or more generally arises from the typically unwarranted assumption or expectation that aggregates should evidence the same relationships as the categories, levels, classes, items or individuals over which the aggregate was formed.

I have a manuscript available outlining a general method of providing for the homogeneity needed to collapse over one or more effects in multi-way contingency tables that is conceptually similar to control via analysis of covariance for continuous data. Computer code is also available.

Some useful references:

Bickel, P. J., Hammel, E. A., and O'Connell, J. W. (1975). Sex bias in graduate admissions: Data from Berkeley. Science, 187, 398-404.

Bishop, Y. M. M., Fienberg, S. E., & Holland, P. W. (1975). Discrete multivariate analysis: Theory and practice. Cambridge, Massachusetts: The MIT Press.

Blyth, C. R. (1972). On Simpson's paradox and the sure thing principle. Journal of the American Statistical Association, 67, 364-381.

Cohen, J. E. (1986). An uncertainty principle in demography and the unisex issue. The American Statistician, 41, 32-39.

Deming, W. E. (1943). Statistical adjustment of data. New York: Dover Publications, Inc.

Deming, M. E. and Stephan, F. F. (1940). On a least squares adjustment of a sampled frequency table when the expected marginal totals are known. Annals of Mathematical Statistics, 11, 427-444.

Fienberg, S. E. (1977). The analysis of cross-classified categorical data. Cambridge, Massachusetts: The MIT Press.

Flexser, A. J. (1981). Homogenizing the 2 X 2 contingency table: A method for removing dependencies due to subject and item differences. Psychological Review, 88, 327-339.

Freedman, D., Pisani, R., and Purves, R. (1978). Statistics. W.W. Norton & Company, New York.

Gardner, M. (1976). On the fabric of inductive logic and some probability paradoxes. Scientific American, 234, 119- 124.

Goodman, L. A. (1970). The multivariate analysis of qualitative data: interactions among multiple classifications. Journal of the American Statistical Association, 65, 226-256.

Hintzman, D. L. (1980). Simpson's paradox and the analysis of memory retrieval. Psychological Review, 87, 398-410.

Kennedy, J. J. (1983). Analyzing qualitative data. Introductory log-linear analysis for behavioral research. New York: Praeger Publishers.

Knapp, T. R. (1977). The unit-of-analysis problem in applications of simple correlation analysis to educational research. Journal of Educational Statistics, 2, 171-186.

Knapp, T. R. (1985). Instances of Simpson's paradox. College Mathematics Journal, 16, 209-211.

Lindquist, E. F. (1940). Statistical analysis in educational research. Boston: Houghton Mifflin.

Mantell, N. (1982). Simpson's paradox in reverse. The American Statistician, 36, 395.

Martin, E. (1981). Simpson's paradox resolved: A reply to Hintzman. Psychological Review, 88, 372-374.

Mosteller, F. (1968). Association and estimation in contingency tables. Journal of the American Statistical Association, 63, 1-28.

Paik, M. (1985). A graphic representation of a three-way contingency table: Simpson's paradox and correlation. The American Statistician, 39, 53-54.

Robinson, W. S. (1950). Ecological correlations and the behavior of individuals. American Sociological Review, 15, 351-357.

Shapiro, S. H. (1982). Collapsing contingency tables -- a geometric approach. The American Statistician, 36, 43-46.
Simpson, E. H. (1951). The interpretation of interaction in contingency tables. The American Statistician, 13, 238-241.

Sunder, S. (1983). Simpson's reversal paradox and cost allocation. Journal of Accounting Research, 21, 222-233.

Thorndike, E. L. (1939). On the fallacy of imputing the correlations found for groups to individuals or smaller groups composing them. American Journal of Psychology, 52, 122-124.

Wagner, C. H. (1982). Simpson's paradox in real life. The American Statistician, 36, 46-48.

Wermuth, N. (1989). Moderating effects of subgroups in linear models. Biometrika, 76, 81-92.

Whittemore, A. S. (1978). Collapsibility of multi- dimensional contingency tables. Journal of the Royal Statistical Society, Ser. B., 40, 328-340.

Yule, G. U. (1903). Notes on the theory of association of attributes in statistics. Biometrica, 2, 121-134.

John R. Vokey ph.: (403) 329-2409
Department of Psychology
University of Lethbridge email: vokey@hg.uleth.ca
Lethbridge, Alberta
CANADA T1K 3M4
========================================================================
Sender: edstat-l@jse.stat.ncsu.edu
From: "Karl L. Wuensch" <PSWUENSC@ecuvm.cis.ecu.edu>
Subject: Simpson's Paradox, Real Example


Recently there was a discussion on this list of Simpson's paradox and a request for real examples thereof. I just ran across what could be construed as a real example. The data (r's) are in the article "Salary Compression in Faculty Salaries: Identification of a Suppressor Effect," by W. L. McCulley & R. G. Downey, Educ. Psychol. Measurement, 53: 79-86. The criterion variable is faculty salary. Both academic rank and years of service are positively correlated with salary, but in a multiple regression the Beta weight for years of service becomes negative. That is, when you hold rank constant your salary is lower the longer you have been at the university. I intend to use this as an interesting example of the paradox (and of suppressor effects) and of the interpretation of multiple regressions and analyses of covariance. I also am sending copies to our local administrators who control my salary (I've been here many years). The authors' interpretation of the results is interesting -- they argue that the negative Beta reflects "the failure of the organization to recognize seniority (continuation at the institution) with adequate compensation increases while meeting current market values of lower ranked individuals hired into the institution." When hiring new faculty, the university has to offer a salary that is at least somewhat competitive with those paid by similar institutions. Once aboard, however, they can dole out only tiny pay increases and still retain most faculty. The authors suggest that to get a decent salary a faculty member must be prepared to jump from one university to another frequently. A more common strategy to relieve the cognitive dissonance produced by observing that one is being paid less than e is worth is, I opine, reducing how diligently one works -- if they aren't going to pay me what I am worth, well, then I'll just not work very hard for them.

Yes, other interpretations of the data are possible (maybe my senior colleagues are really less meritorious than I, maybe they got rank from seniority alone, etc.), which is what, I think, makes an example like this interesting for a class.

Karl L. Wuensch, Dept. of Psychology, East Carolina Univ.
Greenville, NC 27858-4353

 

return to Dr. Wuensch's Stat Help PageClick here to return to Dr. Wuensch's Stat Help Page.

return to Dr. Wuensch's Stat Help Page
Contact Information for the Webmaster,
Dr. Karl L. Wuensch


This page most recently revised on 31. March 2009.