East Carolina University
Department of Psychology
The Reversal Paradox (Simpson's Paradox)
Here is log of some discussions among subscribers to stats listserves.
Date: Tue, 9 Jun 1992 19:34:44 GMT
Sender: "STATISTICAL CONSULTING" <STAT-L@MCGILL1.BITNET>
From: David Nichols <nichols@SPSS.COM>
Todd Dicker asks how two groups with positive slopes can combine to form
a total group with a negative slope:
This can happen quite easily. There's even a name for it: Simpson's Paradox.
To illustrate how this can easily happen, simply draw two clouds of points,
each of them forming an ellipse tilted upward to the right, indicating
a positive correlation, but have them located so that one cloud is above
and to the left of the other. When you look at each group separately, they
will each show a positive correlation, but if you check the overall
relationship, it will be negative. The same thing can happen with
categorical data as well, though it isn't as easy to illustrate graphically.
Below is a rather crude attempt, within the limits of the available
technology and my imagination, to illustrate the situation:
This situation would obviously produce positive correlations within each
cluster, but a negative one overall. This is essentially the basic reason
why we do multiple regression instead of individual bivariate regressions.
--
David Nichols Statistical Support Specialist SPSS, Inc.
nternet: nichols@spss.com
*******************************************************************************
Any correlation between my views and those of SPSS is strictly due to chance.
========================================================================
Date: Wed, 10 Jun 92 13:50:46 EDT
From: "Karl L. Wuensch" <PSWUENSC@ECUVM1>
Subject: Simpson's paradox
To: edstat-l@ncsuvm
This is a cross-posting from STAT-L. I thought those who teach
regression, factorial ANOVA, and multivariate contingency table analyses
might find this useful and not see it on STAT-L.
------------------------------------------------------------------------------
David Nichols <nichols@SPSS.COM> noted:
>Todd Dicker asks how two groups with positive slopes can combine to form
>a total group with a negative slope:
>
>This can happen quite easily. There's even a name for it: Simpson's Paradox.
............rest deleted...........
Nice explanation, David. Those interested in this paradox might
want to read Messick & van de Geer's article "A Reveral Paradox" in
Psychological Bulletin, 1981, vol. 90, 582-593. They define a reversal
paradox as "having two varibles that are positively (negatively) related
while at the same time having the variables negatively (positively) related
within each level of a third variable. They refer to other articles on the
topic, including Simpson's. They also have some nice examples, well worth
reading. One of my favorites is this: The correlation between the number of
firemen sent to a fire and the amount of damage suffered is positive, which
seems paradoxical, until you realize that more firefighters are sent to
larger fires. If you control for the size of the fire, the sign of the
correlation changes to negative. Such examples provide a good way to
introduce students to some of the complexities of multiple regression
analyses, partial correlations, and analyses of covariance, including
suppression. They also discuss a celebrated study of sex bias in
grad school admissions (at UC, Berkeley) as an example of a reversal paradox,
with categorical variables. Briefly, the percentage of female applicants
accepted was lower than that of male applicants, indicating, in the aggregate
data, "bias" against women. But women were applying to programs that were more
competitive while men were applying to less competitive programs. It is not
difficult to construct a 3-way contingency table that indicates "bias" in one
direction in the aggregate, but the opposite direction at each level of the
third variable. Here is a simple example I use (contrived), where the body of
the table has acceptance ratios:
Sex
Female Male
Note that in the marginals it looks like bias against women, but
within each program it looks like bias against men! I add at this point
that these data might still represent bias against women at a different
level -- why is admission to the clinical program so competitive -- maybe
because our society does not sufficiently fund programs in the helping
professions, professions traditionally associated with women!
This paradox can also surface in factorial ANOVA when the cell
frequences are not proportional. Consider the following:
Sex
Program Male Female marginal means
Mean N Mean N weighted unweighted
------------------------------------------------
A 150 60 110 40 134 130
------------------------------------------------
B 160 10 120 90 124 140
------------------------------------------------
Note that the weighted means indicate that the mean for A > for B,
but the unweighted means indicate B > A (which is true within each sex).
Karl L. Wuensch, Dept. of Psychology, East Carolina Univ.
Greenville, NC 27858-4353
========================================================================
From: Tim Dorcey <TCD@CORNELLC>
Subject: Re: Simpson's paradox
Nice posting. My first reaction to the original posting was something
like "of course this can happen, and here's some numbers that show how."
Not particularly interesting on a mathematical level....
Your real world illustrations, however, were quite fascinating and
reminded me why this "paradox" deserves a name.
Tim Dorcey BITNET: TCD@CORNELLC
Statistical Software Consultant Internet: TCD@CORNELLC.CIT.CORNELL.EDU
Cornell Information Technologies Phone: (607) 255-5715
Cornell University
Ithaca, NY 14853
========================================================================
From: Dr D.D. Stretch <dds@uk.ac.le>
Karl Wuensch amplifies quite nicely a message about Simpson's Paradox and
other reversal paradoxes.
I would just like to add that I have found a description and working through
of Simpson's paradox as being a particularly good way of introducing
people to log-linear modeling - particularly if they are sceptical about
why they have to consider using this "radically new technique" (hah!) when
they can just continue to abuse PROC FREQ in SAS, or CROSSTABS in SPSS by
doing loads of 2-way chi-squared tests between all combinations of pairs
of variables they are interested in.
I also do the same thing, but not immediately at the beginning, with
teaching people about path analysis by talking about suppressor variables
(as related to the discussions about regression in this thread). David
Kenny in his book Correlation and Causality (Wiley, 1979) talks about these
on pages 62-65.
David Stretch
+-----------------------+---------------------------------------------+
| Dr. David D Stretch | Understanding atomic physics is child's |
| Dept. of Psychiatry | play compared to understanding child's play |
| dds@leicester.ac.uk | Anon. |
+-----------------------+---------------------------------------------+
========================================================================
From: vokey@HG.ULETH.CA
An example of the reversal paradox that works well with
intro stats students concerns teenagers, sex, junk food and
body fat. If body fat is plotted against junk food
consumption in teenagers, a negative correlation is found at
the level of male and female groups, but a positive
correlation is found within groups. Incidentally, although
it is rarely discussed in this way, Simpson's Paradox (with
contingency tables) and the other reversal examples are just
special case examples of the general inappropriateness of
cross-level inferences (e.g., the ecological and
individualistic fallacies).
Strictly speaking, it is the collapsing of multi-way
contingency tables in the presence of non-additivity that
results in what has come to be called Simpson's paradox
(Blyth, 1972; Simpson, 1951) -- a reversal of relationship
across different levels of stratification, resulting from
heterogeneity in the dimension or dimensions collapsed over
to produce the aggregate table. More recently (e.g.,
Hintzman, 1980), the term has been generalized to include
changes in magnitude in addition to changes in sign of
relationship following the collapsing of contingency tables.
Probably the best known, real-world example of Simpson's
paradox is that provided by Bickel, Hammel, and O'Connell
(1975) concerning an apparent sex-bias in acceptances to
graduate school at Berkeley in 1973. Whereas for the 101
departments as a whole, males were accepted proportionately
more often than females, the data for the individual
departments suggested that, if any, there was a bias in
favor of accepting female students. The explanation for the
apparent paradox is that even though generally, within
departments, female applicants were proportionately as
likely as (if not more likely than) male applicants to be
accepted, females had a proportionately lower overall rate
of acceptance because they applied disproportionately more
often to the departments with the lowest rates of acceptance
(Freedman, Pisani, & Purves, 1978).
That Simpson's paradox can and does occur is well-
established. Note of at least some aspects of the general
problem associated with collapsing over heterogeneity dates
back at least as far as Yule (1903), and real-world examples
of the reversal paradox were noted as early as Cohen and
Nagel (1934). More recent, real-world examples may be found
in Knapp (1985), Mantel (1982), and Wagner (1982). Further
discussion and conceptualizations of the issue may be found
in Cohen (1986), Flexser (1981), Gardner (1976), Hintzman
(1980), Martin (1981), Paik (1985), Shapiro (1982), and
Sunder (1983).
As noted, the problem is not limited to the collapsing of
contingency tables; Simpson's paradox is but a special case
of the more general problem of cross-level inferences (e.g.,
Knapp, 1977; Lindquist, 1940; Robinson, 1950; Thorndike,
1939), although it is not often portrayed as such (but see
Paik, 1985; Wermuth, 1989). The "paradox" in Simpson's
paradox or more generally arises from the typically
unwarranted assumption or expectation that aggregates should
evidence the same relationships as the categories, levels,
classes, items or individuals over which the aggregate was
formed.
I have a manuscript available outlining a general method of
providing for the homogeneity needed to collapse over one or
more effects in multi-way contingency tables that is
conceptually similar to control via analysis of covariance
for continuous data. Computer code is also available.
Some useful references:
Bickel, P. J., Hammel, E. A., and O'Connell, J. W. (1975).
Sex bias in graduate admissions: Data from Berkeley.
Science, 187, 398-404.
Bishop, Y. M. M., Fienberg, S. E., & Holland, P. W. (1975).
Discrete multivariate analysis: Theory and practice.
Cambridge, Massachusetts: The MIT Press.
Blyth, C. R. (1972). On Simpson's paradox and the sure
thing principle. Journal of the American Statistical
Association, 67, 364-381.
Cohen, J. E. (1986). An uncertainty principle in demography
and the unisex issue. The American Statistician, 41, 32-39.
Deming, W. E. (1943). Statistical adjustment of data. New
York: Dover Publications, Inc.
Deming, M. E. and Stephan, F. F. (1940). On a least squares
adjustment of a sampled frequency table when the expected
marginal totals are known. Annals of Mathematical
Statistics, 11, 427-444.
Fienberg, S. E. (1977). The analysis of cross-classified
categorical data. Cambridge, Massachusetts: The MIT Press.
Flexser, A. J. (1981). Homogenizing the 2 X 2 contingency
table: A method for removing dependencies due to subject
and item differences. Psychological Review, 88, 327-339.
Freedman, D., Pisani, R., and Purves, R. (1978).
Statistics. W.W. Norton & Company, New York.
Gardner, M. (1976). On the fabric of inductive logic and
some probability paradoxes. Scientific American, 234, 119-
124.
Goodman, L. A. (1970). The multivariate analysis of
qualitative data: interactions among multiple
classifications. Journal of the American Statistical
Association, 65, 226-256.
Hintzman, D. L. (1980). Simpson's paradox and the analysis
of memory retrieval. Psychological Review, 87, 398-410.
Kennedy, J. J. (1983). Analyzing qualitative data.
Introductory log-linear analysis for behavioral research.
New York: Praeger Publishers.
Knapp, T. R. (1977). The unit-of-analysis problem in
applications of simple correlation analysis to educational
research. Journal of Educational Statistics, 2, 171-186.
Knapp, T. R. (1985). Instances of Simpson's paradox.
College Mathematics Journal, 16, 209-211.
Lindquist, E. F. (1940). Statistical analysis in
educational research. Boston: Houghton Mifflin.
Mantell, N. (1982). Simpson's paradox in reverse. The
American Statistician, 36, 395.
Martin, E. (1981). Simpson's paradox resolved: A reply to
Hintzman. Psychological Review, 88, 372-374.
Mosteller, F. (1968). Association and estimation in
contingency tables. Journal of the American Statistical
Association, 63, 1-28.
Paik, M. (1985). A graphic representation of a three-way
contingency table: Simpson's paradox and correlation. The
American Statistician, 39, 53-54.
Robinson, W. S. (1950). Ecological correlations and the
behavior of individuals. American Sociological Review, 15,
351-357.
Shapiro, S. H. (1982). Collapsing contingency tables -- a
geometric approach. The American Statistician, 36, 43-46.
Simpson, E. H. (1951). The interpretation of interaction in
contingency tables. The American Statistician, 13, 238-241.
Sunder, S. (1983). Simpson's reversal paradox and cost
allocation. Journal of Accounting Research, 21, 222-233.
Thorndike, E. L. (1939). On the fallacy of imputing the
correlations found for groups to individuals or smaller
groups composing them. American Journal of Psychology, 52,
122-124.
Wagner, C. H. (1982). Simpson's paradox in real life. The
American Statistician, 36, 46-48.
Wermuth, N. (1989). Moderating effects of subgroups in
linear models. Biometrika, 76, 81-92.
Whittemore, A. S. (1978). Collapsibility of multi-
dimensional contingency tables. Journal of the Royal
Statistical Society, Ser. B., 40, 328-340.
Yule, G. U. (1903). Notes on the theory of association of
attributes in statistics. Biometrica, 2, 121-134.
John R. Vokey ph.: (403) 329-2409
Department of Psychology
University of Lethbridge email: vokey@hg.uleth.ca
Lethbridge, Alberta
CANADA T1K 3M4
========================================================================
Sender: edstat-l@jse.stat.ncsu.edu
From: "Karl L. Wuensch" <PSWUENSC@ecuvm.cis.ecu.edu>
Subject: Simpson's Paradox, Real Example
Recently there was a discussion on this list of Simpson's paradox and
a request for real examples thereof. I just ran across what could be
construed as a real example. The data (r's) are in the article "Salary
Compression in Faculty Salaries: Identification of a Suppressor Effect,"
by W. L. McCulley & R. G. Downey, Educ. Psychol. Measurement, 53: 79-86. The
criterion variable is faculty salary. Both academic rank and years of service
are positively correlated with salary, but in a multiple regression the Beta
weight for years of service becomes negative. That is, when you hold rank
constant your salary is lower the longer you have been at the university. I
intend to use this as an interesting example of the paradox (and of suppressor
effects) and of the interpretation of multiple regressions and analyses of
covariance. I also am sending copies to our local administrators who
control my salary (I've been here many years). The authors' interpretation of
the results is interesting -- they argue that the negative Beta reflects "the
failure of the organization to recognize seniority (continuation at the
institution) with adequate compensation increases while meeting current market
values of lower ranked individuals hired into the institution." When hiring
new faculty, the university has to offer a salary that is at least somewhat
competitive with those paid by similar institutions. Once aboard, however,
they can dole out only tiny pay increases and still retain most faculty.
The authors suggest that to get a decent salary a faculty member must be
prepared to jump from one university to another frequently. A more common
strategy to relieve the cognitive dissonance produced by observing that one is
being paid less than e is worth is, I opine, reducing how diligently one works
-- if they aren't going to pay me what I am worth, well, then I'll just not
work very hard for them.
Yes, other interpretations of the data are possible (maybe my senior
colleagues are really less meritorious than I, maybe they got rank from
seniority alone, etc.), which is what, I think, makes an example like this
interesting for a class.
Karl L. Wuensch, Dept. of Psychology, East Carolina Univ.
Greenville, NC 27858-4353
Contact Information for the Webmaster,
Dr. Karl L. Wuensch
This page most recently revised on
31. March 2009.