KURTOSIS.txt
========================================================================
Date: Thu, 01 Apr 93 00:19:26 EST
From: "Karl L. Wuensch"
Subject: Re: What is "heavy tailed"?
To: edstat-l@jse.stat.ncsu.edu
In-Reply-To: Message of Wed, 31 Mar 93 21:00:49 EST from
David Nichols thinks of a uniform distribution as being "heavy tailed."
I think of the same distribution as being very low in "heavy-tailedness."
I think both of us can defend our use of the phrase "heavy tailed." I am
really thinking of kurtosis, with greater kurtosis being the same as greater
"heavy-tailedness". The uniform distribution has the lowest possible kurtosis,
I believe, -2, right? When I say "heavy-tailed," I mean the distribution has
more scores (higher densities) far from its mean than would a normal
distribution OF THE SAME VARIANCE. In my experience such "leptokurtic"
distributions are both "fatter" in their tails and more "peaked" in their
center than is a normal distribution (Student's t is one example). The
descriptive terms used in describing distributions that differ in kurtosis
confuse many people. I always seem to confuse my students when I address the
issue. One exercise that may help the student get a feel for kurtosis is to
take a distribution like this:
X Freq
5 20
10 0
15 20
and start moving frequencies from 5 and 15 to 10, measuring kurtosis of the
changed distributions. Consider how far (in standard deviations) the 5 and the
15 are from the mean in the distribution above (only one sd away). Now
move scores to the center:
X Freq
5 1
10 18
15 1
In this second distribution the 5 and the 15 are over 3 sd from the mean
-- 10% of the scores are more than 3 sd from the mean -- now that I
consider "fat-tailed," though many would look at these two distributions
and call the first fatter in its tails than the second. I think we all
agree that the second distribution is higher in kurtosis.
I don't think my students really suffer much from not understanding
kurtosis well, so I don't make a big deal out of it.
========================================================================
Date: Thu, 01 Apr 93 14:12:00 EST
From: "Karl L. Wuensch"
"Dennis Roberts" corrects my earlier posting:
>No, a uniform distribution does NOT have the lowest kurtosis ... a U shaped
>one is even lower (or more negative). Glass and Hopkins ... Statistical
>methods in psy and ed, sec edition, 1984 page 71 have one (and I am sure
>that these appear in many books) graph ... U shaped is at the top and has
>the biggest negative value. This is confusing about kurtosis ... the
>tendency of a distribution to have ONE high peak point ... that throws
>people ... and they think that FLAT is at the opposite end ... but
>actually ... FLAT is just one point on the move from extremely one peaked
>to multipeakedness.
Oops, you are right, Dennis, thanks. Despite the fact that low kurtosis
is sometimes associated with uniform distributions ("curves tending toward the
rectangular are called platykurtic" -- Guilford & Fruchter, Fundamental
statistics in psychology and education, 1973), kurtosis can drop below the
level a uniform distribution has (as I should have known, given the U-shaped
distribution I posted as an example of low kurtosis). Here are the simple
distributions I use to illustrate to my students:
X frequencies
-- -----------------------------------------------
5 20 20 20 10 5 3 1
10 0 10 20 20 20 20 20
15 20 20 20 10 5 3 1
------------------------------------------------------------------
Pop. Kurt. -2.0 -1.75 -1.5 -1.0 0.0 1.33 8.0
Pop. Var. 25 20 16.6 12.5 8.3 5.77 2.27
-------------------------------------------------------------------
Peter flom@murray.fordham.edu comments:
>I think the problem is that kurtosis has a precise definition and
>heavy tailed does not. I think moving more stuff to the center
>makes the curve look lighter tailed, but I can't prove it because there is
>no definition to prove it with.
I asked my colleagues to rank the above distributions on "tail-heaviness,"
without giving them any definition thereof. Every one who answered ranked the
distribution that had the lowest kurtosis as being most "tail-heavy," the one
with the next lowest kurtosis as next most "tail-heavy" etc. through the one
with the highest kurtosis as being least "tail-heavy." Clearly their implicit
definition of "tail-heaviness" is at odds with statements that kurtosis is a
measure of "tail-heaviness." Some of them changed their minds after I told
them how many standard deviations the 5's and 15's are from the mean
(only one sd in the leftmost distribution, over 3 sd in the rightmost) and
asked them to think of "tail-heaviness" as having higher z-scores
in the tails than would be expected in a normal distribution. Many simply
refused to answer, probably the best choice. ;-)
========================================================================
Date: Thu, 1 Apr 93 14:39:41 EST
From: dick darlington
The meaning and nature of kurtosis is well explicated in
"The meaning of kurtosis..." by J. J. A. Moors, American
Statistician, Nov 1986. Moors gives the formula
k = Var(Z^2)+1 where k = E(X-mu)^4/sigma^4
>From this several points follow immediately:
(1) k >= 1
(2) k achieves its minimum value of 1 only for a symmetric two-point
distribution, NOT a uniform distribution as might be assumed from
calling k a measure of "peakedness".
(3) k is best interpreted as a measure of dispersion of the values
of Z^2 around their expected value of 1, where as usual
Z = (X-mu)/sigma
Dick Darlington, Psychology, Cornell
#000#
========================================================================
Date: Thu, 01 Apr 93 14:43:35 EST
From: "Karl L. Wuensch"
To: dick darlington
Thanks for that posting, especially the "dispersion of z**2 about 1"
thought. There remains the confusion of "do we subtract 3 from kurtosis
so that the normal has kurtosis of 0 or just let it be as
E(x - mu)**4
------------- "
sd**4
========================================================================
Date: Thu, 01 Apr 93 21:40:00 EST
Subject: Re: What is "heavy tailed"?
To: David Nichols
From: "Karl L. Wuensch"
In-Reply-To: Your message of Thu, 1 Apr 93 17:41:02 CST
>Yes, I was wrong; I had it backwards. However, a -2 kurtosis is not the
>minimum possible value. A U-shaped distribution has a smaller (larger
>negative) kurtosis than does a uniform distribution. (I ran five 0s and
>five 1s through FREQUENCIES and got -2.571.)
I have relied on the information in SAS manuals for my statement that
-2 is the smallest possible kurtosis, but I am moderately confident that is
correct. Why? Well, I computed kurtosis on perfect U distributions (using the
fourth moment about the mean definition) and got -2. Also, Darlington cited
Moors' American Statistican article and noted that the minimum value of
kurtosis is 1 and is only achieved with a symmetric two-point distribution
(like your five 0's and five 1's, or my "perfect" U-distributions --
rectangular rather than curved U's at that). Why did Darlington say 1
and not -2? Well, many persons subtract 3 from the fourth moment statistic
so that 0 is the kurtosis of a normal distribution -- so Darlington's
minimum of 1 is the same as my minimum of -2. So how did you get from
SPSS a value lower than -2 with your perfect U-distribution? Not to worry,
no need for a bug report -- no doubt SPSS does either exactly what SAS does
or something similar, and that is to compute an estimate of population
kurtosis from your sample data. SAS also returns values of kurtosis less than
-2 with such U-distributions. The SAS manual has the formula they use for
such estimation. If you were to use the population formula on your perfect
U-distribution you would get exactly -2. The formula is:
E(X-mu)**4 be sure to use N rather than N-1
---------- - 3 when computing sd.
sd**4
>I got a note from Hayden and he was correct in saying that I had not seen
>enough of the previous thread to realize what was going on, and also I've
>seen that my view of kurtosis was incomplete and incorrect. I'll be
>posting a public apology after some more discussion with him of certain
>details.
From what I have seen among my colleagues, students, self, and experts on
this and other lists, there is nothing unusual about getting confused about
kurtosis! I felt rather stupid when Dennis Roberts pointed out to me that
minimum kurtosis comes with the perfect U-distribution, not a uniform
distribution. That I had in my post presented exactly such a distribution
made it especially embarassing. I guess that is what I get for reading and
replying to my mail late at night! But I have learned a little more about
kurtosis as a result, and so have some others, so it has been a useful
exchange. EDSTAT-L and related lists have become a big part of my "continuing
education."
========================================================================
Date: Thu, 01 Apr 93 22:09:29 EST
From: "Karl L. Wuensch"
Subject: kurtosis, platypuses, and leaping kangaroos
jerry@hnrc.tufts.edu (Jerry Dallal) sent us this delightful quote:
>>>>>>
Student (1927), Biometrika, 19, 160: Platykurtic curves are short in the tails
like platypuses; leptokurtic curves are heavy in the tails like kangaroos,
noted for 'lepping'
>>>>>>
Any of you historians out there know of the origin of the terms
platykurtic and leptokurtic? Knowing the Greek doesn't resolve the recent
confusion -- leptokurtic curves may be heavy in the tails like leaping
kangaroos, but "lepto" in Greek means "thin, fine, or small." "Platy"
means "broad or flat," but the U-distribution is even lower in kurtosis than
the flat distribution. From where did the term "kurtosis" derive?
========================================================================
Date: Thu, 1 Apr 93 22:08:17 EST
From: "David Scott"
In all the contributions to the discussion of heavy tails and kurtosis, I don't
think anyone has mentioned the work of Balanda & MacGillivray, Kurtosis: a
critical review, Amer. Statist. (1988), Vol. 42, pp. 111-119. To quote from the
abstract---"it is best to define kurtosis vaguely as the location- and
scale-free movement of probability mass from the shoulders of a distribution
into its centre and tails".
The paper is worth looking at if you are interested in kurtosis, and has plenty
of references to earlier work.
David Scott
========================================================================
Date: Fri, 2 Apr 93 09:06:50 EST
From: Sune Karlsson
Interesting example, I do to a certain extent agree that the
distribution with the lowest kurtosis is the most 'tail-heavy' - in the
sense that all the mass is in the tails. The problem with the example
(and which in my opinion makes it misleading) is that that there are two
things that varies, the kurtosis and the variance. But the kurtosis
measures the 'tail-heaviness' relative to the variance (or variance
squared to be precise), hence we should really keep the variance
constant when comparing the kurtosis. Doing that it is clear that a
higher kurtosis also implies a more peaked distribution as in the case
with the highest kurtosis.
In this modified example which holds the variance approximately constant
it is quite clear that a higher kurtosis implies that there are more
extreme observations (or that the extreme observations are more
extreme). If this is what is meant by 'tail-heaviness' is of course a
different story. It is also evident that a higher kurtosis also implies
that the distribution is more 'single-peaked' (this would be even more
evident if the sum of the frequencies was constant).
X frequencies
---- ---------------------------------------------
-6.6 0 0 0 0 0 0 1
-0.4 0 0 0 0 0 3 0
1.3 0 0 0 0 5 0 0
2.9 0 0 0 10 0 0 0
3.9 0 0 20 0 0 0 0
4.4 0 20 0 0 0 0 0
5 20 0 0 0 0 0 0
10 0 10 20 20 20 20 20
15 20 0 0 0 0 0 0
15.6 0 20 0 0 0 0 0
16.1 0 0 20 0 0 0 0
17.1 0 0 0 10 0 0 0
18.7 0 0 0 0 5 0 0
20.4 0 0 0 0 0 3 0
26.6 0 0 0 0 0 0 1
------------------------------------------------------------------
Pop. Kurt. -2.0 -1.75 -1.5 -1.0 0.0 1.33 8.0
Pop. Var. 25 25.1 24.8 25.2 25.2 25.0 25.1
-------------------------------------------------------------------
Sune Karlsson
Stockholm School of Economics
========================================================================
Date: Fri, 02 Apr 93 14:02:45 EST
From: "Karl L. Wuensch"
To: Sune Karlsson
Many thanks for posting your example of distributions with approximately
equal variances but differing kurtosis. Yes, I agree that mine was
(deliberately) misleading by confounding variance and kurtosis. Together
I think your and my examples will help my students better appreciate the
topic. I am adding your example and discussion to the instructional
materials I keep online, with credit to you, of course.
========================================================================
Date: Fri, 2 Apr 93 14:02:31 BST
From: "M.F.Fuller"
In the second edition of Kendall & Buckland's *Dictionary of Statistical
Terms*, the coining of the term 'kurtosis' is attributed to Karl Pearson
in 1906. The context intended is a unimodal distribution and peakedness
is measured by the relative steepness of ascent in the neighbourhood of
the mode. K & B wisely state that they doubt 'whether any single ratio can
adequately measure (this) quality'.
This definition is intriguing on two counts: the focus on the centre of
the distribution rather than the tails, and the specific concern with
unimodal distributions. I wonder when the heavy/light tailed description
of kurtosis first emerged? And it is no wonder that the scale free moment
based estimator of kurtosis attains its minimum for a very U-shaped curve!
Mike Fuller
Canterbury Business School, University of Kent, Canterbury, Kent, CT2 7PE,
========================================================================
From: hayden@oz.plymouth.edu (Bob Hayden)
> From where did the term "kurtosis" derive?
I think it was your great, great grandfather, Kurt O. Wuensch.
========================================================================
Date: Fri, 2 Apr 93 04:29:30 EST
From: "Gerhard Luecke"
Karl L. Wuensch asked:
> From where did the term "Kurtosis" derive ?
I cite from Funk & Wagnalls 'Standard Dictionary of the English
Language' (International Edition 1966):
" kurtosis ........ [
Thanks for the derivation. It also appears in my Funk & Wagnall's
here at the office, but wasn't in my "unabridged" dictionary at home.
=========================================================================
Sender: edstat-l@jse.stat.ncsu.edu
From: nichols@spss.com (David Nichols)
Subject: Kurtosis and heavy-tailed distributions
For the sake of anyone else who jumped into the discussion in the same
way I did (without knowing all the background and not really understanding
all the issues involved), I felt I should clarify some things I've
learned since my posting to the effect that kurtosis is a measure of
the "heavyness" of the tails of a distribution and that a uniform
distribution is heavier in the tails than a normal distribution.
First, the point that Bob Hayden I think was trying to make was summed
up in the _Encyclopedia of Statistical Sciences_ entry for heavy-tailed
distributions, where they remark that it is now generally acknowledged
that to use kurtosis as a measure of such is a "gross oversimplification."
Second, I reversed the directionality of the kurtosis measure. A value
less than 0 (less than 3 if you don't subtract the three out in the
computations the way SPSS and most other software packages do), called
a negative kurtosis, means a platykurtic distribution, which is lower
in the middle AND lighter in the tails than a normal, while a peaked
or leptokurtic distribution is higher in the middle and has more area
outside a given standard deviation range than does a normal distribution.
Learn something new most days. I certainly did yesterday. Thanks to Bob
Hayden, Karl Wuensch and Dennis Roberts for their e-mail discussion of
the issues.
--
David Nichols Senior Statistical Support Specialist SPSS, Inc.
Phone: (312) 329-3684 Internet: nichols@spss.com Fax: (312) 329-3657
========================================================================
From: maj@waikato.ac.nz
Subject: Heavy tails and kurtosis
The introduction of kurtosis into this discussion is essentially
a red herring. The Cauchy distribution, one of the most heavy-tailed
among commonly known distributions, has an undefined kurtosis.
Heaviness of tails can only usefully be compared between two distributions
whose densities have the same interval of real numbers as support.
Then you would say that one distribution F has a heavier right (left)
tail than another distribution G if the limit of the ratio of
densities f(x)/g(x) is greater than 1, and possibly +infinity, as x
tends to the right (left) end of the interval.
Of course the interval of common support may be finite or infinite.
Looking back at my definition, perhaps it can be improved by writing
lim inf instead of limit.
--
Murray A. Jorgensen [ maj@waikato.ac.nz ] University of Waikato
Department of Mathematics and Statistics Hamilton, New Zealand
========================================================================
From: bparks@wuecona.wustl.edu (Robert P. Parks)
However, one should use the cumulative distribution function r.t. the density
functions and define heavy tailedness right or left by:
F(x)>G(x) for x F is heavier left tailed than G
F(x)>G(x) for x>L => F is heavier right tailed than G
which keeps things fairly clear compared to other methods of defining heavy
tailedness. The problem of course is that for arbitrary distributions, this
may not happen (F and G can 'wiggle' in their tails).
This definition certainly shows that the Cauchy or T is heavy tailed (both
sides) compared to the normal. It can also be used when the supports are not
the same but then the definition becomes rather yucky bec. e.g., any
distribution with real line support has a heavier left tail than any
distribution with support of say [0,Inf) - and you do not have to muck things
up with division by 0.
However, I can not find any reference (albeit small sample) that has this
definition of heavy tailedness.
========================================================================
From: barnett@mummy.agsm.unsw.OZ.AU (Glen Barnett)
Both Murray's and Bob's definitions tell us that a N(0,4) is heavier
tailed than a N(0,1) (on both sides), yet all that is different is
the spread. Similarly, both these would say that a N(1,1) has a heavier
right tail, and a lighter left tail than a N(0,1).
Since these definitions are not location or scale free, they may
not be particularly useful in all situations.
========================================================================
From: barnett@mummy.agsm.unsw.OZ.AU (Glen Barnett)
In article nichols@spss.com (David Nichols) writes:
>a negative kurtosis, means a platykurtic distribution, which is lower
>in the middle AND lighter in the tails than a normal, while a peaked
>or leptokurtic distribution is higher in the middle and has more area
>outside a given standard deviation range than does a normal distribution.
If you are using kurtosis as a standardised fourth moment like this,
it is *not* necessarily true that height in the middle (lets stick with
symmetric distributions for now, so we know what 'middle' means) implies
anything about kurtosis, even when the spreads are equal (in some sense,
say s.d.'s). Kendall and Stuart (The Advanced theory of Statistics)
carries a nice example on this type of thing.
========================================================================
From: barnett@mummy.agsm.unsw.OZ.AU (Glen Barnett)
>
>Both Murray's and Bob's definitions tell us that a N(0,4) is heavier
>tailed than a N(0,1) (on both sides), yet all that is different is
>the spread. Similarly, both these would say that a N(1,1) has a heavier
>right tail, and a lighter left tail than a N(0,1).
>
>Since these definitions are not location or scale free, they may
>not be particularly useful in all situations.
Well, that is true, and a good point to see in using the defintion but not a
point against the definition.
While you might not like it, a N(0,4) is heavier tailed than a N(0,1) in
much the same way as a T(n) is heavier tailed than a N(0,1). You seem to want
T(n) heavy but not N(0,4) while I am willing to live with and in fact want a
N(0,4) heavier than a N(0,1).
If you think that you should transform somehow and then make the definition,
just how would you transform a T(10) and/or a Cauchy, and get the desired
result that N(0,4) and N(0,1) have 'similar' tailedness but the T(10) and
Cauchy are still heavier. I do not see a way.
I stand by my definition - N(0,1) is lighter than N(0,4) and T(n).
========================================================================
From: nichols@spss.com (David Nichols)
I'm not sure that what you're saying here contradicts what I said. I said
that something about kurtosis implies something about the middle of the
distribution, while you're saying that something about the middle of the
distribution doesn't necessarily imply something about kurtosis. These
may or may not be contradictory statements.
At any rate, here is where I got this statement (p. 426 of Vol. 4 of
Kotz and Johnson's _Encyclopedia of Statistical Sciences_): "Ratios
smaller than 3 [negative kurtosis in SPSS] correspond to curves with a
flatter top than the normal (e.g., rectangular)."
========================================================================
From: rlenth@stat.uiowa.edu (Russ Lenth)
Subject: Bad advice
> While you might not like it, a N(0,4) is heavier tailed than a N(0,1) ...
"Not liking it" is definitely an understatement. This is wrong. Period.
A big problem with these email discussion lists is that bad advice
can spread like wildfire; and the volume of discussion is so immense that
few people have the time to sort out the bad from the good.
This is a Statistics Education discussion list. Hey, educators, see the
challenge we face?
------------------------------------------------ --------------,
Russ Lenth rlenth@stat.uiowa.edu \ )
Department of Statistics | I O W A \_
and Actuarial Science Tel: (319)335-0814 \ \
The University of Iowa FAX: (319)335-3017 \ * _/
Iowa City, IA 52242 USA CW: *= * ====== *=* | Iowa City /
------------------------------------------------ ----------v
"Everything happens for a reason,
but that doesn't mean that there's a point."
--- Sue Grafton
========================================================================
From: David A. Evans
>From: barnett@mummy.agsm.unsw.OZ.AU (Glen Barnett)
>[stuff deleted]
>I stand by my definition - N(0,1) is lighter than N(0,4) and T(n).
>
>Bob
>
Then it follows that if you have gathered measurements in units of
gallons, when you change units to quarts the nature of the distribution
changes?????
========================================================================
From: hayden@oz.plymouth.edu (Bob Hayden)
Maybe I'll just get EVERYBODY mad at me, but ...
I believe the idea of "heavy-tailed" distribution came out of the
robustness studies of the 1960's. I have never seen a precise
mathematical definition -- it's usually an intuitive idea expressed
rather vaguely in words. Whatever it may mean, it is definitely NOT
the same as whatever it is that kurtosis measures. On the other hand,
usage suggests that all normal distributions have the same tail
weight.
I think the real question here is:
Can this intuitive notion be made precise?
If so, how?
Discussants to date can be broken into two groups: those who provide
a precise definition and those who do not. The former find that the
precise definitions do not match their intuitive concepts.
To make headway I think we need to try to list and refine the
intuitive notions and then try to see if they can be made more
precise. The goal should be a definition that agrees with the
intuitive notion or a demonstration that the intuitive notion is
internally inconsistent.
One of the papers cited in connection with the earlier
boxplot/quartile discussion (Freund was one of the authors) provides a
good model by listing some intuitive properties we expect of quartiles
and then evaluating various proposed definitions according to them.
So, to the intuitionists: try to tell us more precisely what you mean
by "tail-heavy" and what properties you want in a definition. To the
precisionists: provide a definition that matches the intuition or
show that it can not be done.
As an example, in the context in which "heavy-tailed" came up (30+
years ago) I think it is safe to say that the intent was that the
Cauchy has heavier tails than any t with d.f.>1 and the t's with
finite d.f. have heavier tails than any normal.
========================================================================
From: bparks@wuecona.wustl.edu (Robert P. Parks)
So the challenge to you is to provide a definition of heavy tailedness which
is accepted by some if not all and in which definition the N(0,4) is not
heavier tailed than a N(0,1) but a T(10) is heavier tailed than a N(0,1) (or
are you saying that a T(10) is not heavier tailed.
By the definitions presented so far, it is NOT WRONG, and you are wrong to say
that it is.
>A big problem with these email discussion lists is that bad advice
>can spread like wildfire; and the volume of discussion is so immense that
>few people have the time to sort out the bad from the good.
You are correct. You have given bad advice. You called something WRONG
without any argument or proof.
As they say Put up (your definition) or SHUT UP. Bob
========================================================================
From: "David Scott"
Russ Lenth wrote:
A big problem with these email discussion lists is that bad advice
can spread like wildfire; and the volume of discussion is so immense that
few people have the time to sort out the bad from the good.
I can certainly agree with that. I don't think he is correct however about the
comparison of the tails of the N(0,1) and N(0,4)---my calculations show that
N(0,4) is indeed heavier-tailed than N(0,1).
To see why, go back to Murray Jorgenson's definition of heavy-tailedness. This
is basically right, but has one flaw. He wrote:
>However, one should use the cumulative distribution function r.t. the density
>functions and define heavy tailedness right or left by:
>
> F(x)>G(x) for x F is heavier left tailed than G
> F(x)>G(x) for x>L => F is heavier right tailed than G
>
For the right hand tail, the weight in the tail is given not by F(x) but by
1-F(x), so the definition of heavier right-tailed is 1-F(x)>1-G(x), or
F(x)g(x) for x f is heavier left tailed than g
f(x)>g(x) for x>L => f is heavier right tailed than g
and then we will have f is heavier tailed than g iff F is heavier tailed than
G.
Having done all that we can assert that N(0,4) (f) is heavier-tailed than
N(0,1) (g). Just compare the densities as x->+ or - \infty. The ratio f/g goes
to infinity. I haven't bothered to check the t distribution but given that it
doesn't have moments because of its heavier tails, I am willing to bet that it
has heavier tails.
========================================================================
From: rlenth@stat.uiowa.edu (Russ Lenth)
I would like to focus on a unifying concept that I believe underlies
all scale-free measures of tail weight:
tail weight = ratio of two scale measures
The one in the numerator should be more sensitive to the tails than the
one in the denominator.
As long as both the numerator and denominator are in the same units,
the ratio is scale-free. If the numerator and denominator are not in
the same units, then the measure is not scale-free; and, in my opinion,
it no longer can be considered a measure of tail weight.
The standardized kurtosis is such a ratio:
E(X - mu)^4
K = ---------------
{E(X - mu)^2}^2
If X is in inches, then both numerator and denominator are in
inches^4. Both numerator and denominator are scale measures. The
numerator, being based on the fourth power of the deviations, is
clearly more sensitive to the tails than is the denominator. If you
subtract 3 from K, as some prefer, I admit it is no longer a ratio of
scales; but it is still based on such a ratio.
The problem with the kurtosis is that it is not defined for many
distributions, especially those that are generally regarded as
heavy-tailed. Using the ratio-of-scales idea, it is easy to dream up
tailweight measures that are defined for all distributions. Here are
two examples:
Q-ratio = (x[.95] - x[.05]) / (x[.75] - x[.25])
where x[p] is the p-th quantile of the distribution of X.
S-ratio = sigma[.05] / sigma[.25]
where sigma[p] is the trimmed standard deviation (i.e., the SD of the
distribution after truncating it to the interval (x[p], x[1-p])). Both
of these ratios are invariant to linear transformations of the data;
and they both rank the t family such that, the fewer the degrees of
freedom, the higher the ratio.
I believe, but am not sure, that this is a new contribution to the
discussion. I had not paid much attention to it until I saw postings
that I regard as clearly incorrect. I apologize if it is redundant.
========================================================================
From: BERGERD@CGSVAX.CLAREMONT.EDU
Hello fellow Edstatlers,
I would like to hear more discussion about how folks actually use the
kurtosis index in data analysis. While I appreciate the fact that
the index is sensitive to 'heavy' tails and 'peaked' middles, a
more practical use for the kurtosis index is that it can serve as
a warning that there may be one or more extreme outliers in the
data.
If I see a kurtosis value of 1 or greater, I take another look at
the plot to make sure I didn't miss something. With larger data
sets one can usually tolerate greater kurtosis, though even with a
large data set, kurtosis greater than 1 might alert the user to an
error in the data.
Of course, one should not rely on kurtosis to detect outliers.
Summary indices like kurtosis and skew are no substitute for an
inspection of the plot. Yet I offer the rule of thumb that
kurtosis greater than 1 indicates that a transformation should be
considered to reduce the influence of extreme scores. Is this a
reasonable rule of thumb?
Dale Berger
Psychology
The Claremont Graduate School
Claremont, California
========================================================================
April, 1993
Hi Dave, (this was a letter to David Howell)
I just started reading selected chapters of the new edition of your
Fundamentals book. Thanks for the footnote in chapter three noting that I
don't think it wise to make a big deal out of kurtosis in an introductory
course. Nevertheless, I am going to suggest a change in your presentation
of it. You wrote "Kurtosis ...... refers to the degree to which scores
congregate in the tails of the distribution. A distribution in which there are
large numbers of high and low scores far from the center of the distribution is
referred to as a heavy-tailed (or platykurtic) distribution. A distribution
that is relatively thin in the tails is called a leptokurtic distribution."
Well, those who associate heavy-tailedness with kurtosis usually describe the
leptokurtic distribution, not the platykurtic distribution, as "heavy tailed,"
but heavy in the tails is defined in units that are standardized. But at this
point in the text you have not yet introduced the variance and standard
deviation, so there is no easy way to get that across to the students. Perhaps
it would be best to avoid the term "heavy tailed" at this point. I suggest
something like this:
Kurtosis is related to the relative concentrations of the scores in the
tails (upper and lower ends), shoulders (between the tails and the center),
and center of a distribution. (This is more eloquently said in Balanda &
MacGillivray's article in Amer. Stat. 42: 111-.) If you start with a
distribution of medium kurtosis (a mesokurtic distribution), like that with
the dotted curve in Figure 3.10, and move scores from the tails and from the
center into the shoulders, the distribution becomes low in kurtosis
(platykurtic). If you continue moving scores to the shoulders the curve
assumes first a rectangular shape and then a U shape. If you start with
a mesokurtic distribution and move scores from the shoulders to the tails and
to the center, then the distribution becomes high in kurtosis (leptokurtic),
like the Student's t-distribution we shall study later.
Dave, did you follow the edstat-l discussion of kurtosis and tail
heaviness beyond the post where I confessed that I don't make a big deal
out of it with my students? I loved the quote from Student's 1927 Biometrika
article (19: 160): "Platykurtic curves are short in the tails like platypuses;
leptokurtic curves are heavy in the tails like kangaroos, noted for lepping."
One contributor provided a nice set of simple distributions which varied in
kurtosis but were constant in variance. I think they would effectively
demonstrate to my graduate students that high kurtosis is heavy tailedness,
but I'm not about to show them to my undergraduate students.
It is too beautiful outside for me to be sitting here at my PC --
I'm going to head out to my garden now.
==============================================================================
For an excellent article on kurtosis, see DeCarlo, L. T. (1997). On
the meaning and use of kurtosis. Psychological Methods, 2, 292-307.