KURTOSIS.txt ======================================================================== Date: Thu, 01 Apr 93 00:19:26 EST From: "Karl L. Wuensch" Subject: Re: What is "heavy tailed"? To: edstat-l@jse.stat.ncsu.edu In-Reply-To: Message of Wed, 31 Mar 93 21:00:49 EST from David Nichols thinks of a uniform distribution as being "heavy tailed." I think of the same distribution as being very low in "heavy-tailedness." I think both of us can defend our use of the phrase "heavy tailed." I am really thinking of kurtosis, with greater kurtosis being the same as greater "heavy-tailedness". The uniform distribution has the lowest possible kurtosis, I believe, -2, right? When I say "heavy-tailed," I mean the distribution has more scores (higher densities) far from its mean than would a normal distribution OF THE SAME VARIANCE. In my experience such "leptokurtic" distributions are both "fatter" in their tails and more "peaked" in their center than is a normal distribution (Student's t is one example). The descriptive terms used in describing distributions that differ in kurtosis confuse many people. I always seem to confuse my students when I address the issue. One exercise that may help the student get a feel for kurtosis is to take a distribution like this: X Freq 5 20 10 0 15 20 and start moving frequencies from 5 and 15 to 10, measuring kurtosis of the changed distributions. Consider how far (in standard deviations) the 5 and the 15 are from the mean in the distribution above (only one sd away). Now move scores to the center: X Freq 5 1 10 18 15 1 In this second distribution the 5 and the 15 are over 3 sd from the mean -- 10% of the scores are more than 3 sd from the mean -- now that I consider "fat-tailed," though many would look at these two distributions and call the first fatter in its tails than the second. I think we all agree that the second distribution is higher in kurtosis. I don't think my students really suffer much from not understanding kurtosis well, so I don't make a big deal out of it. ======================================================================== Date: Thu, 01 Apr 93 14:12:00 EST From: "Karl L. Wuensch" "Dennis Roberts" corrects my earlier posting: >No, a uniform distribution does NOT have the lowest kurtosis ... a U shaped >one is even lower (or more negative). Glass and Hopkins ... Statistical >methods in psy and ed, sec edition, 1984 page 71 have one (and I am sure >that these appear in many books) graph ... U shaped is at the top and has >the biggest negative value. This is confusing about kurtosis ... the >tendency of a distribution to have ONE high peak point ... that throws >people ... and they think that FLAT is at the opposite end ... but >actually ... FLAT is just one point on the move from extremely one peaked >to multipeakedness. Oops, you are right, Dennis, thanks. Despite the fact that low kurtosis is sometimes associated with uniform distributions ("curves tending toward the rectangular are called platykurtic" -- Guilford & Fruchter, Fundamental statistics in psychology and education, 1973), kurtosis can drop below the level a uniform distribution has (as I should have known, given the U-shaped distribution I posted as an example of low kurtosis). Here are the simple distributions I use to illustrate to my students: X frequencies -- ----------------------------------------------- 5 20 20 20 10 5 3 1 10 0 10 20 20 20 20 20 15 20 20 20 10 5 3 1 ------------------------------------------------------------------ Pop. Kurt. -2.0 -1.75 -1.5 -1.0 0.0 1.33 8.0 Pop. Var. 25 20 16.6 12.5 8.3 5.77 2.27 ------------------------------------------------------------------- Peter flom@murray.fordham.edu comments: >I think the problem is that kurtosis has a precise definition and >heavy tailed does not. I think moving more stuff to the center >makes the curve look lighter tailed, but I can't prove it because there is >no definition to prove it with. I asked my colleagues to rank the above distributions on "tail-heaviness," without giving them any definition thereof. Every one who answered ranked the distribution that had the lowest kurtosis as being most "tail-heavy," the one with the next lowest kurtosis as next most "tail-heavy" etc. through the one with the highest kurtosis as being least "tail-heavy." Clearly their implicit definition of "tail-heaviness" is at odds with statements that kurtosis is a measure of "tail-heaviness." Some of them changed their minds after I told them how many standard deviations the 5's and 15's are from the mean (only one sd in the leftmost distribution, over 3 sd in the rightmost) and asked them to think of "tail-heaviness" as having higher z-scores in the tails than would be expected in a normal distribution. Many simply refused to answer, probably the best choice. ;-) ======================================================================== Date: Thu, 1 Apr 93 14:39:41 EST From: dick darlington The meaning and nature of kurtosis is well explicated in "The meaning of kurtosis..." by J. J. A. Moors, American Statistician, Nov 1986. Moors gives the formula k = Var(Z^2)+1 where k = E(X-mu)^4/sigma^4 >From this several points follow immediately: (1) k >= 1 (2) k achieves its minimum value of 1 only for a symmetric two-point distribution, NOT a uniform distribution as might be assumed from calling k a measure of "peakedness". (3) k is best interpreted as a measure of dispersion of the values of Z^2 around their expected value of 1, where as usual Z = (X-mu)/sigma Dick Darlington, Psychology, Cornell #000# ======================================================================== Date: Thu, 01 Apr 93 14:43:35 EST From: "Karl L. Wuensch" To: dick darlington Thanks for that posting, especially the "dispersion of z**2 about 1" thought. There remains the confusion of "do we subtract 3 from kurtosis so that the normal has kurtosis of 0 or just let it be as E(x - mu)**4 ------------- " sd**4 ======================================================================== Date: Thu, 01 Apr 93 21:40:00 EST Subject: Re: What is "heavy tailed"? To: David Nichols From: "Karl L. Wuensch" In-Reply-To: Your message of Thu, 1 Apr 93 17:41:02 CST >Yes, I was wrong; I had it backwards. However, a -2 kurtosis is not the >minimum possible value. A U-shaped distribution has a smaller (larger >negative) kurtosis than does a uniform distribution. (I ran five 0s and >five 1s through FREQUENCIES and got -2.571.) I have relied on the information in SAS manuals for my statement that -2 is the smallest possible kurtosis, but I am moderately confident that is correct. Why? Well, I computed kurtosis on perfect U distributions (using the fourth moment about the mean definition) and got -2. Also, Darlington cited Moors' American Statistican article and noted that the minimum value of kurtosis is 1 and is only achieved with a symmetric two-point distribution (like your five 0's and five 1's, or my "perfect" U-distributions -- rectangular rather than curved U's at that). Why did Darlington say 1 and not -2? Well, many persons subtract 3 from the fourth moment statistic so that 0 is the kurtosis of a normal distribution -- so Darlington's minimum of 1 is the same as my minimum of -2. So how did you get from SPSS a value lower than -2 with your perfect U-distribution? Not to worry, no need for a bug report -- no doubt SPSS does either exactly what SAS does or something similar, and that is to compute an estimate of population kurtosis from your sample data. SAS also returns values of kurtosis less than -2 with such U-distributions. The SAS manual has the formula they use for such estimation. If you were to use the population formula on your perfect U-distribution you would get exactly -2. The formula is: E(X-mu)**4 be sure to use N rather than N-1 ---------- - 3 when computing sd. sd**4 >I got a note from Hayden and he was correct in saying that I had not seen >enough of the previous thread to realize what was going on, and also I've >seen that my view of kurtosis was incomplete and incorrect. I'll be >posting a public apology after some more discussion with him of certain >details. From what I have seen among my colleagues, students, self, and experts on this and other lists, there is nothing unusual about getting confused about kurtosis! I felt rather stupid when Dennis Roberts pointed out to me that minimum kurtosis comes with the perfect U-distribution, not a uniform distribution. That I had in my post presented exactly such a distribution made it especially embarassing. I guess that is what I get for reading and replying to my mail late at night! But I have learned a little more about kurtosis as a result, and so have some others, so it has been a useful exchange. EDSTAT-L and related lists have become a big part of my "continuing education." ======================================================================== Date: Thu, 01 Apr 93 22:09:29 EST From: "Karl L. Wuensch" Subject: kurtosis, platypuses, and leaping kangaroos jerry@hnrc.tufts.edu (Jerry Dallal) sent us this delightful quote: >>>>>> Student (1927), Biometrika, 19, 160: Platykurtic curves are short in the tails like platypuses; leptokurtic curves are heavy in the tails like kangaroos, noted for 'lepping' >>>>>> Any of you historians out there know of the origin of the terms platykurtic and leptokurtic? Knowing the Greek doesn't resolve the recent confusion -- leptokurtic curves may be heavy in the tails like leaping kangaroos, but "lepto" in Greek means "thin, fine, or small." "Platy" means "broad or flat," but the U-distribution is even lower in kurtosis than the flat distribution. From where did the term "kurtosis" derive? ======================================================================== Date: Thu, 1 Apr 93 22:08:17 EST From: "David Scott" In all the contributions to the discussion of heavy tails and kurtosis, I don't think anyone has mentioned the work of Balanda & MacGillivray, Kurtosis: a critical review, Amer. Statist. (1988), Vol. 42, pp. 111-119. To quote from the abstract---"it is best to define kurtosis vaguely as the location- and scale-free movement of probability mass from the shoulders of a distribution into its centre and tails". The paper is worth looking at if you are interested in kurtosis, and has plenty of references to earlier work. David Scott ======================================================================== Date: Fri, 2 Apr 93 09:06:50 EST From: Sune Karlsson Interesting example, I do to a certain extent agree that the distribution with the lowest kurtosis is the most 'tail-heavy' - in the sense that all the mass is in the tails. The problem with the example (and which in my opinion makes it misleading) is that that there are two things that varies, the kurtosis and the variance. But the kurtosis measures the 'tail-heaviness' relative to the variance (or variance squared to be precise), hence we should really keep the variance constant when comparing the kurtosis. Doing that it is clear that a higher kurtosis also implies a more peaked distribution as in the case with the highest kurtosis. In this modified example which holds the variance approximately constant it is quite clear that a higher kurtosis implies that there are more extreme observations (or that the extreme observations are more extreme). If this is what is meant by 'tail-heaviness' is of course a different story. It is also evident that a higher kurtosis also implies that the distribution is more 'single-peaked' (this would be even more evident if the sum of the frequencies was constant). X frequencies ---- --------------------------------------------- -6.6 0 0 0 0 0 0 1 -0.4 0 0 0 0 0 3 0 1.3 0 0 0 0 5 0 0 2.9 0 0 0 10 0 0 0 3.9 0 0 20 0 0 0 0 4.4 0 20 0 0 0 0 0 5 20 0 0 0 0 0 0 10 0 10 20 20 20 20 20 15 20 0 0 0 0 0 0 15.6 0 20 0 0 0 0 0 16.1 0 0 20 0 0 0 0 17.1 0 0 0 10 0 0 0 18.7 0 0 0 0 5 0 0 20.4 0 0 0 0 0 3 0 26.6 0 0 0 0 0 0 1 ------------------------------------------------------------------ Pop. Kurt. -2.0 -1.75 -1.5 -1.0 0.0 1.33 8.0 Pop. Var. 25 25.1 24.8 25.2 25.2 25.0 25.1 ------------------------------------------------------------------- Sune Karlsson Stockholm School of Economics ======================================================================== Date: Fri, 02 Apr 93 14:02:45 EST From: "Karl L. Wuensch" To: Sune Karlsson Many thanks for posting your example of distributions with approximately equal variances but differing kurtosis. Yes, I agree that mine was (deliberately) misleading by confounding variance and kurtosis. Together I think your and my examples will help my students better appreciate the topic. I am adding your example and discussion to the instructional materials I keep online, with credit to you, of course. ======================================================================== Date: Fri, 2 Apr 93 14:02:31 BST From: "M.F.Fuller" In the second edition of Kendall & Buckland's *Dictionary of Statistical Terms*, the coining of the term 'kurtosis' is attributed to Karl Pearson in 1906. The context intended is a unimodal distribution and peakedness is measured by the relative steepness of ascent in the neighbourhood of the mode. K & B wisely state that they doubt 'whether any single ratio can adequately measure (this) quality'. This definition is intriguing on two counts: the focus on the centre of the distribution rather than the tails, and the specific concern with unimodal distributions. I wonder when the heavy/light tailed description of kurtosis first emerged? And it is no wonder that the scale free moment based estimator of kurtosis attains its minimum for a very U-shaped curve! Mike Fuller Canterbury Business School, University of Kent, Canterbury, Kent, CT2 7PE, ======================================================================== From: hayden@oz.plymouth.edu (Bob Hayden) > From where did the term "kurtosis" derive? I think it was your great, great grandfather, Kurt O. Wuensch. ======================================================================== Date: Fri, 2 Apr 93 04:29:30 EST From: "Gerhard Luecke" Karl L. Wuensch asked: > From where did the term "Kurtosis" derive ? I cite from Funk & Wagnalls 'Standard Dictionary of the English Language' (International Edition 1966): " kurtosis ........ [ Thanks for the derivation. It also appears in my Funk & Wagnall's here at the office, but wasn't in my "unabridged" dictionary at home. ========================================================================= Sender: edstat-l@jse.stat.ncsu.edu From: nichols@spss.com (David Nichols) Subject: Kurtosis and heavy-tailed distributions For the sake of anyone else who jumped into the discussion in the same way I did (without knowing all the background and not really understanding all the issues involved), I felt I should clarify some things I've learned since my posting to the effect that kurtosis is a measure of the "heavyness" of the tails of a distribution and that a uniform distribution is heavier in the tails than a normal distribution. First, the point that Bob Hayden I think was trying to make was summed up in the _Encyclopedia of Statistical Sciences_ entry for heavy-tailed distributions, where they remark that it is now generally acknowledged that to use kurtosis as a measure of such is a "gross oversimplification." Second, I reversed the directionality of the kurtosis measure. A value less than 0 (less than 3 if you don't subtract the three out in the computations the way SPSS and most other software packages do), called a negative kurtosis, means a platykurtic distribution, which is lower in the middle AND lighter in the tails than a normal, while a peaked or leptokurtic distribution is higher in the middle and has more area outside a given standard deviation range than does a normal distribution. Learn something new most days. I certainly did yesterday. Thanks to Bob Hayden, Karl Wuensch and Dennis Roberts for their e-mail discussion of the issues. -- David Nichols Senior Statistical Support Specialist SPSS, Inc. Phone: (312) 329-3684 Internet: nichols@spss.com Fax: (312) 329-3657 ======================================================================== From: maj@waikato.ac.nz Subject: Heavy tails and kurtosis The introduction of kurtosis into this discussion is essentially a red herring. The Cauchy distribution, one of the most heavy-tailed among commonly known distributions, has an undefined kurtosis. Heaviness of tails can only usefully be compared between two distributions whose densities have the same interval of real numbers as support. Then you would say that one distribution F has a heavier right (left) tail than another distribution G if the limit of the ratio of densities f(x)/g(x) is greater than 1, and possibly +infinity, as x tends to the right (left) end of the interval. Of course the interval of common support may be finite or infinite. Looking back at my definition, perhaps it can be improved by writing lim inf instead of limit. -- Murray A. Jorgensen [ maj@waikato.ac.nz ] University of Waikato Department of Mathematics and Statistics Hamilton, New Zealand ======================================================================== From: bparks@wuecona.wustl.edu (Robert P. Parks) However, one should use the cumulative distribution function r.t. the density functions and define heavy tailedness right or left by: F(x)>G(x) for x F is heavier left tailed than G F(x)>G(x) for x>L => F is heavier right tailed than G which keeps things fairly clear compared to other methods of defining heavy tailedness. The problem of course is that for arbitrary distributions, this may not happen (F and G can 'wiggle' in their tails). This definition certainly shows that the Cauchy or T is heavy tailed (both sides) compared to the normal. It can also be used when the supports are not the same but then the definition becomes rather yucky bec. e.g., any distribution with real line support has a heavier left tail than any distribution with support of say [0,Inf) - and you do not have to muck things up with division by 0. However, I can not find any reference (albeit small sample) that has this definition of heavy tailedness. ======================================================================== From: barnett@mummy.agsm.unsw.OZ.AU (Glen Barnett) Both Murray's and Bob's definitions tell us that a N(0,4) is heavier tailed than a N(0,1) (on both sides), yet all that is different is the spread. Similarly, both these would say that a N(1,1) has a heavier right tail, and a lighter left tail than a N(0,1). Since these definitions are not location or scale free, they may not be particularly useful in all situations. ======================================================================== From: barnett@mummy.agsm.unsw.OZ.AU (Glen Barnett) In article nichols@spss.com (David Nichols) writes: >a negative kurtosis, means a platykurtic distribution, which is lower >in the middle AND lighter in the tails than a normal, while a peaked >or leptokurtic distribution is higher in the middle and has more area >outside a given standard deviation range than does a normal distribution. If you are using kurtosis as a standardised fourth moment like this, it is *not* necessarily true that height in the middle (lets stick with symmetric distributions for now, so we know what 'middle' means) implies anything about kurtosis, even when the spreads are equal (in some sense, say s.d.'s). Kendall and Stuart (The Advanced theory of Statistics) carries a nice example on this type of thing. ======================================================================== From: barnett@mummy.agsm.unsw.OZ.AU (Glen Barnett) > >Both Murray's and Bob's definitions tell us that a N(0,4) is heavier >tailed than a N(0,1) (on both sides), yet all that is different is >the spread. Similarly, both these would say that a N(1,1) has a heavier >right tail, and a lighter left tail than a N(0,1). > >Since these definitions are not location or scale free, they may >not be particularly useful in all situations. Well, that is true, and a good point to see in using the defintion but not a point against the definition. While you might not like it, a N(0,4) is heavier tailed than a N(0,1) in much the same way as a T(n) is heavier tailed than a N(0,1). You seem to want T(n) heavy but not N(0,4) while I am willing to live with and in fact want a N(0,4) heavier than a N(0,1). If you think that you should transform somehow and then make the definition, just how would you transform a T(10) and/or a Cauchy, and get the desired result that N(0,4) and N(0,1) have 'similar' tailedness but the T(10) and Cauchy are still heavier. I do not see a way. I stand by my definition - N(0,1) is lighter than N(0,4) and T(n). ======================================================================== From: nichols@spss.com (David Nichols) I'm not sure that what you're saying here contradicts what I said. I said that something about kurtosis implies something about the middle of the distribution, while you're saying that something about the middle of the distribution doesn't necessarily imply something about kurtosis. These may or may not be contradictory statements. At any rate, here is where I got this statement (p. 426 of Vol. 4 of Kotz and Johnson's _Encyclopedia of Statistical Sciences_): "Ratios smaller than 3 [negative kurtosis in SPSS] correspond to curves with a flatter top than the normal (e.g., rectangular)." ======================================================================== From: rlenth@stat.uiowa.edu (Russ Lenth) Subject: Bad advice > While you might not like it, a N(0,4) is heavier tailed than a N(0,1) ... "Not liking it" is definitely an understatement. This is wrong. Period. A big problem with these email discussion lists is that bad advice can spread like wildfire; and the volume of discussion is so immense that few people have the time to sort out the bad from the good. This is a Statistics Education discussion list. Hey, educators, see the challenge we face? ------------------------------------------------ --------------, Russ Lenth rlenth@stat.uiowa.edu \ ) Department of Statistics | I O W A \_ and Actuarial Science Tel: (319)335-0814 \ \ The University of Iowa FAX: (319)335-3017 \ * _/ Iowa City, IA 52242 USA CW: *= * ====== *=* | Iowa City / ------------------------------------------------ ----------v "Everything happens for a reason, but that doesn't mean that there's a point." --- Sue Grafton ======================================================================== From: David A. Evans >From: barnett@mummy.agsm.unsw.OZ.AU (Glen Barnett) >[stuff deleted] >I stand by my definition - N(0,1) is lighter than N(0,4) and T(n). > >Bob > Then it follows that if you have gathered measurements in units of gallons, when you change units to quarts the nature of the distribution changes????? ======================================================================== From: hayden@oz.plymouth.edu (Bob Hayden) Maybe I'll just get EVERYBODY mad at me, but ... I believe the idea of "heavy-tailed" distribution came out of the robustness studies of the 1960's. I have never seen a precise mathematical definition -- it's usually an intuitive idea expressed rather vaguely in words. Whatever it may mean, it is definitely NOT the same as whatever it is that kurtosis measures. On the other hand, usage suggests that all normal distributions have the same tail weight. I think the real question here is: Can this intuitive notion be made precise? If so, how? Discussants to date can be broken into two groups: those who provide a precise definition and those who do not. The former find that the precise definitions do not match their intuitive concepts. To make headway I think we need to try to list and refine the intuitive notions and then try to see if they can be made more precise. The goal should be a definition that agrees with the intuitive notion or a demonstration that the intuitive notion is internally inconsistent. One of the papers cited in connection with the earlier boxplot/quartile discussion (Freund was one of the authors) provides a good model by listing some intuitive properties we expect of quartiles and then evaluating various proposed definitions according to them. So, to the intuitionists: try to tell us more precisely what you mean by "tail-heavy" and what properties you want in a definition. To the precisionists: provide a definition that matches the intuition or show that it can not be done. As an example, in the context in which "heavy-tailed" came up (30+ years ago) I think it is safe to say that the intent was that the Cauchy has heavier tails than any t with d.f.>1 and the t's with finite d.f. have heavier tails than any normal. ======================================================================== From: bparks@wuecona.wustl.edu (Robert P. Parks) So the challenge to you is to provide a definition of heavy tailedness which is accepted by some if not all and in which definition the N(0,4) is not heavier tailed than a N(0,1) but a T(10) is heavier tailed than a N(0,1) (or are you saying that a T(10) is not heavier tailed. By the definitions presented so far, it is NOT WRONG, and you are wrong to say that it is. >A big problem with these email discussion lists is that bad advice >can spread like wildfire; and the volume of discussion is so immense that >few people have the time to sort out the bad from the good. You are correct. You have given bad advice. You called something WRONG without any argument or proof. As they say Put up (your definition) or SHUT UP. Bob ======================================================================== From: "David Scott" Russ Lenth wrote: A big problem with these email discussion lists is that bad advice can spread like wildfire; and the volume of discussion is so immense that few people have the time to sort out the bad from the good. I can certainly agree with that. I don't think he is correct however about the comparison of the tails of the N(0,1) and N(0,4)---my calculations show that N(0,4) is indeed heavier-tailed than N(0,1). To see why, go back to Murray Jorgenson's definition of heavy-tailedness. This is basically right, but has one flaw. He wrote: >However, one should use the cumulative distribution function r.t. the density >functions and define heavy tailedness right or left by: > > F(x)>G(x) for x F is heavier left tailed than G > F(x)>G(x) for x>L => F is heavier right tailed than G > For the right hand tail, the weight in the tail is given not by F(x) but by 1-F(x), so the definition of heavier right-tailed is 1-F(x)>1-G(x), or F(x)g(x) for x f is heavier left tailed than g f(x)>g(x) for x>L => f is heavier right tailed than g and then we will have f is heavier tailed than g iff F is heavier tailed than G. Having done all that we can assert that N(0,4) (f) is heavier-tailed than N(0,1) (g). Just compare the densities as x->+ or - \infty. The ratio f/g goes to infinity. I haven't bothered to check the t distribution but given that it doesn't have moments because of its heavier tails, I am willing to bet that it has heavier tails. ======================================================================== From: rlenth@stat.uiowa.edu (Russ Lenth) I would like to focus on a unifying concept that I believe underlies all scale-free measures of tail weight: tail weight = ratio of two scale measures The one in the numerator should be more sensitive to the tails than the one in the denominator. As long as both the numerator and denominator are in the same units, the ratio is scale-free. If the numerator and denominator are not in the same units, then the measure is not scale-free; and, in my opinion, it no longer can be considered a measure of tail weight. The standardized kurtosis is such a ratio: E(X - mu)^4 K = --------------- {E(X - mu)^2}^2 If X is in inches, then both numerator and denominator are in inches^4. Both numerator and denominator are scale measures. The numerator, being based on the fourth power of the deviations, is clearly more sensitive to the tails than is the denominator. If you subtract 3 from K, as some prefer, I admit it is no longer a ratio of scales; but it is still based on such a ratio. The problem with the kurtosis is that it is not defined for many distributions, especially those that are generally regarded as heavy-tailed. Using the ratio-of-scales idea, it is easy to dream up tailweight measures that are defined for all distributions. Here are two examples: Q-ratio = (x[.95] - x[.05]) / (x[.75] - x[.25]) where x[p] is the p-th quantile of the distribution of X. S-ratio = sigma[.05] / sigma[.25] where sigma[p] is the trimmed standard deviation (i.e., the SD of the distribution after truncating it to the interval (x[p], x[1-p])). Both of these ratios are invariant to linear transformations of the data; and they both rank the t family such that, the fewer the degrees of freedom, the higher the ratio. I believe, but am not sure, that this is a new contribution to the discussion. I had not paid much attention to it until I saw postings that I regard as clearly incorrect. I apologize if it is redundant. ======================================================================== From: BERGERD@CGSVAX.CLAREMONT.EDU Hello fellow Edstatlers, I would like to hear more discussion about how folks actually use the kurtosis index in data analysis. While I appreciate the fact that the index is sensitive to 'heavy' tails and 'peaked' middles, a more practical use for the kurtosis index is that it can serve as a warning that there may be one or more extreme outliers in the data. If I see a kurtosis value of 1 or greater, I take another look at the plot to make sure I didn't miss something. With larger data sets one can usually tolerate greater kurtosis, though even with a large data set, kurtosis greater than 1 might alert the user to an error in the data. Of course, one should not rely on kurtosis to detect outliers. Summary indices like kurtosis and skew are no substitute for an inspection of the plot. Yet I offer the rule of thumb that kurtosis greater than 1 indicates that a transformation should be considered to reduce the influence of extreme scores. Is this a reasonable rule of thumb? Dale Berger Psychology The Claremont Graduate School Claremont, California ======================================================================== April, 1993 Hi Dave, (this was a letter to David Howell) I just started reading selected chapters of the new edition of your Fundamentals book. Thanks for the footnote in chapter three noting that I don't think it wise to make a big deal out of kurtosis in an introductory course. Nevertheless, I am going to suggest a change in your presentation of it. You wrote "Kurtosis ...... refers to the degree to which scores congregate in the tails of the distribution. A distribution in which there are large numbers of high and low scores far from the center of the distribution is referred to as a heavy-tailed (or platykurtic) distribution. A distribution that is relatively thin in the tails is called a leptokurtic distribution." Well, those who associate heavy-tailedness with kurtosis usually describe the leptokurtic distribution, not the platykurtic distribution, as "heavy tailed," but heavy in the tails is defined in units that are standardized. But at this point in the text you have not yet introduced the variance and standard deviation, so there is no easy way to get that across to the students. Perhaps it would be best to avoid the term "heavy tailed" at this point. I suggest something like this: Kurtosis is related to the relative concentrations of the scores in the tails (upper and lower ends), shoulders (between the tails and the center), and center of a distribution. (This is more eloquently said in Balanda & MacGillivray's article in Amer. Stat. 42: 111-.) If you start with a distribution of medium kurtosis (a mesokurtic distribution), like that with the dotted curve in Figure 3.10, and move scores from the tails and from the center into the shoulders, the distribution becomes low in kurtosis (platykurtic). If you continue moving scores to the shoulders the curve assumes first a rectangular shape and then a U shape. If you start with a mesokurtic distribution and move scores from the shoulders to the tails and to the center, then the distribution becomes high in kurtosis (leptokurtic), like the Student's t-distribution we shall study later. Dave, did you follow the edstat-l discussion of kurtosis and tail heaviness beyond the post where I confessed that I don't make a big deal out of it with my students? I loved the quote from Student's 1927 Biometrika article (19: 160): "Platykurtic curves are short in the tails like platypuses; leptokurtic curves are heavy in the tails like kangaroos, noted for lepping." One contributor provided a nice set of simple distributions which varied in kurtosis but were constant in variance. I think they would effectively demonstrate to my graduate students that high kurtosis is heavy tailedness, but I'm not about to show them to my undergraduate students. It is too beautiful outside for me to be sitting here at my PC -- I'm going to head out to my garden now. ============================================================================== For an excellent article on kurtosis, see DeCarlo, L. T. (1997). On the meaning and use of kurtosis. Psychological Methods, 2, 292-307.