Formulas in Statistics:
Definitions or Computational Recipes?


Here is a question I posted to the EDSTAT list and some of the replies I received. Many of those who replied indicated that they see no value in students learning "formulas," even when those formulas are the basic definitions of concepts essential to the understanding of statistics. I have not included many of those replies here, but do wish to acknowledge that many instructors of statistics are of that opinion.


My Question to the EDSTAT List

I sure many of you have been asked a question like that posed today by one of my students, and I would be interested in hearing how you respond to it. I've included the question along with the response I gave this morning. It looks a bit long to me now, I must have been having an attack of mania <grin>

My Student's Question

I have been diligently studying for the exam, but I realized that there are a lot of formulas and sub formulas that I am having trouble memorizing. I probably can memorize them, but I not sure if that is what we need to do. Should we memorize all the formulas and sub-formulas or should we expend most of our energy on having a good understanding of the concepts that we have covered or both?

My Reply to the Student

IMHO, one cannot have a good understanding of the concepts without knowing some basic definitions. As a simple example, I opine that you would not have a good understanding of the concept of mean without knowing that it is the "balance point" which makes the sum of deviations about it zero, and that it is the quantity which minimizes the sum of squared deviations about it (the least squares criterion). Now, I can present that definition in what you might call a pair of formulae, but it is, nevertheless, a definition essential for understanding the concept. On the other hand, if you are going to compute a sample mean by hand, you will probably just add up the scores and divide by the number of scores, a useful "computational formula," but not a definition essential for understanding.

Consider next the concept of variance (not just the more general concept of dispersion). To understand it, you need to know that it is defined as the mean squared deviation of scores from their mean. Yes, it is just another sort of mean. Again, I can present that definition in what you might call a formula, but it is really just a definition essential for understanding the concept. On the other hand, I would not think it essential that you know that you can get the corrected (for the mean) sum of squared deviations (numerator of the ratio we call variance) by taking the uncorrected sum of squares and subtracting the ratio of the square of the summed scores to the number of scores -- but that is the formula you should use if you were computing a variance by hand (but we have machines to do such tasks now, tasks done by one's graduate students back in the dark ages when I was a graduate student).

Another example, after we cover correlation and regression, I would expect you to know that the correlation coefficient is really just a mean -- the mean cross-product of standardized (z) scores, and it represents the slope of the standardized least squares linear regression line for predicting one variable from another. While I could present that definition of Pearson r in "formulas," those would not be the formulas you would use to compute r, but rather are definitions that would help you understand r. With that understanding, you would realize that r is the number of standard deviations by which predicted Y increases per one standard deviation change in known X. Building on that understanding of r, you would then recognize that the covariance is also just a mean, the mean cross product of deviations of X about its mean and deviations of Y about its mean, structurally the same as the univariate concept of variance, but in two dimensions rather than just one. The same least squares criterion used to define the mean is used to define the regression line -- it minimizes the (error) sum of squared deviations (in the Y dimension) about it. The univariate mean is really just our least squares predicted value for a score when the only information we have is that in the univariate distribution.

If our linear model is any good, it should account for some the variance in the variables. The sum of the squared deviations of the predicted scores about the regression line is used to measure that portion of the total variance, and represents the reduction in error due to adding the X variable to the model used to predict Y. Divide that regression sum of squares by the total sum of squares for the predicted variable and you obtain r2, so now you have another way to interpret r2, it is the proportion of the total variance in one variable "accounted for" by our model.

If you have read Edwin Abbott's "Flatland," you might recognize that the same concept (a mean) which looked like a point in one dimensional space now looks like a line in two dimensional space. Then you would be ready to leap into three dimensional space and even beyond, into hyperspace, but you might want to sit down and have a good beer first. I promise that we shall travel that space before the semester is out (as soon as we get started on multiple regression).

So, to recap, starting with what might seem like a useless task of memorizing a couple of formulas for the arithmetic mean, we come to an understanding of several useful extensions of that concept, ending up in hyperspace with a good beer. What more could you possible expect from life than having a good beer in hyperspace?

Selected Replies

Message from Gordon Bear

A delightful reply, and not at all wordy - just richly detailed. With a chuckle at the end too. Your pedagogical philosophy seems to match mine: Foster deep understanding, including connections among concepts, and show the elegance of the mathematics as well as the utility. Don't require rote memorization.

Thanks for sharing your reply with your virtual colleagues here in cyberspace.

Message from Michael Granaas

I honestly believe that there is something to be learned from memorizing several of the basic formulas that are involved in defining statistics. I, less elegantly, tell my students that it is important to have this basic understanding so that it can 1) be utilized when we have the machines start doing the computations for us and 2)be drawn on for understanding when the mathematics is no longer so simple.

Reply from Bob Hayden

I doubt your students will gain ANY understanding from memorizing formulae. Once they have the understanding, then formulae MIGHT provide a summary or reminder-but only for students who are VERY fluent at READING mathematics-as opposed to mindlessly manipulating formulae. I do not see any such students in the undergraduate introductory course that I often teach. I noted that Karl presented all the understandings he sought verbally on the list. Why not do the same in class?

Karl Writes Again

I think that Bob Hayden is on to something essential here ("I noted that Karl presented all the understandings he sought verbally on the list. Why not do the same in class?"). I think of the "definitional formulae" just as a convenient shorthand for the verbal definition of a construct. But it may be the case that most of our students assume that something that looks like a formula is just for use with mindless computations. They may have learned this in their first 12 years of schooling, where formulas may indeed be presented as nothing more than mindless recipes for getting some quantity not really well understood. How can we break our students of that bad habit? I do frequently verbalize the 'formula' after writing it on the board-for variance, saying something like "look at this, we just take the sum of the squared deviations of scores from their mean, which measures how much the scores differ from one another, and then divide that sum by N, to get a measure of how much scores differ from one another, on average." The shorthand is really convenient, I don't know how I would get along without it.

Reply from ebohlman@omsdev.com

I think the problem here is that, at least in the US K-12 system, "mathematical aptitude" really means "computational proficiency." In "back to basics" math (and the overwhelming majority of K-12 math curricula *are* "back to basics"; it's only in a few elite schools with high-performing students that any of those educational innovations that right-wingers claim are destroying the minds of our students have actually been used, but I digress; see Alfie Kohn's writings), mathematical notation is taught as a series of Taylorized (as in Frederick Winslow) job instructions rather than as a language for precisely describing certain types of relationships. IOW, it's taught as a bunch of hoops to be jumped through without really understanding what you're doing; all that matters is going through the prescribed motions. A student who is "good at math" is one who can jump through those hoops quickly. The object of most K-12 math is to turn out human calculators (something that actually made economic sense in the Old Days when a machine to do calculations cost a lot more than an employee doing calculations longhand).

As John Allen Paulos pointed out in _Innumeracy_, the majority of American K-12 students know *how* to add, subtract, multiply, and divide, but a much smaller number of them know *when* to add, subtract, multiply, and divide. They do just fine when given a spelled-out list of calculations to perform, but they panic when confronted with word problems.

Over in comp.infosystems.www.authoring.html, there's a long-running and acrimonious debate between those who see HTML markup as a *description* of the structure of a document and those who see it as a bunch of commands interspersed with text. The notion that formal notation can be a description or explanation rather than a sequence of tasks to perform is alien to a lot of people. It may very well be that some people simply don't have the built-in cognitive ability to grasp the former, but I suspect that a much larger number of people *could* develop that ability if only they were taught how to think that way (and it is indeed a way of thinking rather than a set of "skills"). Right now I suspect that most who developed that ability picked it up through intellectual exploration outside the formal system of schooling. I'm pretty sure I did (I was a math major but am not involved in teaching).

Michael Writes Again

I do explain some of my reasoning in class, but usually I find it about as useful as explaining the long-term benefits of doing a share of the household chores to my 10 year old. All she understands is that her playtime is being shortened. Any arguments that I make will not be understood or appreciated for several years. So I don't explain.

I have always used a mix of problems, short answer, and multiple choice items on my exams. There was a time when I gave the exam in two parts so that notes, books, whatever, could be used on the problem portion of the exam. Like others I reasoned that in the real world nobody actually memorizes formulas, they look them up or trust the computer to carry out the computations correctly.

The split exam format became a problem as enrollments in my classes climbed so I reluctantly abandoned it and went to a completely closed book exam format. Much to my surprise student scores improved on both computational and conceptual items. (I fully expected scores on computation items to drop noticeably with the loss of book and notes.)

This forced me to reconsider my original stance that formulas should just be looked up.

I decided that the use of books or notes for formulas encourages a form of laziness that interferes with learning. The student who is paging madly through their text looking for the formula they need, or who is staring blankly at their card full of formulas, has not put the intellectual effort into learning what is going on. If you truly know what it is that variance describes how can you not pick the definitional formula for it off of a page? And if you understand the formula you have a much easier time selecting or generating an answer with words like "dispersion" correctly used.

A few folks in this discussion have argued that many students simply view formulas as no more than procedural directions otherwise void of meaning. I would agree. I see "cheat sheets" as continuing the misperception and exacerbating the problem. Sure, it would be a lot easier for me to let students use formulas soley as recipies to be followed blindly, but I spend a good deal of time explaining the message of the formula trying to improve their understanding a bit more.

I addition by using definitional formulas it is obvious that the mean squared error is just a general form of the variance, or that the variance is just a type of mean. The student who does not remember the formula for variance, however, can only see the mean squared error as a whole new thing to be learned. These links between related concepts can serve to strengthen learning of both concepts if students can see those links. But they can only see them if they have a certain amount of material, including formulas, committed to memory.

I will also mention that I use exclusively definitional formulas in my classes. I won't even use a text that has computational formulas in it. I am not interested in computational accuracy so much as I am conceptual understanding. Clearly computational formulas are designed to simplify computation, not facilitate understanding. If your text uses computational formulas then I agree that there is no point in having folks memorize anything...let them look it up.

An Objection to my Cross-Dimensional Generalizations

Sorry? The natural generalization of a mean to a 2-dimensional space is surely the vector mean, which is a point.

-Robert Dawson

My Reply

Perhaps I should be more careful with my language, but you know how mania is <grin>. Would it help if I replaced "mean" with "least squares estimator of individual Y given the individual's value(s) on all remaining (k-1) variables in k-dimensional space? Or might that confuse my students? Are there objections to the generalization in that case? In Cartesian space, the regression LINE is that which minimizes the sum of squared deviations (in the Y dimension) about it. If we drop the X dimension, our least squares estimator of Y becomes the mean, the POINT which minimizes the sum of squared deviations about it. If instead of dropping X we add X2, our regression surface is a plane in three dimensional space, .................

In Abbot's fanciful tale, Flatland is inhabited by men, who, according to their social status, are triangles, squares, or polygons, and women who are straight lines. The women can makes themselves almost invisible by rotating to the perpendicular, and in such an orientation present a threat to any man who might run into her point. Under law, they must wiggle their ends so that the men can see them and avoid such danger. Silly, yes. Sexist, yes. Latest edition written in 1884. A Flatland resident has a vision of Lineland (a one-dimensional world). Here men are small lines and women points, but inhabitants see both as points, so mating can be tricky. The inhabitant of Flatland is visited by an inhabitant of (three-dimensional) Spaceland. As you can imagine, it is quite difficult for the Flatlander to understand the Spacelander's description of his world.


Back to the Teaching Resources Page

Visit Karl's Index Page


Contact Information for the Webmaster,
Dr. Karl L. Wuensch



This page most recently revised on 15. May 2001.