East Carolina University
Department of Psychology
Independent Variables and Dependent Variables
Message posted to EDSTAT-L@LISTS.PSU.EDU and sci.stat.edu on 6/14/2004 11:16 PM
Researchers frequently use the terms "independent variable" and "dependent variable" when describing variables studied in their research. I am of the opinion that these terms are frequently used inappropriately. I am very interested in learning how YOU define these terms to students in your classes.
Please respond off-list to wuenschk@ECU.edu. I shall post to the list a summary of responses received. Thanks a bunch.
Karl L. Wuensch, Department of Psychology,
East Carolina University, Greenville NC 27858-4353
Voice: 252-328-4102 Fax: 252-328-6283
Here are my own thoughts on this matter.
Over the years I have become increasingly uncomfortable with the way many people use the terms "independent variable" and "dependent variable." I now believe that these terms should be used only when referring to experimental research, that is, research where the independent variable(s) is(are) manipulated and the dependent variable(s) is(are) passively observed. Researchers commonly associate the term "independent variable" with "cause" and "dependent variable" with "effect." Using the terms "independent variable" and "dependent variable" with nonexperimentally gathered data may prod researchers in making causal attributions when they should not.
Let me give you an example. Several years ago I was assisting a doctoral student with his dissertation. He was investigating the correlates of a single outcome variable which was treated as continuous. Among his predictor variables were several categorical variables. All of the variables were derived from items on a survey given to students at Florida State. No variables were manipulated. I dummy coded the categorical variables and threw them, along with some continuous predictors, into a multiple regression. A week or so after I sent the results to the doctoral student, he wrote back and told me that the dissertation director wanted the categorical predictors analyzed by ANOVA, not by multiple regression (as if there were a difference). Why, I asked. Well, believe it or not, the response I got was that if we did the analysis as an ANOVA, then those predictors would be "independent variables" and we could conclude that they CAUSE changes in the "dependent variable." You can imagine my response. I ended up telling my client to simply change the phrase "multiple regression" to "ANOVA" to make the director happy.
Is this incident an anomaly? Regretfully not, in my experience. I can't count the number of times that Ph.D. holding researchers have told me that if they analyze their data with an independent samples t (where the grouping variable is called the "independent variable"), then "significant" results show that the "independent variable" caused changes in the "dependent variable," but that if they analyze their data with a correlation/regression analysis, then they cannot interpret a significant association as evidence of causation - you know, that old "correlation does not imply causation" bullshit (see http://core.ecu.edu/psyc/wuenschk/StatHelp/Correlation-Causation.htm ).
On a few occasions I have demonstrated to such persons that the pooled variances independent samples t test is absolutely equivalent to the t used to test the significance of the point biserial r. Their jaws really dropped. Of course I show my students that ANOVA is just a dummy coded multiple regression.
I now advise my students that others will tend to use the term "independent variable" with variables that are manipulated rather than observed, or known rather than predicted, or measured earlier in time rather than later (if we predict high school grades from college grades, which variable is "independent?"), or are categorical rather than continuous, or are thought of as causes rather than effects. I explain that the use of the terms "independent variable" and "dependent variable" in nonexperimental research can cause confusion, but confess to doing it myself (but I am trying to stop this bad habit). I suggest alternative terms such as "predictor variable," "factor," (in ANOVA, but the American Psychological Association does not like that use), "grouping variable," "classification variable," "criterion variable," "outcome variable," and "response variable."
Here are the key points (in my mind) made by respondents to my query:
Here are the responses I received, unedited, except for my use of bold font.
Hi, Karl --
I'm not sure what you're looking for re incorrect definition of IVs and DVs, but here's what I say: In my intro stats courses, I begin with a discussion of kinds of research as the context in which statistics are used. I use different names for variables in observational research vs. variables in true experiments. I tell them that true experiments (or "experiments," so that they won't think there's a difference) are characterized by random assignment of participants to groups and manipulation of the different groups' experiences. I tell them the manipulated variable is an independent variable, and I give them a mnemonic for remembering its name: the IV comes first in time so it is independent of the results of the study. In this kind of research, the outcome variable is the dependent variable, which comes second in time and depends on what the people experienced.
Then I go on to talking about observational research as the other end of a spectrum, and I say this kind of research is characterized by not being an experiment. It lacks randomization and manipulation. Then I tell them I'm in the minority in the names I use for variables in observational research. The one that comes first in time is a predictor variable. It often is a grouping variable like an IV, but when there are groups, they usually are naturally occurring -- males vs females, alcoholics vs nonalcoholics, etc. I tell them the outcome variable is the criterion variable -- analogous to the DV, it is second in time. I also give the mnemonic that the order of the variables is not alphabetical: IV before DV, PV before CV. So a PV might not be a grouping variable; it could be a score like human babies' birthweight, which may predict developmental problems. Then I explain that although many journal articles will not make the distinctions that I make in variable names, my mentor at the University of Oklahoma (Larry Toothaker) and I share about 38 years of experience teaching statistics, and we think it helps students learn the difference between kinds of research studies.
Lise DeShea [firstname.lastname@example.org]
Generally, in an experiment and/or regression study .... IV is THE variable that YOU think does the causing ... or is the precursor to the DV ... which you think is the outcome.
Experiment: IV might be the method of instruction ... the experimenter manipulates this ... the DV might be scores on a test ... the outcome
(supposedly) due to the IV
Regression: IV might be HS grades ... that you think has an impact on the DV or College grades ... (usually, IV comes ahead of time to the DV)
Regression can be a bit ambiguous ... if you use Height to estimate Weight ... is H the IV??? Not so sure since the reverse is just as logical and BOTH are co-contiguous in time ... unless H was taken at age 5 and W taken at age 10
Best I can do at 8:30AM. [Later Dennis added that he did not believe that being more careful with use of the terms "independent variable" and "dependent variable" would help readers remember whether the research was experimental or not."
I use the term "independent variable " to indicate a variable that I can manipulate, or a variable that will have some effect on the "dependent variable."
Tanguma, Jesus [TANGUMAJ@COBA.PANAM.EDU]
I tell students that "dependent variables" are the measured variables that are tabulated, averaged, and so on. Every research question has at least one dependent variable.
"Independent variables" form the basis of a comparison between groups, conditions, etc, by defining the differences between the groups or conditions around which the comparison is made. Not every research question has an independent variable. I further distinguish between "manipulated independent variables" and "measured independent variables", with the obvious meanings.
Research question: What is the average income of CEOs of Fortune 500 companies?
Dependent variable: Income.
Independent variable: None.
Research question: What percentage of Fortune 500 CEOs are male?
Dependent variable: Gender.
Independent variable: None.
Research question: Do male or female Fortune 500 CEOs make more money?
Dependent variable: Income.
Independent variable: Gender. (measured)
Research question: Does a certain assertiveness training course help Fortune
500 CEOs make more money?
Dependent variable: Income.
Independent variable: Training course vs no-training control. (manipulated)
(Discussion of random assignment vs self-selection, etc).
Hope that helps,
Prof Jeff Miller
Dept of Psychology
Univ of Otago
Dunedin, New Zealand
My approach is descriptive rather than prescriptive. I prefer to get all the terms on the table -- independent variable, explanatory variable, predictor, regressor, covariate, concomitant variable, nuisance variable, control variable, dependent variable, response variable, criterion, etc -- and then discuss the similarities and differences of their meanings to different people.
Ray Koopman [email@example.com]
I like response and explanatory.
Edward J. Gilroy, PhD firstname.lastname@example.org
13453 W. Oregon Ct. phone:303-986-4944
Lakewood, CO. 80228 FAX: 413-702-4639
My understanding is that people are moving away from that terminology in favor of the clearer: predictor/response.
Fathom Dynamic Statistics Software
I teach very traditionally at first--manipulated vs measured variables, etc. But it gets more complicated when we get to regression. There I have to talk about predicted and predictors. And then when the linear model comes along (near the end of a one semester course, we then see that the two are fundamentally the same thing.
Good topic. I'll look forward to the results.
Richard S. Lehman email: email@example.com
Emeritus Professor of Psychology
Franklin & Marshall College
Lancaster, PA, 17604-3003
I am not sure "dependent" and "independent" are the best adjectives. In some cases they are outright misleading, for example when the "independent" variables are not really mutually independent. Why not look at a number of other expressions for the same concepts. For example, in neural networks, there are the input and output variables, illustrating the flow of information in and out of the model. In belief networks, the "evidence and query variables" convey the role of particular variables in inference, while they cannot truly be distinguished in modelling: before one poses the question, all variables are the same. The great variety of terms listed is really calling for a glossary and "standardization". Namely, there seem to be several meanings, not just two. I'd even argue that the meaning of dependent and independent is so rarely applicable (except as an assumption) that it should be eliminated.
Artificial Intelligence Laboratory,
Faculty of Computer and Information Science, University of Ljubljana.
I've tried to avoid the usage whenever possible. Textbooks don't usually permit one to avoid the issue utterly, but one can control one's own use of language. Generally, I've tried to refer to "predictor" or "design variable" or "design factor" more or less interchangeably (although "design factor" is more applicable to formal ANOVA designs than to multiple regression or GLM contexts); and "response variable" (or "response surface" when there are several) or "outcome" or perhaps "observed variable", also more or less interchangeably.
My view of "independent" and "dependent" as adjectives in this context is that one could hardly choose a pair more apt to engender error and misunderstanding, if one were deliberately trying to obfuscate.
Eventually I got tired of having to explain why one uses "independent" for describing a set of variables that usually _aren't_ independent (at least not in the sense of "independent from each other") except in the artificial context of designed, balanced factorial experiments.
If the textbook follows that usage, one pretty well has to pay some lip service to it (and therefore to go through such an explanation once), but it isn't necessary to subscribe to that labeling beyond such an acknowledgement.
Donald Burrill [firstname.lastname@example.org]
The independent variable is plotted on the horizontal axis and the dependent variable is plotted on the vertical axis, is what I tell my students.
For statistics I add explanatory variable = independent variable and response variable = dependent variable.
This is at the community college level.
Stan Brown, Oak Road Systems, Cortland County, New York, USA
I also posted the same query on Teaching in the Psychological Sciences Here are the replies I received from members of that group.
In an experiment, the independent variable is the variable that is controlled (varied) by the experimenter and the dependent variable is the variable that we measure to determine if the manipulation of the independent variable has an effect.
Dr. Bob Wildlbood
Indiana University Kokomo
Yes, I agree these are often misused, and I go to great pains to mention all the possible legitimate and illegitimate uses. this is how I try to explain it to students:
Basically IV is reserved for an experiment--some one variable is allowed to vary while all others are somehow held constant or equated for. That is the IV and it must have 'levels'.
DV is used more widely, but it is always a measured variable, something that is going to have some kind of 'units' attached.
It can be the outcome of an experiment and that is the preferred use, just to keep things from getting confusing. But it is also often used to refer to any measured variable, hence would be applicable for correlations--although I talk about variable 1 and variable 2 in that case, without getting into the whole DV concept because for a correlation we are more likely to have 'dummy' variables and then it's awkward to have talked about those as a DV.
Annette Kujawski Taylor, Ph. D., Department of Psychology
University of San Diego
1. Strictly speaking, I define the term independent variable to mean the variable that is manipulated in a true experiment. Also, I like to equate "factor" to independent variable.
2. Consequently the dependent variable is the response that may vary as a function of the independent variable.
3. However, I teach that if the study is a quasi experiment in which a variable is manipulated but not accompanied by control of extraneous variables (e.g., two intact groups given different treatments), then we may call the manipulated variable independent.
4. One implication of this is that a study that consists solely of subject variables is not a quasi experiment but rather is an ex post facto study.
5. I do not like to use the terms independent and dependent variables for nonexperimental research. In this type of research, the variables on the "input" side may be ex post facto or a personal characteristic of the subject (subject variable). Ideally, this term should be used to describe them. The response of the person can be called a response variable.
6. In a correlational study, a Pearson r may be calculated between, say, measured IQ and reaction time. I suggest that measured IQ be called a predictor variable and reaction time be a predicted variable.
7. Sometimes it is also appropriate to call subject variables predictor variables.
Stuart J. McKelvie, Ph.D., Chairperson, Department of
Psychology, Bishop's University
Some psychologists have argued that it isn't appropriate
to use DV and IV unless the resercher has "experimental control" over the IV
(i.e., manipulates it directly). Such psychologists tend to get miffed when the
terms are used in correlational or quai-experimental situations. I'm guessing
this is the kind of "miuse" you're speaking of. Actually, such constrained
use is pretty well restricted to psychology. In mathematics, IV and DV are
used widely for pretty well any X and Y at all. Most of the time, the character
of the variables is not specified at all, much less whether there is
"experimental control," a "causal relationship" between them, and the like. Just
another case of psychologists getting overdefensive about the "experimental"
character of their "science." :-)
Christopher D. Green
Department of Psychology
Karl L. Wuensch wrote:
Thanks for your comments. As you will see later, I get miffed when psychologists refer to a classification variable as an independent variable in nonexperimental research and then argue that because they have an independent variable (cause) and a dependent variable (effect) they can make firm causal attributions.
That is, no doubt, a bad thing to do. I don't know that being fastidious about mere vocabulary can fix the problem however. In any case, even the most fastidious experimental psychologists' views of what does, in fact, warrant causal inference is so naive and outdated as to not really be worth defeding so vigorously. Hume and Mill are hardly the latest word in the theory of causation. :-)
Wuensch, Karl L wrote:
I've always adopted a very "common sense" approach to causality -- if I can usually make Y change by manipulating X, I am comfortable referring to X as a cause of Y, even if there are other events which can make Y change, even if the correlation between X and Y is not perfect, and even if X likely operates upon Y through a string of intervening variables. I also prefer to deal with observed rather than latent variables.
I don't know that this view is "common sense," but it is
certainly close to the Humean position that psychology adopted in the early 20th
century (except that your version is made virtually circular by your use of the
term "make" in the definition). Hume simply said directional correlation. There
are so many famous counter-examples that we're pretty certain that it's
inadeqaute as well (e.g., day reliably comes before the night, but day doesn't
casue night, as Thomas Reid pointed out in the 18th century). If you're
interested in some of the work that's been done on cause in the intervening
century, there's a good collection edited by Ernest Sosa and Michael Tooley
entitled simply _Causation_ (Oxford, 1993).
Regards, Christopher D. Green
Perhaps I should have said the "everyday usage" meaning, rather than the "common sense" meaning of "cause." What I have in mind is essentially Colllingwood's (An essay on metaphysics, Oxford, Clarendon, 1940, p. 285) second meaning of "cause," that is, "that which is 'caused' is an event in nature, and its 'cause' is an event or state of things by producing or preventing which we can produce or prevent that whose cause it is said to be." As an example, Collingwood used the flipping of a light switch (X) and the resulting change in illumination (Y). I know, from everyday experience, that flipping the light switch will be followed, usually, by a change of illumination, and the state of the light switch is something I can manipulate. I am quite comfortable saying that flipping the switch causes a change in level of illumination -- of course, if you pressed me, I would acknowledge that the relationship is not perfect (if the power is out or the conduit broken or the light bulbs broken, manipulating X will not influence Y) and that their are other causes of Y (manipulating the blinds, for example). As regards the relationship between day and night, well, I can't manipulate either of them. ;-)
Back to the Stat Help Page
Contact Information for the Webmaster,
Dr. Karl L. Wuensch
This page most recently revised on 28. January 2007.