Data-Help.txt 1. Assign ID numbers to your subjects and include those numbers in your data files. If you find bad data in your file you can then go back to your raw data for that ID number and find what the correct value was. 2. Use column input rather than list input, it is more efficient and less troublesome (as detailed below). I strongly recommend that you NOT use list input (where the scores are matched up with the variables by the order in which they appear, a blank space between each score and the next, with a single dot for missing data) for anything but short and simple data files. Use column input (where each variables values must be found within a fixed field of columns) for large or complex data files. I spent this afternoon struggling with a large, complex data file in list format. It took me hours to find that for 43 subjects one of the seven lines of data per subject had fewer than the 39 scores it was supposed to have. The default action of SAS in this case with list input is to go to the next data line to read the values it did not find on the current line. Typically this causes one to loose the data from that next subject as well as have bad data for the current subject. One can use the MISSOVER option on the INFILE statement to stop SAS from looking for the missing scores on the next line, but it will then assign a missing value code for the scores it could not find. This may only hide the problem. For example, my subject was supposed to have 39 scores, it had only 38. SAS assumes the 38 it has are good but the 39th is missing. It is, however, more likely that the omitted score is one of the first 38, and with list input its omission makes all of the scores after it on that line bad data. In a case like this, if you cannot go back and check the raw data archives (which I could not with the data I worked with today, because the persons who constructed the data set did not assign any ID number to subjects) then you should discard all of the data (set to missing value) on the line with too few scores. One can use the STOPOVER option on the INFILE statement to make SAS stop the data step as soon as it hits a data line with too few scores. That is fine if you have just a few such lines, but today my file had 43 such lines. ======================================================================== 30 Date: Mon, 12 Apr 93 14:16:32 PDT From: Melvin Klassen Subject: Re: list directed data input woes To: PSWUENSC@ECUVM1 In article , you say: > > I will not use list input with anything other than very simple and short >data files. I have the displeasure of working with a long, complex, list >directed data file of a colleague. There are seven lines of data for each >subject, and for some lines there must be fewer data than there should be, >because SAS tells me "SAS went to a new line when the INPUT statement reached >past the end of a line." I would like to know which lines had too few data. >How do I get SAS to tell me on which lines it reached past the end of a line? Rewrite your INPUT statement as **seven** SAS statements, i.e., INFILE CARDS MISSOVER; INPUT REC1A REC1B ... ; INPUT REC2A REC2B ... ; INPUT REC3A REC3B ... ; INPUT REC4A REC4B ... ; INPUT REC5A REC5B ... ; INPUT REC6A REC6B ... ; INPUT REC7A REC7B ... ; Then, SAS will identify which of the seven statements caused the problem!