Keep in mind that we are using these inferential statistics to figure out whether our sample is representative of the population. In other words, we are trying to answer the question: Is our sample as representative as it should be? As we discussed Tuesday in class, how representative it should be is based in part on the sample size. Larger samples should give us better estimates of our population parameters.

*Example:*

Let’s assume we are using the data from the health or crime subsets
for this example. We want to know whether our sample includes the
“right” representation of people of different ages. Looking at the
Population parameters for the State of Illinois (which are up on the website),
we see that the Census breaks up the Illinois population into two age groups:
18-65 and over 65. They say that 85.4% of the Illinois population
is 18-65 and 14.6% are over 65.

Did our sample do as well as it should have in estimating the age breakdown of the Illinois population?

In order to find this out, we would run the frequencies for age using SPSS. This gives us the percentages for people at each age. If we go down to 65 years of age and over to the cumulative percent column, we see that 84.2% of our sample is 18-65 and 15.8% are over 65.

Not too far off, but are they close enough?

To figure this out, we use the margin of error. We divide 1 by the square root of the sample size (1500) and multiply by 100. This gives us 2.58%. This means that there should be a 95% chance that our population parameter is within 2.58% of our statistics. If it isn’t within that range, then we have not represented the population as accurately as we should have on age.

84.2% - 2.58% = 81.62%

84.2% + 2.58% = 86.78%

15.8% - 2.58% = 13.22%

15.8% + 2.58% = 18.38%

So, if the population parameter for 18-65 year olds is not between 81.62%
and 86.78%, we have not done as good a job as we should have in representing
the age population.

If the population parameter for 66+ year olds is not between 13.22%
and 18.38%, we have not done as good a job as we should have in representing
the age population.

These are called the 95% confidence intervals. Note: When using
the standard error of the mean, the SPSS program prints out the 95% confidence
interval, so you don’t have to do any math.

Are they within these confidence intervals? Yes they are!!
85.4% (the population parameter) is between 81.62% and 86.78%) and 14.6%
is between 13.22% and 18.38%. We have done as good a job as we should
have in estimating the population parameters.