Biserial and Polychoric Correlation Coefficients

You know that the point biserial correlation coefficient is used to measure the association between a dichotomous variable and a continuous variable.  Sometimes you may be willing to assume that your dichotomous measurements came from an underlying normally distributed latent variable.  In such cases you can use the biserial correlation coefficient to estimate the value of the Pearson correlation between the two normally distributed latent variables.  The program Biserial.sas on my SAS Programs page illustrates how to compute this statistic with SAS.

Those who employ a polychoric correlation coefficient assume that the observed pair of dichotomous or ordered categorical variables represent underlying variables which have a bivariate normal distribution.  The polychoric correlation coefficient is a maximum likelihood estimate of the Pearson r for those underlying normally distributed variables.  When both variables are dichotomous the polychoric correlation coefficient may be called the tetrachoric correlation coefficient.

The program Polychoric.sas on my SAS Programs page illustrates how to use Proc Freq to compute a polychoric correlation.  The data are contrived.  The data were contrived to illustrate a relationship between the weight of a person and the size of the meal eaten by the person.

Back to Karl's SAS Lessons Page