Review of Hypothesis Testing, Confidence Intervals and Correlation

Review of the Concept of Correlation

In this section we briefly review the concept of correlation. Please read again the following chapters in the StatSoft electronic textbook: Elementary Concepts in Statistics, and in Basic Statistics the sections

The Pearson Correlation coefficient is widely used. As noted in the above sections, the correlation coefficient measures the strength of linear relationship between variables. Consequently, outlier data points may cause significant problems especially if the sample sizes are small. Similarly, non-homogeneous groups may cause one to believe that relationships exist, but once data are stratified (or separated into homogenous groups) correlation between variables may completely change. In addition, a strong non-linear or polynomial relationship may go undetected when relying to correlation only.

If you tried to make some sense of the Pearson Correlation formula, you may have noticed that the correlation coefficient r is simply the ratio of the covariance of two variables, say x,y, (x,y estimated by Sx,y) and the square-root of the product of the standard deviations of those variables (x and y estimated by Sx and Sy respectively). Here are a few more correlation examples done using Microsoft Excel.