Review of Hypothesis Testing, Confidence Intervals and Correlation
Review of the Concept of Correlation
In this section we briefly review the concept of correlation. Please read again the following chapters in the StatSoft electronic textbook: Elementary Concepts in Statistics, and in Basic Statistics the sections
The Pearson Correlation coefficient is widely used. As noted in the above sections, the correlation coefficient measures the strength of linear relationship between variables. Consequently, outlier data points may cause significant problems especially if the sample sizes are small. Similarly, non-homogeneous groups may cause one to believe that relationships exist, but once data are stratified (or separated into homogenous groups) correlation between variables may completely change. In addition, a strong non-linear or polynomial relationship may go undetected when relying to correlation only.
If you tried to make some sense of the
Pearson Correlation
formula, you may have noticed that the correlation coefficient r is simply the ratio
of the covariance of two variables, say x,y, (
x,y estimated by Sx,y) and the square-root of the
product of the standard deviations of those variables
(
x and
y estimated
by Sx and Sy respectively).
Here are a few more
correlation examples done using Microsoft Excel.