Multiple Linear Regression

Did you take the Module 2 quiz?!! I hope you did. Now we have the basics of regression covered. You should have a quite good understanding what regression is all about. It is modeling the relationship between variables, or line fitting. If then the line doesn't really appear to fit, then possibly there is no regression. Please remember that common sense rules here, and in all modeling.

So far we have studied the relationship between two variables only. Are things getting more difficult, if there are more variables?!?! Not necessarily, it is actually going to be more of the same!! The same principles continue to be true. However, because we cannot draw beyond three dimensions, we will not be able to study visually the relationships between variables beyond three variables, i.e. one dependent and two independent variables. Still, we will be able to develop and analyze the models like we did earlier in simple linear regression. All concepts discussed in Module 2 are still valid. Therefore, it is very important that you go back to Module 2 for review of any concept that remains unclear. Please review those summary tables for model- and parameter testing, as well as the testing of OLS assumptions using the residual plots.

Note: You should still continue to plot variables pair-wise against each other to study and understand their relationship. Which variables!?!? Most desirably you should plot all possible variable pairs, and then at the end use this information to study e.g. whether variables appear to have relationships, and whether parameter signs (+,-) appear to make sense. We will talk about the parameter signs later.

Note: What about the relationship between the independent variables!?! Should that relationship be also linear!!! It is very important to note that, in multiple linear regression the independent variables should also be independent of each other. In other words, if you plot two independent variables against each other, then, most desirably, the plot should show no or little association between the variables. The correlation coefficient will be a help, but as you saw earlier, it will miss strong non-linear and polynomial relationships. Therefore, the best is still to plot the variables against each other.

Note: Please also note the key word in the title of this module. That key word is linear. We are still considering only straight line (or linear) relationships between any independent variable (x) and the dependent variable (y).

The Electronic Textbook (by StatSoft) is again a good starting point. It has lots of material available, beyond what is covered in this entire course, for your reading and review. From that book, please read through the following chapters and sections:

Note: Like before extensive use of graphical tools will help you learn to understand the data under consideration, as well as the relationship between variables. Even just for the fun of it, please plot variables against each other always to learn about their relationship and the magnitude of that relationship. Because of the importance of graphical analysis please review again the following:

Later in this module reference will be made to these sections.

Also, please remember to turn to the book's very extensive Glossary when you need a short more technical overview of a topic or concepts, or are looking for a definition, and to the Statistical Tables if you need to check critical values for your t- and F-tests with respect to your regression models.