Simple (univariate) Linear Regression

Case 4: Data Groupings - In practice data groupings, or non-homogeneous data, occur frequently. The graph below shows a linear regression model for data with two distinct groups of data, Group A and Group B. As you can see all data together form an increasing pattern, while within each group the patterns do not appear increasing. In fact, within Group A the data show a negative (downward) trend, whereas within Group B there appears to be no trend (no regression). The regression model appears quite useless with respect to the data groups in this case.

It is the analyst's responsibility to design the experiment (see DOE) in such a way that these types of situations are avoided. For example, if you study fuel consumption of vehicles, it would make sense to consider cars, trucks, RV's, and motor cycles separately. Even a further breakdown would make sense, e.g. within cars stratify the data with respect to compact-, medium-, full-size cars, and potentially even further, by considering different powertrain configurations. For further information, please read the sections Data Reduction and Non-homogeneous groups in the StatSoft electronic textbook.