Elementary Statistics – Outliers and Influential Points in Regression Analysis

 

When doing a regression analysis, we might find an outlying observation (or observations). While we can’t simply delete the outliers (unless we have careful justification to do so), we can, for example, perform the regression both with and without the outlier, and report both findings to our client.

 

Here’s the scoop on outliers and influential points:

 

Outliers

·            An outlier is outside the overall pattern of the rest of the observations.

·            Outliers in the y-direction often have large residuals, but not all outliers have large residuals.

 

Example of an outlier in the y-direction with a large residual:

 

 

Example of an outlier that doesn’t have a large residual:

 

 

 

Influential Observations

·            An observation is influential if its removal markedly changes the regression line.

·            Outliers in the x-direction are often influential.

 

Example of an influential observation: