Applied Linguistics

Statistical Analysis Support

Quantitative Methods

Linear Regression Assumptions

Master the foundation of reliable statistical analysis. Understand, check, and interpret the five key pillars of regression.

Learning Goals

Understand

Grasp the key assumptions underlying linear regression models.

Check (SPSS)

Execute checks using plots, residuals, VIF, and Durbin–Watson stats.

Interpret

Analyze violations and decide on appropriate responses.

Why assumptions matter

Linear regression isn't magic; it rests on specific mathematical foundations. If these are not met, results can be misleading.

  • Coefficients may be biased
  • Significance tests unreliable
  • Predictions inaccurate

The 5 Key Assumptions

Hover over cards to see how to check them.

1

Linearity

The relationship between predictors and outcome must be linear.

How to Check: Scatterplot of standardized residuals vs. predicted values. Residuals should scatter randomly without a pattern.
2

Independence of errors

Residuals (errors) are independent of one another.

How to Check: Durbin–Watson statistic. Values between 1.5 – 2.5 (roughly 1–3) are acceptable.
3

Normality of residuals

Residuals should follow a normal distribution.

How to Check: Normal P–P Plot of standardized residuals. Points should lie close to the diagonal line.
4

Homoscedasticity

Variance of residuals is constant across levels of predicted values.

How to Check: Plot standardized residuals vs. predicted values. The spread should look even (rectangular), not cone-shaped.
5

Multicollinearity

Predictors should not be too highly correlated with each other.

How to Check: Collinearity diagnostics (Tolerance & VIF). VIF values above 10 (or even 5) indicate problems.

SPSS Walkthrough

Step 1
Go to AnalyzeRegressionLinear
Step 2
Click Statistics... and tick:
  • Durbin–Watson (under Residuals)
  • Collinearity diagnostics (under Estimates)
Step 3
Click Plots...
Y Axis ZRESID
X Axis ZPRED

Example Interpretation

In a training dataset, the Durbin–Watson was 1.95 (✔ independence). The P–P Plot showed residuals close to the diagonal (✔ normality). VIF values ranged 1.2–2.3 (✔ no multicollinearity). Scatterplot showed no funnel pattern (✔ homoscedasticity).

Conclusion: All assumptions met.

Quick Check-in

1. Which plot checks linearity and homoscedasticity?

2. What is an acceptable range for Durbin–Watson?

3. If a predictor has VIF = 12, what problem does this indicate?

Resources