Problems with Regression
April 9, 2012
Browsing around on Google Scholar I ran across this accessible paper – which seems excellent, to my eyes – on the use and misuse of regression analysis. It’s focused on the use of the technique in criminology, but its claims apply more broadly.
Berk distinguishes between three different levels of regression analysis:
Level I: descriptive – simply identifying patterns in the data. No broader inferential or causal claims made. This can always be justified.
Level II: inferential – estimating parameters of a population, hypothesis testing, use of confidence intervals, etc. This can be justified if the data has been generated by probability sampling. If the data has not been generated by probability sampling, level II analysis is “difficult to justify” (485).
[Berk gives several types of justification that could be offered in this scenario: 1) Treating the data as the population (i.e. falling back to descriptive statistics); 2) Making the case that the data can be treated as if it were a probability sample (“rarely credible in practice”); 3) Treating the data as a random sample from an imaginary ‘superpopulation’ (“even more difficult to justify than inferences from the as-if strategy”); 4) Making use of a model of how the data was generated (risky, because the model might be wrong).]
Level III: causal – estimating causal relationships between variables in the population. “Perhaps too simply put, before the data are analyzed, one must have a causal model that is nearly right” (481) But: “It is very difficult to find empirical research demonstrably based on nearly right models.” (482)
Berk concludes that: “With rare exceptions, regression analyses of observational data are best undertaken at Level I. With proper sampling, a Level II analysis can be helpful.” Level III is very difficult to justify. Unfortunately: “The daunting part is getting the analysis past criminology gatekeepers. Reviewers and journal editors typically equate proper statistical practice with Level III.” (486)