“Econometrics texts devote many pages to the problem of multicollinearity in multiple regression, but they say little about the closely analogous problem of small sample size,” Arthur Goldberger noted many years ago, continuing, “Perhaps the imbalance is attributable to the lack of an exotic polysyllabic name for `small sample size.’ If so, we can remove that impediment by introducing the term `micronumerosity.'”

His point, of course, is the same as the one I made above. Collinearity is not an issue, even in finite samples, except in the same sense as small samples themselves are an issue: there is just less information present in the data, but our measures of sampling variability correctly reflect that lack of information.

]]>Another issue I do not mention in the blog post is that the variables used in the regression are an arbitrary choice of measures of an underlying risk concept, with the relation between the underlying concepts not being clearly defined.

]]>I think perhaps what you have in mind in your opening paragraph is *perfect* collinearity, which means the covariates are linearly dependent, and in turn that the model is not identified by the sample, and your software will indeed warn you about it (Stata, for example, will just drop variables until the model is identified).

]]>