### Saturday, March 11, 2006

## Re: st: Wald Chi-Square in Logistic with Cluster Option

At 05:25 PM 3/11/2006, you wrote: >The good news is that, assuming your logistic model specifications are >correct, then your Wald value is OK. It may be that some of your variables >are highly collinear with each other, and it's that that's pushing it up a >few notches: you can check this with Richard Williams' highly useful >-collin- post-estimation command, downloadable from SSC.

Thanks to Clive for the kind words. Alas, much as I'd like to claim credit for -collin- (along with xtabond2 and several other programs!) the actual author is Phil Ender and you need to get it from UCLA, not SSC. Just use -findit collin- to get a copy.

>The bad news is that comparing two logistic regression models, even if >they both have some independent variables in common, is _wrong_. For the >full reasoning, you can check out a neat .pdf file from that man again >Williams at > >http://www.nd.edu/%7Erwilliam/xsoc694/x04.pdf

Not quite. The problem comes in comparing coefficients across models, e.g. you have x1, x2 and x3 in a model, you then add x4, x5 and x6, and you observe that the coefficients for x1, x2 and x3 are quite a bit different in the two models. This is a fairly common thing to do with OLS regression models, but, for reasons explained in the handout, can be highly deceptive when doing things like logistic regression. But, that doesn't mean that you can't run a series of models, and see whether adding or deleting variables significantly affects the fit of the model.

In the case of the original problem, I am not sure what is going on. The behavior seems bizarre to me; you add a variable, and the chi-square plummets by 80,000??? You add a different variable, and it plummets by over 111,000? I suspect it has something to do with the use of clustering, but I really don't know. Besides -collin-, I might do some simple descriptive stats, e.g. crosstab Y with some of the Xs. 23 Xs is a lot; perhaps the data are being spread too thin. Maybe add variables in small groups and see if there is some point at which the chi-square goes wild, and then see if there is something odd about the variable that causes it. I'd also probably cheat and try running it without the cluster option, and see if that produces more sensible results; I believe that would suggest that clustering was somehow part of the problem.

------------------------------------------- Richard Williams, Notre Dame Dept of Sociology OFFICE: (574)631-6668, (574)631-6463 FAX: (574)288-4373 HOME: (574)289-5227 EMAIL: Richard.A.Williams.5@ND.Edu WWW (personal): http://www.nd.edu/~rwilliam WWW (department): http://www.nd.edu/~soc

* * For searches and help try: * http://www.stata.com/support/faqs/res/findit.html * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

Tag: statalist