RE: st: Wald Chi-Square in Logistic with Cluster Option

Thanks to both Clive and Richard for clarifying the above subject. To identify the problem further, I did the following:

I ran -collin- and found several collinear variables. Although these are not main variables of interest, I included them to control their effects. Removing them and rerunning the regression made the Wald Chi-Square even higher.

I managed to identify one variable whose removal made the chi-square statistic look more reasonable. This is a year dummy variable, included as a control. Combining this variable with another year dummy produced more sensible chi-square results.

I also ran the regression without the cluster option. The results gave me a reasonable chi-square. However, some of the variables that I controled for did not come out significant.

So, it seems to me that the problem could be due to clustering, or the inclusion of one particular year dummy variable, or both. I checked Section 8.3 of Hosmer-Lemeshow book, have exhausted references to cluster option from the Stata website, and am wondering if someone could provide me with additional citations to help me learn more about this subject. Also, what is the relationship between the number of observations within a cluster and the number of independent variables? Is there any specific requirement that the former be larger than the latter, or is this an irrelevant issue?

Richard Williams wrote:

