### Wednesday, March 01, 2006

## RE: st: fixed effects with clustering when the number of levels of variable to be absorbed exceeds number of clusters

thanks a lot, Mark. once again, extremely helpful. one last question: If I'm only interested in testing significance of a handful of regressors is this less of a concern? Thanks again for your thoughtful replies. Daniel

At 04:49 PM 3/1/2006 +0000, you wrote: >Daniel, > >This is a tricky question, at least for me, and I don't know the >complete answer. > >The situation you describe is definitely a problem if you want to test >lots of parameter restrictions. If you try, say, to test the joint >significance of all your regressors, you will fail, because you have >more (restrictions on) regressors than clusters. You will probably also >see that the F statistic automatically reported by areg or xtreg is >missing and highlighted in blue, and if you click on it you'll get a >longish discussion that includes the following: > >"There is no mechanical problem with your model, but you need to >consider carefully whether any of the reported standard errors mean >anything. The theory that justifies the standard error calculation is >asymptotic in the number of clusters, and we have just established that >you are estimating at least as many parameters as you have clusters. > >Putting that concern aside, the model test statistic issue is that you >cannot simultaneously test that all coefficients are zero because there >is insufficient information. You could test a subset, but not all, and >so Stata refuses to report the overall model test statistic." > >The full help message is available as -help j_robustsingular-. > >However ... there is some ambiguity in the statement above, since it >implies that it's *possible* that none of the SEs mean anything. I used >to think this was automatically the case if the cluster-robust var-cov >matrix is not full rank, but now I'm not sure. It may be the case that, >for example, you can still get valid tests of one or a few coefficients >even if you can't test them all jointly. I've been meaning to go >searching through the literature to find the references on this but >haven't had the time.... > >Cheers, >Mark > > > -----Original Message----- > > From: owner-statalist@hsphsun2.harvard.edu > > [mailto:owner-statalist@hsphsun2.harvard.edu] On Behalf Of > > Daniel Simon > > Sent: 01 March 2006 15:54 > > To: statalist@hsphsun2.harvard.edu > > Subject: RE: st: fixed effects with clustering when the > > number of levels of variable to be absorbed exceeds number of clusters > > > > Mark - thanks, this is very helpful, as usual. Now, I have a > > follow-up. If, in addition to the set of fixed effects that I > > am absorbing, I have another set of dummies that I am > > including manually with i. and there about as many of these > > i.fixed effects as there are clusters, then this will pose a > > problem. Is that correct? For example, if in my individual > > fixed effects model where I cluster on state, I also want to > > include fixed effects for age (e.g. a separate dummy for each > > value of age in years in my dataset), and I have forty > > different age dummies, then the number of age dummies is > > close to the number of clusters. In this situation, is there > > some way to assess whether the estimates of the std errors > > are problematic? and, is there some alternative way to proceed? > > > > Thanks again. Daniel > > > > At 03:19 PM 3/1/2006 +0000, you wrote: > > >Daniel, > > > > > >What you need to be aware of is that the asymptotics justifying the > > >cluster-robust estimator requires the number of clusters to > > go off to > > >infinity. I don't think Austin's comment is quite right, at > > least in > > >the context you've cited it. The number of fixed effects > > can be much > > >bigger than the number of clusters, and that won't by itself cause a > > >problem - after all, the fixed effects are not actually > > being estimated. > > >What *will* cause problems is if you have very few clusters, esp. if > > >compared to the number of parameters that you *are* estimating. In > > >your example, you want to cluster by state. 50 is not very > > far on the > > >way to infinity, but maybe it's enough for your purposes. > > But if you > > >also have lots of parameters that you want to test, then you > > will start > > >running into serious problems (nb: the rank of the cluster-robust > > >var-cov matrix is equal to the number of clusters minus the > > number of > > >estimated parameters). > > > > > >Hope this helps. > > > > > >--Mark > > > > > > > -----Original Message----- > > > > From: owner-statalist@hsphsun2.harvard.edu > > > > [mailto:owner-statalist@hsphsun2.harvard.edu] On Behalf Of Daniel > > > > Simon > > > > Sent: 01 March 2006 15:02 > > > > To: statalist@hsphsun2.harvard.edu > > > > Subject: Re: st: fixed effects with clustering when the number of > > > > levels of variable to be absorbed exceeds number of clusters > > > > > > > > Sorry - I made a mistake in the subject line of my last > > message. It > > > > is now correct. Daniel > > > > > > > > At 09:59 AM 3/1/2006 -0500, you wrote: > > > > >Hi Austin - thanks for pointing out that "the number of > > levels of > > > > >the > > > > >absorb() variable should not exceed the number of clusters." > > > > I have two > > > > >questions about this: (1) I assume that the same holds true for > > > > >xtreg,fe with clustering (given that this yields identical > > > > std errors > > > > >to areg with clustering). Is this assumption correct? (2) > > > > Does anyone > > > > >have suggestions for the most efficient way to estimate > > > > fixed-effects > > > > >models with clustering when there are thousands of fixed effects > > > > >but clustering occurs on a variable with many fewer units? For > > > > example, if > > > > >I have a panel dataset tracking thousands of individuals > > > > over time and > > > > >I want to examine the impact of a state policy variable, > > > > then I would > > > > >want to estimate a model with individual fixed effects but I > > > > would also want to cluster by state. > > > > >What would be a sensible way to proceed in this situation? > > > > > > > > > >Thanks. Daniel > > > > > > > > > >At 02:06 PM 2/28/2006 -0500, you wrote: > > > > >>Perhaps I should ignore this question in the same way you > > > > have ignored > > > > >>the advice in the Statalist FAQ on how to write a > > > > well-formed question > > > > >>(in particular, you give no indication what command you > > > > used or what > > > > >>error message you got, much less show us the output), but > > > > you should > > > > >>certainly read: > > > > >> -help xtreg- -help xtdata- and -help areg- for > > > > starters. Note > > > > >>also that you may want to cluster on id, assuming your > > > > fixed effects > > > > >>are individual id and year effects, to allow for > > arbitrary serial > > > > >>correlation within panel, and -cluster- implies -robust-. > > > > But see the > > > > >>various FAQs on the subject, and such advice as appears in the > > > > >>relevant help files, e.g. > > > > >> Note: Exercise caution when using the cluster() > > option with areg. > > > > >> The effective number of degrees of freedom for the > > > > robust variance > > > > >> estimator is (n_g - 1), where n_g is the number of > > > > clusters. Thus > > > > >> the number of levels of the absorb() variable > > > > should not exceed the > > > > >> number of clusters. > > > > >> > > > > >>On 2/28/06, Yasmine Kent <yasmine_kent@yahoo.co.uk> wrote: > > > > >> > Hi, > > > > >> > > > > > >> > Apologies if this is a basic question... > > > > >> > > > > > >> > I would like to obtain ROBUST standard errors and > > > > t-statistics in a > > > > >> > panel data regression that I am running (with 2-way > > > > fixed effects). > > > > >> > The 'robust' > > > > >> > command does not appear to work with panel data, it > > > > gives an error > > > > >> > message. Theoretically, I thought that it should be > > > > possible to get > > > > >> > these. Is there another command I should use instead? (I > > > > am using > > > > >> > Stata 8). > > > > >> > > > > > >> > Thank you! > > > > >> > Yasmine > > > > >> > > > > >>* > > > > >>* For searches and help try: > > > > >>* http://www.stata.com/support/faqs/res/findit.html > > > > >>* http://www.stata.com/support/statalist/faq > > > > >>* http://www.ats.ucla.edu/stat/stata/ > > > > > > > > > >Daniel Simon > > > > >Assistant Professor > > > > >Department of Applied Economics and Management Cornell University > > > > >(607) 255-1626 > > > > >* > > > > >* For searches and help try: > > > > >* http://www.stata.com/support/faqs/res/findit.html > > > > >* http://www.stata.com/support/statalist/faq > > > > >* http://www.ats.ucla.edu/stat/stata/ > > > > > > > > Daniel Simon > > > > Assistant Professor > > > > Department of Applied Economics and Management Cornell University > > > > (607) 255-1626 > > > > > > > > * > > > > * For searches and help try: > > > > * http://www.stata.com/support/faqs/res/findit.html > > > > * http://www.stata.com/support/statalist/faq > > > > * http://www.ats.ucla.edu/stat/stata/ > > > > > > > > > > > > > >* > > >* For searches and help try: > > >* http://www.stata.com/support/faqs/res/findit.html > > >* http://www.stata.com/support/statalist/faq > > >* http://www.ats.ucla.edu/stat/stata/ > > > > Daniel Simon > > Assistant Professor > > Department of Applied Economics and Management Cornell University > > (607) 255-1626 > > > > * > > * For searches and help try: > > * http://www.stata.com/support/faqs/res/findit.html > > * http://www.stata.com/support/statalist/faq > > * http://www.ats.ucla.edu/stat/stata/ > > > > > >* >* For searches and help try: >* http://www.stata.com/support/faqs/res/findit.html >* http://www.stata.com/support/statalist/faq >* http://www.ats.ucla.edu/stat/stata/

Daniel Simon Assistant Professor Department of Applied Economics and Management Cornell University (607) 255-1626

* * For searches and help try: * http://www.stata.com/support/faqs/res/findit.html * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

Tag: statalist