Tuesday, March 07, 2006

st: Re: tsappend with unbalanced panels

Dear David, -tsappend- seems to work fine. I ran the procedure again, and it took two to three hours to complete it. So, for large datasets (50000 observations, 15000 units and 200 variables), patience is required. Thanks again. Best, Peter

David M. Drukker wrote:

> Dear Peter, > > I am glad to here that -tsappend- is working well on a subset of your > dataset. > > It is possible that -tsappend- will take long time on a large dataset. > -tsappend- is ado-code and with unbalanced data, it needs to loop > through each panel in your dataset, which might take a while. > > I believe that Stata is not hanging, it's just taking a while to do > the task that you gave it. > > By the way, if you also believe that Stata is not `hanging', but just > taking a long time, it would be a good idea if you were to report this > fact to statalist. Some people might be concerned by your original post. > > Best, > David > > On Fri, 3 Mar 2006, Peter Willemé wrote: > >> >> Dear David, >> tsappend seems to work fine on a smaller dataset. In fact, I can't >> say it doesn't work on the full set, it's just that Stata does not >> seem to go on after the command. It seems to 'hang'. Or is it >> possible that it just takes very long (more than half an hour)? >> Thanks again for your help. >> Best regards, and have a nice weekend, >> Peter >> >> David M. Drukker wrote: >> >>> Dear Peter, >>> >>> Here is a simple way that you verify that size is the issue with >>> your own dataset. >>> >>> 1. Save off a copy of your dataset. >>> >>> 2. Keep the first 25 panels. >>> >>> 3. Use -tsappend- and verify that it produces the results that >>> you expect. >>> >>> Given that it works on a small, random selection of your dataset, >>> the issue does not lie with the command, but rather with the size of >>> the dataset. >>> >>> Please keep me posted on the results of your experiment. >>> >>> Best, >>> David >>> >>> On Fri, 3 Mar 2006, Peter Willemé wrote: >>> >>>> >>>> Dear David, >>>> I am actually working with a rather large dataset (about 50000 >>>> observations on some 200 variables). Maybe that's the problem. >>>> Anyway, I don't think it's a good idea to send you a 21Mb file. >>>> Maybe you could send me the small file that you used, together with >>>> the commands? If that works here, there must be some other problem >>>> with my dataset. I first thought it was a memory problem, but I >>>> have increased that to 200m, which should be more than enough. >>>> These are the commands that fail to work so far: >>>> >>>> set memory 200m >>>> use "C:\Usr\Armoede\panel2.dta", clear >>>> recast byte golf >>>> tsset indid golf, yearly >>>> >>>> tsappend, add(3) >>>> >>>> >>>> Thanks again, >>>> Peter >>>> >>>> >>> >> >> >> >

-- Dr Peter Willemé Expert Social Security Research Group Health Economics Federal Planning Bureau Kunstlaan 47-49 B-1000 Brussels Tel. +32 2 5077355

