Wednesday, March 15, 2006

st: -tsfill- with weekly data

Hi again,

I have panel data in which the time dimension is weekly. Thus, I have info for month, day, year. I want to -tsfill- it b/c the time range differs depending on the panelvar. After I execute -tsfill- I want to set month, day, year for the new obs. My code goes like this:

tempvar time; ge `time' = mdy(month, day, year); tempvar yearWeek; ge `yearWeek' = yw(year(`time'), week(`time')); tempvar panelvar; egen `panelvar' = group(variable `geog');

/* this dupe business is b/c some years have 53 weeks, but Stata only allows 52 weeks */ tempvar dupe; duplicates tag `panelvar' `yearWeek', ge(`dupe'); sort `panelvar' `yearWeek'; tempvar temp_value; egen `temp_value' = mean(value) if `dupe'==1, by(`panelvar' `yearWeek'); replace value = `temp_value' if `dupe'==1; duplicates drop `panelvar' `yearWeek', force; tsset `panelvar' `yearWeek', weekly; tsfill, full; replace year = year(`yearWeek') if year==.; replace month = month(`yearWeek') if month==.; replace day = day(`yearWeek') if day==.;

The problem is that these last 3 lines don't work right. For example, the range for the year I fill is [1962, 1966] but the year range I start with is [1980, 2006]. I assume I am using the functions wrong.

If I did instead:

tsset `panelvar' `time', weekly;

tsfill, full; replace year = year(`time') if year==.; replace month = month(`time') if month==.; replace day = day(`time') if day==.;

Then the last 3 lines work out right. BUT the -tsfill- gives me too many days for each year/month. I.e., I get day = 1, 2, 3, 4, etc. for a given year/month, when there should only be 4 or 5 different days, separated by 7, in a given year/month.

How can I get the best of both worlds?

Danielle Ferry

National Bureau of Economic Research

