Tuesday, February 28, 2006
Re: st: too many duplicates with bsample, weight()?
Matissa Hollister <firstname.lastname@example.org>
> I've been experimenting a bit with the bootstrap commands and there seems to > be something wrong with the bsample command when the weight option is used. > As I understand it, ...
Matissa has found a problem in -bsample- when used with the -weight()- option and an expression that results in a resample size that is less than the sample size. While
. bsample, weight(w)
is returning the correct frequency weights for a simple random sample with replacement of the _N observations,
. bsample 10, weight(w)
is not when _N >> 10 (for example).
We have fixed the problem, and the updated -bsample- will be available in the next ado-file update.
> On a related note, is there the equivalent of the > weight option for the bootstrap command? A way to > leave the full dataset in memory? I saw the -nodrop- > option but it's not completely clear to me what it > does.
In short, no. The -nodrop- option prevents -bootstrap- from dropping out-of-sample observations specified in the -if- and -in- conditions. This option is mostly useful for something like
program myboot, rclass args y group reg `y' if group == 0 local m0 = _b[_cons] reg `y' if group == 1 return scalar diff = _b[_cons] - `m0' end
. sysuse auto . bootstrap diff=r(diff), nodrop reps(100) : mybook mpg for
Without the -nodrop- option, -bootstrap- would drop all the domestic cars. This is because -bootstrap- assumes that -e(sample)- identifies the within sample observations when -e(sample)- is created as a result of the first call to the prefixed command, and -bootstrap- drops out-of-sample observations by default.
--Jeff email@example.com * * For searches and help try: * http://www.stata.com/support/faqs/res/findit.html * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/
Links to this post: