Tuesday, March 14, 2006

st: Re: Updating master dataset with a transation dataset

There are a couple of ways you could do this. My preference would probably be to use reshape and merge:

use transaction reshape wide var4, i(var1 var2) j(var3) string sort var1 var2 save transactionwide use master sort var1 var2 merge var1 var2 using transactionwide, update

This approach will update the values in the master dataset with the transaction dataset if they are currently missing. If you want to replace existing values, then you would need to systematically rename the variables in transaction wide (e.g., you could -replace var3=var3+"X"- before the reshape), then perform the merge, giving you two version of each variable, and then use a series of replace commands for each variable, probably best done using a -foreach- loop .

Michael Blasnik michael.blasnik@verizon.net

----- Original Message ----- From: "Alice W Muehlhof" <muehlhof@Princeton.EDU> To: <statalist@hsphsun2.harvard.edu> Sent: Tuesday, March 14, 2006 7:57 AM Subject: st: Updating master dataset with a transation dataset

> Hi, > I am relatively new to Stata, although I have programming experience in > SAS > and C. > This is what I would like to do, but I cannot figure out how to do it in > Stata: > > 1. My master dataset has hundreds of variables, two of which, var1 and > var2 I > combine to use as a unique identifier of each record. > > 2. My second dataset, the transaction dataset, has 4 variables: two of > them > are the same two as in the master dataset, var1 and var2. They are > not unique > identifiers on this dataset. > a. The third variable in the transaction dataset, (var3), contains > the name > of a variable in the master dataset which is to be updated. This > variable > name is different from record to record on the transaction > dataset. > b. The fourth variable, var4, contains the data that is to update > var3 on > the master dataset. > > > Ex: I would match record from transaction dataset with master dataset > record > on var1 and var2. > Then I would look at the contents of the var3 on the transaction dataset > which > would tell me the name of the variable that needed to updated on the > master > dataset. Var4 > on the transaction dataset would tell me what the contents of this updated > variable is to be. > > It is possible to have several records in the transaction dataset all with > the same unique > identifier, instructing the system to update different variables on the > same > record of the > master dataset. > > Now I know that I can just create a series of replace varname with the > contents of var4 > statements and copy and paste this data into a do-file, and run it that > way. > > But that is not very efficient, and I need to do this over and over again, > so if > there is a way I could read the transaction dataset, then retrieve the > appropriate > record from the master dataset, update the specified variable with it's > contents > from var4, that would be better. > > Is there anyway I can do this in Stata? > > Thank you so much for your help. > > Alice > > Alice Muehlhof > Research Assistant > Woodrow Wilson School > Princeton University

* * For searches and help try: * http://www.stata.com/support/faqs/res/findit.html * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/


Links to this post:

Create a Link

<< Home

This page is powered by Blogger. Isn't yours?