Someone asked how to count how many children are in a given household and how to assign this number to each individual in the same household.

What I usually do is:

sort hhid egen no_kids=count(id) if age<19, by(hhid) /*this counts kids in every hh and places the sum in a row where an individual is <=18yo, rows with adults will have a missing value because they did not meet the if condition*/ egen no_childr=max(no_kids), by(hhid) /*this assigns the total number of children to each individual within the same hh*/ replace no_childr=0 if no_childr==. drop no_kids /*you do not need no_kids anymore, so drop it*/

The same trick can be used if you want to create variables with parents' education: let's say you need to create mom_ed and dad_ed, but your data only allows you to identify parents via a variable called relationship. Say, relationship 1=dad, 2=mom, 3=children and you have one variable called educ. Then:

sort hhid egen mom_ed=educ if relationship==2 egen dad_ed=educ if relationship==1 /*at this point hhid does not matter, as the above variables will take a missing value if the "if" condition is not satisfied*/ egen mom_educ=max(mom_ed), by(hhid) egen dad_educ=max(dad_ed), by(hhid) /*this assigns mom's and dad's education levels to all members of the household*/ drop mom_ed dad_ed

you could also rectangulize the dataset to get the same results. if you are intersted in this, let me know - i have a sample code somewhere.

Zamira

