RE: st: how to retain a complicated subset of data

This kind of problem is fairly common. Thus other solutions may also be
of interest.
(1)
egen number_children = total(number == 3), by(hhid)
keep if number_children
-keep if number_children > 0- is equivalent and may be more congenial.
(2)
egen any_children = max(number == 3), by(hhid)
keep if any_children
(3)
egen no_children = min(number == 3), by(hhid)
drop if no_children
In other contexts, the variables created by these -egen- operations can
be useful, especially if you want to keep families with no children
together with families with children.
There is a more systematic discussion at
How do I create a variable recording whether any members of a group (or
all members of a group) possess some characteristic?
http://www.stata.com/support/faqs/data/anyall.html
(4) In this case, there is also a one-line solution:
bysort hhid (number) : keep if number[_N] == 3
Nick
n.j.cox@durham.ac.uk
Scott Merryman
clear
input hhid number
1 1
1 2
1 3
1 3
2 1
2 2
3 1
4 2
end
gen tag = 1 if number == 3
bysort hhid (tag): replace tag = tag[_n-1] if tag == .
keep if tag == 1
drop tag
l
Maghais SK
> I have a dataset on households of parents with kids, and parents
without kids. I only want to keep the data of parents and their kids.
For instance in the example below, I wish to keep only the first
household(parents with 2 kids) and drop the other households. Is there
any easy way of doing this? The keep and drop command works for
individual observations. But I need to keep all the household members if
the household has kids. Thank you
>
> HHID Number in household
> 001 1 (father)
> 001 2 (mother)
> 001 3 (child)
> 001 3 (child)
> 002 1 (father)
> 002 2 (mother)
> 003 1 (father)
> 004 2 (mother)
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/