RE: st: RE: Data Management

Your notification of your earlier mistake removes my puzzlement.
In terms of your puzzlement:
by id : list prev city if prev != city & _n == 2
does not specify pertinent observations under each -id- if none exist,
but the -by:- obliges -list- to give each heading. You can get more
concise output in the way you specified.
Nick
n.j.cox@durham.ac.uk
Rijo John
That is intersting.
When I type the command as you originally wrote
by id : gen prev = city[1]
by id : list prev city if prev != city & _n == 2
the output is only green lines with id=1, id=2, id=3 and so on under
each green line.. nothing else at all on the output page.
Whereas, when I gave the command
by id : gen prev = city[1]
list prev city if city!=prev
it listed what I want and was similar to the the result I would get
using duplicates command.
In my previous mail wrongly said I only changed "by id" portion from
your command. Instead I omited "& _n == 2" portion too from your
second command to get the result.
Thanks,
Rijo.
On Fri, Nov 21, 2008 at 11:48 AM, Nick Cox <n.j.cox@durham.ac.uk> wrote:
> I am confused. You changed my code, which you should do if it is
wrong.
> But I don't think it is wrong.
>
> Also, if the second command is exactly what you typed, it should
-list-
> at most one observation. _n == 2 is true for the whole dataset just
> once, and the compound statement will be true either never or once.
>
> There is some misinformation somewhere.
>
> Nick
> n.j.cox@durham.ac.uk
>
> Rijo John
>
> Thanks Nick.
>
> When I wrote the same command
>
> by id : gen prev = city[1]
> list prev city if prev != city & _n == 2
>
> it gave me the the solution.
>
> If I use "by id" again with the second command it would not list what
I
> want.
>
> Thanks,
> Rijo.
>
> On Fri, Nov 21, 2008 at 11:35 AM, Rijo John <rmjohn@gmail.com> wrote:
>> Hi Nick,
>>
>> I will read more into the tip you gave.. When I gave the command you
> suggested
>>
>> by id : gen prev = city[1]
>> by id : list prev city if prev != city & _n == 2
>>
>> it just lists all the ids... one by one... Doesn't solve the problem.
>>
>> Thanks.
>> Rijo.
>>
>>
>>
>> On Fri, Nov 21, 2008 at 11:26 AM, Nick Cox <n.j.cox@durham.ac.uk>
> wrote:
>>> This synthetic example shows that the command will list precisely
> those
>>> observations that differ from the previous observation. But this
>>> includes the first, as city[0] evaluates to string missing, i.e. "".
>>> More generally, varname[0] is regarded as missing in the sense of
the
>>> variable's data type, i.e. numeric missing . or string missing "".
So
>>> the first in each group will always be listed (unless its value is
>>> missing).
>>>
>>> . l
>>>
>>> +------------+
>>> | city |
>>> |------------|
>>> 1. | Durham, UK |
>>> 2. | Durham, UK |
>>> 3. | Durham, UK |
>>> 4. | Durham, NC |
>>> 5. | Durham, NC |
>>> |------------|
>>> 6. | Durham, NH |
>>> 7. | Durham, NH |
>>> 8. | Durham, NH |
>>> 9. | Durham, NH |
>>> 10. | Durham, NH |
>>> +------------+
>>>
>>> . list if city != city[_n-1]
>>>
>>> +------------+
>>> | city |
>>> |------------|
>>> 1. | Durham, UK |
>>> 4. | Durham, NC |
>>> 6. | Durham, NH |
>>> +------------+
>>>
>>> You probably want
>>>
>>> by id : gen prev = city[1]
>>> by id : list prev city if prev != city & _n == 2
>>>
>>> There is no royal road to cleaning up string variables. The matter
> was
>>> discussed on the list earlier this year and written up as a Tip:
>>>
>>> SJ-8-3 dm0039 . . . Stata tip 64: Cleaning up user-entered string
>>> variables
>>> . . . . . . . . . . . . . . . . . . . . . . . . J. Herrin
and
>>> E. Poen
>>> Q3/08 SJ 8(3):444--445 (no
>>> commands)
>>> tip on how to clean up user-entered string variables
>>>
>>> Nick
>>> n.j.cox@durham.ac.uk
>>>
>>> Rijo John
>>>
>>> I have a data set as follows
>>>
>>> ID City Year
>>> 1 City name 1
>>> 1 City name 2
>>>
>>>
>>> The data is suppose to have same city names for each ids for year 1
>>> and two. but there are many occasions where city for the year 1 is
>>> spelt differently thanthat for year 2. I just want to list out or
> edit
>>> those cities where city names are different for year 1 and 2 for the
>>> same ID. When I issue the following command
>>>
>>> bysort ID : list if City!=City[_n-1]
>>>
>>> it lists all observations in the data whether or not the city is
> spelt
>>> differently in years one and two. Thats strange to me? Can someone
>>> tell what I am doing wrong here?
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/