st: Re: cleaning panel data

. bysort firm (year) : gen prob1 = year[1] != year[_N]
. bysort id (region) : gen prob2 = region[1] != region [_N]
. list firm id year region if prob1 | prob2
Logic: for example, sort by -firm- and within each -firm- by -year-.
If the last
value of -year- for each -firm- differs from the first, you have
a problem.

This is a more comprehensive solution than that I proposed. I do think
there is still one kind of error that it will not catch, though: literally
duplicate obs. with same firmid|year|region, since they will not violate
either of the above conditions. It is quite possible that this kind of
error can appear in these data. Judicious addition of 'dups' to the above
will catch those as well.