st: RE: range of a stringvariable

Some simpler ways of approaching this have not quite come to the surface
in this thread.
Four key points:
1. You are not obliged to create lots of little variables.
2. You are not obliged to convert any bits and pieces to real unless you
genuinely want those results for other purposes.
3. Inequalities apply to strings as well as to numbers. The order
concerned is just alphanumeric order, precisely that used by Stata to
-sort- string variables.
4. -substr()- understands negative indexes as counted from the end of a
string.
Thus
if inrange(substr(code, 1, 4), "E300", "E499") & substr(code, -1, 1) !=
"A"
is a complete answer to the first question. Similarly
if substr(code,-1,1) == "A"
is a complete answer to the second question.
It's the driest of dry reading but the functions section of the
documentation is an eye-opener in terms of the toolkit offered.
Nick
n.j.cox@durham.ac.uk
Tomas Lind
Choose individuals based on a string variable with a range of values
I am working with ICD-10 codes (codes for different types of diseases).
The
codes start with a letter A - Z followed by 2 or 3 digits. In some cases
they might end with the letter A. Say that I have a dataset with 5
subjets
(id=1 to 5) with these ICD-10 codes (fake data, in reality I have
millions
of subjects):
I460 E343 I46 C764 E438
How can I choose individuals with ICD-10 codes in the range E300 to E499
(not including codes that end up with A). What about if I want to
include
codes that ends with an A. (There is a convenient command for ICD-9
codes,
but not for ICD-10 codes.)
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/