st: AW: RE: AW: RE: AW: Categorising dates

<>
For a problem size that takes us straight from the middle ages to the not so
distant future, the difference is between 0.197 and 0.332, but let`s leave
it at that...
*************
clear*
set mem 50m
//generate data
set obs 190000
gen mydates=date("1 Jan 1500", "DMY")+_n-1
format mydates %tdMon_dd,_CCYY
l if inlist(_n,1,2) | inlist(_n,_N)
timer clear
timer on 1
gen mymonth1 = string(mydates, "%tdMonth")
timer off 1
timer on 2
tostring mydates, gen(mymonth2) format(%tdMonth) force
timer off 2
qui timer list
di in r "string() timing: " r(t1) /*
*/ _n "tostring timing: " r(t2)
timer clear
*************
HTH
Martin
-----Ursprüngliche Nachricht-----
Von: owner-statalist@hsphsun2.harvard.edu
[mailto:owner-statalist@hsphsun2.harvard.edu] Im Auftrag von Nick Cox
Gesendet: Dienstag, 24. August 2010 17:03
An: 'statalist@hsphsun2.harvard.edu'
Betreff: st: RE: AW: RE: AW: Categorising dates
Just a mild protest, as signalled. I stand by my arguments. Sure, the
efficiency gain is not detectable at n = 50.
Nick
n.j.cox@durham.ac.uk
Martin Weiss
" gen mydays = string(mydates, "%tdMonth")
which replaces a call to an .ado which is dozens of lines long with a single
line of code with exactly the same effect."
Both calls are a single line long:
*************
gen mymonth1 = string(mydates, "%tdMonth")
tostring mydates, gen(mymonth2) format(%tdMonth) force
*************
And both work out at "0.00" seconds on my computer (-set rmsg on- to see for
yourself), so the benefit has got to be so slight not even Stata notices...
"Respecting the problem"
What is this heading supposed to mean? I gave Sara a solution that is
intelligible when -list-ed to the Results window. Most other solutions
require you to label afterwards using techniques as in your very own
http://www.stata-journal.com/sjpdf.html?articlenum=pr0013 (What is "3"
again, as in -di in r dow(date("23 Sep 09", "DM20Y"))- ? Solution: A
Wednesday...)
Generally, everything depends on what Sara wants to use the results for. In
the absence of this information, we can only guess...
Nick Cox
As the putative author of -tostring-, I must protest mildly at this use of
-tostring-, on two quite different grounds.
1. Style and efficiency
=======================
If you are working with a numeric variable, are inclined to allow force, and
wish only to generate a single string variable, you can and should get there
directly with e.g.
gen mydays = string(mydates, "%tdMonth")
which replaces a call to an .ado which is dozens of lines long with a single
line of code with exactly the same effect.
-tostring- is a convenience command which is, literally, convenient when (a)
you have two or more variables and/or (b) a desire to be prudent because you
are worried about loss of information in conversion. If neither applies,
calling up -tostring- is unnecessary.
2. Respecting the problem
=========================
For problems like Sara's the user is almost always better off with numeric
date variables assigned appropriate date formats.
Nick
n.j.cox@durham.ac.uk
Martin Weiss
clear*
//generate data
set obs 50
gen mydates=date("23 Sep 09", "DM20Y")+_n-26
format mydates %tdMon_dd,_CCYY
//Get day of week
tostring mydates, gen(mydays) format(%td_Dayname) force
//Get month
tostring mydates, gen(mymonth) format(%tdMonth) force
//see result
l, noo
*************
sara khan
I have a list of daily dates inthe format, for example, 23 Sep 09, and
need to create two variables. One is to categorise the days into
weekly data (so week commencing on a Monday). The second is to create
a variable cataegorsing the daily data into monthly data.
I would be grateful for advice on how to do this.
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/