[Yaml-core] Date fields

Hi y'all. YAML newbie here. I'm the one Steve was talking about with
his noon-or-midnight posting. I'm trying to produce an event calendar
using PyYAML. Each event is a dictionary document. I thought I could
use YAML's timestamp type to automatically convert my dates into
mx.DateTime objects, but it looks like the timestamp type is a bit more
rigid than what I need. So, I'm looking for suggestions.
Each event has:
(1) a start date
(2) a start time, null if unknown
(3) an end date, ... but most events are 1-day so I'm skipping this for now
My first pass was to have a Date field combining (1) and (2):
Date: 2002-09-02 13:15
Date: 2002-09-02
Both forms failed because of bugs in PyYAML, but that's beside the
point. The point is that the timestamp type seems to require seconds.
This makes it a inconvenient for hand-edited files that don't need that
precision.
I could separate the date and time into separate fields, but then what
do I do about the time field? It would come as a string (or worse, a
dictionary) rather than a mx.DateTime object, so I would have to covert
Time: and not Date:. This is doable but is, er, asymmetric.
The noon-or-midnight issue comes up with timeless dates. In other
libraries, null time is all zero (midnight), so you have to treat
midnight as a special case. This causes a problem if you actually do
have events that start at midnight, you'd have to encode them as 00:01
or something. So I may end up using a separate time field even though
I'd rather not. But the point here is, that while having the time
default to noon may make some operations easier (subtracting six hours
would still leave you in the same day, for instance), it makes other
operations harder: it means you have to treat a non-zero value as a
special case, which is pretty weird.
Is the timestamp type supposed to be a general type for dates and/or
times? Or is it only for "timestamps" that go down to the second, and
my data model is outside its scope? If so, how do I tell YAML to parse
these fields as strings even though they look like dates?
--
-Mike (Iron) Orr, iron@... (if mail problems: mso@...)
http://iron.cx/ English * Esperanto * Russkiy * Deutsch * Espan~ol

Thread view

Hi y'all. YAML newbie here. I'm the one Steve was talking about with
his noon-or-midnight posting. I'm trying to produce an event calendar
using PyYAML. Each event is a dictionary document. I thought I could
use YAML's timestamp type to automatically convert my dates into
mx.DateTime objects, but it looks like the timestamp type is a bit more
rigid than what I need. So, I'm looking for suggestions.
Each event has:
(1) a start date
(2) a start time, null if unknown
(3) an end date, ... but most events are 1-day so I'm skipping this for now
My first pass was to have a Date field combining (1) and (2):
Date: 2002-09-02 13:15
Date: 2002-09-02
Both forms failed because of bugs in PyYAML, but that's beside the
point. The point is that the timestamp type seems to require seconds.
This makes it a inconvenient for hand-edited files that don't need that
precision.
I could separate the date and time into separate fields, but then what
do I do about the time field? It would come as a string (or worse, a
dictionary) rather than a mx.DateTime object, so I would have to covert
Time: and not Date:. This is doable but is, er, asymmetric.
The noon-or-midnight issue comes up with timeless dates. In other
libraries, null time is all zero (midnight), so you have to treat
midnight as a special case. This causes a problem if you actually do
have events that start at midnight, you'd have to encode them as 00:01
or something. So I may end up using a separate time field even though
I'd rather not. But the point here is, that while having the time
default to noon may make some operations easier (subtracting six hours
would still leave you in the same day, for instance), it makes other
operations harder: it means you have to treat a non-zero value as a
special case, which is pretty weird.
Is the timestamp type supposed to be a general type for dates and/or
times? Or is it only for "timestamps" that go down to the second, and
my data model is outside its scope? If so, how do I tell YAML to parse
these fields as strings even though they look like dates?
--
-Mike (Iron) Orr, iron@... (if mail problems: mso@...)
http://iron.cx/ English * Esperanto * Russkiy * Deutsch * Espan~ol

Mike Orr (iron@...) wrote:
> Each event has:
> (1) a start date
> (2) a start time, null if unknown
> (3) an end date, ... but most events are 1-day so I'm skipping this for now
.. then later ..
>
> I could separate the date and time into separate fields, but then what
> do I do about the time field? It would come as a string (or worse, a
> dictionary) rather than a mx.DateTime object, so I would have to covert
> Time: and not Date:. This is doable but is, er, asymmetric.
I'd start with how you're storing the dates and times in Python. If I'm
storing something as an Array in my code, then I wanted it to be loaded
from my YAML document as an Array. It sounds like you're storing it
as an mx.DateTime object, so you could store it in YAML as an mx.DateTime
object.
This sounds like a good case for YAML's typing mechanism. I know in
PyYaml you could write a handler for a type family so that your YAML
would look like this:
Date: !!mxDate 2002-09-02 13:15
Date: !!mxDate 2002-09-02
..or..
Date: !!mxDate { date: 2002-09-02, time: 13:15 }
Date: !!mxDate { date: 2002-09-02 }
Your handler could then convert the data into an mxDateTime object.
Yeah, it's a bit wordy. But it's a solution in the now.
_why

----- Original Message -----
> Hi y'all. YAML newbie here. I'm the one Steve was talking about with
> his noon-or-midnight posting. I'm trying to produce an event calendar
> using PyYAML. Each event is a dictionary document. I thought I could
> use YAML's timestamp type to automatically convert my dates into
> mx.DateTime objects, but it looks like the timestamp type is a bit more
> rigid than what I need. So, I'm looking for suggestions.
>
> Each event has:
> (1) a start date
> (2) a start time, null if unknown
> (3) an end date, ... but most events are 1-day so I'm skipping this for now
>
> My first pass was to have a Date field combining (1) and (2):
> Date: 2002-09-02 13:15
> Date: 2002-09-02
>
> Both forms failed because of bugs in PyYAML, but that's beside the
> point. The point is that the timestamp type seems to require seconds.
> This makes it a inconvenient for hand-edited files that don't need that
> precision.
>
Just to clarify, the first date should fail in PyYaml, as you mention, because
the YAML spec requires seconds. Let's change the YAML spec here. I see no
reason to require typing in seconds.
The second date failed for Mike due to another problem in his document, I
believe. Can you send me a repro, Mike? Lone dates are working fine for me.
> I could separate the date and time into separate fields, but then what
> do I do about the time field? It would come as a string (or worse, a
> dictionary) rather than a mx.DateTime object, so I would have to covert
> Time: and not Date:. This is doable but is, er, asymmetric.
>
The workaround for now is to just put the seconds in.
Date: 2002-09-02 13:15:00.00
I will fix PyYaml to not need the hundredths (the spec already allows this), and
hopefully to not need the seconds either (if the spec isn't changed, we can make
it a special PyYaml option).
> The noon-or-midnight issue comes up with timeless dates. In other
> libraries, null time is all zero (midnight), so you have to treat
> midnight as a special case. This causes a problem if you actually do
> have events that start at midnight, you'd have to encode them as 00:01
> or something. So I may end up using a separate time field even though
> I'd rather not. But the point here is, that while having the time
> default to noon may make some operations easier (subtracting six hours
> would still leave you in the same day, for instance), it makes other
> operations harder: it means you have to treat a non-zero value as a
> special case, which is pretty weird.
>
In the current implementation, the original "timelessness" of the date is lost
information. If you think it's important to keep this information, you probably
want to have your own wrapper object.
Mikes calendar:
start_project: !!mikeDate '2002-09-12'
end_project: !!mikeDate '2002-09-15'
sunset: !!mikeDate '2002-09-15 05:12'
Using the strings, along with the private type, gives you the ultimate
flexibility to parse the YAML as you wish.
This could get really annoying, though. If Mike's doing a calendar app, then
datetimes are the bread and butter of his application, so they deserve an
implicit type, but his implicit type might have requirements that go way beyond
the needs of cross-platform YAML. If that were the case, then I think we would
want PyYaml to give some hook to Mike to capture his own implicit types.
This would be very easy to implement an API for, but I wonder if it goes against
the cross-platform spirit of YAML.
> Is the timestamp type supposed to be a general type for dates and/or
> times?
Yes.
> Or is it only for "timestamps" that go down to the second, and
> my data model is outside its scope?
Hopefully this was just an error in the spec.
> If so, how do I tell YAML to parse
> these fields as strings even though they look like dates?
>
The best way to tell YAML that something's a string is to surround it in single
quotes.

On Tue, Sep 03, 2002 at 11:20:24AM -0700, Mike Orr wrote:
| Hi y'all. YAML newbie here. I'm the one Steve was talking about with
| his noon-or-midnight posting. I'm trying to produce an event calendar
| using PyYAML. Each event is a dictionary document. I thought I could
| use YAML's timestamp type to automatically convert my dates into
| mx.DateTime objects, but it looks like the timestamp type is a bit more
| rigid than what I need. So, I'm looking for suggestions.
Hello Mike. Yes, I have a similar requirement -- there are two
application model choices:
(a) time is mandatory
In this case, you can use yaml's timestamp directly. After some use
of this myself, I'd like a few changes:
1) make the time missing mean "midnight" once again, noon was
a bit too clever
2) make seconds and timezone optional
3) add a rule that you don't write out midnight, 0 seconds, or UTC
timezone. In this way timestamp equality is string equality.
(b) time is optional
In this case, I see three options:
1) We modify YAML timestamp to have a flag which
indicates if the time is provided (or if it is null).
I'm not certain that this is such a great idea, but
I'm willing to entertain it.
2) We modify YAML timestamp to have a mandatory time
part so that dates all by themselves are no longer reserved.
This is ok, but it kinda defeats the purpose of a common
date/time data type.
3) We keep YAML as is, but model this situation using two
fields either a date and a time, where the date is a
YAML timestamp and the time is a custom application type.
Or as a timestamp and an application flag which denotes
if the time portion of the timestamp is to be used.
Interestingly enough, I have this exact requirement for
Xcolla, and chose option #3. I also chose #3 since sometimes
a meeting has a time but not a day. However, I consider times
without a day as sort of dangerous.... in my application I don't
do this; instead I have re-occurance rules which use a time-delta.
events:
- name: past-event
date: 2002-01-02
time: 3:20
- name: future-event
date: 2002-01-02
time: ~
Thus, in the example above, you can use the yaml timestamp
for your dates; but you'd need to rool your own time-without-date.
| Each event has:
| (1) a start date
| (2) a start time, null if unknown
| (3) an end date, ... but most events are 1-day so I'm skipping this for now
Events are actually much more complicated if you really
want to model them. They are a collection of time segments. ;)
| My first pass was to have a Date field combining (1) and (2):
| Date: 2002-09-02 13:15
| Date: 2002-09-02
|
| Both forms failed because of bugs in PyYAML, but that's beside the
| point. The point is that the timestamp type seems to require seconds.
| This makes it a inconvenient for hand-edited files that don't need that
| precision.
Agreed.
| I could separate the date and time into separate fields, but then what
| do I do about the time field? It would come as a string (or worse, a
| dictionary) rather than a mx.DateTime object, so I would have to covert
| Time: and not Date:. This is doable but is, er, asymmetric.
And only necessary if time is optional.
| The noon-or-midnight issue comes up with timeless dates. In other
| libraries, null time is all zero (midnight), so you have to treat
| midnight as a special case. This causes a problem if you actually do
| have events that start at midnight, you'd have to encode them as 00:01
| or something. So I may end up using a separate time field even though
| I'd rather not. But the point here is, that while having the time
| default to noon may make some operations easier (subtracting six hours
| would still leave you in the same day, for instance), it makes other
| operations harder: it means you have to treat a non-zero value as a
| special case, which is pretty weird.
|
| Is the timestamp type supposed to be a general type for dates and/or
| times? Or is it only for "timestamps" that go down to the second, and
| my data model is outside its scope? If so, how do I tell YAML to parse
| these fields as strings even though they look like dates?
I'd like timestamp to model general timestamps, this includes dates
as a subset (ignoring the time part) but not stand-alone times.
I hope the above is helpful. I'm not really happy about dropping
timestamp or completely re-thinking the whole type mechanism so
late in the game. I use timestamps everwhere in my stuff. About
10% of my data is user-level timestamps (dates, sometimes with
times) and about 25% of it is transaction level timestamps for
created/updated/deleted, and for message and logging dates/times.
Overall I have far more timestamps than I do floating, integer,
or boolean types combined.
Best,
Clark

On Thu, Sep 05, 2002 at 04:09:47AM +0000, Clark C . Evans wrote:
| I'd like a few changes:
|
| 1) make the time missing mean "midnight" once again, noon was
| a bit too clever
|
| 2) make seconds and timezone optional
Well, at this time I think the above two should be adequate
to fix-up timestamp. Both are simple changes.
Time, TimePeriod, DateSpan, etc. are out-of-scope mostly beacuse
they are far fewer use cases; but also because defining them
across legal and application boundaries is a mammoth chore.
Clark