On Wednesday 22 September 2004 03:01, Clark C. Evans wrote:
> Ok, production requirements:
> - allowing at most one BOM (current spec allows two)
How?
> - explicit stream does _not_ have a document "..." terminator,
> if you want one, use an explicit document
I don't understand... What's an explicit "stream"? Did you mean
"implicit document"? If so, I'd say that the existence of '...' is
independent of the existence of '---'.
> - implicit stream does allow for leading comments and
> the optional BOM
Sure.
> l-yaml-stream ::= (l-document-prefix # BOM + comments
> l-implicit-document)?
> (l-document-prefix # BOM + comments
> l-stream-declaration # %TAGs + comments
> l-explicit-document # --- (stuff)
> l-document-suffix? # ...
> )*
Nope, its possible to have a single prologue and multiple documents
following it.
There are two alternatives:
1. The directives are seen as an (optional) part of each document; if
missing, it reuses the previous document's directives (or, if first,
the default directives). So, there's one BOM that appears at the start
of the prologue and there is never a second BOM at the start of the
explicit document following it.
In this case, the simplest fix is to add l-directives just before
l-document-start in the l-explicit-document production. That's it.
2. The directives are seen as an independent syntactical entity. Since a
stream may start with directives, they must allow for an initial BOM.
In addition, each document must allow for a BOM as well (since a stream
may also start with a document without a directives section).
Therefore, it is possible that a stream would contain both a BOM before
the directives prologue and a BOM before the first explicit document
following them.
In this case, we'll need to do something like that:
l-yaml-stream ::= l-throwaway-prefix
| ( l-without-prologue
l-with-prologue* )
l-without-prologue ::= l-implicit-document?
l-explicit-document*
l-with-prologue ::= l-stream-prologue
l-explicit-document+
l-stream-prologue ::= l-throwaway-prefix
( l-directive
l-comment(any)* )+
Where l-implicit-document and l-explicit document are as today (well,
minus the directives of course). In this case we rename
l-document-prefix to l-throwaway-prefix because it is used both for a
document and for a prologue.
At first glance, option 2 seems cleaner. However a we can't have two
prologues following each other, there's no way to tell where one ends
and the other begins. So a prologue isn't "really" a seperate
syntactical entity. If it *must* be followed by an explicit document,
perhaps it is cleanest to view it as part of its header after all (that
is, option 1).
Thoughts?
Have fun,
Oren Ben-Kiki
P.S. All the above (and the current spec's productions) are ambiguous
for an empty stream. In all cases it means "no document here, move
along", though. I'll try to see if I can defuse this anyway.
Oren.

On Wed, Sep 22, 2004 at 07:19:51AM +0200, Oren Ben-Kiki wrote:
| On Wednesday 22 September 2004 03:01, Clark C. Evans wrote:
|
| > Ok, production requirements:
| > - allowing at most one BOM (current spec allows two)
|
| How?
ah, my bad -- I mis-read an alternation
| 1. The directives are seen as an (optional) part of each document; if
| missing, it reuses the previous document's directives (or, if first,
| the default directives). So, there's one BOM that appears at the start
| of the prologue and there is never a second BOM at the start of the
| explicit document following it.
*nod*
| In this case, the simplest fix is to add l-directives just before
| l-document-start in the l-explicit-document production. That's it.
Sounds like a plan.
| At first glance, option 2 seems cleaner. However a we can't have two
| prologues following each other, there's no way to tell where one ends
| and the other begins. So a prologue isn't "really" a seperate
| syntactical entity. If it *must* be followed by an explicit document,
| perhaps it is cleanest to view it as part of its header after all (that
| is, option 1).
Yes; the new prologue logic is a parsing rewrite rule, not a lexing
production.
| P.S. All the above (and the current spec's productions) are ambiguous
| for an empty stream. In all cases it means "no document here, move
| along", though. I'll try to see if I can defuse this anyway.
I didn't have a problem with that.
Best,
Clark
P.S. The "pull" parser is coming along nicely. The productions
you have worked so hard on are damn flawless so far. I'm trying
to keep the implementation with a very close 1-1 relationship
with the specification.