Hi,
in the spirit of making small incremental steps, I'd like to propose
that we first fix the BNF so that it actually becomes parseable with a
parser written for the 822/2616 BNF format.
The current issues I'm aware of are:
1) missing whitespace, such as in
Accept-Charset = "Accept-Charset" ":"
1#( ( charset | "*" )[ ";" "q" "=" qvalue ] )
which should be
Accept-Charset = "Accept-Charset" ":"
1#( ( charset | "*" ) [ ";" "q" "=" qvalue ] )
2) multi-line prose values, such as
field-content = <the OCTETs making up the field-value
and consisting of either *TEXT or combinations
of token, separators, and quoted-string>
3) prose values containing prose delimiters, such as
qdtext = <any TEXT except <">>
4) illegal characters in rule names, such as in
http_URL = "http:" "//" host [ ":" port ] [ abs_path [ "?" query ]]
5) duplicate rule names (which are case-insensitive), such as with
trailer = *(entity-header CRLF)
Trailer = "Trailer" ":" 1#field-name
6) attempts to do something in BNF that just does not work :-):
chunk = chunk-size [ chunk-extension ] CRLF
chunk-data CRLF
chunk-size = 1*HEX
last-chunk = 1*("0") [ chunk-extension ] CRLF
chunk-extension= *( ";" chunk-ext-name [ "=" chunk-ext-val ] )
chunk-ext-name = token
chunk-ext-val = token | quoted-string
chunk-data = chunk-size(OCTET)
(note the chunk-date rule in the last line).
7) editorial nit: Bill's ABNF parser (BAP) prefers to see all rule names
to be indented by the same amount; I think this is just a matter of
editorial quality and we simply should make the indentations consistent.
Here are my proposals to fix the individual issues:
1) just fix it.
2) try to get rid of prose value; when not possible, replace with a
shorter one and add the remaining text as BNF comment.
3) Use DQUOTE instead of <">
4) Replace "_" with "-". In some cases we currently import rules from
other older specs in which case we can write this as:
abs-path = <abs_path defined in ...>
5) Keep the canonical rule names for the header productions, replace the
other ones.
6) just fix it.
7) fix the indentation.
(Would it make sense to open a separate issue for this collection of
problems?)
Best regards, Julian