Implementations that previously allowed for the previously lax restrictions may need to continue to support that for some use cases. For example, if there is an Avro Data File or old schema that used a name before defining it that currently works, there must be a way for future versions to continue to work.

If we decide to accept this change, we should first provide an integration test for all languages that checks conformance, for those languages that previously supported 'declare before use' that there is still a mechanism that can parse such schemas.

Scott Carey
added a comment - 08/Feb/12 07:03
There is a minor typo: "wdepth-first"
Implementations that previously allowed for the previously lax restrictions may need to continue to support that for some use cases. For example, if there is an Avro Data File or old schema that used a name before defining it that currently works, there must be a way for future versions to continue to work.
If we decide to accept this change, we should first provide an integration test for all languages that checks conformance, for those languages that previously supported 'declare before use' that there is still a mechanism that can parse such schemas.

> if there is an Avro Data File or old schema that used a name before defining it that currently works

I don't think any implementation currently supports use-before-define, does it?

A "left-to-right traversal" of JSON only makes sense for array elements. The only schemas that include multiple types and traversal order matters are unions and records, but these use JSON arrays, so left-to-right works. The types array in a protocol definition could come textually after the messages, but the types must be processed before the messages and in-order. Should we clarify that too?

Doug Cutting
added a comment - 08/Feb/12 18:06 > if there is an Avro Data File or old schema that used a name before defining it that currently works
I don't think any implementation currently supports use-before-define, does it?
A "left-to-right traversal" of JSON only makes sense for array elements. The only schemas that include multiple types and traversal order matters are unions and records, but these use JSON arrays, so left-to-right works. The types array in a protocol definition could come textually after the messages, but the types must be processed before the messages and in-order. Should we clarify that too?

I don't think any implementation currently supports use-before-define, does it?

It would sure be nice if there were a few shared folders that every language tested schemas and protocols against. For example, if there was a 'valid schemas' and 'invalid schemas' folder that all languages were expected to pass / fail against, then someone could test every language by adding to this folder without having to have much knowledge about any language at all. Then we could just add a use-before-define schema to the invalid schemas folder, and find out if any languages support it.

Scott Carey
added a comment - 08/Feb/12 18:26 I don't think any implementation currently supports use-before-define, does it?
It would sure be nice if there were a few shared folders that every language tested schemas and protocols against. For example, if there was a 'valid schemas' and 'invalid schemas' folder that all languages were expected to pass / fail against, then someone could test every language by adding to this folder without having to have much knowledge about any language at all. Then we could just add a use-before-define schema to the invalid schemas folder, and find out if any languages support it.

The types array in a protocol definition could come textually after the messages, but the types must be processed before the messages and in-order. Should we clarify that too?

Good catch. I looked at Schemas and verified that types appear in arrays, but not protocols. What if I changed my older text to the following: "A schema or protocol may not contain multiple definitions of a fullname. Further, a name must be defined before it is used ("before" in the depth-first, left-to-right traversal of the JSON parse tree, where the types attribute of a protocol is always deemed to come "before" the messages attribute.)" A bit windy, but precise.

It would sure be nice if there were a few shared folders that every language tested schemas and protocols against.

The test-suite I'm writing for AVRO-1006 will include a file of test cases in src/test/resources that could be the basis of what you're talking about here. I'll get that posted soon, you can look at it and see what more would need to be done.

Raymie Stata
added a comment - 08/Feb/12 21:17 The types array in a protocol definition could come textually after the messages, but the types must be processed before the messages and in-order. Should we clarify that too?
Good catch. I looked at Schemas and verified that types appear in arrays, but not protocols. What if I changed my older text to the following: "A schema or protocol may not contain multiple definitions of a fullname. Further, a name must be defined before it is used ("before" in the depth-first, left-to-right traversal of the JSON parse tree, where the types attribute of a protocol is always deemed to come "before" the messages attribute.)" A bit windy, but precise.
It would sure be nice if there were a few shared folders that every language tested schemas and protocols against.
The test-suite I'm writing for AVRO-1006 will include a file of test cases in src/test/resources that could be the basis of what you're talking about here. I'll get that posted soon, you can look at it and see what more would need to be done.