Etan Wexler <ewexler@stickdog.com>, 2011-06-26 22:38 -0400:
> The W3C Markup Validation Service issues poor reports on such violations.
Agreed. Specifically, the HTML5 facet of the validation service.
(I don't think the non-HTML5 facets of the service do any of this kind of
checking at all. At least, I don't think the core DTD-based backend of the
service is not even capable of performing this kind of check at all. )
Anyway, that said, this is a known issue for which we have had an open bug
for some time now:
http://bugzilla.validator.nu/show_bug.cgi?id=339
But improving those particular error messages is a challenge. I do plan to
do it (if Henri Sivonen doesn't get around to it first), but it's a
significant amount of work to code it up, so it's going to be a while yet
before it gets done.
In the mean time, if you care to know the details about why it's difficult,
here they are:
The HTML5 backend Jing as its core component, and relies on a Relax NG
schema that can be found here:
https://bitbucket.org/validator/syntax/src/tip/relaxng/
Jing is RelaxNG-based validation tool, and practically speaking, Jing on
its own is not capable of emitting a useful error message for this case. At
least not with the current schema. It's imaginable that the schema could
be (re)constructed in a way that enabled Jing to emit a useful error
message for this, but suffice it to say that trying to do that would not be
the right solution for this problem -- because this is one of many cases
of error reporting for which a grammar/schema-based general-purpose
validator is, on its own, not a good fit.
That said, it is something for which an assertions-based validator like
Schematron is a slightly better fit. And we have such an assertions-based
validator[1] built into the HTML5 facet already, and it's responsible for
emitting error messages for quite a lot of other cases.
So we could add it there, but doing that would also complicate that code
quite a lot. And the result would be that we'd then have *two* error
messages for each instance of this error case: The first generated by Jing
and the second generated by the assertions validator.
We could "fix" that by changing the Relax NG schema to allow the attributes
even in places where the spec says they are invalid. But one obvious
disadvantage of that is, if somebody uses the schema on its own, without
the additional assertions validation, then they would not get any error
message about this at all.
But the is one other place in HTML5 backend where we can deal with the
content of the error message that gets emitted for this. That's in the Java
code that actually emits the message. Among the things that code currently
does is, it takes fragments from the HTML5 spec and includes them in error
messages where they are useful. In this case -- where the error is for an
invalid attribute -- it takes the fragment from the spec that (normally)
is a list of all the attributes that are allowed on a particular element,
and appends that to the error message from Jing.
For pretty much every other element, that works great. But the input
element is a special case due to it having certain attributes that are
allowed only for certain (sub)types. The complexity of describing which
attributes on input are allowed for which subtypes is way more than can
just be included in the simple format normally used in the spec to list out
the attributes. And that part of the current message-emitter code is
generic in the sense that it behaves the same way regardless of what
element it's emitting a message for.
But after talking with Henri about this, it seems that, under the
circumstances, the best way to address this is to add some special-casing
for the input element in that message-emitter code. So that's what I plan
to do when I can make the time.
> The following can serve as the basis for an improvement to the reports of
> the W3C Markup Validation Service.
> “The ‘size’ attribute that appears on this ‘input’ element is invalid
> because the element is in the Number state.
That would be the ideal error message for this case, yeah. But that's not
likely to be the error we end up emitting. What we'll likely end end up
with instead is this:
"Error: Attribute placeholder not allowed on element input at this point.
Attributes that are allowed for type=number input elements:
[list of only the attributes that are valid for the type=number case]"
That is, the error message itself would remain exactly the same (because
that's coming from Jing and is a general error for this case). But the part
that follows the error message (which in the code is called "elaboration"
and "spec advice") would be changed so that instead of being the long list
of all attributes that can be valid somewhere for the input element
element, it would be a shorter list of just those attributes that are
valid for the particular subtype we're checking.
> The element is in the Number state because the element has a ‘type’
> attribute that has the value ‘number’.”
While that's more accurate in terms of using the actual language of the
spec, I'm inclined to just have it say "type=number input elements" instead.
--Mike
[1] Note: The assertions-based validator in the HTML5 facet is not actually
Schematron-based or even XPath-based; it's all custom Java code. But in
practice it's a Schematron workalike, and we do also maintain a set of
Schematron assertions that provide the same checks:
https://bitbucket.org/validator/syntax/src/tip/relaxng/assertions.sch
--
Michael[tm] Smith
http://people.w3.org/mike