On Oct 13, 2011, at 07:24 , Martin Hepp wrote:
> Hi Greg,
> Thanks! I think this is getting better and better! Are there already implementations for this?
>
> In particular, does anybody know whether there is a parser module for rdflib (Python) that supports this?
Martin,
not yet... I have an microdata->rdf parser (on top of RDFLib) that did implement the mapping originally published by the whatwg. I then played with it for variations, but I have not checked it against Gregg's algorithm yet. I plan to do it at some point but I have some trips that make it difficult to give you a target date...
Ivan
>
> Second question: I am not entirely sure how the RDF with the collections will look like. It should match the same queries as the RDFa markup.
>
> And a suggestion: It would be good to specify a heuristic for turning property values into proper typed RDF literals if
>
> - the vocabulary is known or retrievable from the Web and
> - has a single rdfs:range statement to a known xsd:datatype or
> - if it can be determined from looking at the data.
>
> The rough algorithm could be
>
> - for local properties, check whether the itemtype URI is dereferencable
> - for properties with full URIs, check whether the itemtype URI is dereferencable
> - search for rdfs:range statements for the property
> - if the object of these statements is not a URI, break
> - if it is a URI, one could do a regex test to the xsd namespace or check that it is not subject of additional triples in the vocabulary representation (in order to catch complex cases with user-defined datatypes; I think there are some OWL 2 use cases that may use this). In principle, also such user-defined datatype would be fine to work with, as long as they have a URI, but one may want to leave this out for the moment
> - attach the "^^<URI>" suffix to the literal.
>
> Another approach would be to try to determine the datatype from the data and attach the respective suffix, e.g. int, integer, decimal, float, double, or boolean. If you cover those with a simple heuristic, you would also support most usages. Some integers meant to be xsd:float or double would get the wrong datatype by this, but an RDF environment would still know how to handle the data for comparison etc.
>
> Both would make the data good to work with in an RDF/SPARQL environment.
>
> So basically the following two examples should result in the same triples:
>
> a) GoodRelations in Microdata
> <div itemscope itemtype="http://purl.org/goodrelations/v1#ProductOrServiceModel" itemid="#model">
> <span itemprop="name">ACME Electric Anvil</span>
> ...
> Weight: <div itemprop="http://purl.org/goodrelations/v1#weight" itemscope
> itemtype="http://purl.org/goodrelations/v1#QuantitativeValue">
> <span itemprop="hasValueFloat">50</span> kg
> <meta itemprop="hasUnitOfMeasurement" content="KGM" >
> </div>
> </div>
>
> b) GoodRelations in RDFa
> <div typeof="gr:ProductOrServiceModel" about="#model">
> <span property="gr:name">ACME Electric Anvil</span>
> ...
> Weight: <div rel="http://purl.org/goodrelations/v1#weight">
> <div typeof="gr:QuantitativeValue">
> <span property="gr:hasValueFloat" datatype="xsd:float">50</span> kg
> <div property="gr:hasUnitOfMeasurement" content="KGM"></div>
> </div>
> </div>
> </div>
>
> Martin
>
>
> On Oct 13, 2011, at 7:34 AM, Gregg Kellogg wrote:
>
>> (Appologies if this shows up twice, the first from a separate account seems to have gone to a filter).
>>
>> Note that the just-released Microdata to RDF draft defines property URI generation using the same domain as the @itemtype, not relative to the type itself. Read about it at [1]; comments welcome, feedback to public-html-data-tf@w3.org.
>>
>> Gregg
>>
>> [1] http://lists.w3.org/Archives/Public/public-html-data-tf/2011Oct/0066.html
>>
>> On Oct 12, 2011, at 3:57 PM, Martin Hepp wrote:
>>
>>> FYI: GoodRelations will clearly define in its next service update that the URIs of properties should be formed by attaching the local name of the property to the base URI of the vocabulary, not to the URI of the itemtype that gives the context.
>>>
>>> I also think that this is the most useful pattern for most cases, but if that cannot be written in the standard, Microdata parsers must simply offer this as a heuristic.
>>>
>>>
>>> On Oct 12, 2011, at 10:44 AM, Bob Ferris wrote:
>>>
>>>> Hi,
>>>>
>>>> On 10/12/2011 9:45 AM, Bernard Vatant wrote:
>>>>> Thanks for the pointer to any23.org <http://any23.org>
>>>>>
>>>>> An issue I clearly see with URIs such as http://schema.org/Person/name
>>>>> is that some properties are used by more than one class. So we'll have
>>>>> for example http://schema.org/Movie/duration and
>>>>> http://schema.org/Event/duration potentially misleading to the idea that
>>>>> they are different properties with specific domains, although the
>>>>> definition found for "duration" is exactly the same at both
>>>>> http://schema.org/Movie and http://schema.org/Event : "The duration of
>>>>> the item (movie, audio recording, event, etc.) in ISO 8601 date format
>>>>> <http://en.wikipedia.org/wiki/ISO_8601>." So it's another argument for
>>>>> having this definition clearly published at a single place, under
>>>>> http://schema.org/duration - with expected range
>>>>> http://www.schema.org/Duration. (which BTW would lead to the side issue
>>>>> of having a property and its range just differing by one character case,
>>>>> not a good practice in my opinion).
>>>>
>>>> +1 for excluding the class domains in the URIs of multiple classes spanning properties, i.e., a name is a name is a name. A human user and also a machine will get the relation (specific meaning) of name via its context, i.e., the types of that resource, e.g., schemaorg:Person => a person's name etc.
>>>>
>>>> Cheers,
>>>>
>>>>
>>>> Bo
>>>>
>>>>
>>>> PS: otherwise we would probably end up with something the like the Freebase vocabulary ;)
>>>>
>>>>
>>>
>>>
>>
>
>
----
Ivan Herman, W3C Semantic Web Activity Lead
Home: http://www.w3.org/People/Ivan/
mobile: +31-641044153
PGP Key: http://www.ivan-herman.net/pgpkey.html
FOAF: http://www.ivan-herman.net/foaf.rdf