All comments are registered in an issue list that will be discussed by the Working Group in March. Resolutions of those issues will be shared on Joinup, and will lead to a final version of the specification that will be submitted to the European Commission for endorsement by the EU member states.

Nice work. Next will follow a few questions / suggestions. They are only put with the intention of having a concrete and pragmatic usage of the material.

These questions are put in a separate post so that they can be addressed easily.

Thank you in advance for taking these into consideration.

Concerning the scope of the legal entity class.

As stated in point 4.2 there are quite a lot of organisations not being considered for membership of the class.

Question: if those organisations cannot be part of the legal entity class, where would we register them, knowing they are important players for the European public.

Suggestion: allow organisations of all kind becoming members of the class and renaming the class as "organisation". The relation "legal identifier" to the class "Formal identifier" may become optional.

Motivation: next important organisations could be identified: transnational bodies (like the European Commission), public entities (like IRS in the USA or their counterpart in European countries), associations (like lawyers associations,…), European Economic Interest Groups (they are not registered as such in all countries), …

Question: is the cardinality as shown in the UML diagram to be restricted to 0..1?

Suggestion: no, the cardinality could be 0..*

Motivation: if we only consider the official registration address, there are companies with different nationalities (e.g. Royal Dutch Shell) Otherwise, companies can also be registered in other countries than their country of origin (General motors in Belgium employed thousands of people without being registered as a Belgian company).

Suggestion: yes, the change class might be a solution if caution is applied

Motivation: every property may be subject to change (even the birthday if a material error occurred), properties may overlap, replace each other, … Information where-off we do not know whether it's history or actual is not very relevant.

Eddy, thank very much for taking the time to comment on the vocabulary. You raise many questions that I will try and answer.

Your Question: if those organisations cannot be part of the legal entity class, where would we register them, knowing they are important players for the European public.

Your suggestion of having a broader organisation vocabulary is well taken. The good news is that such a vocabulary exists known as the Oranization Ontology. Created by Dave Reynolds of Epimorphics for the UK government, it is now being taken through the W3C Recommendations process by the Government Linked Data Working Group. The current editor's draft is unchanged from Dave's original and I expect it to be formally publised as a First Public Working Draft by the W3C working group imminently (along with other vocabularies it's producing or advancing). We had the Organization Ontology.We had this in mind when creating the business vocabulary. An organisation class can easily link to one or more Legal Entity classes. W3C is a case in point. It's not a legal entity although it is clearly an organisation (the three legal entities are MIT, ERCIM and Keio University).

So - I agree with you completely and I hope the org ontology meets your requirement.

Question: can a self-employed person be considered having a legal status "self-employed" although it does not exist in all countries ?

Clearly an individual, a 'sole trader,' does business and, as you say, there are different ways of handling this in different countries. We would like to develop the vocabulary to cover sole traders as that is clearly important. The time scale for this work didn't allow us to go down that path in this round, hence we restricted ourselves simply to the kind of legal entities one sees in business registers, but again, I know the working group agrees with you fully and we'd like to see an extension into this area.

Question: Should the relation remain 1..1 from Legal Entity to Legal Identifier?

Again, we're touching on areas that we'd al tlike to see developed. Yes, PwC is an example of a company with multiple legal entities with complex relationships between them in different countries. One quickly gets into mergers anad acquisitions as well - who owns what and what is the legal status and so on. A search for PricewaterhouseCoopers on Open Corporates yields 209 results! Yes - you're right - we'd like to model this - which we see as an action for the next phase that can build on the basics we have here.

A direct answer to your question is that, legal entity status is conferred on one body by one authority, hence the [1..1] cardinality. The relationship that legal entity has with other legal entities is not (yet) described.

Question: Could there be an agreement to use ISIC4 at this core level?

I sympathise. Life would be easier if everyone used the same terms and ISIC4 is the obvious candidate. The problem is that many data sets already exist that use other codes. As a Core Vocabulary all we can do is to provide the properties and point people in the right direction. We can't actually force people to use particular codes, especially if doing so would require considerable investment for them. There won't be any legal obligation for public administrations to use the core vocabularies (unlike INSPIRE which carries such an obligation).

Question: Could the string be replaced to a reference (URI eventually) stored in the "Organisation c.q. Legal Entity" class?

No need to replace it. We provide a property of Issuing Authority URI (4.3.2) for that purpose. The text highlights the advantages of doing so.

Question: the relation heldBy between License and Legal Entity class is sufficient?

Again I agree with your suggestion. If we are able to extend the work to define the Licence class and the holdsLicence/heldBy relationships then I would agree with you and propose that they should apply to the most generic class of 'Agent' which covers organisations, people, groups, legal entities and anything else.

Question (re Registered Address): is the cardinality as shown in the UML diagram to be restricted to 0..1?

For legal entities, yes. For organisations clearly the answer is no. The organisation ontology supports the idea of multiple locations. So again I hope the org ontology satisfied your point (with which I fully agree).

Question: if any legal entity is entitled to many registered addresses, should there be a relation citizenship between Legal entity class and Location?

I think we've covered that above? We did consider having a "country of origin" property (see 5.4) but felt that the Formal Identifier class - i.e. the description of the authority that conferred legal entity status - would give this information.

Question: could temporal information be maintained for the properties

Oh yes - that's a biggie. Everyone in the 3 task forces is very aware of the need to time stamp or version data and to have a means of changing it and recording that change. It is a big area and is probably top of the priority list for future work. Of course many such systems exist - it's a question of identifying the best and most practical method. We're working on something very similar at W3C in the Provenance Working Group that I hope will aid this discussion in the near future.

I'm aware that I've answered all your points with different versions of "yes we agree - and here's why we don't feel able to change anything!" The three core vocabularies are all designed to provide baselines for public sector info exchange but they are just the start. We are well aware that there is a lot more to do.

These vocabularies will not be fixed and then abandoned. Discussions about the future of them beyond the current work programme are being actively pursued.

Thanks a lot for your answers. I'm well aware that this forum was only a public call for comments.

Therefor, from my point of view I'm really very satisfied with the answers and don't expect to have them tackled immediately. Another reason to be very satisfied is the understanding of our concern we learned from your answers.

I was able to get a group of UK Government officials from various departments together to review the core vocabularies. Here is what we came up with.

4.2.3. (Company Type)
We would recommend that this is expanded to include a reference to the “Legal Entity” (country/jurisdiction) that issued the type.

4.2.4 (Status)
We would recommend that this is expanded to include a reference to the “Legal Entity” (country/jurisdiction) that issued the status.
Also, we note that it is named “Status” not “Company Status”, while there is a “Company Type” element.

4.2.5 (Activity)
We would recommend that this is expanded to include a reference to the “Legal Entity” (country/jurisdiction) that issued the activity.
[Typo?] The cardinality of [0…1] would be inappropriate for the UK SIC code system mentioned, and disagrees with the diagram which lists it as [0…*].

4.2.6 (Formal Identifier)
It is not stated who determines what the most relevant identifier is for “Formal Identifier” rather than merely another “Identifier” for a given Legal Entity. For example, a UK company that has registered charity status – is it being a registered charity the most important aspect of it, or that it’s also an incorporated company? This would be dependent on the context in most applications.

4.2.7 (Identifier)
[Typo?] Cardinality here disagrees with diagram – is it [0…1] or [0…*]? Suggest the latter as per diagram would make more sense.

4.3 (FORMAL IDENTIFIER CLASS)
The concept of the issuing authority is given as a String and a URI, but not as a Legal Entity – we recommend that this option is also given.
Also, this gives no concept of the type of identifier issued, and assumes that a given issuing authority only issues one form of credential (and whose identifiers cannot be mistaken for one another); this is not the case in the UK.

5.4 (COUNTRY OF ORIGIN)
We would recommend that this is expanded to include a reference to the “Legal Entity” (country/jurisdiction) that states this.
If this is not followed, we feel that the standard assumes use of RDF, which it claims that it does not. There is no mention of namespace-ing or similar in the prior texts for these attributes.

5.5 (HOLDS LICENCE & HELD BY)
[Typo] Use of “licenses” (line two) is American English in this context, whereas the rest of the document appears to be written in British English.

Thank you for sending the three above-mentioned core vocabularies for review.

We warmly welcome this standardisation initiative and would appreciate to be kept informed about future developments on this issue and possibly be more actively involved in the work of this working group.

Eurostat, in its central role in the European Statistical System (ESS) also faces the need for improved standardisation. This standardisation is a crucial part of the ESS strategy (ESS vision) which aims at more integration of production and dissemination processes through the creation and use of more statistical, technical and IT standards.

In order that the ESS member parties speak the same language, Eurostat has launched or is taking part in various hamonisation initiatives, among others:

Statistical Data and Metadata Exchange (SDMX): Eurostat is progressively implementing this set of technical and statistical standards and guidelines for improving the exchange and sharing of statistical data and metadata.Standard Code Lists (SCL): Until recently the reference database of Eurostat contained hundreds of vocabularies, of which the vast majority were ad hoc subsets of established standards like NACE. It was decided to harmonise these vocabularies based on simple rules. At present, 80 such harmonised vocabularies are made publicly available (via RAMON, Eurostat's metadata server: http://ec.europa.eu/eurostat/ramon/, see under section "STANDARD CODE LISTS"). In the near future, we plan to submit to the SDMX parties a proposal for a general harmonisation of the construction rules of these standard code lists, the ultimate objective being to give these rules the status of an SDMX standard.International statistical classifications: Statistical classifications used in the ESS are fully harmonised and very closely derived from the world standards developed by the United Nations.Harmonised business registers for statistical purposes: These registers serve as a tool for the preparation and coordination of surveys, as a source of information for the statistical analysis of the business population and its demography, for the use of administrative data, and for the identification and construction of statistical units.EuroGroups Register: Network of registers, consisting of a central register kept at Eurostat and registers in each EU Member State and in EFTA countries. The central register contains information about multinational enterprises groups (MNEs), which have statistically relevant financial and non-financial transnational operations in at least one of the European countries.
And many more...

Here are some specific comments concerning the Core Person, Core Business and Core Location Vocabularies. These comments are provided mainly by Eurostat units dealing with metadata and business registers.

Point 4.1.6 Gender

May we suggest the following classification:

F = female
M = male
OTH = other
UNK = unknown
NAP = not applicable

Such a coding system is much more self-explanatory than mere numerical values. Furthermore, by adopting such a coding system, you no longer have to publish a long paragraph stating that the values chosen do not convey any meaning of importance, ranking or else, since the sorting is here purely alphabetical.

These codes are the ones used in our SCL "Sex". Please also note that, like in the computer programming world, we use "reserved words" which have the same meaning through all our SCL (for instance, NAP for "not applicable", UNK for "unknown", NRP for "no response", etc.).

Point 4.1.11 Citizenship

You should mention which classification of countries should be used as standard (I guess ISO 3166-1 as for Country of Birth and Country of Death).

Point 4.2.5 Activity

We fully support your recommendation to use NACE as standard for this variable. At present, NACE Rev. 2 is the mandatory standard for all EU Member States for statistics relating to economic activity; it has also been adopted by many non-EU Member States. Indeed, Member States have the possibility to develop national classifications of activities but these national versions have to be obligatorily linked to the European NACE standard (based on established rules defined in the legal act implementing NACE). Furthermore, countries which have developed such national versions have also constructed concordance tables between the two systems. It should also be mentioned that NACE is directly derived from the UN standard ISIC (NACE is simply more disaggregated than ISIC) and that a correspondance table between the two systems also exists (see under tab "Correspondence tables " in RAMON). Since NACE is more disaggregated than ISIC, the links between the two classifications are generally of the many-to-one type (m:1), which is a simple concordance, whereas the reverse is a complex concordance since you need weights to allocate the "one" codes to the "many" codes, which means that it is much easier to go from NACE to ISIC than from ISIC to NACE.

Points 4.1.10 Country of Birth, Country of Death, and 4.4.1 Geographic Name

For information, please note that on March 6th 2012, all "GB" and "GR" codes were replaced with "UK" and "EL" repectively in all Eurostat data sets published on its website.

Points 4.3.2 and 4.3.3 - Issuing Authority URI and Issuing Authority

We suggest to introduce instead an ISO compliant code scheme, which the authorities issuing identifiers for Legal Entities (and "Person") can identify uniquely (preferably worlwide). Due to the fact that identifers are assigned on the national level globally legal entities can only be uniquely identified in combination with country code and name of the issuing authority.

DG Markt has launched an initiative to secure the interoperability between registers in the EU (Proposal for a Directive of the European Parliament and of the Council amending Directives 89/666/EEC, 2005/56/EC and 2009/101/EC as regards the interconnection of central, commercial and companies registers). In this context this question plays a role too. One of the solutions which is proposed is to create a unique identifier which is composed of the country code, a code for the issuing authority, and the (national) legal identifier

Last, we would recommend, as a general comment, to insert much more examples (especially, for family names).

please find below a comment from our INSPIRE colleague working on data specifications on facility-related themes:

In the ongoing development of the INSPIRE Annex II+III data specifications, the concept "Activity Complex" has been proposed to represent a generic superclass for industrial, agricultural or utility-related facilities. It is defined as follows: "A single unit, both technically and economically, under the management control of the same legal entity (operator), covering activities as those listed in the Eurostat NACE classification, products and services. Activity Complex includes all infrastructure, equipment and materials. It must represent the whole area, at the same or different geographical location, managed by a "single unit" as it has been previously described."

While the focus of the proposed INSPIRE model is on the physical and spatial characteristics of an "activity complex", there is a clear link with the "Legal Entity" class defined in the Business CV, especially since in many cases, one facility is managed by a single legal entity (which is often set up for the single purpose of managing the facility).

We therefore suggest to include an (optional) link from the "Legal entity" to an INSPIRE "Activity Complex" (or any other representation of a facility managed by the legal entity), e.g. through a URI identifier (as it was also done for the ISA CV Address class). We would also like to include the possibility of referring to an ISA CV "Legal entity" from the INSPIRE "Activity Complex" - what would be the best way to do this? Through the "legal identifier"?

In this post I want to reply to each of the points raised for the Person vocabulary that have not yet been answered fully and explain how they have been handled.

In reply to Paul Davidson, 16/3/12

Company type, activity and status. Firstly, the names have all been harmonised so they are companyType, companyActivity and companyStatus. More importantly, the spec document now includes a complete example of a business. This includes a sample encoding of a controlled vocabulary of company activities. I used NACE for this but it could be any vocabulary, including SIC. It follows the method described for expressing any controlled vocabulary (for which the initial example was for genders) which in turn was based on Jeni Tennison's work at http://www.jenitennison.com/blog/files/codelists.ttl. In other words, I believe your comments have been implemented in full.

4.2.6 The properties are legal identifier and identifier (not formal identifier, that was the class, now just called Identifier). The legal identifier points to an Identifier class that represents the registration by an authority that has the power to confer legal status. In Britain, the only authority that can do this is Companies House. Other countries have analogous bodies although some, such as Germany, have more than one and in Spain it can be a tax office that confers legal entity status. Even where multiple options exist, a single legal entity will only have that status conferred by a single authority and that's the one that legal:legalIdentifier points to.

To take a UK charity as an example, the NSPCC would not be covered by the business vocab as such - that's a job for the organisation ontology which you know intimately and that is being put through the W3C Rec track already (NSPCC is an org:FormalOrganization but is not a legal:LegalEntity). What the business voc would cover would be NSPCC Trading Company Ltd (http://opencorporates.com/companies/gb/00890446). The Business vocab is designed to link in with the org ontology so that the Organisation NSPCC could link to a legal entity like NSPCC Trading Company. The org ontology could also use the legal:identifier property to link to an Identifier that represents its charitable status (recording its number, 216401, and that it was the Charity Commission that issued that number).

In this sense there is no 'preference' for one identifier over another. It's just that only one Identifier captures the authority that conferred the status of registered Legal Entity.

Actually, working on this reply was very useful in confirming my original understanding of the Org Ontology so that we can say that legal:LegalEntity is a sub class of org:FormalOrganization.

On the Identifier class and country of origin issue, I hope the revised Identifier class meets your comments too. Based on the UN/CEFACT complex type of Identifier, it captures several details about the identifier so that it is unambiguous which agency issued which identifier and what type of identifier it was. The example in Appendix A is included for clarity on this. It doesn't give the country of origin explicitly but does explicitly say that legal entity status was conferred on Apple Binding by Companies House.

Thanks for spotting the American English - I hope all instances have been expunged from this document.

In reply to Michael Lutz 19/3/12

Thanks for this Michael. I believe that the vocabulary provides the hooks you're looking for:

legal:legalEntity can be used to link any resource to a Legal Entity Class. This is useful, for example, where an organisation includes one or more legal entities.

dcterms:isPartOf is a suitable inverse of this. The definition being "A related resource in which the described resource is physically or logically included."

In the context you mention though it's worth highlighting the Organisation Ontology. This is already mature and in use in the UK but, like the core vocs and ADMS, is being taken forward by the W3C GLD WG. A quick glance at the diagram shows you that it conflates legal entities with other formal organisations - that's why the business vocabulary is a) in scope for the GLD WG and b) an important adjunct to it - the two are designed to work together (legal:LegalEntity is a sub class of org:FormalOrganization). I am surprised that the vocabulary that your unit is working on talks about a single unit under the management of a legal entity. That would not, for example, describe W3C. My boss, like me, works for the European Research Consortium for Informatics and Mathematics in Sophia Antipolis. His boss works for the Massachusettes Institute of Technology in Boston - such is the structure of W3C which, legally, does not exist :-). The org ontology /does/ have the concept of multiple sites and, for now at least, just uses simple methods of identifying the geographical location. Coordination with our work in the location voc would is an obvious step to take here.