Hello,
I just finalized reading the Draft Final Report [1] and below are my comments. Additional comments perhaps on Monday from other people in our research group and I'm also waiting for comments from the National Library of Finland.
[1] http://www.w3.org/2005/Incubator/lld/wiki/DraftReportWithTransclusion (version: 10th of June)
Please don't hesitate to ask for clarifications to any points below. I'm open for comments and discussing any counter arguments you may have. :)
Regards,
Kim (with a background in semantic web & linked open data / Semantic Computing Research Group, Aalto University; _NO_ background in library data systems.)
------------------------------------------------------------------------------------------------------------------------------
GENERAL COMMENTS
* Who is the document written for? People change the world - organizations give the resources and provide stability in the longer run. The actors (organizations) could be listed, e.g. national libraries of each country, other key actors such as companies, conferences, publications, ... ("Hey, YOU should read this and start acting.") This could also help in getting comments on the report: we could send it to each actor on the list and request comments? Who should give the resources for creating LLD?
* The vision of Library Linked Data should be described more clearly: what is the ultimate goal of moving the libraries to Linked Data for e.g. the next 1000 years or so? What do we actually recommend and believe in? What are the "key selling points"? Something catchy with a good picture would be good... What are the key BIG reasons why LLD is needed and why it changes everything? Where is the surrounding world around the library field and linked data field going and how does that affect the recommendations presented in the report? (The separate Use Case document could used for ideas what to write in such a vision?)
* Why should the reader believe in us and the W3C in this library related issue? In addition to tell something about the working group, W3C etc perhaps the report could include statements from selected key actors, such as national libraries and other organizations?
* A general architecture belief of the document seems to be that the Libraries should use Linked Data as their native source data format, i.e. RDF databases etc. I think there are two possibilities that should be considered and written out in the document:
1. Native Library Linked Data content management system, directly connected to the Linked Data cloud
2. (Existing) Native Library Data content management, with a Linked Data bridge/API to connect the Library to the Linked Data cloud
Replacing existing old fashion library IT systems with new library linked data IT systems would cost (my estimation, perhaps I'm wrong) globally billions euros?
* The document is written with a strong believe in Linked Data: Can we believe in that Linked Data is THE technology of the future? (Libraries work with a perspective of centuries, how should this be addressed in the document?) Some descriptions on existing success stories on using Linked Data in the Library domain would be good here (they are now in a separate document).
* The document is somewhat long and verbose. It should perhaps be edited to a more tight package with abstracts in each section and more "bullet points" on key actions/key points...
* Graphics could make the document easier to read:
- e.g. general architecture of a global Library Linked Data network,
- key actors and their relations to each others
- SWOT analysis
- timescale of the recommended actions: what should be done in the next year? what in next ten years? what in next 100 years? 1000 years?
- Linked Data cloud
- map of the world (with potentially some statistics?)
* Although it is listed as a task in the recommendations, identifying key content of the future LLD could make the report to a more concrete call for action.
* Some statistics would be good: how much content do libraries actually contain globally? how much content would there be in LLD? how many libraries are there in the world? how many natural languages? any estimation on financial needs: how much would it cost to build a global LLD? (Millions or Billions of euros?)
* Geographical considerations: what is the differences in benefits of using LLD when comparing Western countries to evolving countries?
* Perhaps the report should be slightly re-ordered (bringing existing problems listed in 6 "Implementation challenges and barriers" more to the beginning as a motivation for the rest of the report? this would though require rewriting some of the text in section 6...)
* Even though I recommend to use lots of examples, I think some of the examples such as Wikipedia, Geonames, Musicbrainz, Google, Facebook etc... are perhaps not services that are available for this Century. So always when something is given as an example, I would recommend the word "currently". Also, is e.g. Musicbrainz a big and stable service compared to rely on from the perspective of the libraries?
* Generally, no references to literature or other material supporting the key claims of the report (e.g. are the libraries lacking emphasis on keeping pace with technological change?)
* Some acronyms that are not opened up (e.g. IFLA)
SECTION SPECIFIC COMMENTS
3.1 Benefits section: Scope of this report
* an example would be good with picture (datasets, element sets, value vocabularies)
3.2 Benefits of the Linked Data approach
* what about URNs? Libraries tend to use them quite a lot...
* "Benefits to Researchers, Students and Patrons" heading => "Benefits for the library users such as ..."
* "Beneftis to Researchers..." text:
- 1st paragraph: the text can be shortened, start from the benefits (not from that Linked Data can not be noticed...) ;)
- 2nd paragraph: "Library users should be comfortable" ... sounds like "must" to my ear, when the point is here that everybody (?) knows the WWW and thus they are comfortable with linked data...
- 3rd paragraph: "RDFa" ... what about the recently published schema.org and Googles etc recommendations to use microformats and not RDFa?
* "Benenefits for Organizations" heading => "Benefits for Libraries and other Memory Organizations"
* "Benefits to Librarians, Archivists and Curators"
- 1st paragraph: "By using Linked Data, memory institutions will create an open global pool of shared data that can be used and re-used to describe resources, with a limited amount of redudant effort compared with current cataloguing processes." => why is OPENESS a benefit?
6 Implementation challenges and barriers to adoption
* general comment: what are the claims in this section based on? The report is giving a quite bad picture of the library section at the moment, is this intended?
* "Library leaders" paragraph: what is IFLA an acronym of?
* OpenURL, METS and OAI should be described shortly.
7 Recommendations section
* To make the report more to a call for action, perhaps each task heading should be beginning with the word: "Task [number]", e.g. "Task 1: Identify sets of data ...", "Task 2: For each set of data..."
* "Consider migration strategies": "A full migration to Linked Data for library ..." -- Is Linked Data the future (for this Century)?
* "A plan must be drawn up that stages activities" ... could such a plan be sketched in this document?
* "Increase library participation in Semantic Web standardization", two comments:
- should "Semantic Web" be replaced with "Linked Data" to avoid confusion?
- vice versa also: Increase Semantic Web (or Linked Data) participation in Library standardization?
* "Translate library data, and data standards, into forms appropriate for Linked Data": "translators of library standards should involve Semantic Web experts" => does this work? In my opinion, it sounds slightly bad if you must ask for help from some specific technology -- why not just learn yourself what is needed?
* subsection "Assign unique identifiers (URIs) for all significant things in library data" -- the paragraph could be longer with some more details and argumentation
* "Create URIs for the items in library dataset" -- this subsection could be shortened + how about URNs?
* minor comment: "Prepare" and "Design" sections contain both the idea of "design patterns" with a quite similar description. Perhaps the idea of design patterns should moved to a "General" section?
* typo: "Commit to best-practice policies...", 1st paragraph: "and efficiency. quality assurance..." (replace the dot with a comma?)
* "Identify tools that support the creation and use of LLD", 1st paragraph: "URI generator" -- what about URNs?
* This sounds very bad from "sales point of view": "Much the content in today's Linked Data cloud is of questionable quality" -- why should our library put our beloved content to this Linked Data dumpster? :)
* in "Preserve Linked Data vocabularies": "Linked Data will remain usable twenty years from now only if its URIs ..." -- how about the whole Century?
--
Kim Viljanen
Semantic Computing Research Group SeCo, Dep. of Media Technology, Aalto University
email: kim.viljanen@aalto.fi
snail: P.O. Box 15500, FI00076 Aalto, Finland
visit: Room 2541, Otaniementie 17, Espoo, Finland
mob: +358 40 5414654
web: http://www.seco.tkk.fi/u/kimvilja/