Wednesday, October 31, 2007

The article When Is Open Access Not Open Access? (CJ MacCallum) PLoS Biology examines the slippery activities of publishers that try and fly the flag of Open Access (with varying degrees of capitalization) but who only offer the free-as-in-beer definition of freedom, as opposed to the Open Access definition, which includes --- as well as free-gratis freedom -- extensive intellectual property rights permitting unrestricted derivative use. This issue and these distinctions were discussed earlier this year in "Free but not open?" at the PLoS blog. I have noticed that many journals use the weasel words like "We conform to open access as defined by SHERPA". The SHERPA definition does not include the extensive IP rights described by Open Access:

By "open access" to this literature, we mean its free availability on the public internet, permitting any users to read, download, copy, distribute, print, search, or link to the full texts of these articles, crawl them for indexing, pass them as data to software, or use them for any other lawful purpose, without financial, legal, or technical barriers other than those inseparable from gaining access to the internet itself. The only constraint on reproduction and distribution, and the only role for copyright in this domain, should be to give authors control over the integrity of their work and the right to be properly acknowledged and cited. -Budapest Open Access Initiative

This watering-down of freedom from "free-gratis and free-to-use-and-modify-and-distribute" to simply "free-gratis" (and maybe some IP freedom for the authors) and the general obfuscation/duplicity/ignorance by publishers parallels similar activities in the software world, where the freedom issue has also been confused and watered-down in various "open source" (note case) licenses. See Open Source vs. Free Software.

Saturday, October 27, 2007

I have been working with Tag clouds and other Web 2.0 sorts of things quite a bit lately [see earlier post: Drill Clouds for Search Refinement] and couldn't help notice that it might be useful to use the Tag cloud "Size reflects frequency/importance" idiom in HTML select lists, so I did a little bit of experimenting (BTW, I did look for these on the Web but didn't find them: it doesn't mean they are not already out there...).

So I played with the styles of these elements, and was able to get something that looks like this:

I am not sure how the above HTML renders in your browser (Update: Daniel has some info on how/if this works in different browsers), but here is how it renders in mine (Firefox 2.0.0.4 on Linux (Suse 10.2):

It is interesting how the browser allocates space: it seems like it uses the largest (tallest) item in the list to allocate the height of the widget, which makes sense. But while the version of Firefox appropriately sizes the pull-down contents (i.e. above, left), when a term is selected, it is sized at the default text font size (above right), even if its font size as defined and as displayed in the pull-down is larger. This appears to be a bug. But it is easily possible that there is some CSS that I should be using to look after this but do not know about. I have not tested this behaviour in other browsers, but I have for other versions of Firefox (1.06, 2.0.0.7).

Notwithstanding this behaviour, on experimenting with these select variations, I think that they work well and are useful in the appropriate situations.

Tuesday, October 23, 2007

The October issue of the Communications of the ACM has two complementary articles in the area of Intellectual Property. Complementary in that one is one copyright reform and the other is on (software) patents:

Peter Suber of Open Access Newsreports that a U.S. Senate labour bill has recently had an amendment added to it, putting the Open Access mandate of the NIH at risk:

The provision to mandate OA at the NIH is in trouble. Late Friday, just before the filing deadline, a Senator acting on behalf of the publishing lobby filed two harmful amendments, one to delete the provision and one to weaken it significantly.

Saturday, October 13, 2007

The International Journal on Digital Libraries has a special issue entitled "Connecting digital libraries to eScience". I haven't had a chance to read any of the articles, but they look very interesting, and include some discussion on various scientific data issues, collaboration, repositories, research infrastructure, etc:

The former is a summary of recent projects and policy, and introduced me to a number of projects and initiatives that I hadn't previously known about. The latter is a well thought-out view of the data sharing continuum, showing us where we have been (and perhaps for some of us, still are!) and a good idea of where we will/should be going. A good graphic to show to a manager trying to understand the big picture.

Drill clouds are what I call an extension to tag clouds to make them a useful tool for search refinement. That is, to use a tag cloud to refine an existing query by adding new elements to the query through interactions with the cloud. As this results in a kind of drill-down search behaviour, these new clouds have been named drill clouds. Some differences between traditional tag clouds and drill clouds:

Drill clouds are applied to search results and -- as search results can be very large and include many result items and many tags -- the cloud that is presented is created from a subset of the result set (usually the top N). This is done for both for user interface and performance reasons. This is different from traditional tag clouds which are usually applied to all items. In Ungava, the number of tags and number of search results articles from which those tags were derived is displayed and can be manipulated by the user.

When a tag is used in the query refinement, this tag is excluded from the subsequent cloud, as it exists in every result item of the new search. This is perhaps the most distinguishing attribute of a drill-cloud: the exclusion of accumulating search-refinement tags from the subsequent query(ies).

The user now clicks on the chromatin keyword cloud entry, which adds keyword:chromatin to the original search query, resulting in a new set of results:Note that this refined search results in 52 hits, down from the original 3461 hits.

Now when the user clicks on the Keyword cloud link, they get the keyword drill cloud for the new results, but with the keyword (tag) chromatin excluded from the cloud, removing its dominating influence on the cloud (as all articles would have chromatin as a keyword). Here is the resulting keyword drill cloud:If the chromatin were not excluded, its dominance would reduce the other clouds entries to small entries, reducing the discriminating power of the cloud, and its overall usefulness. Here is what the cloud would look like in our example:The user can continue to iteratively refine their search using the drill cloud from each search. Note that users are not constrained to using the same type of metadata tag cload for refinement, i.e. they can follow a keword drill cloud refined search with one from one of the other available drill clouds.