Thursday, July 28, 2016

The group suggests
that the industry is approaching a point where economics, rather than physics, becomes the Moore's Law roadblock. The further below 10 nanometres transistors go, the harder it is to make them economically. That will put a post-2020 premium on stacking transistors in three dimensions without gathering too much heat for them to survive.

There are other problems than the difficulty of making transistors smaller:

The biggest is electricity. The world's computing infrastructure already uses a significant slice of the world's power, and the ITRS says the current trajectory is self-limiting: by 2040, ... computing will need more electricity than the world can produce.

So we're looking at limits both on the affordability of the amounts of data that can be stored and the computations that can be performed on it.

The ITRS points to the wide range of different applications that the computations will be needed for, and the resulting:

research areas a confab of industry, government and academia see as critical: cyber-physical systems; intelligent storage; realtime communication; multi-level and scalable security; manufacturing; “insight” computing; and the Internet of Things.

We can see the end of the era of data and computation abundance. Dealing with an era of constrained resources will be very different.In particular, enthusiasm for blockchain technology as A Solution To Everything will need to be tempered by its voracious demand for energy. An estimate of the 2020 energy demand of the bitcoin blockchain alone ranges from optimistically the output of a major power station to pessimistically the output of Denmark. Deploying technologies that, like blockchains, deliberately waste vast amounts of computation will no longer be economically feasible.

However, the raw citation data used here are not publicly available but remain the property of Thomson Reuters. A logical step to facilitate scrutiny by independent researchers would therefore be for publishers to make the reference lists of their articles publicly available. Most publishers already provide these lists as part of the metadata they submit to the Crossref metadata database and can easily permit Crossref to make them public, though relatively few have opted to do so. If all Publisher and Society members of Crossref (over 5,300 organisations) were to grant this permission, it would enable more open research into citations in particular and into scholarly communication in general.

Larivière et al's painstaking research shows that journal publishers and others with access to these private databases (Web of Science and Scopus) can use it to graph the distribution of citations to the articles they publish. Doing so reveals that:

the shape of the distribution is highly skewed to the left, being dominated by papers with lower numbers of citations. Typically, 65-75% of the articles have fewer citations than indicated by the JIF. The distributions are also characterized by long rightward tails; for the set of journals analyzed here, only 15-25% of the articles account for 50% of the citations

Thus, as has been shown many times before, the impact factor of a journal conveys no useful information about the quality of a paper it contains. Further, the data on which it is based is itself suspect:

On a technical point, the many unmatched citations ... that were discovered in the data for eLife, Nature Communications, Proceedings of the Royal Society: Biology Sciences and Scientific Reports raises concerns about the general quality of the data provided by Thomson Reuters. Searches for citations to eLife papers, for example, have revealed that the data in the Web of ScienceTM are incomplete owing to technical problems that Thomson Reuters is currently working to resolve. ...

Because the citation graph data is not public, audits such as Larivière et al's are difficult and rare. Were the data to be public, both publishers and authors would be able to, and motivated to, improve it. It is perhaps a straw in the wind that Larivière's co-authors include senior figures from PLoS, AAAS, eLife, EMBO, Nature and the Royal Society.

Thursday, July 21, 2016

Last May in my talk at the Future of Storage workshop I discussed the question of whether flash would displace hard disk as the bulk storage medium. As the graph shows, flash is currently only a small proportion of the total exabytes shipped. How rapidly it could displace hard disk is determined by how rapidly flash manufacturers can increase capacity. Below the fold I revisit this question based on some more recent information about flash technology and the hard disk business.

Tuesday, July 19, 2016

When Jefferson Bailey & I finished writing My Web Browser's Terms of Service I thought I was done with the topic, but two recent articles bought it back into focus. Below the fold are links, extracts and comments.

In 2014, the database's fifth year, an estimated 70,000 people were using the website each day.

Australia Library and Information Association chief executive Sue
McKarracher said Trove was a visionary move by the library and had
turned into a world-class resource.
...
"If
you look at things like the digital public libraries in the United
States, really a lot of that came from looking at our Trove and seeing
what a nation could do investing in a platform that would hold museum,
gallery and library archives collections and make them accessible to the
world."