Search This Blog

Archive

More Blogs from Google

Scholar Metrics provide an easy way for authors to quickly gauge the visibility and influence of recent articles in scholarly publications. Today, we are releasing the 2015 version of Scholar Metrics. This release is based on citations from all articles that were indexed in Google Scholar as of mid-June 2015 and covers articles published in 2010–2014.

Scholar Metrics include journal articles from websites that follow our inclusion guidelines, selected conference articles in Computer Science & Electrical Engineering and preprints from arXiv, SSRN, NBER, and RePEc. As in previous releases, publications with fewer than 100 articles in the covered period, or publications that received no citations are not included.

Scholar Metrics also includes a large number of publications beyond those listed on the per-category pages. You can find these by typing words from the title in the search box, e.g., [stem cells], [enfermagem], or [conservation].

Recently, I spent a few days organizing my uncle's papers. He was a graduate student in the 60s and a faculty member for the rest of his life. Going over his papers was like walking through the history of scholarly communication. One of the fascinating things I found were pre-printed postcards for requesting article reprints.

Each institution printed these postcards for its researchers. They included the institution address and a template request. To request a reprint, you would fill in the address of the author and some information about the paper you were interested in and drop it in mail. And hope for a response in six to ten weeks. Here are a couple of requests that my uncle received.

Much has changed since those days. Journal archives have moved online and email zips across the world in seconds. It is hard to imagine today how researchers of the day moved the mountains that they did.

Next in the 10th anniversary series, we look at the impact of older articles, and at how it had changed over the last several decades. A significant increase in the rate of publication over this time period might lead one to expect a corresponding decrease in the fraction of citations to older articles. However, this trend is counteracted by increasingly broad availability of archival content, and by universal availability of comprehensive relevance-ranked search. Overall, we found that the impact of older articles had grown over 1990-2013, and that the growth had accelerated over the second half of this time period. -- Alex Verstak

On the Shoulders of Giants: The Growing Impact of Older Articles

In this paper, we examine the evolution of the impact of older scholarly articles. We attempt to answer four questions. First, how often are older articles cited and how has this changed over time. Second, how does the impact of older articles vary across different research fields. Third, is the change in the impact of older articles accelerating or slowing down. Fourth, are these trends different for much older articles.

To answer these questions, we studied citations from articles published in 1990-2013. We computed the fraction of citations to older articles from articles published each year as the measure of impact. We considered articles that were published at least 10 years before the citing article as older articles. We computed these numbers for 261 subject categories and 9 broad areas of research. Finally, we repeated the computation for two other definitions of older articles, 15 years and older and 20 years and older.

There are three conclusions from our study. First, the impact of older articles has grown substantially over 1990-2013. In 2013, 36% of citations were to articles that are at least 10 years old; this fraction has grown 28% since 1990. The fraction of older citations increased over 1990-2013 for 7 out of 9 broad areas and 231 out of 261 subject categories. Second, the increase over the second half (2002-2013) was double the increase in the first half (1990-2001). Third, the trend of a growing impact of older articles also holds for even older articles. In 2013, 21% of citations were to articles >= 15 years old with an increase of 30% since 1990 and 13% of citations were to articles >= 20 years old with an increase of 36%.

Now that finding and reading relevant older articles is about as easy as finding and reading recently published articles, significant advances aren't getting lost on the shelves and are influencing work worldwide for years after.

The next article in our 10th Anniversary Series is by Prof. Jonathan Eisen. He is at the University of California, Davis with appointments in the Department of Medical Microbiology and Immunology in the School of Medicine and the Department of Evolution and Ecology in the College of Biological Sciences. His research focuses on communities of microbes and how they provide new functions – to each other or to a host. He is coordinating the largest microbial sequencing project to date – a Genomic Encyclopedia – being done at the DOE Joint Genome Institute where he holds an Adjunct Appointment. He is also an active and award-winning blogger and microblogger.-- Anurag Acharya

Using Google Scholar in Scholarly Workflows

Jonathan Eisen
School of Medicine & College of Biological Sciences, UC Davis

When Anurag Acharya asked me recently if I would be interested in writing a guest post for the Google Scholar blog in relation to the 10th anniversary of Google Scholar, I immediately responded "Yes." Literally, that was the full content of my email response to his request. Why did I answer so enthusiastically? Well, I can put this down to three main reasons:

So - in thinking about what to write for this post, I came up with three main topics I thought would be good to cover - how I got interested in topics of searching for and sharing scholarly papers, how I use Google Scholar, and some ideas about future possible uses of Google Scholar.

Part 1: Some Background

One day, in ancient history, my wife came home from work (at a biotech startup up focusing on bioinformatics) raving about this new search engine "Google" that people at her company were talking about. As someone who thought of himself as on the cutting edge of web technology, I was a bit dismayed that I had not somehow discovered this myself. But I got over that and tried it out. And, after searching for my name (and being impressed with how well this new search engine worked on such an important topic) I immediately started playing around with searching for scientific papers and data. I did this, I guess, because ever since I was in college, I had been becoming more and more interested in (or some would say obsessed with) issues relating to finding and sharing scientific knowledge.

Without going into too much detail, some of the factors that contributed to my obsession included:

Working as a shelver and then assistant in the Museum of Comparative Zoology library in college and seeing how people struggled to find papers of relevance to their work;

Building and sharing databases where I was trying to include a description of every paper that had been published about specific genes. I note - thanks to the Wayback machine my Stanford website from when I was a PhD student is still available - although alas the specific linked databases are not. I have reposted some of them for people to see what they were like (though many of the links in them are busted). See for example my sites on RecA, SNF2, MutS and more

Working on projects to catalog everything known about specific organisms in association with work I was doing to characterize the genomes of these organisms

In these and other projects, I had seen and experienced just how much time could be spent on searching for papers and data about a particular topics. I am not sure I had a well-defined strategy in every case but I came to rely upon some preferred methods including:

"Citation walking" where one takes a paper of interest and then asks "how has this paper been cited?” and traverses across the literature via citations

Searching for keywords in abstracts and titles

Browsing through specific journals

Looking for papers by specific authors

For data, I mostly would look in specific centralized data repositories such as Genbank for DNA sequence information and PDB for three-dimensional structural data on proteins.

And of course many other approaches. Nothing really novel or brilliant here though I do think I got pretty good at how to carry out such searches. But one of the challenges was each approach had to be done in a different system and some of the systems were only available for a fee and some were not even online. And even with lots of time and pain, many things could be missed.

Thus when my wife introduced me to this new fangled Google thing my thoughts rapidly turned to - how can I use this new tool to help in finding and then sharing scientific papers or data about these genes and organisms I was studying? Did Google searches solve all my "issues" in this regard? Alas, no. But jump forward ~15 years to today and I am quite amazed in retrospect how much of my scholarly workflow flows through Google Scholar. But rather than try to recall and write about how my workflow changed with the advent of Google Scholar I thought I would just jump to the present time and discuss some ways that I use Google Scholar now.

Part 2: Using Google Scholar today

When working on this post I started to look around at how I use Google Scholar and I confess I was amazed at how many different ways I use it in my work. Here are some examples:

Tracking and using citations. One major general use of Google Scholar lies in tracking of citations to specific scholarly works. Here are some ways that I use such information:

Citations to individual works. A key aspect of scholarly work in many fields is examining how specific works are cited. Such information has many uses include discovering new works on a topic by seeing how specific papers from the past are cited, assessing impact of works, ego satisfying, and more. For many years, information on how a specific work was cited was nearly impossible to come by without paying for access to citation tracking databases. Now, with Google Scholar I (and others) can very rapidly gather such information.

Citation from diverse sources. One aspect of using Google Scholar to track citations to individual works is the way GS finds citations in diverse sources – not just in the peer reviewed scholarly literature. Now, in some ways this can be viewed as a limitation (some may not want to count or even know about citations from self published white papers, for example). But in others ways this is a wonderful thing as one can find citations to one’s work from very diverse sources outside of the “normal” mold.

Citation metrics. It is not a large conceptual leap to go from the ability to track citations to individual works to the ability to create summary statistics about citations across many works. There are many indices for such purposes – some useful and some not. But whatever you think of such indices – Google Scholar has opened up the ability for people to calculate such metrics for oneself or to offer services to calculate metrics for others. Such indices can be used in many ways but perhaps the most common is to summarize the citations for one individual researcher. Which leads into my next topic …

Google Scholar Pages. Perhaps my favorite development from Google Scholar in the last 10 years has been the introduction of Google Scholar Pages for individuals. I make use of my Google Scholar page and pages of others for dozens of things including these:

Citation metrics for myself. See above for a discussion of citation metrics in general. I use Google Scholar pages to examine citation metrics for myself and my papers all the time (right now GS shows two summary statistics H-index and I-10 index). And I use this information in many ways including putting it on my CV, including it in grant reports, and examining which of my scholarly works have had more “impact”.

As landing page for my publication list. Once one has a GS page, GS automatically adds new publications to one’s list and also updates citation counts and other information regularly. Thus I now include a link to my GS page on my blogs, my work web sites, and in my email signature.

To keep track of my coauthors. I have been blessed (and perhaps a bit cursed) to work in a field (genomics) where many projects involve large-scale collaborations across many institutions, involving many researchers. And I have found that a nice way to track these coauthors is via GS (although – note to GS folks – there used to be a way to show, publically, all coauthors in a list but I cannot seem to figure out how to do this anymore).

Author disambiguation. For people like myself with a relatively unique name, when others search for my scholarly works, they are pretty easy to find (although I note the fact that there is another Jonathan Eisen out there who publishes some works with a bit of a conspiracy theory angle has been both good and bad for me at times). But for many others, their name is not a perfect way to find their work. This may be because they have a name that is relatively common, or it may be because they have changed their name (e.g., after marriage). For such people creating a GS page can be very useful because once one trains GS with a set of works, it can find new works by that same person quite well (I first found out about this author disambiguation by GS when Anurag gave a talk at a meeting I organized last year). GS is certainly not the only tool in author disambiguation and others – like author UIDs (e.g., ORCID) are almost certainly better long term options. I note – author disambiguation may seem like a esoteric topic to many but it has major implications on important issues such as gender equity in academia, since women are much more likely to change their names during their career than men are.

Automated updates of new papers by specific authors. One option associated with GS author pages I use extensively is the ability to “follow” specific authors and get notified of new publications of theirs.

To keep track of a collection of people. Most researchers do not regularly update their individual publication pages on their websites. However, if those researchers have GS pages one can keep track of their new papers quite easily (either by the follow option mentioned above or just by browsing occasionally). For example, for my microBEnet project I curate a list of GS pages for researchers in the whole field with connections to studies of “microbiology of the built environment” and thus (hopefully) help others keep up with what is going on in the field.

Who is in a specific field? One feature of GS author pages that is not used a lot as far as I can tell, but which has some value is the “areas of interest” tag one can add to one’s profile. Though not everyone uses such tags, I have found they are a useful tool in finding researchers working on specific topics. For example, I list “symbiosis” as one of my areas of interest and if I click on the link for that on my page I get a list (sorted by citation counts – which is both useful and annoying) of others who have listed that same area of interest. And many of the people in this list I am not familiar with yet they do work on topics in which I am very interested.

Automated discovery of new papers by topic. Pretty much all scholars these days are drowning in information and in keeping up with scholarly works. There are many reasons for this of course, and there are also some solutions. I find, for example, that social media is a great way to keep up to date on what new papers are coming out or have come out recently. But social media does not find everything and as someone who is responsible for keeping others up to date on various fields (e.g., this is one of my jobs at microBEnet) I also rely on both manual and automated searchers of the scholarly literature to find new papers or old papers I have missed. GS has two key ways to help in this regard. The first is relatively simple in concept but takes advantage of the power of Google indexing – which is just directly searching GS for papers on particular topics. And the advanced search options allow some customization of such searches. But as someone who is quite busy, I do not actually end up searching GS for new papers all that often. Instead I rely upon automated searches through various services including Pubmed, Pubchase, and GS. I use GS in two ways for such automated searches:

Create an alert. When one does a search on GS, in addition to results one is presented with an option to “Create an alert”. I now have dozens of such alerts in operation. To avoid getting drowned by the results I set them up to send only once a week and I filter them into a separate mail folder that I only look at when I have time. But I frequently find interesting new papers this way.

GS Updates. Another option now available, if one has a GS profile, is to use the GS Updates system (which I have written about before here and here for example). This system uses one’s publication list to scan for new papers that are related in some way to one’s prior work.

Many other uses of GS. I have gone on perhaps way too long here so I am only going to briefly mention a few other uses of GS.

Finding online versions of papers. Unquestionably one of the most valuable uses of GS is to find online versions of scholarly works. But since others have written extensively about this I will just say the following: if you publish any scholarly work I recommend you make it freely and openly available AND that you make sure that it gets indexed by GS.

Full text searches of the literature. Another critically important aspect of GS is that it facilitates full text searching of the scholarly literature which is important for many reasons.

Finding works outside of the “normal” places to publish. Another key feature of GS is that it indexes much more than just publisher’s sites. If one posts a preprint on one’s own web server, that paper may show up in GS (which I think is a good thing). GS also indexes many diverse sources of scholarly works and thus helps in finding works that may otherwise not see the light of day.

Part 3: Where do we go from here?

As an active user of Google Scholar I of course have many comments, complaints, ideas and thoughts about what it could do better and where it might go in the future. And there are SO many things that could be added or improved upon – things like better figure and table searching, better exporting of information, better abilities to curate and create collections and to then use such collections as training sets for automated searchers, and more and more and more. I have written about some such issues and suggestions from time to time in my blog (see for example, this and this and this). There is certainly lots of work to be done.

But in thinking about this I realized that making a list of issues and suggestions is only of limited value. What I think GS really needs is a better public forum where GS can discuss what their plans are for the future and also where users and developers can discuss what they would find useful. And though I see some places for such discussions on the Google Scholar blog and in related sites, I don’t see a lot. So – I would like to end with a call for GS to create a better site for such discussions of the future of GS …

The next article in our 10th Anniversary Series is by Thomas Bruce. He is the director of the Legal Information Institute at Cornell. He co-founded the LII in 1992. Today, its legal collections are used widely and have inspired the Free Access to Law Movement which has helped citizens worldwide learn about the laws that govern them. Thomas is also the author of Cello, first Web browser for Microsoft Windows. -- Anurag Acharya

Caselaw is Set Free, What Next?

Thomas Bruce, Director, Legal Information Institute, Cornell

A lawyer story

Google Scholar’s caselaw collection is a victory for open access to
legal information and the democratization of law. It would be more
than worthy of celebration from that standpoint alone. But caselaw is
above all an obsession of lawyers, and I’d like to start by telling
the tale from their point of view.

Five years ago, when Google Scholar added judicial opinions to its
portfolio, it created an immediate sensation among
lawyers. Small-office and solo practitioners
were the
most vocal about it; they had always had a difficult time
affording the services of commercial
publishers, even
in print. And now there was access to a significant chunk of
material that had previously been lodged firmly behind paywalls. It
was linked and searchable, and still better, it
offered a
version of the citation-tracking and evaluation features that
lawyers knew and loved in expensive commercial systems. It had
first-class sorting and filtering features. It
had Bluebook-form
citations for each case (pretty much the epitome of something that
nobody but
lawyers knows
or cares about, but a very thoughtful touch indeed). Nobody in the
open-access arena had tried such a thing, and probably only Google
could
have. One
commentator said that, “Google fired (arguably) the
loudest...salvo in the battle for free access to caselaw… and it
apparently came as a tweet”.

Scholar’s immediate impact on the legal profession was owed in large
part to its technical virtuosity. It was an unusual display of
ingenuity used to democratize services and features whose value had
mostly been known only to lawyers. But, for the legal profession, it
was happening in the middle of a long-brewing, near-perfect
storm. Since at
least the early 90’s, clients had complained about surcharges that
law firms added to legal research costs. By 2000, there was growing
refusal to reimburse legal-research fees at all; clients felt that the
firm’s online charges were just a part of overhead, like water and
electricity. That was not an isolated gripe; rather, it was a visible
crack in
a
business model that we now know had been eroding for quite some
time. By one estimate, the 2008 implosion of the
financial-services industry destroyed over a third of the legal
employment in New
York. A
lot of firms changed radically or disappeared altogether in the
aftermath. You could talk, in dry academic terms,
about downward
price pressure on the industry. One suspects that the feeling was
more like riding in an elevator whose cables had been cut.

There had been free offerings of caselaw
online for some
time, starting with a BBS
system offered
by the Cleveland Freenet in 1989; the first web-based effort
started here at Cornell in
1992, and was followed with a full edition of all Federal statutes in
1994. Elsewhere -- notably
in Canada
and Australia -- open-access
systems offered by third parties had evolved into the de facto
national standard. And government was catching up, with many law
creators publishing their materials online, for free.

Free services had never been the first choice of lawyers in the
US. Some of the reasons were rational -- free services often lacked
features that lawyers depend on, most provided very little in the way
of commentary or annotation, and in any case they were highly
distributed. There was no “one-stop shopping” in the world of open
access to law, just a lot of websites offering different
collections. The irrational reasons were, if anything, even more
interesting and far more influential, though much more deeply buried
in lawyer psyches. Lawyers are notoriously conservative in their work
methods,
and many
law librarians even more so. Anything that was both new and
noncommercial
was inherently
suspect. And the commercial services had
had more
than a century to reinforce the idea that size and comprehensiveness
were the only measures of quality that mattered.

Even so, it’s hard to convey the degree to which lawyers mistrust
distributed systems. As John Lederer once remarked, “Lawyers don’t buy
books -- they buy systems of books”, and so it was with electronic
products as well. It was easy for lawyers to dismiss what they saw as
isolated pockets of legal information offered by volunteers at wildly
different levels of added value, and marketers of commercial
services had
been quick to emphasize these qualities. That said, in the year
prior to the addition of caselaw to Scholar, Cornell’s website had
delivered well over 81 million pageviews to nearly 14 million unique
visitors. 4.5 million of those pageviews went to
the Federal Rules of
Civil Procedure, a collection unlikely to be used by anyone but
lawyers.

Comes
now Google, a company with unparalleled capacity and legendary
technical skills, offering a large collection of caselaw under one
roof, with a workable citator and advanced search functionality. That
was a big story, and it was
often
reported as “Google takes on commercial legal-research
behemoths”. It was free access offered from a source that could not be
dismissed as somehow beneath notice or unlikely to survive. Google’s
offerings in Scholar thus became a validation of, and a capstone on,
the things that open-access advocates had been doing for years. Apart
from its inherent value -- which was, and is, huge -- it was a sign
that freely accessible legal information was technically advanced and
more than sufficient for many if not most professional needs. Most of
all, it signaled that free legal information was something to be taken
seriously. It sent that signal at a time
when circumstances
compelled the profession to pay far more attention than it otherwise
might have. Scholar not only brought us a new and capable
collection, it brought a new level and quality of attention to the
entire open-access enterprise.

Everyone else

I began by telling a story about law and lawyers, but of course
there’s an even more compelling story about law and everyone
else. Laws -- and particularly statutes and regulations
-- affect
everybody. They describe what’s possible and permissible, what it
costs to do business, what we can expect from government and what
government can expect from us. On any given day, an open-access legal
web site such as ours, or Scholar, is used by people who are helping
veterans get the benefits to which they’re entitled, small businesses
planning new courses of action, and students at all levels who are
learning about the Constitution and our system of government. There
are law-enforcement personnel learning about the limits and
obligations of their position, hospital managers consulting
public-benefits law, and people finding out what they have to do to
sell new products in new markets. Those people need access to
law. They need to be able to create starting points for themselves,
using search to connect words and phrases that they already understand
with concepts and explanations that at first they will not understand
at all. They need to be able
to follow
their noses from those poorly-understood things to other pages
that will explain them. Making all that possible is the next
challenge.

What now?

Google Scholar’s caselaw collection offers features -- such as
citators -- that are a step toward the “system of books” that would
fully integrate primary legal sources and commentary into a practical
resource for public understanding and professional practice. The
legal-information ecosystem on the Web as a whole is moving in that
direction. As that progresses, the benefits to everyone affected by
law -- which is to say, everyone, period -- will be enormous. We will
move beyond making law available on the Web to making it truly
accessible on the Web -- not just discoverable, but
understandable.

In 1992, starting with important caselaw collections, the open-access
community began connecting law to itself. Hyperlinks gave readers a
way to seamlessly follow citations -- at least if the cited thing was
available online somewhere. And simply seeing to it that the things
that ought to be online are online kept us all busy for a very long
time (and is still a significant problem,
in many
places, some of
them surprisingly
close to home). We need to increase the density of connections
between documents by making connections easier for machines (rather
than human authors) to create. We need to hugely increase the amount
of freely-available material that explains the law. And we need to --
in ways both trivial, and not -- make it possible for people to find
the laws that affect them using things they already know.

Regulations provide a really good arena for thinking about such
problems, for two reasons. First, they are harder for information
systems to deal with. They are inconsistently drafted by a wide
variety of people. For example,
the Code of Federal
Regulations is essentially a compilation of the work of perhaps
200 agencies (nobody really knows exactly how many). And, compared to
caselaw, regulations have been relatively neglected by open-access
publishers. Finally, and most importantly, they are the largest single
contact surface between the public and the legal system. Yes, there
are Supreme Court cases that
are sweeping
in their effect on daily life -- roughly half a dozen a year,
compared to the thousands and thousands of cases in the Federal system
that are just about two people suing two other people over something
that only
four people care about (and maybe a fifth if you count the
judge). Regulations affect
lots of people,
and they
change often. That makes them much more of a challenge for
open-access publishers, both technically and economically. It also
makes it that much more urgent to provide citizens with improved modes
of access and value-added services such as notification of changes and
anything and everything that would make compliance easier. Second,
regulations are about things, and they are often based on science. And
building things that bridge knowledge domains is what information
scientists do.

A trivial example may help. Right now, a full-text search for
“tylenol” in the US Code of Federal Regulations will find…
nothing. Mind you, Tylenol is regulated, but it’s regulated as
“acetaminophen”. But if we link up the data here in Cornell’s CFR
collection with data in
the DrugBank
pharmaceutical collection , we can automatically determine that
the user needs to know about acetaminophen -- and we can do that with
any name-brand drug in which acetaminophen is a component. By
classifying regulations using the
same system
that science librarians use to organize papers in agriculture, we
can determine which scientific papers may form the rationale for
particular regulations, and link the regulations to the papers that
explain the underlying science. These techniques, informed by emerging
approaches in natural-language processing and the Semantic Web, hold
great promise.

All successful information-seeking processes permit the searcher to
exchange something she already knows for something she wants to
know. By using technology to vastly expand the number of things that
can meaningfully and precisely be submitted for search, we can
dramatically improve results for a wide swath of users. In our shop,
we refer to this as the process of “getting from barking dog to
nuisance”, an in-joke that centers
around mapping
a problem expressed in real-world terms to a legal concept. Making
those mappings on a wide scale is a great challenge. If we had those
mappings, we could answer a lot of everyday questions for a lot of
people.

As I hinted earlier, search is often just the start; it shows the way
to the trailhead, but the information-seeker must then follow a path
that leads to commentary and deeper explanation of what the search
engine offers easily. Building that path is a problem that rests
critically on integration across multiple websites and
collections. Metadata-publishing standards and linked-data approaches
are helping; we look forward, for example, to a set of specific legal
extensions to schema.org that will
make it easier for people and machines to follow their noses from what
search provides to the understanding that they really need. It will be
a long job.

But that is a tale for another day, perhaps another ten years in the
future. It’s exciting to see how far we’ve come. Scholar, and its
legal collection, are a tremendous gift to those who want to know
about the law, and a platform for those of us who want to go further.

The world of scholarly communication has changed quite a bit over the last decade and Scholar has been a part of the change. We are taking the opportunity of Scholar's 10th anniversary to explore the impact of these changes - looking at how scholarship and citation patterns have changed as publications and archives moved online and comprehensive relevance-ranked search became available to everyone.

As the next article in the 10th anniversary series, we have published a study examining the evolution of the impact of non-elite journals on arXiv. The idea that a small elite set of journals covers most of the key papers in a discipline has long been prevalent in the study of scholarly communication. We explore how this has changed over 1995-2013. - Anurag Acharya

Rise of the Rest: The Growing Impact of Non-Elite Journals

In this paper, we examine the evolution of the impact of non-elite journals. We attempt to answer two questions. First, what fraction of the top-cited articles are published in non-elite journals and how has this changed over time. Second, what fraction of the total citations are to non-elite journals and how has this changed over time.

To answer these questions, we studied citations to articles published in 1995-2013. We computed the 10 most-cited journals and the 1000 most-cited articles each year for all the 261 subject categories included in Scholar Metrics. We considered the 10 most-cited journals in a category as the elite journals for the category and all other journals in the category as non-elite.

There are two main conclusions from our study. First, the fraction of highly-cited articles published in non-elite journals increased steadily over 1995-2013. While the elite journals still publish a substantial fraction of high-impact articles, many more authors of well-regarded papers in a diverse array of research fields are choosing other venues.

Our analysis indicates that the number of top-1000 papers published in non-elite journals for the representative subject category went from 149 in 1995 to 245 in 2013, a growth of 64%. Looking at broad research areas, 4 out of 9 broad areas saw at least one-third of the top-cited articles published in non-elite journals in 2013. All broad areas of research saw a growth in the fraction of top-cited articles published in non-elite journals over 1995-2013. For 6 out of 9 broad areas, the fraction of top-cited papers published in non-elite journals for the representative subject category grew by 45% or more.

Second, now that finding and reading relevant articles in non-elite journals is about as easy as finding and reading articles in elite journals, researchers are increasingly building on and citing work published everywhere. Considering citations to all articles, the percentage of citations to articles in non-elite journals went from 27% of all citations in 1995 to 47% in 2013. Six out of nine broad areas had at least 50% of total citations going to articles published
in non-elite journals in 2013.

The second article in our 10th Anniversary Series is by Abel Packer. He is the director of the SciELO Program which has transformed scholarly publishing in Latin America. Given SciELO's multi-lingual reach, this post appears in English, Portuguese and Spanish. - Anurag Acharya.

SciELO, Google Scholar and Latin American Journals

Abel L. Packer
SciELO/FAPESP Program, Director

SciELO is 16 years old. Today, it publishes approximately one thousand selected peer-reviewed open access journals grouped into national collections. The SciELO Network currently comprises 16 national collections, 13 from Latin America and three from Portugal, Spain and South Africa.

The primary goal of SciELO is to provide growing visibility to the research published by national journals. When SciELO was launched these journals were print-only, usually with a small subscriber base. Only a few journals were indexed in citation indexes and there was no way of determining the real or potential impact that most of the journals had in their respective fields.

Today, we estimate about one million downloads a day across the network, 500 thousand of them from SciELO Brazil as based on COUNTER-compliant statistics. The total number of articles hosted across the SciELO Network is over 450 thousand.

Two important questions: how did SciELO succeed in setting up such a broad-based operation and achieve such an impressive performance in terms of downloads and why have so many countries and journals joined the SciELO Network?

There are four major factors. First, the reputation and leadership of the driving organizations. The SciELO project was established and nurtured by the São Paulo Research Foundation (FAPESP), widely known in Brazil as the most efficient and advanced research agency in the country, and the Latin American and Caribbean Center in Health Sciences Information (BIREME), which is affiliated with the Pan American Health Organization and the WHO. The initial motivation for the partnership was to develop a citation index covering a more comprehensive collection of journals beyond the 17 which were then indexed in the Journal Citation Reports from ISI. Soon after launch, the Chilean National Commission for Scientific and Technological Research (CONICYT) joined the effort. From 2002 on, the Brazilian National Council for Scientific and Technological Development (CNPq) and other national research agencies started also to support SciELO.

Second, the selective acceptance criteria applied to journals for SciELO collections. Only open access peer-reviewed journals with an editorial board composed of well-known researchers, a reasonable rejection rate of manuscripts and standards-compliant publication processes were accepted. The best journals of Brazil were invited by FAPESP to join SciELO. CONICYT took a similar approach for SciELO Chile. This helped set the expectation of selective acceptance criteria for new national collections.

Third, the tremendous impact of Google Scholar which was decisive in moving the program ahead. As soon as Google Scholar began indexing SciELO, the traffic to SciELO sites increased to such an extraordinary extent that the general panorama was completely changed. The dramatic growth contributed, in a major way, to overcome the resistance that publishers had towards online publication. Google Scholar showed publishers, editors, authors and users that online publication was the new paradigm for the dissemination of journals and that SciELO could help them to achieve it. The processes put in place by SciELO to create structured versions of the articles and metadata, as well as standardization of article formatting were a key component in the rapid success of the indexing effort.

Fourth, the success and the increasing use of SciELO, together with quality control on the journals, led national research evaluation systems to include SciELO as an index in their evaluation criteria. This favored an increase in manuscript submissions to indexed journals, which provided an additional impetus to the program.

The other ongoing objective of SciELO is to increase the impact of the research communicated by its journals. A key requirement for this is to identify and count citations to SciELO journals and articles. SciELO computes bibliometric indicators covering the journals it hosts. To measure broader impact, SciELO initially relied on Web of Science (WoS) and Scopus. However, these indexes have an incomplete coverage of SciELO journals. For example, in 2014 Scopus covers 70% of the journals in SciELO Brazil, and WoS only 36%. To partially solve this lack in coverage, SciELO concluded an agreement with Thomson Reuters to include, as of 2014, the SciELO Citation Index in the WoS platform which provides a wider coverage, particularly in the physical and life sciences.

However, Google Scholar has much broader coverage worldwide, even more so in social sciences and humanities. As a result, Scholar Metrics offers more comprehensive citation numbers. These are now used by SciELO to evaluate the broader influence of its journals. Scholar Metrics are also a key part of the evaluation process for new journals that want to become part of SciELO. In this regard, what we would really like in Scholar Metrics is the availability of an annual series of indicators, and extending the journal rankings beyond 100.

SciELO and Google Scholar have walked a long way together. Together, we have helped to significantly increase the worldwide visibility of Latin American journals and journals of Portugal, Spain and South Africa. On its anniversary, we would like to congratulate the Scholar team for the impressive development of Google Scholar, a comprehensive search service that many gifted minds once only dreamed of. Long live Google Scholar!