Innovation and best practices for the Web

About this Blog

The blog is written by Brian Kelly. Brian is the Innovation Advocate based at CETIS, University of Bolton.

This blog functions as an open notebook which provides personal thoughts, reflections and observations on the role of the Web in higher and further education which I hope will inform readers and stimulate discussion and debate, both on this blog and elsewhere, including on Twitter.

Archive for October 24th, 2012

Background

The second in the series of guest blog posts which gives a summary of an SEO analysis of a repository hosted at a Russell Group university is provided by Natalia Madjarevic, the LSE Research Online Manager. As described in the initial post, the aim of this work is to enable repository managers to openly share their experiences in use of MajesticSEO, a freely-available SEO analysis tool to analyse their institutional repositories.

The London School of Economics and Political Science

Background

LSE is a specialist university with an international intake and a global reach. Its research and teaching span the full breadth of the social sciences, from economics, politics and law to sociology, anthropology, accounting and finance. Founded in 1895 by Beatrice and Sidney Webb, the School has a reputation for academic excellence. The School has around 9,300 full time students from 145 countries and a staff of just under 3,000, with about 45 per cent drawn from countries outside the UK. In 2008, the RAE found that LSE has the highest percentage of world-leading research of any university in the country, topping or coming close to the top of a number of rankings of research excellence. LSE came top nationally by grade point average in Economics, Law, Social Policy and European Studies and 68% of the submitted research outputs were ranked 3* or 4*.

LSE Research Online – a short history

LSE Research Online (LSERO) was set up in 2005 as part of the SHERPA-LEAP project. The aim of the project was to create EPrints repositories for each of the seven partner institutions, of which LSE was one, and to populate those repositories with full-text research papers. In June 2008 the LSE Academic Board agreed that records for all LSE research outputs would be entered into LSE Research Online. We have no full-text mandate but authors are encouraged to provide full-text deposits of journal articles in pre-publication form, clearly labelled as such, alongside references to publications. Research outputs included in LSE Research Online appear in LSE Experts profiles automatically, thereby reusing data collected by LSE Research Online.

LSE Research Online is to be the main source of bibliographic information for the Research Excellence Framework (REF) in 2014. This has served to further increase the impetus for deposit and visibility of the repository in the School and we have various repository champions throughout the School across departments.

LSE Research Online size and a brief look at usage statistics

As of September 2012, LSE Research Online contains around 33,696 records, with 7,050 full-text items. We include a variety of item types such as articles, book chapters, working papers, data sets, blogs and conference proceedings. We most recently began collecting LSE blogs to create a permanent home for this important content. We began tracking LSERO site usage with Google Analytics in 2007 and the site has received 2,268,135 visits since this date. According to Google Analytics, 76.55% (1,748,725 total visits) of traffic to LSE Research Online comes from searches. Only 16.13% of traffic is from referrals and 7.14% from direct traffic. We also use analog server statistics to monitor downloads and total downloads May 2007-Sept 2012 was 5,266,871.

Expectations of the survey

Before running the Majestic SEO report, I expected we would see plenty of traffic from Google and backlinks (i.e. incoming links) from lse.ac.uk as, understandably, these are key sources of traffic to LSERO and are indicated as such on Google Analytics. Google Analytics also points to referrals from Wikipedia and Google Scholar, and most recently, our Summon implementation which includes LSERO content. However, I was intrigued as to how LSERO would fare in an SEO analysis.

Majestic SEO survey results

The data was generated from Majestic SEO using a free account on 24th September 2012 using the ‘fresh’ index option. A summary of the results is shown below: there are 1,285 referring domains and 8,856 external backlinks. Note that the current findings can be viewed if you have a MajesticSEO account (which is free to obtain).Figure 1: Majestic SEO analysis summary for eprints.lse.ac.uk

This includes 408 educational referring backlinks. If we look at backlinks in more detail, patterns begin to unravel:

Figure 2: Top 5 Backlinks

This illustrates a distinct majority of Wikipedia pages linking to LSERO content and yet this is only ranked as the sixth most popular source of traffic in Google Analytics.

Top referring domains, sorted by matched links, can be found in the table shown below:

Referring domains

Matched links

Alexa rank

Flow Metrics

Citation flow

Trust flow

wordpress.com

14502

21

95

93

blogspot.com

11239

5

97

94

wikipedia.org

349

8

97

98

flickr.com

272

33

98

96

google.com

225

1

99

99

Table 1: Top 5 Referring Domains

Flickr makes a surprise appearance, with WordPress and Blogger dominating the top of the table.

In Table 1, the Top 5 Referring Domains to LSE Research Online are WordPress, Blogspot, Wikipedia, Flickr and Google. We can see the dominance of international social platforms here with WordPress (14,502 links) and Blogspot (11239 links), followed by Wikipedia (349 links), Flickr (272 links) and, finally a search engine, google.com (225).

In Figure 3, Top 5 Resources in Repository (sorted by flow metrics), we can see several links to LSERO information pages including the home page and the feed of latest additions. There are, however, several direct links to full-text papers including an Economic History Working Paper on A dreadful heritage: interpreting epidemic disease at Eyam, 1666-2000. Sorting this data by number of backlinks, as shown in Table 2, the top item is the LSERO homepage with 501 backlinks. The second item is the PDF of one of our most downloaded papers of all time: The Hartwell Paper.

Discussion

So what can I draw from the results of the Majestic SEO report of LSE Research Online? Analysing the top referring domains according to the Majestic report, it seems reasonable to suggest that adding links to repository content on blogging platforms such as WordPress and Blogspot may result in an increased SEO ranking. We often link to LSERO content in various LSE Library blogs hosted on Blogspot, including New Research Selected by LSE Library. Flickr is also listed as a top referring domain according to the Majestic SEO but running a Google search for site:flickr.com “eprints.lse.ac.uk” retrieves zero results. It’s difficult to ascertain how MajesticSEO gets this result when Google does not confirm the findings – perhaps it uses very different algorithms to Google? The MajesticSEO top referring domains indicate that blogging platforms are the main referring domains to LSERO content. However, according to our Google Analytics stats, 76.55% of traffic to LSERO is from searches. Furthermore, the Majestic report indicates that there are 349 matched links to LSERO content on Wikipedia. “Running the search site:wikipedia.org “eprints.lse.ac.uk” in http://www.google.co.uk/ you get (on 11 October 2012) “About 92 results”. From the last page of the results, by repeating the search to include omitted results, Google ends up with 80 hits.” Searching for eprints.lse.ac.uk in http://en.wikipedia.org/wiki/Main_Page retrieves 83 hits. How does MajesticSEO retrieve such varying results?

Looking at backlinks, it’s important to note that the majority of top backlinks refer to papers that have the full-text attached and often link directly to the full-text PDF, of course resulting in a direct download. In addition, the Top 5 Resources in Repository (sorted by external backlinks) as seen in Table 2 tallies with our consistently popular papers according to Google Analytics and our analog statistics.

It is apparent that the inclusion of repository links on domains such as Wikipedia and blogging platforms appears to have a positive impact in helping the relevancy ranking weighting for LSERO content in web pages. This is not to mention direct hits on the links themselves, adding directly to the site’s visitors, and thus the dissemination of LSE research outputs. However, whether we can draw firm conclusions from the Majestic report remains to be seen, particularly with such differing results to those found on Google.

Thanks to my colleague Peter Spring for his advice when writing this post.

About the Author

Natalia is also the Academic Support Librarian for the Department of Economics and LSE Research Lab. Joining LSE in 2011, prior to that Natalia worked at libraries including UCL, The Guardian and Queen Mary, University of London. Her professional interests include Open Access, research support, REF, bibliometrics and digital developments in libraries.