Innovation and best practices for the Web

About this Blog

The blog is written by Brian Kelly. Brian is the Innovation Advocate based at CETIS, University of Bolton.

This blog functions as an open notebook which provides personal thoughts, reflections and observations on the role of the Web in higher and further education which I hope will inform readers and stimulate discussion and debate, both on this blog and elsewhere, including on Twitter.

The importance of SEO rankings for surfacing content hosted in institutional repositories can be gauged from the responses to the query I asked on the JISC-Repositories JISCMail list: “Does anyone have any statistics on the proportion of traffic which arrives at institutional repositories from Google?”. I asked a similar question on Twitter and found that mature research repositories seem to get about from 50-80% of their traffic from Google. This aligns with the findings reported by Les Carr for the University of Southampton back in 2006: “the majority of repository use, if I can equate eprint downloads with repository use, is due to external web search engines (64%)“. Indeed since it has been reported that direct downloads of PDFs hosted in repositories may not be reported unless Google Analytics has been configured appropriately such figures may be an underestimate!

In light of the importance of Google in supporting repositories in their mission of making research papers easily accessible to others it will be useful to gain a better understanding of the factors which contribute to supporting the discoverability of the content hosted in institutional repositories.

The survey described in this post reports on summary SEO findings for the 24 Russell Group universities. The aims of the survey are to provide a benchmark for comparisons with surveys which may be carried out in the future, to attempt to identify any interesting usage patterns which may help to enhance the effectiveness of institutional repositories and to identify the highest ranked domains which provide links to institutional repositories.

Survey Using MajesticSEO

The data was collected on 27-28 August 2012 using the MajesticSEO service. Note that the current finding can be obtained by following the link in the final column. The findings can be viewed if you have signed up to the free service.

The eScholar repository at the University of Manchester is hosted at http://www.manchester.ac.uk/escholar/ Figures for this home page are given but since the domains with incoming links may refer to pages hosted on the manchester.ac.uk domain, these figures are not given in order to avoid skewing the findings.

The DCS repository at the University of Sheffield is hosted at http://www.shef.ac.uk/dcs/research/publications Figures which are available for this home page are given but since the domains with incoming links may refer to pages hosted on the shef.ac.uk domain, these figures are not given in order to avoid skewing the findings.

The YODL repository of the University of York is hosted at http://dlib.york.ac.uk/yodl/app/home/index Figures which are available for this home page are given but since the domains with incoming links may refer to pages hosted on the dlib.york.ac.uk domain, these figures are not given in order to avoid skewing the findings.

Table 2 gives the total number of links to the high-ranking domains which are listed in the survey, together with the Alexa ranking for these domains. Note Google.com has the highest Alexa ranking and is listed at number 1. Figure 1 shows the significance of links from blog platforms compared with the other most highly-ranked domains.

Discussion

In a previous post I suggested that since LinkedIn.com is so widely used across Russell Group Universities, encouraging researchers to provide links to their papers hosted in their institutional repository would enhance the visibility of papers to Google, especially since LinkedIn has such a high Alexa ranking (it currently is listed at number 13 in the global ranking order).

However it appears that LinkedIn does not appear to have a significant presence according to the findings provided in MajesticSEO (although the free version does only list the top five domains).

Based on the information obtained in the survey it would appear that two blog platforms, WordPress.com and Blogspot.com, are primarily responsible for driving traffic to institutional repositories, having both high Alexa rankings together with large numbers of links to the repositories.

Following these two platforms, but a long way behind, we find Wikipedia and the BBC and then, perhaps somewhat confusingly, Google itself (perhaps links from Google Scholar). The presence of media sites such as the BBC, CNN and the Guardian suggest that researchers (or their media advisers) are doing a good job in ensuring that these organisations provide links to original research papers when stories about university research are being covered in the media.

But perhaps the most noticeable findings is that only one University Web site – Oxford’s – is included in the list of the top 5 domains across all of the Russell Group Universities. The low Alexa ranking (6,764) for the Oxford University Web site in comparison with the other sites listed (which have an Alexa ranking ranging from 1 to 259) suggests that links from university Web sites, even prestigious universities such as Oxford, will not have a significant impact on Google search results. It should also be noted that links from the University of Oxford Web site will not provide SEO benefits to the University of Oxford’s repository, which is hosted in the same domain (ox.ac.uk).

Limitations of this Survey

It should be noted that these conclusions are based on just one SEO tool and only a small selection of the findings are available. A more comprehensive survey would make use of the licensed version of the service, and make use of other SEO tools to compare the findings.

In addition Google do not publish the algorithms on which their search results are ranked so there can be no guarantee that the findings provided by SEO tools will relate directly to users experiences of using Google.

In order to relate these findings to the ways users access resources hosted on a repository there will be a need to examine usage statistics for repositories. It would be interesting to see if the downloads for the most popular items show any correlation with links from the services listed above.

Survey Paradata: The findings given in Table 1 were collected on 27-28 August 2012 using the free version of MajesticSEO. The Alexa rankings listed in Table 2 were obtained from the Alexa survey and collected on 28 August 2012. Where the findings from MajesticSEO were incomplete, due to the repository not being hosted on the root of a repository sub-domain this information was recorded and any data collected was not included in further analysis.

15 Responses to “MajesticSEO Analysis of Russell Group University Repositories”

Dixon Jonessaid

Hi Brian, I am a director here at MajesticSEO. Thanks for using our data for this research. If we can help with a follow up then I would be very happy to help. Track me down @Dixon_Jones on Twitter. In particular, Looking at the comparative Trust Flow metrics would be an interesting comparison, since trust is a weighted metrics that passes through multiple link iterations, meaning that large numbers of low quality links are less influential on the resulting 0-100 scores. By contrast, Citation Flow is a “Link heavy” metric… so urls or sites with high trust and low citation flow MIGHT indicate narrow centres of excellence.

Thanks for the comment. I’m now following you on Twitter (I’m @briankelly).

Funnily enough initially I did include the Trust Flow and Citation Flow values on the table. However since the FAQ didn’t really provide a great explanation of what these terms mean and how they are obtained, I decided to omit this information so that the post was focussed on the figures which are easily understood,

On further reflection I wonder if significant number of links to the repositories are from link farms which are hosted on Blogspot or WordPress. I guess it would be useful to be able to filter out links which are felt to be from untrustworthy sources. Is that possible?

I’m working with a small number of repository managers who will be in a position to provide such contextual information. I prefer working in an open fashion, providing evidence as the work progresses, so that flaws in the methodology can be spotted at an early stage.

[…] Not about OERs but similar questions are of interest. Investigation of SEO Rankings of Institutional Repositories There is a need “to investigate whether links [from popular social media services] are responsible for enhancing SEO rankings of re… "The survey reports that two blogging platforms appear to be primarily responsible for providing the high ranking which may drive traffic to repositories." The two blogging platforms in question are WordPress.com and Blogger. One question, I suppose, is how much traffic would the publications get if they were exposed directy through these platforms compared to being in the repository. […]

Annsaid

[…] what patterns of usage for searches for university Web sites do we find? In a recent survey of the search engine rankings, it was observed that only one institutional Web site (at the University of Oxford) was featured in […]

[…] post published in August 2012 on an MajesticSEO Analysis of Russell Group University Repositories highlighted the importance of search engine optimisation (SEO) for enhancing access to research […]

It would be very useful, to benchmark results and into analyse new link flow metrics that were added from that point.
This would give another dimension on the outcome and give a little bit more waiting to what link metrics added value?