Benchmarking Higher Ed Sites ...

I recently revisited an old pet project of mine looking at benchmarking and comparing higher education sites. It is obviously a work in progress, but it is shaping up fairly well at this stage and I'm getting to a point where I'd like to secure some feedback. If 'ya get a chance, please take a look and let me know what you think. It's based on D7, represents around 3,200 US higher ed sites, and has a range of data to sift through.

Comments

I work at the University of Oregon and that is fun to see the page stats. I know a few of the developers who manage the homepage. I'll pass this along to them. Feedback: the colors you use on the server information graph make it difficult to distinguish between platforms.

Text analysis is just cool. I like the metrics chosen. Anyhow, great project. Keep up the nice work.

Greatly appreciated. Sorry I missed UC Merced. I got the data from IPEDS ... if I remember correctly, UC Merced is a newer institution, so that's probably why it isn't there. I'll work on getting that added soon.

I've always hoped to have longitudinal data collected. Just have to make sure I can afford to scale it ;) ... and need to invest in the work to schedule the snapshots (easy). My hope is just to use Drupal's node revision system ... then w/ Diffs, views, etc., there is potential for all kinds of interesting data. Also thinking about back filling revisions with data from the internet archive ... that'll only get me basic DOM, HTML, usage (none of the performance/header fun), but it could be interesting nonetheless. I'd love to see a chart with adoption of Google Analytics overtime.

I'm playing around with a dom hashing technique too ... hoping I might be able to use that to give me a heads up when pages are redesigned.

This is very cool - love it! I started something a little similar for academic libraries about a year ago - I was focused on the IA, the layout of the page, types of navigation, common navigational elements, etc. Mine was pretty manual, but you've inspired me to figure out how to automate it. I agree that a history of results would be even cooler.
Cheers,

I have an interest in academic libraries as well. Was thinking about things like ...

1) Percentage of sites that mention the library on their home page
2) Creating a semi-structured series of taxonomy terms for "common" child pages (libraries, jobs, portal, department pages, etc).
3) Allowing anyone to add child pages to the index

Just so many angles to consider here ... fun stuff, but it takes time.

See the comment above about longitudinal data. I'm all about going there, just have to find the time :)

Lots of metadata to sift through ... you might be surprised about DC, but not sure ... I'll be adding a full-text source search soon(ish), so people can delve a bit deeper and more creatively than using the ole canned reports that I have available today.

It would be interesting to pull in some data from builtwith.com about the sites since builtwith.com tries to do analysis of the technologies in use (e.g. CMS, programming language, javascript libraries) - see http://builtwith.com/arapahoe.edu for one of the sites in your index I'm familiar with.

Indeed. I'll have to take a deeper look at their API. Unfortunately it appears to be limited to 500 calls a month, which for my needs, isn't much. Still, I'd rather use an API than try to reproduce parts of it from scratch. I already have parts of the js library, analytic engines, etc figured out ... just have to polish it up. Still it would take a ton of time to everything they're doing on my own ... hopefully they'll be willing to work with me on the API limit.

Thanks! I've always hoped to pursue something interesting enough for a DBUG prezo ... unfortunately, it looks like i'm moving outta state. I'll be happy to share some thoughts on building it ... it wasn't too bad ... just need to carve out the time. :)

Really enjoyed poking around your work to see how the .edu site I'm responsible for stacked up. The textual analysis was an interesting metric, but the server/developer visitors such as myself could use some explanation of their meaning & impact (maybe a short summary sentence & link to the corresponding wikipedia article)

Ya, I was running on a different host at one point and then ran into some threshold issues and lost the info. I'll look into that at some point. It was completely foreign territory for me too ... I'm not even sure how interesting it is at this point ... I think it has more potential when I (hopefully) expand later.

Hi,
Really interested in having you add in Wellesley College for benchmarking. We are newly up in Drupal 7 and using a module called monster menus to help us with departmental hierarchy/ownership issues.

Heh, well, I wound up taking it down as it was costing me a bit more than I wanted to pay. I'd love to bring it back ... had grand plans for it, but in the end, I just couldn't justify it. Sorry! Glad to hear that someone else found it interesting tho'