another round of hydra/samvera community dependency analysis

It’s time for another round of running my tool to see what community dependencies and versions Samvera community apps are using. (Last done in August 2017).

This time I’m adding any samvera community apps I can find, not limited to sufia/hyrax or even valkyrie. Now 43 apps total analyzed, significant increase over the 28 we had before, so numbers between the two reports are not directly comparable.

Still, the majority of apps analyzed use Sufia, Hyrax, or Valkyrie. Of the 43 apps analyzed, 17 (40%) use sufia, 11 (26%) use hyrax, and 2 (5%) use valkyrie. (Might have one or two blacklight-only apps that snuck into the corpus too).

Dates of Last Commit

As before, just because a public repo exists doesn’t necessarily mean it’s in production. It could be an old version no longer in production, or an experiment that never went anywhere or was meant for production, or an in-progress intended to eventually be in production. While my “research question” is really about apps actually in production (or perhaps in progress to get there), I don’t know of any good way to limit to this set without lots and lots of out-of-band research.

But to provide a bit more context, I’ve added a feature to summarize the last time an app in a given dependency-version-use category was updated. Just because an app hasn’t been updated in years doesn’t necessarily mean it’s not in production — some people (for better or worse) may have apps in production they haven’t touched in years. But an app that has been touched recently we at least know is “current”, whether in production, in-development with a production goal, or an experiment.

It’s only giving summary statistics right now, but we can see that there are definitely apps that received commits in 2018 which are still using old dependencies, including:

Sufia 6.6

hydra-editor 1.x (2.0 was released two years ago)

hydra-head/hydra-core 6.4 (latest release 10.5, a 6.x release last made in 2014)

There are definitely apps out there currently being developed and using pretty old dependencies (not a surprise) , but I’m not sure how many apps this is total, and this makes me curious to learn more about the apps.

I could write more sophisticated aggregate analysis, but this isn’t the first time I’ve kind of wanted to see the list of apps using, say, active-fedora 7.x, so I could go investigate them and learn more about them — what are they, what other dependencies do they have, etc?

But for now, my tool still reports only aggregate info, never listing specific repo URLs (not even to me). I don’t want anyone to feel individually shamed for their old dependencies, so I’m avoiding any non-aggregate data for now. I may eventually add it though when I really want to learn more in a way it would make easier.

Major Version Bumps

I’m really curious about how often community apps upgrade to a new major version of dependencies like Sufia, ActiveFedora, RSolr, Blacklight, or even Rails. Of the apps using, say, Sufia 7.x, how many were created with Sufia 7.x initially, and how many were created with a 6.x or previous version and then upgraded?

I started on tooling to answer this, which we can do by fetching every single commit that touched a Gemfile.lock and analyzing them, but it requires an awful lot of requests to Github api and some analysis code. I haven’t gotten the tool to the point it can answer exactly my questions yet, but I do have a raw count of how many apps have in their history at least one major-version upgrade of an “interesting” gem.

Number of apps that did a major version bump of the listed dependency at least once:

20 of the 42 apps that use active-fedora updated it least once — 25 of the 42 apps that use a-f are on 11.x, so I’d suspect the 20 upgraders come largely from within these ranks.

About half, 8 of the 17 sufia-using apps have done a major version bump at least once. Only 7 sufia-using apps are on the latest/last 7.x; I don’t have analysis of the cross-over, but we know at least one app has done a sufia major version bump in it’s history, but still hasn’t made it to 7.x. (Of course, others could have gone on to hyrax). (Exploring this kind of thing is what tempts me to reveal the actual repo ids/urls, to make it easier to manually explore ).

And, 15 of the apps have done a Rails major version bump. All 43 apps analyzed use Rails. This is actually a bit smaller then I might have guessed. I suspect many of the apps not upgraded are apps that were created on Rails 4.x and remain there. Rails 4.2 (33% of analyzed apps) is still receiving patches for “major security updates” (but not “minor security” or other bugs); I think this will remain true even after Rails 5.2 is released, up until Rails 6.0 is released. 26% of apps analyzed are on rails 4.1 or earlier, which does not even receive updates for major security vulnerabilities. 46% of apps are on Rails 5.x, which appears to be up from the August analysis, although since we increased our corpus they aren’t directly comparable.

Now vs. August

The corpus is different so we can’t compare directly (we added more apps, which may have dependencies that aren’t like the ones we had before), but we can still do a bit of comparison careful to remember limitations.

Sufia versions remain dispersed. In both sets, Sufia-using apps are about split between 1/3rd 7.x, 1/3rd 6.x, and 1/3rd earlier.

active-fedora use is still fairly dispersed, but the number of apps using the most recent 11.x has gone up to 60% from 48%. Because of the different corpii, that isn’t directly comparable, but it seems like a good sign. Still plenty of apps using earlier active-fedoras of course, including a substantial number using 7.x and earlier. Zero apps under analysis use the latest active-fedora 12.0.x.

The ldp gem, used by 74% of apps analyzed, still has a latest release of 0.7.0, no 1.0 release.

I don’t seem to have included rsolr in the August analysis for some reason, but have here. Rsolr usage is still predominantly (72%) 1.x, rather than 2.x (2.0.0 was released in May 2017). I think sufia may not be compatible with rsolr 2.x.

Amongst the corpus, there are now 11 apps using hyrax, and 17 using sufia. In August’s analysis, we had 8 apps using hyrax and 17 using sufia. So there doesn’t appear to have been anything like a massive migration to hyrax from sufia apps in the past 6 months.

The full results

Still in ugly ascii format, getting perhaps hard to interpret with so much data. What we really need is some fancy visualizations (with various cross-tabs), but not sure when/if I’ll get there. I did try to make the output more clear about some things I think were misleading/confusing some people before.