Greenkeeper.io Talks about Node.js Dependency Tracking

Greenkeeper.io is project that helps you keep your Node.js project up-to-date by keeping an eye on its dependencies and updating your code when they change (via a pull request). Greenkeeper is written by our friends at Neighbourhood.ie who also make Hoodie, the offline-first application framework, and can usually be found building things from Apache CouchDBTM or contributing to the project itself.

This week I interviewed the Greenkeeper team to get some background on the project and to get sneak peek at how it works and what technologies it uses.

GLYNN: Before we get to Greenkeeper.io itself, let's describe the problem that Greenkeeper.io is aiming to solve. Let's say I have a Node.js project that depends on a number of npm projects (listed in my project's package.json file). Would you please elaborate the problems you can get into managing change in such a Node.js project?

Greenkeeper.io: Of course! The way dependencies are specified in npm (and elswhere) relies on a practice called semantic versioning. You’ve seen this expressed in the usual three-piece version-numbers like 3.7.1, also called the major (3), minor (7) and patch (1) version. When a module fixes a bug that doesn’t otherwise affect its public API, we should increment the patch version. When a module introduces a new feature, like a new API call, but doesn’t change any of the existing ones, we should increment the minor version. When, however, a module’s public API changes in a way that a previously-working use-case stops working, or works in another way, we should increment the major version.

The reasoning is this: As long as only the minor or patch versions increase, we can keep depending on that module, because we know it doesn’t break our code. But when a new major version comes out, we need to see what changed and adapt the code. To make this even clearer, we like to call the 3 individual digits the breaking, feature, and patch version, because it removes room for interpretation of what we see as major or minor.

So, why do we need to go through so much trouble then? I’m glad you ask! :) The whole system relies on the fact that people who release their modules assign the right version numbers, and adhere to the semantic versioning convention. And that’s where things get messy. At this point, a human has to decide whether something is a bugfix, a feature, or a breaking change. And more often than not, a bugfix and breaking change look awfully similar. Greenkeeper will help you to verify and test these new versions in isolation, so your software keeps working.

GLYNN: Right. So my project relies on, say, 3 other npm projects (which in turn may depend on dozens of others, recursively). To ensure that I get all the latest changes that are published in my dependent projects, I am obliged to check each project periodically to see if there's an update.

Recently I discovered libraries.io which detects changes in packages I wish to follow and emails me when they change. This has helped me greatly in tracking changes to my dependencies. Tell me how Greenkeeper improves on that approach?

Greenkeeper.io: First off, we are friends with Andrew who runs libraries.io and we love what he’s doing there. In case you don’t know, they handle the use-case you describe (among other things), but for many more programming languages than just JavaScript hosted on npm.

In the process of keeping dependencies up to date, discovering when a dependency did update is only a small part of the game. The way bigger part is installing the new dependency, running tests, commiting the changes, and pushing them.

With Greenkeeper you:

don’t have to track dependencies or wait for notification emails

and you get a pull request on your GitHub repo

If you have a continuous integration service like Travis CI set up, it runs your tests, and you can see whether it is safe to upgrade right there in the pull request. All that’s left is to hit the merge button.

For a typical open source project, this maintenance work is about the least exciting thing to do, and if you work on your project in your spare time, you might opt for solving harder problems rather than doing this mundane chore. With Greenkeeper, you get to do both:

keep your dependencies up to date

and work on the fun stuff

In a commercial project, dependency updates can easily take up many hours of engineer-time per month. Reducing this to a click of a merge button for most cases is a great time-saver.

And we can go one step further: say Node.js 5.0.0 comes out. Then we can add this to everybody’s .travis.yml file, send a pull request and you get to see whether your module or project is forward compatible or not, without having to fear that you might screw up your local development environment. We already have a long list of little things that will improve module developer’s lives.

GLYNN: So let's talk through the steps: first I install Greenkeeper (npm install -g greenkeeper) then I run greenkeeper login to authorise Greenkeeper to access my Github account. After that it's a case of running greenkeeper enable on each project I want to be assisted by Greenkeeper. Can you tell me what's going in each of these steps? And can you explain why the project gets a pull-request to "pin the dependencies"?

Greenkeeper.io: Certainly! Your main interface with Greenkeeper is our open-source command line tool called greenkeeper. It is written in Node.js and can be installed via npm: npm install -g greenkeeper. The -g makes sure that we can type greenkeeper in our terminal after the installation. Running greenkeeper login opens your browser and redirects you to GitHub, so you can grant us the required access to your repositories . This way we can operate on them without having to know your GitHub password. After this one-time step is done, you can go back to your terminal. Now we're ready for the the fun part, where the machines take the work out of your hands.

To enable your first project, navigate to its folder and run greenkeeper enable. From now on, we’ll monitor your module’s dependencies and send you a pull request as soon as an update is published. On a very basic level, this is now an elaborate notification system, where instead of manually doing all the chores described above, you've got everything ready to work on – in real-time. All you have to do, to keep your dependencies up to date, is check out a branch and test it, which saves you a ton of time already. With continous integration configured though, you can make this a fully automated process, to the point where you just have to give your final go and hit the merge button when all lights are green.

Right after enabling a project for the first time, we'll also send you an update for all your existing dependencies, just to make sure we have a tidy playing field from the beginning. We're also pinning your dependencies down to a specific version, because it is a great way to keep them in a state you know about. That's only feasible beacuse you can keep them up to date easily, with the power of Greenkeeper. One of our main goals is to make sure that you're immediately aware of when things break, and we have some great ideas on how to make that work with version ranges as well. Look out for what we're launching next.

GLYNN: Part of the trouble is that doing npm install --save async causes your package.json to get an "async": "^1.4.2" entry which means 'please install async, where the version is >=1.4.2 and <2.0.0', leaving your project open to regressions in future versions of the dependent package, then I saw this tweet this morning:

GLYNN: Anyway, I wanted to switch the conversation to storage. I should declare my interest here! I work for IBM Cloud Data Services who, as you know, offer Cloudant, a multi-node Apache CouchDB fork, as a service. Are you able to enlighten us as to what Greenkeeper uses as a database and how you arrived at this decision?

Greenkeeper.io: We have to make sure that this doesn’t sound like a cheap set-up, but we are in fact using Cloudant for Greenkeeper. We are rabid fans of CouchDB and part of its development community even, but for this product we needed a reliable CouchDB host that we can trust. We know we’ll be able to grow with Cloudant as our business grows and we couldn’t be happier supporting our friends.

GLYNN: I know that npm uses CouchDB to store the metadata about the npm packages. Do you subscribe to their "changes feed" to get notifications of changes to packages that your customer's projects depend on?

Greenkeeper.io: That is exactly how it works. CouchDB’s changes feed is one of its most compelling features, yet so few people know about it. Check it out! :)

GLYNN: Would you briefly detail Greenkeeper's business model and the difference between your free and paid tier?

Greenkeeper.io: We are greatly inspired by GitHub and Travis CI. Their model of allowing public repos to be free has transformed Open Source as we know it. We’d like to do our part in this and follow the same model. As soon as you want to use the same service for private code, we’re happy to take your money and support our open source infrastructure with it.

All operations in Greenkeeper are run on queues and there is a queue for open source projects that we work off as fast as we can, but paid clients get a dedicated queue with guaranteed response times. In addition to this simple separation, Open Source users that don’t want to potentially wait for their turn on the Open Source queue can pay a $5 per month supporter plan that makes their projects skip the queue.

GLYNN: As dedicated open-source contributors, are you keen to solicit bug reports and contributions from the developer community at large?

Greenkeeper.io: Always! Our CLI is Open Source and we are using its issue tracker for feature requests and bug reports. We also have a dedicated user-support setup for operational issues that you can access via greenkeeper support. We’d love to hear from you!

Before joining IBM Cloud Data Services, Glynn served as the Head of IT and Development for Central Index, creating a white-label frontend for a NoSQL business directory (using PHP, Node.js, MySQL, Redis, Cloudant, and Redshift). His experience includes writing CRM systems, "find my nearest" indexes, e-commerce platforms, and a phone tracking app. He also built a transport route-planning system in Java. Glynn got his start in Research and Development for the steel industry, creating control and instrumentation systems. Outside work, Glynn enjoys guitars, football, crosswords, and Victorian fiction.