Are node_modules in git still necessary with npm shrinkwrap?

I was on an internal discussion with some folks around the management of npm dependencies and whether or not your should check them in. That led to a reference to @mikeal’s great post here. In that post, the recommendation is basically this:

If you are authoring a module, you should not check in your dependencies.You should have a package.json with a liberal versioning policy on dependencies. The main reason stated for this is to distribute integration testing / ensure that the latest versions are really being put to the test. Beyond that I would say it also provides the benefit of developers leveraging the latest and greatest features modules have to offer.

If you are authoring an application, you SHOULD check in your dependencies. The main reason given for this is in order to have predictability in your production environment. package.json only allows you to specify your high level dependency versioning policy. Meaning you can lock down your versions to a specific version, but you cannot with package.json lock down your dependency’s dependencies. This means your environment is unpredictable and you might find out at the last second that one of your child dependencies breaks your app! Not a good thing at all! Because of that the recommendation is to check in all your dependencies thus ensuring there are no surprises at deployment time.

And then came npm shrinkwrap.

In the past year a new awesome (though not necessarily widely known feature) was added to npm to solve the problem identified in 2, “npm shrinkwrap”. When you run “npm shrinkwrap” on your app, npm generates an npm-shrinkwrap.json that locks down the full transitive closure of all of your dependencies. It essentially locks things down so you get that predictability I was talking about above.

For example here is the example used on the node.js site of an npm-shrinkwrap.json file generated off of a package.json for module A which depends on B which then has a dependency on C.

Works in Azure Websites and other cloud providers

In case you are wondering, if you push an npm-shrinkwrap.json to Windows Azure Websites or to other PAAS providers like Nodejitsu and Heroku, it should work. I verified this for Azure but I am assuming the same for others as npm automatically recognizes the shrinkwrap files presence.

Is it still recommended to check in your dependencies for your app?

With the introduction of shrinkwrap, this raises the question of should you still check in your dependencies? My personal lean is it is no longer necessary if you use shrinkwrap. Thus I follow the following guidelines.

If you are publishing a module, you should not check in your dependencies. You should have a package.json with a liberal versioning policy on dependencies.

If you are publishing an application, you should not don’t need to necessarily check in your dependencies. You can should use an npm- shrinkwrap.json which locks down your dependencies.

Note: A clear exception to 2 is if you are using private forks of modules. In those cases you will need to check in the forked modules somewhere. You still could conceptually have your shrinkwrap file pull those forks from a private repo (npm supports pulling from github repos) so that they are still managed via the shrinkwrap file. Or you could mix and match having just those modules checked in to the main repo and the ones that are not forked pulled via the shrinkwrap file.

1. You can still have local modules which are not checked into npm in your repo. Same for privately forked modules. That still then becomes an exception to the rule.

2. As to the issue of versions getting cleaned up. It is a valid question. I would hope if such a purge was to occur (say for performance) then those versions would get moved to an npm archive where they could still be accessible.

3. On the security issue one option would be to clone the repo privately (which I know Yammer did and then moved away from). It would be interesting (maybe npm supports this) to have a mini repo that did not have everything, but just the modules my company needs……

James Manning

Just to play devil’s advocate for a bit and give others a chance to respond to the obvious bits My knee-jerk reaction is that from a global application of DRY, it seems like shrinkwrap is preferable, but might as well put these out there.

not checking in dependencies has benefits that include:
– save (in the best-case scenario) some disk space in source control (AFAICT), since you still need the modules at runtime, so the modules will still end up on the disks of machines doing dev/test/prod

checking in dependencies has benefits that include:
– works for all dependencies, whether they’re registered in npm or not
– works for any modules where you had to make local changes (bugfix, for instance, that hasn’t been accepted upstream)
– gives your source control a stronger ‘single source of truth’ (for systems like git where you have a hash that covers everything in the repo, such a hash means less when things could change in an external system since it wouldn’t change the hash)
– related, it means I can come back months/years from now and fetch/run the exact same bits if necessary (perhaps as part of a bisect) without having to worry about whether a very old version might have been cleaned up.
– probably a requirement in any kind of agency-regulated environment (FDA rules for pharma software, for instance)
– reduces/removes an external dependency
– reduces/removes an attack vector
– might be a benefit if you have any end-of-line issues with fetching from npm (not sure if this is a real issue or not)

Admittedly, the security issues aren’t ones I’ve really thought through thoroughly, but the lack of signing / cryptographic hashes (at least, as I browse around the registry, they don’t jump out at me, but I haven’t dug into how npm works) seems a little odd when I think about things like linux distributions and rpm/dpkg files with hashes. Maybe those were just more important due to being mirrored on random machines around the world, not sure.

Anyway, just random stuff to help continue the conversation.

http://blogs.msdn.com/gblock Glenn Block

I was hoping for more than just Mikeal’s response I am assuming he would see the trackback as I linked to his post. However for safe measure I did ping him on twitter.

Thoughts otherwise?

James Manning

Getting @mikeal:twitter ‘s response would seem to be of the most value.