NPM Lifecycle Stages: A Study in Stream Editors

A majority of my work, both business and pleasure, is done in Node. I support a fairly large codebase which always needs maintenance. As I get new ideas or learn new techniques, the codebase grows. No matter how organized I try to be, every project seems to spawn half a dozen new projects. New projects means new config and packages and builds and instead of doing things the right way, I've just been shuffling a few master files around.

This weekend I sat down to break off a chunk of that. I've been using way too much copypasta config recently. It hasn't broken yet, and it probably won't break in the future, but it bothers me that I have to copy and paste scripts between projects. I'd much rather find or write a library to coordinate the process. My goal was to to streamline some of my npm lifecycle scripts.

Quick Note

I haven't taken the time to set up a solid AMP template yet, so if you're viewing this on an AMP CDN, the code won't be as pretty as it could be. You might notice a leading newline with some blocks. There are also some longer blocks in AMP that I shrank on my website via CSS. You win some, you lose some.

I assume a lot of things about your shell, mainly that it's not PowerShell or cmd.exe. There's a chance I've used some zshisms; if the code isn't working in your shell let me know and I'll find a fix. I think all of the tools I use here are installed by default, or at least easily accessible via your package manager.

NPM Scripts

NPM has created a solid set of discrete stages that describe every state a package might be in. Each stage typically has three components: prestage, stage, and poststage. The components are made up of shell commands passed to sh (or whatever the value of script-shell option is). Stages are initiated by running

$ npm run stage
# e.g.
$ npm run lint

or, if the stage is a valid lifecycle stage with a command,

$ npm stage
# e.g.
$ npm install

The run command attempts to fire prestage. If successful, or if prestage doesn't have a command, it continues to stage. Again, if successful, or if stage is empty, it continues to poststage. Finally, if poststage is successful, or if it's empty, npm exits with code 0. npm will halt and throw failures if any of the stages exit with nonzero status.

Code

Before we get too far, it might be a good idea to check out an actual package. I'll be poking around npm (latest) later, so it's a great place to start.

$ git clone https://github.com/npm/npm.git
$ cd npm
$ npm install

Distinguished Stages

NPM differentiates between lifecycle scripts and user-created scripts. You can see the difference with npm run:

not zero means something went wrong; throw an error and kill the process

Putting it all together, NPM won't increase the version unless
1. the tests pass,
2. the package builds successfully and is added to VCS,
3. and the repo + tags are pushed followed by the removal of temp files.
Because the commands are fired as NPM moves through its own internal process, you're guaranteed execution at the proper time. You can even chain lifecycle events, making package maintenance easily automatable.

Programmatic Access

The first step in managing lifecycle scripts is validating the stage. The docs are great for human perusal, but they're not as nice for code. You can't expect every user to periodically curl NPM's website for updates.

npm_lifecycle_event

$ grep -R "npm_lifecycle_event"
doc/misc/npm-scripts.md:Lastly, the `npm_lifecycle_event` environment variable is set to
doc/misc/npm-scripts.md:be wise in this case to look at the `npm_lifecycle_event` environment
html/doc/misc/npm-scripts.html:<p>Lastly, the <code>npm_lifecycle_event</code> environment variable is set to
html/doc/misc/npm-scripts.html:be wise in this case to look at the <code>npm_lifecycle_event</code> environment
man/man7/npm-scripts.7:Lastly, the \fBnpm_lifecycle_event\fP environment variable is set to
man/man7/npm-scripts.7:be wise in this case to look at the \fBnpm_lifecycle_event\fP environment
node_modules/npm-lifecycle/index.js: env.npm_lifecycle_event = stage
node_modules/npm-lifecycle/index.js: var stage = env.npm_lifecycle_event
node_modules/npm-lifecycle/index.js: var stage = env.npm_lifecycle_event

Unfortunately, that leads us right back to npm-lifecycle. Because npm-lifecycle is a solid package, it's very DRY. stage is passed in and npm_lifecycle_event is only set in a single location. The NPM devs have done a great job trimming the fat. You can't really complain about that one.

The important function, list, isn't accessible from the outside. We could modify the file itself, but that would involve messing with an API that wasn't exposed. It's more of a nothing-else-worked option than anything else. We might come back to it.

npm-lifecycle Usage

Because npm-lifecycle is the official package, it has to be used. There's a chance its implementation will highlight the stages. To reduce extraneous results (e.g. the static docs), we can shrink the input to only important directories. Based on some of my earlier grepping, it looks like lib directory and node_modules directory are the only ones that contain active lifecycle code.

But lib, on the other hand, hit the jackpot. npm-lifecycle exports a lifecycle function whose signature is

pkg: the calling package

stage: the lifecycle stage

wd: the working directory for the stage

opts: any passed-in options to apply to the stage

The odd [lifecycle, something, stage, something, something] (example) comes from the package slide; chain is doing exactly what you think it is—sequentially calling lifecycle with the rest of the array bound as parameters.

For the most part, lifecycle is called via the chain syntax. But not always. It wouldn't be an open-source project if tons of people didn't contribute, and that means different files do things different ways. The direct calls are pretty close to the chain calls, and the out-of-left-field calls match just enough of the other two that regex should work.

BEGIN{ RS="\n\n\n+"; }: change the Record Separator to multiple newlines

match($0, /## DESCRIPTION[^\#]*/, a): Only select the first block of text

while(match($0, /\n\*([^\:]*)/, b)): Pull out lines starting with *

split(b[1], c, ",");: split the string along commas

for (i in c): loop over the elements of the exploded string

gsub(/(pre|post)/, "", c[i]);: strip prefix

gsub(/pare/, "prepare", c[i]);: fix prepare

gsub(/publishOnly/, "prepublishOnly", c[i]);: fix prepublishOnly

gsub(/ /, "", c[i]);: strip spaces

print c[i];: print the cleaned stage

$0 = substr($0, RSTART + RLENGTH);: increment the loop

doc/misc/npm-scripts.md: Only doc location I could find with everything

sort: same as before; sorts for easy reading

uniq: same as before; strip duplicates

It looks like that doc file does contain all the stages. That's awesome, because it means we walked away with two solutions. That means we can both guess the lifecycles and check our guess against another source.

Solutions Compiled

When I started down this path early this morning, I thought that finding the lifecycle stages would take an hour tops. I thought I'd read some docs, look at the repo, find a well-defined and accessible list, and go do other stuff. I've logged the better part of a day on this problem now (granted, plenty of that was spent playing with awk), and I'd be remiss if I didn't leave the environment better for the next person with a similar question.

I've compiled this stuff (specifically the find ... awk and awk ... doc solutions) into a super tiny package (GitHub | NPM). It exposes all the lifecycle stages as both an array and an enum whose keys are initialized with themselves. It has no dependencies, its version should match npm, it's got fairly autonomous logic, and, eventually, I'll get around to building historical versions.

Final Thoughts

If you know of a better way to access the lifecycle stages, I'd love to hear about it. As much fun as I had with awk, it requires so much extra setup that it's not a good, long-term solution.

Sometime later this week I'll be trying to add some older versions to the package to support older npm versions. I don't know how volatile the lifecycle stages have been, so I'm not sure how far I'll get.

If you end up using the package, I'd love to see what you do with it. If it's missing something, make a PR or fork it. My email is in the package.json (and below).