NPM Package Verification — Ep. 2 – Hacker Noon

Day Two was where the “fun” began. As it turns out, AWS Lambda is really good at running basic Node functions out of the box. But TBV has to exec both git and npm in order to work. git is used to fetch the package source from source control, and npm is used for installing dependencies and generating a package to compare to the published version.

My assumption was that running npm would be trivial since the Lambda Node 8.10 runtime includes it already and that running git would be difficult if not impossible. One of the beauties of racing to (in)validate assumptions is that I was totally wrong.

It turns out that at the end of last year, AWS launched Lambda Layers which allow developers to “package and deploy libraries, custom runtimes, and other dependencies separately from your function code.” Ya know other dependencies like git. And in the 50 or so days that the feature had been live, someone had already created just what I needed. Thanks, internet!

At this point, I thought I had really dodged a bullet. I started running live data through the system for packages like express that are pretty popular and that I knew verify. And it worked. I’ve been doing this long enough to know that success at this early is suspicious.

This is where I started going down the first path of yak shaving. The output of TBV was originally optimized for human readability. However, this meant that viewing the log output in AWS was very unhelpful. I ran the same version of TBV on my own machine as on Lambda and it Worked On My Machine™.

I added verbose logging to TBV so that I could see the raw output from the commands that were being run.

And so began a long series of commits that I not proud of.

Once I had better logging visibility, noticed that npm ci was failing because the command was not supported. Yep, Node 8.10 on AWS Lambda runs [email protected] and I need at least [email protected] to run npm ci. I was literally one minor version short of a working distributed system.

First I tried running npm install --global [email protected] on Lambda, but that failed because Lambda functions have read-only access to everything on the filesystem except /tmp. I expected this, but hey, no harm in trying.

Next, I tried installing npm as a production dependency of my function. Getting $PATH to include Node, MY version of NPM, but NOT the normally installed version of NPM proved difficult. Installing npm as a production dependency just seems wrong anyway.

Next, I started digging into the source code for NPM to see how the ci command works. If it was trivial to implement, then maybe I could clone just that code. Open source, blah blah, MIT license, blah blah. I assumed it would be super convoluted and thus an exercise if futility. Nope! It just used another library called cipm. (As part of this process, I learned that NPM often introduces new functionality by incorporating external libraries.)

Partial success?

Next, I tried installing cipm as a production dependency. Honestly, I forget why this attempt didn’t work. I also felt dirty using a different method for installing and building than what would be used in the wild. I was bummed by the irony that I was getting beat by the library that I was trying to secure.

Next, I tried watching Netflix. But this didn’t work because watching TV isn’t a good way to write software that works.

Sunday had come and gone. I was frustrated. I was getting shoddy results from TBV anyway. I had gained some validated learning but hadn’t shipped the product I had wanted. Sigh.