Write Code to Rewrite Your Code: jscodeshift

Codemods with jscodeshift

How many times have you used the find-and-replace functionality across a directory to make changes to JavaScript source files? If you’re good, you’ve gotten fancy and used regular expressions with capturing groups, because it’s worth the effort if your code base is sizable. Regex has limits, though. For non-trivial changes you need a developer who understands the code in context and is also willing to take on the long, tedious, and error-prone process.

This is where “codemods” come in.

Codemods are scripts used to rewrite other scripts. Think of them as a find and replace functionality that can read and write code. You can use them to update source code to fit a team’s coding conventions, make widespread changes when an API is modified, or even auto-fix existing code when your public package makes a breaking change.

Think of codemods as a scripted find and replace functionality that can read and write code.

In this article, we’re going to explore a toolkit for codemods called “jscodeshift” while creating three codemods of increasing complexity. By the end you will have broad exposure to the important aspects of jscodeshift and will be ready to start writing your own codemods. We will go through three exercises that cover some basic, but awesome, uses of codemods, and you can view the source code for these exercises on my github project.

What Is jscodeshift?

The jscodeshift toolkit allows you to pump a bunch of source files through a transform and replace them with what comes out the other end. Inside the transform, you parse the source into an abstract syntax tree (AST), poke around to make your changes, then regenerate the source from the altered AST.

The interface that jscodeshift provides is a wrapper around recast and ast-types packages. recast handles the conversion from source to AST and back while ast-types handles the low-level interaction with the AST nodes.

Setup

To get started, install jscodeshift globally from npm.

npm i -g jscodeshift

There are runner options you can use and an opinionated test setup that makes running a suite of tests via Jest (an open source JavaScript testing framework) really easy, but we’re going to bypass that for now in favor of simplicity:

jscodeshift -t some-transform.js input-file.js -d -p

This will run input-file.js through the transform some-transform.js and print the results without altering the file.

Before jumping in, though, it is important to understand three main object types that the jscodeshift API deals with: nodes, node-paths, and collections.

Nodes

Nodes are the basic building blocks of the AST, often referred to as “AST nodes.” These are what you see when exploring your code with AST Explorer. They are simple objects and do not provide any methods.

Node-paths

Node-paths are wrappers around an AST node provided by ast-types as a way to traverse the abstract syntax tree (AST, remember?). In isolation, nodes do not have any information about their parent or scope, so node-paths take care of that. You can access the wrapped node via the node property and there are several methods available to change the underlying node. node-paths are often referred to as just “paths.”

Collections

Collections are groups of zero or more node-paths that the jscodeshift API returns when you query the AST. They have all sorts of useful methods, some of which we will explore.

Collections contain node-paths, node-paths contain nodes, and nodes are what the AST is made of. Keep that in mind and it will be easy to understand the jscodeshift query API.

It can be tough to keep track of the differences between these objects and their respective API capabilities, so there’s a nifty tool called jscodeshift-helper that logs the object type and provides other key information.

Knowing the difference between nodes, node-paths, and collections is important.

Exercise 1 - Remove Calls To console

To get our feet wet, let’s start with removing calls to all console methods in our code base. While you can do this with find and replace and a little regex, it starts to get tricky with multiline statements, template literals, and more complex calls, so it’s an ideal example to start with.

OK, that was a bit anticlimactic since our transform doesn’t actually do anything yet, but at least we know it’s all working. If it doesn’t run at all, make sure you installed jscodeshift globally. If the command to run the transform is incorrect, you’ll either see an “ERROR Transform file … does not exist” message or “TypeError: path must be a string or Buffer” if the input file cannot be found. If you’ve fat-fingered something, it should be easy to spot with the very descriptive transformation errors.

But how do we find the consoles and remove them? Unless you have some exceptional knowledge of the Mozilla Parser API, you’ll probably need a tool to help understand what the AST looks like. For that you can use the AST Explorer. Paste the contents of remove-consoles.input.js into it and you’ll see the AST. There is a lot of data even in the simplest code, so it helps to hide location data and methods. You can toggle the visibility of properties in AST Explorer with the checkboxes above the tree.

We can see that calls to console methods are referred to as CallExpressions, so how do we find them in our transform? We use jscodeshift’s queries, remembering our earlier discussion on the differences between Collections, node-paths and nodes themselves:

The line const root = j(fileInfo.source); returns a collection of one node-path, which wraps the root AST node. We can use the collection’s find method to search for descendant nodes of a certain type, like so:

const callExpressions = root.find(j.CallExpression);

This returns another collection of node-paths containing just the nodes that are CallExpressions. At first blush, this seems like what we want, but it is too broad. We might end up running hundreds or thousands of files through our transforms, so we have to be precise to have any confidence that it will work as intended. The naive find above would not just find the console CallExpressions, it would find every CallExpression in the source, including

require('foo')
bar()
setTimeout(() => {}, 0)

To force greater specificity, we provide a second argument to .find: An object of additional parameters, each node needs to be included in the results. We can look at the AST Explorer to see that our console.* calls have the form of:

Now that we’ve got an accurate collection of the call sites, let’s remove them from the AST. Conveniently, the collection object type has a remove method that will do just that. Our remove-consoles.js file will now look like this:

It looks good. Now that our transform alters the underlying AST, using .toSource() generates a different string from the original. The -p option from our command displays the result, and a tally of dispositions for each file processed is shown at the bottom. Removing the -d option from our command, would replace the content of remove-consoles.input.js with the output from the transform.

Our first exercise is complete… almost. The code is bizarre looking and probably very offensive to any functional purists out there, and so to make transform code flow better, jscodeshift has made most things chainable. This allows us to rewrite our transform like so:

Much better. To recap exercise 1, we wrapped the source, queried for a collection of node-paths, change the AST, and then regenerated that source. We’ve gotten our feet wet with a pretty simple example and touched on the most important aspects. Now, let’s do something more interesting.

Exercise 2 - Replacing Imported Method Calls

For this scenario, we’ve got a “geometry” module with a method named “circleArea” that we’ve deprecated in favor of “getCircleArea.” We could easily find and replace these with /geometry\.circleArea/g, but what if the user has imported the module and assigned it a different name? For example:

import g from 'geometry';
const area = g.circleArea(radius);

How would we know to replace g.circleArea instead of geometry.circleArea? We certainly cannot assume that all circleArea calls are the ones we’re looking for, we need some context. This is where codemods start showing their value. Let’s start by making two files, deprecated.js and deprecated.input.js.

This gets us the ImportDeclaration used to import “geometry”. From there, dig down to find the local name used to hold the imported module. Since this is the first time we’ve done it, let’s point out an important and confusing point when first starting.

Note: It’s important to know that root.find() returns a collection of node-paths. From there, the .get(n) method returns the node-path at index n in that collection, and to get the actual node, we use .node. The node is basically what we see in AST Explorer. Remember, the node-path is mostly information about the scope and relationships of the node, not the node itself.

// find the Identifiers
const identifierCollection = importDeclaration.find(j.Identifier);
// get the first NodePath from the Collection
const nodePath = identifierCollection.get(0);
// get the Node in the NodePath and grab its "name"
const localName = nodePath.node.name;

This allows us to figure out dynamically what our geometry module has been imported as. Next, we find the places it is being used and change them. By looking at AST Explorer, we can see that we need to find MemberExpressions that look like this:

Now that we have a query, we can get a collection of all the call sites to our old method and then use the collection’s replaceWith() method to swap them out. The replaceWith() method iterates through the collection, passing each node-path to a callback function. The AST Node is then replaced with whatever Node you return from the callback.

Again, understanding the difference between collections, node-paths and nodes is necessary for this to make sense.

Once we’re done with the replacement, we generate the source as usual. Here’s our finished transform:

Exercise 3 - Changing A Method Signature

In the previous exercises we covered querying collections for specific types of nodes, removing nodes, and altering nodes, but what about creating altogether new nodes? That’s what we’ll tackle in this exercise.

In this scenario, we’ve got a method signature that’s gotten out of control with individual arguments as the software has grown, and so it has been decided it would be better to accept an object containing those arguments instead.

For reading all the arguments currently being passed in, we use thereplaceWith() method on our collection of CallExpressions to swap each of the nodes. The new nodes will replace node.arguments with a new single argument, an object.

Change method signatures with 'replacewith()' and swap out entire nodes.

Let’s give it a try with a simple object to make sure we know how this works before we use the proper values:

Node Builders

Builders allow us to create new nodes properly; they are provided by ast-types and surfaced through jscodeshift. They rigidly check that the different types of nodes are created correctly, which can be frustrating when you’re hacking away on a roll, but ultimately, this is a good thing. To understand how to use builders, there are two things you should keep in mind:

All of the available AST node types are defined in the deffolder of the ast-types github project, mostly in core.js
There are builders for all the AST node types, but they use camel-cased version of the node type, not pascal-case. (This isn’t explicitly stated, but you can see this is the case in the ast-types source

If we use AST Explorer with an example of what we want the result to be, we can piece this together pretty easily. In our case, we want the new single argument to be an ObjectExpression with a bunch of properties. Looking at the type definitions mentioned above, we can see what this entails:

Codemods With jscodeshift Recap

It took a little time and effort to get to this point, but the benefits are huge when faced with mass refactoring. Distributing groups of files to different processes and running them in parallel is something jscodeshift excels at, allowing you to run complex transformations across a huge codebase in seconds. As you become more proficient with codemods, you’ll start repurposing existing scripts (such as the react-codemod github repository or writing your own for all sorts of tasks, and that will make you, your team, and your package-users more efficient.

About the author

Jeremy is a senior software engineer with a passion for modern JavaScript--client and server-side--including React, Redux, Angular, and Express. He believes in clean code, testing, and reading the manual. Making cool software makes him giddy, and he is deeply moved every time he sees his work being used by others. [click to continue...]

Jeremy is a senior software engineer with a passion for modern JavaScript--client and server-side--including React, Redux, Angular, and Express. He believes in clean code, testing, and reading the manual. Making cool software makes him giddy, and he is deeply moved every time he sees his work being used by others.