Codemods: Effective, Automated Refactoring

By on January 29, 2019 8:11 am

Maintaining software is challenging. Stagnant software quickly becomes obsolete and this couldn’t be truer than in the JavaScript ecosystem. JavaScript firmly holds the reigns as the language of the web and with that comes a unique opportunity for the language and ecosystem to learn and adopt best practices and paradigms from other languages and ecosystems, as JavaScript is the convergence point for all web projects. This means that everything changes, from the language syntax to the popular patterns used to write effective front-end code.

Frameworks and libraries are also in a perpetual state of flux, squashing bugs and improving the code. Trying to keep up with all of this can be a tedious effort and can lead to JavaScript Fatigue. But it doesn’t need to be bad! As JavaScript has grown in complexity, so too has the ecosystem continued to develop new and better tooling. One of the most powerful tools at our disposal is Codemods.

What’s in a codemod?

In general, a codemod is an automated change to source code. More specifically, a codemod generally includes the following steps:

  1. Generate an Abstract Syntax Tree (AST) from a source file
  2. Traverse the AST looking for matches
  3. Make changes to the AST where appropriate
  4. Regenerate the source file based on the new AST

A very common codemod example is Prettier which formats code based on an opinionated set of rules. Codemods don’t need to as general purpose as Prettier! Their real power arises from the simplicity of being able to generate your own codemod to satisfy your project’s very specific needs. Let’s walk through each of the steps.

Generating the AST

There are many tools available to generate an AST, depending on the language or style. One such tool is @babel/parser which provides a simple parse method to generate an AST.

import { parse } from '@babel/parser';

const output = parse('let foo = true;', {});
console.log(output);

In this example, the source let foo = true; is passed to the parse method to generate the following tree.

{
  "type": "File",
  "start": 0,
  "end": 15,
  "program": {
    "type": "Program",
    "start": 0,
    "end": 15,
    "sourceType": "script",
    "interpreter": null,
    "body": [
      {
        "type": "VariableDeclaration",
        "start": 0,
        "end": 15,
        "declarations": [
          {
            "type": "VariableDeclarator",
            "start": 4,
            "end": 14,
            "id": {
              "type": "Identifier",
              "start": 4,
              "end": 7,
              "name": "foo"
            },
            "init": {
              "type": "BooleanLiteral",
              "start": 10,
              "end": 14,
              "value": true
            }
          }
        ],
        "kind": "let"
      }
    ]
  }
}

Following this structure, let foo = true; consists of a single VariableDeclaration. The declarations property is an array containing the left and right sides of the equal sign. On the left is a VariableDeclaration, while on the right is the value being set, a BooleanLiteral.

ASTs may seem overwhelming at first, but it’s actually quite simple. Each part of the source is transformed into a node, which can be made up of other nodes. Being able to check the type of node for everything in the source tree allows us to find very specific segments of our code to transform, being assured that the code is not actually part of a comment or other less relevant node type. A great tool to better understand an AST is AST Explorer, where you can enter code and receive a real-time AST. This is an incredibly powerful way to introspect an AST, including the types of nodes and their properties.

Traversing and modifying the AST

Once we have the AST, next we need to traverse it by calling a function for each node that will accept the node as its input. The function can determine the node’s type and then modify that node and return it, or return the unmodified original. To make things easier, traversal helpers like @babel/traverse provide a helper to automatically detect a node’s type and run a provided function against all nodes of that type.

import { parse } from '@babel/parser';
import traverse from '@babel/traverse';

const ast = parse('let foo = true;', {});

traverse(ast, {
    Identifier(path) {
        if (path.node.name === 'foo') {
            path.node.name = 'bar';
        }
    }
});

console.log(ast.program.body[0].declarations[0]);

In this example, the Identifier function allows us to provide code to specifically run on nodes of type Identifier. We can then check and make changes to its properties, resulting in an updated AST.

{
  "type": "VariableDeclarator",
  "start": 4,
  "end": 14,
  "id": {
    "type": "Identifier",
    "start": 4,
    "end": 7,
    "name": "bar"
  },
  "init": {
    "type": "BooleanLiteral",
    "start": 10,
    "end": 14,
    "value": true
  }
}

Generating new source from the updated AST

Once we have traversed the AST and made our desired changes, we can then generate the new source code from the modified AST. To do this, we can use recast or @babel/generator. Our current preference is recast because it specifically tries to preserve the original code styles, such as quote types, indentation, and newlines.

import { parse } from '@babel/parser';
import traverse from '@babel/traverse';
import { print } from 'recast';

const ast = parse('let foo = true;', {});

traverse(ast, {
    Identifier(path) {
        if (path.node.name === 'foo') {
            path.node.name = 'bar';
        }
    }
});

const { code } = print(ast);

console.log(code); // -> let bar = true;

Putting it all together

Now that we have reviewed each of the steps needed to make an effective transform for our code, we put it all together to create a codemod. The open source jscodeshift project combines these pieces together to provide a simplified interface with a focus on simplicity. It also provides a command-line interface for running codemods by pointing it towards the source files. A jscodeshift codemod is created by defining a function that accepts information about a source file and API object.

export default function (file, api) {
    const { source } = file;
    const { jscodeshift: j } = api;

    return j(source)
        // codemod goes here
        // ...
        .toSource()
}

Inside that function, jscodeshift provides a chainable API for traversing and modifying the AST, as well as identifiers for each type of node. A final .toSource call is used to then replace the contents of the source file with one generated from the modified AST, matching the styles of the original file.

A real-world example

Version 2 of Dojo was released on May 2nd, 2018. Since then, three additional major versions have been released, each with some pretty big changes. For example, version 3 of Dojo focused on reducing the repository/package complexity of Dojo by consolidating eight foundational packages into a single package, @dojo/framework. Version 4 of Dojo removed several modules that were no longer necessary. In both of these versions, the changes required by a user of the previous version required manual updates. This is where codemods can really shine and why we made them part of the CLI tool.

The following codemod example is derived from our work simplifying the upgrade story for Dojo by rewriting imports for the eight deprecated packages in version 2, to the new @dojo/framework package in version 3. One of Dojo’s main focuses is on developer ergonomics, and automating changes via codemods moves us to achieve that goal. This kind of change could be tedious for Dojo users as it is very likely that they would need to change import statements in every file of their project. They could use regular expressions to find and replace the strings, but there is a chance for unintended consequences with this approach, and the whole experience is not very friendly to the developer. Instead, this codemod automates the change to assure that it only affects import statements by modifying the AST directly.

const match = /^@dojo\/(core|has|i18n|widget-core|routing|stores|shim|test-extras)/;

export default function({ source }, { jscodeshift: j }) {
    return j(source)
        .find(j.ImportDeclaration)
        .replaceWith((path) => {
            const { source } = path.node;
            const matches = match.exec(source.value);
            if (matches) {
                const [full, pkg] = matches;
                const replacement = pkg === 'test-extras' ? 'testing' : pkg;
                source.value = source.value.replace(full, `@dojo/framework/${replacement}`);
                return { ...path.node, source: { ...source } };
            }
            return p.node;
        })
        .toSource();
};

Providing the expected syntax for a jscodeshift codemod, this function traverses the AST looking specifically for nodes of type ImportDeclaration. For each matching node the codemod finds, it checks if the node’s value matches the regular expression for the deprecated packages. If it matches, the codemod directly modifies the value, changing the source value to be the new location of the import, and then returns that node. Otherwise, the codemod returns the node unchanged. Finally, it calls toSource which will output the changes. This codemod can be run from the command line with jscodeshift -t ./codemod.js src/**/*.js. This will run the defined function once for each source file and overwrite the files with the relevant changes. Ideally, the files being changed exist in version control, and the accuracy of the codemod can be tested via a simple git diff check.

Simplifying the future with codemods

Codemods are a simple yet effective tool to combat the inevitable churn of a codebase in a fast, safe, and programmatic way. Because of the low barrier to entry, they can be created by anyone with a little practice and are effective for long-term conversions, or one-time, throw away scripts. The ability to create and maintain modifications to an AST is the primary driver behind @dojo/cli-upgrade-app. This tool wraps jscodeshift and provides a way for Dojo developers to organize and develop codemods for each new major version of Dojo going forward. It also helps manage npm dependencies by defining what dependencies are new or deprecated and automating their installation/upgrade and removal, respectively. Codemods play a signifcant part in Dojo’s developer ergonomics story, and they can in yours as well!

Getting help

If you’d like to know more about codemods or if you need help codemods within your development initiatives, feel free to reach out to us, and we’ll be more than happy to help!

Follow SitePen for more articles just like this
TwitterFacebookLinkedIn


Do you have any questions or want some expert assistance?