Compilers as a Service – Microsoft’s Project Roslyn

Introduction

Ever since we’ve had compilers, they have been regarded as black boxes that process the code we write in one programming language or another and then magically create object code that can then be executed by the computer processor. Now I’ve always been genuinely curious about how these black boxes actually work, which is why I took not one but two compiler courses during my bachelor studies. So when I heard about Microsoft’s Project Roslyn in one of the BUILD talks in September, presented by the man behind C# himself, Anders Hejlsberg, I immediately wanted to play with the idea of a “compiler as a service”. What Project Roslyn aims to achieve is provide a way to access the information that the compiler gathers during parsing, as well as a way to add functionality to the IDE (refactoring, code fixes, etc.) or change the code at runtime. This opens up a variety of new possibilities in areas such as meta-programming, code generation and transformation, interactive use of the C# and VB languages, and embedding of C# and VB in domain specific languages. Currently, Project Roslyn is still being developed, but a CTP is already available here. I will use this CTP in order to write a C# code refactoring tool as a proof of concept.

Compiler basics

Compiler pipeline

First, let us take a short look at the compiler pipeline, or the stages the code goes through when processed by the compiler. The picture on the left shows 4 stages: lexical analysis, syntactic analysis, semantic analysis, code generation and code optimization. During the first step, the code is broken into tokens. Next, the parser (or syntactic analyzer) combines tokens into syntactic structures. During semantic analysis type checks are done and declarations of variables, functions and types are tracked, to make sure that the code is correctly constructed according to the language rules. The code generator finally takes the abstract syntax tree (AST) constructed during the previous phases and generates object code (often an assembly). Code optimization can also be performed after code generation and sometimes also before, in order to make the program more efficient.

Starting up with Roslyn

As mentioned before, Roslyn gives us access to all the information the compiler has about our code. This means we can look at the syntax tree, we can access semantic information and we have APIs we can use to alter this information. As concrete examples, we could implement automatic code fixes or even unit tests that ensure certain coding guidelines are followed, or if we work with both C# and VB, then such tools as copy from C# and paste the code directly in VB could be implemented. Another useful feature that comes with Roslyn is the C# interactive window, which allows programmers to execute code snippets on the fly and see the results immediately, without needing to create a separate project for it – this is especially useful when trying out some solutions for a specific problem.

For the purpose of this article, I decided to implement a small C# application using the Roslyn APIs, that offers a code refactoring tool integrated with the IDE. The tool will allow developers to move a field from a class to its parent class, together with the property associated to it, if there is one. This is just a proof of concept for working with Roslyn, so of course more complicated or more useful things can be done.

First, in order to get familiar with Roslyn, I recommend reading the Roslyn whitepaper and going through at least one of the walkthroughs. In addition, after installing the CTP, in your documents folder there is also a folder that contains some very useful code samples, which show how the Roslyn project types can be used. To anyone with some understanding of how a compiler works, the API is fairly easy to understand, at least after going through the whitepaper. Personally, I found the code samples from the CTP very useful and they helped me find the right way to implement my code refactoring tool.

Implementation

Disclaimer: There are different APIs that can be used in order to integrate your Roslyn tool with the Visual Studio IDE. Here I will only explain the approach for code refactoring.

For our code refactoring tool, we need 3 classes: CodeRefactoringProvider (which is generated by default with the Code Refactoring project type), CodeAction and FieldMover. The first one is triggered automatically by the IDE whenever the cursor changes position in the code file and checks if our refactoring can be applied, i.e. if we are position on a field declaration and the current class has a parent class. The second one shows a nice popup with a description if the refactoring is possible and checks if there is any property associated to the field. Finally, the third class does the actual manipulation of the syntax tree, by moving the field and its property (if it exists). Below is a snippet which shows the method invoked to check if our refactoring can be applied:

For a deeper dive into the code, I invite you to visit GitHub, where you can also find the test project I used.

Using the refactoring extension

In order to use the refactoring there are 2 options – installing the Visual Studio extension that is produced when building the project, or simply running the project from within Visual Studio. In order to install the extension, you just need to double-click on the .visx file in the Debug or Release folder, then run a special instance of Visual Studio that supports Roslyn like this: “devenv.exe /rootsuffix Roslyn” (from the command line). The other option just means running or debugging the Roslyn project from Visual Studio – this will launch a new instance of VS, where you need to open a solution to test the refactoring. Once you have the solution open and the cursor is on a field, a small rectangle will appear at the beginning of the field definition and if you click it you will be able to see a preview of the changes and then apply them.

New roads opening

With Microsoft’s Project Roslyn, new roads are opening in a variety of areas, such as refactoring, automatic code fixes, meta programming and so on. So next time you feel like you need a tool that makes your coding more efficient, you can go ahead and implement it yourself, so that it suits your needs perfectly.