Category: Uncategorized

Our first task is to coerce Roslyn to emit metadata and IL deltas between between two compilations. I say coerce because we’ll have to do quite a bit of work to get things working. The Compilation.EmitDifference() API is marked as public, but I’m fairly sure it’s yet to be actually used by the public. Getting everything to work requires reflection and manual copying of Roslyn code that doesn’t ship via NuGet.

The first order of business is to figure out what it takes to call Compilation.EmitDifference() in the first place. What parameters are we expected to provide? The signature:

EmitBaseline

An EmitBaseline represents a module created from a previous compilation. Modules live inside of assemblies and for our purposes it’s safe to assume that every module relates one-to-one with an assembly. (In reality multi-module assemblies can exist, but neither Visual Studio nor MSBuild support their creation). For more see this StackOverflow question.

We’ll look at the EmitBaseline as representing an assembly created from a previous compilation. We want to create a baseline to represent the initial compiled assembly before any changes are made to it. Roslyn can compare this baseline to new compilations we create.

Func<MethodDefinitionHandle, EditAndContinueMethodDebugInformation> proves much harder to work with. This function maps between methods and various debug information including a method’s local variable slots, lambdas and closures. This information can be generated by reading .pdb symbol files. Unfortunately there’s no public API for generating this function. What’s worse is that we’ll have to use test APIs that don’t even ship via NuGet so even Reflection is out of the question.

Instead we’ll have to piece together bits of code from Roslyn’s test utilities. Ultimately this requires that we copy code from the following files:

It’s a bit of a pain that we need to bring so much of Roslyn with us just for the sake of one file. It’s sort of like working with a ball of yarn; you pull on one string and the whole thing comes with it.

The SymReaderFactory coupled with the DiaSymReader packages can interpret debug information from Microsoft’s PDB format. Once we’ve copied these files to our project we can use the SymReaderFactory to create a debug information provider by feeding the PDB stream to SymReaderFactory.CreateReader().

IEnumerable<SemanticEdit>

SemanticEdits describe the differences between compilations at the symbol level. For example, modifying a method will introduce a SemanticEdit for the corresponding IMethodSymbol marking is as updated. Roslyn will end up converting these SemanticEdits into proper IL and metadata deltas.

It turns out SemanticEdit is a public class. The problem is that they’re difficult to generate properly. We have to diff Documents across different versions of a Solution which means we have to take into account changes in syntax, trivia and semantics. We also have to detect invalid changes which aren’t (to my knowledge) officially or completely documented anywhere. In this Roslyn issue, I propose three potential approaches to generating the edits, but we’ll only take a look at the one I’ve implemented myself: using the internal CSharpEditAndContinueAnalyzer.

Since these classes are internal we’ll have to use Reflection to get at them. We’ll also need to keep a copy of the Solution around with which we used to generate our EmitBaseline. I’ve put all of the code together into a complete sample. The reflection based approach for CSharpEditAndContinueAnalyzer is demonstrated in the GetSemanticEdits method below.

We can see that this is quite a bit of work just to build the edits. In the above sample we made a number of simplifying assumptions. We assumed there were no errors in the compilation, that there were no illegal edits and no active statements. It’s important to cover all cases if you plan to consume this API properly.

Our next step will be to apply these deltas to a running process using APIs exposed by the CLR.

When discussing the Emit API in my last post, I mentioned that Roslyn gives users the ability to emit deltas between compilations. As far as I know this API is only used by Visual Studio’s Edit and Continue (EnC) feature. When you edit a running program the compiler is smart enough to only emit the changes you’ve made to the previous compilation. The CLR is then smart enough to load these changes and preserve the state of the running program.

I’ve created a (large) sample on how to use Roslyn and the CLR to modify a running process that is available on GitHub. Over the next week we’ll take a look at what it takes to use both Roslyn and the CLR to achieve this.

I’ve had my eye on the Compilation.EmitDifference() API for almost a year now. I work on a Visual Studio extension called Alive that shows developers exactly what their source code does the moment they write it. This means that every time a user edits their code the extension re-compiles and re-emits the binary for their updated source code.

Re-emitting the compiled binary was a large bottleneck for us and created consistent GC pressure. When you emit a compilation you’re essentially dumping a big byte[] to memory. Worse still, if this byte[] contains over 85,000 elements then it goes straight to the large object heap. In our case these arrays weren’t long lived; the moment our users type we have to recompile and the previous binary becomes useless. Compilation.EmitDifference() allowed us to avoid emitting this giant array for every compilation and greatly reduce our extension’s memory footprint.

We can look at two approaches to consuming this API by comparing EnC and Alive. The primary difference between the two approaches is the preservation of state. EnC pauses execution of your program, lets you change it and resumes execution while retaining the previous program state. Alive has no need to preserve state between executions. It runs a given method and then waits for further instructions.

This difference means that EnC calculates the deltas between each compilation it creates, preserving state. Alive calculates deltas between the initial base compilation and the current state of the code.

How EnC builds deltas across compilations

How Alive builds deltas across compilations

The above deltas are simplified for the sake of explanation. In reality they exist as pairs of IL/Metadata deltas. Deltas also aren’t generated at the statement level, when you edit a method the CLR actually replaces the entire method with your new code.

There are also restrictions on what constitutes a valid edit. For detailed rules I’ll defer to Mike Stall’s post on valid edits but it’s possibly outdated. (One valid edit he doesn’t mention is the addition of new top-level types to a program) Programs that use these APIs should have fallback plans for invalid edits. Visual Studio’s EnC simply displays an error saying that it cannot continue while invalid edits are present. Alive falls back to its old approach and re-emits the compilation in its entirety.

In part two we’ll take a look at what it takes to get Roslyn to generate deltas between two compilations.

As of November, people outside of the Roslyn team have been able to build and dogfood changes they make to the compiler and language services. Now that the various feature branches have caught up, we can start playing around with some of the proposed features for C#.

The /future branch is where all these features end up once they’re close to complete and ready to be reviewed for more feedback. Today (Feburary 9, 2015) it’s home to binary literals, digit separators and local functions.

Today we’re going to look at the steps necessary to get the /future branch to build and let us test out the new features.

Up until now, we’ve mostly looked at how we can use Roslyn to analyze and manipulate source code. Now we’ll take a look at finishing the compilation process by emitting it disk or to memory. To start, we’ll just try emitting a simple compilation to disk and checking whether or not it succeeded.

After running this code we can see that our executable and .pdb have been emitted to Debug/bin/. We can double click output.exe and see that our program runs as expected. Keep in mind that the .pdb file is optional. I’ve only chosen to emit it here to show off the API. Writing the .pdb file to disk can take a fairly long time and it often pays to omit this argument unless you really need it.

Sometimes we might not want to emit to disk. We might just want to compile the code, emit it to memory and then execute it from memory. Keep in mind that for most cases where we’d want to do this, the scripting API probably makes more sense to use. Still, it pays to know our options.

Finally, what if we want to influence how our code is compiled? We might want to allow unsafe code, mark warnings as errors or delay sign the assembly. All of these options can be customized by passing a CSharpCompilationOptions object to CSharpCompilation.Create(). We’ll take a look at how we can interact with a few of these properties below.

In total there are about twenty-five different options available for customization. Basically any option you have within the Visual Studio’s project property page should be available here.

Advanced options

There are a few optional parameters available in Compilation.Emit() that are worth discussing. Some of them I’m familiar with, but others I’ve never used.

manifestResources – Allows you to manually embed resources such as strings and images within the emitted assembly. Batteries are not included with this API and it requires some heavy lifting if you want to embed .resx resources within your assembly. We’ll explore this overload in a future blog post.

win32ResourcesPath – Path of the file from which the compilation’s Win32 resources will be read (in RES format). Unfortunately I haven’t used this API yet and I’m not at all familiar with Win32 Resources.

There is also the option to EmitDifference between two compilations. I’m not familiar with this API, and I’m not familiar with how you can apply these deltas to existing assemblies on disk or in memory. I hope to learn more about this API in the coming months.

That just about wraps up the Emit API. If you have any questions, feel free to ask them in the comments below.

In previous posts we touched on the CSharpSyntaxWalker and the CSharpSyntaxRewriter. The SymbolVisitor is the analogue of SyntaxVisitor, but applies at the symbol level. Unfortunately unlike the SyntaxWalker and CSharpSyntaxRewriter, when using the SymbolVisitor we must construct the scaffolding code to visit all the nodes.

To simply list all the types available to a compilation we can use the following.

In order to visit all the methods available to a given compilation we can use the following:

It’s important to be aware of how you must structure your code in order to visit all the symbols you’re interested in. By now you may have noticed that using this API directly makes me a little sad. If I’m interested in visiting method symbols, I don’t want to have to write code that visits namespaces and types.

Hopefully at some point we’ll get a SymbolWalker class that we can use to separate out our implemenation from the traversal code. I’ve opened an issue on Roslyn requesting this feature. (It seems like it’s going to be challenging to implement and would require working with both syntax and symbols).

Finding All Named Type Symbols

Finally, you might be wondering how I answered my original question: How do we get a list of all of the types available to a compilation? My implementation is below:

I should note that after implementing this solution, I came to the conclusion that it was too slow for our purposes. We got a major performance boost by only visiting symbols within namespaces defined within source, but it was still about an order of magnitude slower than the simply searching for types via the SymbolFinder class.

Still, the SymbolVisitor class is probably appropriate for one-off uses during compilation or for visiting a subset of available symbols. At the very least, it’s worth being aware of.

The Scripting API is finally here! After being removed from Roslyn’s 1.0 release it’s now available (for C#) in pre-release format on NuGet. To install to your project just run:

Install-Package Microsoft.CodeAnalysis.Scripting -Pre

Note: You need to target .NET 4.6 or you’ll get the following exception when running your scripts:

Could not load file or assembly 'System.Runtime, Version=4.0.20.0, Culture=neutral, PublicKeyToken=b03f5f7f11d50a3a' or one of its dependencies. The system cannot find the file specified.

Note: Today (October 15, 2015) the Scripting APIs depend on the 1.1.0-beta1 release, so you’ll have to update your Microsoft.CodeAnalysis references to match if you want to use all of Roslyn with the scripting stuff.

There are a few different ways to use the Scripting API.

EvaluateAsync

CSharpScript.EvaluateAsync is probably the simplest way to get started evaluating expressions. Simple pass any expression that would return a single result to this method it will be evaluated for you.

RunAsync

Not every script returns a single value. For more complex scripts we may want to keep track of state or inspect different variables. CSharpScript.RunAsync creates and returns a ScriptState object that allows us to do exactly this. Take a look:

ScriptOptions

We can start to get into more interesting code by adding references to DLLs that we’d like to use. We use ScriptOptions to provide out script with the proper MetadataReferences.

This stuff is surprisingly broad. The Microsoft.CodeAnalysis.Scripting namespace is full of public types that I’m not at all familiar with and there’s a lot left to learn. I’m excited to see what people will build with this and how they might be able to incorporate scripting into their applications.

Kasey Uhlenhuth from the Roslyn team has compiled a list of code snippets to help get you off the ground with the Scripting API. Check them out on GitHub!

If you’ve got some cool plans for the scripting API, let me know if the comments below!

It can be tricky to keep track nodes when applying changes to syntax trees. Every time we “change” a tree, we’re really creating a copy of it with our changes applied to that new tree. The moment we do that, any pieces of syntax we had references to earlier become invalid in the context of the new tree.

What’s this mean in practice? It’s tough to keep track of syntax nodes when we change syntax trees.

A recent Stack Overflow question touched on this. How can we get the symbol for a class that we’ve just added to a document? We can create a new class declaration, but the moment we add it to the document, we lose track of the node. So how can we keep track of the class so we can get the symbol for it once we’ve added it to the document?

A SyntaxAnnotation is a basically piece of metadata we can attach to a piece of syntax. As we manipulate the tree, the annotation sticks with that piece of syntax making it easy to find.

There are a couple of overloads available when creating a SyntaxAnnotation. We can specify Kind and Data to be attached to pieces of syntax. Data is used to attach extra information to a piece of syntax that we’d like to retrieve later. Kind is a field we can use to search for Syntax Annotations.

So instead of looking for the exact instance of our annotation on each node, we could search for annotations based on their kind:

This is just one of a few different ways for dealing with Roslyn’s immutable trees. It’s probably not the easiest to use if you’re making multiple changes and need to track multiple syntax nodes. (If that’s the case, I’d recommend the DocumentEditor). That said, it’s good to be aware of it so you can use it when it makes sense.