Monthly Archives: July 2014

I don’t know if I’ll do this every week, but this week I hit two spots of the .NET Compiler Platform API quicksand. I did not get out of either alone, so wanted to share what I learned.

ToFullString()

I struggled fantastically with creating code for XML documentation. Run the Roslyn quoter against a simple comment and you’ll get the gist of it.

For my work with RoslynDom I need to go both ways after modifying the documentation:

– Code -> Compiler API (syntax tree) -> RoslynDom items

– RoslynDom items -> Compiler API (syntax tree) -> to code

–

The first works great. Grab the symbol and you can grab the documentation:

Symbol.GetDocumentationCommentXml()

This gives you the XML as a string. Just load it as an XDocument and run as much LINQ to XML as you like. All is good.

But then… I needed to recreate the syntax tree. I really, really felt I should be able to build it up from primitives. After a few hours banging my head against that wall, I had to accept the core rule of …

The .NET Compiler Platform is a compiler, what it does really, really well is parse code.

So, even though it made me feel dirty, I wrote out the XML to a string, split the string into lines, iterated over the lines inserting the three slashes, and asked the SyntaxFactory to parse it. If you’re struggling to build something, see if you can parse into what you need.

In this particular case, it failed. I mean I had the output and it looked good, but the first three slashes were missing and the end of line at the end was missing. Specifically, I mean when I wrote it out in the immediate window these were missing. Crap.

Happily I have friends. Anthony D Green (ADG) on the team pointed out that I wasn’t using ToFullString(). At various points in working with the API, ToString() may do surprising things – working too hard or just getting nuts on your behalf. Perhaps someone somewhere needs the stripped version.

If you’re looking at a string output from the API, check it also with ToFullString().

The Formatter is picky, and EndOfLineTrivia requires \r\n

The .NET Compiler Platform is designed, and massively tested, with code that can happen in the real world from its own parsing. When you build trees, there is a large number of ways you can mess up that could never happen through parsing. I’d say infinite, but my son is an astrophysicist and doesn’t let me say things are infinite.

In my case, I naively thought that EndOfLineTrivia would understand that it was supposed to, well, you know, output an end of line. I did not anticipate that I would also need to pass a \r\n. I also did not anticipate that it would silently create an object that would later cause an exception – deep in the heart of the Formatter API. This time Balaji Soundrarajan did a little telepathic debugging and guessed that I’d failed to include \r\n. Thanks to him and all the folks that took a look at that one!

The goal of RoslynDom is to present information about your code in the way you think about your code.

A note on VB: I’m building out the C# version first, but I know VB very well and am designing to support later VB creation. If something is at odds with good C# support, I’ll cross that bridge when I get there.

You can get RoslynDom on NuGet via the Package Manager in Visual Studio and here on GitHub. Keep in mind that it is an early experimental release.

RoslynDom celebrates the awesome .NET Compiler Platform, but also respects that the .NET Compiler Platform is built as a compiler, and you are not a compiler.

Introduction

I started from the outside, highest level of code in a single file and am working inward – beginning with the structure and working inwards to statements and eventually expressions. Support for multiple files is coming – but not until I’ve completed work on statements.

By structural, I mean artifacts that organize your code – namespaces, classes, structures, etc. This post shows how to use RoslynDom to query code. Changing code is a different post. You can rather easily change the RoslynDom– outputting a new tree with your changes is currently buggy. In the meantime, most RoslynDom items expose the SyntaxNode it was created from, and where practical the corresponding ISymbol (SyntaxNode and ISymbol are part of the .NET Compiler Platform). You can use RoslynDom to get to the right location in your code, and then use .NET Compiler Platform techniques.

You can find more about the scenarios I wrote RoslynDom to support here. If you have a tool idea and want me to make RoslynDom friendly to what you’re doing, let’s talk.

This post has walk-throughs of how you can use RoslynDom today. RoslynDom is a library to build tools from – it is not itself a tool. One tool that has been built on top of it is Jim Christopher’s RoslynDom-Provider.

Retrieving Namespaces

A namespace is a logical container. It’s orthogonal to the structure of your running application and tools like ObjectBrowser offer alternate physical (assembly/module) and logical (namespace) trees.

RoslynDom sets out to give access to your code the way you think about it, and you might think about it differently at different times. Both of these statements are true:

A namespace is a dot delimited string attached to a class or other type to give it a more complete and hopefully unique name (in->out)

A namespace is an identifier that you put at the top of a file that groups the contained code with related code in different files (in-out)

A namespace can be nested – the namespace System contains the namespace System.Diagnostics

The nesting of namespaces in code is entirely arbitrary – these code fragments are logically identical:

The .NET Compiler Platform manages namespaces differently in the two trees. RoslynDom’s is committed to expressing code the way you think of it and you probably don’t think of your code in terms of different access mechanisms, each good for different things.

To access namespace information in RoslynDom, you first load your code. You can do this from a file, a source code string, a project document, or a SyntaxTree. For example:

These provide your namespaces as you wrote them and where the namespace is actually in use in this root. The following code would have two members in the Namespaces property and one member in the NonemptyNamespaces property:

Since namespaces and using directives can appear within other namespaces, both of these properties also appear on the RoslynDom INamespace interface.

Retrieving Classes

The next step down the structural hierarchy is classes, structures and other types. These may appear at the root or in a namespace. You probably have a single namespace in your file and probably do not perceive your file as a nested structure of namespace(s) containing types. RoslynDom supports both approaches:

Type attachable, or type members: methods, properties, fields, (soon) enum values and (soon) events (constructors are currently a special case of a method, but waiting for a final understanding of primary constructors)

Statements attachable to methods and property accessors

Expressions that can be attached to statements and to fields as initializers (and now properties)

Remembering how you access namespaces, you can probably predict the code to access type members in RoslynDom:

Except in the case of CLR types, you may not have access to the Reflection runtime type. To avoid taking a dependency on the .NET Compiler Platform, RoslynDom has its own type class. Alas, that’s another post.

Here is the set of features RoslynDom makes available for methods and parameters:

Attributes

Attributes are another area where the .NET Compiler Platform syntax tree keeps track of arbitrary differences in how code is written – differences that you don’t think about when reading code. These two fragments of code have the same intent.

A similar LINQ expression could return the matching, or non-matching classes.

Attributes may have values, which would be the parameters to the attributes. Since RoslynDom does not yet support multiple files, the attributes aren’t fully resolved and positional arguments are currently problematic.

Summary

RoslynDom is in a preliminary stage, and I’d be happy to hear your thoughts. The goal of RoslynDom is to enhance the .NET Compiler Platform to make humans like you and me happy accessing the fantastic information the compiler is exposing!

In any tree things can become, well interesting, if nodes appear in more than one location. This is particularly damaging in a tree that takes characteristics from context – which happens with naming (namespaces and nested classes) in the .NET class model. Thus, by intent, no item may appear in more than one location in the tree.

When a member is cloned, its parent is not copied with it. Also, parent and parent properties are not used in determining same intent.

Real-time Namespace property

Previously, Namespace was stored from the symbol when the instance was created. Because Namespace is contextual, this was incorrect. Namespace is now calculated from the parent hierarchy when the namespace is requested for all classes except RDomReferencedType. This resulted in some changes in Namespace results, including the result from

Namespace testing.Foo

Which previously returned Foo and now returns testing.Foo.

The Namespace in RDomRefernecedType is the namespace of the type being referenced, so is still retrieved from the symbol on load.

AddOrMoveMember and RemoveMember methods

Methods to add members to containers have been added to new IRDomStemContainer, IRDomTypeContainer and IRDomCodeContainer interfaces.

As discussed under the heading “New Parent property on all items,” IDom items may not appear in more than one location in the tree. The AddOrMove semantics reflect this. I actually think moving will be a rare task, but if you accidently add an item to a new location in the tree, RoslynDom will remove it from the prior location and I wanted naming to clarify this.

I may add an “AddCloneOfMember” to simplify the process of cloning a member and adding it to a new location after changes. This is the anticipated use case.

ICodeContainer and ICodeMember interfaces

There are new ICodeContainer and ICodeMember interfaces. Support for intra-member features (code) remains almost non-existent in this version.

RawItem and OriginalRawItem semantic changes

RawItem and the new OriginalRawItem on the IDom interface represent the underlying data in an agnostic way. IDom is agnostic on mutability so there may be future implementations where RawItem and OriginalRawItem are always the same. I want the semantics to be clear that RawItem is the best current capturing of the tree, and OriginalRawItem is the original unchanged item. This intentionally implies that the original must be maintained.

TypedSyntax and OriginalTypedSyntax are the RDom implementations of these generalized ideas.

AddMember method added to RDomStemContainer and RDomBaseType

To support mutability, AddMember methods were added to these two base classes. This makes the ability to add types and type members available to appropriate types, namespaces, and the root.

Changed return of PublicAnnotationList.GetValue(string key)

Previously this returned the default value, which blocked access to other values. It now returns the PublicAnnotation. The default value remains accessible by GetValue(name, name).

Changed PublicAnnotation to a Class

PublicAnnotation was a struct. This was the only struct in the system and I felt the value/reference semantic difference would be detrimental to maintenance. As part of this, I removed the equality testing and added a SameIntent method.

Added IHasSameIntentMethod interface

Another characteristic interface was added for the SameIntent methods. This is for consistency with other characteristic interface usage.

Moved SameIntent to a subsystem in RoslynDom.Common

This code may eventually run with a DI, but for now, if the interface data matches, they match.

Changed SameIntent method type parameter

Previously the SameIntent method appeared on the strongly typed IDom<T> interface and could only be called on items of the same type. This was overly restrictive, so the method was changed to have a local strongly typed parameter, constraint only to be a class. Comparing different IDom types of the current implementations will always return false, although it is possible that a derived class could be created that had different behavior, but the same intent, as one of the existing implementation classes, and could therefore return true as the same intent. This was also done to support scenarios where the type is not known, such as public annotations that might be IDom types.

Changed inheritance semantics of SameIntent() method

The previous inheritance semantics of the SameIntent method were to directly override the public SameIntent method. This method is no longer virtual. Instead override the CheckSameIntent protected method. Be sure to call the base CheckSameIntent method for correct behavior.

SameIntent and names

Type members (fields, properties, methods and enum) do not include outer name when considering same intent.

Stem members (types, namespaces) do not include namespace/qualified name in same intent.

Added IHasLookupValues interface

Added this interface to reduce dependencies in an upcoming project.

Virtual Matches method added to IDom

Immediately this allows CheckSameIntentChild to better find the other child to compare to. It also provides a generalized way to find items in a list.

Changed name of RDomTypeParameter. HasReferenceTypeConstraint

Was previously HasReferenceConstraint. Changed for consistency. Also changed ITypeParameter

Changed name of MemberKind, StemMemberKind and LiteralKind

The suffix “type” is confusing. Switched these enums and property names to “kind”

BuildSyntax

Implementation of syntax recreation from changed nodes is begun, not complete.

Thanks to Llewellyn Falco for his ongoing support and insight. He is encouraging my frequent releases of RoslynDom, and to get a preliminary release of CodeFirstMetadata to NuGet as well as GitHub real soon.

You can get the bits here and the download the NuGet package through Visual Studio package manager or another NuGet client.

These are experimental releases, and as such are not signed.

SameIntent methods

For the work I am doing, I am more interested in the intent of the code than the details of it. There are a number of ways different code can result in identical behavior including ordering of members, attribute syntax details, namespace nesting, and use of named parameters. The first version of the SameIntent methods are fairly conservative – not all code with identical results will be found, just the big, common issues.

Cloning as Copy methods

I added a feature to clone RoslynDom items. This is a precursor to adding mutability, but mutability is not yet available. This involved changing a number of items from direct access to the underlying trees to retrieving this information into local fields. All tests pass, but if you find a missing feature or anything funny, let me know.

PublicAnnotationList replaces IEnumerable<PublicAnnotation>

Previously RDomBase managed a list of PublicAnnotation. This was a bad refactoring of concerns, so I added a PublicAnnotationList class. This cleaned up the code in RDomBase and will make it easier to evolve the PublicAnnotationList.

Removed RDomSyntaxNodeBase from hierarchy

At one point this class seemed appropriate in the hierarchy. It wasn’t doing anything and was removed.

NonEmptyNamespaces renamed to NonemptyNamespaces

Cleanup issue found by FxCop.

Improved code analysis (FxCop) and test coverage

I may separately blog about how positive the code analysis exercise was – in spite of my deep dread of what I would find. The recommended rules had only one issue – which I thought was pretty cool. Switching to All Microsoft Rules for the non-testing libraries resulted in about 100 issues. I dropped this to under 25 and almost all the changes were things I was really happy to find – insufficient checks for nulls on method entry, a couple of naming fixes.