Refactoring

First of all I’d like to point out that I was kindly given a license by the folks at NDepend (not very often that sort of thing happens I can assure you!) and I’m under absolutely no obligation to write anything about it.

In the beginning…

The funny thing is that it was probably over a year ago when I first installed the product without any specific requirement or
expectation. I had a little play with it (on Visual Studio 2008 as I recall), then the work I ‘had’ to do overtook my will to learn this new product and it lay gathering dust on my hard drive. This probably explains why I haven’t posted in all that time!

But then…

Recently, I picked up an existing project (on visual Studio 2010), and wanted to have a good look inside to see what I was getting myself into. I dusted off NDepend and told myself I’d give it a good go this time…

First Impressions

The first thing I learned is that this is one significant addin, and you realistically need to ‘know you need it’ before you get it (see ‘laying
dormant comment above’). This also means you need to know what it can do for you – which is plenty!

If you’re reading this and thinking of trialling NDepend, then you either have problems to solve or you’re wanting to invest in ongoing improvement to your code. Both are very good reasons as it happens.

NDepend has few limitations in what it can do, as it has your entire codebase, Visual Studio extensibility and its own powerful rules engine at its disposal. It also employs its own CQL (code query language), to allow you to find all sorts of patterns and complexity problems with your code.

The biggest problem is knowing where to start, or discovering that first task you want to achieve with it. It’s easy to get overwhelmed by the
information it bombards you with when you spin it up).

To be fair, there’s plenty of links trying to lead you to ‘what you’re looking at is…’

Reasons to try/buy

If you’re interested in the quality of your code I believe there really is no equal. This is the tool you need. You may already be using FX Cop in your build process to check for certain snytactical rules, and ReSharper for sorting out your code as you go, but NDepend can do all sorts of ‘different’ funky stuff (through CQL) that goes in depth to your code to enforce things that would be otherwise difficult to do It can obviously do all the simple stuff like show you where your dependencies are between methods, classes and projects, and redundant code etc.

Things to be aware of

It’s a technical tool, and it’s easy to get a little overwhelmed with what it can do and where to start.

Time is needed to understand some of the concepts and power of the product.

You’ll need a beefy machine to avoid things slowing down with the addin loaded (I had to disable it for a while when I was using a solution with 60 projects as I was starting to experience memory issues). If you don’t want to run it in Visual Studio, you can run it in the standalone ‘Visual NDepend’ application.

I’ll admit I haven’t spent a lot of time with the interactive reports, and I don’t find some of the graphical representations of the metrics that easy to use.

I think like most products, you get comfortable with what you see as valuable, and tend to only try other things when you have time.

Summary

Clearly NDepend’s a very impressive tool for any serious development team to be using. It will help you to learn about reducing complexity, dependencies and generally designing your code in an efficient way. It’s basically all about improving quality.

It’s also a big product that’s not for the faint hearted. You basically get out what you put in as far as effort in understanding what it’s trying to achieve for you.

I think the key is finding the right balance between all the technical information it presents, the time you have available, and the business benefit you’ll get from code improvements.

As I said at the start. It can basically take you as far as ‘you’ want to go.

OK. Last one for today. I’m not going to go into too much detail except to say that the offending piece of code loads all items in a ‘matrix’ database table and builds a business object collection hierarchy. This table has 4 fields, one or more of which may be NULL – for different meanings. I really don’t particularly like this type of approach but changing it right now would involve quite a bit of work.

The table looks something like

Level1IDLevel2aIDLevel2bIDLevel2cID

This ultimately translates to a business object hierarchy of:

Rows with Level1ID (only) – all other fields null – are Level1 objects

Rows with Level1ID and Level2a ID (only) are Level2a objects

etc…

The matrix table just contains the primary keys for each object so the code loads each one in the style below…

This seems reasonable enough (logical anyway) given the way the data’s being retrieved.

This does however load another several hundred rows from more stored proc calls that load each new object into the child collections. This whole thing consistently takes around 8 seconds.

The Refactoring

A wise man once told me that if you can be sure the bulk of your operation is data access then lump it all together if you can, and do the loop or other grunt processing on the client.

I created a new Stored Procedure to return 4 result sets. These come back as 4 tables in the target DataSet. Each resultset is qualified by the same master criteria, and just uses joins to get a set of data that can be loaded directly into the collections. The original stored proc is no longer required and this is now the only data access call.

I changed my collection classes slightly to allow a LoadData method which takes in a dataset, a tablename and a parent key. This means I can add Level2a objects to the appropriate Level1 collection. The pseudo code now looks like…

if Level2a Results present for each Level1 row LoadData on Level1[Level1ID].Level2aCollection

if Level2b Results present for each Level1 row LoadData on Level1[Level1ID].Level2bCollection

if Level2c Results present for each Level1 row LoadData on Level1[Level1ID].LevelcbCollection

}

As I said at the beginning, there are some definite improvements to be made from changing the data structure, and a lot of this code could look a lot nicer by using Typed Datasets with relationships defined.

The new approach actually completes in less than 100ms. I couldn’t actually believe it myself, and made sure I killed connections, cache etc to make sure the database was coming in cold. Still the same.

This basically proves that for data-heavy operations, things really start to hurt when you’re making repeated client round-trips, however small the call. This is basically a 99% saving in load time for this business object.

The resulting page is also really snappy now and I’m sure the customer won’t even notice 🙂

I know many people have written about this one, but it’s cropped up yet again in the app I’m maintaining. The old chestnut of setting a string based on an enum value – when the enum names are identical to the string values (most often with one word names).

There’s other ways to do this if you want more of a ‘description’ (e.g. multiple words). The StringEnum class could do it for you, or you could provide your own implementation with a Description attribute on the values – e.g. [Description(“My extended value name”)]

My ‘entry point’ into this refactoring was trying to eradicate dynamic SQL and also the dreaded SELECT *, so this snippet popped up in my search.My immediate reaction was to refactor the dynamic SQL to a Stored Procedure, but then I took a step back and had a further look at the code.

The following has been changed slightly (naming) to protect the innocent.

What this code is doing is actually using an ‘already populated’ collection of ‘Organisation’s and performing a nasty dynamic SQL call (using an constructed IN clause) back to the database to retrieve the same data again so it can bind against a datagrid.

It then goes back to using the original collection further on and does some more binding through the object hierarchy by linking to child controls in the grid.

It’s clear here that we’ve got more going on than the need to just refactor the SQL code (dodgy table names etc). It turns out that I’ve touched this code quite recently by refactoring what was a ‘pseudo-not-quite’ collection into a generic dictionary. The original poor collection implementation was probably confusing enough to put the developer off trying to use that for data binding. The collection now can of course be bound against the original collection with no need to go back to the database.

This not only removed bad and unnecessary code that could have hurt performance, it also made me focus on ‘real’ refactoring opportunities. It turned out that 90% of the remaining dynamic sql calls we actually duplicates of this code in other ASP.NET pages, and so that could be removed too.

Refactoring can be quite like peeling an onion. I’d been in this code a few days ago, and due to a small improvement made then (with the collection), I was able to come back and do some more. The moral here is ‘patience’. If you’re serious about improving all code you touch then you learn to make small, manageable changes that you can test before moving on to the next. There’s still work to do on this application but the code smells are starting to fade.

As it’s Friday I thought – let’s do some cleaning up. The following exercise is refactoring a class from an app I’m maintaining at the moment. The app’s fine (functional etc), and was originally a .NET 1.1 build. A combination of this, a previous, straight conversion to .NET 2.0 (with the minimum of mods) and a variety of historical developers (mostly junior) means that the code has a certain amount of ‘odour’ in numerous places.

Lack of consistency isn’t the main problem, it’s more about doing things smartly rather than taking the path of least resistance with your current knowledge. The latter generally means you don’t learn much, get much satisfaction or pride in your work, and you may end up a tad confused as to why you always end up with the ‘less glamourous’ projects.

This is a simple wrapper implementation. In conjunction with the naming you may conclude this is an ex-VB programmer who hasn’t quite grasped object-oriented concepts yet (an observation – not a criticism). If the developer knew about inheritance then they would have hopefully just inherited from one of the many applicable classes in System.Collections. This would have also removed the need for implementing a custom Enumerator class. Note that neither class actually inherits from any useful base class. This is copy/paste pseudo-inheritance, and is rarely a wise idea.

The Exists and Item methods use a simple loop mechanism to find the item they’re looking for. This is potentially wasteful with many items, and goes back to the ‘path of least resistance’ thought. It’s also using the inefficient ‘ToLower’ implementation to do comparison of what should be a key. In practice the match doesn’t need to be case-insensitive, and if it did then a String.Compare should be used with the appropriate ‘ignorecase’ argument). The developer clearly isn’t familiar (or maybe just not comfortable) with dictionaries or hashtables where access to items via a key is supported and far more efficient.

The Count implementation is a method rather than a property (aargh!)

It could be argued that the menuitem class itself (not shown here) is redundant as the ASP.NET framework does define a similar MenuItem class. In reality the whole implementation could be replaced with a sitemap, a menu control, and some custom code, but I’ll leave that out of scope for now.

That gives some pointers as to what ‘could’ have been done in .NET 1.1. .NET 2.0 of course introduced generics and so let’s follow that path.

The MenuItems class is used in a few places:

A menu control (that renders menu items)

Many pages that load items into the control (I already refactored some of this into inherited menu controls, where all pages shared the same ‘dynamic’ menu), but there’s plenty more improvements to be made.

The first thing is to say…

I don’t need a collections class…

The following is now in the Menu Control:

//private MenuItems menuItems = new MenuItems(); – Out with the Oldprivate Dictionary<string, MenuItem> menuItems = new Dictionary<string, MenuItem>(); //In with the new

This simply uses a generic Dictionary object with a string key (as we find items by title). Anywhere that references this dictionary will now need to add a ‘key’ in certain places (e.g. Add method), and the Item method will no longer work as that’s gone. That needs to change to a class indexer.

Fortunately Visual Studio’s find and replace can cope with all of this as the code is largely consistent…

A quick compile should show all the patterns (from the errors) if there’s any inconsistencies. There were probably about 200 references to patch here with about 4 different variations of naming and scope – but it took about 5 minutes to patch (including writing this text :-))

We then have to patch all the ‘Item’ references. These are almost always setting a ‘selected’ property (Stopwatch on…)

As there’s only a few references here (about 20), I could patch them manually, but this could be done entirely by using regular expressions to do the find and replace. http://msdn.microsoft.com/en-us/library/2k3te2cs.aspx shows all the options available. I tend to only use regexp for large volume stuff where I get a return on my time investment to work out the expression!

I’m just doing it in two steps without regexp as follows:

The code

Menu1.MenuItems.Item(UserMenu.MyCommunicationItems).Selected = true;

becomes

Menu1.MenuItems[UserMenu.MyCommunicationItems].Selected = true;

For extra points you then use Resharper to rename all of the obj and pal references to something reasonable, e,g, objMenuItem becomes menuItem.

Oh – and use Resharper’s ‘Find Usages’ on the MenuItems class to confirm it can be deleted.

And relax…..

In hindsight I would have used Resharper to ‘change signature’ on the methods in MenuItems (to patch all the references first), then converted to the generic Dictionary. Could have been a bit quicker.

Hope this gives someone some courage to get in there and improve their code. I firmly believe you’ve just got to ‘want’ to do it and then you’re biggest problem will be stopping yourself!