I originally didn’t realise how easy it would be to make the TweetViewModel pretend to temporarily be a Hash but it actually made it really easy for me to change the code and know that it was working the whole way.

For someone with more Ruby experience perhaps it wouldn’t be necessary to break out the refactoring like this because they could fairly confidently do it in one go.

We wanted to push the logic used to create the params in ‘LookUpKey’ into the ‘LookUpKey’ since it seemed that was a more natural place for that behaviour to live.

Unfortunately there were also a few other places that were use LookUpKey’s constructor so we couldn’t just go and change that to take in a ‘UserData’ otherwise we’d break other places in the code base.

My initial thought was that we could overload the constructor and temporarily have two constructors until we got rid of the original.

Dermot pointed out a better approach which was to add a static factory method to ‘LookUpKey’ and initially push the logic into that method.

I wrote a post a while ago about keeping consistency in the code base where I covered some of the reasons that you might want to rewrite parts of a code base and the potential impact of those changes but an interesting side to this discussion which I didn’t cover that much but which seems to play a big role is the role of incremental refactoring.

In our code base we recently realised that the naming of the fields in some parts of a form don’t really make sense and I wanted to start naming new fields with the new naming style and then go back and change the existing ones incrementally when it was a good time to do so.

Richard and Christian suggested that this wouldn’t be such a good approach because it would make the naming issue even more confusing until all the fields had been renamed.

In order to avoid this problem we had to either go and change every single field to follow the new naming approach immediately or settle on the old names even though they might not be as descriptive.

Since doing the former would involve changing the names of around 15-20 fields across several different objects, in Hibernate mapping code, probably in the database, on HTML forms and in Watin tests we decided not to do that – the project is only for a short amount of time so the investment probably wouldn’t be worthwhile.

Although in this case it makes sense not to make the improvement it doesn’t strike me as being entirely satisfactory that we would need to make this type of change in a big bang fashion.

From my experience there are often insights into the code or improvements in the ubiquitous language as time goes on and while consistency is of course an important thing in any code base it’s not the only thing.

When do we decide that actually gradually moving to a better approach is worth the temporary pain that having this inconsistency will cause?

I’ve previously written about some approaches that I’ve been taught with respect to taking small stepswhen refactoring code and another approach which a couple of colleagues have been using recently is the idea of overloading the constructor when refactoring objects.

On a couple of occasions we’ve been trying to completely change the way an object was designed and changing the current constructor would mean that we’d have to change all the tests against that object before checking if the new design was actually going to work or not.

Given a simple example where we want to change the following object to take in a completely different type and not use ‘field1’ or ‘field2’ anymore:

publicclass Foo
{public Foo(string field1, int field2){// and so on}}

One approach would be to change that constructor to take in the extra parameter and then gradually phase out the other parameters:

I quite like this approach as it allows us to keep the code compiling while we’re making changes to it to improve the design.

The main thing to remember with this approach is not to keep the old approach around for too long and to make sure we move our code across to use the new approach as quickly as possible otherwise it can become very confusing for other pairs which come across the code in this half way state.

At this stage in the past I’ve often then stopped and left the refactoring until I have more time to complete it but this hasn’t really worked and a lot of the time I end up only seeing the code change in my mind and not in the actual code.

I came across this problem again last week in an object which had 8 dependencies when I came across it.

Having drawn the effect sketch I realised that 3 of those could be pulled out into a new object which could then be injected into the original object and help to encapsulate those other 3 dependencies.

To do this I had to go and change the tests on ‘TheObject’ and move some of them to go directly against ‘FooFactory’.

The third stage of the refactoring was to change all the places which called ‘TheObject.FooCreation()’ to just call ‘FooFactory.Create()’ directly.
Some of those places were also using other methods on ‘TheObject’ so those objects now have an extra dependency although I think at least the code is more intention revealing than it was previously.

I’m sure there are some other patterns for this type of small step refactoring but this is just one that I’ve noticed so far.

I’ve had quite a few discussions with various different colleagues about coding consistency over the last year or so and PatKua and Frank Trindade have both written posts suggesting that we should look to have coding standards on projects in order to avoid the type of pain that having an inconsistent approach can lead to.

From what I’ve noticed there seem to be two reasons that we end up with inconsistent code on projects:

People’s personal preferences for how certain bits of code should be written vary.

The team is trying to incrementally move towards an improved solution to a problem encountered but isn’t completely there yet.

I think the first of these is the area which Frank and Pat are addressing in their posts.

However I think there is some overlap between the two. For example someone may rewrite a bit of code in their own style and while they would suggest that what they are doing is improving the code base their team mates may think that what they are doing is merely creating more inconsistency.

Potential coding inconsistency falls into three categories as I see it:

Personal Preferences

A simple example of this that we had on the last project I worked on was around how we should write simple conditional statements when used as guard clauses.

These were the 3 different ways that people preferred and we had all the different styles across the code base!

publicString SomeMethod(){if(SomeCondition())return"hello";}

publicString SomeMethod(){if(SomeCondition())return"hello";}

publicString SomeMethod(){if(SomeCondition()){return"hello";}}

We eventually agreed to use the last one because it was the least controversial choice and although people who preferred the other two approaches thought it was too verbose they were fine with using that as a compromise.

Pat covers a lot of the other types of code that I would classify as being linked to personal preference and I think that it would be beneficial to have coding standards/agreement in the team for the patterns we favour/names we’re going to use in this area.

It’s way easier to just go and write code in your own personal style and I do this way too frequently but it’s much better for the team if we follow a common style.

Improving the code but perhaps not significantly

I’ve written previously about the way that we’ve been making use of a ‘GetBuilderFor‘ class to hang our test builders off and while this worked quite well there is a better way to do this which requires less code being written overall.

We can make use of C#’s object initializer syntax to setup fields with values instead of having to create methods to do that and we don’t need to write the code to make the builder hang off ‘GetBuilderFor’.

As long as we put all the builders in the same namespace we can just type ‘new BuilderNameSpace.’ and then the IDE will show us the choices of builder that we have.

The problem we experienced was that that we already had 30-40 builders written using the ‘GetBuilderFor’ approach and there wasn’t a significant advantage to be gained by going and rewriting all that code to use the new approach.

In the interests of consistency we therefore decided to keep using the ‘GetBuilderFor’ approach even though the next time I come across this problem I’d definitely favour the other approach.

I think he describes the type of code changes that I would classify in this and the previous category although it does make me wonder where the line between refactoring and adding your own personal touch to a piece of code lies.

Improving the code significantly

These are the code changes that may create inconsistency but help us to drive a design improvement which will have a big impact on the way we work.

We did have tests checking the values being setup by the object initializer so I was already able to refactor with some degree of safety – it would probably have been possible to just create the builder and build the object from that and then delete the old code and replace it with the new but I’ve caused myself too many problems from doing that before that I decided to try a more incremental approach.

The idea was to have the builder and the object initializer both creating the object at the same time and as I built a property from the builder I would set it in the object initializer until eventually all of the properties were being set directly from the builder’s object and then I could just return that instead – this approach feels quite similar to what Kent Beck describes as having parallel implementations in his recent presentation on Responsive Design.

I worked together with one of the business analysts on our team who pointed out that there were actually some clear groupings in what I had just been considering ‘data’ and we were able to put those explicit domain concepts into the code as part of the builder.

My first step however was to remove the object initializer to avoid making any silly mistakes – an example of one I have made when using object initializers is to set a property using ‘Int16.Parse’ and then passing in a string with a value of “53700” which causes the method to throw an exception and the stack trace just directs you to the first line of the object initializer, making it quite difficult to work out which line has failed.

Having worked the code into a sequence of setters I gradually added methods to the builder to create the policy:

I created some duplication by doing this – I am now creating the ‘Foo’ twice – but I didn’t check any of the code into the main branch until I had totally replaced the original Foo creation with the builder.

I did about two of three properties at a time and then ran the tests which I thought might be too small a step but actually saved me on a couple of occasions so it probably actually saved me time.

Eventually when I had all the tests passing I got rid of the original Foo and replaced it with the one from the builder:

This code is still in a state of transition – it is still possible to create an object with half the fields not set by passing in nulls to the builder for example – and I’m trying to work out what the best step is to fix that.

I generally prefer to have everything setup in the constructor and then you know that the object is in a good state as soon as you create it, but in this case moving everything straight into the constructor will probably make the object even more difficult to deal with.

My current thinking is to maybe check the pre conditions for creating the object inside the builder and then refactor out value objects so that there are not so many properties overall but I’m open to other ideas.

Granted we have been trying to take some of the practices to the extreme but the basic idea of trying to keep the tests green for as much time as well as keeping our code in a state where it still compiles (in a static language) is very useful no matter what code we’re working on.

Recently a colleague and I were doing some work on our code base to try and remove some duplication – we had two classes which were doing pretty much the same thing but they were named just slightly differently.

The implementation was also slightly different – one was a list containing objects with Key and Value properties on and the other was a dictionary/map of key/value pairs.

We spent a bit of time checking that we hadn’t misunderstood what these two different classes were doing and having convinced ourselves that we knew what was going on decided to get rid of one of them – one was used in around 50% more places than the other so we decided to keep that one.

We now needed to replace the usages of the other one.

My pre-coding dojo refactoring approach would have been to just find the first place that the one we wanted to replace was being used, delete that and then let the compiler guide me to replacing all its usages and then do that with the second usage and so on.

The problem with this approach is that we would probably have had the code in a state where it didn’t compile for maybe half an hour, leading to a situation where we would be unable to run our tests for any of this time which would mean that we would lose confidence that the changes we were making actually worked.

The approach we therefore took was to add in the class we wanted to keep side by side with the one we wanted to get rid of and slowly move our tests across to setup data for that.

We therefore ended up with code looking a bit like this to start with:

When changing tests we took the approach of commenting out the old setup code to start with so that we could see exactly what was going on in the setup and then delete it once we had done so. I’ve written previously about my dislike for commented code but we were using it in this case as a mechanism to guide our refactoring and we didn’t ever check the code in with these comments in so I think it was a reasonable approach.

The intention was to try and reduce the number of places that OldType was being used until the point where there were none left which would allow us to safely remove it.

Once we had made that change to the test setup we needed to make the changes in the class using that data to get our green bar back.

On a couple of occasions we found methods in the production code which took in the OldType as a parameter. In order to refactor these areas we decided to take a copy of the method and renamed it slightly before re-implementing it using the NewType.

We then looked for the usages of ‘OldMethod’ and replaced those with calls to ‘OldMethod2’, also ensuring that we passed in NewType to the method instead of OldType.

I’m intrigued as to whether there is an even better way to perform this refactoring – when I chatted with Nick about this he suggested that it might have been even easier to create a temporary inheritance hierarchy with NewType extending OldType. We could then just change any calls which use OldType to use NewType before eventually getting rid of OldType.

I haven’t tried this approach out yet but if it makes our feedback cycle quicker and chances of failing less then I’m all for it.

One refactoring I was doing last week was to try and remove the use of some getters/setters on one of our objects so that it was better encapsulated and all the behaviour related to it happened in one place.

The change involved introducing a constructor to initialise the object rather than doing so using the new object initialiser syntax and initalising it using the properties.

My initial approach was to find all the usages of these properties and then remove each usage one by one, running our suite of tests against the code after each change to ensure that nothing had broken as a result of the change.

While we were doing this my colleague pointed out that it would probably be quicker to just comment out the properties code and then recompile the code – the list of errors would then point out the areas which relied on the properties and we could then fix the code that way.

It’s a technique I’ve used previously and it worked out reasonably well because there were only 10 or so usages, meaning we were able to make all those changes and run the tests in under ten minutes.

I think if there had been bigger changes to make – for example if the properties had been being used to expose the data all over the application (luckily we only had one case of this) then the smaller steps approach may have been preferable.

Getting the balance between taking small steps and getting rapid feedback was also a consideration here.

We were running our unit and javascript tests as we made the changes since we were uncertain exactly what impact the changes would have. It took around 70 seconds to run these tests, which I think encouraged slightly bigger steps so that we weren’t sitting around waiting too often.

Both of these approaches are useful and it seems to me that maybe a combination of them, at different stages of our refactoring, would prove to be most useful.