Daily Software Development

One analogy I like using when talking about interfaces is the car. I like it because cars have a standard "interface" and most people know how to drive cars. In a post I wrote recently about dependency injection, I mentioned the importance of programming against interfaces instead of concrete classes.

For my car analogy I will start by saying that you are the program. You have been programmed against an interface. This is a good thing. You know how to drive a car. In this case "Car" is the interface. (Remember this means a literal interface, so it could be an abstract class.) You learned to drive a car. It doesn't matter what brand make or model. Since you know how to drive a car you can drive anything that implements the car interface even if it is not a car.

When you need to get from one place to another you can use any car with the same interface you know. This is very useful. You may not know how to operate all the features (methods) of that specific car. For example do you might not know how to operate the extra parts of a tow truck, but you could still drive it.

I like that analogy for interfaces.

How does testing fit this analogy?

Notice that I said earlier that, "you can drive anything that implements the car interface even if it is not a car". So before you can get your license to drive in the United States you need to pass a test. That is kind of like testing. Well there are two ways we could test your driving.

Right now the current system used to test drivers is an integration test. Perhaps in the future we will have a nice unit test to test our drivers. I am obviously going somewhere with this, so I will get straight to the point.

The way we test drivers now is we have someone sit in the car and watch what you do. We have to have very predictable courses set up in advance. We also have trouble controlling external variables (other drivers on the road). We can't control the weather for these tests, but they will be there when you're taking the test. Observations are made by an external entity (the person testing you) one that can't always get the exact interactions being made. All of these external resources even when they're controllable make this an integration test. We are dependent on too many other things.

A unit test for testing how well someone drives would need to break the dependency on a car by using an interface. We've done that. We are saying that a person knows the "car interface". Now we need some fake car that we can use for testing purposes to remove that piece we don't control. One that records all the interactions that the driver makes with the car, so we're not dependent on external observations of what is going on. We also need to break the dependency on real roads, since traffic conditions and the environment are difficult to change, control, and standardize. What does that sound like to me? A video game or in more professional terms; a simulator. The driver can sit down with this thing which has the same interface as a car, behaves as a car, works like a car. We can give the driver any scene, environment, conditions, etc. we want. We can also accurately measure all interactions with our fake car.

Maybe some day we will actually test drivers this way. It certainly would make it faster and easier to test. We could test more often. Considering that once someone has a license we don't test them again....

Would you do that with your code? I run unit tests every time I make a small change. Integration tests every time I check in.

Dependency injection is an important concept for anyone to understand before trying to get into test driven development. Recently I've noticed that lots of people are trying to get into agile practices of software development. They're writing tests, and plenty of them are writing integration tests instead of unit tests.

Integration Tests Versus Unit Tests

When testing there are two main types of tests which are created. The ones most people write are integration tests, and this I attribute to their not knowing about dependency injection. If your code contains too many dependencies you will not be able to write unit tests.

Unit tests are the tests you need the most of usually. Unit tests should be testing the logic of your code. The core of your application should be your domain objects, your business logic, and well as the name states it should be the core code required for your application run. Unit tests should be fast. Because well-tested code will have hundreds of unit tests for even very small projects, it is important that unit tests run extremely quickly. This means that unit tests should never ever access external resources. This means your code in unit tests should never access a database or even use configuration files. Another reason to not access external resources is that they would need to be kept in a known state at all times for the test to work correctly.

Integration tests are also vitally important to a healthy and tested application. These tests serve a different role though. These tests aren't so much testing the pieces of your code like the unit tests are. The integration tests are here to make sure everything works together including the external resources. Integration tests do exactly as their name states, they make sure that everythign integrates well together.

Don't spend your time testing every case with integration tests. That is the responsibility of the unit tests. Unit tests are there to try to test all the different logical cases that your code handles. The unit tests should be examining a small portion of the code at a time. The integration test is there to make sure that each piece of the code is able to interact with each part with which it is supposed to interact. This allows you to make sure everything is linked together correctly. This means you should have some simple data accessing tests, mapping tests to verify that the properties of your classes mesh with the columns in your database tables.

Testing With dependencies

Using some very common methods of software development people don't build software in a very testable way. This means that we need to make some adjustments to how a lot of people have learned to structure their applications. Most code is able to be integration tested. Why? Integration tests are easier to do because our code is too tightly coupled together to test the individual pieces we would test using unit tests. Most software has so many dependencies in it that even simple unit tests are impossible, because they'll become integration tests if we start touching a database, a file, a web service, or anything else external to our code. If you're not testing one single tiny piece of code, then you're integration testing. Integration testing makes sure the dependencies are all working. Unit tests are testing only a unit of the code at a time. The problem is that once you test an object and its dependency you're also testing the dependency. This is why we want unit tests; so we can control what we are testing.

We need to break these dependencies before we can write our unit tests correctly. Once we've broken these dependencies we should be able to test the code.

Programming Against Interfaces

As a general rule, developers should program against an interface. Be as generic as possible. By interface I mean that in the normal sense not the programming one. Interfaces and abstract classes and anything similar are all acceptable. The point is that you're not programming using concrete classes. When the code executes it will run against concrete implementations of the interface, but most of the code should just use the interface. This allows us to substitute in implementations which are fake and allow us to manipulate their results.

Injecting the Dependency

In a previous post I wrote about how to begin unit testing, but I didn't explain the dependency injection very well. In that post I created an ICalendar interface. I programmed against this interface. Then in my tests I used a FakeCalendar class which implemented the interface, and I manipulated the values returned by that class so that I could test what I wanted to test. I have also created a concrete implementation of the ICalendar interface, and I use it as the default implementation. I used the simple dependency injection I refer to in this article.

I create two constructors for the class with the dependency; one that takes the dependencies and one that is the default constructor. The one with the extra parameters will be used by the test methods so that I can pass in the fake implementations, and the default one will be used by my production code. This lets me use different implementations between test and production code without having to muck up the production code.

This is an example of what a class might look like after the dependency has been removed. Pay attention to the two constructors and the fact that I am writing code against an interface and not a concrete class.

publicclass TimeOfDay

{

private ICalendar _calendar;

public TimeOfDay() : this(new Calendar())

{

}

public TimeOfDay(ICalendar calendar)

{

_calendar = calendar;

}

publicbool IsMorning()

{

return (_calendar.GetCurrentTime().Hour < 10);

}

publicbool IsEvening()

{

return (_calendar.GetCurrentTime().Hour >= 18);

}

}

Update: It seems that I failed. Steve Smith mentioned in a comment below that I forgot to mention what this pattern is called. This is the strategy design pattern. It follows the principle of programming to an interface instead of to a concrete class. The point of the pattern is that you select the algorithm (in our case a class) which you will be using at runtime. Testing is just one use of the strategy pattern. It is also very useful in general purpose coding.

Sometimes there are methods kept private in a class. Some calculations are kept private because nothing should be calling those methods on this class. This is a good hint that the method belongs somewhere else. If the method is kept private because it doesn't make sense for a user of this class to use it, it belongs somewhere else.

Private methods are a common occurrence in classes. Sometimes they should be moved into another class, because they were only private because they didn't make sense in that class. Other times they are part of the internal workings of the class. At the end of the day it is always up to the developer how he is going to structure his code.

If you don't want to move the method and don't want to make it public you still have a couple of options to test it.

You can sometimes test a method through the public methods that call it. (Can be difficult sometimes because it is harder to control what is being passed to the method.)

You can write a public method which passes through to it, and prefix the name with "Test". (This is a bit hacky and should only ever be done with internal code that will not ever be in an API.)

You can change the method to protected and write a Test version of the class that inherits, and then exposes the method publicly on the test class. (This option works well because the test class can be kept with the tests so it doesn't dirty the production code. Only do this if you will not be subclassing this class already.)

Some people discourage testing private methods, because there really shouldn't be much logic in private methods that really needs to be tested. If it really needs tests it probably belongs in another class. My opinion is that if people are going to keep the code in the private method anyway, they might as well at least be testing it.

Structure your code how you like. Just don't let your classes get out of hand. If it becomes an issue then refactor it.

Friends don't let friends perform premature optimization.

As Donald Knuth said, "We should forget about small efficiencies, say about 97% of the time: premature optimization is the root of all evil. Yet we should not pass up our opportunities in that critical 3%."

I think it is safe to say this should apply to refactoring as well. Don't go on refactoring binges. Only refactor code you're currently working with.

Testing and refactoring go hand in hand. When tests are in place we are safer to refactor because we have test in place which will help us preserve the previous functionality. When we are trying to add tests into code, we refactor the code to make it more testable. We refactor so we can test and we test so we can refactor. This may seem a bit cyclic, but both the refactoring and the tests improve our code. This makes it quite important to do them both anyway.

One problem which I've seen recurring in lots of different projects are classes and methods which are quite large. When classes and methods get long, they become difficult to test and maintain. It is easy to make mistakes when working with large classes or methods. In order to be able to work with and test large methods and classes we need to try to break them down into bite sized chunks that will be easy to handle.

The mistake people make when they do this is that they focus on testing the large method or class. We've already established that it isn't easy to test. Since it is not easy to test, we will test it last. How do we get the code where we can test it? We do exactly what I said in the first part of this post. We refactor the code so we can test it. Not the big piece. We pull out small pieces and test them.

We extract methods from our large method and test those pieces. We test them because they're small, bite sized, easy to test, pieces. Sure we don't get the whole method tested yet, but we can do it a piece at a time. As we extract these bits of code the method becomes much easier to read.

If there are now a few methods which have been extracted which seem to perform a similar responsibility we can extract them into a class with a name descriptive of that responsibility. The nice thing is that we now have a class which is easily testable and is tested since we tested each of its methods as we removed them from the large method in the other class.

Now if we're really feeling zealous we could extract an interface from the new class. As a general rule it is better to program against interfaces or abstract classes. The reason for this is that they are easy to test and easy to change when needed. We would then perform a task called dependency injection, which will allow us to use the concrete implementation in our production code and a fake object for our tests. This would then make testing the large method easier.

Even if you don't want to use the interface it is still easier to test the large method once it becomes a lot simpler and its individual pieces need less testing.

Getting started unit testing is a difficult task. I've seen plenty of people learning to unit test and it took some time for me and seems to take some time for everyone before unit testing really clicks. Part of the reason why unit testing is difficult for people to get started with is that the code most developers write is not testable. Yes, not only is it important to test, but it is important for your code to be testable. Most code is so difficult to test that anyone who doesn't know how to unit test will have a great deal of trouble testing anything.

Most testing is difficult because the code you're trying to test has too many dependencies. So we all know that we write code which depends on other code we write. There are a few important dependencies that a lot of people forget about.

Breaking the DateTime Dependency

One seemingly safe piece of code to test is the DateTime class. How is it often used? Well, people use DateTime.Now or DateTime.Today in their code. So what is the big deal? Isn't this easy to test? NO! Testing a method which uses either of those two values is not easy at all. If your application cares about the time of day at all, it is impossible to test it reliably if you're using DateTime.Now.

Two cases for time come into my mind right now: checking for weekends and checking morning, afternoon, and evening. A couple of interesting sites I enjoy change the theme of the site based on the time of day. How could you test that logic? Well, not easily if you're using DateTime directly. To get around this wee need to break the dependency. To do this we need to work with an interface which gives us access to all the logic we used from the DateTime class.

If you need any more attributes from the DateTime class add the methods for them to the interface and the class.

In order to use these methods in production with our TimeOfDay class, we will need to do a couple of things. As a general rule, it is important to program against an interface. We will start by creating a stub for the class and then testing the stub.

Now we go and write a couple of test methods which we want to have fail since we haven't implemented our methods yet. Then once we've got the tests in place we can fill in the code and have confirmation that we have written it correctly.

So now when we run our tests again we should see all green passing tests.

If we hadn't kept the DateTime class at arms length we wouldn't have been able to easily test the class. The reason we can't is because we are not able to set the value which will be returned from DateTime.Now. Anytime you use classes which magically give you access to something, they better have an interface. If they do not have an interface then you will need to use a solution like this one to wrap the class inside of another one.

One thing that seems to get out of control quickly on a lot of projects I've seen is a tendency for classes to grow to enormous size. This is one problem that seems to turn up at least once in a lot of projects. Why does it happen? Because programmers are lazy. Yes, I said it. We are. I think it is a good thing in a lot of cases. We are "efficiency experts". We pay an up-front cost writing a program so we will have less work to do to solve a problem. By writing the code we save ourselves time. This is one example of this laziness.

When we get lazy we tend to group things together. Since the foo class already has access to types x, y, and z, we can just put this new code in foo. It makes sense, but we're really just making our classes larger. This is why the single responsibility principle is so important. This is one rule important enough that developers everywhere should know about it.

Classes should do one, and only one, task. That is their responsibility. If you need to generate some value, write a class to do that. Don't reuse an existing class. I've seen classes which have gotten so large people don't even know what is in them. I would be scared to make modifications to files like that.

Keep your classes small. It is easier to test individual parts. This is why unit tests are called "unit tests". They test a small piece of the code. If you keep classes small and separate responsibilities among classes it is easier to test. You can break dependencies on the other classes using interfaces, abstract classes, or other things.

Private Methods to discourage use

Sometimes there are methods kept private in a class. Some calculations are kept private because nothing should be calling those methods on this class. This is a good hint that the method belongs somewhere else. If the method is kept private because it doesn't make sense for a user of this class to use it, it belongs somewhere else.

Breaking Classes Apart

The best way to cut large classes down to size is to go through and find pieces which don't belong and are grouped together. Often looking for the collapsible regions of code helps. If you've grouped methods into a region there is a good chance those are related methods and probably belong in a different class.

When you create the new class where you will put these methods, the first thing you should do is test it. This new piece should be very testable since you're pulling it out of the beast. If you test this one piece you'll be able to keep things stable while you pull apart the large class. If you aren't testing each part as you break things up, there is a very good chance you will create bugs in the process.

In my opinion, tests serve two very important purposes. They test to make sure that things work as expected, and they help keep code stable. This stability clamps things down and allows changes to be made. Often times after breaking off separate pieces many refactorings which were hard to find in the large cluttered class become quite obvious. This is why it is so very important to break apart large classes.

In Working Effectively With Legacy Code, the author talks about where to place test classes in your applications structure. He mentions that you should place test classes along side production code in the structure of your application. Why? Well he says that it is good because it helps you navigate between the test classes and the production code easily.

I somewhat agree with him when he says this, because a lot of time is wasted moving between classes. I admit I am one of the people who tries not to mix too much test and production code. I like to keep anything test related at arms length. Why do I do this? Because I believe that everything becomes cleaner and easier to observe when it is separated. I believe there is some reason to keep things aesthetically pleasing, and I certainly believe that keeping them separate will make the system appear a lot less daunting.

Navigating Between Test and Production Code

So I've explained what I like about keeping the two together, but I still haven't made an argument for why I don't need the "navigation" benefits of keeping the two together. The way I handle this is through superior add-ins to my IDE. I use Resharper to give me better navigation. Lately I've become quite fond of this application. If nothing else just for the ability to navigate to classes and files in the solution based on the name alone.

This is incredibly powerful, and it is a quick keyboard shortcut away. It lets you navigate quickly and easily. I like it because it lets me organize how I want to, and I don't need to worry about moving from one assembly to another even. I keep my unit tests in a separate project from any production code. Since I can still navigate quickly, I get all the benefits of both methods.

I think there are a lot of great suggestions people can make for how to code, but at the end of the day you need to make your own choice. Consider people's suggestions, understand why they make them, and certainly adjust them to your own uses.

My opinion is that test classes should go in a separate location. It keeps the production app less cluttered. The application will not deploy test code. If they're separated there will never be any confusion. It would just be terrible if you had a class that actually needed to have the word "test" in the name.

Creating stories to work on is an integral task in agile software development. These stories are what the team of developers will be working on for the iteration. Where do the stories come from? Well, through discussion with the development team, the customer says what he wants. What types of stories does the customer ask for most of the time? The customer asks for epics. What makes them epics? They're just too large.

When writing software in an agile manner it is important to have very small pieces. You want something that can be completed quickly and delivered to the customer. Sure, it might be a small piece that isn't of much value yet, but that is good. The fact that it is small means that you can be constantly finishing pieces of the project. If your stories are too large you'll only be delivering at the end of the iteration. That's just not cool.

Steve Smith wrote a great blog post where he discusses very well how to break epics into vertically sliced stories. One major reason that doing this is of great advantage to you is that you will be able to get feedback from the customer. In Steve's example he talks about a registration page which is loaded with extra data including multiple addresses, contact info, and possibly other pieces. I can see a customer wanting to have something like this. So there are two ways we can approach splitting this up.

Splitting Horizontally

When you split horizontally, you will be delivering the entire solution at the end since each horizontal slice needs to be finished before anything is finished. So for the sake of argument why don't we say that the developer didn't create exactly what the customer wanted the first time. So now at the end of the iteration the developer finds out that the whole process should have been completely different. Now he goes and rebuilds everything in the next iteration because this one is out of time.

Splitting Vertically

Using this method we will quickly have the first small piece done. We can say that it is the ability just to create a user name and password at registration. Now when the customer sees this we get feedback. Maybe everything is fine for this part. We then go do the next part. Maybe we write the address section incorrectly. When the customer sees this he can let us know. Since we know early that there is a problem we can correct it. We may not have everything done at the end of the iteration, but at least we've got the most important parts done.

Since the customer can keep each of these vertical slices in ranking order at the end of the iteration he will have a bunch of working pieces. He will have stuff he cares about even if it isn't all there. If we had found out at the end of the iteration that we needed to make a change we wouldn't have anything done.

Conclusion

It is important to know how to split stories. Sometimes there are pieces of the story which can be vertical slices, but you should try not to have too many dependent tasks. It will make your software development work much faster and better.