Luke Francl Explains Why Testing Is Overrated

Recorded at:

Bio Luke Francl is a developer at Slantwise Design (http://slantwisedesign.com) and sometimes tech writer. Luke is a frequent presenter at the Ruby Users of Minnesota, and has also presented at CodeCon, MinneBar, Ostrava on Rails, and acts_as_conference. He blogs at Rail Spikes (http://railspikes.com) and Just Looking (http://justlooking.recursion.org).

Sure, I am a Ruby and Rails developer from Minneapolis, Minnesota. I met Pete Ford, the organizer of this conference, a few years ago at a Rails conference. He asked me if I want to come up and talk and I said “Sure” and I was like “What about?” I have been doing some reading about testing and said “Well, how about I talk about testing is overrated” and since the guys at Unspace don’t really do that much developer testing they thought it was great so he invited me to present on that.

I am a developer, I was working at a small Ruby and Rails consultancy. Previously I came from the Java world, and now we sort of split up into a start up so I am the main developer on a website that’s targeted at parents with young children.

I got the idea for the talk when I was reading “Code Complete”. In that book Steve McConnell goes through various ways of finding defects in software. And I was surprised to see that testing, unit test and other kinds of developer testing were not actually that effective. And, for example, code reviews were much more effective at finding defects. And so I thought that was very interesting and then further reading it, he is not saying “Don’t do tests”, he is saying that they find different kinds of defects, and that doing developer testing isn’t enough. And so I felt like in the Ruby world there seams to sometimes be this attitude that we have tests, we don’t need QA. And I disagree with that and that is the kind of the genesis of the idea for my talk.

I tend to write tests along with my code. Sometimes we call that test along, as opposed to test driven development. Probably the TDD or the BDD, Behavior Driven Development, people are going to lynch me for saying this but I just want to say that is an easier way to work with my style of coding, and so I write tests with my code but we also rely on manual testing and usability testing and we have started doing code reviews to try and catch other defects and make our software better.

I actually could, because I have never figured out what BDD actually means as opposed to TDD. I have read tones of blog posts and different things about it and I am like “What’s the difference?” But I think essentially the idea is that BDD is a more user friendly syntax for describing what the software should do. In both cases you are writing a sort of specification, as a test or spec first, and then implementing the software. It’s just like a difference in the emphasis and what you are calling things with the BDD.

Sure, to start of, I absolutely believe that writing tests is important, especially in a language like Ruby where we don’t have a compiler to catch like really stupid errors that we as programmers make all the time. So we need some test to help with that and it also helps a lot with refactoring. So I think a good basis, a layer of tests, is important. The extent that I take it is less than some other people because I feel like, as I learnt when doing my research, and this fits with my experience, about eighty percent of defects tend to happen in twenty percent of the code. And I think that is probably the most complicated area of the code because probably people are more confused, people are not sure what’s going on, there are more possibilities for errors. And so that part of the code needs to be well tested by the developer and also by these other methods that I talked about in my talk, which are more manual, they can’t really be automated so much. We do that, we have a good layer of developer tests, and then we also have manual QA to do smoke test across the whole system, just flash out bugs, and then the QA person is also responsible for when the defect is found, no matter who finds it, they are responsible for verifying if there is actually been fixed. Because a lot of time programmers they will get a bug report, they will fix the bug and then they will say it’s done and it sort of disappears from view because they close out the bug report. We have another layer of checking to make sure that doesn’t happen. And then we also do code reviews not only to try to find defects but also learn about the system. I know a lot of people are advocates of pair programming, we don’t do that, but the code reviews in pair programming provide some more benefits. And another thing that we have been doing, that I found really helpful and quite frankly have been really amazed at how bigger difference it makes, is just usability testing. You don’t really understand how obtuse your software is until you sit a user down in front of it and watch them try to figure out how to use it. It really blew me away and it’s quite humbling to see that, but it also for me it gives me the motivations to want to fix those kinds of problems. One of the points I was making in my talk is that no level of developer testing is ever going to find usability defects because it’s testing the code, and having a clean code is important but the user doesn’t care. The user sees the interface and so the interface has to be clean and intuitive and respond to their needs. And development testing won’t find those problems.

Yes, that is one of the interesting things too, I think, that people have this impression that usability testing is expensive and that they are going to hire consultants who have double sided glass and cameras everywhere and it’s going to cost a fortune, but it really doesn’t have to. We base our usability testing philosophy out of the book called “Don’t make me think” by Steve Krug, which is a really fantastic book and it shows you how to conduct usability testing on the cheap, he calls it “First law of usability testing”. And his approach is you have your participant and you have a person who guides the test, and then you have a camera, and then you film them doing this, and while they are doing the test you have the other developers just have that camera hooked up to a monitor, a TV or something and they just watch. And that’s just ridiculously cheap. The way we do it, just because we don’t have a camera we just bought 20 dollars screen recording software for the Mac, there is free or cheap versions for this for just about every operating system out there. We just record the screen and audio with a USB microphone, and then we get together, maybe have a beer or something and then watch the video of the participants. And we pay people about 50 dollars for half hour to forty five minutes of testing. And the reasons we do that is because it motivates them and it makes them think that their time is valuable that this is the real deal, it’s not just something that “Oh, can you come over and maybe look at our code”. I feel like we get better feedback because of that.

Yes, again we base this on what Steve Krug recommends and, as he points out, and Jakob Nielsen pointed out, you can find a lot of usability problems with just a few people, five people seams to be the most commonly number. And Steve Krug recommends doing that in stages. So the way we do it is that we bring in two people, we do simultaneous usability tests with them, then we watch the results, we take notes of course, and figure out what are the big problems that have occurred in this round of testing. We fix those problems to the best of our ability and then we bring in another round. And the way that we tend to do it is about two of these rounds a month. So we have two week iterations where I , so at the end of the iteration we try to bring people to do usability tests. That helps us continuously refine the product, take out the big errors and move on from there.

No, we bring in different people, for us it has mostly been friends and family, friends of friends, friends of friends of friends, that kind of thing and we just bring in different people every time, that way that they are not sort of colored by their previous experience with the project. I think that works out better.

Despite the topic of my talk, I definitely would probably start with an unit test framework. If you are in Java, JUnit and Microsoft .Net has NUnit and a Microsoft framework, I am not sure what it is called. Ruby has Test::Unit, just start with something like that. If you can’t convince your fellow programmers to write tests, well first of all it’s time to look for a new job, but secondly you can just test your own stuff and there is no reason why you can’t set up a process of you running your own tests before checking your own code. Ideally the whole team would do that. We also have a continuous integration server, to run the test every time the software changes. And that works pretty well because you’ll find problems when somebody forgot to run a test or they forgot to check in code, that happens a lot because “Oh, I ran the test everything worked, push my changes, oh, I forgot to add this file”. That kind of stuff tends to happen and it can really mess up somebody else’s day or another problem with Rails, more prior to the current version of Rails, was migrations that had the same number, they would screw up things and so continuous integration would find that problem. Having done that I would branch out I think code reviews are fantastic I mean you could even start with code reviews, before doing tests, because they are quite effective and you’ll learn a lot about programming I think by doing that.

I am going to have a good feedback about the talk, a lot of people have come up to me and told me they enjoyed it. The title is a little hyperbolic but I think my point is basically that testing isn’t enough and most people accept that or are willing to listen to that. But I have to wonder, maybe the people who thought I am an idiot, instead of having come up to me and say “You are an idiot” they are just ignoring me.

I think it really depends on your project. I didn’t get into this in my talk but I think there is like a spectrum of tests, covered needs. A library, that people are depending on, needs to be well covered by tests and you need to apply as many techniques of quality that you can to that code it’s sort of like mission critical code. Then, as you move to the other end of the spectrum, you end up writing little scripts that you can use in your day to day job, I mean I am not going to test those, just throw away things that you use and maybe never use again. And in between that is your application somewhere in that spectrum, so you have to decide for yourself. For me my personal guideline is “Am I confident in the code?” and some people aren’t confident unless they get to a hundred percent code coverage. We are about ninety percent but as I talked about in my talk, that number is meaningless I guess, not meaningless but it means less than I think some people give it credit for. But if you can run a refactoring I think and have errors because the things that you missed, then you have enough tests. Unfortunately there can be a negative in that if you don’t have the tests, they won’t catch there and they won’t know.

Well, fist of all, I am not really arguing that you should cut down the number of tests. It’s more about chose the appropriate number that you feel comfortable with. If you already have tests and you have a tone of tests, I am not saying go and delete half of them or anything like that. But I think that it’s just an awareness that code review is going to find different types of errors. If you have a good testing layer you can make your code better than it already is by instituting code reviews. And I think that is true for all the other techniques that I am talking about. Tests are awesome but they are not enough. Code reviews find errors that are in the strings in your code, stuff that the user might see but the tests aren’t going to catch. One thing that is hard to test is error conditions, stuff that is deep in exception code that is hard to trigger in your tests and if you work over that with code review you can trace it and if you get three of your co-workers to look at it it’s just applying more eyeballs to the problem.

Is your profile up-to-date? Please take a moment to review and update.

Email Address

Note: If updating/changing your email, a validation request will be sent

Company name:

Keep current company name

Update Company name to:

Company role:

Keep current company role

Update company role to:

Company size:

Keep current company Size

Update company size to:

Country/Zone:

Keep current country/zone

Update country/zone to:

State/Province/Region:

Keep current state/province/region

Update state/province/region to:

Subscribe to our newsletter?

Subscribe to our industry email notices?

You will be sent an email to validate the new email address. This pop-up will close itself in a few moments.

We notice you're using an ad blocker

We understand why you use ad blockers. However to keep InfoQ free we need your support. InfoQ will not provide your data to third parties without individual opt-in consent. We only work with advertisers relevant to our readers. Please consider whitelisting us.