Towards Empirical Evaluation of Test-Driven Development in a University Environment

Problem
Pancur et al, believe that traditional approaches to developing software are rigid and unsuccessful. Many have been turning to agile processes in hope for more promising approaches. However, the current evidence is “merely anecdotal”. There is very little empirical validation of the implications of various agile techniques. This study compares test driven development (TDD) versus iterative test last (ITL) to see if they influence development processes and resulting code.

Study
Thirty-four (34) senior undergraduates from the University of Ljubljana, Slovenia participated in this study. It was conducted between February and June 2003, the spring semester. They were all computer science students. The students were divided up into two groups. The first was the TDD group (19 students) and the second was the ITL control group (15 students). The authors used past academic performance (average grades) and previous programming knowledge and experience to divide up the groups equally. The groups were then statistically analyzed with a t test to ensure that they were not statistically significantly different. Both groups used Java, JUnit, and Eclipse 2.1 for developing. To help measure the use of the assigned development process, an eclipse plug-in was used to log time spent coding, and the number of complete (test-code-refactor) and incomplete (test-code) cycles.

Results
The quality of code was tested by code-coverage. Surprisingly, the TDD code coverage was only 92.6% ,and the ITL group was 95.1%. External code quality was tested by 120 test cases, but the results between the two groups were not significantly different. A few surveys were conducted after the experiment, and students were generally less favorable toward TDD. Only 26% of TDD the group (40% ITL) think that their development process was noticeably effective. Yet, 90% of the TDD students (80% ITL) would accept their development process as a primary process in industry.

Conclusions
Only a few conclusions could be brought forth from this study. These students had never previously tested their code, and wrongly believed that because of having to write tests they wrote less new functionality. All in all, the TDD and ITL results were not statistically significantly different to draw any correlations.