In node.js testing, there are 2 helper modules that I use often: sinon and rewire. Sinon allows you to monkey-patch functions, while rewire allows you to… monkey-patch functions. Both modules have other uses as well, but in this post I’m focusing on monkey-patching.

So, in which situations should you use sinon and when should you use rewire for monkey-patching?

In short: if you are testing module A, and inside module A there is a call to a function of module B, you should use sinon. If you want to monkey-patch a call to a function inside module A itself, you should use rewire (and optionally, sinon as well).

Here is an example.

a.js

JavaScript

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

// a.js

varb=require('b');

functiontestMe(num){

return{

x:a1(num),

y:b.b1(num),

}

}

functiona1(num){

return2*num;

}

module.exports={

test:test

};

b.js

JavaScript

1

2

3

4

5

6

7

8

9

// b.js

functionb1(num){

return5*num;

}

module.exports={

b1:b1

};

In your test file, when you are testing
testMe , you can mock
b.b1 by doing this:

Mocking b.b1

1

2

varb=require('b');

varb1mock=sinon.mock(b,'b1');

This works, because in node.js, each
require() ‘ed module is a singleton. So the
b in your test file is the same as the
b in
a.js . Therefore, sinon can overwrite the
b.b1 function in the test file, which results in the call to
b.b1 inside
testMe ending up in sinon’s mock.

Unfortunately, this doesn’t work for the
a1 call. This function is not exported by
a.js , and even if it was, overwriting it with sinon would not have any effect. The reason for this, is that a1 is called “directly” from testMe.

One way to work around this, is to modify the source code in the following way.

Modified a.js

JavaScript

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

// a.js - modified (ugly! don't do this!)

varb=require('b');

functiontest(num){

return{

x:module.exports.a1(num),

y:b.b1(num),

}

}

functiona1(num){

return2*num;

}

module.exports={

a1:a1,

test:test

};

In your test file, you can now
require('./a') and use sinon to mock
a.a1() . Ugh. This is not elegant at all.

Fortunately, rewire allows us to overwrite even private functions. All you need to do in your test file is the following:

Monkey-patching a.a1 using rewire and sinon

JavaScript

1

2

3

4

5

6

7

8

9

10

vara=rewire('a');

vara1mock=sinon.mock();

// Overwrite the private a1 function with the mock.

varrestore=a.__set__('a1',a1mock);

// Do your test.

// Restore the original a1 function.

restore();

This is a bit more work, but at least you don’t need to change the source code.

In the previous post, we talked about spying. With a test spy, you can spy on method calls and see how they are called. This is a bit like listening in on a conversation passively.

The next step is to provide an answer to a method call, similar to interrupting someone who is about to answer a question, and giving your own – different – answer.

This is done via a test stub. Let’s look at an example. You have a website with users, and you want to show the current weather for the logged in user. To do that, you have created a simple function:

generateWeatherReport

JavaScript

1

2

3

4

5

6

7

8

9

functiongenerateWeatherReport(user){

varcity=user.location.city;

varcountryCode=user.location.countryCode;

returnweatherService.fetchLocalWeather(city,countryCode)

.then(function(weatherData){

returnsprintf('The weather in %s, %s is %s.',

city,countryCode,weatherData.summary);

});

}

In goes a user, out comes the one-line weather report. Your helper function
fetchLocalWeather of the
weatherService calls a weather website’s API to fetch the local weather.

How can we test this?

weather.spec.js first try

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

22

23

require('should');

varweather=require('./weather');

describe('generateWeatherReport',function(){

it('should generate a weather report',function(done){

varuser={

location:{

city:'Amsterdam',

countryCode:'NL'

}

};

weather.generateWeatherReport(user)

.then(function(report){

report.should.equal('The weather in Amsterdam, NL is rainy, cold and miserable..');

done();

})

.catch(function(err){

done(err);

});

});

});

This code runs. However, the test only succeeds when it is actually rainy, cold and miserable in Amsterdam. Admittedly, this is a large part of the year, but it would be better if this test also succeeds when it’s sunny in Amsterdam.

Again, the sinon module comes to the rescue. With it, we can stub the
fetchLocalWeather call, and make it return what we want for a given input. Like this:

report.should.equal('The weather in Amsterdam, NL is sunny and great.');

done();

})

.catch(function(err){

done(err);

})

.done(function(){

fetchLocalWeatherStub.restore();

});

});

});

It’s necessary to import the
weatherService in the test file, even though we are not using it directly. Because imported modules in node.js are only imported once, and then behave like a singleton, the weatherService in the spec file is the same as in the
weather.js file.

Using sinon, we overwrite the
fetchLocalWeather function. If it’s called with arguments
Amsterdam,NL , we return a fixed value.

Note that it’s important to restore this stub when you’re done with this test, otherwise
fetchLocalWeather for
Amsterdam,NL will keep returning this value for other tests in your test suite.

Now if only we could improve the actual weather in Amsterdam with sinon!

I have to disappoint you. Unfortunately (or maybe not), spying in the world of unit testing is nowhere as exciting as spying in real life. At least, that’s what I think. Maybe the view we non-spies have on spying, created by all these films and series, is completely incorrect. Who knows, maybe unit test spying is more exciting than real world spying.

Anyway, spying. A test spy is a way to verify that a specific call, method or function has been called correctly.

Sounds pretty vague, right? Let’s look at some code to make that more concrete.

shop.js

JavaScript

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

22

23

24

25

26

// shop.js

varmyLogger=require('./mylogger');

functioncalculateTotal(items){

vartotalPrice=0;

vartotalItems=0;

items.forEach(function(item){

totalItems+=item.count;

totalPrice+=item.count*item.price;

});

if(totalPrice>1000){

myLogger.log('A probably order of '+totalPrice+' is being considered!');

}

return{

totalPrice:totalPrice,

totalItems:totalItems,

};

}

module.exports={

calculateTotal:calculateTotal

};

This is a module for your web shop, with a function in it to calculate the total price. Of course you are going to be very excited every time a large order is about to happen, so that’s why you decided to log every calculation where the total is over 1000.

For now, you’re just logging this to console, using
mylogger.js :

mylogger.js

JavaScript

1

2

3

4

5

6

7

8

9

// mylogger.js

functionlog(message){

console.log(message);

}

module.exports={

log:log

};

How can we test whether the reporting mechanism works?

We could try to fiddle around with capturing
STDOUT or overwriting
console.log . That’s certainly an option, but it’s a bit crude. Besides, if we change the
log function to send an email instead of writing to console, it won’t work anymore.

What are we really interested in? Well, we want to know whether the
log function has been called by calculateTotal. We don’t need the actual
log function to be called – this might even cause problems, if
log would send an email or write to a database.

Here, a spy comes in handy. We replace the
myLogger.log call by a spy. This has 2 advantages:

logSpy.getCall(0).args[0].should.equal('An order of 1400 is being considered!')

});

});

Another situation where a spy could come in handy, is when you are calling a function that expects a callback. In that case, you can just create a spy with
sinon.spy() and pass that as callback. Afterwards, you can verify that the callback was called with the right arguments. You can find an example in the sinon.js documentation.

Today, you have been asked by your uncle to help him set up the new MegaBanana Slide in his Fun Park. The MegaBanana Slide is meant for children, but not too small children. Certainly not for adults. And you must take off your shoes before using it. Oh, and of course you are not allowed to go together with your friend – each child has to wait for its turn!

The MegaBanana Slide!

To prevent any accidents, the MegaBanana Slide has a sophisticated Child Measurement System, where any child who stands near the start of the Slide is measured. The results of this measurement are then fed into the Permission Granting System, which then opens the gate – or not.

Your uncle knows that you are a star programmer, so he has asked you to program the PGS according to his specifications:

only 1 child at a time

no shoes

no children under 90 cm (that’s about 3 feet)

absolutely nobody who weighs more than 125 kg (275 lbs)!

The PGS wants an object with a
canUse field and a
reason field. If
canUse is
false , the reason is shown to the poor child who is not allowed to go down the slide.

Writing this code takes you all of 3 minutes and 42 seconds.

slide.js

JavaScript

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

22

23

24

25

26

27

28

29

30

31

32

33

34

35

36

37

38

functioncanUseSlide(person){

if(Array.isArray(person)&&person.length>1){

// Sneaky! Trying to go with more than 1 person together!

return{

canUse:false,

reason:'Only 1 person at a time!',

};

}

if(person.weight>125){

// We don't want the slide to break!

return{

canUse:false,

reason:'This slide is for children only.',

};

}

if(person.height<90){

// Small children are not allowed on this dangerous slide.

return{

canUse:false,

reason:'You are not tall enough yet.',

};

}

if(person.isWearing('shoes')){

// You're only allowed to go down the slide barefoot.

return{

canUse:false,

reason:'You must take off your shoes first!',

};

}

return{

canUse:true,

reason:''

};

}

Easy peasy. But you know that your uncle is a nitpick, so you decide to write some unit tests (even though he has no clue what that term even means) to verify that your code does the right thing.

slide.spec.js first version

JavaScript

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

22

23

24

25

26

27

28

29

30

31

32

33

34

35

36

37

38

39

40

41

42

43

44

45

46

47

48

49

50

51

52

53

54

55

56

57

58

59

60

61

62

63

64

65

66

67

68

require('should');

varslide=require('./slide');

describe('canUseSlide',function(){

it('should verify if you can use the slide: child the slide is intended for',function(){

varresult=slide.canUseSlide({

height:105,

weight:15,

isWearing:function(){returnfalse;}

});

result.should.eql({

canUse:true,

reason:'',

});

});

it('should verify if you can use the slide: large adult instead of child',function(){

varresult=slide.canUseSlide({

height:193,

weight:151,

isWearing:function(){returnfalse;}

});

result.should.eql({

canUse:false,

reason:'This slide is for children only.',

});

});

it('should verify if you can use the slide: too small child',function(){

varresult=slide.canUseSlide({

height:83,

weight:11,

isWearing:function(){returnfalse;}

});

result.should.eql({

canUse:false,

reason:'You are not tall enough yet.',

});

});

it('should verify if you can use the slide: wearing shoes',function(){

varresult=slide.canUseSlide({

height:105,

weight:15,

isWearing:function(){returntrue;}

});

result.should.eql({

canUse:false,

reason:'You must take off your shoes first!',

});

});

it('should verify if you can use the slide: 2 children at the same time',function(){

varresult=slide.canUseSlide([{

height:105,

weight:15,

isWearing:function(){returnfalse;}

},{

height:115,

weight:17,

isWearing:function(){returnfalse;}

}]);

result.should.eql({

canUse:false,

reason:'Only 1 person at a time!',

});

});

});

5 possible situations, so 5 tests. But this test code hurts your eyes. So much copy-pasting going on here, agh. You can do better.

Let’s make an array of test situations. We’ll give each one a name (to put in the
it description), the input of
canUseSlide , and the expected output.

slide.spec.js version 2

JavaScript

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

22

23

24

25

26

27

28

29

30

31

32

33

34

35

36

37

38

39

40

41

42

43

44

45

46

47

48

49

50

51

52

53

54

55

56

57

58

59

60

61

62

63

64

65

66

67

68

69

70

71

72

73

74

75

76

77

78

require('should');

varslide=require('./slide');

describe('canUseSlide 2',function(){

vartestCases=[

{

name:'child the slide is intended for',

person:{

height:105,

weight:15,

isWearing:function(){returnfalse;}

},

expectedResult:{

canUse:true,

reason:'',

}

},

{

name:'large adult instead of child',

person:{

height:193,

weight:151,

isWearing:function(){returnfalse;}

},

expectedResult:{

canUse:false,

reason:'This slide is for children only.',

}

},

{

name:'too small child',

person:{

height:83,

weight:11,

isWearing:function(){returnfalse;}

},

expectedResult:{

canUse:false,

reason:'You are not tall enough yet.',

}

},

{

name:'wearing shoes',

person:{

height:105,

weight:15,

isWearing:function(){returntrue;}

},

expectedResult:{

canUse:false,

reason:'You must take off your shoes first!',

}

},

{

name:'2 children at the same time',

person:[{

height:105,

weight:15,

isWearing:function(){returnfalse;}

},{

height:115,

weight:17,

isWearing:function(){returnfalse;}

}],

expectedResult:{

canUse:false,

reason:'Only 1 person at a time!',

}

},

];

testCases.forEach(function(tc){

it('should verify if you can use the slide: '+tc.name,function(){

varresult=slide.canUseSlide(tc.person);

result.should.eql(tc.expectedResult);

});

});

});

Excellent. You have separated the data from the test execution. It’s now easy and clear how to add another test. It’s even possible to put the test data in a different file.

However, your test file just went up from 68 lines to 78 lines. And there is still a lot of duplication.

In the test data, the expected result can be reduced to just the reason. During test execution, you can then create the expected result object from the reason (after all,
canUse is
true if the reason is empty, and
false otherwise).

Also, you decide to make a helper function to create the person objects.

This is the final version of your test file:

slide.spec.js version 3

JavaScript

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

22

23

24

25

26

27

28

29

30

31

32

33

34

35

36

37

38

39

40

41

42

43

44

45

46

47

48

49

50

51

require('should');

varslide=require('./slide');

describe('canUseSlide 3',function(){

functioncreatePerson(height,weight,isWearingShoes){

return{

height:height,

weight:weight,

isWearing:function(){returnisWearingShoes;}

}

};

vartestCases=[

{

name:'child the slide is intended for',

person:createPerson(105,15,false),

reason:'',

},

{

name:'large adult instead of child',

person:createPerson(193,151,false),

reason:'This slide is for children only.',

},

{

name:'too small child',

person:createPerson(83,11,false),

reason:'You are not tall enough yet.',

},

{

name:'wearing shoes',

person:createPerson(105,15,true),

reason:'You must take off your shoes first!',

},

{

name:'2 children at the same time',

person:[createPerson(105,15,false),createPerson(115,17,false)],

reason:'Only 1 person at a time!',

},

];

testCases.forEach(function(tc){

it('should verify if you can use the slide: '+tc.name,function(){

varresult=slide.canUseSlide(tc.person);

varexpectedResult={

canUse:tc.reason.length===0,

reason:tc.reason,

}

result.should.eql(expectedResult);

});

});

});

51 lines of lean and mean test code. The test data is very clear and compact. Not bad!

Your uncle installs your code on the PGS, and soon after, the MegaBanana Slide is fully operational. Everyone is happy!

Except that you still have this slight nagging feeling that it should be possible somehow to refactor the
person field of the test data so you can move the
createPerson calls from the test data to the test execution. But you can’t think of an elegant way…

Note: The code snippets contain a few things that should be done differently in production code. For example, using asynchronous fs calls (e.g. fs.readFile instead of fs.readFileSync ); using "use strict;" ; and using modules like log4js and sinon.

Step 1

This is the code we shall write tests for:

Unit test server version 1

JavaScript

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

22

23

24

25

26

27

28

29

30

31

32

33

34

35

36

37

38

39

40

// server.js

// Express is a web framework for node.js.

varexpress=require('express');

// Moment is a date manipulation library.

varmoment=require('moment');

// Fs is a core module of node.js to manipulate the file system.

varfs=require('fs');

varapp=express();

// When a http request is done on /count/start, execute this code.

// req contains request information, which we are not using.

// res is the response object.

app.get('/count/start',function(req,res){

// Read the server log file

vardata=fs.readFileSync(__dirname+'/server.log').toString();

// and count how many lines contain the words 'server started'.

varnumStartLines=data

.split(/\r\n|\r|\n/)

.filter(function(line) { return /serverstarted/.test(line);})

.length;

// Send a JSON answer that shows how many times the server has been started.

res.send({numStart:numStartLines});

});

varserver=app.listen(64001,function(){

// When the server starts, log that it started.

varmsg=moment().format('YYYY-MM-DD HH:mm:ss')+' - server started\n';

fs.appendFileSync(__dirname+'/server.log',msg);

varhost=server.address().address

varport=server.address().port

console.log("Unit test 1 app v1 listening at http://%s:%s",host,port)

});

This code starts an HTTP server that listens on
localhost:64001 . It writes to a log file whenever it is started. On an HTTP request to
http://localhost:64001/count/start , it returns a JSON structure that shows how many times the server has started. Like this:

Curling the server

Shell

1

2

czapka:~$curl localhost:64001/count/start

{"numStart":5}

We want to test if the correct count is returned by our URL – so the code that is the second argument of
app.get('/count/start',function) . How can we do that?

If we load
server.js in our test file, the server is actually started and the real log file is being used. This is not an option. Besides, running the server and testing the HTTP call is a high level test – let’s test it on a lower level. To do this, we need to separate the count start function from the server itself.

Step 2

We’ll create a
count.js file to put the count start functionality in. It looks like this:

Step 2 count.js

JavaScript

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

// count.js

varfs=require('fs');

functioncountStart(req,res){

vardata=fs.readFileSync(__dirname+'/server.log').toString();

varnumStartLines=data

.split(/\r\n|\r|\n/)

.filter(function(line) { return /serverstarted/.test(line);})

.length;

res.send({numStart:numStartLines});

}

module.exports={

countStart:countStart

};

The
app.get() line in
server.js needs to be changed as well:

Step 2 modifications to server.js

JavaScript

1

2

3

4

// Modifications to server.js

varcount=require('./count');

app.get('/count/start',count.countStart);

Excellent. Now we can create
count.spec.js to test the new
countStart function. Let’s give it a shot. We’ll use the mocha framework for testing and should to check actual results against expected results.

Step 2 count.spec.js first attempt

JavaScript

1

2

3

4

5

6

7

8

9

10

11

12

13

14

// count.spec.js

require('should');

varcount=require('./count');

describe('countStart',function(){

it('should count the number of times the server was started',function(){

varreq=..?

varres=..?

count.countStart(req,res);

something.should.equal(somethingElse);

});

});

There are a few problems here.

This test will read the real
server.log file. Therefore, the number of “server start” lines can be different each time you run this test. Additionally, the file may not exist at all.

The
countStart function does not return a useful value; instead it call
res.send with the value to be tested. That makes it more difficult to test.

Let’s get it to work anyway, by creating a fake
res object with a
send method.

Step 2 count.spec.js

JavaScript

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

// count.spec.js

describe('countStart',function(){

it('should count the number of times the server was started',function(){

vardataSent;

varfakeResObject={

send:function(data){

dataSent=data;

}

};

count.countStart({},fakeResObject);

// We know that dataSent should be an object with the key

// "numStart". However, we don't know the value.

dataSent.should.have.keys(['numStart']);

dataSent.numStart.should.be.a.number;

});

});

Now we have something that works, most of the time. In my opinion, this is not acceptable test code yet. So let’s fix the first problem, that the test code uses the same
server.log file as the server itself.

Step 3

The problem with the current approach, is that the
countStart function determines the log file by itself. It would be better if the log file was determined elsewhere, and then given to (or asked by) the countStart function. Ever since I saw Miško Hevery’s talk on Dependency Injection, I’ve been a fan of DI; and even if this is not be exactly DI, it’s certainly very similar.

We’ll move the log file related code to a separate file,
log.js and we’ll make it possible to set and get the filename of the log file.

Note that we can now actually test whether the number of lines is 2, because we control the log file in the test. In step 2, we could not predict how many lines there would be. That means that this test tests our function better. In step 2, the
countStart function could just as easily always have returned 42, and our test would not have noticed that.

Regarding our DI approach: we’re not injecting the log file name into the
countStart function yet, because this function is being called by the express framework. Let’s work on that now. This should also fix the second problem mentioned at the end of step 2.

Step 4

If we look at the
countStart function, we see that it is currently doing 2 things: it is handling the HTTP request and sending a response, and it’s counting the “server started” lines in the log file. In general, when a function does 2 things that are easily separated, it’s good to separate the two. This makes the code cleaner, easier to reuse, and easier to test.

Here’s the new count.js:

Step 4 count.js

JavaScript

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

22

// count.js

varfs=require('fs');

varlog=require('./log');

functioncountStart(req,res){

varnumStartLines=_countStart(log.getFile());

res.send({numStart:numStartLines});

}

function_countStart(filename){

vardata=fs.readFileSync(filename).toString();

varnumStartLines=data

.split(/\r\n|\r|\n/)

.filter(function(line) { return /serverstarted/.test(line);})

.length;

returnnumStartLines;

}

module.exports={

countStart:countStart

};

The function
countStart now only gets the number of “server start” lines from elsewhere, and returns a data structure. The
_countStart function (ok, the name could have been better) no longer has any knowledge about HTTP requests or responses, and just counts lines in a file.

This also means that we should test both functions. Both tests will be easier than the single test of step 3.

First, we’ll test
_countStart . Currently, this function is not listed in the exports of
count.js , because it’s a private function (maybe it should be, especially if we want to use this functionality in other parts of the cade). There is a trick to get access to this function in our test suite, and that trick is the rewire module.

This is how we test
_countStart .

Step 4 count.spec.js countStart part

JavaScript

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

22

23

24

25

26

// count.spec.js first part

describe('_countStart',function(){

varlogFileName=__dirname+'/unit-test-file.log';

functioncleanupTestFile(){

if(fs.existsSync(logFileName)){

fs.unlinkSync(logFileName);

}

}

beforeEach(cleanupTestFile);

afterEach(cleanupTestFile);

it('should count the number of times the server was started',function(){

This looks very similar to the test in step 3. The difference is, that we don’t fiddle around with a fake response object anymore:
_countStart simply returns the count.

The other test is of countStart. In this test, we use a very important principle: we don’t need to test what has already been tested elsewhere. Therefore, we are going to replace the
_countStart function be a function of our own. This is called monkey patching.

Step 4 count.spec.js second part

JavaScript

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

22

23

24

25

26

27

28

29

30

31

32

33

34

35

36

// count.spec.js second part

describe('countStart',function(){

it('should respond with the number of times the server was started',function(){

vardataSent;

varfakeResObject={

send:function(data){

dataSent=data;

}

};

varlogFileName='testfile.log';

log.setFile(logFileName);

varlogFileArgument;

// We overwrite the private _countStart function so it returns what we

// want. After all, we are testing the countStart function here; the

// _countStart function has already been tested above.

// Also, we want to verify that countStart calls _countStart with the correct

// filename argument.

// The __set__() call returns a function that, when called, resets the

// overwritten function back to its original.

varreset=count.__set__('_countStart',function(filename){

logFileArgument=filename;

return2;

});

count.countStart({},fakeResObject);

dataSent.should.eql({numStart:2});

logFileArgument.should.equal(logFileName);

reset();

});

});

That’s a lot of preparation for a single test! This actually happens often, that the preparation for a unit test is significantly more code than the call and verification. It’s just part of testing.

In the above test, we still need the fake response object, but we don’t need to write files anymore.

This is a reasonable final state for our code. There is one more refactoring improvement that can be done.

Step 5

The last thing that bothers me is the
_countStart function. This function does too much: it reads stuff from a file, and it counts stuff. Especially the counting part is a reasonably complex operation, and we’d like to test that separately.

Let’s split it into two parts.

Step 5 count.js modification to _countStart

JavaScript

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

// count.spec.js modifications

function_countStart(filename){

vardata=_readFile(filename);

return_countMatchingLinesInString(data,/server started/);

}

function_readFile(filename){

returnfs.readFileSync(filename).toString();

}

function_countMatchingLinesInString(string,regexp){

returnstring.split(/\r\n|\r|\n/)

.filter(function(line){returnregexp.test(line);})

.length;

}

There. Now it’s much easier to test the matching function.

Step 5 count.spec.js testing matching function

JavaScript

1

2

3

4

5

6

7

8

9

10

11

12

13

14

// conut.spec.js part

describe('_countMatchingLinesInString',function(){

it('should count the number of lines that match in a string',function(){

Looking at these four functions, we can notice a difference between the two pairs.
_readFile and
_countMatchingLinesInString are functions that do actual work. Let’s call them worker functions. The functions countStart and _countStart don’t do any work, but they link worker functions together. Let’s call these linker functions.

To test a linker function, all you need to do is to verify that it calls the right worker functions in the right order, with the right arguments. This can be done by mocking the worker functions; there is no need to call the actual worker functions. For example, for
_countStart :

For this specific example, I’m not sure whether step 5 is overkill. It might be. I found it important to show you though, because for more complex situations, the separation into worker functions and linker functions can make your life much easier.

Conclusions

Even for a very simple server that only does one thing, writing unit tests will make you think about and improve the design of your code. If a function or module is hard to test, it is often an indication of a problem with the design.

When testing, keep in mind dependency injection and the difference between worker functions and linker functions. Also think about which parts of the code have already been tested elsewhere, so you know when it’s safe to mock or monkey patch that code. This will help you to restructure your code to make it easier to test – and hopefully also easier to reuse and maintain.

If you have done or read anything about unit testing, I’m sure you’ve encountered the standard example of a function that multiplies two numbers and returns the result. This function is pretty easy to test.

Once your functions become more complex than that, it can become a real challenge to figure out how to test them. Let’s look at a function that is slightly more complex than the multiplication one.

A function that filters items by profit

1

2

3

4

5

functionfilterItems(items,totalProfit){

returnitems.filter(function(item){

returnitem.amount *item.profit>=totalProfit;

});

}

This function takes 2 input parameters: a list of items, and a total profit value. It returns a new list, with only the items that have a total profit of at least the given total profit value. The profit is determined by multiplying the amount of items and the profit per item.

You use it like this:

Using the filteredItems function

1

2

3

4

5

6

7

varitems=[

{amount:10,profit:100},

{amount:20,profit:30}

];

varfilteredItems=filterItems(items,700);

// filteredItems = [{ amount: 10, profit: 100 }];

There is your first test already. What else could you test here?

An empty list as input

An object with
amount*profit exactly equal to the
totalProfit argument

The same, but now with floating point numbers

An object with negative profit

An object with zero amount

A negative
totalProfit argument

I’m not saying you need to test all these, but you could, and they could be meaningful tests. Which ones would you actually choose to implement? Let’s go over them, the normal scenarios first.

Test 1 is simple but also does not contribute very much. Filtering an empty list returns an empty list, no matter what the filter is.

Number 2 is a good one: the function uses the
>= operator, not the
> operator, and does so for a reason. It’s good to test that.

The third test is an interesting one. Should we care about floating point multiplication issues here? I would like to confirm that the object
{amount:40,profit:17.5} will also be in the returned list when calling the function with
700 as second argument. So let’s do it, and see what happens.

The fourth one is not that relevant to me. After all, we know that a number below 0 is smaller than the given (positive) profit. We don’t need to test if the
>= operator works correctly.

For the same reason, test 5 is not that useful to me. We know that multiplying a number by zero works.

Finally, test 6: a negative
totalProfit argument. Here we come to a different issue: in the system that this function lives in, is it normal to have negative profits, to show losses? Is that an acceptable situation? Or is the profit always zero or higher, because if there is a loss, it’s administered in a different field?

If the profit can be negative, that means that a negative totalProfit argument makes sense. I would add a test with a few objects, some with positive and some with negative profit, and filter on a negative totalProfit. Yes, technically you are again testing the
>= operator, but I think this is a valuable test, because this is conceptually different. This test is not for the computer, it is for the people who look at this code in the future.

So, of the 6 tests, I would do 2, 3 and 6.

Oh wait, what about the possible error scenarios?

Don’t pass a second argument

Pass a negative amount and a negative profit

Pass a string as first argument instead of an array of objects

Pass an array of strings as first argument

Pass a string as second argument

Trigger an “out of memory” error when the function is called

Let’s do the same exercise.

Test 1 exposes a problem in our code immediately. Apparently, when in JavaScript you compare any number to
undefined , you get
false (I had to test that myself too). So instead of an error, you always get an empty array back. Personally I’d rather get an error – much easier to debug.

The second test is funny. Multiplying two negative numbers results in a positive number. So when filtering for
500 , and you pass
{amount:-10,profit:-100} , this object will be in the return array. Now, the question is whether our
filterItems function should care that this is possible at all. Maybe the array of items is created via some other function, and thus this scenario could never happen. So, it depends on the rest of the system whether you should make a test for 2.

Number 3 is typically a test that you don’t see often. Most developers, even when writing unit tests, don’t tend to make tests for input that is utterly incorrect. This is a reasonable approach in my opinion, because otherwise you’d be writing dozens of extra tests for each function, with marginal value.

If you’re familiar with JavaScript, you know what will happen if you call
filterItems with a string as first argument: you’ll get an error on the
items.filter statement. Good enough for me.

Test 4 is tricky. In other languages, you will get an error because a string does not have the field
amount or
profit . However, in JavaScript, doing
"test".profit gives undefined. So our function will always return an empty array when you use an array of strings as input. Is that a problem? Probably not in this case. filterItems is a very low-level function, so most of the interactions with this function will be done by developers. They should know better.

5 turns out to have the same result as 1. Comparing any number to a string returns false. Again, I’d rather get an error than an empty array as return value.

The sixth test is quite drastic. An out of memory error? I don’t think I want to test this. First, it’s probably going to take some work to simulate this. And second, if this function hits an out of memory error, there will be bigger problems than just this function failing.

So, what do we do with these six error scenarios? With the code as it is, I don’t think that there are any meaningful error tests. I’d probably change the function: I’d add a check to see if the first argument is an array of objects with the fields amount (which should not be negative) and profit, and if not, throw an error. Then I would add a few tests for that: 2, 3, and 4.

Note that in a different language, you would have different problems. For example, if your object oriented language has static typing, and you have defined a
ListOfItems class, you can put all these checks in that class and have the language verify that the first argument is a
ListOfItems object. Anyway, this post is not about differences of languages, but how to think about testing.

Phew, that’s quite a bit of thinking, and that just for a 3-line function. So far, we have:

written a few tests for the normal scenarios;

decided not to test certain scenarios because they are already tested elsewhere;

written a test purely to help other developers in the future;

found the need to know how negative profits (losses) are handled in this system;

found the need to know how these lists of item objects are created;

modified the code to add a check for the first argument.

Let’s look at a function that is a bit more complex. This function is part of a system to book hotel rooms and it creates a booking.

function createReservation

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

22

23

24

25

26

27

28

29

30

31

32

33

34

35

functioncreateReservation(reservationData){

// moment is a date library.

varcheckin=moment.utc(reservationData.checkin);

if(!checkin.isValid()){

thrownewError('Check-in date '+reservationData.checkin+' is not a valid date.');

}

varcheckout=moment.utc(reservationData.checkout);

if(!checkout.isValid()){

thrownewError('Check-out date '+reservationData.checkout+' is not a valid date.');

}

// The reservation checker is a dependency from outside this function.

varisAvailable=reservationChecker.isAvailable({

roomId:reservationData.roomId,

checkin:checkin,

checkout:checkout

});

if(!isAvailable){

thrownewError('Cannot create reservation because room '+reservationData.roomId+' is not available.');

}

varreservationObj=newReservation({

checkin:checkin,

checkout:checkout,

numberOfGuests:reservationData.numberOfGuests,

guestEmail:reservationData.guestEmail

});

reservationObj.calculateCost();

// Return a promise that resolves to the saved reservation object.

returnreservationObj.save();

}

This might not be the best code ever, but it’s certainly a function that you could encounter in a custom made reservation system. An early version. Anyway, let’s look at what it does.

The only argument is an object with information about the reservation. This object should contain four fields:
checkin ,
checkout ,
numberOfGuest , and
guestEmail . There are checks done on the
checkin and
checkout fields, and the availability of the room is checked as well. Then a new
Reservation object is created, the cost of the reservation is calculated, and the reservation is saved. A promise is returned that resolves to the saved reservation.

The first two tests can be used to verify if reservations can be short or long.

The third and fourth test make sure that your logic with checkin/checkout for consecutive reservations is correct. I think these are very important tests here: you don’t want to have days that are double booked, and you also don’t want to have gaps of 1 day between reservations.

The fifth test, testing the email address validity, could be useful – but it should not be tested here. This function is about creating reservations, not managing email addresses. So the validity of the email address should be created elsewhere. Maybe the constructor of the Reservation class calls
isValidEmail on the given email address. In that case, there should be tests for thisValidEmail function.

Test 6, different amounts of guests, could be useful. There could be different prices for different amounts of guests. So you could make 2 tests that create a reservation each, for the same room, checkin and checkout, but different numbers of guests. Then you can make sure that the two reservations each have the right price.

On the other hand, those are probably tests that should be done on a lower level: on
reservation.calculateCost() . If those tests already exist, you don’t need to duplicate them here. All you need to do is test that calculateCost() is called on the reservation.

What about rainy day scenarios?

Where do we start? That’s going to be a long list.

Check-out the same day as check-in

Check-out before check-in

Room with given id does not exist

Room is already booked for the given check-in and check-out

Guest email is not a valid email address

Number of guests is too large

Number of guests is too small

The cost could not be calculated

Error saving reservation to database

Check-in is not a date

Check-out is not a date

Room id is in the wrong format

Check-in is a date time, e.g. “2016-02-15 17:31:44”

Guest email is empty

Guest email is not an email address

I’m sure you can think of more.

Note that the first 9 scenarios are of a different type than the last 6. The first 9 can happen with each input field being valid in itself. The last 6 are just invalid inputs.

Test 8 leads to some questions. Could this happen? How could this happen? Maybe the room doesn’t have rates defined for the given check-in and check-out dates. Should we test this?

Test 13 is also interesting. What should happen here? Should the function “round down” the date to midnight? What about timezones? Should the checkin and checkout arguments be date strings maybe?

Since this seems to be a very central core function of the system, I would write tests for most of these, except for:

5, because this should be tested elsewhere.

9, because if there is a database error, we have bigger problems than just this function failing.

As you can see, even just thinking about which tests you could write already leads to improvements of the code, extra error checks, finding bugs, and determining which tests should go on which level.

This is the first post in a series about automated testing of software.

A long time ago, before I started my professional software development career, I had no experience with unit testing, or any type of automated testing. When I wrote a piece of code, I ran it manually with different inputs to verify that it worked. The blessings of a physics education, where many scientists were just fiddling around with code!

I’m sure many people did it like that back then, and many people still do nowadays.

Then, at my first job, one of my colleagues insisted that I wrote unit tests for my code. At first, I didn’t see much use of it, but because he was the senior guy and I was the newbie, I did what he said.

That was a massive step forward in my software development skills. Over the years, I gained more experience, and my respect for automated testing kept growing.

In this series, I want to share with you the fine points of automated testing.There are many introductory posts to be found, where a function to multiply 2 numbers is defined and then unit tested. This series will go much further than that. It will be an advanced series, where you will learn how to write tests for a large system properly.

But first, let me try to whet your appetite by listing some advantages of automated testing. I have experienced all these advantages first hand.

You will find more bugs earlier.

This is a no-brainer of course. When you write tests for your code, you will find bugs. And you will find them early – long before your code is deployed.

You will prevent fixed bugs from reappearing.

When you fix a bug, it is good practice to create a test that demonstrates that the bug has been fixed. If you have such a test, and you modify your code later on, you can be confident that this bug will not reappear.

You will prevent bugs in part A when you modify part B.

This is a common, and very frustrating, occurrence. You change a part of your code, and somehow a different part of your code breaks. If you have unit tests for the other part, you will be able to detect the breakage while you are coding – and fix it immediately. This is called regression testing.

You will protect yourself against silly mistakes.

Sometimes your code looks good. And it compiles. And the server starts up. But you made a typo somewhere, maybe a
+ instead of a
- , or 8640 seconds in a day instead of 86400. Or you forgot to include a necessary module. Having a test will find these issues before they become problems.

You can demonstrate how the code is meant to be used.

Tests can be great complements to documentation: they show how your code is meant to be used. I usually start by looking at the tests, when learning new modules or projects. They are often better at showing how the code works than the documentation.

You will refactor your code.

Often, when you are writing tests for a function, you conclude that it’s going to be a lot of work and/or difficult to test this function. The logical solution is to refactor such a function into smaller parts, to make each part easier to test.

It’s not always about function size, but about having a function that does too much. For example, a function that reads a file, parses the content, and sends an email will be hard to test. Split it into 3 parts, and testing will be much easier. The resulting code will be better as well.

Your design will improve.

This is an extension of the previous point. Sometimes you get new insights during refactoring, which leads you to improve your overall architecture.

You will deploy broken code less often.

The more tests you have, the less the risk that you deploy code that doesn’t work correctly. It is of course not a guarantee of faultless deployment, but it will certainly improve the odds.

You will have more confidence in your code.

This is the vaguest advantage, but for me personally a very strong one. When I’m working on code that has proper automated testing, it just feels better. I am more confident about making changes or expanding the code. I work faster because I’m less scared of breaking something important.

These are the reasons why almost all of the code I work on, has automated tests in some form. Both my professional and my personalcode. There are some exceptions, like one-time scripts and throw-away proof of concept code (does that actually exist?). I hope this list of advantages will nudge you to writing more unit tests.

In your life, you have to make a lot of decisions. Most of the time, there are uncertain factors in your decisions, because they have to do with the future. For example:

Shall I buy insurance for my new, expensive bicycle?

At what time do I leave to be on time for my appointment?

Do we go left or right here?

Later on, you will sometimes look back at such decisions. It is very human to look at the result to determine whether the decision was good.

My bicycle has been stolen! I should have bought insurance.

It’s good that I left early, because the train I took had a huge delay.

We went left as usual, but there was an accident on that route! We should have gone right instead.

This is called results-oriented thinking. This term is used a lot in gaming, for example in poker and in Magic: the Gathering.

Let’s look at this a bit closer. Is this a good way to judge your decisions?

The bicycle insurance. Let’s say your new bicycle costs 1,500 euros, half of your monthly salary. That is a lot of money. An insurance against theft and damage sounds like a good idea.

Of course, if you know beforehand that your bicycle is going to be stolen, the correct decision is to take the insurance. But you cannot know this. The best you can do, is to estimate the probability of theft. Maybe the local police has statistics which indicate that the risk of theft of expensive bicycles is 10% in the first year. Let’s use that number: 10% chance of theft.

If the insurance costs 1 euro for 1 year, most people would take this insurance. And if it costs 1000 euros for 1 year, most people wouldn’t. There is no need for fancy calculations here – people instinctively know that the former is a good deal and the latter is a bad deal.

Somewhere between 1 and 1000 is the right price for this insurance. In this case, it’s a simple calculation: the risk of theft multiplied by the price of the bicycle. So 10% x 1500 euros = 150 euros. *

The right way of judging your decision, is to compare the actual price to the right price. The wrong way is to use the result of the decision: has my bicycle been stolen?

Assuming a 10% chance of theft, buying insurance for 1 euro is a good decision, even if your bicycle is never stolen. And buying insurance for 1000 euros is a bad decision, even if your bicycle is actually stolen.

So, next time you are wondering whether a decision you made in the past was a good decision, remember the information you had at the time you made the decision instead of looking at the result, and judge based on that.

Imagine you are a developer, working on a backend system with reservations. A new requirement comes up: if a reservation is not paid within 24 hours, it needs to be expired.

Your reservation table already has a field
creationDate . So all you need to do, is to create a periodic job that looks at all active and unpaid reservations, that were created more than 24 hours ago, and set their status to
expired .

Simple, right? You feel a bit queasy about using the
creationDate field in this manner, but decide to go for it anyway.

A few weeks later, someone from customer support comes to you. “We have this reservation that has expired. But we really want to give the customer the option to pay, even though he was too late. Could you un-expire it?”

It seemed pretty important, so you change the
creationDate value of this reservation. Yes, this will screw up the statistics, but who’ll notice that? Let’s hope this won’t happen too often.

Some time later, you receive a feature request. “Is it possible for all reservations made by premium users to have an expiry time of 48 hours instead of 24?”

Sure, no problem. You expand your expiry code. Instead of only looking in the
reservation table, you
JOIN the
users table to check whether a user is a premium user. This does work, but it has a small bug in it: the user’s premium status should actually be checked when the reservation was created, not when it’s being expired.

So you ponder whether you should add a history log to the user’s premium status, but in the end you and the business decide to accept this small defect instead of spending time on fixing it.

The next feature request arrives. “Could we have a way to increase the expiry time for individual reservations via the reservation overview page?”

Hmm, tricky. You tell them that it’s possible but it will screw up the creation statistics. They accept it, and you program it.

And another one. “We are A/B testing our reservation site. Can we change the expiry time to 6 hours for reservations created on the B version of the website?”

This one has you puzzled. How can you do this? The reservation in the database is not aware of any A/B testing going on in the website, and neither is the expiry script.

So you consider adding metadata to the reservation to keep track how it was created. Based on that, the expiry script can calculate what the exact expiry time needs to be. So for a reservation by a premium user on the B version of the website, should that be 6 hours or 48 or maybe 1/4th of 48…

Suddenly, everything becomes clear.

… and suddenly you realise that it’s actually much smarter to store the expiration time in the reservation and also to add an “expiry time” argument to the API call that creates the reservation.

After all, whatever system creates the reservation in the backend knows exactly in which circumstances the reservation is being created, and thus can decide what the expiration time needs to be. Thinking about this a bit more, you realise that this solves all previous feature requests as well.

At least, you have learned a few valuable lessons from this experience.

It’s really hard to predict the future.

Trust your queasy feelings. They are there for a reason.

Don’t put too much meaning in a single field. You will regret it later on.

Paybox is a French payment provider. Getting their integration to work can be quite a hassle.

The simplest way to get payments to work, is to generate a form and post it to the paybox servers. This form must contain an HMAC signature based on your private HMAC key and the form data. It is rather tricky to get this right.

Unfortunately, if you do anything wrong with your form, you get a generic – and totally unhelpful – error message.

This does not help you debug your code.

Fortunately, there is a permanent test user on the paybox test environment. This can help you get it up and running. The test user has this HMAC key: