Functional Java by Example | Part 4 – Prefer Immutability

In previous part we talked a bit about side effects and I’d like to elaborate a bit more about how we can prevent having our data manipulated in unexpected ways by introducing immutability into our code.

If you came for the first time, it’s best to start reading from the beginning. It helps to understand where we started and how we moved forward throughout the series.

Pure functions

A small summary on what we discussed before.

Functional Programming encourages side-effect free methods (or: functions), to make the code more understandable and easier to reason about. If a method just accepts certain input and returns the same output every time – which makes it a pure function – all kinds of optimizations can happen under the hood e.g. by the compiler, or caching, parallelisation etc.

Here’s what we currently have after the refactoring from the previous part:

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

22

23

24

25

26

27

28

29

30

31

32

33

34

35

36

37

38

39

40

41

42

classFeedHandler{

Webservice webservice

DocumentDb documentDb

voidhandle(List<Doc>changes){

changes

.findAll{doc->isImportant(doc)}

.each{doc->

createResource(doc)

.thenAccept{resource->

updateToProcessed(doc,resource)

}

.exceptionally{e->

updateToFailed(doc,e)

}

}

}

privateCompletableFuture<Resource>createResource(doc){

webservice.create(doc)

}

privatebooleanisImportant(doc){

doc.type=='important'

}

privatevoidupdateToProcessed(doc,resource){

doc.apiId=resource.id

doc.status='processed'

documentDb.update(doc)

}

privatevoidupdateToFailed(doc,e){

doc.status='failed'

doc.error=e.message

documentDb.update(doc)

}

}

Our updateToProcessed and updateToFailed are “impure” — they both update the existing document going in. As you can see by their return type, void, in Java this means: nothing comes out. A sink-hole.

1

2

3

4

5

6

7

8

9

10

11

12

privatevoidupdateToProcessed(doc,resource){

doc.apiId=resource.id

doc.status='processed'

documentDb.update(doc)

}

privatevoidupdateToFailed(doc,e){

doc.status='failed'

doc.error=e.message

documentDb.update(doc)

}

These kinds of methods are all around your typical code base. Consequently, as one’s code base grows it tends to get harder to reason about the state of the data after you’ve passed it to one of these methods.

Consider the following scenario:

1

2

3

4

5

6

7

8

9

10

11

12

13

14

def newDocs=[

newDoc(title:'Groovy',status:'new'),

newDoc(title:'Ruby',status:'new')

]

feedHandler.handle(newDocs)

println"My new docs: "+newDocs

// My new docs:

// [Doc(title: Groovy, status: processed),

// Doc(title: Ruby, status: processed)]

// WHAT? My new documents aren't that 'new' anymore

Some culprit has been mangling the status of my documents; first they’re “new” and a second later they aren’t; that’s NOT ok! It must be that darn FeedHandler. Who authored that thing? Why is it touching my data?! 🙂

Consider another scenario, where there’s more than one player handling your business.

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

def favoriteDocs=[

newDoc(title:'Haskell'),

newDoc(title:'OCaml'),

newDoc(title:'Scala')

]

archiver.backup(favoriteDocs)

feedHandler.handle(favoriteDocs)

mangleService.update(favoriteDocs)

userDao.merge(favoriteDocs,true)

println"My favorites: "+favoriteDocs

// My favorites: []

// WHAT? Empty collection? Where are my favorites????

We start with a collection of items, and 4 methods later we find that our data is gone.

In a world where everyone can mutate anything, it’s hard to reason about any state at any given time.

It’s not even “global state” per se – a collection passed into a method can be cleared and variables can be changed by anyone who gets a hold of (a reference to) your data.

Prefer Immutability

There’s a ton of resources out there, about how to go about this in your particular language. Java, for instance, does not favor immutability by default; I have to do some work.

If there’s a 3rd party which is making problems and changing data along the way (such as clearing my collection) one can quickly flush out the troublemaker by passing my collection in a unmodifiable wrapper e.g.

1

2

3

4

5

6

7

8

9

10

def data=[

...

]

// somewhere inside 3rd-party code

data.clear()

// back in my code:

// data is empty *snif*

Preventing trouble:

1

2

3

4

5

6

def data=Collections

.unmodifiableCollection([])

// somewhere inside 3rd-party code

data.clear()// HAHAA, throws UnsupportedOperationException

Inside your own code base we can prevent unintended side effect (such as my data being changed somewhere) by minimizing mutable data structures.

Read-only first

Using the principles we’ve learned so far, and drive to prevent unintended side effects, we want to make sure our Doc class can not be changed by anything after instantiating it – not even our updateToProcessed/updateToFailed methods.

This is our current class:

1

2

3

4

classDoc{

Stringtitle,type,apiId,status,error

}

Instead of doing all the manual labor of making a Java class immutable, Groovy comes to the rescue with the Immutable-annotation.

When put on the class, the Groovy compiler puts some enhancements in place, so NO ONE can update its state anymore after creation.

1

2

3

4

5

@Immutable

classDoc{

Stringtitle,type,apiId,status,error

}

The object becomes effectively “read-only” — and any attempt to update a property will result in the aptly-named ReadOnlyPropertyException 🙂

1

2

3

4

5

6

7

8

9

10

11

12

13

14

privatevoidupdateToProcessed(doc,resource){

doc.apiId=resource.id// BOOM!

// throws groovy.lang.ReadOnlyPropertyException:

// Cannot set readonly property: apiId

...

}

privatevoidupdateToFailed(doc,e){

doc.status='failed'// BOOM!

// throws groovy.lang.ReadOnlyPropertyException:

// Cannot set readonly property: status

...

}

But wait, doesn’t this mean that the updateToProcessed/updateToFailed methods will actually fail updating a document’s status to “processed” or “failed”?

Jip, that’s what immutability brings us. How to repair the logic?

Copy second

The Haskell guide on “Immutable data” gives us advice on how to proceed:

Purely functional programs typically operate on immutable data. Instead of altering existing values, altered copies are created and the original is preserved. Since the unchanged parts of the structure cannot be modified, they can often be shared between the old and new copies, which saves memory.

Answer: we clone it!

We do not have to update the original data, we should make a copy of it — the original is not ours and should be left untouched. Our Immutable-annotation supports this with a parameter, called copyWith.

1

2

3

4

5

@Immutable(copyWith=true)

classDoc{

Stringtitle,type,apiId,status,error

}

Consequently, we’ll change our methods to make a copy of the original with the altered status (and api id and error message) — and return this copy.

(The last statement in a Groovy method is always returned, doesn’t need an explicit return keyword)

1

2

3

4

5

6

7

8

9

10

11

12

13

14

privateDoc setToProcessed(doc,resource){

doc.copyWith(

status:'processed',

apiId:resource.id

)

}

privateDoc setToFailed(doc,e){

doc.copyWith(

status:'failed',

error:e.message

)

}

The database logic has also been moved up one level, taking the returned copy to store it.

We’ve gained control of our state!

This is it for now 🙂

If you, as a Java programmer, worry about the performance implications of excessive object instantiation, there’s a nice reassuring post here.