Decoding Reference Types in Swift 4

Apple’s Swift 4 recently introduced some amazing new features to support archiving/unarchiving as part of Foundation. This post explores some techniques for unarchiving/decoding while ensuring that instances of reference types can be shared within the object graph based on an arbitrary identifier.

Problem Description

Stepping back a little, imagine you had a simple construct representing a Car and its owner/driver. Using value types, this might look like:

which is great. When that JSON gets decoded, fresh new copies of the Person and Car structs are constructed. Lovely.

However, imagine that your Person object was actually a reference type, instead of a value type. When decoding in this case, there will actually be two instances of Person created and assigned to owner and driver respectively. In this case, the decoded object graph doesn’t match the original object graph. ie. a single Person instance shared between the owner and driver properties.

This article explores some ways we can achieve the desired behaviour based on a modified Person type that now looks like:

For models as simple as the above, this technique is workable. However, one drawback is that it only scales for object graphs where all of the reference types are within the sameCodable element. For example, imagine a JSON structure that had many different Person objects scattered throughout the tree. The above code relies on the caching occurring at a local level only.

Caching Objects

Expanding on the previous example, the next logical steps would be to create a cache that contains a list of previously found Person objects. Then, as the data is getting decoded, we can pull out the identifier field of the Person sub-structure and use that to read from the cache.

This is a little better, as our Person cache is constrained to a single JSON decoding session, yet we still have the opportunity of sharing the same PersonCache across decoding sessions (if required) by passing it in when setting up the userInfo.

Having said that, though, I still don’t like that:

The force cast when extracting the cache from the userInfo is horrible

It relies on the caller setting up the userInfo object in the JSONDecoder which is very brittle.

Note: In order to mitigate the point, I initially attempted to lazily initialise the cache when fetching it from the userInfo. However, that doesn’t work because at the point where we consume userInfo, it is referenced from the base Decoder type – where it is declared as read-only. We are able to set it up initially, though, because JSONDecoder redeclares it as writable.

Refactoring our Cache

Broadly speaking, though, I think we’re on the right track. Let’s do a little bit of refactoring to start to tidy things up. The first thing we should do is encapsulate the cache into something a little nicer.

Now, let’s extend the Decoder object to provide a type-safe function that returns an instance of our DecodableCache object. As mentioned earlier, I tried to use the userInfo object on Decoder, however, for some reason Apple has marked Decodable.userInfo as read-only, whereas JSONDecodable.userInfo is read-write. Because we want the cache to be available to all decoders (not just JSON), we need to look at alternatives.

One such alternative is associated objects. We can extend Decodable in the following way:

The above fetches a nested “person” container and extracts the identifier field to use as the key into the cache. If the object is found, then it will be returned. If it is not found, the object will be decoded using the standard decoding process, added to the cache, and then returned. This simplifies the decoding process for Car greatly, which now looks like:

One thing I did notice when writing the nestedPerson function is that when referencing Car.CodingKeys in a generic function signature, the compiler forces me to explicitly declare the enum. Oddly, though, I can reference the (generated) enum with no problems inside a function body. I have raised a Swift 4 defect, but for now, to work around it you just have to explicitly declare the CodingKeys enum:

enum CodingKeys: String, CodingKey {
case owner
case driver
}

Wrapping up

As you can see from the above, it was a bit of a journey to arrive at the eventual solution. I think that there is still further opportunity to further improve the solution (I’d love to be able to use userInfo instead of associated objects), but I think the general technique is sound.

If you have any comments, thoughts or suggestions, feel free to comment here or hit me up on twitter

Full source code that can be run in an Xcode 9 playground is available in this gist.