I’ve been doing more stuff using Jupyter Notebook recently for Python work and figured it’d be helpful to have a really simple and compressed cheat sheet that I could embed into a markdown block as a reference. The ones that I found out there had too many explanations and I wanted it all to fit into a page. Here’s a copy of it to copy+paste directly into a Jupyter Notebook markdown block.

TL;DR – In this post I share 3 patterns, 2 of which can be used, to deal with Go’s lack of method overloading – a functional way, an object oriented way, and a “JavaScript” way.

One of the things I miss in Go that I had in C# is method overloading. I realise that method overloading can be pretty badly abused but it’s perfect for default values (where optional arguments are insufficient) and sometimes when the method signature needs to be slightly different, for feature reasons, but the core reason for the function’s existence is ultimately the same.

In this post I’ll share 3 patterns to deal with this missing language feature. Two that I really like, a functional method and an object oriented method, and one that I seriously hate which I refer to as the “JavaScript” method.

The Functional Way

This method takes advantage of two really cool features in Go, it’s ability to return multiple values from a function and the ability to immediately accept those multiple values as an argument list for another function. Here’s an example for optional arguments –

As you can see in the calls in the main() func above, the code is really readable as we have nice names for each default list of settings. In addition, we can use this method as a way to separate responsibilities. Here’s an example –

In my opinion, this is the superior pattern to use in the absence of method overloading in Go. The first reason being it immediately makes your code more readable. In the example above we can see immediately on each line –

t initially holds a template filled from the contents of myReader

t then holds a template filled from the contents of the given file name

t then holds a template filled from the contents of the template URL with the JSON from the service URL

It also forces you to separate your logic and elegantly structure your code. In our example, the logic for parsing and filling the template is kept to the FillTemplate function. We’ve separated out all the necessary IO tasks to get the data to work with into the FromReader, FromFile, and FromURLWithJSON functions.

A third reason, which is more personal/emotional, is that I feel that it makes use of a powerful Go language construct and thus feels more “Go”-ish. Unless others feel the same way and this becomes an idiomatic alternative for method overloading within the Go community, this isn’t really a legitimate reason.

The obvious con to this method is that if we have a lot of parameters, the method signature can become unwieldy. This is where being a little object oriented could be a superior method.

The Object Oriented Way

If you’ve come from a language like C# or Java, I imagine this would be the go to alternative to method overloading. Here is an example for default parameters –

In comparison to the functional method, for the examples I’ve offered above, I’d personally go for the functional method. In the first OO method above, in the main() func, the difference in usage is virtually none but now we have an additional struct that I feel is really unnecessary.

In the second example above, it’s definitely more OO, but we can’t always take this extra step in all scenarios. If the Brew() func was a method on a CoffeeMachine struct, we would have to use the method outlined in the first example. Ultimately the choice to use this OO method or the functional method is probably down to personal preference.

Where the OO method is probably superior is if we had a lot more options. Take the following functional example and compare it with the OO example after it.

I think it goes without saying, the OO way is vastly more readable in the above examples and for that reason I would definitely choose the OO method in such situations.

The “JavaScript” Way

So this method SHOULD never be used, but surprisingly (and very annoyingly) I’ve seen it quite a few times in various Go projects on GitHub and in blogs. In fact the fact I have witnessed this method being used was a big motivation as to why I even decided to write this post. I list it here mainly so I can bag it out and point how horrid it is. Here is an example and I hope you feel as horrified about it as I was the first time I saw it in use –

The fact I have seen methods like this is mind boggling to me. The only sane reason I can deduce, as to why you would choose to do this, is that you have only ever known JavaScript prior to using Go. If this is you then I, on behalf of all Go programmers, forgive you for your past sins. But please for the love of all programmers that work with your code after you, use one of the two methods I’ve mentioned previously. Variadic functions have their place as a solution to some problems and it’s not in the same universe as the “solves lack of method overloading in Go” universe.

To make it clear, the biggest issues I take with this “solution” are –

Loss of implicit type safety (why are you even using Go if you don’t see the benefits of static typing)

Misuse of variadic functions – in Go they are intended to be used for an arbitrary number of arguments, not an arbitrary number of signatures!

I really can’t stress enough NEVER TO USE THIS METHOD. If after seeing the rubbish code above you still feel it’s a good solution to function / method overloading in Go and feel like arguing your point in the comments below, don’t bother as I have no intention of responding-to or approving your idiotic view.

Conclusion

Although I clearly favour the functional method above, I do think you need to consider your particular scenario. In fact you’ll probably end up using a hybrid of the functional method and the object oriented methods. The key take away though, beyond two great alternatives for a lack of overloading in Go, is that you should NEVER USE VARIADIC FUNCTIONS FOR OVERLOADING!

I thought I’d write this article after a friend mentioned that he hadn’t dealt with runes before in Go. After doing a quick search on string manipulation in Go, I noticed that a few tutorials and answers in forums were operating on strings as []byte. It’s at this point realized that Go strings and their relationship to runes and bytes aren’t very intuitive so I thought I’d make an effort to see if I could explain it as compactly as possible (for a proper exploration I recommend reading this https://blog.golang.org/strings). This post assumes that at the very minimum you’ve used string literals (e.g. “this is a string”) and string variables (e.g. firstName := “Chris”).

Fundamentals of Strings

In my mind, there are two really important fundamentals of strings that need to be understood to mastering strings in Go.

The components of a string.

The slice behaviour of a string.

The Components of a String

The first important rule of strings is that strings are made of up runes (not bytes) and, as such, can be cast to a []rune. A rune is literally just a character, like “A”, “b”, and “*” but can also be “我” or “私”. It’s important to understand that runes represent a single character and that different languages have different definitions of what constitutes a “character”.

In English a character is a letter from the alphabet but, in the Chinese and Japanese examples I’ve provided, those characters both represent an entire word (in those examples the words mean “me”). As you can imagine, languages like Chinese and Japanese have literally thousands of glyphs to represent the thousands of words in the languages. Obviously thousands of glyphs are not going to fit in a single byte so these characters are actually stored as multiple bytes. This leads to the second important rule which is characters/runes in a string are of variable length!

What I’ve noticed in a number of tutorials and forum answers online is that people have incorrectly been casting strings to []byte slices and then perform formatting operations etc. on the slice of bytes. If you did that with something that contained characters from many different languages, you would actually break the meaning of the sentence! Below is an example where you will actually break a sentence in a different language but it’ll turn out fine in English (you can run it yourself at https://play.golang.org/p/gDejvQbEUL)

At this point you may be thinking to yourself “meh, not a big deal, I’m only going to cater to English speaking people anyway”. The problem is, the character standards upon which the runes are defined also provide codes for emoji’s so unless you don’t want to cater to the 92% of the online consumers that use emojis daily, you may want to consider using []rune instead of []byte.

Slice Behaviour of a String

Now that you know what a rune is (roughly) and now understand (hopefully) why it’s better to use runes rather than bytes, let’s talk about the relationship between strings and []runes.

The fact that the string is broken down into individual runes and you can cast it to a []rune slice, you may be mistaken for thinking that a string IS a []rune slice. It’s important to know that this is NOT the case as strings are immutable while a []rune slice is mutable. The consequence of this is that you CAN’T build string manipulation operations like the following (play with this example at https://play.golang.org/p/P9sd21DbAv).

Update: A Slight Confusion with len(string)

So to make the confusion worse, when you use len() on a string it actually brings back the number of bytes. With behaviour like this, it’s really no wonder that a lot of people think that a []byte slice is the natural type for a string. As a result, Go does have a number of stdlib functions in the unicode/utf8 package that help with finding the correct length, rather than using len([]rune(string)), such as RuneCountInString.

Summary

Just to recap what I feel are the 4 important rules to properly understanding strings in Go –

strings are made of up runes (not bytes)

characters/runes in a string are of variable length

ranging over a string with a for loop returns runes

strings are immutable while a []rune slice is mutable

I hope this quick tutorial has given you enough reason (and knowledge) to ditch []byte(“my string”) in favour of []rune(“my string”) in the future and to embrace the magic of runes!

I know I’m going to forget how to do this in the future and it’s probably something I’ll re-use to help me generate 32 byte keys in the future.

To generate the random key use the following commands from Bash.

date +%s | sha256sum | head -c 64 ; echo

Then take the resulting string (we’ll call it secret), which will 64 hexadecimal characters long, and use it in the following Haskell list comprehension (I evaluate it with GHCi) to turn it into an array of properly formatted hexadecimal literals like so.

The work I’m currently doing with my migration of Oystr from Node.JS to Golang requires me to re-visit a feature that required me to take a graph data set and perform a number of transformations to achieve the desired result. When I built this the first time in Node.JS, I thought I’d attempt the transforms by mutating an array of values. After a pretty frustrating day it became obvious that I needed to move to an immutable mindset.

Although the data sets aren’t huge, a 12×12 matrix and a 200×8 matrix, they are large enough (for my mind) that it made it difficult to follow the ~10 mutating transformations being applied to each data set. Once I had re-factored my transformations into an immutable pipeline, the simplicity of debugging and maintenance was immediately clear and well worth any performance hit that may have been induced from the increased memory usage… or was it.

That last bit sort of bugged me as I couldn’t say with any real certainty if the performance hit was big or small. With Node.JS, unlike with Go, testing the performance isn’t as simple as doing a BenchmarkXXX function and, at the time, I didn’t have a working mutable version of my algorithm anyway so I decided to drop it. Fast-forward to now, and I am now re-doing it in Go so I figured I’d just do some quick tests to see how much less (or maybe more) efficient it was to do an immutable transform.

This post presents what I found and some suggestions for you to consider when you find yourself in the same predicament in the future.

Use Case

In my specific scenario I’m building 10 different transforms with matrix-like data sets. I didn’t want to go through and build the full thing twice so I chose a simpler task. Instead I’ve built a basic matrix with implementations for the addition, subtraction, scalar multiplication, and matrix multiplication operators. I hoped that these 4 operations would provide enough variance to observe different performance scenarios.

In addition to operator variation, I also built two different forms of the matrices. With one set of the matrices the underlying data structure is a slice. With the other set of matrices the underlying data structure is an array. My reasoning for this is that I wanted to see what impact a reduced memory allocation scenario would have on the different matrix types.

Results

N.B. The following bench marks were only performed on the operations themselves (this is what I’m interested in) and ignore the initial allocation which is why there can be zero allocations for some of the tests.

The setup I used is a Dell XPS 15 with a i7-4702HQ, 16GB RAM, and a 1TB mSATA SSD.

Slices

I first performed the tests with slices (as this is what I’ll be using). I ran each bench mark 10 times to try and smooth out any potential external factors (background processes etc.). Additionally I ran the benchmarks with various matrix sizes to see how that would impact the performance. The graphs below summarise these tests. The y-axis represents the percentage increase in time of using an immutable structure over a mutable structure and the x-axis represents the size of a side in a square matrix.

After looking at the graphs above, what I noticed was that as the matrix get’s larger, the performance hit for immutable appears to reach a floor for addition, subtraction, and scalar multiplication for the immutable structures. Taking into consideration these results and how each operation is created, I imagine this is because of the time required for allocation of memory doesn’t change much even though the matrix is larger.

The obvious difference is the matrix multiplication operation. With this operation, the difference is minimal with a ~5% – ~10% decrease in processing performance. The reason here is pretty obvious as matrix multiplication requires a third matrix to be created even in the mutable implementation as the values within the original matrices are required for the full duration of processing the result and could even result in a different sized matrix. The interesting thing to note is that this doesn’t hold true when the size of the matrix starts to get really large. I’m not sure if this is related to the amount of memory in my machine, memory paging, addressing etc. so probably needs to be explored further if you’re dealing with really large data sets.

Array

After doing the slices I thought to myself “I wonder if I can improve the performance by using an array instead”. You know… the whole “zero allocs bro” line of thinking. Of course using arrays is kind of impractical for most scenarios, but I figured there may be some scenarios where I could use a static sized array so figured I may as well build array based data structures and run the benchmarks again. I used the same sizes as the slice based version and got the following (surprising) results.

For me this is where the surprises really did come. The mutable structures performed slightly better than their slice counterparts. I expected this to be the case but what I didn’t expect was that the immutable versions performed worse!!! And a lot worse in some cases like the matrix multiplication which is on average ~650x worse!!! Even though I’ve been able to reduce the memory allocations 52 times using an array and reduce memory usage ever so slightly, my processing performance is considerably worse.

The bad performance is particularly true for large data sets. I imagine (pure speculation here) that this is related to how much contiguous memory can be allocated for the array. What makes me suspect this is that you’ll notice in the graphs I’ve provided above I only provide benchmarks for 10×10, 30×30, and 90×90. This is because the 270×270 and 810×810 arrays took in excess of 10 min to complete and by default go test waits a maximum of 10 mins for the benchmark to complete.

Conclusion

For me personally, I’ve got quite a few important lessons from this exercise. Depending on how you want to interpret the results above, you may agree or disagree with them.

The first lesson is not to make any assumptions on what the compiler is doing in terms of optimization and performance. There are clearly some shortcuts the compiler is able to take in the slice-based immutable implementations that it can’t do for the array-based immutable implementations leading to the horrible performance.

Second, and this is probably more based upon the personal gripe I have with mindless and overly simple metrics, but less allocations does not mean better. A common “selling point” I see in a lot of Go libraries is “zero allocations”. It’s clear from these results that this doesn’t automatically mean better in every scenario.

The third lesson is compound as it directly answers the original intent of this post. In some scenarios, an immutable algorithm is the only solution anyway so programming immutable data structures has little effect on the performance (e.g. the matrix multiplication operation). In the cases where you observe a statistically significant hit on performance for operations that can be done mutably, you need to really consider whether this statistical significance is practically significant.

A ~40% increase in the the processing time may seem significant but the context in which that occurs for my use case it’s actually very insignificant in terms of what I gain with ease of debugging and maintenance. When we make decisions for the “greater good”, the “greater” part should include non-performance related gains.

In terms of moving forward I’ll definitely be using immutable transforms in Oystr. In a future post I’ll explore how the immutable data structures here could be easily extended with lazy evaluation which should improve the performance for certain scenarios and shouldn’t (fingers crossed) be conceptually difficult to implement.

This is a quick post of a quirk in Go I only just noticed. The operation described in the title of this article is actually meant to be impossible. If you attempt to compile the following code, the compiler will cry at you.

This is of course exactly how interfaces are meant to work, but the fact that it does just feels a bit odd. I did further investigation by reflecting the type of both “incrementable” and “aStruct” and they both come back as “MyTestStruct”.

Conclusion

I’ve only just noticed this quirk as assigning a non-pointer type to an interface variable isn’t a common operation for me. As such, I haven’t had enough time to think about or test the implications of this on broader architecture but my “developer spidey-sense” tells me there could be either opportunities or, more likely, potential caveats to this behaviour.