Elm orders its output based on how strongly connected different components are. In other words, a function that has more dependencies is more likely to appear later, a function that has fewer dependencies is more likely to appear earlier. In our case, that makes the most likely order leafDecoder -> branchDecoder -> decoder.If we lazily refer to branchDecoder from decoder, this order doesn't change; and branchDecoder will still eagerly refer to decoder whose definition appears only later in the compiled code.https://medium.com/media/6445861be604d4f845618466a9bf2c27/hrefLet’s have a look at what that would compile to.Sidenote: this is not quite what the compiled code looks like. I’ve stripped out the module-prefixes, partial application calls through the A* functions, and an Elm List really isn’t a JS array. However, it illustrates the core issue.https://medium.com/media/fec4f89b491d69bf97b3df0f63c82647/hrefNotice that there’s only a single function definition in this whole thing - everything else is eagerly evaluated and has all of its argument available. After all, decoders are values! Note, also, that branchDecoder is defined before decoder is defined; yet it references decoder. Since only function declarations are hoisted to the top in JavaScript; the above code can't actually work! decoder will be undefined when branchDecoder is used.A second attemptOur second option is moving the lazy call so branchDecoder lazily refers to decoder instead:https://medium.com/media/338c3e9ab3b910826e01b25f62f63d54/hrefAnother look at the pseudo-compiled code shows that we have reached our goal, this time:https://medium.com/media/3d7d32f4ca4c3b17457f938b6c2a5c22/hrefNow for the correct solutionThe order in which compiled results appear in the output isn’t something you, while writing Elm code, should worry about. Figuring out the strongly connected components to figure out where the lazy should go, is not what you want to be thinking about.Worse yet, slight changes to your decoders might result in the order changing in the compiled code!The only real option when dealing with this type of recursion is to introduce lazy in both places:https://medium.com/media/88a11f4ba8d1a760644b5e9177702c9d/hrefThis post was extracted from a document originally part of my Demystifying Decoders project. Learn more about that here:Demystifying Elm JSON DecodersHelp, my recursive Decoder caused a runtime exception! was originally published in Ilias Van Peer on Medium, where people are continuing the conversation by highlighting and responding to this story.

The State of Elm 2017 results are here!
I first presented these as two talks, one to Elm Europe and one to the Oslo Elm Day.
I’ve embedded the Elm Europe talk below (it’s shorter) but the longer version is on also YouTube.

elm-reactor is an underrated tool. Not only does it do on-demand recompilation of Elm source code, but it can serve up other assets, too.But did you know you can serve your own HTML with live-compiled Elm code, too? This is useful if you need JS interop or want to start your program with flags.The trick here is that elm-reactor exposes a “magical” /_compile directory — any elm file prefixed with that path will be pulled in and compiled on page-load.For example, start with a folder-structure like this:myProject/|- elm-package.json|- index.html`- src/ `- Main.elmPlacing your index.html at the same level as your elm-package.json means that running elm-reactor from your myProject folder will allow you to point your browser to http://localhost:8000/index.htmlAs for the contents of your index.html, start with something like this:<html><head> <style> /* custom styles? Sure! */ </style></head><body> <!-- Relative to index.html, main.elm lives in `src/Main.elm`. --> <!-- Prefixing that with `/_compile/` gives us magic! --> <script src="/_compile/src/Main.elm"></script> <script> var app = Elm.Main.fullscreen() // You could also pass flags, or setup some ports, ... </script></body></html>There, all set!Note that elm-reactor has also learned how to serve quite a few other file types with the correct content type headers, so pulling in some CSS, images or JSON should work, too.Shout-out to @ohanhi for the tip!Elm reactor and custom HTML was originally published in Ilias Van Peer on Medium, where people are continuing the conversation by highlighting and responding to this story.

JSON Decoders be what?Coming from JavaScript, where JSON is the most natural thing ever, having to write decoders to work with JSON in Elm is a mystifying experience. On some level, you understand that you need some way to convert these foreign objects into statically typed structures for safe use in Elm. And yet, it’s… weird.Some people learn best by reading about how they work, and looking at examples. Other people learn best by doing: writing decoders, from “very simple” to “I didn’t know you could do that”, progressively cranking up the complexity level, and knowing that what you wrote, is correct.To that end, I’ve put together a series of exercises that aim to walk you through writing JSON decoders. Each exercise aims to be a little more difficult than the one before, and introduce new concepts at a fairly reasonable pace.zwilias/elm-demystify-decodersThe code is there, all contributions are welcome, if you get stuck on something, create an issue and someone will help you out sooner or later.Now go and write some decoders!Demystifying Elm JSON Decoders was originally published in Ilias Van Peer on Medium, where people are continuing the conversation by highlighting and responding to this story.

Or perhaps you did. Reminders can’t hurt, though.1. Changing types in record update syntaxRecord update syntax { a | b = c } is restricted in various ways: you cannot add or remove fields, but only change the values of fields. In that regard, they are very much like labeled tuples: the general shape can’t change. However, here’s something you can do that may come as a bit of a surprise: changing datatypes.Naturally, if you have a type alias that says some field needs to be a certain type, and you change that type somewhere, the resulting record will no longer conform to that type alias. Here are a few examples illustrating that.https://medium.com/media/0475bc2cca1916585141bdb86dd525a4/hrefWhen is this useful?As an example, imagine we use some browser detection library in JS like platform.js which can potentially result in null for every value. Let’s say we’re also taking a seed for a random number (because we don’t trust using the current time). We can do that as follows:https://medium.com/media/9461e6ecfd01b472d2028de6792e1ec5/href2. Ports and flags don’t require automatic conversionBoth ports and flags do automatic conversion of raw JS values to Elm types; but that comes with limitations, for example custom union types.Json.Encode.Value to the rescue! A Value type encapsulates a raw JS value; so annotating your ports or init function to accept a Value basically means that Elm won’t do its automatic conversion for those values. You will, however, need to handle the decoding and conversion yourself.The same goes for outgoing ports, where you can handle the encoding yourself before passing the value of to the port.Automatic conversion is sort of a relic from times when there wasn’t a streamlined Json.Decode library, and it was basically the only way to handle foreign values.When is this useful?

Decoupling your Elm types from your JS types

Handling union types

Control over what happened when you receive bad values — rather than an error, you get to graciously handle it in Elm.

3. Unsubscribe from outgoing portsLet’s say you have an outgoing port in your Elm app and subscribed to it from the JS side. How does one do that?You unsubscribe of course! Note that you need to pass the same callback to unsubscribe as you originally passed into subscribe — similar to working with event handlers. There’s even some magic so you can call unsubscribe from within your subscription handle, meaning you can pass this and unsubscribe without clutter!As illustration, have a look at the following Ellie. After 5 clicks, they will no longer be printed to the console.https://medium.com/media/493dfd0689f41341b713b936f0ea4fa8/href4. Pipelines are free!Or, more correctly — the right and left function application operators |> and <| are special cased in the compiler so they don’t cause any overhead.Thanks to Richard Feldman for this little gem.Naively, a |> b would compile into a function calling the |> operator with a and b as arguments; after which the |> operator would call b a. Under that assumption, simply writing it as b a instead would be more performant. However, that’s not actually what happens — the compiler knows about <| and |> and gives the special treatment; basically inlining their operation.This means that pipelines, which improve readability, don’t come at some sneaky cost; they’re free!5. Phantom types are a thingPhantom types are types in which type variables are declared on the right-hand side that aren’t used in the left-hand side. As an example type MyType a = My String where it’s clear that a isn’t used. This somewhat weird construct gives us the ability to express certain constrains about the values without directly affecting the values.Thanks to Martin Janiczek for brining this up in the Elm-lang slack earlier!For example, it becomes possible to create a FormData type that is Unsanitized by default, but allows sanitizing the input using a user-provided Sanitizer and restricts submiting to Sanitized data, like follows.https://medium.com/media/312b0f320740353f5b0438995166a140/hrefNow, on its own, this isn’t particularly useful — the same could be accomplished by putting the data in Unsanitized and Sanitized and throwing away the whole FormData type. However, this becomes useful when dealing with functions that don’t care about whether or not the data is sanitized, but just need a FormData a and can work on that type. In order to do that with two distinct types, you’d have to write different functions and give them different names, and double the maintenance effort.Googling around for phantom types will probably offer you a whole host of other use cases than this one, definitely worth checking out.So there you have it, some neat things you might not have known about Elm. If you enjoyed this post, feel free to check out my other posts and don’t hesitate to click that green little heart!5 Things you didn’t know about Elm was originally published in Ilias Van Peer on Medium, where people are continuing the conversation by highlighting and responding to this story.

How do you keep your JSON encoders, decoders, and model in sync?
You can skip fields in your encoder, right?
But should you?
And what about when you add new fields?
Decoders are a little easier, but you have to sync them up with your encoders or you’ll lose data.
And the worst part is that we can’t rely on the compiler to catch these classes of errors… argh!

This is a perfect situation for property tests (fuzz tests in elm-test lingo.)
The test system will keep us honest by giving us random values to test with.
You can assert that encoders and decoders mirror each other, and add a bit more safety to your app.

How do you keep your JSON encoders, decoders, and model in sync?
You can skip fields in your encoder, right?
But should you?
And what about when you add new fields?
Decoders are a little easier, but you have to sync them up with your encoders or you’ll lose data.
And the worst part is that we can’t rely on the compiler to catch these classes of errors… argh!

This is a perfect situation for property tests (fuzz tests in elm-test lingo.)
The test system will keep us honest by giving us random values to test with.
You can assert that encoders and decoders mirror each other, and add a bit more safety to your app.

Pitfalls you may run intoWhen optimising for performance, there are a few things to watch out for.Micro-benchmarks and JIT compilersWhen writing micro-benchmarks, and you’re getting funky results, you might not want to trust those results. Lets start with an example:https://medium.com/media/fb28c6407a91a7991670b7336cccc5bb/hrefWait, what?If you’re on a recent version of Chrome, you’ll notice that Tuple.first outperforms a simple inline pattern match. That seems… odd. Well, it is!Notice the commented out “optimisation buster”? Let’s toggle it and see what happens.https://medium.com/media/40d08f7ab305a2a5092517d47e8fb9c8/hrefUrhm.Woah! Suddenly, our pattern match outperforms Tuple.first. As for what exactly is going on here would require inspecting the intermediate code the JIT uses. However, that’s pretty complicated, so the gist of it is this: the function using Tuple.first is optimised by the JIT, possibly to the point of just inlining the first result, most likely during the initial “figuring out how many times to call this function for each sample” step.See, the properties that make Elm so nice to work with (i.e. pure functions, type safety, etc) make it easy for JavaScript compilers to optimise. While that is certainly a very good thing, it does make it exceedingly hard to write micro-benchmarks with any degree of certainty.One thing we’re doing in our micro-benchmarks, is calling the same function over and over again. Given the simplicity of the function, the JIT is perfectly capable of noticing that it will always return the exact same result, which means it doesn’t actually have to do any computation at all. The more advanced JIT compilers become, the harder it will become to write (micro)benchmarks that represent reality.The benchmarking library for Elm currently doesn’t keep any reference to the result of function calls made as part of the benchmark. This makes things slightly less predictable, as the JIT may just as well decide to replace the entire function by a no-op: a side-effect free function whose result is discarded is essentially a no-op.For the time being, there’s only one thing to do: don’t trust wonky numbers. Perhaps, at some point in the future, we may be able to do benchmarks with varying or cycling inputs, and keep track of results. The major difficulties with such an approach will likely be figuring out a workable API and trying to keep the overhead of doing these things to the bare minimum, optimally not tracking time during the “input generation” and “output storing” phases of the process.An interesting resource that talks more about microbenchmarks and compilers can be found here: http://mrale.ph/blog/2012/12/15/microbenchmarks-fairy-tale.htmlTesting the intended behaviourIn a very early version of my Dict optimisation adventure, my benchmark was very crude. Essentially, it boiled down to this:-- Insert key 1000 with value `()` into dictionary that has 1000 entriesbenchmark3 "Size 1000" Dict.insert 1000 () (dictOfSize 1000)There are a few things that make this into a pretty bad benchmark for testing insertion behaviour:

Not all inserts are equal — some may require rebalancing, some may leave the tree balanced. Doing one single insertion, over and over, will not give us any information about the amortised cost, but only how this particular insert behaves.

It’s a no-op! The third argument, dictOfSize 1000, create a dictionary with keys 1 through 1000, all initialised with value ().

A better solution, for the time being, is to replace that with something like the following.benchmark1 “Size 1000” Dict.fromList listOf1000KeyValuePairsNow, this does come with a downside: Dict.fromList has some overhead, namely folding over that large list. On the other hand, folding over a list is a fairly small cost, especially compared to inserting itself. This become especially valuable when comparing different Dict implementations, as the underlying fromList implementation will most likely be exactly the same, while the insert implementation will most likely be entirely different. Since both branches of the compare would have to deal with the same overhead, this makes the result a fair comparison, at the cost of benchmarking more than just the function we’re trying to benchmark.Takeaways

Optimising a toHex functionThe other day, someone on the Elm-lang Slack asked for an (efficient) toHex : Int -> String implementation. Of course, efficiency shouldn’t be the primary concern when doing something like that - fredcy/elm-parseint already provides some very nice functions that cover this, as well as a plethora of other int <-> string conversions with different radixes.Nevertheless, @unsoundscapes jumped to the rescue, and tried a couple of different implementations:

an initial, “naive” implementation

a tail call optimised variation

a second TCO variation using Bitwise operators

So, let’s have a look at these three and compare their performance with a micro-benchmark:https://medium.com/media/343a8d9455797ebe7a468e62cd447448/hrefNote that this is just a group of benchmarks, since there’s no sane way to do three-way compares right now.On my machine, the naive implementation is the fastest (by a fairly minor margin), with the TCO implementation being the slowest one, and TCO+Bitwise version hovering in between. Note that the number of recursions required is proportional to the input size. On my machine, though, even with large inputs, the TCO version would only be about on-par with the non-TCO version.It does offer us some useful information, though — we could combine the naive approach with the Bitwise operators.There’s another avenue of improvement: the digit function is doing a String.slice call, which is backed by native JavaScript String.slice but still comes at a cost. We could approach the issue in a slightly different way: if n < 10 , we can simply toString it. If n >= 10 we could simply have a case..of expression for the remaining cases; in which each branch would be doing an equality check to a constant number.Finally, we’re computing n % 16 even when there is nothing left to do. This seems minor, but moving that piece of modulo math into the digit function does buy us some extra performance.Let’s stack our performance-optimised version against the original (much more legible) version by unsoundscapes, for a couple of different sizes:https://medium.com/media/d8be944480872455f0f56cc87fb79312/hrefYay, numbers!Takeaways:

Some optimisations work better for smaller inputs, some optimisations work better for larger inputs. Try to benchmark different cases and keep the real-world use cases in mind.

Be lazy: don’t calculate what you don’t need.

Benchmark, benchmark, benchmark. You can’t know with absolute certainty if something will be faster until you have benchmarks to back it up.

If a function has a limited domain of possible inputs, it might be a candidate for optimisation.

Optimising the rebalancing operation in a self-balancing binary search treeOne of the core requirements for a self-balancing binary search tree, is that it, well, self-balances. Balancing strategies are what set different binary search tree variant apart. In this example, let’s have a look at the self-balancing operations as I had originally implemented them for a Dict based on AVL trees. If you’re curious about how rebalancing for AVL trees works, do have a look at Brian Hicks’ series on Functional Sets, where I first learned about them, too.First, a recap of what the code for the core balance function looked like at its first iteration:https://medium.com/media/a242caad42ef839cb1030a27e8536821/hrefThere are a few points we can improve upon straight away, based on what we saw in part 2: we have a few places where we’re piping the new tree with balanced inner node into our rotation functions.However, there is something much more important here: we’re calculating heightDiff dict in 4 different places. In the worst case scenario, i.e. an already balanced tree, that means we’re recalculating the function 4 times too many.So, let’s use a let..in block to memoise that value. Our benchmark, in this case, will be testing creation of trees of a certain size — 10 and 100 — with sequential keys, since this is a pathological case for tree insertion in self-balancing binary search trees.https://medium.com/media/bce0dc30dbc516c625d9515a0eabd509/hrefA modest improvementThat already helps some. We can push this a little further, though. One thing we actually know for sure, is that our tree will never become unbalanced by more that an absolute heightDiff of 2. So there is really no need to compare using > and < : we already know we should only ever enter those branches if the balance-factor is off by exactly 2. So, let’s replace those inequalities by == .This reveals something else: our first branch is a special case of our second branch, and our fourth branch is a special case of our third branch. In other words, we could group our cases. Doing this yields us the following.https://medium.com/media/e38155f98726f20230d8c32b05dcb2da/hrefAnother modest improvement!Now we’re getting somewhere. There’s one last thing we can do, though. We’re comparing diff to 2 constant values and introducing a default branch. There’s no reason we couldn’t do a straight case..of there, which would additionally buy us the ability to inline the heightDiff dict call, scrapping a let..in block. Let’s see if that gains us any performance:https://medium.com/media/881de7245eb719513593391ce40d1db5/hrefAlright!Alright! Not bad.Let’s have one last look at the before and after, and then sum up the takeaways here.https://medium.com/media/22aa30537633bb342ffe4fba700decee/hrefFinal score!Takeaways:

Don’t recalculate things if you don’t need to.

Prefer equality checks with constants over inequalities. This isn’t apparent here, but this is especially important when dealing with other types than integers.

Optimising the foldl operation in a recursive data structureLet’s start with a some code I plucked from an earlier implementation of an alternative Dict implementation.https://medium.com/media/4c29a6295d5b63a4ae8f60262c55221c/hrefSo, that seems like a fairly straightforward foldl implementation: Recursively execute the operation on the left side, then execute the operation on your own data using the accumulated value from before, then feed that to folding over the right hand side.There’s something here that’s going to make that fairly expensive, though. That last lambda as part of a pipeline, that comes at a cost. How much? Well, let’s come up with an alternative, and see what difference it makes.That lambda could be eliminated if that last line could take the accumulator as its last value. So let’s do some argument shuffling and make a function to flip the 2nd and third arguments of whatever function you pass in.flip2nd : (a -> b -> c -> d) -> a -> c -> b -> dflip2nd f a b c = f a c bUsing this, we can turn that lambda into the following:|> flip2nd foldl op rightAlright, now that we’ve got that, let’s do a quick micro-benchmark to verify this actually helps.https://medium.com/media/2b68f5f5c4713c9f20f18da4f1a300de/hrefResults on my machine.Alright, that’s a fairly large improvement already. Having anonymous functions in a pipeline is probably not a great idea, as far as performance is concerned. It felt more readable with a lambda, though. There has to be a better option. Wait, let’s think about this for a bit. What we’re doing with pipelines, is saying “first compute the lefthand side, then feed that into the right hand side”. We could just as well do that by using a let .. in block, which would at least allow us to remove the flip2nd call.I previously mentioned here that using a let..in expression rather than pipelining would save us the overhead of the right function application operator |> . Turns out that both <| and |> are special cased in the Elm compiler, and a |> b compiles down to b (a)! Big thanks to Richard for pointing that out.Let’s refactor a bit, and compare the result to our previous, optimised result.https://medium.com/media/a13aca4a413b6cb133175781807dd63e/hrefIn case you don’t feel like running them yourself.Alright, that’s another slight performance improvement. Let’s take a step back, and look at an alternative way to desugar the function application operator. a |> b is just sugar for b (a) . No need to store the intermediaries, we can just inline the different parts of the pipeline straight into the final result.https://medium.com/media/1b110e4a45c50d37cffb8c2b85e54d1d/hrefThat yields us another decent little performance improvement. I think that’s all we can do here for now. Let’s stack up the original “pipeline+lambda” version against this “nested/inline” version.Note, also, that readability suffered some, but not too much — I feel like we’ve found a pretty decent balance between legibility and performance, for library code.https://medium.com/media/848a79f24e188f1501ec3ff0cd9f3137/hrefThe final resultsTakeaways:

Avoid anonymous functions in pipelines

Avoid pipelines, preferring inline function application if and only if the result is still legible!

Part 1: IntroductionFirst, let’s get the obvious disclaimers out of the way.

The real problem is that programmers have spent far too much time worrying about efficiency in the wrong places and at the wrong times; premature optimization is the root of all evil (or at least most of it) in programming.

1. Don’t do it.2. (for experts only). Don’t do it yet — that is, not until you have a perfectly clear and unoptimized solution.

— Principles of program design (1975) - Michael A JacksonThere, that’s better.Who is this forI’ve written this to document how I went about optimising an alternative Dict type. This type of core functionality needs to be fast, without compromising maintainability. The type of code described here would typically be library code rather than application code.If you find yourself doing such a thing, and need some pointers on how to increase performance, this document may be useful to you.

Don’t write this type of code in regular application code.

Optimise for readability, even to the point of sacrificing performance.MethodologyOptimising for performance in general demands a lot of work and attention to detail. Multiple steps are involved, which I’ll attempt to explain here.1. Identify slow codeThis means using Chrome DevTools (or equivalent) to identify where any slowness if coming from, as well as relying on logic and micro-benchmarks to identify potentially slow code. Beware: identifying slow (Elm) code using Chrome DevTools is not easy, and micro-benchmarks come with a series of pitfalls.The process of using Chrome DevTools to actually identify bottlenecks probably warrants its own series of blogposts.Knowing what goes on behind the scenes is helpful in realising why certain constructs exhibit certain performance characteristics. For this, I recommend Noah’s “Elm compiled documentation” as well as taking a good look at core/Utils.js which holds the code for comparisons, equality and some other interesting little helpers.2. Make it fasterThis is something I’ll try to explain more in-depth in the rest of this series.3. Verify invariants still holdThis is really important.An invariant is something that must hold true about your code. It is a basic assertion you should be able to make before and after calling your function.To explain this with an example, let’s imagine we’re writing a binary search tree (BST). A tree can be modelled by saying that each tree is either the empty node, or a node with a value, a lefthand side and a righthand side.type Tree v = Empty | Node v (Tree v) (Tree v)Now, in order for this tree to be a valid binary search tree, the following invariant must hold:

For each node in the tree, all values on the lefthand side must be strictly smaller and all nodes on the righthand side must be strictly larger than the value of the node itself.

The first step, before attempting to optimise any code that touches the tree — or even before simply writing any code that touches the tree — should be to turn your invariants into executable code. For a binary search tree, that would look roughly like this:https://medium.com/media/bfe9d8252ffcd1197144ef18d7b1805f/hrefNow I can easily set up a number of test-cases, preferably fuzz-tests so I can be sure that it works for random cases rather than the convoluted ones I can come up with myself, to test that my implementation of, say, an insertion algorithm does not break that invariant.Running those tests before and after each change made to some code ensures I don’t slip up and break any guarantees my code is making.4. Benchmark, benchmark and benchmark againChanges that improve performance for some input on one platform, may regress performance for a different type of input, or on a different platform. Setting yourself up with a whole stack of different platforms, browsers and types to test with, is an important part of ensuring that your performance improvement really is, in fact, an improvement at all.The performance of a a > b for example, depends heavily on the type. Some changes will work better for smaller datasets, others on larger datasets. Some changes might result in javascript that might initially perform better, but can’t be handled efficiently by an optimising JIT. Optimising for performance isn’t that clear-cut.Brian Hicks’ excellent elm-benchmark provides you with the tools to run benchmarks, and setting up a series of comparisons with different types and sizes of input is fairly easy.If you’re working on an OSS project, you may wish to apply to BrowserStack or Saucelabs for a free OSS license so you can run your performance tests against hundreds of different platform/browser combinations.5. If there are no regressions, go back to step 1By which I really mean, optimise one step at a time. Quantifying the impact of a change can only be done with that change being regarded in isolation. If you optimise everything at once, then notice a performance regression in a benchmark or an invariant being broken, good luck figuring out what to revert.So, how do I go about step 2, and actually make things faster?Showing patterns that are good candidates for improving performance is, of course, the main point of this series. Now that you’ve wrestled your way through this long introduction, it’s time to get to the good bits. In part 2.