If you watch pro pool players, most of the time the game is super boring, if you don’t believe me, watch this video from someone that puts 152 balls in a row. What’s interesting is that if you were to look at each shot individually, most of them are easy. I can likely make the 152 pots he did in a row, if I didn’t have to care about positioning myself for the next ball.

The real talent of pro pool players is being able to not only pot the ball but put the white ball in a good position for shooting the next ball. When they play well, they “make the game easy” by having the white ball always in a good position for the next shot.

What this means is that if you see a pro player doing some crazy shot, this means that they “got out of position” in the previous ball. And in practice at this level, usually they did a mistake a few shots earlier and haven’t been able to correct the position back and it gradually amplified.

They made a mistake in an earlier shot, so got out of position and have to come up with a creative way to hit the ball

This is a really bad property when watching the game, so most tournaments introduce a 30s limit so you don’t let the players properly think through and increase the likelihood of making mistakes, and having to come up with interesting shots.

Land in a place where there are multiple shots available next rather than a single one

Come into the line of the next shot so that you don’t need to be super precise with your speed

Remove all the balls next to each others so you don’t have to go up and down the table with longer shots

Now is probably the point where you’re asking yourself, that’s interesting but what does it have to do with software engineering. Well, I think that there are a lot of parallels with building software.

When I see people doing very visible and consequential actions, I find myself thinking that they are doing a “hero shot” and it must mean that they got “out of position” for the past few shots and now the only option that they have left is unsatisfying but there’s no other choice.

On the other hand, I see people appearing to somehow always be in easy projects where everything just works out fine and they deliver a lot of impact. I used to think that they were lucky, now I think that they are pro players and are able to plan multiple shots in advance and able to execute on their strategy.

On January 1st I started building a little tool that lets you create diagrams that look like they are hand-written. That whole project exploded and in two weeks it got 12k unique active users, 1.5k stars on github and 26 contributors on github (who produced real code, we don't have any docs). If you want to play with it, go to Excalidraw.com.

Many people have asked me how I got so many people to contribute in such a short amount of time for Excalidraw, while this is still fresh in my mind, let me post about what I was thinking about during the process.

S curve

Before we get started with the actual content, here's an interesting concept that was in my mind thorough the project. I discovered the concept of a S curve through Kent Beck's video series. There are three rough phases:

the first phase is when you do R&D and develop the product, there's a lot of work done but no real visible impact

the second phase is the exponential part where everything is growing tremendously

the third phase is when the growth flattens and you're doing smaller improvements (which can still be huge if the baseline is huge)

The S curve is usually used to describe bigger projects but it turns out Excalidraw just went through a S curve as seen in this chart that plots the number of stars over the past two weeks.

The most important part for me was to capitalize on the growth phase so that the project doesn't die when it hits the stabilization phase.

Proven Value Proposition

Excalidraw didn't come out of nowhere, I've been using a tool called Zwibbler for probably 10 years in order to build hand-drawn like diagrams to illustrate my blog posts. I've always had this feeling that this tool was underrated. I seemingly was the only one to use it even though it felt like it could be used much more broadly.

Example of image drawn with Zwibbler

So when excalidraw came out, there was a clear value proposition and I knew it was going to be somewhat successful. Those days I don't have that much free time so I tend to spend my time on things that I believe have a high likelihood of being successful, especially side projects.

Make Some Noise

The first thing was to get people excited! I'm fortunate to have a sizable audience on Twitter so I used it by posting a bunch of videos of the progress of building the first version of the tool.

Convert Attention to Action

I got more attention than I anticipated so I felt like I could convert it into actual action. For this, the best way I've found is to create a bunch of issues about all the things that need to be done. I've been thinking about rebuilding a Zwibbler equivalent for a long time so I had a pretty good sense of what needed to be done.

People that wanted to contribute could just skim through the list of things to be done and start hacking. That worked really well!

Who is Contributing?

When I open sourced React Native, I was convinced that the same people that contributed to React would contribute to React Native. It turns out I was plain wrong, a new set of people started contributing. This same pattern applied to all the subsequent projects I've worked on since then.

This is a very broad generalization but most people that tend to contribute significantly to early projects like this are unknown (if they were well known, they'd likely have better opportunities to spend their time) but experienced (they are able to jump in on a random codebase and contribute).

Keeping People Engaged

The name of the game is to get as much from people that are interested in contributing as possible. Your initial buzz is only going to last so long (a few days), so you want to capitalize on that time. Everyone (myself included) is likely going to have to go back to their real job soon.

For this, I usually try to be very responsive on the pull requests coming in. If you can get turnaround in less than 10 minutes, then you can have real-time work and people will stay engaged as long as you are.

I've tried something new this time and gave commit access to everyone that got a PR merged in. In the past I would do it after I've seen sustained work. This worked really well where this gave an extra motivation for people to contribute and they also started to review each other's code which was awesome! I am not worried about people abusing their power, people that spend energy getting something of quality in tend to be considerate.

A trick I've been also using is to merge pull requests even if they're not exactly the way I want and then push all the follow ups I had in mind. This way the person can have their feature shipped and likely to come back without having expensive back and forth (we never know when / if they're going to apply suggestions).

Be Decisive

People are going to try and stir the project in all sorts of directions with their ideas and pull requests. It's pretty tricky to think in advance what kind of suggestions you're going to get because people tend to get very creative (in both good and bad ways...).

If you want something to happen, you need to give a very clear "yes" with concrete things that need to be done. If you're not sure or change your mind multiple times or answer days/weeks later, people are either not going to invest their time making it happen, or will lose interest and not push it to conclusion.

On the flip side, you're likely going to see a lot of pull requests or suggestions that you don't think are a good idea. I've found that it's usually not a good idea to give a clear "no" as it's a hard message to give to a stranger over text. Instead, what I found tends to work better is to space out replies and ask for more information. The other party will naturally lose interest and move on. You should use this technique very sparingly as it is not a nice approach.

Keeper of Quality

With so many simultaneous contributions, the product can easily start losing quality. I view myself as the keeper of quality. I've been pretty obsessed about all the small details and things that feel off.

Every time I see a problem, I open an issue with a small repro case. In many cases, those issues are easy to fix and someone will get to it. I also make sure to clear the backlog so that we're always in a good enough shape.

I've also made sure that some core values were being maintained. I want minimal friction to get started drawing. In particular, this means that what you see first should be the shapes. I had to actively prevent people from adding title selection and login to keep this property.

Celebrate Success

Posting about all the good things that happen, be it a new cool feature, or interesting usage or thoughts in the topic will increase the size of that channel as those posts will attracts an audience.

The other interesting thing that will happen is that you will provide an audience to a lot of the people that are contributing. As I mentioned earlier, they're unlikely going to have a big one of their own that cares about this topic.

This is a win-win situation! It takes time to actually post all those things but I've seen it being valuable time and time again.

Empty Canvas

What I found fascinating with this project is that many people were able to project their dreams and ideas onto it. I've been told that I should quit my job by at least three people and build a startup around this project as they saw a lot of growth potential in different areas. (Sorry, I'm not, but if you want to, the business is up for grab!)

I'm not exactly sure what to make of that but it led to great conversations! That's more than I hoped for with this project.

Things That Went My Way

I wish anyone could read this and reproduce it but that's not completely true. I had a lot of things that went my way. I found it to be useful to know what advantages people behind success stories have to see how they affect their abilities to deliver.

I have more than 10 years of experience building front-end and it turns out that I learned very little on the technical front during this project. I've done all the pieces many times one way or another. So when it was time to architect the project, split up the work, review code or suggestions, do the work, manage contributors, evangelize... All of this was pretty much mechanical and didn't require much thinking. This helped speed up everything so that a lot more than usual would fit within one buzz cycle.

I have a large audience on Twitter and I've worked closely in the past with other people with large audiences (hi Dan Abramov and Jordan Walke!) who were willing to evangelize the project. Without that, I wouldn't have been able to get the project in front of so many people so quickly.

Excalidraw was built with other projects such as CodeSandbox, Zeit, Rough. They've been fantastic to use and were part of the reason why the project got off the ground so quickly. I encountered some small issues with those dependencies, which likely would have ended up somewhere on an issue tracker and eventually got fixed. But because I personally knew the owners of the first two projects and was visible enough for the third, I was able to get those issues resolved extremely quickly, which is not everyone's experience.

Conclusion

This was a fun project to work on while procrastinating on writing performance reviews. I'm not exactly sure what the future holds for Excalidraw but I'm happy that it is now at a point where I can finally use it to illustrate the blog post I wanted to write that started this whole project (hello rabbit hole!).

I'm now [in July 2018] in a group full of compiler engineers at Facebook and learning a lot. Yesterday, I read a post by David Detlefs (summarizing a collaborative idea involving several members of his team) about how to efficiently encode strings for concatenation and since it's very clever I figured I would share it.

Problem

A lot of programs are taking a string as input and building a string as output. You can imagine the following code. Note: I'm going to use JavaScript as an example but it applies to almost all the languages out there.

Because strings are immutable, we need to do a full copy of the string for every small concatenation. In practice this turn a O(n) algorithm into O(n²).

Solution 1: Change the code

If this becomes a bottleneck, instead of using a string all the way through, you can use an array and push all the string pieces to it. Once you are done building the result, you can join all the pieces together into the final string. Since at this point you know all the strings the operation can sum all the sizes and allocate exactly the right size.

This pattern works to solve the problem but requires the programmer to know about it and the performance to be bad enough that it is worth writing code in a different way. In practice, a lot of code is not written that way and it's unclear that any amount of education will change this fact.

Note that a compiler to a bytecode format could, in many cases, make the transformation of the original code to the explicit StringBuffer code. But not in all cases, since compilers have to be conservative: if the string being concatentated is passed as an argument, all bets are off.

Broken Solution 2: Mutate the original string

The solution that comes to mind is: can normal strings act as a buffer?

The idea is that you allocate a buffer of characters and whenever you do a concatenation, you keep writing at the end of the buffer. If it isn't big enough, you allocate a bigger one, do a single copy and keep going.

If you are curious, this is how the Java StringBuilder class is implemented. Performance-wise, this is what we want, but there's one problem...

Aliasing

You can assign the string to a variable and assign another variable with that variable. For example:

var str ='\n';var str2 = str;// here we make an alias

var str = '\n';
var str2 = str; // here we make an alias

In this case, both str and str2 are pointing to the same '\n' string. In the compiler literature this is called aliasing. The big question is what happens if you try to update one of the variable:

str +=' * ';

str += ' * ';

If you look at the JavaScript specification, strings are immutable meaning that you expect str2 to be unchanged but str to be:

str2 =='\n'
str =='\n * '

str2 == '\n'
str == '\n * '

Unfortunately, if you mutate the string like in the above solution, then both of them would be '\n * because they both point to the same underlying storage.

Solution 3: Linear Types

If you've not been living under a rock, you probably have heard about Rust and linear types. This is a fancy name to say that you cannot have aliasing: there's only a single variable that can point to a value at all time.

What this means in this case is that the line var str2 = str; would be illegal. If you want to do that, you need to do a full copy of the value so it's effectively a different one.

In practice, aliasing happens all the time in normal programs, for example calling a function with a string as argument is a form of aliasing. We wouldn't want to do full copies every time aliasing is happening.

Rust is getting away with it using a concept calling "borrowing" where you can create an alias if the compiler can guarantee that the previous variable cannot be accessed during the lifetime that the alias exists.

In my understanding, you need a strong type system in order to properly enforce those guarantees and in dynamic languages like JavaScript you would have to be too pessimistic and do way more copies than necessary when you are just passing the variable around, ruining the wins you get from building the string in the first place.

Solution 4: Size in the variable

Aliasing is usually a dealbreaker because you can mutate the underlying storage that another variable could observe. But in this particular case we can exploit the fact that the only mutation we care about is appending something at the end.

So, in the variable we not only keep a pointer to the buffer but also the size we care about. If someone else appends something at the end, it will not affect us because what's in the buffer for that size didn't change.

At this point, str2 points to buffer1 with '\n * ' but because it has size = 1 then we know it really is '\n' as intended.

The only edge case to consider is if you are trying to also concatenate str2. If the size of the variable is not equal to the size of the underlying buffer, this means that someone else clobbered the buffer. In this case, our only option is to do a full copy.

Conclusion

Before joining the team, I knew about the string builder pattern but I had no idea that there was so much theory behind this particular problem like aliasing, linear types... I hope that explaining those concepts in terms of JavaScript is helpful to get some insights into what's happening inside of compilers.

What it does is to turn all the occurrences of \, :, /, \0 and z into zB, zC, zS, z0 and zZ. This way, there won't be any of those characters in the original string which are probably invalid in the context where that string is transported. But you still have a way to get them back by transforming all the z-sequences back to their original form.

Why is it useful?

The first interesting aspect about it is that it's using z as an escape character instead of the usual \. In practice, it's less likely for a string to contain a z rather than a \ so we have to escape less often.

But the big wins are coming when escaping multiple times. In the \ escape sequence, it looks something like this:

\ -> \\ -> \\\\ -> \\\\\\\\ -> \\\\\\\\\\\\\\\\

whereas with the z escape sequence:

z -> zZ -> zZZ -> zZZZ -> zZZZZ

The fact that escaping a second time doubles the number of escape characters is problematic in practice. I was working on a project once where we found out that the \ character represented 70% of the payload!

Conclusion

It's way too late to change all the existing programming languages to use a different way to escape characters but if you have the opportunity to design an escape sequence, know that \ escape sequence is not always the best 🙂