Collaborative Editing in JavaScript: An Intro to Operational Transformation

I've set out to build a robust collaborative code editor for the web. It's called Codr, and it lets developers work together in real time - like Google Docs for code. For web developers, Codr doubles as a shared reactive work surface where every change is instantly rendered for all viewers. Check out Codr's newly launched Kickstarter campaign to learn more.

A collaborative editor allows multiple people to edit the same document simultaneously and to see each other's edits and selection changes as they occur. Concurrent text editing allows for engaging and efficient collaboration that would otherwise be impossible. Building Codr has enabled me to better understand and (I hope) convey how to build a fast and reliable collaborative application.

The Challenge

If you've built a collaborative editor or have talked with someone who has, then you know that gracefully handling concurrent edits in a multi-user environment is challenging. It turns out, however, that a few relatively simple concepts greatly simplify this problem. Below I'll share what I've learned in this regard through building Codr.

The primary challenge associated with collaborative editing is concurrency control. Codr uses a concurrency control mechanism based on Operational Transformation (OT). If you'd like to read up about the history and theory of OT, then check out the wikipedia page. I'll introduce some of the theory below, but this post is intended as an implementor's guide and is hands-on rather than abstract.

Codr is built in JavaScript and code examples are in JavaScript. Significant logic needs to be shared between the server and client to support collaborative editing, so a node/iojs backend is an excellent choice. In the interest of readability, code examples are in ES6.

A Naive Approach to Collaborative Editing

In a zero-latency environment, you might write a collaborative editor like this:

Every action is conceptualized as either an insert or delete operation. Each operation is:

Locally applied in the editing component

Sent to the server

Applied to a server-side copy of the document

Broadcast to other remote editors

Locally applied to each remote editor's copy of the document

Latency Breaks Things

When you introduce latency between the client and server, however, you run into problems. As you've likely foreseen, latency in a collaborative editor introduces the possibility of version conflicts. For example:

Oops! 'abced' != 'abcde' - the shared document is now in an inconsistent state.

The Easy Fix is Too Slow

The above conflict occurs because each user is "optimistically" applying edits locally without first ensuring that no one else is making edits. Since User 1 changed the document out from under User 2, a conflict occurred. User 2's edit operation presupposes a document state that no longer exists by the time it's applied to User 1's document.

A simple fix is to switch to a pessimistic concurrency control model where each client requests an exclusive write lock from the server before applying updates locally. This avoids conflicts altogether. Unfortunately, the lag resulting from such an approach over an average internet connection would render the editor unusable.

Operational Transformation to the Rescue

Operational Transformation (OT) is a technique to support concurrent editing without compromising performance. Using OT, each client optimistically updates its own document locally and the OT implementation figures out how to automatically resolve conflicts.

OT dictates that when we apply a remote operation we first "transform" the operation to compensate for conflicting edits from other users. The goals are two-fold:

Ensure that all clients end up with consistent document states

Ensure that the intent of each edit operation is preserved

In my original example, we'd want to transform User 2's insert operation to insert at character offset 4 instead of offset 3 when we apply it to User 1's document. This way, we respect User 2's intention to insert e after d and ensure that both users end up with the same document state.

To do the merge, the server updates (transforms) operation A so that it still make sense in light of preceding operations E and F, then applies the transformed operation (G) to master. The transformed operation is directly analogous to a git merge commit.

Rebasing Onto Master (Client-Side)
After an operation is transformed and applied server-side, it is broadcasted to the other clients. When a client receives the change, it does the equivalent of a git rebase:

Reverts all "pending" (non-merged) local operations

Applies remote operation

Re-applies pending operations, transforming each operation against the new operation from the server

By rebasing the client rather than merging the remote operation as is done server-side, Codr ensure that edits are applied in the same order across all clients.

Establishing a Canonical Order of Edit Operations

The order in which edit operations are applied is important. Imagine that two users type the characters a and b simultaneously at the same document offset. The order in which the operations occur will determine if ab or ba is shown. Since latency is variable, we can't know with certainty what order the events actually occurred in, but it's nonetheless important that all clients agree on the same ordering of events. Codr treats the order in which events arrive at the server as the canonical order.

The server stores a version number for the document which is incremented whenever an operation is applied. When the server receives an operation, it tags the operation with the current version number before broadcasting it to the other clients. The server also sends a message to the client initiating the operation indicating the new version. This way every client knows what its "server version" is.

Whenever a client sends an operation to the server, it also sends the client's current server version. This tells the server where the client "branched", so the server knows what previous operations the new change needs to be transformed against.

If op1 inserted text beforeop2 on the same line, increase op2's character offset accordingly.

If op1 occurred entirely afterop2, then don't do anything.

If op1 inserts text into a range that op2 deletes, then grow op2's deletion range to include the inserted text and add the inserted text. Note: Another approach would be to split op2 into two deletion actions, one on either side of op1's insertion, thus preserving the inserted text.

If op1 and op2 are both range deletion operations and the ranges overlap, then shrink op2's deletion range to only include text NOT deleted by op1.

Syncing Cursor Position and Selection

A user selection is simply a text range. If the start and end points of the range are equal, then the range is a collapsed cursor. When the user selection changes, the client sends the new selection to the server and the server broadcasts the selection to the other clients. As with editing operations, Codr transforms the selection against conflicting operations from other users. The transform logic for a selection is simply a subset of the logic needed to transform an insert or delete operation.

Undo/Redo

Codr gives each user their own undo stack. This is important for a good editing experience: otherwise hitting CMD+Z could undo someone else's edit in a different part of the document.

Giving each user their own undo stack also requires OT. In fact, this is one case where OT would be necessary even in a zero-latency environment. Imagine the following scenario:

Implementing undo/redo correctly is one of the more challenging aspects of building a collaborative editor. The full solution is somewhat more involved than what I've described above because you need to undo contiguous inserts and deletes as a unit. Since operations that were contiguous may become non-contiguous due to edits made by other collaborators, this is non-trivial. What's cool though, is that we can reuse the same OT used for syncing edits to achieve per-user undo histories.

Conclusion

OT is a powerful tool that allows us to build high-performance collaborative apps with support for non-blocking concurrent editing. I hope that this summary of Codr's collaborative implementation provides a helpful starting point for understanding OT. A huge thanks to David for his invitation to let me share this piece on his blog.

About Alden Daniels

Alden is the lead front-end engineer at OneSpot. He likes using web technologies to build high-performance apps that humans will enjoy using. He lives and works in the beautiful city of Austin, TX. When he's not hacking on code, he enjoys all things outdoors, travel, and good times with family and friends. He's also an insufferable punster.

Google Plus provides loads of inspiration for front-end developers, especially when it comes to the CSS and JavaScript wonders they create. Last year I duplicated their incredible PhotoStack effect with both MooTools and pure CSS; this time I'm going to duplicate...

One of the worst kept secrets about AJAX on the web is that the underlying API for it, XMLHttpRequest, wasn't really made for what we've been using it for. We've done well to create elegant APIs around XHR but we know we can do better. Our effort to...

I was recently scoping out the horrid source code of the Google homepage when I noticed the "Google Search" and "I'm Feeling Lucky" buttons had a style definition I hadn't seen before: -webkit-appearance. The value assigned to the style was "push-button." They are buttons so that...

One of the most awesome parts of the Dojo / Dijit / DojoX family is the amazing GFX library. GFX lives within the dojox.gfx namespace and provides the foundation of Dojo's charting, drawing, and sketch libraries. GFX allows you to create vector graphics (SVG, VML...

Great idea! Collaborative editing topic is complicated and the solution are also hard to test and implement.

Couple questions about your implementation:

– Have you looked at the ShareJS project which solves also similar problem ?
– Do you plan to separate the core implementation of OT, so it would be possible to use in other projects ?
– How would you scale your solution beyond single instance, what are the limits ?
– Is there going to be offline support and how do you solve it using the OT ?

Great post.

Risto Novik

Forgot one more question, any plans to support other data structures (Sets, Vectors, key-value stores) besides the text sequence editing ?

> Have you looked at the ShareJS project which solves also similar problem?

I’m familiar with ShareJS, but haven’t used it. One of my goals when creating Codr was to become more conversant with OT. This motivated me to build my own solution. Also, I don’t think ShareJS provides built-in support for separate undo stacks.

> Do you plan to separate the core implementation of OT, so it would be possible to use in other projects ?

I’d like to. This will take some time, however, so whether or not I get to this depends on whether the KickStarter campaign is funded.

> How would you scale your solution beyond single instance, what are the limits?

Scaling out to multiple machines doesn’t introduce too much complexity so long as actions affecting a specific document are all routed to the same machine. Some intelligent load balancing / routing would accomplish this.

> Is there going to be offline support and how do you solve it using the OT

OT makes this entirely possible. However, the result may be unhelpful if you and someone else spend a lot of time editing the same document and don’t see the merged results till later. Since Codr’s primary purpose is to facilitate realtime collaboration, this is something I’m unlikely to add in the near future.

> Forgot one more question, any plans to support other data structures (Sets, Vectors, key-value stores) besides the text sequence editing?

No, although a lot of the logic could be re-used for these. It looks like ShareJS supports pluggable transform types, which is a good approach.

Shanmu

Hi Alden,

Great post, I understand the core concept of collaborative editing.

I am stuck with the OT algorithm, how it actually transforms the position of operation?

I am wondering the actual algorithm will be? How does it work behind the scenes?

Can I get any sudo view of the algorithm? Or any reference link for OT algorithm.

Continue this conversation via emailGet only replies to your comment, the best of the rest, as well as a daily recap of all comments on this post. No more than a few emails daily, which you can reply to/unsubscribe from directly from your inbox.