Search This Blog

A Glance Into Web Tech.-Based Text Editors' Text Management

It's been nearly two years since I wrote "A brief glance at various text editors" which was fun to write. A lot of people wanted to see how modern web-based editors manage their text. I've found the time and motivation to write another article about these wonderful tools. This does not cover rich text web-based editors - only source code editors.

As with the last article, this is no way 100% accurate. There isn't as much source this time, because the majority use plain JavaScript arrays... I've left out the "physically, virtually, and space efficient" triangle graphics. Sure they were nice to see but offered little, and were a lot of work. To make up for it, I have a little surprise at the end you hardcore text editor fans will love. Corrections normally follow when writing this type of material. If you have any suggestions, I'll do my best to find you in the various reddit and hackernews threads that sprout from this...or I'll reply to your comment here right away! Anyone can comment. No registration required.

Alright, grab a drink, sit back, and read on!

Snazzy

GitHub's Atom

Atom has been infamous for being slow. With other JavaScript based editors like Visual Studio Code, there is no excuse for this bad experience (and we'll take a look at VSC soon too). I begin by going to their GitHub repository page and searching src/ directory for anything to do with a buffer or text editor component. One thing that's apparent right away is the amount of abstraction and decoupling. Atom truly does lend itself to be very malleable. After 10 minutes I realized it was importing a "text-buffer" package that it downloads from npm. It's still by the Atom developers, they've just separated the buffer so it's easier to test and maintain. Personally I like how this further demonstrates their excellent separation of concerns.

All the text insertion is "funneled" through a single function called setTextInRange through various events. In theory the asynchronous event system should leave Atom always responsive. The real meat is when applyChange is called. Below is a copy of the code:

I'm not sure if key presses are buffered by Atom. Worst case scenario this approximately 80 line function is being called every time you press one key. In reality though I think the key presses are buffered and the routine is run when the event system is given time to handle the event.

Every time text is inserted, it's inserted as one "chunk", then split up by its line endings. This is done by invoking a regular expression engine. Personally I think this is overkill, but it certainly lets Atom continue to be easily modifiable. I can imagine the same thought is running through a few people reading this. It pushes all the new lines to a stack (or more technically: a regular JavaScript array). Already I don't want to find myself opening a large file. It then uses "spliceArray" to replace a range of lines.

Atom will have no problem growing. Its speed can definitely be on-par with Visual Studio Code if someone truly wanted to see it happen. Currently it looks like the Atom developers are leaning on the built-in arrays for performance. There is a huge opportunity here to contribute to Atom and make it way better than it already is. Write a piece chain text buffer implementation. Write a gap buffer implementation. Anything but using a regular array like this.

Oh this is snazzy too

Microsoft's Visual Studio Code

Visual Studio Code is the new player on the block. With it's 1.0 release, Visual Studio Code has reached a stable API to allow its developers to devote efforts to new plugins. It's a direct competitor to GitHub's Atom. One major advantage VSC has over Atom is its responsiveness. In my experience, and the experience of many others, we've observed that VSC is just plain overall faster than Atom. I'm not sure about resource usage, but I've heard stories of Atom eating 8 GB of RAM vs no stories of VSC doing this.

It was very difficult to find out how VSC manages its text. There are several applyEdits methods peppered all over the source. Instead of copying the code here, I'll link to these applyEdits methods for you to read and interpret yourself.

If we look at api.js, it appears VSC doesn't do anything fancy either. It uses plain JavaScript arrays and uses .splice() methods to insert new text. One difference is it looks like VSC buffers many edits. But again like Atom it is event based and uses a very, very similar architecture. The code looks less organized, and more complex. I guess VSC uses these different text models when appropriate and it pays off overall. Other than that I have no idea. Maybe that one difference is all it takes vs Atom's simple approach. I still think VSC would choke on large files just like Atom would.

Eclipse's Orion

And that leaves us with just Orion, an editor I haven't heard much about. It's used in Eclipse CHE, a web-based Eclipse. Their mission statement is noble:

The goal of Orion is to build developer tooling that works in the browser, at web scale. The vision behind Orion is to move software development to the web as a web experience, by enabling open tool integration through HTTP and REST, JSON, OAuth, OpenID, and others. The idea is to exploit internet design principles throughout, instead of trying to bring existing desktop IDE concepts to the browser.

I browsed the GitHub for a bit but I couldn't really find anything. A lot request-based code. If anyone finds out I'll add your inspection here.

So those are all the current popular web technology based source code editors out there. Did you notice something? They are all backed by some company. This leads me to ask: why are these companies competing in this market? What is there to gain? What is there to...lose? Well money, time and energy.

One thing I'm taking away from this is if I ever need a web-based source code editor, hands down I'm going with CodeMirror. I also expect some big changes in Atom that will make its efficiency on-par with Visual Studio Code. That will be exciting. I welcome the JavaScript Emacs of tomorrow.

And now, for our special feature presentation...!

Microsoft's Word 1.1a

Holy this tool looks serious...I think I like it.

Yep. And boy is the source not pretty. After about an hour and a half I've finally found the code that manages the text. I have to thank the people who worked on this project. Without the comments I'd be sitting here all night trying to figure out what it does.

And that's not even the best part. The FcAppendRgchToFn function is in fricking assembly.

I've got to say though the assembly is nicely written and commented. It's how I write my Z80 actually - writing out pseudo code in a comment and then translate it below.
Word uses a "scratch buffer" to hold all changes. The scratch buffer is a gap buffer from what I can tell, actually it is a piece table, which is great to see. I assumed it was a gap buffer because of the string always being appended to then the end pointers being fixed up. While looking at the other parts of the source code, it's apparent there is a ton of optimization all over. Maybe I should start writing all my blog posts in Word 1.1, and use a rtf -> html converter...

And that's all folks! Have a nice week. I hope you enjoyed that. I certainly did. Want to help me out? Email me about JavaScript work (Angular or Node! ES6! Waaa!!). Alright, I'm going to play Age of Empires 2. Good night.