The other day a friend had this React question for me: “Composition through components, one way data binding; I understand all that, but why Virtual DOM?”.

I’ve given him the usual answer. "Because, direct DOM manipulation is inefficient, and slow."

“There’s always news on how JavaScript engines are getting performant; what makes adding something directly to the DOM slow?”

…

That is a great question. Surprisingly, I’ve not found any article that properly pieces it all together, making the case for the need of a Virtual DOM rock solid.

It’s not just the direct DOM manipulation that makes the whole process inefficient. It is what happens after.

To understand the need for a Virtual DOM, lets take a quick detour, a 30000 feet level view on a browser’s workflow, and what exactly happens after a DOM change.

A Browser’s Workflow

NOTE: The following diagram, and the corresponding explanation uses Webkit engine’s terminology. The workflow is almost similar across all browsers, save for a couple of nuances.

Creation of the DOM tree

Once the browser receives a HTML file, the render engine parses it and creates a DOM tree of nodes, which have a one-one relation with the HTML elements.

Creation of the Render tree

Meanwhile, the styles both from external CSS files, and inline styles from the elements are parsed. The style information, along with the nodes in the DOM tree, is used to create another tree, called the render tree

Creation of the Render Tree — Behind the scenes

In WebKit, the process of resolving the style of a node is called “attachment”. All nodes in the DOM tree have an "attach" method, which takes in the calculated style information, and return a render object (a.k.a. renderer)

Attachment is synchronous, node insertion to the DOM tree calls the new node "attach" method

Building a render tree, consisting of these render objects, requires calculating the visual properties of each render object; which is done by using the calculated style properties of each element.

The Layout (also referred to as reflow)

After the construction of the render tree, it goes through a “layout” process. Every node in the render tree is given the screen coordinates, the exact position where it should appear on the screen.

The Painting

The next stage is to paint the render objects — the render tree is traversed and each node’s “paint()” method is called (using browser’s platform agnostic UI backend API), ultimately displaying the content on the screen.

Enter the Virtual DOM

So, as you can see from the above flow of steps, whenever you make a DOM change all the following steps in the flow, right from the creation of the render tree (which requires recalculation of all the style properties of all the elements), to the layout, to the painting step, all are redone.

In a complex SPA, often involving a large number of DOM manipulations, this would mean multiple computational steps (which could be avoided) which make the whole process inefficient.

This is where the Virtual DOM abstraction truly shines; when there’s a change in your view; all the supposed changes that are to be made on the real DOM, are first made on the Virtual DOM, and then sent on to the real DOM, thus reducing the number of following computational steps involved.

Update: The following comment from redditor ugwe43to874nf4 does more justice to the prominence of Virtual DOM 👏🏼

The real problem with DOM manipulation is that each manipulation can trigger layout changes, tree modifications and rendering. Each of them. So, say you modified 30 nodes, one by one. That would mean 30 (potential) re-calculations of the layout, 30 (potential) re-renderings, etc.

Virtual DOM is actually nothing new, but the application of "double buffering" to the DOM. You do each of those changes in a separate, offline DOM tree. This does not get rendered at all, so changes to it are cheap. Then, you dump those changes to the "real" DOM. You do that once, with all the changes grouped into 1. Layout calculation and re-rendering will be bigger, but will be done only once. That, grouping all the changes into one is what reduces calculations.

But actually, this particular behaviour can be achieved without a virtual DOM. You can manually group all the DOM modifications in a DOM fragment yourself and then dump it into the DOM.

So, again, what does a Virtual DOM solve? It automates and abstracts the management of that DOM fragment so you don't have to do it manually. Not only that, but when doing it manually you have to keep track of which parts have changed and which ones haven't (because if you don't you'd end up refreshing huge pieces of the DOM tree that may not need to be refreshed). So a Virtual DOM (if implemented correctly) also automates this for you, knowing which parts need to be refreshed and which parts don't.

Finally, by relinquishing DOM manipulation for itself, it allows for different components or pieces of your code to request DOM modifications without having to interact among themselves, without having to go around sharing the fact that they've modified or want to modify the DOM. This means that it provides a way to avoid having to do synchronization between all those parts that modify the DOM while still grouping all the modifications into one.

Further Reading

The above Browser workflow has been excerpted from this document on the internals of browser operations. It delves deeper into a browser engine’s hood, explaining everything in detail; definitely worth your time to read it from end to end. It helped me a great deal in understanding the “why”, and justifying the the need for a Virtual DOM abstraction.

Hope this was of help. Let me know if you have any questions in the comments.

So, does the vDOM do the WHOLE DOM? or just a specified singular element? I assume vDOM is based on something like MutationSummary (https://github.com/rafaelw/mutation-summary) for picking up changes to it? Furthermore is the vDOM essentially just a JSON/similar representation of the real DOM?Is it possible to view/access the vDOM in react?

@saiki Thanks for the pointer! Unfortunately I did actually already create a DOM serialiser! Thanks for the interest too :). I just thought I might be barking up the wrong tree (see what I did there). Thanks again

Have seen this "explanation" and diagram many times. But still, useless and actually doesn't explain everything.

When a Virtual DOM wants to what you call "rerender", it needs to use native DOM API itself because otherwise it is just impossible to communicate with the browser. Why I need a lot of extra layers here when I can do just A -> E instead of A -> B -> C -> D -> E with document.createDocumentFragment() for example?

When exactly a browser is doing "rerendering"? What about "forced reflow", "read first and write second"? Why you think modern browsers are not doing a lot of optimization already.

Any real business code examples where everyone can compare good vanilla JS DOM manipulation and same code with Virtual DOM? Please no more useless innerHTML in the loop. I've been loading and rerendering 10000+ comments with plain JS easily.

Any benchmarks?

These simple questions in normal situations developer can answer quickly, however in this "virtual" problem even core author of React couldn't answer any of them.

I understand the idea of vDOM. Any app today has changes in the UI but you don't need to for that to remove original body/section and place a modified body/section instead all the time. With vDOM you have a bit of less UI computations, but you now have new computations - you need to store in memory a huge document and each time search in it for changes. After that when vDOM will render into DOM browser still will do it job, you can't avoid it.

Can you provide a specification for sorting use case - What do you have, what do you want to do and what results are you expecting? I will write a vanilla JS sorting example and then we can compare. It's hard for me to browse so many .ts files to find what exactly it is doing :) In any case in real world you will never have so many operations at the same time and 2) screen size is very limitted, you don't need to do anything with 100 posts above I already scrolled, only when I can see changes +- some space, only then UI needs to change.

As a developer I always know what exactly must be changed, when and where. Yes createDocumentFragment() does the job when you add nodes but when you modify... I again need real examples of these complicated SPAs to be able to answer this. For each use case there is a simple solution. What exactly must be done? ...and it's not sorting a huge table every 10ms on my screen. Avarage human reaction is about 300ms. No more then one click per 300ms, no more then one action per 300ms.

What you call a React's idea to separate app into components is noway related and invented by React. It is a common and general way in software architecture and it was for many many years. It's hard to say who was first, but we can be sure that this idea is at least 25 years old since it is one of the core concepts in what is called a UNIX Philosophy. In frontend we actually had these components at least for 10 years of jQuery where each jQuery plugin or a subset from jQuery UI library - is a component. Talking about the data every app today inherits from 3-tier architecture (MVC ot MVwhatever) and again it is a very old principle and not connected to React. Every software has separation of business logic and presentation logic. User clicks on button, calls for action, XHR made, server reads DB, returns data, client puts this data in DOM. It always was unidirectional and Angular's 2-way data binding just blowed up the Internet with another buzzword and useless technic. In my vanilla apps I have a very small window.Api object after that I have a models directory with small ajax/API-speaking objects composed with window.Api and my usual vanilla reflow looks like:

// like a comment
btn.addEventListener('click', () => {
const commentId = btn.dataset.id;
// waiting for AJAX response, if there will be errors // PostComment based on Api object will show a small alert
PostComment.like(commentId).then(() => {
// and if everything is ok, here we are doing our rerendering,// it's up to PostCommentUI object how to do that, // in this case it could be just // 1) incrementing .textContent (and NEVER InnerHTML) in some <span> where like count is stored// 2) mark a like button as active, something like btn.classList.add('active'), that's all
PostCommentUI.like(commentId)
});
});