What is Node.js? The JavaScript runtime explained

Scalability, latency, and throughput are key performance indicators for web servers. Keeping the latency low and the throughput high while scaling up and out is not easy. Node.js is a JavaScript runtime environment that achieves low latency and high throughput by taking a “non-blocking” approach to serving requests. In other words, Node.js wastes no time or resources on waiting for I/O requests to return.

Let me explain…

In the traditional approach to creating web servers, for each incoming request or connection the server spawns a new thread of execution or even forks a new process to handle the request and send a response. Conceptually, this makes perfect sense, but in practice it incurs a great deal of overhead.

While spawning threads incurs less memory and CPU overhead than forking processes, it can still be inefficient. The presence of a large number of threads can cause a heavily loaded system to spend precious cycles on thread scheduling and context switching, which adds latency and imposes limits on scalability and throughput.

Node.js takes a different approach. It runs a single-threaded event loop registered with the system to handle connections, and each new connection causes a JavaScript callback function to fire. The callback function can handle requests with non-blocking I/O calls, and if necessary can spawn threads from a pool to execute blocking or CPU-intensive operations and to load-balance across CPU cores. Node’s approach to scaling with callback functions requires less memory to handle more connections than most competitive architectures that scale with threads, including Apache HTTP Server, the various Java application servers, IIS and ASP.NET, and Ruby on Rails.

Node.js turns out to be quite useful for desktop applications in addition to servers. Also note that Node applications aren’t limited to pure JavaScript. You can use any language that transpiles to JavaScript, for example TypeScript and CoffeeScript. Node.js incorporates the Google Chrome V8 JavaScript engine, which supports ECMAScript 2015 (ES6) syntax without any need for an ES6-to-ES5 transpiler such as Babel.

Much of Node’s utility comes from its large package library, which is accessible from the npm command. NPM, the Node package manager, is part of the standard Node.js installation, although it has its own website.

Some JavaScript history

In 1995 Brendan Eich, then a contractor to Netscape, created the JavaScript language to run in Web browsers—in 10 days, as the story goes. JavaScript was initially intended to enable animations and other manipulations of the browser document object model (DOM). A version of JavaScript for the Netscape Enterprise Server was introduced shortly afterwards.

The name JavaScript was chosen for marketing purposes, as Sun’s Java language was widely hyped at the time. In fact, the JavaScript language was actually based primarily on the Scheme and Self languages, with superficial Java-like semantics.

Initially, many programmers dismissed JavaScript as useless for “real work” because its interpreter ran an order of magnitude more slowly than compiled languages. That changed as several research efforts aimed at making JavaScript faster began to bear fruit. Most prominently, the open-source Google Chrome V8 JavaScript engine, which does just-in-time compilation, inlining, and dynamic code optimization, can actually outperform C++ code for some loads, and outperforms Python for most use cases.

The JavaScript-based Node.js platform was introduced in 2009, by Ryan Dahl, for Linux and MacOS, as a more scalable alternative to the Apache HTTP Server. NPM, written by Isaac Schlueter, launched in 2010. A native Windows version of Node.js debuted in 2011.

Joyent owned, governed, and supported the Node.js development effort for many years. Since 2015, the Node.js project has belonged to the Node.js Foundation, governed by the foundation’s technical steering committee. Node.js has also been embraced as a Linux Foundation Collaborative Project.

The beginning of the code loads the HTTP module, sets the server hostname variable to localhost (127.0.0.1), and sets the port variable to 3000. Then it creates a server and a callback function, in this case a fat arrow function that always returns the same response to any request: statusCode 200 (success), content type plain text, and a text response of ”Hello World\n”. Finally, it tells the server to listen on localhost port 3000 (via a socket) and defines a callback to print a log message on the console when the server has started listening. If you run this code in a terminal or console using the node command and then browse to localhost:3000 using any Web browser on the same machine, you’ll see “Hello World” in your browser. To stop the server, press Control-C in the terminal window.

Note that every call made in this example is asynchronous and non-blocking. The callback functions are invoked in response to events. The createServer callback handles a client request event and returns a response. The listen callback handles the listening event.

The Node.js library

As you can see at the left side the figure below, Node.js has a large range of functionality in its library. The HTTP module we used in the sample code earlier contains both client and server classes, as you can see at the right side of the figure. The HTTPS server functionality using TLS or SSL lives in a separate module.

IDG

One inherent problem with a single-threaded event loop is a lack of vertical scaling, since the event loop thread will only use a single CPU core. Meanwhile, modern CPU chips often expose eight cores, and modern server racks often have multiple CPU chips. A single-threaded application won’t take full advantage of the 24 cores in a robust server rack.

You can fix that, although it does take some additional programming. To begin with, Node.js can spawn child processes and maintain pipes between the parent and children, similarly to the way the system <a href="http://man7.org/linux/man-pages/man3/popen.3.html">popen(3)</a> call works, using <a href="https://nodejs.org/api/child_process.html#child_process_child_process_spawn_command_args_options">child_process.spawn()</a> and related methods.

The cluster module is even more interesting than the child process module for creating scalable servers. The cluster.fork() method spawns worker processes that share the parent’s server ports, using child_process.spawn() underneath the covers. The cluster master distributes incoming connections among its workers using, by default, a round-robin algorithm that is sensitive to worker process loads.

Note that Node.js does not provide routing logic. If you want to maintain state across connections in a cluster, you’ll need to keep your session and login objects someplace other than worker RAM.

The Node.js package ecosystem

The NPM registry hosts almost half a million packages of free, reusable Node.js code, which makes it the largest software registry in the world. Note that most NPM packages (essentially folders or NPM registry items containing a program described by a package.json file) contain multiple modules (programs that you load with require statements). It’s easy to confuse the two terms, but in this context they have specific meanings and shouldn’t be interchanged.

NPM can manage packages that are local dependencies of a particular project, as well as globally installed JavaScript tools. When used as a dependency manager for a local project, NPM can install, in one command, all the dependencies of a project through the package.json file. When used for global installations, NPM often requires system (sudo) privileges.

You don’t have to use the NPM command line to access the public NPM registry. Other package managers such as Facebook’s Yarn offer alternative client-side experiences. You can also search and browse for packages using the NPM website.

Why would you want to use an NPM package? In many cases, installing a package via the NPM command line is the fastest and most convenient to get the latest stable version of a module running in your environment, and is typically less work than cloning the source repository and building an installation from the repository. If you don’t want the latest version you can specify a version number to NPM, which is especially useful when one package depends on another package and might break with a newer version of the dependency.

For example, the Express framework, a minimal and flexible Node.js web application framework, provides a robust set of features for building single and multi-page, and hybrid web applications. While the easily clone-able Expresscode repository resides at https://github.com/expressjs/express and the Express documentation is at https://expressjs.com/, a quick way to start using Express is to install it into an already initialized local working development directory with the npm command, for example:

$ npm install express —save

The —save option, which is actually on by default in NPM 5.0 and later, tells the package manager to add the Express module to the dependencies list in the package.json file after installation.

Another quick way to start using Express is to install the executable generator<a href="https://github.com/expressjs/generator">express(1)</a> globally and then use it to create the application locally in a new working folder:

With that accomplished, you can use NPM to install all of the necessary dependencies and start the server, based on the contents of the package.json file created by the generator:

$ npm install
$ npm start

It’s hard to pick highlights out of the half a million packages in the NPM, but a few categories stand out. Express is the oldest and most prominent example of Node.js frameworks, which I have recently discussed in these pages. Another large category in the NPM repository is JavaScript development utilities, including browserify, a module bundler; bower, the browser package manager; grunt, the JavaScript task runner; and gulp, the streaming build system. Finally, an important category for enterprise Node.js developers is database clients, of which there are more than 4,000, including popular modules such as redis, mongoose, firebase, and pg, the PostgreSQL client.

To summarize, Node.js is a cross-platform JavaScript runtime environment for servers and applications. It is built on a single-threaded, non-blocking event loop, the Google Chrome V8 JavaScript engine, and a low-level I/O API. Various techniques, including the cluster module, allow Node.js apps to scale beyond a single CPU core. Beyond its core functionality, Node.js has inspired an ecosystem of half a million packages that are registered and versioned in the NPM repository and can be installed using the NPM command line or an alternative such as Yarn.

Copyright 2018 IDG Communications. ABN 14 001 592 650. All rights reserved. Reproduction in whole or in part in any form or medium without express written permission of IDG Communications is prohibited.