Node.js Training And Node.js Hosting With Nodejitsu

Over the last couple of weeks, I've been playing around with Node.js - a server-side Javascript runtime environment (powered by Google's V8 engine). My tinkering has been all kinds of fun; but, I wanted to take my understanding to the next level. So, when I saw that Nodejitsu was holding a one-day intensive Node.js training course here in NYC, I jumped on the opportunity.

The training session was a total blast! The guys in the photo above (Paolo Fragomeni, Charlie Robbins, and Marak Squires) are the co-founders of Nodejitsu. They each took turns leading portions of the one-day course which covered interactive demos, "best practices" approach to Node.js application development, popular modules, and Q&A.

The size of the class was small - about 5 students - allowing the atmosphere to take on a very personal, hands-on, interactive feeling. Which was awesome because the guys running the class were clearly brilliant. I always enjoy being the dumbest guy in the room and I took full advantage of the situation, asking about a million questions. I was a sponge eager to absorb all of the information I could find.

In addition to evangelizing Node.js, Nodejitsu is a Node.js hosting company that provides a sort of "platform as a service" for Node.js applications. Coming from a ColdFusion / Apache / IIS world built on top of dedicated servers and Virtual Private Servers (VPS), the concept of hosting a Node.js application kind of confuses me. Luckily, Nodejitsu takes care of all the infrastructure for you. Deploying a Node.js application to Nodejitsu is as easy as running the following command line instruction:

jitsu deploy

This command creates a "snapshot" of the current directory, sends it off to the Nodejitsu servers, deploys it, boots up your Node.js instance, and then routes port 80 into your application. Dealing with ports in Node.js was definitely one of my biggest blind spots. When using Apache, any number of sites can be running on port 80; when creating a Node.js HTTP server, however, you have to tell it exactly what port to monitor. Connecting incoming HTTP requests to specific Node.js applications seemed to be something that would quickly present a huge challenge. Luckily, Nodejitsu handles all of this routing for you.

Up until now, all of my Node.js experimentation has been done locally. But, with a Nodejitsu beta account (woot!), I was eager to try and actually deploy a Node.js application to the web. My application architecture still leaves something to be desired; but at least I am now in a position to see the fruits of my labor materialized and accessible.

To try out Nodejitsu deployment, I threw together a site called Gymspiration. The premise of the site is very simple: it uses YQL (Yahoo! Query Language) to grab the RSS feed from my Tumblr account; then, it takes the photos from the RSS feed and displays them on the homepage (one photo per page view). The site is supposed to "inspire" you to get to the gym and workout; hence the name, Gymspiration.

In addition to learning more about Node.js application deployment, I also took this as an opportunity to learn about Git source control and GitHub. In fact, the source code for Gymspiration is the very first thing that I have ever pushed to my GitHub account. I'm going to discuss the code below, but feel free to take a look at my repository.

There's a bit of CSS and HTML that go along with this site, but I'm not going to bother showing it here. The interesting stuff is the Node.js code. I will say, however, that I did use a module called "node-static" to serve up the static assets (CSS and Images) and manage 404 Not Found requests. This module implements best practices when it comes to streaming content and setting up HTTP cache headers.

To configure my Node.js application and initialize my HTTP server, I created a server.js file in my root directory. I originally tried to put this file in a "bin" directory, as I'm told is common practice. But, when doing that, I had trouble configuring the node-static module to server from the proper paths.

My server.js file loads the tumblr.js module and uses it to construct the site's homepage, which displays a different image on each page view.

Server.js

// Include the necessary modules.

var sys = require( "sys" );

var http = require( "http" );

var fs = require( "fs" );

// This is the static file server module for our site.

var static = require( "node-static" );

// ---------------------------------------------------------- //

// ---------------------------------------------------------- //

// Load the Tumblr RSS proxy. When that is done, we have everything

// that we need to start running the server.

//

// NOTE: The RSS feed might not be available right away (requires an

// ansynchronous HTTP request); however, as this happens only at the

// beginning of the application, I don't feel a need to use the

// available callback.

var tumblr = require( "./lib/tumblr" ).load(

function(){

// Log that the data has loaded.

sys.puts( "Tumblr RSS data has loaded." );

}

);

// Read in the output template. This site is only going to serve up

// this one page. Since this will only be read in once, just read it

// in SYNChronously - this way we can get the file contents back

// immediately and don't have to use a callback.

var htmlTemplate = fs.readFileSync(

(__dirname + "/views/index.htm"),

"utf8"

);

// Create our static file server. We can use this to facilitate the

// serving of static assets with proper mime-types and caching.

var staticFileServer = new static.Server( "." );

// Create an instance of the HTTP server. While we are using a static

// file server, we will need at HTTP server to handle the one dynamic

// page (the homepage) and to hand requests off to the static file

// server when necessary.

var server = http.createServer(

function( request, response ){

// Parse the requested URL.

var url = require( "url" ).parse( request.url );

// Get requested script.

var scriptName = url.pathname;

// Check to see if the homepage was requested. If so, then we

// need to load it and serve it. If not, then we can just

// pass the request off to the static file server (which will

// handle any necessary 404 responses for us).

if (scriptName.search( /^\/(index\.htm)?$/i ) == 0){

// Get the next RSS feed item (might return NULL).

var rssItem = tumblr.next();

// Set the 200-OK header.

response.writeHead(

200,

{ "Content-Type": "text/html" }

);

// Write the index page to the response and then

// close the output stream.

response.end(

htmlTemplate.replace(

new RegExp( "\\$\\{rssItem\\}", "" ),

(rssItem || "Oops - No data found.")

)

);

} else {

// If this isn't the index file, pass the control

// off to the static file server.

staticFileServer.serve( request, response );

}

}

);

// Point the server to listen to the given port for incoming

// requests.

server.listen( 8080 );

// ---------------------------------------------------------- //

// ---------------------------------------------------------- //

// Output fully loaded message.

sys.puts( "Server is running on 8080." );

The Tumblr module has to contact the YQL (Yahoo! Query Language) web service over HTTP. This means that the Tumblr module loads data asynchronously to the rest of the Server.js script. At first, I was tempted to wait until the Tumblr module was fully loaded before I configured the HTTP server. This sort of callback-based configuration chaining seems to be common practice in Node.js applications.

But, then I thought about it - would it really be so horrible if the HTTP server was ready before any RSS data was available? In this particular case, not at all. If no RSS data was available, my Tumblr module would simply return Null when asked for its next RSS item. In such a scenario, which would only be during the booting up of the application, requests for the homepage would simply display an, "Oops - No data found," message.

Living in an Asynchronous environment definitely requires a large mental shift. But, I see no reason why compromises can't be made in favor of more readable, less indented code. Especially when the alternative leaves the domain unavailable.

As I said before, the Tumblr module loads RSS feed data using a JSON-ized response from YQL. Once the module has been loaded, the exported load() method exposes a simple API consisting of a single method, next(). This method iterates over the cached RSS data, one item at a time.

In addition to loading the RSS feed, the tumblr module also sets up an interval - using the setInterval() function - that will reload the RSS feed data every 60 minutes.

Tumblr.js Module

// Include the necessary modules.

var http = require( "http" );

// ---------------------------------------------------------- //

// ---------------------------------------------------------- //

// Define the Tumblr RSS URL that we are going to be feeding into

// this site.

var tumblrRSS = "http://bennadel.tumblr.com/rss";

// I am the locally cached RSS items.

var rssItems = [];

// I am the current index of the RSS items (each request will move

// this index forward one index).

var rssIndex = 0;

// ---------------------------------------------------------- //

// ---------------------------------------------------------- //

// I am a place-holder function that does nothing more than provide

// an invocation API (can be used to simplify callback logic).

function noop(){

// Left intentionally blank.

}

// I am a helper function that simplifies the HTTP request,

// encapsulating the concatenation of response data packets into

// a single callback.

function httpGet( httpOptions, callback ){

// Make the request with the given options.

var request = http.request( httpOptions );

// Flush the request (we aren't posting anything to the body).

request.end();

// Bind to the response event of the outgoing request - this will

// only fire once when the response is first detected.

request.on(

"response",

function( response ){

// Now that we have a response, create a buffer for our

// data response - we will use this to aggregate our

// individual data chunks.

var fullData = [ "" ];

// Bind to the data event so we can gather the response

// chunks that come back.

response.on(

"data",

function( chunk ){

// Add this to our chunk buffer.

fullData.push( chunk );

}

);

// When the response has finished coming back, we can

// flatten our data buffer and pass the response off

// to our callback.

response.on(

"end",

function(){

// Compile our data and pass it off.

callback( fullData.join( "" ) );

}

);

}

);

}

// I load the RSS feed from TUMBLR using Yahoo! Query Language (YQL).

// When the RSS feed has been pulled down (as JSON), it is passed

// off to the callback.

function getRSS( callback ){

// Define the YQL query. Since this is going to become a URL

// component itself, be sure to escape all the sepcial

// characters.

var yqlQuery = encodeURIComponent(

"SELECT * FROM xml WHERE url = '" +

tumblrRSS +

"'"

);

// Make the request and pass the callback off as the callback

// we are going to use with the RSS HTTP request.

httpGet(

{

method: "get",

host: "query.yahooapis.com",

path: ("/v1/public/yql?q=" + yqlQuery + "&format=json"),

port: 80

},

callback

);

}

// I load the remote RSS feed locally.

function loadRSS( callback ){

// Get the remote Tumblr RSS data. When it had been loaded,

// store it locally.

getRSS(

function( rssData ){

// Deserialize the RSS JSON.

var rss = JSON.parse( rssData );

// Make sure that there are RSS items.

if (

rss.query.results.rss.channel.item &&

rss.query.results.rss.channel.item.length

){

// Copy the RSS items reference to the local

// collection.

rssItems = rss.query.results.rss.channel.item;

}

// Whether or not we received any valid RSS items, we've

// done as much as we can at this point. Invoke the

// callback.

(callback || noop)();

}

);

}

// ---------------------------------------------------------- //

// ---------------------------------------------------------- //

// I get the next item in the RSS collection (or return null).

function getNextItem(){

// Check to make sure that we have at least one RSS item. If

// not, then return null.

if (!rssItems.length){

return;

}

// Increment the index - each request further traverses the

// RSS feed.

rssIndex++;

// Check the index bounds - if we've gone too far, simply loop

// back around to zero.

if (rssIndex >= rssItems.length){

// Loop back around.

rssIndex = 0;

}

// Return the description of the current item.

return( rssItems[ rssIndex ].description );

}

// ---------------------------------------------------------- //

// ---------------------------------------------------------- //

// I initialize the component and return an API for data interaction.

exports.load = function( callback ){

// Create a simple API for our Tumblr service.

var localAPI = {

next: getNextItem

};

// Load the RSS feed (grabs the remote RSS data and caches it

// locally for serving. Once it has loaded, pass off the local

// API to the callback. This way, the calling context can use

// either the RETURN value or the CALLBACK value.

loadRSS(

function(){

(callback || noop)( localAPI );

}

);

// Set up an interval to check for a new RSS feed every hour.

// This will completely replace the locally cached data.

setInterval(

loadRSS,

(1 * 60 * 60 * 1000)

);

// Return a simple interface to our Tumblr object.

return( localAPI );

};

Putting this all together requires both the above code and the node-static module (for serving static files). Only, I didn't actually deploy the node-static module to Nodejitsu. Rather, I simply declared it as a dependency and the Nodejitsu deployment process automatically took care of installing all of the required dependencies. Nodejitsu knows how to do this thanks to the package.json file in my root application directory. The package.json file is a CommonJS standard for outlining Node.js modules and their dependencies.

Package.json

{

"name": "gymspiration",

"subdomain": "gymspiration",

"description": "Pictures of hot, sexy women with intimidating muscle that will make you feel like you need to go workout ASAP!",

"version": "0.5.0-1",

"author": "Ben Nadel <ben@bennadel.com> (http://www.bennadel.com)",

"main": "./server.js",

"scripts": {

"start": "server.js"

},

"engines": {

"node": "0.4.6"

},

"dependencies": {

"node-static": "0.5.4"

}

}

As you can see, this JSON object has a "dependencies" member. During the deployment process, Nodejitsu looks at these dependencies and automatically installs any missing modules (using NPM - Node Package Manager - I believe).

Pretty cool stuff! Node.js application architecture still confuses me greatly and I admittedly have a load of blind spots; but, slowly and steadily this stuff is starting to come together. And, platform / hosting services like Nodejitsu are making the transition from development to production about as painless as anyone could ask for.

Reader Comments

As amazing as your blog posts are and as awesome as a developer you are, it is really refreshing to read about you taking the "student" approach to new endeavors. It's a great reminder that no matter how much you think you know, there is always much more you don't; very humbling and inspiring. Thanks for that.

Thanks my man! This stuff is really very new to me - but it's exciting. It's interesting to compare and contrast it to things like ColdFusion where there's an incredible amount of infrastructure built into the core application server. With Node.js, there's a lot there too - but, you're really left with a ton of freedom (aka. responsibility) to put things together.

Hopefully, at the end of the day, I can get some good cross-over of ideas between the various programming languages; and, if this can help me write better Javascript in general, well then that sort of rocks :)

I've been reading a lot about Node, and seeing even more people talking about how awesome it is. But I have one question.

What's the point?

Honestly, I'm not understanding the benefit of using Node over Apache (or does Node run on top of Apache?). It's all well and good to use Javascript for "server side" code, but why would someone make a choice to use Node over ColdFusion?

Excellent blog post by the way Ben. I've told you before, but I really appreciate the fact that you include not only your code, but your thought processes during development. I think it helps other people relate to their own personal approach.

I think that's a spot-on question. I think there are types of applications that work really well for Node.js environments; and then types that don't. My sample app is *definitely* an application that would probably have been way easier to put together in ColdFusion. In fact, I don't really see anything in my code that is especially geared towards asynchronous processing.

Really, I think the place where this will shine is when you involve a lot of "Realtime" interactions with an application. At least, that seems to be what many people are using it for.

And, when it comes to Apache, I also asked them that question in class on Saturday - can Apache sit above Node.js and provide a connector from port 80 to whatever port a Node.js app was running on.

They said that would be a decidedly bad idea. Node.js is "evented", Apache is "threaded". When you start to mix the two paradigms, things apparently get much more complicated.

At Nodejistu, apparently they have developed a routing system that maps the port 80 to the port of your application so you can get various Node.js applications running without having to worry about port conflicts. Or something... I didn't fully understand what they were saying.

Ultimately, I'm certainly *not* looking to give up ColdFusion - it's just too awesome. I'm just looking to get some good learning on :)

Think about when you assign a click event to a button with client side JavaScript. Does your code there after have to wait till the click event has fired and completed...? No, the callback is just thrown on the event loop within the browser and will fire when triggered and your application continues on.

Now think about taking that paradigm to the network layer. Http calls, database calls, file IO, etc. Everything just got a lot faster and scales a bit better than traditional threading. Ya know?

I wanted to chime in on the approach that Ben started to outline for running multiple node.js applications on a single machine. Nodejitsu wrote and maintains a high-performance reverse proxy written in node.js.

In a production environment you would have your reverse-proxy running on port 80. Then each of your applications would run on a separate port (8080, 8081, ...). We wrote node-http-proxy to be capable of routing based on the incoming HTTP `host` header, so:

- requests to foo.com can be forwarded to port 8080- requests to bar.com can be forwarded to port 8081

And so on. The reasons for doing this are because there is a limit in the V8 Javascript Virtual Machine's heap size (i.e. total memory available to the node.js process). So as your application needs to scale, you need to break into separate processes.

Languages like ColdFusion and C don't have this limit so single process VHost approaches make more sense there.

Ah, cool stuff! Thanks for clarifying - I'm just barely wrapping my head around of this stuff. And, when you get to server configurations, I'm a bit out of comfort zone even when it comes to ColdFusion and the like.

ColdFusion also has size limitations, though not on a per-app basis. All of the sizes running on a ColdFusion instance contribute to the same JVM heap, which I think on at 32-bit machine is limited to like 1.2 gigs or something..... of course, I'm basically pulling all of that out of the air :) I have no real understanding of anything at that level.