Web Served is nearly over! This is the last regular part in the series, and so I want to kick it up a notch—several notches, actually!—and get away from installing tired old PHP applications. Instead, we're going to go a little nutty and install Etherpad (formerly known, and still sometimes referred to as, "Etherpad Lite"). Some parts of this piece have previously appeared on my blog, but I'm aiming to expand a bit on that write-up and give you all something fun to play with.

Etherpad is a tool born out of the ashes of Google Wave, the abandoned and decommissioned real-time "e-mail replacement" that Google launched and then quickly killed. EPL is a real-time collaborative document editing application: it presents the user(s) with a blank canvas and lets them all make changes to it, and everyone's changes are visible to everyone else in nearly real-time.

It's a lot like Google Docs, except that the changes are generally shown a whole lot faster, and everyone's changes are tracked and can be replayed like a VCR. Plus, it's yours—that's one of the main points of this entire series, after all! Rather than living in Google's ephemeral application cloud, Etherpad lives on your server and you always have access to it (at least, assuming you have access to your server).

Why would I even want this?

Here at Ars, we use Etherpad with some regularity for collaborative editing. It's really handy to be able to call up a coworker, create a new "pad," paste in a document, and start talking through it and making changes as you talk. You can rewind and fast-forward back and forth across the document in various changed states to watch how the piece evolves through editing.

It's also handy for quickly gathering a bunch of ideas from a bunch of different people. In this way, it fills some of the same functions as Google Drive applications do, except you're hosting it yourself. As someone who regularly uses Google Drive for work (we keep our story dashboard in a Google Drive document and use it to coordinate who's writing what throughout the day), it can occasionally be a frustrating tool to deal with—like when it loses the changes you just made and tells you that you must refresh the page to continue editing. Argh.

Another reason to get this set up is because it's a neat application that relies on a number of different technologies to function, including Node.js (a server-side Javascript execution engine) and Redis (a fast in-memory NoSQL database), and setting up new, cool technologies is fun!

Going nuts with prerequisites

So, let's get started on them prereqs, because there are a lot! Etherpad is a Node.js application—it's written in Javascript and relies on Node.js to run it. Node.js is gaining a lot of traction in Web application developer circles because it's fast, and it lets developers use their existing bits of Javascript knowledge to create "real" (or at least executed on the server instead of the client) Web-based applications. There's a version available in Ubuntu's repositories, but we're going to add a custom PPA to ensure we're running the latest version.

We're also going to be installing Redis, which Etherpad uses as its database. Redis is a NoSQL database—specifically, it's a key-value store (like memcached, which we installed back in part 3). Rather than operating like a traditional database with data arranged in a table and addressed through some form of structured query, Redis just stores a big list of keys and their associated values (like, "color" for a key and "red" for a value, for example). The big draw for Redis over other NoSQL databases is that Redis is entirely RAM-based. There are drawbacks to this (what if you lose power?), but the big advantage is that it's blazing fast. Because Etherpad is pretty database-access-heavy, giving it a fast place to stash its data is a good idea. Plus, cool technology is cool!

As with Node.js, we need to add a PPA to ensure that we're installing the latest version of Redis. Use the following commands to add the necessary PPAs to your server:

Once those two PPAs are in, update your package listing and install both Node.js and Redis:

sudo aptitude update
sudo aptitude install nodejs redis-server

Etherpad download

Etherpad itself is available from Github, and we're going to clone it down onto our local machine, just as we did with Vanilla in part 6.

Head to your Nginx directory at /usr/share/nginx, and use git pull to create a local copy of the Etherpad application from its git repository. To save time, we're going to first elevate our shell to root privilege, since our regular user account should only have read access to the directory we're in:

That will kick off a lot of activity, and when it's done, you'll have an etherpad-lite subdirectory with the git repo's contents.

Note that we're not actually putting our Etherpad installation inside our Nginx Web root. This is because Nginx never actually needs to directly serve any files from inside the directory; the pages will all be generated by Node.js and proxied through Nginx...with one teeny, tiny exception, which we'll get to at the very end.

Make Etherpad run as a service

We've got our local copy of Etherpad, but it's not going to do us much good just sitting there—we need to set it up to run, by itself, as a service.

This is one of the advantages that prebuilt packages bring—when we installed Redis just now, for example, we didn't have to think about what to do to make Redis run in the background by itself, because the Redis package took care of that for us. It created a local account for the Redis processes to run as, it set up a place for logs to be stored, and it took care of the init tasks for us so that Redis gets started when the computer boots up. We want the same kind of stuff for Etherpad, but we're going to have to create it ourselves.

Create a user

At the root-elevated bash prompt, create an Etherpad user named "etherpadlite" by running the following command:

useradd -M -s /bin/false etherpadlite

This will create the user. The -M switch ensures that a home directory is not created (service accounts don't need home directories), and -s /bin/false tells the system to use /bin/false as the account's default shell—which is fine, because no one will ever log on as this particular user and no shell is needed.

Now that we have a user, there is one important step we need to make sure to do: we must change the ownership of the cloned Etherpad directory to that user. Otherwise, we won't be able to actually get Etherpad up and running! To do that, use the chown command, like this:

chown -R etherpadlite:etherpadlite /usr/share/nginx/etherpad-lite/

Create logs for Etherpad

We also need to create a log file location for Etherpad to use and set up a log rotation policy for those logs so the operating system automatically tends to them and keeps them from growing too large.

Still as root, run these commands to create a directory for your Etherpad logs and to make sure that location is owned by your Etherpad account so that the application has the permissions to actually write things there:

The chown command here sets ownership of the directory to the "etherpadlite" account, and also to the "adm" group (which needs to be there so Ubuntu can also change files in the directory).

Next, we need to tell the logrotate daemon about this location. This is a background application that keeps track of all the log files on the system; it can be configured to start a new log file for an application every day, to compress and archive older log files, to delete the oldest log files, and to do a whole bunch of other things.

To configure the logrotate daemon to also watch over our new Etherpad directory and to do its magic on files that show up there, we need to create an etherpadlite file inside of the /etc/logrotate.d/ directory, and add the following to that file:

This tells logrotate that there will be two files it should watch—one named access.log and the other named error.log—and then tells the daemon what to do with them. These options can be adjusted however you'd like; for a good explanation of what does what, try typing man logrotate at the bash prompt.

Setting up a service

We have a user and a log file location, and now we need to create the script to get Etherpad automatically started. Throughout this series, I've been using /etc/init.d/servicename start and stop when starting and stopping processes, but that's really the old way to do things. Ubuntu supports (and prefers you to use!) the new, fancy Upstart method. So, rather than creating an old-school init script for Etherpad, we'll make an Upstart job.

This is actually pretty easy: create a file named /etc/init/etherpad-lite.conf and paste in the following:

This might take a minute or so to complete, since it's the first time Etherpad has been run. While this is going on, you can check the two Etherpad log files you'd previously created; if they're filling up with stuff, that's a good sign that the application is working.

If the job is able to start successfully, you'll see an acknowledgement at the command line:

But we're only about halfway done. We still need to set up Nginx so that it proxies traffic to Node.js, so that you don't have to have your bare instance of Node.js facing the Internet. We also need to actually do a bit of configuration on Etherpad, because right now it doesn't know anything about how we want it to use Redis. In fact, if you look at the Etherpad error log right now, it's probably throwing the same Error: EACCES, open 'var/dirty.db' error over and over again—don't worry, we'll fix it!

24 Reader Comments

Etherpad is super useful. The only real weakness is there's no directory structure, I keep a pad just to track our pads. There's also no easy way to nuke a pad, and since the history is always embedded in it that could be awkward if the wrong person is invited to it and can scrub back.

One thing I found: Etherpad's use of port 9001 appears to conflict with Tor.

Also, the rewrite line needs a space between "rewrite" and "^".

Thanks! Fixed. Always hard to ferret out all the typos in a piece with this many code snippets and extra formatting bits.

Looks like you're right about the potential conflict. You can change the port Etherpad runs on by editing the "port" setting at the top of "settings.json". I don't think I'll update the piece, though—I figure most folks savvy enough to be running Tor probably can figure it out.

I have really, really enjoyed this series. It has been a really great jumping off point in creating my own personal website that holds notes and other things that I can host on my own machine without relying on outside cloud services.

Etherpad lite seems like a neat tool. I've never used it but I could definitely see the benefits. I do hope that you continue with intermittent tutorials on webhosting. In particular, I think it would be really cool to see a tutorial on setting up a dropbox alternative (such as owncloud) to really extend the functionality of a home server.

Thanks for a great series, Lee - you had a good run there. Personally I would have loved a primer on setting up postgres + ruby too, but I guess people should be able to figure that out on their own after completing the previous entries. Thumbs up!

@autodefenestrator You can change the port of Etherpad by Hitting up /admin/settings or by editing settings.json

@Aurich Lawson you can use the API or a plugin to see pads available and to delete existing pads.

Visit /admin/plugins to see the available plugins and one click install them

To get to the /admin/plugins page you need to set a username/pass in settings.json, it'd be great if Lee can do a quick write up about how to install plugins on Etherpad and maybe an authors pick of his favorites!

Having tried out Etherpad about a year ago, we settled on Gobby, using a infinote server running on an Ubuntu install locally. The latest infinote server is rock solid (been running since November without a single outage)

And Gobby is, in my opinion, far superior, so long as you do not need document history. It has built in chat, full document support (folders, many documents), security communication, great performance, and so on. Etherpad felt very clunky, even when on a local network.

Gobby has replaced instant messaging and chatrooms for our development team. We have documents of the issues and talk right on the ticket itself, which allows for many simultaneous discussions, rather than the 1-2 you can have going in a chatroom.

I've currently got Subethaedit constantly running on my Mac server. I love 'collaborative note-taking' for brainstorming; sometimes I prefer it to face-to-face meetings, depending on the level of detail and time limitations.

When I get to the npm install redis step, I get a series of errors (starting with something along the lines of 'ueberDB is not a valid file name') and the ueberDB redis connector fails to install correctly in the /ueberDB directory.

I got around this by installing the redis connector elsewhere then moving it to the correct directory, as follows:

(be careful with that last command - if you install the redis connector in a directory other than /usr/share/nginx, and that directory already had a 'node_modules' sub-directory, then skip the last command or you'll delete whatever else was already in the sub-directory)

I'm not sure why this was workaround was necessary though. There was already a node-redis module in ueberDB/node_modules, so maybe etherpad/node have preinstalled a different connector and don't want you to use the other redis connector? Any ideas?

Having a wee bit of a problem and I've been trying to sort this out all week, but I'm a bit stuck at one part.

Etherpad works fine when I get it running through the server and port 9001 (e.g. http://192.168.1.2:9001) but when I load it through /etherpad/ it appears that the css isn't loading ("New Pad" icon not showing up and format is all over the place on the pad page) and when I create or open an pad I get a flash of an error about not having permission to access something and then " 'require' is undefined in http://192.168.1.2/etherpad/p/Test (line 540)" and can't do anything.

Checked nginx access and error logs, unfortunately the error logs don't point to anything, and the access logs give me a 301 error on the css and javascript files

Think I've found the problem.

"get /wiki/Main_Page HTTP/1.1" 200 5079 "http://192.168.1.2/etherpad/p/Test"It wouldnt be the line from the wiki.conf file would it?

Nice article, does a good job of explaining *why* you made various choices during the install, which I like. But a quick comment about the SSI of the text file and the cronjob. Wouldn't it be more efficient to just have nginx call the script when someone hits the page, and to cache the result for 60 seconds or so?

This way, you're not endlessly re-writing the output file, and you only put in updates when someone actually hits the etherpad web page. No sense doing work all night from 1am to 6am if no one is actually using the system, is there?

change it to"location ~* /wiki/\*.(js|css|png|jpg|jpeg|gif|ico)$" and it seems to have worked.

I had the exact same behavior on my installation. I tried your changes and it appears to be working now. Still, I'm at a loss on 'why' would a wiki location have any effect on the pad's css... Let me know the reason if you have the time, and thanks a million for posting it...

I am again having a weird issue that pops up a 502 Bad Gateway when trying to access and run etherpad. Fortunately for me it's only affected etherpad and not the entire site this time. Here's a run of my /etc/nginx/sites-available/www file

change it to"location ~* /wiki/\*.(js|css|png|jpg|jpeg|gif|ico)$" and it seems to have worked.

I had the exact same behavior on my installation. I tried your changes and it appears to be working now. Still, I'm at a loss on 'why' would a wiki location have any effect on the pad's css... Let me know the reason if you have the time, and thanks a million for posting it...

I'm also in the same boat. I'm able to fix up etherpad with this fix but then the wiki loses images. Thanks for the fantastic series by the way. Had it saved a long time ago and recently had some time and came across my old links.

This triggered an idea I've been wondering about for a few days, but haven't found the right steps of instructions to come up with a viable solution.

How would you set up nginx as a reverse proxy for an apache web server I'm running on a Mac Mini Server as well as a web server for all the applications you've covered here? Do I need two instances of nginx or just a new location with the proxy_pass command? I'm at an impasse with the basic concept and the specific commands. Thanks.--a

I am again having a weird issue that pops up a 502 Bad Gateway when trying to access and run etherpad. Fortunately for me it's only affected etherpad and not the entire site this time. Here's a run of my /etc/nginx/sites-available/www file

Lee Hutchinson / Lee is the Senior Reviews Editor at Ars and is responsible for the product news and reviews section. He also knows stuff about enterprise storage, security, and manned space flight. Lee is based in Houston, TX.