April 16, 2011

I want to eat at your restaurant, but I need to find you first. You need a website. And not just any website. It has to work on a mobile device. If not, you might as well not exist anymore.

In the past few months, I have been out with my friends several times, spontaneously decided to get a lunch, dinner, etc. and had a bad experience with a restaurant's website.

Today's typical dining experience starts with a bit of research on a phone, looking for a menu. I start with the maps app on my iPhone, search for something like "Indian" or "Italian" around my current location, find a restaurant, go to the website, and I'd greeted with a message like "you must install Flash version 9 or higher" if I'm lucky enough to find your website at all.

You know what the cool kids call that? EPIC FAIL.

Even if I can see the website, I'm more likely to find a picture of the owner's family as if they are slaving over a hot stove, the history of the restaurant starting with the owner's mother immigrating in 1977, or a stock photo of a slice of pizza. If there is a menu at all, its an out of date pdf file I'm struggling to view on my tiny screen.

If you care about your restaurant, you should want your customers to find you.

What information should be at their fingertips?

When visiting a restaurants website from my phone, I want:

location (including phone number)

hours

menu

Anything else is just noise on a mobile phone... If I'm planning a catered event, I'll do that from my computer.

But I bought a package deal...

Yes, I'm aware that ground-up website design can be expensive, and restaurants often fall for the "Website for $29.95 a month" marketing ploy, where you get a "billboard-at-a-url" and a scanned menu. You are losing out. You need a real website.

Here's why:

Customers can find what they are looking for

You wouldn't imagine not having a sign outside your restaurant - a website is the same thing to the person actively looking for you. Restaurants think nothing of printing thousands of folded menus and sticking them on every car in a parking lot or every door in a heighborhood. Your website can have a far greater reach than that, and with immediate impact.

you can have a CMS to keep the menu up to date

For most restaurants, the menu on the website is like a time capsule. Dishes come and go... prices change... why shouldn't your website reflect that? With a content management system behind your menu, you can log in to update prices, add specials, etc. Why shouldn't your website be able to tell people what the soup of the day is?

You can tailor your SEO strategy

Try this: Put the name of your city and the type of food your restaurant serves into Google. (If you are in a big city, try your zip code). Is your restaurant's website on the first page? It should be. If you put in the name of your restaurant, are you the first result? If you're not, the first result is probably something like a restaurant review - and you have no control over that! Your worst customer could be the first thing people learn about your restaurant.

With your own website, you can easily influence how search engines view you. That is a topic larger than this blog entry though - its called 'Search Engine Optimization' or 'Search Engine Marketing'.

Further, by having your website as a real website (as opposed to flash or pdf), all your dishes become keywords Google can index. Maybe I'll find you because I searched on "Vindaloo" or "Pad Thai". All those words are pure SEO gold.

You can target your market more effectively

Visitors to your website can be tracked more closely than cattle with tags in their ears. Are most of your visitors coming from a particular neighborhood? Maybe thats the place to target with your door hangers. Is there a neighborhood thats not visiting? Send them a coupon for a free cheesy bread.

Ok, so how do I do it?

I'm not trying to sell you anything here - in fact, our typical projects are much larger than restaurant websites. I just want a better dining experience, so I'm offering you some technical advice:

Use 'Open Technology' standards

Make sure your website firm designs with html/css/javascript. Anything that requires a browser plugin to use (flash, Java, and most likely pdf), are non-starters for your marketing purposes.

Make sure your website looks good on small devices

I'll spare you most of the technical details, but with the kinds of tool we have today, a good website design scales down to work on phones, and scales up to work on big screens. On big screens, sure - show me your family in the kitchen. On my cell phone, echo the design with logos, fonts, and colors, but just give me the menu, hours, and location.

April 12, 2011

We’ve all see the online CAPTCHAs that make you type in a pair of garbled words to prove you’re not a spam-bot. What you might not know is that by doing this, you are participating in crowd sourcing, a way to gather small amounts of data from large numbers of people. In this case, many of those words are used to digitize old books from the Library of Congress. When a computer can't read the scanned text, it "crowd sources" the unknown word. Once two or more people enter the same word for the garbled image, the computer now knows what that scanned word is. It’s pretty amazing to think a few seconds of time from millions of people can create public works of real value. This is the value of “crowd sourcing”.

Can your business use crowd sourcing? It works well for things that are “hard” for a computer to figure out, but easy for humans. For example, “which of these pictures contain pasta dishes?”. Amazon has a service called 'Mechanical Turk' which farms out work like this to interested people in exchange for micro-payments. Say you're willing to pay a penny a picture to have your images categorized or tagged. Amazon’s service will match up your job with interested parties. Amazon gives you back the results just as if a computer had processed the images, but instead the work was done with human intellegence. The results can be integrated into your data, allowing your systems to act on the new information. In fact, we’ve used services like this to verify that profile photos on a social network were all 'G-rated', and to generate transcripts for video files.

April 10, 2011

Fullcalendar is an impressive calendar widget for JQuery, showing you events on a month, week, or day view, and letting you drag them around to reschedule them.

It needed a good sample app showing how to use it in Rails 3, so I wrote one.

Rails3_fullcalendar is a rails application that tries to live up to the rails3 notion of restful ideals. It does this thanks to the power of Adam Lassek's jquery.rest plugin, which handles all of the wrapping of the jquery.ajax method into create, read, update, and delete methods, and also deals with the complexity of the token's rails uses to avoid cross-site scripting attacks.

Download it from github, run the migrations, and fire up the app. You'll see an empty calendar, and scaffolding for adding events. When looking at the calendar view, you can drag events around to change their date and time. Dig into the code to see how it works.

I'm going to need to prepare a longer rant regarding timezones, but fullcalendar does something nice - it respects timezones (as long as you send it the right time format in the json), and normalizes everything to the local timezone on display... Now if I schedule something for 12:00 noon in Washington DC during Eartern Daylight Time, it'll end up on the server as 16:00:00 UTC on the server, and if viewed by someone in California will show up on their calendar at 9:00am.

I have some future plans for this demo - including having multiple schedules, and then showing how to overlap them on the same calendar, with the events from each schedule showing up in their own color. With a view like this, someone can have a calendar 'command center' view, dragging around events from different people's schedules until conflicts are resolved.

April 07, 2011

Last night we capped off a great weekend at RubyNation 2011 with a wrap-up
dinner (thanks Dr. Nic!). I'm proud that so many CodeSherpas had a hand in
putting together this conference over the years. Anyways, during the dinner a
fellow attendee, Robert, asked me for my thoughts on RESTful search. "It's just
the query string on the index, right?" That is a very interesting question, and
as it turns out there's more than one way to do it...

We are all familiar with search and use Google every day. Perform a Google
search and it gives you results. But "search" itself is a verb - an activity
(bear with me if this seems a bit pedantic).

When using Google, we are really interested in web pages or sites related to our
search term: a set of resources related to that term:

We will use this as our guide to develop a RESTful search service. I will make
use of some Rails conventions in my examples below, but the concepts here apply
to any RESTful ideal.

Let's say I'm building a web site for a fantastic Mini dealership, and my site
includes search for all cars on the lot. I will need to answer questions like:

Do you have any red & white Mini Coopers with superchargers?

Option 1: The query string!

Yes this is very obvious.
GET => /mini_cooper_inventory
Returns my entire inventory
GET => /mini_cooper_inventory?color="red"&color="white"
&supercharger="true"
Returns the portion of my inventory matching the given
query parameters

Option 2: A filter resource.

Aha! I have played a trick on you in Option 1. We were
already using a filtered resource to see only those
Coopers in our inventory. We excluded cars that have
been already been sold, or those on order or in transit.
To think of filters RESTfully:
GET => /mini_coopers
Returns all Mini Coopers that have anything to do with
my dealership: ones that I have sold, ones that are in
inventory, and ones on order.
Because these are such commonly used resources, instead
of doing a query every time I wanted to look at my
inventory:
GET => /mini_coopers?status="inventory"
I have a dedicated another resource:
GET => /mini_cooper_inventory
In Rails you would perform the search in the #index
method of this controller, and then narrow by any
other query parameters. A note of interest, the
mini_cooper_inventory is not backed by a model of
it's own. REST is not CRUD, but that is a topic for
another day.

Option 3: A search-results resource

This is probably useful when you have a really
complex search that spans several resources. Let's
say the IRS (yikes!) is askng me for all details on
all sales made by Duzzy Cheetham or his family in
the past two years. RESTfully speaking, I'm potentially
dealing with any of a number of resources in my system:
mini_coopers
customers
sales
I could use one of the previous 2 options, and search
for sales matching certain critera:
GET => /sales?customer_last_name=Cheetham
&customer_last_name=Swindell
&date_range=2009-04to2011-04
&detailed_report=true
But at this point they query string (Option 1) does not
seem to be a good fit. I could create a search object:
POST => /search
But again, this is not really what I am after, unless
I am giving users the ability to save searches. What I
really want is some sort of customer report:
GET => /customers/:duzzy_cheetham_id/customers_report
I want to see all purchases made by Duzzy Cheetham and
his family:
GET => /all_purchases_made_by_duzzy_cheetham_and_his_family
Neither of those requests are really reusable or useful
outside of this one search. Let's try again.
GET => /search_results/new
This is the form for doing a complex search (familiar to
those of you using Rails). Not bad. What's next?
POST => /search_results
This creates a new search results report that matches my
search terms, including date range, customers, and
whatever else I want to throw in. And if I want 2 types
of searches, one that returns a list of results, and
another that summarizes results in some type of report:
POST => /search_results
POST => /search_results_reports
The name of the actual resource is up to you. The search
submitted can even be given a name that allows users to
save and re-run searches:
POST => /search_results/:saved_search_id

Anyways, there are many options when it comes to organizing your applications to
work RESTfully - for the way the web was designed. I hope this example gives some
food for thought.

April 06, 2011

Last night we performed some upgrades and restarted a server with nearly a year of uptime. This morning I was checking on the status of the server, and saw this graph. It is a beautiful example of the kinds of things you can see with Munin.

After the server restarted, the cache was empty. As the server traffic picked up this morning, more and more database calls and page fragments were generated by users hitting the site. They were tucked away in memcache so they could be served up faster to the next visitor. As we get more traffic, this cache fills up, and the website actually gets faster.

The performance gains aren't the subject of this post though - I thought this was a beautiful representation of a day in the life of a server process. This represents one day period, with time going from left to right, and the three lines representing the amount of data in bytes (green), the number of objects (whether that be objects from the database or pre-rendered pieces of html - red), and the number of connections our rails processes are making to the memcached server (blue).

What does this show me? I can see that memcache is doing its job - that the rails app is putting data in and getting it out. I can also see that the cache is filling up... based on the 'hockey stick' curve, I can see we have allocated enough memory to hold the data we are putting in there. If we hadn't allocated enough memory, I would visually be able to see it 'hit the ceiling'.

I can also tell things about the system overall by comparing to other graphs from munin and other tools. Comparing this to graphs of mysql queries, I can see the impact of using memcached on mysql (by the number of reduced database queries) By comparing it to processor utilization, I can see the amount that pre-rendering reduces processor load (since we don't spend as much time rendering).

April 04, 2011

It is a lot of hard work putting on a conference like RubyNation; Three CodeSherpas are on the organization committee, out of a total of 7. I'd like to thank Gray Herter for being a driving force behind this conference... it's his baby, and without him, there would be a gaping hole in the Ruby Community in the Northern Virginia/DC Metro area. I'd also like to thank all of our sponsors. Without them, this conference would still just be a dream.

Every year the conference gets better and better. With a live event, there are always issues that have to be solved in real-time (like the programs being delivered late), and each year we hold a retrospective after the conference to figure out what we need to improve. This year it seems like it was only minor things - nothing that couldn't be handled on the spot with our great staff of volunteers.

If you attended, I'd lke to remind you to please rate the talks you attended. Nothing helps us improve the conference like real feedback. If you didn't attend, you missed a great event; don't miss it next year! Make sure to visit the RubyNation website early and often. We'll have a signup for next year's announcements available soon.

March 09, 2011

In the past two weeks, I have been called in for some near-emergency tuning of two rails applications that recently went into production. One was for a state-level government agency, the other was for a startup that recently went into production and found performance problems as their app grew in popularity. In both cases, the first place I looked was the innodb table settings for mysql - and in both cases, I found things that could immediately help the application in question.

I'm going to walk you through that thought process now, potentially teach you something about your own rails app, and hopefully improve the performance of your rails application.

Innodb - the background

While this advice comes from two deployed rails applications, the advice I'm giving here applies to any application using mysql and the innodb storage engine. Rails uses innodb by default.

One of the coolest things about mysql is its ability to swap in different storage engines. This is also one of the reasons mysql gets a lot of grief about 'not supporting transactions', 'not recovering well after a crash', or other nasty rumors. The MyISAM storage engine, in fact, doesn't support transactions and does have issues recovering from a crash. But rails apps typically don't use that storage engine - by default, rails apps on mysql use Innodb. If I were so inclined, I could write a storage engine for mysql that stored all text in flat files, converted to pig latin... but thats not the point - the point is that Innodb gives us everything we expect from a real database - including more knobs and dials to turn than we could experiment with in a lifetime.

Innodb - the knobs and dials

Someplace, your mysql installation has a "my.cnf" file. Typically, this is under /etc, but the exact location can vary depending on your operating system installation. In this file, we can tweak the values of various parameters that mysql uses.

There is a long list of tunable innodb parameters"; for this article I'm going to teach you about 4 of them - the 4 I found can most profoundly affect a rails app, and the 4 that most often appear in tribal lore about what the 'correct' values should be. I'm not going to tell you the correct values - I'm going to teach you how to figure out what the correct values should be based on measurements of your running application.

Innodb parameters - the 4 "Usual Suspects"

There are 4 values I like to explicitly define in a server's my.cnf file. They are:

innodb_buffer_pool_size

innodb_log_buffer_size

innodb_thread_concurrency

innodb_flush_method

Google any of those, and you'll see trite advice like "Set this to be about 80% of your server's memory" or "set this to 2x the number of processor cores your server has". That advice is Not Even Wrong... because it might even be right, but it doesn't give you a clue to answer questions like "How much memory should we put in the server?" and "how many processors should we have on that new box we are going to set up?"... worse, it might be wrong for your particular setup, because the advice was free of any particular context.

There are certainly other things worth tuning to get the most out of your setup, but if you haven't tuned anything yet, those are the first 4 that will 'take the handcuffs' off of your mysql server.

Buffer Pool Size

This is the value of Innodb's buffer pool. By default, it is ridiculously low - something like 128 Megs. Even if you were to add tons of real memory to your server, MySQL wouldn't be able to use it unless you tweak this parameter.

By increasing this value, you are telling mysql "keep the most frequently accessed stuff in memory, so you don't have to go to disk to get it when someone wants to look at it". Changing this number to something appropriate for your app will likely give you the single biggest database performance improvement you will ever see while tuning.

In a perfect world, every server would have terabytes of ram, and we'd be able to set this number to something incredibly high and never worry about it. But memory isn't free, so we have to figure out what to set this to, and whether getting more memory is 'worth it'.

Here in the first half of 2011, I typically see servers on one of two ends - on the 'low end' is a virtual server with something small, like 2 gigs of ram. On the 'high end' are dedicated mysql servers with 64 gigs of ram. You can certainly have servers with much more than that; I just don't typically see installs larger than that without a mysql guru already along for the consultant ride.

So what do we set this to? Well, it depends... Is this a dedicated mysql server, or is it also hosting the apache/nginx/passenger/ruby/rails part of the stack? Are you running memcached on the same server? Is this virtual hardware, or something real?

Advice for a stand-alone mysql server

If this is a stand alone mysql server, the answer is "give as much ram as you can afford". Assuming we can decide how much memory the box will ultimately have, lets start with as much as we can afford - something like 4, 8, or 16 gigs of ram. Subtract a reasonable amount of the operating system to run (perhaps a gigabyte), subtract a little bit more for any user-space monitoring you might run (perhaps another gigabyte), and then just a little bit more for some of the other values we are about to give mysql (maybe another 512 megs), and put the rest of your system memory towards this value.

Based on that logic, you can see where the rumor "50% to 80% of your systems memory" comes from. It also shows that less than 4 gigs is too constrained, and if you find youself in front of a 64 gigabyte monster, even 80% is too low, leaving a chunk of memory unused. Once you are in production, you should be monitoring your server's performance - and if you become performance-bound based on memory usage, your best dollars spent will be on more ram.

Advice for a full-rails-stack server

This is going to be a harder tuning job, but perhaps even more "worth it". We need to give mysql enough to get started and do its job properly, but we can quickly get into an area where it might make more sense to give extra memory to Passenger so it can run more processes for rendering, rather than give it to mysql for caching... or perhaps give it to memcached so we can cache fragments and avoid database queries altogether.

Based on that logic, your server is going to need a minimum of 4 gigabytes, and you should give half of that to this parameter, tuning aggressively to make sure you are using most of your servers memory *someplace* when your app is in production. An 8 gig server will handle most rails apps we tend to see in production... but when your traffic or data makes you hit this memory wall, the solution is easy - throw money at the problem, double the machine's ram, and tune aggressively again. When your server monitoring shows that you are processor or I/O bound instead of memory-bound, then tuning mysql isn't your best performance answer anymore.

The longer answer here will also require knowledge on tuning apache/passenger, as well as your applications use of fragment caching in memcached, since all that is mixed in the same ram profile.

Log Buffer Size

Every time we write data, mysql holds it in a buffer until it has a large enough data set to warrant an update to the innodb tables. If it were *always* writing to the disk, Mysql would be seriously write i/o constrained. We don't want the disk activity overwhelming the system, so we need to figure out a value for this log buffer size that writes frequently, but not so frequently that we are constantly writing to disk.

This value is going to change depending on your exact server setup... On a system with 5400rpm drives we might want a different value than for 7200 rpm drives. Same for IDE/SATA/SCSI hardware. If we have a slow storage area network, this value can affect our mysql performance nearly as much as the buffer pool size. Recently, I've seen a trend to build big disk arrays with SSDs; that would warrant a tuning of this value as well. But in order to have any insight into that value, we need to be able to answer the question "how often *is* this buffer being flushed to disk?

Log into your mysql server, get to a mysql> prompt, and type this:

mysql> show innodb status\G

There is a lot of information there we can use to tune other stuff, but for now, just check out the value of "log i/o's/second" under the "LOG" section. This lets us know how often we are writing to disk.

Is that value a problem? It depends... we need more context to know for sure.

Based on those values, we can get a sense of what this value can be. Surprisingly, this value is going to look miniscule compared to the value we set above. I start with 4 megabytes and adjust up from there. We reach a point where making it bigger gains us nothing, especially depending on how we have things set for flushing buffers for things like ACID compliance and replication (but more on that below).

Tuning mythology says "don't let your server be writing out this cache more than 10 times a second". Obviously, the mythology will be too low for incredibly fast SSD setups, and too high for resource-constrained virtual machines.

Thread Concurrency

By default, mysql has this value set to 8. So if you have a dual core box that's also handling apache for you, this value is set way too high - and if you have a 16-core dedicated box, most of your cores will sit idle. Tuning mythology says "set this value to 2x the number of cpu cores your server has". I don't think thats bad advice in itself, but there are a few places where it can steer you wrong.

Several times I have actually had to play around with manually setting processor affinity on a multi-core box, to ensure that several cores handle apache/passenger, and several cores handle mysql, and then allowing other cores to 'float', depending on the exact demand. In this case, I might want to adjust this number to be 2x the number of *possible* cores.

Flush Method

The innodb_flush_method parameter is a tricky one, and it has several settings that could be considered controversial. The fastest, safest option for this parameter is O_DIRECT as long as you aren't running on a storage-area-network, and ideally, you are also using a battery-based-up hardware raid card (we use a battery backed-up raid card running RAID-0 for our mysql instances at CodeSherpas). Setting this to O_DIRECT will turn off double-buffering when flushing logs, significantly speeding up disk activity.

Its worth reading the documentation and doing some performance testing based on your individual server's configuration. This only has a handful of values, so its easy to test and decide what performance level you get for the various tradeoffs.

If you can, set this to O_DIRECT, otherwise leave it alone.

Bonus #0 - flushing logs at transaction commit

This isn't a parameter I normally change for real in production, but changing it when doing performance testing can lead to other insights and clues to bottlechecks. The variable

innodb_flush_log_at_trx_commit

has a default value of "1", and should remain at this level if you want the typical ACID guarantees the innodb provides... however setting it to "0" or "2" changes that flush to happen either at a time interval, or at a time determined by conventional disk I/O. This does have the risk of a loss of about 1 second of data updates in the event of a system crash, but while tuning, it can help you determine if disk I/O and log flushing is a bottleneck, and if the data you risk losing is equivalent to the comment people leave on YouTube, then the performance gain might make sense for your application.

Bonus #1 - mysql logging

As long as we are looking at performance values in the my.cnf file, I suggest that you turn on the slow query log.

log-slow-queries=/var/log/mysql/slow-queries.log
long_query_time = 1

Whenever a query takes more than a second to run, it'll get logged in the file you specify (another nice thing to note about the Percona build of mysql - by default, mysql's slow query time resolution is 1 second increments... Percona changes that to milliseconds, giving you much more visibility into what you might consider 'slow').

And as long as we're logging things that are slow, add

log-queries-not-using-indexes

and that will give us visibility into things we can speed up by adding indexes to our tables.

Bonus #2 - linux 'swappiness'

As an application running under linux, mysql does its best to manage its own memory usage - and as you saw above, we are going to give it the bulk of available memory on pretty much any install we put it on. But at the same time, the linux kernel is going to do its best to look for memory that is going unused and swap it out to disk so it can be free for other things. and MySQL looks like a big target, sitting there using most of our memory.

As you can see, we can create a big mess if mysql thinks something is in memory, but linux has swapped it out - imagine the scenario where mysql simply wants to return the data it thinks is in memory - it *still* has to read it from disk, which is what we were hoping to avoid. Imagine mysql trying to clear its own memory usage - in order to free up memory it decides it isn't using, linux will have to swap something else out to disk just so the data can be brought in and freed. There should be something smarter we can do here.

First and foremost, I use munin to watch the swappiness and make sure that memory is never being swapped. Swapping kills performance on servers, no matter what is happening. We can't fix it if we don't know its happening in the first place.

Bonus #3 - my "Virtualization Rant"

And finally, I'd like to rant a little bit about virtualization and 'cloud based' services with MySQL. "The Cloud" seems to be all the rage lately, especially with startups. At CodeSherpas, we host several small clients on services like WebbyNode and Linode, we use Amazon EC2 for testing and for some surge support, and we certainly use Google apps and GMail. I'm also a fan of Heroku for its ease-of-use in getting a rails app deployed with no fuss... But by the time I'm dealing with clients that have serious mysql tuning, "100,000 visitors-a-day - why is our app so slow" kinds-of issues, I like to take virtualization out of the equation - at least on the database. There are several reasons why, which probably deserve another blog post - but the simple reason in this context - Virtualization makes your server lie to you.

Seriously. The tuning we were doing above depended on us learning things about our disk I/O, number of cores, swappiness, and other various measurements from the system. When your disk I/O reports something on a virtual machine, is that the *real* answer, or is that just how long it took the VM to be happy (meanwhile your write is still in a cache on the host)? Do you *really* have that many cores to schedule something against? Might the stuff that your guest OS is reporting as 'in memory' might actually be swapped out by the host OS, which you have no visibility into? Might your carefully tuned mysql server go completely pear-shaped because another client on the same physical hardware suddenly has a spike that uses more cpu than you do? I'll probably write more on this at another time, but in the meantime, check this post by Mark Imbriaco of 37 signals, about the performance increase they saw moving away from virtualization. If you saw the movie Inception, you know what kinds of issues this can create.

Conclusion

This was a particularly long blog post, but for both clients I mentioned, this analysis and tuning took less than 20 minutes in the real world. This is just the top of the iceberg for tuning a Rails application in production; if you are interested in learning more, or seeing if we can tune your application further, you can always contact me. Some of this material is also covered in our Rails In Production training course.

I once knew a small company that was effectively shut down because their internet service provider (ISP) had gone out of business. In order to recover the 'intellectual assets' of their business, they had to go to the liquidation auction of their ISP and bid on (and win!) the physical machines that had been running their websites. Would you fare any better?

As a business owner, you want to make sure you have a complete reconstitution plan. Backups are a good start, but aren't enough by themselves. Make sure your reconstitution plan includes:

a tested backup and restore of the database, source code, deploy procedures, uploaded images/documents, and all of the configuration files specific to your servers. Make sure the backup contains all the files needed by your web app, in a format you can access and use for a restoration.

a map of all of the servers in use, including their names, IP addresses, and services they are providing. This includes things like your name server and mail server, which are probably handled by your service provider (and ignored by most plans that think just having a backup is enough).

The login credentials of your domain name registrar, in case you have to change your domain name servers.

The login credentials to your domain name servers, in case you have to change the ip address of a name (so your "www.your-domain.com" can move to another server when necessary).

Spend a little time letting your imagination run through disaster recovery scenarios... What would happen if there were a fire in your data center? Would your backups be destroyed too? What would happen if your machines were hacked, data was corrupted, and you didn't discover it for a week? Would your only backups contain that corrupted data?

I hope you never need a reconstitution plan, but if you do, you'll be grateful for the time you spent preparing. As we all know, hope isn't a good business strategy. So start your plan now!

Build your list. Include an opt in form where people interested in getting early access to your application can sign up. These folks are raising their hand now and will be gold to you when your app is ready to launch.

Split test. Not sure whether to use the image of the cute baby or the cute puppy? Don't guess... use A/B testing to let site visitors tell you which works better.

January 09, 2011

If you can't tell from my recent series of blog entries, I have spent a lot of time lately maintaining the CodeSherpas Server Farm and provisioning a few new machines for clients. I have several more entries pending, but they all rely on higher-order sysadmin techniques to keep track of all the 'noise' that a server can generate; so I thought I'd pull those out into a separate entry.

A running server generates a lot of system notifications. Tools like LogWatch, Monit, DenyHosts, LSM, etc. all send out emails when they find something. I have seen machines that send emails to the root user, to an email address set up specifically to receive them, and to the person who installed a specific tool - all at the same time. Without a unified view into the happenings of the system, an opportunity is lost and a hacker can slip through.

Technique #1 - Unify and Forward

The first part of this technique is obvious - unify all those email addresses into one destination - I typically prefer the root user on the machine itself, although I have also set up a user account specifically for this purpose. When done consistently, this is easy to maintain; The confusion over multiple email addresses only arises when the people installing tools think they need an original answer to this question. But having all those great system notifications don't do much good if they never leave the machine... do they?

The second part of this technique involves a few levels of indirection (and don't all good solutions, really?), and it uses a little-known trick of the linux-sysadmin gurus - for ".forward" file.

In the root users home directory, I create a file named ".forward". In this text file, I put an email address - your own email address would work - and now anything that gets sent to the root user simply gets forwarded to that email address.

At CodeSherpas, we use one more layer of indirection on top of that... the .forward file on our servers forwards to a special email address like "all-system-notifications@codesherpas.com" (and that isn't the real address, so don't bother spamming it). That address is simply an alias that gets rotated between the CodeSherpas - whoever is on syste duty is responsible for getting those messages.

Of course, the ultra-paranoid will tell you that an intruder can delete the .forward file, and I'll stop getting notices. Thats true - but if an intruder has gotten far enough to delete files in the root's home directory, then I can't trust any output from the machine. My trip wires should have gone off long before they have gotten that far.

Technique #2 - If it does't email, make it!

While many tools send emails as part of their normal operation, there are many that don't. thats ok - we can make them!

Take lynis, for instance. Lynis is an easy to install system checker that runs on just about every flavor of *nix out there. It generates a great report that makes system hardening recommendations, checks to make sure keys haven't expired, verifies firewall rules are in effect, and all kinds of other things (worth a blog entry or more by itself). It is a command line tool that is easy to run, but I don't want to have to remember to run it... We should never send a human to do a computers job.

I want to create a shell script that takes the output from lynix and mails it to root@localhost, and I want to set it to run once a week. This is trivial:

This tells linus to do a complete system check, do it without stopping for human intervention, and only report problems.

| /bin/mail -s "Lynis Weekly Run for $HOSTNAME" root@localhost

And this line takes the output, creates a subject that includes the host name of the machine (useful when you are getting several of these a week), and email it to root@localhost (which has the .forward file, as described above).

I put that shell script in /etc/cron.weekly/lynis.sh, and voila! Every week the system checks itself and emails the current CodeSherpa sysadmin watchdog any findings.

That shell script above can easily be modified; just keep in mind that the command you put in between the parentheses should not require any human interaction, and ideally should report only problems (humans learn to ignore emails they get every day that say "all is well").