I am working for a small finance firm as a Web Application Developer. Our company has an interal website coded in PHP and running on Apache. Recently our server went down and the website was down for several days causing severe problems.

I have been asked to setup two more servers for serving the website. So we want three Apache Web/App servers running on three different machines. When a user logs into the website, he must be servered by one of the three servers depending on the load. Also if one or two of the servers go down, the server which is up must handle the website requests.

I just know creating a website in PHP and hosting it on an Apache server. I dont have any knowledge of networking. Please tell me what I need to learn for creating the above mentioned system.

I am not expecting to be spoon fed. Just need a pointer to what I have to learn to achieve my goal. I'm Googling simultaneously but have asked this question here as I am in hurry to implement it.

4 Answers
4

A common approach is to develop web-applications that are aware of clustering. You'll probably need to remake your site's basics that's related to database, sessions, shared and dynamic data. I'm sure my question will make you interested: Cloud/cluster solutions for scalable web-services. To make a website 'scalable' you'll need to create a scalable design. Alas, there's no button with "Make it faster" written below :)

A simple way is to replicate all data between these servers (you may use GlusterFS for files, and replicate your MySQL/whatever between these servers) and ensure all sessions are available from all of them! It's not the best suggestion, but you won't have to remake your code :)

Load-balancing can be easily implemented with Round-Robin DNS: just add several 'A' records that point to different servers and they'll be picked randomly by clients. For instance, Google has this feature:

While the easiest method of load balancing is something like round-robin DNS as indicated in o_O Tync's answer, you need to be aware that if one of those servers goes down and you remove its DNS record, a portion of your users will be directed to the down server until the TTL on their DNS records expire or you manually change IPs. Depending on how important uptime is to you, this may not be acceptable. In addition, any users that were in the middle of a session with the server that goes down, they would lose that session.

RRDNS is fine for load balancing, but isn't really the key to high availability.

The easiest way (and by easiest I mean simplest, not necessarily cheapest) to implement true high-availability load balancing is to use a hardware load balacing network appliance that sits in between the Internet connection and your web servers. Such a device can be used to split the load between your systems, and also to automatically (or manually) remove a server from the rotation if there is a problem. In addition, it will take care of TCP connections, so that a user can be automatically hooked up to another server if their original device goes down. Another advantage of this solution is that it generally requires little or no application modification to implement. Note that a truly "high-availability" configuration will usually use two load balancers to reduce single points of failure.

Another option is to use regular servers to achieve a high availability load balancing scenario. Here is some info on configuring a high-availability load-balanced Apache cluster. The Linux-HA site is a great source for Linux load-balancing information.

Yet another option is something like the Linux Virtual Server project. LVS uses Linux for all components, both servers and load balancers, and generally provides a seamless solution (once configured).

To conclude: my general recommendation is that for a situation like yours, where an inexperienced admin is asked to set up load balancing in such a situation, a hardware load balancer appliance is the most painless way to do so. It obviously costs some money, but can save a lot of time. Determining the trade-off point is an individual decision, of course.

What I'd do, is sit down and look at what's currently running on the server in question.

I'm going to assume for now that you're running linux, although you didn't explicitly say this. What you want to look at is Varnish. It's a high performance reverse-proxy and load-balancer. You can set it up following the online examples here and it should be straightforward to get working.

If you have 2 varnish nodes out of 3 servers, point 1/3 of the traffic to each of the servers, arranged as a round-robin assignment. These 2 servers need unique public IP addresses, and you can set multiple A records in DNS to do RoundRobin DNS (RRDNS).

If your service is critically important to the business, and your last outage cost serious money, you might want to argue for a more resillient and redundant network. Assuming you currently only have one server, on one IP provided by one supplier, if your previous outage was network related, you might find that increasing the network resilience would improve your uptime.

Look hard at your monitoring also, make sure you're monitoring things on the server(s) that could bring the server(s) down, SMART data on the hard disks, free swap space, free memory, free / disk space. Get Nagios and Munin set up. Nagios to alert when critical monitoring conditions are met, and Munin to graph these data.

You'll probably have to make some application level changes too. Assuming your application(s) are session based, you'll need some way to handle the user's requests not always going to the same server. You could make them client side, or server-side and sticky. You'll probably find that memcached will help you greatly here.
As for application level changes, you might want to ask on StackOverflow, as they're more code-y and less server-y.