I have been reading about all the above 5 tools and I have confused myself as to which one is used for what purpose? Could someone please explain me in lay man terms what use is each of the tool put in, when used together and which specific concern do they address?

Questions on Stack Overflow are expected to relate to programming within the scope defined by the community. Consider editing the question or leaving comments for improvement if you believe the question can be reworded to fit within the scope. Read more about reopening questions here.
If this question can be reworded to fit the rules in the help center, please edit the question.

4

I don't agree with this question being marked as Off Topic.
–
ramnApr 16 '13 at 13:49

Instead of marking it as off-topic, move it to serverfault and redirect maybe?
–
mmlacJul 4 '13 at 21:37

3 Answers
3

Let's say you plan to host a few websites on your new VPS. Let's look at the tools you might need for each site.

HTTP Servers

Website 'Alpha' just consists of a some pure HTML, CSS and Javascript. The content is static.

When someone visits website Alpha, their browser will issue an HTTP request. You have configured (via DNS and name server configuration) that request to be directed to the IP address of your VPS. Now you need your VPS to be able to accept that HTTP request, decide what to do with it, and issue a response that the visitor's browser can understand. You need an HTTP server, such as Apache httpd or NGINX, and let's say you do some research and eventually decide on NGINX.

Application Servers

Website 'Beta' is dynamic, written using the Django Web Framework.

WSGI is an protocol that describes the interface between a Python application (the django app) and an application server. So what you need now is an WSGI app server, which will be able to understand web requests, make appropriate 'calls' to the application's various objects, and return the results. You have many options here, including gunicorn and uWSGI. Let's say you do some research and eventually decide on uWSGI.

uWSGI can accept and handle HTTPS requests for static content as well, so if you wanted to you could have website Alpha served entirely by NGINX and website Beta served entirely by uWSGI. And that would be that.

Reverse Proxy Servers

But uWSGI has poor performance in dealing with static content, so you would rather use NGINX for static content like images, even on website Beta. But then something would have to distinguish between requests and send them to the right place. Is that possible?

It turns out NGINX is not just an HTTP server but also a reverse proxy server: it is capable of redirecting incoming requests to another place, like your uWSGI application server, or many other places, collecting the response(s) and sending them back to the original requester. Awesome! So you configure all incoming requests to go to NGINX, which will serve up static content or, when required, redirect it to the app server.

Load Balancing with multiple web servers

You are also hosting Website Gamma, which is a blog that is popular internationally and receives a ton of traffic.

For Gamma you decide to set up multiple web servers. All incoming requests are going to your original VPS with NGINX, and you configure NGINX to redirect the request to one of several other web servers based in round-robin fashion, and return the response to the original requester.

HAProxy is web server that specializes in balancing loads for high traffic sites. In this case, you were able to use NGINX to handle traffic for site Gamma. In other scenarios, one may choose to set up a high-availability cluster: e.g., send all requests to a server like HAProxy, which intelligently redirects traffic to a cluster of nginx servers similar to your original VPS.

Cache Server

Website Gamma exceeded the capacity of your VPS due to the sheer volume of traffic. Let's say you instead hosted website Delta, and the reason your web server is unable to handle Delta is due to a popular feature that is very content-heavy.

A cache server is able to understand what media content is being frequently requested and store this content differently, such that it can be more quickly served. This is achieved by reducing disk IO operations; the popular content can be stored in memory or virtual memory instead. You might decide to combine your existing NGINX stack with a technology like Varnish or Memchached to achieve this type of optimization and server website Gamma more effectively.

pls revise: "[...] uWSGI has poor performance in dealing with static content [...]". i realize this is nitpicky, and the example is only illustrative, but IMO it should still be accurate (at the time of question, uWSGI already had high-perf ReverseProxy/static-files support for a LONG time)... aside, uWSGI has advanced support for every category touched -- far beyond the listed competitors -- such as dynamic subscription routing (vs. round robin/etc)... the only place i DON'T use it is "Cache Server", where Varnish suits better. beyond that, for newcomers, this is a nice overview :)
–
anthonyrisingerApr 13 '13 at 8:24

I will put a very concise (very informal) description for each one, in the order they would be hit when you make a request from your web browser:

HAProxy balances your traffic load, so if your webpage is receiving 5000 hits per second, you can't handle that with only one
webserver, so HAProxy will balance the hits among the webservers you
had behind.

Varnish is a cache server, it sits upfront your webservers and behind HAProxy, so if a resource is already cached by Varnish he will serve the request itself, instead
of passing the request to the webservers behind.

ngingx, gunicorn, uwsgi are web servers, that would be behind varnish and will get the requests that varnish will let pass
through. Theses web servers use optimized designs to handle high
loads (requests per second).

fine, understood, so where does then memcache fit in the above architecture? Varnish apparently is also doing the same thing, why do we need memcache then?
–
whatfNov 3 '12 at 17:53

2

Memcache works at your programming language level, so if you use PHP, you could cache the result of a mysql query in memcache so to not repeat that same query again when your PHP code it's executed. But Varnish operates at a higher level, caching for example the css files you refer in your webpage, so he serves the css files from its cache in memory instead of letting the webserver read them from disk.
–
NelsonNov 3 '12 at 18:01

2

So, memcache can operate inside your PHP script, caching variables, function outputs, whereever you want. But varnish works at a file level, he will cache the entire output of your PHP page, the same as he does for others file resources like css files, javascript files or image files.
–
NelsonNov 3 '12 at 18:03

thanks understood! I would have marked your answer as correct but i found @Aman 's answer most detailed and easy to understand. Thanks for your time.
–
whatfNov 3 '12 at 18:08