Directly examining your CPU/IO/RAM usage might be more revealing. Consider looking at the output of iostat -x. If you have available memory, your load average should be determined by CPU and IO. If you do not have a high io wait time, then you are likely CPU bound. (Also look at htop or similar).
–
cyberx86Nov 20 '12 at 18:51

Hi, the iostat -x only show CPU and Disk usage, but I am thinking my bottom neck is network..
–
HowardNov 23 '12 at 17:31

An m1.small should easily handle 2Mbps; and a network bottleneck shouldn't increase your load average to such a level (the m1.small has only 1 CPU).
–
cyberx86Nov 23 '12 at 18:59

6 Answers
6

Configure nginx status plugin and install collectd to collect system performance data. It's a very lightweight daemon in means of system resources it needs. There's plugin for nginx monitoring: Plugin:nginx and of course collectd can monitor whole other system performance data.

As far as collectd is just collector of performance data (stores it in RRD DBs), a tool for displaying data is required. I'm pretty comfortable with CGP... git version is OK. CGP is a PHP app thus it will eat you CPU just only when you will look at graphs.

It requires i/o accounting support inside the kernel, which is present in Ubuntu 10.04 or greater.

If you find out nginx is I/O bound, try examining if you actually need access logging (which may be a bottleneck in such high number of requests). Disabling the access log is as easy as:

access_log /dev/null crit;

FYI

access_log off;

won't do (nginx will write to a file named off).

If you need logging, implement a shipping policy (such as logrotate the logs once a day and ship the rotated one to a remote location via rsync, scp or else) and try writing to the instance store (by default mounted in /mnt). Instance store is backed by server local disks which may be faster (though this is not guaranteed) but their data is lost upon instance shutdown, hence the need for a log shipping policy.

Check the system metrics to see if you're hitting a performance bottleneck and if so; identify where it is.

2mbps is definitely slow for a m1.small. I've gotten significantly more than that from my t1.micro instances. Check iotop and htop to see what your system is doing. It sounds like there is a nasty bottleneck in your process somewhere. The CloudWatch metrics for this instance and it's volumes could also help.

If you're running dynamic pages (PHP, Perl, Ruby) there could be some unoptimized code that is causing the slowdown.

If you're not seeing CPU or IO bottlenecks on the host then the issue may be in another tier if you have any other systems in the stack.

One thing to consider is using an ELB for the SSL termination (and load balancing too) to spread the load. They're not supremely expensive and it might offload enough (assuming SSL is to blame) to help the performance for cheaper than moving up in instance size. Hanging your site off the ELB also would give you more flexibility on how to scale and manage the site.

Since you are using ec2 small, which has only 1 computing unit, a loadavg of 2~3 is too high. System metrics will provide much more useful information than ones you provided from nginx. Rather, those metrics really doesn't help at all...

You may also benefit by reading other questions relating to load averages in general to better understand it's meanings and importance.

The problem with EC2's small instances is that you don't have 100% CPU time available, only bursts. Once your instance starts keeping the CPU busy consistently, it gets throttled.

Ideally, you shouldn't be hitting loads > 2.0. Since small instances have only one CPU, any load above 1.0 means you already have half of your processes waiting for available CPU slices. A medium instance should be enough.

You're correct Andrei, but m1.small have poor CPU on their own. I had trouble running jobs on small instances that didn't hit 10% usage on medium instances in the past. Please read: axibase.com/cloud/2010/07/22/…
–
hcalvesNov 29 '12 at 20:41

If your top shows that nginx is the only process eating CPU and your instance type is m1.small, thats for sure means that nginx is in BAD state.
Be sure to turn off gzip compression and also i suggest to add SPDY patch http://nginx.org/patches/spdy/README.txt (firefox and chrome supports SPDY) which will greatly increase load time of SSL pages.