Further, the only ports that are open on the system are 80, 443, 12345 (ssh). I do not know where to find the actual ssh log, but I did a logwatch dump, and SSH showed nothing.

These are the monitoring graphs

@James Little

I have checked /var/log/btmp, the file has been last changed 1-1-2012 and is 0 bytes.

ifconfig show me everything 0, I assume no errors and everything is ok. I don't really have the knowledge to work with ifconfig and ethtool as you suggested. I tried some google searches but failed to find some solid methods that would give me some information.

I think I will send an email to Amazon now, maybe they have some answers.

You say the nginx log looks normal; do you have access logging enabled? i.e. you can see accesses over the internet during this time? Also, did you notice what the load average was when you first re-connected to the instance?
–
James LittleJan 4 '12 at 18:06

I have access logging enabled, and there was nothing during that time period. The time period just before downtime there was just normal behavior, no attacks or anything. When the server started again at 15:46:00 CPU usage was extremely high for 6 minutes, the highest it's ever been. Also Network utilization during these 6 minutes were just above avarge. But in that 6 minutes time perion, there is absolutely no activity in the nginx access logs. Ill edit in php-fpm's log, it went down, up and rebooted I guess.
–
Saif BechanJan 4 '12 at 18:15

Have you checked your bad login attempt log (usually /var/log/btmp)? I think you will have to ask Amazon, maybe they detected an attack (e.g. syn flood) and took action higher up the chain to cut off the traffic. Also use ifconfig and ethtool to check for interface errors.
–
James LittleJan 4 '12 at 18:37

Also, how long has the nginx process been running? You can check the start-time using ps aux or similar.
–
James LittleJan 4 '12 at 18:43

I will check out the log, and try to check interface errors. I am new to all this, that is why I only run a small website on the server right now, I knew stuff like this would happen. Hope I can find some answers in some logs, I will also send an email to amazon.
–
Saif BechanJan 4 '12 at 18:43

2 Answers
2

You don't specify if it actually rebooted. In case you did not check - use uptime to see when it last rebooted, or go thorough syslog or dmesg (from your php-fpm log I guess it did reboot). Since it was unavailable for some 30 minutes, it doesn't looks like some planned upgrade (unless they decided to "update" all the datacenter instances at once ;) .

If it was reboot, it's either some failure inside your instance or failure at amazon - again, look at syslog/dmesg.

If it wasn't reboot it could be also some issue that affected just the monitoring.

Amazon have status page of their datacenter issues, with history (somewhere on your EC2 dashboard). For planned reboots, in EC2 you have history too (under EC2 it's just above "instances", if I remember well).

Single instance unavailability is a normal (I did not say common) issue though. It's not feasible to totally prevent it.

No it was today actually, just a few hours ago. I live in the Netherlands, the times that I gave were from today here in the Netherlands, about 5 hours ago. I have also checked if there was any downtime, but I could not find any information. My instance is located in the region west-europe, ireland.
–
Saif BechanJan 4 '12 at 17:51