Server Uptime Statistics - New Solution

Recommended Posts

I made this over the weekend. For a while now I have been wanting to pull true uptime statistics into Automate, IE, presented as a percentage how much uptime did the server have this month.

To do this a piece of embedded Powershell is running in an Automate Script that populates EDFs with this information in. There are numerous decent data points here that can potentially have monitors running against them:

1) Trigger when more than x crashes are detected in last 30 day period
2) Include up-time percentage in your reports
3) Trigger when more than x reboots are detected in last 30 day period
4) Show value to customers who have required SLAs for server uptime

Share this post

Link to post

Share on other sites

This works great. We had a client wanting uptime reports so we've implemented this along with extracting the data into reports via brightgauge that looks really good.

Just have one question Gav, I've got one server that's showing 22 reboots in a 30 day period as it has a scheduled task to reboot each night (previous IT's solution to a problem, dont ask :p) but it's downtime only shows 18 mins and this server takes longer than that for a single reboot. Can you explain how the downtime figure is calculated? My powershell knowledge is unfortunately not good enough to ascertain from your script :)

cheers

Share this post

Link to post

Share on other sites

Share this post

Link to post

Share on other sites

I've got an issue with the crash logging on this, something to do with the way the date formats come out in the crash section. I'm in australia and using the dd/mm/yyyy date format in most cases, but the script seems to show most dates in the mm/dd/yyyy format EXCEPT for the one that comes out of the replacementstrings command which comes out in our local format, which means the downtime for crashes gets calculated completely wrong.