I work at a medium sized company (100+ employees). An issue that has been cropping up is network performance, internet access in particular.

We have about 70 or more computers, a mix of Mac OS X and Windows XP & 7 machines. We have several servers (Exchange server, PC file servers, MS SQL, Blackberry, FTP, Mac server, etc). There are four main switches, a SonicWall firewall, and probably a couple routers in the server room with a dozen or so more scattered around the building.

The network structure has grown organically over a number of years; and, as far as I know, there really isn't a monitoring solution in place. When we experience network issues (slow connections, dropped packets, and so on), our general solution is to power cycle some hardware or go around to each employee and ask them if they are uploading/downloading any large files.

This is really inefficient and time consuming, and it does not allow us to monitor the network, tackling potential problems proactively.
I would like to find a solution that would allow me to monitor network usage company-wide in real time, with detail going down to the individual computer, ideally.

Given the hodgepodge of equipment and operating systems, what would be the best way to set up some kind of monitoring solution? Hardware, software, restructuring our network architecture?

5 Answers
5

The first step is to monitor everything. I can suggest you to use Cacti or Zabbix to get SNMP information from your devices so you can find exactly who is using how much.
After that you can even setup alerts (if you use Zabbix, or then install an extra tool like nagios) based on usage or problems.

After all that you can think in ways to segment your network, optimize, buy new stuff and etc.

I recommend the time tested and proven combination of Nagios and MRTG. Nagios for monitoring and alerting and MRTG for longer term monitoring, which often shows trends that may otherwise go unnoticed. There are alternatives but I've found these two can do all the others can but are more easily set up and configured, although that might be just personal preference.

The Sonicwall already in place can also provide very basic internet usage reporting immediately: top bandwidth usage by IP, top bandwidth usage by port/protocol. Viewpoint is an add-on for more in depth reporting. This won't solve the bigger issue of end to end network performance but can provide some info on the internet performance issue in the meantime.

If you don't want to spend the time to install, configure and maintain a premise-based solution, you might consider LogicMonitor's hosted solution.

It provides remote monitoring and alerting for web/mail/app servers, switches, routers, firewalls, storage, VMs...out of the box. It's pre-configured, so you don't have to know what metrics to monitor or how to write/edit the configurations.

The single-pane view makes it easier to troubleshoot versus bouncing between point solutions that don't talk to each other.

LogicMonitor takes literally minutes to set up and lets you see useful data from your devices in as little as 10 minutes.

One small agent (20MB) needs to be installed on any Windows or Linux machine on your network (does not need to be dedicated server).

Configuration is then a matter of providing the IP address of any device you want monitored and the agent will automatically start collecting various performance metrics using snmp, wmi, jdbc, jmx, netapp/amazon/vmware API, and other data collection methods.

Data is stored for up to a year, and is accessible from any browser.

Alert thresholds come pre-configured for most critical parameters based on years of experience running datacenters.

LogicMonitor is delivered on a Software-as-a-Service architecture, so there's no hardware, backup, patches, maintenance.

It costs about $12/month per device (price goes down with volume). If you're managing a lot of gear, the time-saved from automation, and the intelligence gained make it a well worth the investment.

+1 for LogicMonitor. We're only using the trial currently but I can only describe it as "newrelic for sysadmins". Sure, many tools monitor and graph your appliance stats, but few if any do it without a bunch of manual configs and tinkering. We had graphs up in literally minutes and on top of all that you have the ability to create dashboards on the fly. The ease of implementation and maintenance is key here especially for a small team like ours (and yours Kyle Lowry).
–
JimDec 21 '11 at 16:15

If you are looking for interface level traffic then a simple SNMP solution like Nagios will fit the bill. Just monitor your router or Sonicwall interfaces and look for busy interfaces.

If you want to also keep an eye on user activity in your network (who is downloading that big file or running a torrent) you need more than SNMP. You need to monitor flows or even raw packets.

You can try NTOP which is a good package if you can get it to work well with MySQL or RRD files.

You can try a software tool called Trisul Network Metering and Forensics which will provide all this information for you. (Disclaimer I work there). It is also completely free if you are interested in real time to the most recent 3-day window, which from your question should suit you fine. You can monitor per user long term and current usage.

With any solution you need to identify vantage points in your network such as the firewall, or the router / switch in front of your server rack. If your equipment supports Netflow or Sflow use that or port mirror your choke points and feed the data to Ntop or Trisul.