I was taking a personal day to work on the honey-do list when it happened: I received an urgent call from my boss, the CFO. This is typical with most critical issues in my enterprise — they usually occur when I’m on vacation or otherwise away from the office.

“We have no phones and no Internet! We can’t get to anything!” he exclaimed. OK, easy enough. The ISP must have been totally down. (We run all our T1s and PRI through one carrier.) My boss continued, “The accounting system won’t respond and SharePoint won’t open. Oh, and it turns out some people can make phone calls after all.” This was not an ISP issue. It was much worse…

I was able to connect remotely to our firewall just fine, so our WAN connection wasn’t down. At this point I began thinking of the servers. As there was network connectivity, but no phones (mostly), Internet or servers, there had to be a DNS component, which could mean both my DNS servers were down. I pinged one of the DNS servers from the firewall: spotty at best. The server wasn’t down, it was just intermittently responsive. I then pinged another server with the similar results. I proceeded to ping a phone switch and my workstation with even worse results — almost no replies.

My usual IT support fill-in, with an unofficial title of “Jr. IT,” was out of the office as well (of course he was!) so I had the CFO have a look in the Server/Network Room. As this was the first time he had ever been in the room, it was mostly an act of futility. Servers were on, phone switches were on, network switches were on, and yet I couldn’t make it past our firewall.

I told the CFO that I believed we were under attack — from inside our office. Somebody or something was in the process of a DoS attack which had successfully brought down our entire network. This also explained the phones. Most of our phones were VoIP, but there were still a few analog.

I started questioning the CFO about nefarious people roaming the office in black coats and sunglasses, but eventually accepted I would have to drive into the office. By the time I made it in, it was after 5 p.m. so most people had left for the day, including my boss — not as if there was a lot to do.

The network switches confirmed my suspicion. Every connected port was a solid LED — completely saturated with network traffic. I proceeded to remove one switch at a time until I found the source switch. I then worked my way through each active port on the switch. Gotcha! Conference room.

With caution, I headed to the conference room, half expecting some type of black hat hacker device setup in the corner. And sure enough, there it was: an Ethernet cable plugged into a port on the conference room table, with the other end plugged into — another Ethernet port. Underneath the table, all the Ethernet ports connected to a standard hub. The process of plugging the hub into itself created a network loop that flooded the entire network with broadcast packets.

The senior manager was nice enough to come clean (there were witnesses). Obviously, a smarter switch would have likely averted this particular situation, but this opened my eyes to just how easily an “open” network segment can be brought down. After this, I configured my switches to rate limit broadcast traffic and have learned that utilizing 802.1x and VLANs would be good steps to avoiding this and other even more serious situations altogether.

This is the 154th article in the Spotlight on IT series. If you'd be interested in writing an article for the series, PM Lee to get started.

We have an issue somewhat like that, but caused by a single PC, once it gets locked up, every PC on the switch immediately loses any network connection. The PCs don't recognize any ethernet connection until they're completely restarted. We're still trying to figure out if it's being initiated by software, a glitch in the driver, a hiccup on the switch. These PCs are used constantly so we don't get much time to troubleshoot them.

My outlook - We'd rather enjoy ease of use over reliability, soon everything important will be connected to the internet one way or another causing infinite targets for those who cripple our devices from a distance. *CoughCoughSuperBowlLightingCoughCough*

1st Post

Amen to that! It should be published. I have seen this happen more than a couple of times in my nearly 30 years in IT. A bad NIC can also mess up a network, but I haven't had that happen for several years now.

When I worked for a university's IT department, we had one of the Computer Science lab admins accidentally do something similar. Unfortunately for us, he was too arrogant to admit that his lab created the problem even after we told him exactly which port was at fault. So we ended up marching over there and yanking out the offending patch cable ourselves..

1st Post

I too had the same loop problem and recently had a faulty switch that had two printers in it at the end of the network that took down the whole network. Pain in the rear to troubleshoot when there are multiple buildings with daisy chained switches and hubs. Got to love not for profit private K-12 education.

As a NPO with low budgets in our smaller offices, and therefore less feature-rich network equipment, this situation isn't at all uncommon for us. Even worse, the VOIP phones we deploy have 2 network ports, one to patch out to a computer, and one for network connectivity (essentially a 2 port switch). About once per month someone plugs both ports into a network causing a loopback and subsequent broadcast storm.

Happened to one of our offices, wasn't able to locate the source but we eventually resolved the issue, to this day it is a mystery though we did see after replacing the switches that a specific model machine was having a high number of collisions while asleep.

Heh, I have to ask... what brand of switches and how old are they? It's damn near impossible to make the new models of switches take a Broadcast storm. I know, because recently I tried to recreate one on my home network. I had to disable the failsafes in order for it to take. :)

I have been fearing this for some time. To be safe I have worked on disabling access on inactive ports hopefully preventing a loop. I fear one day I will miss one because my process is entirely manual. I have two questions:
1. Will my 3750x's automatically prevent such loops Out of the box? Or do I have to enable a setting?
2. What if someone loops ports on two different vlans. Will the switch detect and prevent that also? 802.1x is my next project.

1st Post

Surprised nobody has mentioned anything so far, but spanning tree and associated commands are your friend. If you're not using spanning tree you need to get on that ASAP, it would have stopped this before it happened. If you are using spanning tree portfast I suggest you add bpduguard. Basically when it detects the bpdu sent out of one port and received in another it error disables the port.