Hosting customers stranded as generators in NY data centers run out of fuel.

Flooding and power outages caused by Hurricane Sandy have forced several New York data centers to switch to generator power. But those generators are quickly running out of fuel, so data center companies are telling their customers to shut down their servers and move workloads elsewhere.

One of the worst situations is at 75 Broad Street in Manhattan, where both Internap and Peer1 Hosting are shutting down operations "after basement-level flooding disabled critical diesel fuel pumps," Data Center Knowledge reports. 75 Broad Street is part of the "Zone A" portion of the city that is under emergency evacuation orders, as is another data center operated by Datagram at 33 Whitehall Street. The Datagram outage led to downtime for popular websites Gawker, Huffington Post, and BuzzFeed.

Peer1's official network status page reported last night that it was running on emergency generator power. This morning, the company said "we have an estimate of 4 hours for the fuel left on our generators. Our techs and facility are continuously working to get emergency fuel delivery on time and was looking to set-up a temporary tank and pump since the basement is still flooded. In the event of not receiving the fuel on time, worst case scenario is we will have to gracefully shutdown the facility."

The worst case scenario has apparently occurred, as the latest update says, "We are going to implement a controlled shutdown of NY Data Center at 10:45 ET." (UPDATE: Peer1 reported good news just before 12:30pm ET. "The New York facility is still on generator power, sustaining longer than initially estimated," Peer1 said. "We will have the latest update on the remaining fuel available, along with the arrival of fuel replacement shortly.")

Internap is reporting much the same scenario. In a note to customers made public on Pastebin, Internap said, "The flooding has submerged and destroyed the site's diesel pumps and is preventing fuel from being pumped to the generators on the mezzanine level. The available fuel reserves on the mezzanine level are estimated to support customer loads for approximately 5-7 hours. Once this fuel supply has been exhausted the generator will no longer be able to sustain operation and critical customer power loads will be lost. The building itself is being evacuated and no remote hands support will be available to assist in any equipment shutdown."

Internap advised its self-service customers to shut down their servers immediately and is having its customer support team execute a "graceful" shutdown of servers for managed customers. Internap's cloud services are also being shut down, the company said. We've asked Internap for an update and will report back if we get one. But according to iT News, Internap sent customers a follow-up e-mail this morning that said, "Available fuel reserves on the mezzanine level are estimated to be nearly depleted and able to support customer loads for less than 2 hours. Once this fuel supply has been exhausted the generator will no longer be able to sustain operation and critical customer power loads will be lost."

UPDATE: Internap is reportedly out of fuel and offline, but it is trying to get more fuel to the building. As of 12:55pm ET, the Internap network operations center hotline was playing a recorded message that says the facility is "currently without power due to flooding" and that co-location and IP customers can expect "widespread outages." Later in the day, the company posted a blog saying the 75 Broad Street facility was still without power, but that Internap is trying to get its generator farm back up and running. "It is unclear how long it will take ConEd to restore utility power to the site, but we are preparing for the possibility of remaining on generator power for many days," Internap said. In addition to running out of fuel in secondary tanks, Internap said the flooding damaged "both our redundant fuel pumps and our generator fuel tank." Internap is coordinating fuel deliveries and pumps, and said it will have engineers "fabricate pipe to bring the fuel directly to the generators on the mezzanine level."

In addition to damage caused by flooding, New York power company Con Edison said it preemptively shut off electricity in part of Lower Manhattan last night, and reported today that substation damage and downed wires cut off power to many customers. Con Edison called it "the largest storm-related outage in our history."

As mentioned, Gawker, Huffington Post, and BuzzFeed have suffered downtime as a result of flooding at Datagram's data center at 33 Whitehall Street. As of this writing, the main Gawker sites are still offline (stripped-down versions are reachable at live.gawker.com), while the Huffington Post and Buzzfeed have gotten themselves back online.

Further outages included Steadfast hosting at 121 Varick Street in New York City due to "an auxiliary electrical failure," and Init7 at 111 8th Avenue, due to a data center power outage. Init7 operates IP backbone services. As a result of the outage, the company said to expect "possible routing issues from/to the United States." As it turns out, Internap also has servers at the 8th Avenue address, but they are operating under generator power and have enough fuel for several days.

It's not as if data centers didn't take any precautions. In advance of the storm, Data Center Knowledge reported that "data center providers in New York, Philadelphia and the Washington, DC area said they are testing and fueling up their emergency backup generators, preparing to maintain services during any utility power outages caused by the hurricane." The cloud storage provider Nirvanix allowed customers to move data out of its data center in New Jersey for free. And cloud service providers such as Amazon are closely monitoring data centers on the East Coast, stocking up on generator fuel, and having extra staff on hand.

As various non-storm-related outages at Amazon have shown, customers relying on hosting providers and cloud services may want to build systems that can fail over across multiple regions. Ultimately, when a data center is in the wrong spot at the wrong time, even the most extensive preparations may not be enough to stay online in the face of a storm like Hurricane Sandy.

Wasn't generator location identified as problem during the Fukushima disaster? Why put the generators (or their fuel) in a location that will be compromised by one of the potential problems you need the generators for? It seems like they could have moved fuel/generators to a better location. Now, there may have been building codes that prevented moving fuel out of the basement, but shouldn't that be part of the planning decisions on running a datacenter?

Fuel isn't so easy to store above ground. Especialy in crowded urban environments. But there are still precautions they could have taken to water proof their fuel storage and pumps so that they could still retrieve fuel when flooded.

Fuel isn't so easy to store above ground. Especialy in crowded urban environments. But there are still precautions they could have taken to water proof their fuel storage and pumps so that they could still retrieve fuel when flooded.

"At approximately 6am EST today, our LGA6, LGA8 and LGA9 facilities successfully restored power to our core infrastructure. At that time all services came back on line and are operating normally. Please note that at this time the facility is continuing to run on generators until such time as Con Ed can restore power. We have an ample supply of diesel onsite with contracts to refuel as needed. As of this time we have no reason to believe that stability going forward will be effected. If you have issues or questions regarding your infrastructure... We appreciate your patience and understanding as we work through this crisis situation"

Having less than 24 hours of diesel onsite is extremely poor planning.

Yes and no. You plan for the 99%, not the 1%. Storing large quantities of fuel for an industrial-sized generator is neither easy nor cheap. In almost all situations, you're able to replenish supplies during an extended outage.

But massive flooding and damage from a 100-year storm are not the sort of things that it's reasonable to plan for. It's just not cost effective (well, beyond the obvious: As others have said, take into consideration that a basement may be the best place to store fuel, but it's also going to be the first impacted by flood water).

That being said, the *customers* of these data centers have their own choices to make in terms of peering and redundancy. I think any business that relies on its internet presence should have a contingency in place to swap between distributed data locations, or be partnered with a provider that can do it for them.

Wasn't generator location identified as problem during the Fukushima disaster? Why put the generators (or their fuel) in a location that will be compromised by one of the potential problems you need the generators for? It seems like they could have moved fuel/generators to a better location. Now, there may have been building codes that prevented moving fuel out of the basement, but shouldn't that be part of the planning decisions on running a datacenter?

Lessons hard-learned over and over again. It took unprecedented flooding in Houston — a city as flood-prone as any — to discover vulnerable basement facilities at data centers, utilities, public safety facilities, and hospitals. In many cases, only one piece was affected… the fuel pump at street level, the fuel tank in the basement, the cutover switch in a basement utility closet, a load panel on the first floor… or a DMARC in the alley, whatever. Or in some cases, everything was high and dry, but personnel access was obstructed by flooding. Or the building was inaccessible altogether.

It seems to be a combination of factors and a sequence of decisions. Building code says this, delivery company says that, installation contractor does this, IT department does that… and of course, budget budget budget. Unfortunately, it takes a large-scale disaster to bring all the interested parties to the table and design to avoid such points of failure. In the meantime, it might seem obvious that the fuel tanks shouldn't go in the basement, but for most of the people in the decision chain, "it's not my problem." (Until of course, it is.)

"At approximately 6am EST today, our LGA6, LGA8 and LGA9 facilities successfully restored power to our core infrastructure. At that time all services came back on line and are operating normally. Please note that at this time the facility is continuing to run on generators until such time as Con Ed can restore power. We have an ample supply of diesel onsite with contracts to refuel as needed. As of this time we have no reason to believe that stability going forward will be effected. If you have issues or questions regarding your infrastructure... We appreciate your patience and understanding as we work through this crisis situation"

Having less than 24 hours of diesel onsite is extremely poor planning.

I dunno… in facilities where I've worked with diesel backup, the approach varies according to a matrix of likely disasters, likely fuel service disruptions, minimum runtime needed (with load shedding in stages), and maximum storage lifetime. The chief limiting factor is, you can't store diesel indefinitely. So you minimize the cost of wasting fuel every six to ten months against the requirement to be at full readiness at all times. You ramp up reserves as a storm approaches, which takes time and is subject to provider supply and carrier availability.

Sure, your point is valid in a hand-wavy, dismissive sort of way. But the planning is pretty complicated, and getting it right is expensive .That means only the most critical infrastructure can afford to do it right.

Which means mye-blogserviceonline.com or whatever probably isn't going to spend the time and money it takes to guarantee local DC uptime in the event of a >24h outage, and they're probably spending more resources on developing the product than on distributing the service among multiple regional DCs.

Having less than 24 hours of diesel onsite is extremely poor planning.

It's not the quantity of fuel, it's getting the fuel to the generators that's the problem. Internap's facility at 111 8th has enough fuel to last five days, and it's still online. The problem at 75 Broad st is that the fuel pumps have been flooded, so they have plenty of fuel in the basement, and no way to get it up to the generators.

The poor planning started with putting a data center hosting critical services in an urban area adacent to the ocean. It's a big country, not to mention planet, and there are a plethora of locations that are inherently 'safer' on any scale than NYC, Philly, Atlantic City, Miami, etc.

The poor planning started with putting a data center hosting critical services in an urban area adacent to the ocean. It's a big country, not to mention planet, and there are a plethora of locations that are inherently 'safer' on any scale than NYC, Philly, Atlantic City, Miami, etc.

Latest big data centers from the likes of Apple and Amazon have gone up along the East or West coast. Other than tax incentives I imagine they want to be closer to international links. I am sure there is plenty of locations in the midwest that the worst that can happen is flooding (most bigger cities are next to rivers). Would not have the earthquakes of the west coast nor hurricanes of the east.

Having less than 24 hours of diesel onsite is extremely poor planning.

Yes and no. You plan for the 99%, not the 1%. Storing large quantities of fuel for an industrial-sized generator is neither easy nor cheap. In almost all situations, you're able to replenish supplies during an extended outage.

But massive flooding and damage from a 100-year storm are not the sort of things that it's reasonable to plan for. It's just not cost effective (well, beyond the obvious: As others have said, take into consideration that a basement may be the best place to store fuel, but it's also going to be the first impacted by flood water).

That being said, the *customers* of these data centers have their own choices to make in terms of peering and redundancy. I think any business that relies on its internet presence should have a contingency in place to swap between distributed data locations, or be partnered with a provider that can do it for them.

Isn't this the 3rd straigt year that a 100 year storm has shutdown power for extended periods of time for large areas of the East coast?

I'm surprised at some of the sites listed not having remote backups for catastrophic events.

As a general rule (outside of truly massive, integrated providers like Amazon), this is best left to customers. It is really worth emphasizing what cateye said:

cateye wrote:

C Boy wrote:

Having less than 24 hours of diesel onsite is extremely poor planning.

Yes and no. You plan for the 99%, not the 1%. Storing large quantities of fuel for an industrial-sized generator is neither easy nor cheap. In almost all situations, you're able to replenish supplies during an extended outage.[...]That being said, the *customers* of these data centers have their own choices to make in terms of peering and redundancy. I think any business that relies on its internet presence should have a contingency in place to swap between distributed data locations, or be partnered with a provider that can do it for them.

That has always been the real answer if uptime is critical: distributed geographic redundancy. It gets exponentially harder and more expensive to hit additional 9s of uptime at any single data center. Emergent failure from rarer and rarer situations is harder to test for, unpredictable extreme events get harder to plan for, etc. The "unknown unknowns" become more numerous. However, while any single data center might face trouble, the odds of two entirely different data centers from different providers hundreds or thousands of miles apart suffering the exact same failure at the exact same time are immensely lower. Even though it's a bit more then twice as expensive (mirroring plus some extra management overhead), that can still be simultaneously cheaper and more effective then trying to get from 99.99 to 99.999. It's really up to customers to determine their own needs and then design appropriately.

The poor planning started with putting a data center hosting critical services in an urban area adacent to the ocean. It's a big country, not to mention planet, and there are a plethora of locations that are inherently 'safer' on any scale than NYC, Philly, Atlantic City, Miami, etc.

Latest big data centers from the likes of Apple and Amazon have gone up along the East or West coast. Other than tax incentives I imagine they want to be closer to international links. I am sure there is plenty of locations in the midwest that the worst that can happen is flooding (most bigger cities are next to rivers). Would not have the earthquakes of the west coast nor hurricanes of the east.

In the case of NYC, it was probably also important to be close to the customer base.

The poor planning started with putting a data center hosting critical services in an urban area adacent to the ocean. It's a big country, not to mention planet, and there are a plethora of locations that are inherently 'safer' on any scale than NYC, Philly, Atlantic City, Miami, etc.

Latest big data centers from the likes of Apple and Amazon have gone up along the East or West coast. Other than tax incentives I imagine they want to be closer to international links. I am sure there is plenty of locations in the midwest that the worst that can happen is flooding (most bigger cities are next to rivers). Would not have the earthquakes of the west coast nor hurricanes of the east.

obviously Indianapolis and the surrounding areas need to become a major new hub of secondary data-centers

The poor planning started with putting a data center hosting critical services in an urban area adacent to the ocean. It's a big country, not to mention planet, and there are a plethora of locations that are inherently 'safer' on any scale than NYC, Philly, Atlantic City, Miami, etc.

Latest big data centers from the likes of Apple and Amazon have gone up along the East or West coast. Other than tax incentives I imagine they want to be closer to international links. I am sure there is plenty of locations in the midwest that the worst that can happen is flooding (most bigger cities are next to rivers). Would not have the earthquakes of the west coast nor hurricanes of the east.

In the case of NYC, it was probably also important to be close to the customer base.

Wasn't generator location identified as problem during the Fukushima disaster? Why put the generators (or their fuel) in a location that will be compromised by one of the potential problems you need the generators for? It seems like they could have moved fuel/generators to a better location. Now, there may have been building codes that prevented moving fuel out of the basement, but shouldn't that be part of the planning decisions on running a datacenter?

Umm, because even if a blog, web app, or corporate cloud goes offline for hours or days, nobody dies (or lose billions, for the cynicals)?

It's a bit like asking why, despite all the IT literature, the desktop you use to browse the web and play Civ hasn't got redundant power supply and hot swappable raid10.

I'm pretty sure you can find way more resilient datacenters options on the market. If customers who are suffering an outage didn't opt for them, it's either because their services don't need that availability, or because they are morons.Either way it's not the datacenter fault, unless they were falsely advertising themselves.

BTW, requirements for availability escalates fast. The article mentions that some of the affected providers have their personnel evacuated. Now, they could've invested in nuke-proof emergency generators, but if nobody can be on-site in case of an emergency, that still a pretty risky scenario. So they should've invested in plans to keep staff on-site even in a major disaster, and on and on...

I'm surprised at some of the sites listed not having remote backups for catastrophic events.

As a general rule (outside of truly massive, integrated providers like Amazon), this is best left to customers. It is really worth emphasizing what cateye said:

cateye wrote:

C Boy wrote:

Having less than 24 hours of diesel onsite is extremely poor planning.

Yes and no. You plan for the 99%, not the 1%. Storing large quantities of fuel for an industrial-sized generator is neither easy nor cheap. In almost all situations, you're able to replenish supplies during an extended outage.[...]That being said, the *customers* of these data centers have their own choices to make in terms of peering and redundancy. I think any business that relies on its internet presence should have a contingency in place to swap between distributed data locations, or be partnered with a provider that can do it for them.

That has always been the real answer if uptime is critical: distributed geographic redundancy. It gets exponentially harder and more expensive to hit additional 9s of uptime at any single data center. Emergent failure from rarer and rarer situations is harder to test for, unpredictable extreme events get harder to plan for, etc. The "unknown unknowns" become more numerous. However, while any single data center might face trouble, the odds of two entirely different data centers from different providers hundreds or thousands of miles apart suffering the exact same failure at the exact same time are immensely lower. Even though it's a bit more then twice as expensive (mirroring plus some extra management overhead), that can still be simultaneously cheaper and more effective then trying to get from 99.99 to 99.999. It's really up to customers to determine their own needs and then design appropriately.

That's why I said "sites" not "data center". A data center is responsible for that data center. Sites hosted there are responsible for ensuring backup plans in case their primary data canter goes down.

yea... except those are all on top of the ring of fire... which should be disqualified from the start. Yellowstone may take everything out, but when that happens, keeping data hosting up will be the least of our concerns. good places to put data centers: most of arizona, new mexico, and utah. also, eastern nevada.

I would argue that it is undesirable to plan to 'sandy proof' things everything. What happens if a random tornado strikes, or earthquake. It quickly becomes a "xyz should have planned better' fest of improbabilities. It doesn't mean you should totally ignore the possibility of something like this happening, but instead of attempting to disaster proof, I believe it's more logical to have resolution plans and procedures for when problems do occur. It sounds like all of the services listed here do (rapidly contracting in new fuel for example).

City infrastructures are robust against localized failures, but fragile when compared to 200+ year natural disaster outliers. And unless there's a history of flooding in the area, I don't blame people for not taking the time and money to plan for a 10-15 ft storm surge in NYC / NJ

In the case of NYC, it was probably also important to be close to the customer base.

It's also significantly easier to hire large numbers of skilled workers at large cities than it is in the middle of nowhere.

If you start making data center jobs in other places, people will move to those places. besides, I'm not saying you shouldn't have a presence in major seaboard metros: Just that you shouldn't be dependent on those locations as your primary site.