Combating Thermal Issues with Next-Gen Tools

Todd Schneider leads the new product team at Electrorack, where he has focused on thermal issues in the data center, including rack and aisle-level cooling technologies. He has 15 years of engineering experience in the design and development of enclosure solutions for data centers.

TODD SCHNEIDER
Electrorack

Today’s facilities are dealing with increased heat loads and power densities like never before, a result of faster equipment processor speeds in smaller form factors and the accelerated switch to blade servers. Typical rack loads have gone from 1-2kW per racks just a few years ago, to 6-10kW per rack - with high-density areas seeing 15-25kW per rack.

The industry is simply outgrowing the hot/cold aisle approach, and while there is some new construction going on, not every facility can add more power or cooling infrastructure. This means that data center managers must get better efficiencies out of their existing facilities.

At the crux of the issue is heat. Advances in technology may bring increased processing capacities, but those same advances are now driving IT professionals to rethink their strategies when it comes to combating thermal issues in the data center. According to the Uptime Institute, server compute performance has been increasing by a factor of three every two years, while energy efficiency is only doubling in the same period.

The resulting problem: Increased thermal issues as data centers continue to play catch up. These issues have not gone unnoticed by industry leaders and those in government concerned with the problems and the current capacity of our existing utilities. Organizations and agencies focused on creating new solutions include The Green Grid (an industry group focused on data center energy efficiency), the EPA (Environmental Protection Agency), 7x24 Exchange, Uptime Institute, AFCOM and ASHRAE TC9.9. The latter is working with server manufacturers to improve hardware energy efficiencies and increase the thresholds of the acceptable operating environments.

Metrics Emerge
Additionally, the industry has put together metrics to determine the efficiency of facilities. Power Usage Effectiveness (PUE), total facility power divided by its IT equipment power, is a ratio now being monitored at even the least sophisticated of facilities. Another metric is the Data Center Infrastructure Efficiency (DCiE). This is calculated by dividing IT equipment power by total facility power, basically the inverse of the PUE calculation and is expressed as a percentage.

Lastly, the government is working on an EPA Energy Star rating for data centers focused around amount of total source energy divided by the UPS Energy. The total source energy covers all energy coming in to the building including electricity, natural gas, diesel, etc. While the EPA was originally going to create it’s own metric – Energy Usage Effectiveness or EUE, they have decided to stay (for now) with the industry preferred PUE which they state can be used for power or energy.

With the life of equipment at stake, and the cost of downtime exponential - not to mention the billions of dollars of equipment deployed across thousands of data centers in the United States alone - the drive towards greater efficiencies has never been more critical.

The separation of Cold Air Supply from Hot Exhaust is key. ASHRAE TC9.9 allows supply air temps at the face of the rack to be up to around 80 degrees. Delivering air at elevated temperatures and returning air back to units at higher temperatures improves efficiencies of CRAC units, resulting in increased capacities.

The Future Is Green
The push to go green is not only being driven at the board level to show corporate responsibility to the environment, but new proposed legislation such as cap and trade make it an economic necessity to run data centers more efficiently. Green initiatives can actually serve data centers with a fast payback and a substantial ROI as they work to maximize their existing cooling infrastructure, invest in more efficient equipment, and implement the tools discussed in this article.

The complexity of the issues facing today’s data center managers has driven the need for rack suppliers to offer tools and solutions that will effectively integrate and improve the overall efficiency of the space. Moving beyond the traditional scope of just storing equipment, next-generation rack vendors offer solutions concerning power, cabling, and thermal management. This has moved the rack supplier to the front end of the design process vs. being brought in after all the other systems are designed.

Driving down costs in data centers through a focused effort on increasing efficiencies is centered on dealing with thermal issues. Whether IT professionals are looking at new build-outs, or maximizing efficiencies within existing facilities, new, innovative solutions have surfaced in the last few years to help data centers combat heat. What follows is by no means a comprehensive list of solutions, but rather represents some of the more popular solutions data centers are turning to in their efforts to achieve greater efficiencies.

Air-side SolutionsTraditional data centers utilize an air-side solution for cooling with perimeter CRAC units either on a raised floor or a slab environment. This set-up is familiar to most data center managers, and is often preferred as they have worked for many years to try and keep water out of the data center. Therefore, it is necessary to have solutions that can be added to existing systems to improve efficiencies. Recent advances in air-side technologies have increased the amount of cooling that can be done with air upwards of 30kW per rack, which has also improved their use for new green field designs.

The separation of Cold Air Supply from Hot Exhaust is key. ASHRAE TC9.9 allows supply air temps at the face of the rack to be up around 80 degrees. Delivering air at elevated temperatures and returning air back to units at higher temperatures improves efficiencies of CRAC units, resulting in increased capacities, and greatly increases the amount of free-cooling days that can be achieved in a given climate, reducing overall energy consumption required to operate a data center.

The mixing of hot and cold air within a data center exposes IT equipment to higher inlet temperatures, especially at the top of the rack, which can lead to equipment failures. In many uncontained data centers where there is no hot and cold air separation, CRAC units and other in-house cooling systems operate inefficiently as they over-supply an aisle with cool air to keep equipment from failing. This over-supply also results in cooler air temperatures going back to the CRAC, which then causes inefficient CRAC operation.

There is debate about whether to contain the hot or cold aisle. According to a recent searchdatacenter.com survey, cold aisle containment was slightly more common. Either way, the things to look for in aisle containment are: flexibility to retrofit to existing installed racks; ease of installation; translucent panels that allow light to pass through; sliding vs. swinging doors to minimize required footprint (no door swing); and ceiling requirements – always consult local codes to determine fire suppression requirements before installing a containment system.

Passive Ducted ContainmentPassive ducted cooling systems consist of a duct placed on top of each rack that extends up to either a ceiling plenum or a ducted return. The system is designed to remove hot exhaust air from each rack and route it directly back to the CRAC unit, eliminating recirculation and hot spots in the room.

These systems also allow for higher Delta T’s between the data center cooling air and hot exhaust air, and there are no moving parts, no additional power or plumbing necessary to implement this type of system. It should be pointed out, however, studies have shown that passive, ducted solutions can cause increased pressure build-up in the rack. This can cause leakage of hot air into the space, which can lead to recirculation, causing CRAC units to oversupply to maintain suitable temperatures. Since the system relies on the server fans to push the air through the system, the server fans must work harder to overcome the resistance, which can increase server power consumption well above normal published bench-rated power numbers.

This increase in server fan speed and CRAC fan speed for additional supply air volume can be significant as the fan energy is proportional to the cube of the fan speed. This hidden energy consumed by this type of system should be taken into account, not just initial investment costs.

Active Ducted Containment
Active systems are similar in design to the passive systems except instead of relying on the server fans to move the air, these systems utilize variable-speed fans placed in the overhead ductwork. Of the various solutions on the market, some rely on temperature to control the fan speed while others rely on pressure, which is the best way to determine the volume of air required through the rack. These pressure-based systems keep the rack at a “zero pressure” state, or slightly negative pressure, to match the volume of the server fans. This prevents leakage and automatically adjusts based on equipment adds or changes.

These rack-level, high-density heat containment (HDHC) systems are becoming a popular, solution for data center managers. Scalable to 30 kW, this type of system, which gives users the ability to determine cooling capacity at the rack level, also empowers IT professionals to make informed decisions about equipment placement in the rack.

A networking module is typically included with these systems to provide IP access to data center managers. Embedded software includes a graphical user interface to easily monitor cooling capacity, temperature, humidity and service life of each fan module. User-defined SNMP traps and alarm thresholds trigger email alerts when levels fall outside of the desired parameters.

HDHC systems also allow for unity cooling, aggregating the total airflow requirement of each rack into a total and then only delivering the volume of air (CFM) from the CRAC units that is required to cool the overall load. This can prevent over-cooling and reduce fan energy significantly.

Water-side or Refrigerant CoolingWater has re-emerged from the mainframe days and is slowly becoming more accepted back into the data center because of its tremendous capacity as a cooling medium. There are several solutions on the market today that utilize water. Most of these are considered close-coupled (cooling next to the source of the heat) solutions, and many are also available using refrigerant as the medium. The most significant advantage of refrigerant is that if it leaks, unlike water it becomes a harmless gas, it can be expensive.

Rear-door Heat ExchangersAnother rack-level solution that has gained popularity recently are Rear-door Heat Exchangers (RDX). These cabinet-based cooling solutions come in both active and passive models, neutralize heat at the source, and can cool equipment at the rack level up to 30kW. The concept with these doors is to remove the heat and return it back to the room at the supply air temperature. This prevents recirculation of the hot exhaust air. And, because these units affix directly to the back of a rack, no rearrangement of the enclosures are necessary when being incorporated in the data center.

With RDX solutions, Coolant Distribution Units (CDU) create a secondary chilled water loop to adjust the flows with redundant pumps so the units continually operate above dew point preventing condensation. The water is distributed from the CDU to the coils either via special leak-proof hoses or a fabricated manifold.

Passive systems utilize a low resistance coil and rely on the servers to move the air across the coil. In active systems, the airflow is controlled via variable-speed fans that drive a thermal sensor network to continually monitor the temperature of the cabinet and automatically adjust the airflow as needed to deliver optimal thermal performance. These systems are generally effective if there is building-chilled water present in the facility and the load at the rack is greater than 5kW.

In-row Cooling
In-Row cooling units are essentially mini CRAC units that reside in the row of racks, between equipment cabinets. These close-coupled units are designed to draw hot exhaust air directly from the hot aisle, cool it and distribute it back into the cold aisle at the desired supply air temperature.

One popular configuration utilizes a manifold that ties directly into the building chilled water loop and distributes the chilled water out to each In-Row unit. Each in-row unit captures heat directly from the hot aisle and distributes cool air, ensuring that equipment temperatures are constantly held to set point conditions. End users can monitor IT equipment inlet temperatures to modulate capacity based on the demand for cooling. Fan speed can be controlled based on the demand to reduce energy consumption during off-peak cooling periods.

In-row cooling solutions often still require aisle containment or containment at the back of the rack to prevent mixing of the hot air back into the space, and can become expensive based on redundancy requirements. Because of the amount of air moving through these units, it is often very windy to walk down the cold aisle in these systems.

Additionally, make sure the system you install operates above the dew point temperature and doesn’t require condensate drains at every unit. In-row units can achieve up to 37kW of cooling.

Although data center managers have a lot to consider in their quest to improve their PUE rating and overall efficiency, one this is certain: Improving efficiencies on a massive scale is no longer an option, it’s a requirement. And selecting vendors that understand the available options, and who offer a portfolio of solutions to meet your specific needs, is a critical component of any successful data center operation.