Review of Cooling Technologies For Computer Products
Review of Cooling Technologies For Computer Products
4, DECEMBER 2004
Invited Paper
Abstract—This paper provides a broad review of the cooling fans to remove the 140 kW dissipated from its 18 000 vacuum
technologies for computer products from desktop computers to tubes” [1]. Following ENIAC, most early digital computers used
large servers. For many years cooling technology has played a key vacuum-tube electronics and were cooled with forced air.
role in enabling and facilitating the packaging and performance
improvements in each new generation of computers. The role of The invention of the transistor by Bardeen, Brattain, and
internal and external thermal resistance in module level cooling Shockley at Bell Laboratories in 1947 [2] foreshadowed the
is discussed in terms of heat removal from chips and module development of generations of computers yet to come. As a
and examples are cited. The use of air-cooled heat sinks and replacement for vacuum tubes, the miniature transistor gener-
liquid-cooled cold plates to improve module cooling is addressed. ated less heat, was much more reliable, and promised lower
Immersion cooling as a scheme to accommodate high heat flux
at the chip level is also discussed. Cooling at the system level is production costs. For a while it was thought that the use of
discussed in terms of air, hybrid, liquid, and refrigeration-cooled transistors would greatly reduce if not totally eliminate cooling
systems. The growing problem of data center thermal manage- concerns. This thought was short-lived as packaging engineers
ment is also considered. The paper concludes with a discussion of worked to improve computer speed and storage capacity by
future challenges related to computer cooling technology. packaging more and more transistors on printed circuit boards,
Index Terms—Air cooling, data center cooling, flow boiling, heat and then on ceramic substrates.
sink, immersion cooling, impingement cooling, liquid cooling, pool The trend toward higher packaging densities dramatically
boiling, refrigeration cooling, system cooling, thermal, thermal gained momentum with the invention of the integrated cir-
management, water cooling.
cuit separately by Kilby at Texas Instruments and Noyce at
Fairchild Semiconductor in 1959 [2]. During the 1960s, small
I. INTRODUCTION scale and then medium scale integration (SSI and MSI) led
from one device per chip to hundreds of devices per chip. The
E LECTRONIC devices and equipment now permeate vir-
tually every aspect of our daily life. Among the most
ubiquitous of these is the electronic computer varying in size
trend continued through the 1970s with the development of
large scale integration (LSI) technologies offering hundreds
from the handheld personal digital assistant to large scale main- to thousands of devices per chip, and then through the 1980s
frames or servers. In many instances a computer is imbedded with the development of very large scale (VLSI) technologies
within some other device controlling its function and is not offering thousands to tens of thousands of devices per chip. This
even recognizable as such. The applications of computers vary trend continued with the introduction of the microprocessor
from games for entertainment to highly complex systems sup- and continues to this day with chip makers projecting that a
porting vital health, economic, scientific, and military activities. microprocessor chip with a billion or more transistors will be a
In a growing number of applications computer failure results reality before 2010.
in a major disruption of vital services and can even have In many instances the trend toward higher circuit packaging
life-threatening consequences. As a result, efforts to improve density has been accompanied by increased power dissipation
the reliability of electronic computers are as important as ef- per circuit to provide reductions in circuit delay (i.e., increased
forts to improve their speed and storage capacity. speed). The need to further increase packaging density and re-
Since the development of the first electronic digital computers duce signal delay between communicating circuits led to the de-
in the 1940s, the effective removal of heat has played a key role velopment of multichip modules beginning in the late 1970s and
in insuring the reliable operation of successive generations of is continuing today. An example of the effect that these trends
computers. The Electrical Numerical Integrator and Computer have had on module heat flux in high-end computers is shown in
(ENIAC), dedicated in 1946, has been described as a “30 ton, Fig. 1. As can be seen heat flux associated with Bipolar circuit
boxcar-sized machine requiring an array of industrial cooling technologies steadily increased from the very beginning and re-
ally took off in the 1980s. There was a brief respite with the
transition to CMOS circuit technologies in the 1990s; but, the
Manuscript received August 30, 2004. demand for increased packaging density and performance re-
The authors are with the IBM Corporation, Poughkeepsie, NY 12601 USA
(e-mail: [email protected]). asserted itself and heat flux is again increasing at a challenging
Digital Object Identifier 10.1109/TDMR.2004.840855 rate.
1530-4388/04$20.00 © 2004 IEEE
CHU et al.: REVIEW OF COOLING TECHNOLOGIES FOR COMPUTER PRODUCTS 569
[5]. The last generation TCM incorporated a copper piston (the Fig. 6. Cross-sectional view of central processor module package with thermal
original piston was aluminum) with a cylindrical center section paste path to module cap [9].
and a slight taper on each end to minimize the gap between
piston and cap while retaining intimate contact between the
piston face and the chip [6]. Additionally, the volume inside the
module was filled with a PAO (polyalphaolefin) oil instead of
helium to reduce the piston-to-cap and chip-to-piston thermal
resistances. Hitachi packaged a similar conduction scheme in
their M-880 [7] and MP5800 [8] processors. Instead of a
cylindrical piston Hitachi utilized an interdigitated microfin
structure (Fig. 5).
In the 1990s when IBM made the switch from bipolar to
CMOS circuit technology [10] the conduction cooling approach
was simplified and reduced in cost by adopting a “flat plate” Fig. 7. MCM cross-section showing heat spreader adhesively attached to chip
(adapted from [10]).
conduction approach as shown in Fig. 6. The thermal path from
chip to cap is provided by a controlled thickness (e.g., 0.10 mm
to 0.18 mm) of a thermally conductive paste. This was possible the system environment. This is accomplished primarily by at-
largely due to improved planarity of the substrate, better control taching a heat sink to the module. Traditionally, and prefer-
of dimensional tolerances and enhanced thermal conductivity of ably, the system environment of choice has been air because
the paste. of its ease of implementation, low cost, and transparency to
As time went on, chip power levels continued to increase. In the end user or customer. This section, therefore, will focus
addition, concentrated areas of high heat flux 2 to 3 times the on air-cooled heat sinks. Liquid-cooled heat sinks typically re-
average chip heat flux referred to as hot spots emerged. To meet ferred to as cold plates will also be discussed.
internal thermal resistance requirements, in 2001 IBM chose to 1) Air-Cooled Heat Sinks: A typical air-cooled heat sink is
attach a high-grade silicon carbide (SiC) spreader to the chip shown in Fig. 8. The heat sink is constructed of a base region
with an adhesive thermal interface (ATI) and then use a more that is in contact with the module to be cooled. Fins protruding
conventional thermal paste between the spreader and the cap from the base serve to extend surface area for heat transfer to
[10]. This configuration is shown in Fig. 7. the air. Heat is conducted through the base, up into the fins and
The adhesive thermal interface (ATI), while not as thermally then transferred to the air flowing in the spaces between the fins
conductive as the thermal paste, could be applied much thinner by convection. The spacing between fins can run continuously
resulting in a lower thermal resistance. SiC was chosen for the in one direction in the case of a straight fin heat sink or they can
spreader material for its unique combination of high thermal run in two directions in the case of a pin fin heat sink (Fig. 9).
conductivity and low coefficient of thermal expansion (CTE). Air flow can either be through the heat sink laterally (in cross
The CTE of the SiC closely matches that of the silicon chip thus flow) or can impinge from the top as seen in Fig. 10.
avoiding stress fracturing the interface when the module heats The thermal performance of the heat sink is a function of
up during use. The thermal resistance of this package arrange- many variables. Geometric variables include the thickness and
ment is lower than just using thermal paste between chip and plan area of the base plus the fin thickness, height, and spacing.
cap because of the use of the lower thermal resistance ATI on The principal material variable is thermal conductivity. Also
the smaller chip area. The thermal paste thermal resistance is factored in is volumetric air flow and pressure drop. Many opti-
mitigated by applying it over a much larger area. mization studies have been conducted to minimize the external
thermal resistance for a particular set of application conditions
[11]–[13]. However, over time, as greater and greater thermal
B. External Module Cooling performance has been required, fin heights and fin number have
Cooling external to the module serves as the primary means increased while fin spacing has been decreased. Additionally,
to effectively transfer the heat generated within the module to heat sinks have migrated in construction from all aluminum
CHU et al.: REVIEW OF COOLING TECHNOLOGIES FOR COMPUTER PRODUCTS 571
C. Immersion Cooling
Immersion cooling has been of interest as a possible method
to cool high heat flux components for many years. Unlike the
water-cooled cold plate approaches which utilize physical walls
to separate the coolant from the chips, immersion cooling brings
the coolant in direct physical contact with the chips. As a result,
most of the contributors to internal thermal resistance are elim-
inated, except for the thermal conduction resistance from the
device junctions to the surface of the chip in contact with the
liquid.
Direct liquid immersion cooling offers a high heat transfer co-
efficient which reduces the temperature rise of the heated chip
surface above the liquid coolant temperature. The magnitude
of the heat transfer coefficient depends upon the thermophys-
ical properties of the coolant and the mode of convective heat
transfer employed. The modes of heat transfer associated with
liquid immersion cooling are generally classified as natural con-
vection, forced convection, and boiling. Forced convection in-
cludes liquid jet impingement in the single phase regime and
boiling (including pool boiling, flow boiling, and spray cooling)
in the two-phase regime. An example of the broad range of heat
flux that can be accommodated with the different modes and
forms of direct liquid immersion cooling is shown in Fig. 11
[22].
Selection of a liquid for direct immersion cooling cannot
be made on the basis of heat transfer characteristics alone.
Chemical compatibility of the coolant with the chips and
Fig. 11. Heat flux ranges for direct liquid immersion cooling of
other packaging materials exposed to the liquid is an essential microelectronic chips [22].
consideration. There may be several coolants that can provide
adequate cooling, but only a few will be chemically compatible.
Water is an example of a liquid which has very desirable
heat transfer properties, but which is generally undesirable for
direct immersion cooling because of its chemical and electrical
characteristics. Alternatively, fluorocarbon liquids (e.g., FC-72,
FC-86, FC-77, etc.) are generally considered to be the most
suitable liquids for direct immersion cooling, in spite of their
poorer thermophysical properties [22], [23].
1) Natural and Forced Liquid Convection: As in the case of
air cooling, liquid natural convection is a heat transfer process
in which mixing and fluid motion is induced by differences in
coolant density caused by heat transferred to the coolant. As
shown in Fig. 11, this mode of heat transfer offers the lowest
heat flux or cooling capability for a given wall superheat or
surface-to-liquid temperature difference. Nonetheless, the heat
transfer rates attainable with liquid natural convection can ex-
ceed those attainable with forced convection of air.
Higher heat transfer rates may be attained by utilizing a pump Fig. 12. Forced convection thermal resistance results for simulated 12.7 mm
to provide forced circulation of the liquid coolant over the chip 2 12.7 mm microelectronic chips (adapted from [24]).
or module surfaces. This process is termed forced convection
and the allowable heat flux for a given surface-to-liquid temper- Experimental studies were conducted by Incropera and
ature difference can be increased by increasing the velocity of Ramadhyani [24] to study liquid forced convection heat
the liquid over the heated surface. The price to be paid for the transfer from simulated microelectronic chips. Tests were
increased cooling performance will be a higher pressure drop. performed with water and dielectric liquids (FC-77 and FC-72)
This can mean a larger pump and higher system operating pres- flowing over bare heat sources and heat sources with pin-fin
sures. Although forced convection requires the use of a pump and finned pin extended surface enhancement. It can be seen in
and the associated piping, it offers the opportunity to remove Fig. 12 that, depending upon surface and flow conditions (i.e.,
heat from high power chips and modules in a confined space. Reynolds number), thermal resistance values obtained for the
The liquid coolant may then be used to transport the heat to a fluorocarbon liquids ranged from 0.4 to 20 C W. It may be
remote heat exchanger to reject the heat to air or water. noted that a thermal resistance on the order of 0.5 C W could
CHU et al.: REVIEW OF COOLING TECHNOLOGIES FOR COMPUTER PRODUCTS 573
Fig. 21. Modular cold plate subsystem and water distribution loops in the IBM
3081 processor frame.
Fig. 20. Large scale computer configuration of the 1980s with coolant
distribution unit (CDU).
C. Liquid-Cooling Systems
Either the air-to-water heat exchangers in a hybrid
air–water-cooled system or the water-cooled cold plates in
a conduction-cooled system rely upon a controlled source of
water in terms of pressure, flow rate, temperature, and chem-
istry. In order to insure the physical integrity, performance, and
long-term reliability of the cooling system, customer water is
usually not run directly through the water-carrying components
in electronic frames. This is because of the great variability
that can exist in the quality of water available at computer
installations throughout the world. Instead a pumping and heat
exchange unit, sometimes called a coolant distribution unit
(CDU) is used to control and distribute system cooling water to
computer electronics frames as shown in Fig. 20. The primary
closed loop (i.e., system) is used to circulate cooling water
to and from the electronics frames. The system heat load is
transferred to the secondary loop (i.e., customer water) via a Fig. 22. Flow schematic of a typical IBM coolant distribution unit (CDU).
water-to-water heat exchanger in the CDU. Within an elec-
tronics frame a combination of parallel-series flow networks is A CDU is also required for direct immersion cooling systems
used to distribute water flow to individual cold plates and heat such as used in the CRAY-2 discussed earlier. In this application
exchangers. An example of the piping configuration used to the CDU performs a similar role to that in water-cooled systems
distribute water to cold plates mounted on multichip modules and segregates the chemical coolant (e.g., FC-77) from the cus-
in the IBM 3081 processor is shown in Fig. 21. tomer water as shown in Fig. 23. Of course, all the materials
As shown in Fig. 22, the basic flow and heat exchange com- within the CDU, as well as the piping distribution system must
ponents within a CDU consist of a heat exchanger, flow mixing be chemically compatible with the coolant. In addition, because
valve, pumps, expansion tank, and water supply/return mani- of the relatively high vapor pressure of the coolants suitable for
folds. Water flow in the primary loop is provided at a fixed flow direct immersion applications (e.g., fluorocarbons), the cooling
rate by a single operating pump, with a stand-by pump to pro- system must be both “vapor-tight” and “liquid-tight” to ensure
vide uninterrupted operation if the operating pump fails. The against any loss of the relatively expensive coolant.
temperature of the water in the primary loop is controlled by
using a mixing valve to regulate the fraction of the flow allowed D. Refrigeration Cooled Systems
to pass through the water-to-water heat exchanger and forcing The potential for enhancement of computer performance
the remainder to bypass the heat exchanger. by operating at lower temperatures was recognized as long
CHU et al.: REVIEW OF COOLING TECHNOLOGIES FOR COMPUTER PRODUCTS 577
Fig. 24. Relative performance factors (with respect to a 100 C value) of 1.5-V
CMOS circuits as a function of temperature. Threshold voltages are adjusted
differently with temperature in each of the three scenarios shown (adapted from
[43]).
Fig. 23. Cray-2 liquid immersion cooling system.
IBM’s reliability and life expectancy specifications and handle a bulk power compartment is the central electronic complex
cooling load of 250 W at 77 K. A Stirling cycle type refrigerator (CEC) where the MCM housing 12 processors is located. Two
was chosen as the only practical refrigeration method for modular refrigeration units (MRUs) located near the middle
obtaining liquid nitrogen temperatures. Prototype models were of the frame provide cooling via the evaporator attached to the
built with cooling capacities of 500 and 250 W at 77 K. In back of the processor module. Only one MRU is operated at
addition, a packaging scheme had to be developed that would a time during normal operation. The evaporator mounted on
withstand cycling from room temperature down to 77 K and the processor module is fully redundant with two independent
provide thermal insulation to reduce the parasitic heat losses. refrigerated passages. Refrigerant passing through one passage
A low-temperature conduction module (LTCM) was built to is adequate to cool the MCM which dissipates a maximum
package the chip and module. The LTCM, or cryostat, consisted power of 1050 W. Following the success of this machine IBM
of a stainless steel housing with a vacuum to minimize heat has continued to exploit the advantages of sub-ambient cooling
losses. This hardware was used to measure chip performance at the high-end of its zSeries product line.
at 77 K. As a result of this effort, prototype Stirling cycle In 1999, Fujitsu released its Global Server GS8900 that uti-
cryocoolers in a form factor compatible with overall system lized a refrigeration unit to chill a secondary coolant and then
packaging constraints were built and successfully tested and supply the coolant to a liquid-cooled Central Processor Unit
key elements of the packaging concept were demonstrated. (CPU) MCMs [52]. A schematic of the liquid-cooled system
IBM’s most recent interest in refrigeration-cooling focused on is shown in Fig. 27. The refrigeration unit which is called the
the application of conventional vapor compression refrigeration chilled coolant supply unit (CCSU) contains three air-cooled re-
technology to operate below room temperature conditions, but frigeration modules and two liquid circulating pumps. The re-
well above cryogenic temperatures. In 1997, IBM developed, frigeration modules chill the coolant to near 0 C. The system
built and shipped its first refrigeration-cooled server (the S/390 board assembly housing the CPU modules is accommodated in
G4 system) [50], [51]. This cooling scheme provided an average a closed box in which the dew point is controlled in order to pre-
processor temperature of 40 C which represented a temperature vent condensation from forming on the electrical equipment. In
decrease of 35 C below that of a comparable air-cooled system. comparison to an air-cooled version of this system, circuit junc-
The system packaging layout is shown in Fig. 26. Below the tion temperatures are reduced by more than 50 C.
CHU et al.: REVIEW OF COOLING TECHNOLOGIES FOR COMPUTER PRODUCTS 579
Fig. 26. IBM S390 G4 server with refrigeration-cooled processor module and redundant modular refrigeration units (MRUs).
Fig. 27. Configuration of Fujitsu’s GS8900 low-temperature liquid cooling system (adapted from [52]).
IV. DATA CENTER THERMAL MANAGEMENT The increasing heat load of datacom equipment has been
documented by a thermal management consortium of 17 com-
Due to technology compaction, the information technology panies and published in collaboration with the Uptime Institute
(IT) industry has seen a large decrease in the floor space [53] as shown in Fig. 28. Also shown in this figure are mea-
required to achieve a constant quantity of computing and storage sured heat fluxes (based on product footprint) of some recent
capability. However, the energy efficiency of the equipment product announcements. The most recent shows a rack dissi-
has not dropped at the same rate. This has resulted in a pating 28 500 W resulting in a heat flux based on the footprint
significant increase in power density and heat dissipation within of the rack of 20 900 W/m . With these heat loads the focus
the footprint of computer and telecommunications hardware. for customers of such equipment is in providing adequate air
The heat dissipated in these systems is exhausted to the room flow at a temperature that meets the manufacturer’s require-
and the room has to be maintained at acceptable temperatures ments. Of course, this is a very complex problem considering
for reliable operation of the equipment. Cooling computer and the dynamics of a data center and one that is only starting
telecommunications equipment rooms is becoming a major to be addressed [54]–[61]. There are many opportunities for
challenge. improving the thermal environment of data centers and the
580 IEEE TRANSACTIONS ON DEVICE AND MATERIALS RELIABILITY, VOL. 4, NO. 4, DECEMBER 2004
need to deploy equipment quickly in order to get maximum industry to show both the differences and also aid in possible
use of a large financial asset. This may mean that minimal time convergence of the specifications in the future. The four data
is spent on site preparation, thereby potentially resulting in processing classes cover the entire environmental range from
thermal issues once the equipment is installed. air conditioned, server and storage environments of classes 1
The construction cost of a data center is now exceeding $1000 and 2 to the lesser controlled environments like class 3 for
per square foot in some metropolitan areas and the annual oper- workstations, PCs and portables or class 4 for point of sales
ating cost is $50 to $150 per square foot. For these reasons, IT equipment with virtually no environmental control.
and facilities managers want to obtain the most out of their data In order for seamless integration between the server and the
center space and maximize the utilization of their infrastructure. data center to occur, certain protocols need to be developed
Unfortunately, the current situation in many data centers does especially in the area of airflow. This section provides airflow
not permit this optimization. The equipment installed into a data guidelines for both the IT/Facility managers and the equip-
center can be from many different manufacturers each having ment manufacturers to design systems that are compatible and
a different environmental specification. With these requirement minimize inefficiencies. Currently, manufacturers design their
the IT facilities manager is required to overcool his data center equipment exhaust and inlets wherever it is convenient from
to compensate for the equipment with the tightest requirements. an architectural standpoint. As a result, there have been many
cases where the inlet of one server is directly next to the exhaust
of adjacent equipment resulting in the ingestion of hot air. This
C. Need for Thermal Guidelines
has direct consequences to the reliability of that machine. This
Since many of the data center thermal management issues are guideline attempts to steer manufacturers toward a common
industry-wide, a number of equipment manufacturers decided airflow scheme to prevent this hot air ingestion by specifying
to form a consortium in 1998 to address common issues related regions for inlets and exhausts. The guideline recommends one
to thermal management of data centers and telecommunications of the three airflow configurations: front-to-rear, front-to-top
rooms. Initial interest was expressed from the following compa- and front-to-top-and-rear.
nies: Amdahl, Cisco Systems, Compaq, Cray, Inc., Dell Com- Once manufacturers start implementing the equipment pro-
puter, EMC, HP, IBM, Intel, Lucent Technologies, Motorola, tocol, it will become easier for facility managers to optimize
Nokia, Nortel Networks, Sun Microsystems, and Unisys. As a their layouts to provide maximum possible density by following
result the Thermal Management Consortium for Data Centers the hot-aisle/cold-aisle concept as shown in Fig. 30. In other
and Telecommunications Rooms was formed. Since the industry words, the front face of all equipment is always facing the cold
was facing increasing power trends, it was decided that the first aisle.
priority was to develop and then publish (in collaboration with The ASHRAE guideline’s heat and airflow reporting sec-
Uptime Institute) a trend chart on power density of the industry’s tion defines what information is to be reported by the infor-
equipment that would aid customers in planning data centers for mation technology equipment manufacturer to assist the data
the future (see Fig. 28). center planner in the thermal management of the data center.
In January 2002, the American Society of Heating, Re- The equipment heat release value is the key parameter that is
frigerating and Air Conditioning Engineers (ASHRAE) was reported. In addition several other pieces of information are re-
approached with a proposal to create an independent committee quired if the heat release values are to be meaningful like total
to specifically address high-density electronic heat loads. The system air flow rate, typical configurations of system, air flow
proposal was accepted by ASHRAE and eventually a tech- direction of system, and class environment, just to mention a
nical committee, TC9.9 Mission Critical Facilities, Technology few.
Spaces, and Electronic Equipment, was formed. The first pri- Other publications will follow on data center thermal man-
ority of TC9.9 was to create a Thermal Guidelines document agement with one planned for January 2005 that will update the
that would help to align the designs of equipment manufac- initial trend chart and will discuss air cooling and water cooling
turers and help data center facility designers to create efficient in the context of the data center. However, to aid in the ad-
and fault tolerant operation within the data center. The re- vancement of data center thermal management it is of utmost
sulting document, Thermal Guidelines for Data Processing importance to understand the current situation in high density
Environments, was published in January 2004 [73]. Some of data centers in order to build on this understanding to further
the key issues of that document will now be described. enhance the thermal environment in data centers. In this effort
For data centers, the primary thermal management focus is Schmidt [74] published the first paper of its kind to completely
on assuring that the housed equipment’s temperature and hu- thermally profile a high density data center. The motivation for
midity requirements are met. Each manufacturer has their own the paper was twofold. First, the paper provided some basic in-
environmental specification and a customer of many types of formation on the thermal/flow data collected from a high density
electronic equipment is faced with a wide variety of environ- data center. Second, it provided a methodology which others can
mental specifications. In an effort to standardize, the ASHRAE follow in collecting thermal and air flow data from data centers
TC9.9 committee first surveyed the environmental specifica- so that data can be assimilated to make comparisons. This data-
tions of a number of data processing equipment manufacturers. base can then provide the basis for future data center air cooling
From this survey, four classes were identified that would en- design and aid in the understanding of deployment of racks of
compass most of the specifications. Also included within the higher heat loads in the future. This data needs to be further
guidelines was a comparison to the NEBS (Network Equipment expanded so that data center design and optimization from an
Building Systems) specifications for the telecommunications air-cooled viewpoint can occur.
CHU et al.: REVIEW OF COOLING TECHNOLOGIES FOR COMPUTER PRODUCTS 583
Data centers do have limitations and each data center is blower or pump failure while continuing to provide the required
unique such that some data centers have much lower power cooling function. It also means that provisions must be incorpo-
density limitations than others. To resolve these environmental rated in the cooling design to allow replacement of the failed
issues in some data centers today manufacturers of HVAC unit while the machine continues to operate. All of these con-
equipment have begun to offer liquid cooling solutions to aid siderations clearly represent an increased level of challenges for
in data center thermal management. The objective of these thermal engineers. It also means that thermal engineers must be
new approaches is to move the liquid cooling closer to the an integral part of the design process from the very beginning
source of the problem, which is the electronic equipment that and work very closely with electrical and packaging engineers
is producing the heat. Placing the cooling near the source of to achieve a truly holistic design.
heat shortens the distance that air must be moved and results in In addition, as identified in the thermal management section
minimal static pressure. This increases the capacity, flexibility, of the 2002 National Electronics Management Technology
efficiency, and scalability of the cooling solutions. Several Roadmap [75] there are several major cooling areas requiring
viable options based on this strategy have been developed: further development and innovation. In order to diffuse high
1) rear-mounted fin and tube heat exchangers; 2) internal fin heat flux from chip heat sources and reduce thermal resistance
and tube heat exchangers either at the bottom of a rack of at the chip-to-sink interface, there is a need to develop low cost,
electronic equipment or mounted to the side of a rack; and higher thermal conductivity, packaging materials such as adhe-
3) overhead fin and tube heat exchangers. Although each one sives, thermal pastes and thermal spreaders. Advanced cooling
of these is a liquid-cooled solution adjacent to the air-cooled technology in the form of heat pipes and vapor chambers are
rack, the liquid can be either water based or refrigerant based. already widely used. Further advances in these technologies
These solutions and others will continue to be promoted with as well as thermoelectric cooling technology, direct liquid
the increased power densities being shipped and the projections cooling technology, high-performance air-cooled heat sinks
of the increased heat loads by the manufacturers of datacom and air movers are also needed. Also as discussed earlier in
equipment. the paper, cooling at the data center level is also becoming a
very challenging problem. High performance cooling systems
that will minimize the impact to the environment within the
V. FUTURE CHALLENGES
customer’s facility are needed to answer this challenge. Finally,
For many years the major challenge facing thermal engineers to achieve the holistic design referred to above, it will be
has been how to limit chip operating temperatures in the face necessary to develop advanced modeling tools to integrate
of increases in heat flux with each new generation of chip de- the electrical, thermal, and mechanical aspects of package
sign. This challenge may be expected to continue through the and product function, while providing enhanced usability and
remainder of this decade. As the size of semiconductor devices minimizing interface incompatibilities.
is reduced further, leakage power dissipation may become com- It is clear that thermal management for high-performance
parable to or even greater than the active device power dissipa- computers will continue to be an area offering engineers many
tion further compounding the thermal challenge. challenges and opportunities for meaningful contributions and
In the previous sections the cooling technologies and designs innovations.
developed to respond to increased powers were discussed with
no mention of cost. Although controlling and reducing cost has REFERENCES
always been an objective, the overriding consideration was to
[1] A. E. Bergles, “The evolution of cooling technology for electrical, elec-
provide the necessary cooling even if the cost was higher than tronic, and microelectronic equipment,” ASME HTD, vol. 57, pp. 1–9,
desired. Today things are considerably different with intense 1986.
competition demanding increased performance at reduced cost. [2] D. Hanson, The New Alchemists. New York: Avon Books, 1982.
[3] R. C. Chu, U. P. Hwang, and R. E. Simons, “Conduction cooling for an
While the focus remains on providing the necessary cooling, it LSI package: A one-dimensional approach,” IBM J. Res. Develop., vol.
is no longer acceptable to do so at any cost. The cost of cooling 26, no. 1, pp. 45–54, Jan. 1982.
must be commensurate with the overall manufacturing cost of [4] R. C. Chu, O. R. Gupta, U. P. Hwang, and R. E. Simons, “Gas encapsu-
lated cooling module,” U.S. Patent 3,741,292, 1976.
the computer and indeed be a relatively small fraction of the [5] R. C. Chu and R. E. Simons, “Cooling technology for high performance
total cost! computers: Design applications,” in Cooling of Electronic Systems, S.
Although air cooling may be expected to continue to be the Kakac, H. Yuncu, and K. Hijikata, Eds. Boston, MA: Kluwer, 1994,
pp. 97–122.
most pervasive method of cooling, in many instances the chips [6] G. F. Goth, M. L. Zumbrunnen, and K. P. Moran, “Dual-Tapered piston
and packages that require cooling are at or will soon exceed (DTP) module cooling for IBM enterprise system/9000 systems,” IBM
the limits of air cooling. As this happens it will be necessary to J. Res. Develop., vol. 36, no. 4, pp. 805–816, July 1992.
[7] F. Kobayashi, Y. Watanabe, M. Yamamoto, A. Anzai, A. Takahashi, T.
once again introduce water or some other form of liquid cooling. Daikoku, and T. Fujita, “Hardware technology for HITACHI M-880 pro-
This represents a real challenge as it does not mean simply res- cessor group,” in Proc. 41st Electronics Components and Technology
urrecting the water-cooled designs of the past. Machines today Conf., Atlanta, GA, May 1991, pp. 693–703.
[8] F. Kobayashi, Y. Watanabe, K. Kasai, K. Koide, K. Nakanishi, and R.
are packaged much more densely than in the past making the job Sato, “Hardware technology for the Hitachi MP5800 series (HDS Sky-
of introducing water or any other form of liquid cooling much line Series),” IEEE Trans. Adv. Packag., vol. 23, no. 3, pp. 504–514, Aug.
more challenging. In addition, today many machines must virtu- 2000.
[9] P. Singh, D. Becker, V. Cozzolino, M. Ellsworth, R. Schmidt, and E.
ally operate continuously without interruption. This means that Seminaro, “System packaging for a CMOS mainframe,” Advancing Mi-
the cooling design must incorporate redundancy to allow for a croelectron., vol. 25, no. 7, pp. 12–17, 1998.
584 IEEE TRANSACTIONS ON DEVICE AND MATERIALS RELIABILITY, VOL. 4, NO. 4, DECEMBER 2004
[10] J. U. Knickerbocker, “An advanced multichip module (MCM) for high- [33] S. C. Yao, S. Deb, and N. Hammouda, “Impacting spray boiling for
performance unix servers,” IBM J. Res. Develop., vol. 46, no. 6, pp. thermal control of electronic systems,” Heat Transfer Electron., vol.
779–804, Nov. 2002. ASME HTD-111, pp. 129–134, 1989.
[11] D. J. De Kock and J. A. Visser, “Optimal heat sink design using mathe- [34] G. Pautsch and A. Bar-Cohen, “Thermal management of multichip mod-
matical optimization,” Adv. Electron. Packag., vol. 1, pp. 337–347, 2001. ules with evaporative spray cooling,” Adv. Electron. Packag., vol. ASME
[12] J. R. Culham and Y. S. Muzychka, “Optimization of plate fin heat sinks EEP-26-2, pp. 1453–1461, 1999.
using entropy generation minimization,” IEEE Trans. Compon. Packag. [35] G. Pautsch, “An overview on the system packaging of the Cray SV2
Technol., vol. 24, no. 2, pp. 159–165, Jun. 2001. supercomputer,” presented at the IPACK 2001 Conf., Kauai, HI, 2001.
[13] M. F. Holahan, “Fins, fans, and form: Volumetric limits to air-side heat [36] T. Cader and D. Tilton, “Implementing spray cooling thermal manage-
sink performance,” in Proc. 9th Intersociety Conf. Thermal and Ther- ment in high heat flux applications,” in Proc. 2004 Intersociety Conf.
momechanical Phenomena in Electronic Systems, Las Vegas, NV, Jun. Thermal Performance, 2004, pp. 699–701.
2004, pp. 564–570. [37] G. Lin and R. Ponnappan, “Heat transfer characteristics of spray cooling
[14] F. Roknaldin and R. A. Sahan, “Cooling solution for next generation in a closed loop,” Int. J. Heat Mass Transfer, vol. 46, pp. 3737–3746,
high-power processor boards in 1U computer servers,” Adv. Electron. 2003.
Packag., vol. 2, pp. 629–634, 2003. [38] C. Hilbert, S. Sommerfeldt, O. Gupta, and D. J. Herrell, “High perfor-
[15] M. Gao and Y. Cao, “Flat and U-shaped heat spreaders for high-power mance air cooled heat sinks for integrated circuits,” IEEE Trans. CHMT,
electronics,” Heat Transfer Eng., vol. 24, no. 3, pp. 57–65, May/Jun.
vol. 13, no. 4, pp. 1022–1031, 1990.
2003.
[39] R. C. Chu, R. E. Simons, and K. P. Moran, “System cooling design con-
[16] Z. Z. Yu and T. Harvey, “Precision-Engineered heat pipe for cooling Pen-
siderations for large mainframe computers,” in Cooling Techniques for
tium II in compact PCI design,” in Proc. 7th Intersociety Conf. Thermal
Computers, W. Aung, Ed. New York: Hemisphere, 1991.
and Thermomechanical Phenomena in Electronic Systems, Las Vegas,
[40] V. W. Antonetti, R. C. Chu, and J. H. Seely, “Thermal design for IBM
NV, May 2000, pp. 102–105.
[17] V. W. Antonetti, S. Oktay, and R. E. Simons, “Heat transfer in electronic system/360 model 91,” presented at the 8th Int. Electronic Circuit Pack-
packages,” in Microelectronics Packaging Handbook, R. R. Tummala aging Symp., San Francisco, CA, 1967.
and E. J. Rymaszewski, Eds. New York: Van Nostrand Reinhold, 1989, [41] R. C. Chu, M. J. Ellsworth, E. Furey, R. R. Schmidt, and R. E. Simons,
pp. 189–190. “Method and apparatus for combined air and liquid cooling of stacked
[18] R. S. Prasher, C. Simmons, and G. Solbrekken, “Thermal contact resis- electronic components,” U.S. Patent 6,775,137 B2, Aug. 10, 2004.
tance of phase change and grease type polymeric materials,” Amer. Soc. [42] H. Bray, “Computer Makers Sweat Over Cooling,” The Boston Globe,
Mechanical Engineers, Manufacturing Engineering Division (MED), 2004.
vol. 11, pp. 461–466, 2000. [43] Y. Taur and J. Nowak, “CMOS devices below 0.1 m How high will
[19] D. J. Delia, T. C. Gilgert, N. H. Graham, U. P. Hwang, P. W. Ing, J. performance go ?,” in Int. Electron Devices Meeting Tech. Dig., 1997,
C. Kan, R. G. Kemink, G. C. Maling, R. F. Martin, K. P. Moran, J. R. pp. 215–218.
Reyes, R. R. Schmidt, and R. A. Steinbrecher, “System cooling design [44] K. Rose, R. Mangaser, C. Mark, and E. Sayre, “Cryogenically cooled
for the water-cooled IBM enterprise system/9000 processors,” IBM J. CMOS,” Critical Rev. Solid State Materials Sci., vol. 4, no. 1, pp. 63–99,
Res. Develop., vol. 36, no. 4, pp. 791–803, Jul. 1992. 1999.
[20] D. B. Tuckerman and R. F. Pease, “High performance heat sinking for [45] W. F. Clark, E. Badih, and R. G. Pires, “Low temperature CMOS—A
VLSI,” IEEE Electron. Device Lett., vol. EDL-2, no. 5, pp. 126–129, brief review,” IEEE Trans. Compon., Hybrids, Manufact. Technol., vol.
May 1981. 15, no. 3, pp. 397–404, Jun. 1992.
[21] R. Hahn, A. Kamp, A. Ginolas, M. Schmidt, J. Wolf, V. Glaw, M. Topper, [46] R. F. Barron, Cryogenic Systems, 2nd ed. New York: Oxford Univ.
O. Ehrmann, and H. Reichl, “High power multichip modules employing Press, 1985.
the planar embedding technique and microchannel water heat sinks,” [47] J. S. Kolodzey, “Cray-1 computer technology,” IEEE Trans. Compon.,
IEEE Trans. Compon., Packag., Manufact. Technol.–Part A, vol. 20, no. Hybrids, Manufact. Technol., vol. CHMT-4, no. 2, pp. 181–186, Jun.
4, pp. 432–441, Dec. 1997. 1981.
[22] A. E. Bergles and A. Bar-Cohen, “Direct liquid cooling of microelec- [48] D. M. Carlson, D. C. Sullivan, R. E. Bach, and D. R. Resnick, “The
tronic components,” in Advances in Thermal Modeling of Electronic ETA-10 liquid-nitrogen-cooled supercomputer system,” IEEE Trans.
Components and Systems, A. Bar-Cohen and A. D. Kraus, Eds. New Electron. Devices, vol. 36, no. 8, pp. 1404–1413, Aug. 1989.
York: ASME Press, 1990, vol. 2, pp. 233–342. [49] R. E. Schwall and W. S. Harris, “Packaging and cooling of low temper-
[23] R. E. Simons, “Direct liquid immersion cooling for high power density ature electronics,” in Advances in Cryogenic Engineering. New York:
microelectronics,” Electron. Cooling, vol. 2, no. 2, 1996. Plenum Press, 1991, pp. 587–596.
[24] F. P. Incropera, “Liquid immersion cooling of electronic components,” [50] R. R. Schmidt, “Low temperature electronics cooling,” Electronics
in Heat Transfer in Electronic and Microelectronic Equipment, A. E. Cooling, vol. 6, no. 3, Sep. 2000.
Bergles, Ed. New York: Hemisphere, 1990. [51] R. R. Schmidt and B. Notohardjono, “High-End server low temperature
[25] R. D. Danielson, N. Krajewski, and J. Brost, “Cooling a superfast com- cooling,” IBM J. Res. Develop., vol. 46, no. 2, pp. 739–751, 2002.
puter,” Electron. Packag. Produc., pp. 44–45, Jul. 1986. [52] A. Fujisaki, M. Suzuki, and H. Yamamoto, “Packaging technology for
[26] L. Jiji and Z. Dagan, “Experimental investigation of single phase multi
high performance CMOS server fujitsu GS8900,” IEEE Trans. Adv.
jet impingement cooling of an array of microelectronic heat sources,” in
Packag., vol. 24, pp. 464–469, Nov. 2001.
Modern Developments in Cooling Technology for Electronic Equipment,
[53] Heat Density Trends in Data Processing, Computer Systems and
W. Aung, Ed. New York: Hemisphere, 1988, pp. 265–283.
Telecommunication Equipment. Santa Fe, NM: Uptime Institute,
[27] P. F. Sullivan, S. Ramadhyani, and F. P. Incropera, “Extended surfaces to
enhance impingement cooling with single circular jets,” Adv. Electron. 2000.
Packag., vol. ASME EEP-1, pp. 207–215, Apr. 1992. [54] R. Schmidt, “Effect of data center characteristics on data processing
[28] G. M. Chrysler, R. C. Chu, and R. E. Simons, “Jet impingement boiling equipment inlet temperatures,” in Proc. IPACK ’01, Advances in
of a dielectric coolant in narrow gaps,” IEEE Trans. CPMT-A, vol. 18, Electronic Packaging 2001, vol. 2, Kauai, HI, Jul. 2001, pp. 1097–1106.
no. 3, pp. 527–533, 1995. [55] R. Schmidt and E. Cruz, “Raised floor computer data center: Effect on
[29] A. E. Bergles and A. Bar-Cohen, “Immersion cooling of digital com- rack inlet temperatures of chilled air exiting both the hot and cold aisles,”
puters,” in Cooling of Electronic Systems, S. Kakac, H. Yuncu, and K. in Proc. ITHERM, San Diego, CA, Jun. 2002, pp. 580–594.
Hijikata, Eds. Boston, MA: Kluwer, 1994, pp. 539–621. [56] , “Raised floor computer data center: Effect on rack inlet tempera-
[30] I. Mudawar and D. E. Maddox, “Critical heat flux in subcooled flow tures when rack flow rates are reduced,” presented at the Int. Electronic
boiling of fluorocarbon liquid on a simulated chip in a vertical rectan- Packaging Conf. and Exhibition, Maui, HI, Jul. 2003.
gular channel,” Int. J. Heat Mass Transfer, vol. 32, 1989. [57] , “Raised floor computer data center: Effect on rack inlet tempera-
[31] R. C. Chu and R. E. Simons, “Review of boiling heat transfer for cooling tures when adjacent racks are removed,” presented at the Int. Electronic
of high-power density integrated circuit chips,” in Process, Enhanced, Packaging Conf. and Exhibition, Maui, HI, July 2003.
and Multiphase Heat Transfer, A. E. Bergles, R. M. Manglik, and A. D. [58] , “Raised floor computer data center: Effect on rack inlet temper-
Kraus, Eds. New York: Begell House, 1996. atures when high powered racks are situated amongst lower powered
[32] R. E. Simons, “The evolution of IBM high performance cooling tech- racks,” presented at the ASME IMECE Conf., New Orleans, LA, Nov.
nology,” IEEE Trans. CPMT-Part A, vol. 18, no. 4, pp. 805–811, 1995. 2002.
CHU et al.: REVIEW OF COOLING TECHNOLOGIES FOR COMPUTER PRODUCTS 585
[59] , “Clusters of high powered racks within a raised floor computer Robert E. Simons received the B.S. degree in
data center: Effect of perforated tile flow distribution on rack inlet air mechanical engineering from Widener University,
temperatures,” presented at the ASME IMECE Conf., Washington, DC, Chester, PA, and the M.S. degree in operations
Nov. 2003. research and applied statistics from Union College,
[60] C. Patel, C. Bash, C. Belady, L. Stahl, and D. Sullivan, “Computational Schenectady, NY.
fluid dynamics modeling of high compute density data centers to assure Prior to retiring from IBM in 1995, he was a
system inlet air specifications,” in Proc. IPACK ’01, Advances in Elec- Senior Technical Staff Member and manager in the
tronic Packaging 2001, vol. 2, Kauai, HI, July 2001, pp. 821–829. Advanced Thermal Laboratory at the IBM Devel-
[61] C. Patel, R. Sharma, C. Bash, and A. Beitelmal, “Thermal considerations opment Laboratory, Poughkeepsie, NY. He joined
in cooling large scale compute density data centers,” in Proc. ITHERM, IBM in 1966 working in the thermal area as an
San Diego, CA, Jun. 2002, pp. 767–776. engineer and manager, and was a key participant in
[62] C. Patel, C. Bash, R. Sharma, M. Beitelmal, and R. Friedrich, “Smart the thermal design and development of cooling technologies for the IBM 3033,
cooling of data centers,” in Proc. IPACK ’03, Advances in Electronic 3081, and 3090 computer systems, as well as the development of direct liquid
Packaging 2003, Maui, HI, Jul. 2003, pp. 129–137. immersion cooling techniques. As a co-inventor of the cooling scheme for the
[63] C. Bash, C. Patel, and R. Sharma, “Efficient thermal management of IBM Thermal Conduction Module (TCM), he received an IBM Outstanding
data centers—Immediate and long term research needs,” HVAC&R Res. Innovation Award and a Corporate Award. While at IBM, he was a member
J., vol. 9, no. 2, pp. 137–152, Apr. 2003. of the IBM Academy of Technology. He is an inventor on over 50 issued
[64] H. Obler, “Energy efficient computer cooling,” Heating/Piping/Air Con- U.S. patents and 75 invention publications. He has published over 50 papers
ditioning, vol. 54, no. 1, pp. 107–111, Jan. 1982. and book chapters related to cooling electronic packages and systems, and
[65] J. M. Ayres, “Air conditioning needs of computers pose problems for developed a short course on electronics cooling that he taught in the U.S. and
new office building,” Heating, Piping and Air Conditioning, vol. 34, no. Europe.
8, pp. 107–112, Aug. 1962. Mr. Simons is a recipient of the Semi-Therm Significant Contributor Award
[66] H. F. Levy, “Computer room air conditioning: How to prevent a catas- and has been active in the conference since its inception serving in the capacities
trophe,” Building Syst. Des., vol. 69, no. 11, pp. 18–22, Nov. 1972. of session, program and general chairman. He is also a past chairman of the
[67] R. W. Goes, “Design electronic data processing installations for relia- ASME Heat Transfer Division K-16 Committee on Heat Transfer in Electronic
bility,” Heating, Piping Air Cond., vol. 31, no. 9, pp. 118–120, Sept. Equipment.
1959.
[68] W. A. Di Giacomo, “Computer room environmental systems,” Heating,
Piping Air Cond., vol. 45, no. 11, pp. 76–80, Oct. 1973. Michael J. Ellsworth received the B.E.M.E. in 1984
[69] F. J. Grande, “Application of a new concept in computer room air con- and the M.E.M.E. degree in 1988 from Manhattan
ditioning,” Western Electric Eng., vol. 4, no. 1, pp. 32–34, Jan. 1960. College, Riverdale, NY.
[70] F. Green, “Computer room air distribution,” ASHRAE J., vol. 9, no. 2, He is a Senior Technical Staff Member working in
pp. 63–64, Feb. 1967. the Advanced Thermal Laboratory in Poughkeepsie,
[71] M. N. Birken, “Cooling computers,” Heating, Piping Air Cond., vol. 39, NY, and has been with IBM since 1988. While at
no. 6, pp. 125–128, Jun. 1967. IBM he has explored improved cooling for applica-
[72] H. F. Levy, “Air distribution through computer room floors,” Building tions ranging from laptops to high-end servers and
Syst. Des., vol. 70, no. 7, pp. 16–16, Oct./Nov. 1973. has investigated cooling technologies encompassing
[73] Thermal Guidelines for Data Processing Environments. Atlanta, GA: air, water, and refrigeration. From 1992 to 1996 he
ASHRAE, 2004. was a ceramic/thin film package applications engi-
[74] R. Schmidt, “Thermal profile of a high density data center-methodology neer and technical program manager in the Interconnect Products Group, East
to thermally characterize a data center,” presented at the ASHRAE Fishkill, NY. He is a member of IEEE and of ASME where he serves on the
Nashville Conf., Nashville, TN, May 2004. Electronics and Photonics Packaging Division Executive Committee and on the
[75] R. C. Chu and Y. Joshi, Eds., “Thermal Management,” in National Elec- K-16 Committee on Heat Transfer in Electronic Equipment. He has published
tronics Manfacturing Technology Roadmaps. Herndon, VA: National 15 technical papers and holds 33 U.S. patents.
Electronic Manufacturing Initiative, Inc., 2002.