Skip to content
Features

Coolest jobs in tech (literally): running a South Pole data center

You know it's cold when you have to heat the air used to cool your data center.

Sean Gallagher | 51
Credit: Photograph by National Science Foundation/F. Descamps
Credit: Photograph by National Science Foundation/F. Descamps
Story text

Steve Barnet is hiring, but not for an ordinary IT job. His ideal candidate "will be willing to travel to Polar and high altitude sites."

Barnet, interim Computing Facilities Manager for the Wisconsin IceCube Particle Astrophysics Center (WIPAC) at the University of Wisconsin, is looking to fill what may be the coolest Unix administrator job opening in the world—literally. Plenty of IT jobs exist in extreme and exotic locales, but the WIPAC IT team runs what is indisputably the world’s most remote data center: a high-performance computing cluster sitting atop a two-mile thick glacier at the South Pole.

The data center has over 1,200 computing cores and three petabytes of storage, and it's tethered to the IceCube Observatory, a neutrino detector with strings of optical sensors buried a kilometer deep in the Antarctic ice. IceCube observes bursts of neutrinos from cataclysmic astronomical events, which helps to study both "dark matter" and the physics of neutrinos.

That mission demands a level of reliability that many less remote data centers cannot provide. Raytheon Polar Services held the National Science Foundation’s Antarctic programs support contract until April. As Dennis Gitt, a former director of IT and communications services for the company puts it, a failure anywhere in the Antarctic systems could lose data from events in space that may not be seen again for millennia.

Running that kind of IT operation at one of the most hostile and remote locations in the world creates a whole set of challenges few in IT have ever experienced.

The few, the proud, the cold

Ralf Auer, left, and Steve Barnet, right, with James Roth of the IceCube team, dressed for a walk to the IceCube data center. Credit: Wisconsin IceCube Particle Astrophysics Center

A trip to the Amundsen-Scott South Pole Station is as close to visiting another planet as you can get on Earth, with “a palette of whites, blues, greys and blacks,” Gitt says. In summer, it feels like two o’clock in the afternoon 24 hours a day, which “can do interesting things to your diurnal cycle,” says Barnet. In winter, the outside world is lit only by moonlight and the Aurora Australis.

With a maximum population of 150 at the base during the Austral summer, South Pole IT professionals-in-residence are limited to a select few. And they don’t get to stay long—most of the WIPAC IT team only stays for a few months in the summer, during which they have to complete all planned IT infrastructure projects.

The rest of the year, WIPAC's US-based employees remotely walk the wintering members of the IceCube team—usually chosen for their physical science credentials, not for their IT skills—through tasks over satellite calls. Systems are monitored over a multiplexed Iridium satellite connection. “You just try to collect as much information as you can and do what you can remotely,” says Ralf Auer, WIPAC’s South Pole systems administrator.

The wintering-over team “can do a lot of the physical maintenance,” says Barnet, but routine IT tasks can sometimes feel foreign to physical scientists. The result is that Auer, Barnet, and the others back home have to visualize what the over-winter team is seeing in order to walk them through fixes. (Barnet compares it to being an air traffic controller talking down an airliner flown by a flight attendant.) So part of the team’s job is to make these hands-on tasks as simple as possible for the scientists, and to handle as much as possible remotely over whatever bandwidth they can get.

South Pole Station does have other IT support, but it’s not on WIPAC’s payroll. The general IT and communications support team for Amundsen-Scott peaks at 5-8 people during the summer and shrinks to 4-5 during the winter, according to Gitt. More stay at McMurdo Station, the main logistics support base in Antarctica: 35-40 at the peak of the research season in the summer, “depending on the projects,” he says. Smaller stations run by NSF may have only two or three IT and communications people total. (Lockheed-Martin just took over the NSF contract, but the award is currently under protest.)

The IceCube Lab at Amundsen-­Scott South Pole Station—home of the world's southernmost data center. Credit: National Science Foundation/F. Descamps

There’s plenty of work to go around. During the summer, Gitt says, “You may have close to 1,000 scientists rolling into different stations getting their science done.” During the peak of the research season, McMurdo Station can surge to a population of around 1,100 people, some staying for only a few weeks based on their projects and funding. Every one of them wants to make sure they can get the IT they need when they need it—often all at the same time, since they have to reconfigure schedules around what the weather allows.

The low headcount and the urgency of just about every project—where the millions spent on a grant for a particular project may be lost if data can’t be collected or transmitted—demands that the few IT pros on hand in Antarctica have not just depth of knowledge, but skills across other fields as well. “Typically, they’re at an engineer level as opposed to a technician,” Gitt says. “They could fix almost anything.” One technician had to diagnose a problem with the Windows software package needed for a project on a researcher’s laptop—and the program was written and documented in French.

The IT teams in Antarctica don't operate in complete isolation; while Raytheon provided support, the Antarctic equivalent of a help desk was in Centennial, Colorado, for instance. But communications back to the US from the South Pole are akin to communication from space—in fact, South Pole stations use the same satellite network for broadband communications as does the International Space Station.

The most reliable form of communication available is the Iridium satellite network. Individual Iridium connections aren’t exactly blazing—they support a data rate of only about 2,400 bits per second. But according to Gitts, Raytheon did a lot to coax as much bandwidth as possible out of Iridium, including multiplexing Iridium connections, doing compression to shrink the size of e-mails—even adding a wireless server with file-sharing and e-mail services to containerized Iridium ground stations to support some of the smaller field stations.

“We even dropped the package size down further in size and put it on some of the traverse tractors," Gitt says—providing a data lifeline to expeditions crossing the frozen continent.

The higher-bandwidth access is limited to about 10 hours a day of broadband coverage from NASA’s Tracking and Data Relay Satellite System (TDRSS) and the GOES-3 Satellite —a weather satellite launched in 1978 that lost its weather imaging capabilities and now provides 1-megabit per second data transmission for eight hours a day. TDRSS provides the most bandwidth, with transmission speeds of up to 150 megabits per second.

But catching signal from those satellites at 90 degrees south latitude requires some serious work. Two of the IT support team for the South Pole Station are a satellite engineer and a technician, because keeping communications up often requires more than one set of hands. “With TDRSS," says Gitt, “they are actually working four or five contacts a day, where ground station is swinging over the dish to catch the next satellite as it comes over the horizon.”

And they don’t get very high above the horizon; ground station dishes are almost always pointed nearly parallel to the ground.

“Oxygen is precious”

Some people might think of IT as a less-than-athletic profession. But in Antarctica, any job can take a physical toll. “Some people forget that the South Pole sits on top of two miles of ice,” says Gitt. “The physical stresses on the individual are very high, because of the low atmospheric pressure and temperature."

“The elevation at the Pole is 9,300 feet,” Barnet says, but because of the cold, it’s “the biological equivalent of 11,000 feet.” That, and the low temperatures—even in the relatively warm peak of Austral summer, temperatures only reach -10 degrees or so Fahrenheit, with a wind chill of -40 F—make things that would seem like routine tasks in a safe, warm corporate setting into exhausting physical activities.

“We’re planning on replacing or supplementing our data center’s uninterruptable power supply infrastructure this year,” Barnet says. “We’re sending 18 UPSes south, which means at some point they’ll end up outside the building and we’ll have to carry them upstairs. You get tired quickly, and oxygen is precious.”

Everyone who goes to the South Pole station has to go through an extensive physical qualification process. That’s partially because of the physical demands, and because of the limited medical resources available—especially for those who winter over, when a medevac flight is next to impossible. “There’s a doctor and nurse practitioner onsite,” Barnet says, “but the facilities are really limited, and anyone with a serious medical condition is really at risk.”

That leads to some interesting medical prerequisites even for the relatively healthy. Before he was cleared for travel to the Pole, Barnet was told he had to get his wisdom teeth extracted. “And if you’re staying for the entire winter,” says systems admin Auer, “they prefer to have your appendix removed. They had a case in 2001 when they had to medevac someone because of appendicitis.”

Wintering over also requires a certain personality type—one that isn’t prone to claustrophobia or the sudden desire to run out into temperatures that can kill. “The best analogy that I can think of is that it’s the same sort of conditions that nuclear sub guys go through,” says Gitt. Amundsen-Scott station is “a very confined area with a maximum winter population between 60 and 80,” he says, so it requires someone who can work well with others under stress. But those who do winter over get some remarkable views.

The physical challenges of the job mean that people working at the station need to be regularly rotated off for three to four months before they can return. Since the station is only accessible for a little more than half the year, this actually means that twelve months often pass between trips to the Pole.

That limited travel window also means that when there’s a changing of the support team, there’s not a much time for turnover. “If you’ve got two or three weeks overlap, you’re doing good,” Gitt says. Because the National Science Foundation wants as many scientists as possible to get into the South Pole station during the months it’s accessible, “bed space is at a premium,” so there’s not much time for walking the new guy through handoff.

A long journey

Getting people and gear to the Pole in the first place is a Herculean effort. “Since the station is closed and completely inaccessible from the beginning of March until October,” Barnet says, “any work you're going to do (on the infrastructure) has to happen from November to January. So it poses a significant logistical challenge.” And because everything is dependent on weather, “You can draw up the most beautiful plan, and then you may end up spending time you planned to do IT work doing dishes or lugging food around instead, because your cargo’s not in."

The Amundsen-Scott South Pole Station is at the end of a 9,000 mile logistics chain. Anything or anyone bound for the Pole has to get from the US to Christchurch, New Zealand, before being loaded onto an Air Force C-17 Globemaster bound for the ice runway at McMurdo Station. Then ski-equipped LC-130 transport planes from the New York State Air National Guard handle the final 800 miles or so of the trip—with a little rocket-assisted takeoff help.

The New York State Air National Guard's LC-130s are the only ride into and out of Amundsen-Scott Station Credit: US Air Force

“You get a very strong appreciation for the sound dampening in commercial aircraft,” says Barnet, describing the trip. “But the food is better these days on the National Guard flights than it is on most airlines.”

That long supply chain poses a support challenge for the IceCube data center team. Getting vendor support isn’t nearly as simple as it is back home. “We’re not exactly in the established vendor support dialogue path," Barnet notes, "where if we say we’ve got a failed hard drive, they’ll say, ‘We’ll overnight you another one.’ We have to keep sufficient spares in order to be able to operate through winter.”

And 30-day warranty replacement doesn’t exactly work when you’re at 90 degrees south latitude. “If it’s the right 30 days, we’re fine," he says. "But that almost never happens.”

The back of the racks in the IceCube data center Credit: WIPAC

Cooling on the ice

You might not think that cooling would be a problem in one of the coldest places on Earth. But ironically, it’s a big concern. “You had to be careful about when you could do maintenance,” Gitt says. “At the South Pole Station, there are times we couldn’t crack open equipment bays because the equipment would start to crack from the cold.” At other times, electronic equipment would actually overheat because it had been designed for cold weather.

“150 machines can produce a lot of heat,” adds Auer. “You can’t just open a door because the temperature would drop too quickly, and we’d lose hardware because hard drives would die.”

The IceCube Neutrino Observatory is an in-­ice array of 5,160 Digital Optical Modules. Sixty optical modules are deployed on a string, with 86 strings making up the array. Credit: National Science Foundation/ F. Descamps
The cables from IceCube's sensor strings plug into custom-built single-board computers with Celeron processors and IDE disk drives, called Digital Optical Module (DOM) hubs. The DOM hubs gather the sensor data and forward it to the data center's "compute farm" for initial processing. Credit: National Science Foundation/ F. Descamps

Then there’s the matter of figuring out what temperature, exactly, the equipment should be cooled to. Barnet says it’s difficult to get “reasonable engineering estimates” for the equipment from vendors because of the altitude. “You do start worrying about things like the machine sizes,” he says. IceCube uses 2U servers, but when the team considered smaller servers, they worried whether the smaller machines could move enough of the thinner air through them for proper cooling.

To get the server room to 65 degrees Fahrenheit when the outside temperature can be -40 F or -100 F in winter meant investing some thought into how to get cooling air into the servers. The IceCube team runs an HVAC system—without air conditioning—to handle the cooling, using vents to bring in outside air in controlled fashion. But the environment doesn’t necessarily allow itself to be controlled all the time—Barnet says that the vents for the HVAC system often freeze in position.

Working at extremes

The view of the IceCube data center from the lab's compute farm—four racks of 2U servers, networked to the DOM hubs by a collection of Cisco switches Credit: National Science Foundation / F. Descamps

There are “other fun and games” driven by the extremes of the climate. One of them is that there’s nearly zero humidity at the Pole. When working in the server space, Barnet says, “We have to be very careful, wear anti-static jackets everywhere we go and make sure we’re always properly grounded.”

The humidity played havoc for a while with tape storage. For a number of reasons, all of the data that comes out of the IceCube Observatory—about a terabyte a day—is written to tape. “That’s a lot of tape," Barnet says. "We found that tape gets really, really cranky down there, and we’ve wrestled with this from day one.” After trying a “wide variety” of fixes, the team determined that the problem was related to low humidity. ?The tape drives might develop internal static; humidifying the tapes made them perform better.

At one point, the team put the tapes in a greenhouse at the station before use; they also considered running a humidifier inside the server room, but concerns about condensation drew protests from others in the station. “It got other people a little upset,” Barnet says. Auer adds that right now they’re not doing anything to the tapes—the current batch seems to work fine without being pre-steamed.

Another major issue: power availability. “In the Northern hemisphere, we have a good power grid,” Auer says, “but down there, everything is run off two generators for the entire station, one of them active. They provide the power for the station and all of the outbuildings, and they’re basically the only source of power for our experiments. If something goes wrong in the power plant down there, every minor problem is immediately visible in IceCube, and nobody ever knows how long it will take to restore power.“

But perhaps the biggest challenge is simply getting to the servers themselves. Even during the summer, there’s limited access to the server room—it’s in an out-building separate from the station. During the winter, the weather could keep the onsite team from getting access at all for days at a time.

Despite all the challenges—or perhaps because of them—Auer and Barnet think their jobs are pretty cool, both literally and figuratively. “When you can tell people, we’re going to the South Pole, and we run a data center that has about 150 servers and provides 99.5-plus percent uptime, that’s just cool,” Auer says.

Barnet agrees. “It’s pretty cool. It’s a lot nicer for me, than, say, something like working at an insurance company.”

Listing image: Photograph by National Science Foundation/F. Descamps

Photo of Sean Gallagher
Sean Gallagher IT Editor Emeritus
Sean was previously Ars Technica's IT and National Security Editor, and is now a Principal Threat Researcher at SophosLabs. A former Navy officer, systems administrator, and network systems integrator with 20 years of IT journalism experience, he lives and works in Baltimore, Maryland.
51 Comments