0% found this document useful (0 votes)
399 views29 pages

SemiAnalysis - AI DC

Uploaded by

Jason Wong
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
399 views29 pages

SemiAnalysis - AI DC

Uploaded by

Jason Wong
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 29

15/03/2024, 21:53 Mail - Jason Wong - Outlook

AI Datacenter Energy Dilemma - Race for AI Datacenter Space


SemiAnalysis <[email protected]>
Thu 14-Mar-24 1:36 AM
To:[email protected] <[email protected]>
[EXTERNAL]

Forwarded this email? Subscribe here for more

AI Datacenter Energy Dilemma -


Race for AI Datacenter Space
Gigawatt Dreams and Matroyshka Brains Limited By Datacenters Not
Chips
DYLAN PATEL, DANIEL NISHBALL, AND JEREMIE ELIAHOU ONTIVEROS
MAR 13 ∙ PREVIEW

READ IN APP

The boom in demand for AI clusters has led to a surge in focus on


datacenter capacity, with extreme stress on electricity grids, generation
capacity, and the environment. The AI buildouts are heavily limited by the
lack of datacenter capacity, especially with regard to training as GPUs need
to be generally co-located for high-speed chip to chip networking. The
deployment of inference is heavily limited by aggregate capacity in various
regions as well as better models coming to market.
There is plenty of discussion on where the bottlenecks will be – How large
are the additional power needs? Where are GPUs being deployed? How is
datacenter construction progressing by region including North America,
Japan, Taiwan, Singapore, Malaysia, South Korea, China, Indonesia, Qatar,
Saudi Arabia, and Kuwait? When will the accelerator ramp be constrained
https://outlook.office.com/mail/id/AAQkADg4M2FlMjA0LWVkMTEtNDgwNi04ODg2LTdiZTE5ZDBjOTQwNQAQAAnRBm1aiJVIh1wdSWUAw4E%3D 1/29
15/03/2024, 21:53 Mail - Jason Wong - Outlook

by physical infrastructure? will it be transformers, generators, grid capacity


or one of the other 15 of the other datacenter component categories we
track? How much capex is required? Which hyperscalers and large firms are
racing to secure enough capacity and which will be constrained heavily
because they were caught flat footed without datacenter capacity? Where
will the Gigawatt and larger training clusters be built over the next few
years? What is the mix of power generation types such as natural gas, solar,
and wind? Is this even sustainable or will the AI buildout destroy the
environment?
Today let’s answer these questions, with the first half of the report free, and
the 2nd half is available to subscribers below.
Upgrade to paid
Many are opining with ridiculous assumptions about datacenter buildout
pace. Even Elon Musk has chimed in to opine, but his assessment is not
exactly accurate.
The artificial intelligence compute coming online appears to be
increasing by a factor of 10 every six months… Then, it was very easy to
predict that the next shortage will be voltage step-down transformers.
You've got to feed the power to these things. If you've got 100-300
kilovolts coming out of a utility and it's got to step down all the way to
six volts, that's a lot of stepping down. My not-that-funny joke is that
you need transformers to run transformers... Then, the next shortage will
be electricity. They won't be able to find enough electricity to run all the
chips. I think next year, you'll see they just can't find enough electricity
to run all the chips.
Bosch Connected World Conference
To be clear he is mostly right on these limitations of physical infrastructure,
but compute is not up 10x every six months – we track the CoWoS, HBM,
and server supply chains of all major hyperscale and merchant silicon firms
and see total AI compute capacity measured in peak theoretical FP8 FLOPS
has been growing at a still rapid 50-60% quarter on quarter pace since
1Q23. I.E. nowhere close to 10x in six months. CoWoS and HBM is simply
not growing fast enough.

https://outlook.office.com/mail/id/AAQkADg4M2FlMjA0LWVkMTEtNDgwNi04ODg2LTdiZTE5ZDBjOTQwNQAQAAnRBm1aiJVIh1wdSWUAw4E%3D 2/29
15/03/2024, 21:53 Mail - Jason Wong - Outlook

SemiAnalysis Estimates
The boom in generative AI, powered by transformers, will indeed need a lot
of transformers, generators and a myriad of other electrical and cooling
widgets.
A lot of back of the envelope guesstimates or straight up alarmist narratives
are based on outdated research. The IEA’s recent Electricity 2024 report
suggests 90 terawatt-hours (TWh) of power demand from AI Datacenters
by 2026, which is equivalent to about 10 Gigawatts (GW) of Datacenter
Critical IT Power Capacity, or the equivalent of 7.3M H100s. We estimate
that Nvidia alone will have shipped accelerators with the power needs of
5M+ H100s (mostly shipments of H100s, in fact) from 2021 through the end
of 2024, and we see AI Datacenter capacity demand crossing above 10 GW
by early 2025.

https://outlook.office.com/mail/id/AAQkADg4M2FlMjA0LWVkMTEtNDgwNi04ODg2LTdiZTE5ZDBjOTQwNQAQAAnRBm1aiJVIh1wdSWUAw4E%3D 3/29
15/03/2024, 21:53 Mail - Jason Wong - Outlook

IEA Electricity 2024


The above report is an underestimation of datacenter power demand, but
there are plenty of overestimates too – some from the alarmist camp have
recycled old papers written before the widespread adoption of accelerated
compute that point to a worst-case scenario with datacenters consuming a
whopping 7,933 TWh or 24% of global electricity generation by 2030!
Here come the datacenter locusts, Dyson spheres, Matrioshka brains!

https://outlook.office.com/mail/id/AAQkADg4M2FlMjA0LWVkMTEtNDgwNi04ODg2LTdiZTE5ZDBjOTQwNQAQAAnRBm1aiJVIh1wdSWUAw4E%3D 4/29
15/03/2024, 21:53 Mail - Jason Wong - Outlook

On Global Electricity Usage of Communications Technology: Trends to


2030
Many of these back of the envelope estimates are based on a function of
growth estimates for global internet protocol traffic, and estimates for
power used per unit of traffic dampened by efficiency gains – all figures
that are extremely hard to estimate, while others utilize top down
datacenter power consumption estimates created in the pre-AI era.
Mckinsey also has laughably bad estimates, which pretty much amount to
putting their finger on a random CAGR and repeating it with fancy graphics.
Let’s correct the narrative here and quantify the datacenter power
crunch with empirical data.
Our approach forecasts AI Datacenter demand and supply through our
analysis of over 1,100 datacenters in North America across existing
colocation and hyperscale datacenters, including construction progress
forecasts for datacenters under development, and for the first time ever for
a study of its type, we combine this database with AI Accelerator power
demand derived from our AI Accelerator Model to estimate AI and non-AI
Datacenter Critical IT Power demand and supply. We also combine this
analysis with regional aggregate estimates for geographies outside North
America (Asia Pacific, China, EMEA, Latin America) collated by Structure
Research to provide a holistic global view of datacenter trends. We
supplement the regional estimates by tracking individual clusters and build
outs of note with satellite imagery and construction progress, for example
the up to 1,000 MW development pipeline in Johor Bahru, Malaysia
(primarily by Chinese firms) – just a few miles north of Singapore.
This tracking is done by hyperscaler, and it’s clear some of the largest
players in AI will lag behind others in deployable AI compute over the
medium term.
The AI boom will indeed rapidly accelerate datacenter power consumption
growth, but global datacenter power usage will remain well below the
doomsday scenario of 24% of total energy generation in the near term. We
believe AI will propel datacenters to use 4.5% of global energy generation
by 2030.

https://outlook.office.com/mail/id/AAQkADg4M2FlMjA0LWVkMTEtNDgwNi04ODg2LTdiZTE5ZDBjOTQwNQAQAAnRBm1aiJVIh1wdSWUAw4E%3D 5/29
15/03/2024, 21:53 Mail - Jason Wong - Outlook

SemiAnalysis Estimates

The Real AI Superpowers


Datacenter power capacity growth will accelerate from a 12-15% CAGR to a
25% CAGR over the next few years. Global Datacenter Critical IT power
demand will surge from 49 Gigawatts (GW) in 2023 to 96 GW by 2026, of
which AI will consume ~40 GW. In reality the buildout is not this smooth and
there is a real power crunch coming soon.

SemiAnalysis Estimates

https://outlook.office.com/mail/id/AAQkADg4M2FlMjA0LWVkMTEtNDgwNi04ODg2LTdiZTE5ZDBjOTQwNQAQAAnRBm1aiJVIh1wdSWUAw4E%3D 6/29
15/03/2024, 21:54 Mail - Jason Wong - Outlook

The need for abundant, inexpensive power, and to quickly add electrical
grid capacity while still meeting hyperscalers’ carbon emissions
commitments, coupled with chip export restrictions, will limit the regions
and countries that can meet the surge in demand from AI Datacenters.
Some countries and regions such as the US will be able to respond flexibly
with a low electrical grid carbon intensity, low-cost fuel sources with supply
stability, while others such as Europe will be effectively handcuffed by
geopolitical realities and structural regulatory constraints on power. Others
will simply grow capacity without care for environmental impact.
Key Needs of Training and Inference
AI Training workloads have unique requirements that are very dissimilar to
those of typical hardware deployed in existing datacenters.
First, models train for weeks or months, with network connectivity
requirements being relativity limited to training data ingress. Training is
latency insensitive and does not need to be near any major population
centers. AI Training clusters can be deployed essentially anywhere in the
world that makes economic sense, subject to data residency and
compliance regulations.
The second major difference to keep in mind is also somewhat obvious – AI
Training workloads are extremely power hungry and tend to run AI hardware
at power levels closer to their Thermal Design Power (TDP) than would a
traditional non-accelerated hyperscale or enterprise workload. Additionally,
while CPU and storage servers consume on the order of 1kW, each AI server
is now eclipsing 10kW. Coupled with the insensitivity towards latency and
decreased importance of proximity to population centers, this means that
the availability of abundant quantities of inexpensive electricity (and in the
future – access to any grid supply at all) is of much higher relative
importance for AI Training workloads vs traditional workloads. Incidentally,
some of these are requirements shared by useless crypto mining
operations, sans the scaling benefits of >100 megawatt in singular sites.
Inference on the other hand is eventually a larger workload than training,
but it can also be quite distributed. The chips don’t need to be centrally
located, but the sheer volume will be outstanding.
Datacenter Math
https://outlook.office.com/mail/id/AAQkADg4M2FlMjA0LWVkMTEtNDgwNi04ODg2LTdiZTE5ZDBjOTQwNQAQAAnRBm1aiJVIh1wdSWUAw4E%3D 7/29
15/03/2024, 21:54 Mail - Jason Wong - Outlook

AI Accelerators achieve relatively high utilization rates (in terms of power


usage, not MFU). The expected average power (EAP) from normal operation
per DGX H100 server is ~10,200 W, which works out to be 1,275W for each
of the 8 GPUs per server. This incorporates the 700W Thermal Design
Power (TDP) of the H100 itself, along with about 575W (allocated per GPU)
for the Dual Intel Xeon Platinum 8480C processors and 2TB of DDR5
memory, NVSwitches, NVLink, NICs, retimers, network transceivers, etc.
Adding the power needs for storage and management servers as well as
various networking switches for an entire SuperPOD gets us to an effective
power requirement of 11,112W per DGX server or 1,389W per H100 GPU.
The DGX H100 configuration is somewhat overprovisioned with respect to
storage and other items when compared to the HGX H100, which we
account for. Companies like Meta have released enough information about
their full configuration to estimate system level power consumption.

NVIDIA DGX SuperPOD Datacenter Design


Critical IT Power is defined as the usable electrical capacity at the
datacenter floor available to compute, servers and networking equipment
housed within the server racks. It excludes the power needed to run
cooling, power delivery and other facility related systems in the datacenter.
To calculate the Critical IT Power capacity that needs to be built or
https://outlook.office.com/mail/id/AAQkADg4M2FlMjA0LWVkMTEtNDgwNi04ODg2LTdiZTE5ZDBjOTQwNQAQAAnRBm1aiJVIh1wdSWUAw4E%3D 8/29
15/03/2024, 21:54 Mail - Jason Wong - Outlook

purchased in this example, add up the total expected power load of the IT
equipment deployed. In our example below, 20,480 GPUs at 1,389W per
GPU equates to 28.4 MW of Critical IT Power Required.
To get to the total power that the IT equipment is expected to consume
(Critical IT Power Consumed), we need to apply a likely utilization rate
relative to Critical IT Power Required. This factor accounts for the fact that
the IT equipment typically does not run at 100% of its design capability and
may not be utilized to the same degree over a 24-hour period. This ratio is
set to 80% in the example.
On top of the Critical IT Power Consumed, operators must also supply
power for cooling, to cover power distribution losses, lighting and other
non-IT facility equipment. The industry measures Power Usage
Effectiveness (PUE) to measure the energy efficiency of data centers. It's
calculated by dividing the total amount of power entering a data center by
the power used to run the IT equipment within it. It of course is a very
flawed metric, because cooling within the server is considered “IT
equipment”. We account for this by multiplying the Critical IT Power
Consumed by the Power Usage Effectiveness (PUE). A lower PUE indicates
a more power efficient datacenter, with a PUE of 1.0 representing a
perfectly efficient datacenter, with no power consumption for cooling or any
non-IT equipment. A typical enterprise colocation PUE is around 1.5-1.6,
while most hyperscale datacenters are below 1.4 PUE, with some purpose
build facilities (such as Google’s) claim to achieve PUEs of below 1.10. Most
AI Datacenter specs aim for lower than 1.3 PUE. The decline in industry-
wide average PUE over the last 10 years, from 2.20 in 2010 to an estimated
1.55 by 2022 has been one of the largest drivers of power savings and has
helped avoid runaway growth in datacenter power consumption.
For example at 80% utilization rate and a PUE of 1.25, the theoretical
datacenter with a cluster of 20,480 GPUs would on average draw 28-29MW
of power from the grid, adding up to 249,185 Megawatt-hours per year,
which would cost $20.7M USD per year in electricity based on average US
power tariffs of $0.083 per kilowatt-hour.

https://outlook.office.com/mail/id/AAQkADg4M2FlMjA0LWVkMTEtNDgwNi04ODg2LTdiZTE5ZDBjOTQwNQAQAAnRBm1aiJVIh1wdSWUAw4E%3D 9/29
15/03/2024, 21:54 Mail - Jason Wong - Outlook

Datacenter Layouts and Constraints


While the DGX H100 server requires 10.2 kilowatts (kW) of IT Power, most
colocation datacenters can still only support a power capacity of ~12 kW
per rack, though a typical Hyperscale datacenter can deliver higher power
capacity.

NVIDIA DGX SuperPOD Datacenter Design


https://outlook.office.com/mail/id/AAQkADg4M2FlMjA0LWVkMTEtNDgwNi04ODg2LTdiZTE5ZDBjOTQwNQAQAAnRBm1aiJVIh1wdSWUAw4E%3D 10/29
15/03/2024, 21:54 Mail - Jason Wong - Outlook

Server deployments will therefore vary depending on the power supply and
cooling capacity available, with only 2-3 DGX H100 servers deployed where
power/cooling constrained, and entire rows rack space sitting empty to
double the power delivery density from 12 kW to 24 kW in colocation
datacenters. This spacing is implemented to resolve cooling
oversubscription aswell.

NVIDIA DGX SuperPOD Datacenter Design


As datacenters are increasingly designed with AI workloads in mind, racks
will be able to achieve power densities of 30-40kW+ using air cooling by
using specialized equipment to increase airflow. The future use of direct to
chip liquid cooling opens the door to even higher power density by
potentially reducing per rack power usage by 10% by eliminating the use of
fan power, and lowering PUE by 0.2-0.3 by reducing or eliminating the need
for ambient air cooling, though with PUEs already at 1.25 or so, this will be
the last wave of meaningful PUE gains to be had.

https://outlook.office.com/mail/id/AAQkADg4M2FlMjA0LWVkMTEtNDgwNi04ODg2LTdiZTE5ZDBjOTQwNQAQAAnRBm1aiJVIh1wdSWUAw4E%3D 11/29
15/03/2024, 21:54 Mail - Jason Wong - Outlook

Supermicro Liquid Cooling Whitepaper


Another important consideration that many operators raise is that individual
GPU server nodes are best positioned near each other to achieve
acceptable cost and latency. A rule of thumb used is that racks from the
same cluster should be at most 30 meters from the network core. The short
reach enables lower cost multimode optical transceivers as opposed to
expensive single mode, which can often reach multiple km of reach. The
specific multimode optical transceiver typically used by Nvidia to connect
GPUs to leaf switches has a short range of up to 50m. Using longer optical
cables and longer reach transceivers to accommodate more distant GPU
racks will add costs with much more expensive transceivers needed. Future
GPU clusters utilizing other scale-up network technology will also require
very short cable runs to work properly. For instance, in Nvidia’s yet to be
deployed NVLink scale-up network for H100 clusters, which supports
clusters of up to 256 GPUs across 32 nodes and can deliver 57.6 TB/s of
all-to-all bandwidth, the maximum switch-to-switch cable length will be 20
meters.

https://outlook.office.com/mail/id/AAQkADg4M2FlMjA0LWVkMTEtNDgwNi04ODg2LTdiZTE5ZDBjOTQwNQAQAAnRBm1aiJVIh1wdSWUAw4E%3D 12/29
15/03/2024, 21:54 Mail - Jason Wong - Outlook

NVIDIA H100 Architectural White Paper


The trend towards higher power density per rack is driven more by
networking, compute efficiency and cost per compute considerations – with
regards to datacenter planning, as the cost of floor space and data hall
space efficiency is an generally an afterthought. Roughly 90% of colocation
datacenter costs are from power and 10% is from physical space.
The data hall where IT equipment is installed is typically only about 30-40%
of a datacenter’s total gross floor area, so designing a data hall that is 30%
larger will only require 10% more gross floor area for the entire datacenter.
Considering that 80% of the GPU cost of ownership is from capital costs,
with 20% related to hosting (which bakes in the colocation datacenter
costs) the cost of additional space is a mere 2-3% of total cost of
ownership for an AI Cluster.
Most existing colocation datacenters are not ready for rack densities above
20kW per rack. Chip production constraints will meaningfully improve in
2024, but certain hyperscalers and colos run straight into a datacenter
capacity bottleneck, because they were flat footed with AI – most notably
within colocation datacenters, as well as a power density mismatch – where
the limits of 12-15kW power in traditional colocation will be an obstacle to
achieving ideal physical density of AI super clusters.
Rear door heat exchangers and direct to chip liquid cooling solutions can be
deployed in newly built datacenters to solve the power density problem.
However, it is much easier to design a new facility from the ground up
incorporating these solutions than it is to retrofit existing facilities –
https://outlook.office.com/mail/id/AAQkADg4M2FlMjA0LWVkMTEtNDgwNi04ODg2LTdiZTE5ZDBjOTQwNQAQAAnRBm1aiJVIh1wdSWUAw4E%3D 13/29
15/03/2024, 21:54 Mail - Jason Wong - Outlook

realizing this, Meta has halted development of planned datacenter projects


to rescope them into datacenters catering specifically to AI workloads.
Meta had the worst datacenter design in terms of power density of all the
hyperscalers, but they woke up and shifted very quickly. Retrofitting an
existing datacenter is costly, time consuming, and in some cases may not
even be possible – there may not be the physical space to install additional
units of 2-3 MW generators, Uninterruptable Power Supplies (UPSs),
switching gear or additional transformers, and redoing plumbing to
accommodate the Cooling Distribution Units (CDUs) needed for direct to
chip liquid cooling is hardly ideal.

NVIDIA DGX SuperPOD Datacenter Design


AI Demand vs Current Datacenter Capacity
Using a line-by-line unit shipment forecasts by accelerator chip based on
our AI Accelerator Model together with our estimated chip specifications
and modeled ancillary equipment power requirements, we calculate total AI
Datacenter Critical IT Power needs for the next few years.

https://outlook.office.com/mail/id/AAQkADg4M2FlMjA0LWVkMTEtNDgwNi04ODg2LTdiZTE5ZDBjOTQwNQAQAAnRBm1aiJVIh1wdSWUAw4E%3D 14/29
15/03/2024, 21:54 Mail - Jason Wong - Outlook

SemiAnalysis Estimates
As mentioned above, total Datacenter Critical IT Power demand will double
from about 49 GW in 2023 to 96 GW by 2026, with 90% of the growth
coming from AI-related demand. This is purely from chip demand, but
physical datacenters tell a different story.
Nowhere will the impact be felt more than in the United States, where our
satellite data shows the majority of AI Clusters are being deployed and
planned, meaning Datacenter Critical IT Capacity in the US will need to
triple from 2023 to 2027.

SemiAnalysis Estimates
Aggressive plans by major AI Clouds to roll out accelerator chips highlight
this point. OpenAI has plans to deploy hundreds of thousands of GPUs in
their largest multi-site training cluster, which requires hundreds of
megawatts of Critical IT Power. We can track their cluster size quite
accurately by looking at the buildout of the physical infrastructure,
https://outlook.office.com/mail/id/AAQkADg4M2FlMjA0LWVkMTEtNDgwNi04ODg2LTdiZTE5ZDBjOTQwNQAQAAnRBm1aiJVIh1wdSWUAw4E%3D 15/29
15/03/2024, 21:54 Mail - Jason Wong - Outlook

generators, and evaporation towers. Meta discusses an installed base of


650,000 H100 equivalent by the end of the year. GPU Cloud provider
CoreWeave has big plans to invest $1.6B in a Plano, Texas facility, implying
plans to spend for construction up to 50MW of Critical IT Power and install
30,000-40,000 GPUs in that facility alone, with a clear pathway to a whole
company 250MW datacenter footprint (equivalent to 180k H100s), and they
have plans for multiple hundreds of MW in a single site in planning.
Microsoft has the largest pipeline of datacenter buildouts pre-AI era (see
January 2023 data below), and our data shows its skyrocketed since. They
have been gobbling any and all colocation space they can as well
aggressively increasing their datacenter buildouts. AI laggers like Amazon
have made press releases about nuclear powered datacenters totaling
1,000MW, but to be clear they are lagging materially on real near term
buildouts as they were the last of the hyperscalers to wake up to AI. Google,
and Microsoft/OpenAI both have plans for larger than Gigawatt class
training clusters in the works.

Structure Research
From a supply perspective, sell side consensus estimates of 3M+ GPUs
shipped by Nvidia in calendar year 2024 would correspond to over 4,200
MW of datacenter needs, nearly 10% of current global datacenter capacity,
just for one year’s GPU shipments. The consensus estimates for Nvidia’s
shipments are also very wrong of course. Ignoring that, AI is only going to
grow in subsequent years, and Nvidia’s GPUs are slated to get even more
power hungry, with 1,000W, 1,200W, and 1,500W GPUs on the roadmap.
Nvidia is not the only company producing accelerators, with Google
https://outlook.office.com/mail/id/AAQkADg4M2FlMjA0LWVkMTEtNDgwNi04ODg2LTdiZTE5ZDBjOTQwNQAQAAnRBm1aiJVIh1wdSWUAw4E%3D 16/29
15/03/2024, 21:54 Mail - Jason Wong - Outlook

ramping custom accelerator production rapidly. Going forward, Meta and


Amazon will also ramp their in house accelerators.
This reality is not lost on the top global hyperscalers – who are rapidly
ramping up datacenter construction and colocation leasing. AWS literally
bought a 1000MW nuclear-powered datacenter campus for $650M USD.
Though only the very first building with 48MW of capacity is likely to be
online in the near term, this provides a valuable pipeline of datacenter
capacity for AWS without being held up waiting for power generation or grid
transmission capacity. We think a campus of such mammoth proportions
will take many years to fully ramp to the promised 1,000 MW of Critical IT
Power.

Datacenter Dynamics
The Carbon and Power Cost of AI Training
and Inference
Understanding power requirements for training popular models can help
gauge power needs as well as understand carbon emissions generated by
the AI industry. Estimating the Carbon Footprint of BLOOM, a 175B
Parameter Language Model examines the power usage of training the
BLOOM model at the Jean Zay computer cluster at IDRIS, a part of France’s
CNRS. The paper provides empirical observations of the relationship of an
AI Chip’s TDP to total cluster power usage including storage, networking
and other IT equipment, all the way through the actual power draw from the
grid.
https://outlook.office.com/mail/id/AAQkADg4M2FlMjA0LWVkMTEtNDgwNi04ODg2LTdiZTE5ZDBjOTQwNQAQAAnRBm1aiJVIh1wdSWUAw4E%3D 17/29
15/03/2024, 21:54 Mail - Jason Wong - Outlook

Another paper, Carbon Emissions and Large Neural Network Training,


reports on the training time, configuration and power consumption of
training for a few other models. Power needs for training can vary
depending on the efficiency of models and training algorithms (optimizing
for Model FLOPs Utilization – MFU) as well as overall networking and server
power efficiency and usage, but the results as reproduced below are a
helpful yardstick

Estimating the Carbon Footprint of BLOOM, a 176B Parameter Language


Model, Carbon Emissions and Large Neural Network Training
The papers estimate the carbon emissions from training these models by
multiplying the total power consumption in kWh by the carbon intensity of
the power grid that the datacenter is running on. Eagle eyed readers will
note the very low carbon intensity of 0.057 kg CO2e/kWh for training the
BLOOM model in France, which sources 60% its electricity from nuclear
power, far lower than the 0.387 kg CO2e/kWh average for the US. We
provide an additional set of calculations assuming the training jobs are run
on datacenters connected to a power grid in Arizona, one of the leading
states for datacenter buildouts currently.
The last piece of the emissions puzzle to consider is embodied emissions,
defined as the total carbon emissions involved in manufacturing and
transporting a given device, in this case the accelerator chip and related IT
equipment. Solid data on embodied emissions for AI Accelerator Chips is
scarce, but some have roughly estimated the figure at 150kg of CO2e per
A100 GPU and 2,500kg of CO2e for a server hosting 8 GPUs. Embodied
emissions work out to be about 8-10% of total emissions for a training run.

https://outlook.office.com/mail/id/AAQkADg4M2FlMjA0LWVkMTEtNDgwNi04ODg2LTdiZTE5ZDBjOTQwNQAQAAnRBm1aiJVIh1wdSWUAw4E%3D 18/29
15/03/2024, 21:54 Mail - Jason Wong - Outlook

Estimating the Carbon Footprint of BLOOM, a 176B Parameter Language


Model, Carbon Emissions and Large Neural Network Training, EPA eGrid,
SemiAnalysis Estimates
The carbon emissions from these training runs are significant, with one
GPT-3 training run generating 588.9 metric tons of CO2e, equivalent to the
annual emissions of 128 passenger vehicles. Complaining about GPT-3
training emissions is like recycling plastic water bottles but then taking a
flight every few months. Literally irrelevant virtue signaling.
On the flip side, it is a safe bet that there were many iterations of training
runs before settling down on the final model. In 2022, Google emitted a
total of 8,045,800 metric tons of CO2e from its facilities including
datacenters, before factoring in any offsets from renewable energy
projects. All this means is GPT-3 is not effecting carbon output of the world,
but with the FLOPS of GPT-4 multiple orders of magnitude more, and the
current OpenAI training run, more than an magnitude above that, the carbon
emissions of training are going to start to become sizeable in a few years.
For inference, we have detailed the economics of AI Cloud hosting in our
posts on GPU Cloud Economics and on Groq Inference Tokenomics. A
typical H100 server with 8 GPUs will emit about 2,450 kg of CO2e per
month and require 10,200 W of IT Power – a cost of $648 per month
assuming $0.087 per kilowatt-hour (KWh).

https://outlook.office.com/mail/id/AAQkADg4M2FlMjA0LWVkMTEtNDgwNi04ODg2LTdiZTE5ZDBjOTQwNQAQAAnRBm1aiJVIh1wdSWUAw4E%3D 19/29
15/03/2024, 21:54 Mail - Jason Wong - Outlook

SemiAnalysis Estimates
Building out AI Infrastructure at Scale – What
makes a Real AI Superpower?
The AI Datacenter industry is going to need the following:
Inexpensive electricity costs given the immense amount of power to be
consumed on an ongoing basis, particularly since inference needs will
only compound over time.
Stability and robustness of energy supply chain against geopolitical
and weather disturbances to decrease likelihood of energy price
volatility, as well as the ability to quickly ramp up fuel production and
thus rapidly provision power generation at great scale.
Power generation with a low carbon intensity power mix overall, and
that suitable to stand up massive quantities of renewable energy that
can produce at reasonable economics.
Countries that can step up to the plate and tick off those boxes are
contenders to be Real AI Superpowers.
Electricity Tariffs, Power Mix, and Carbon
Intensity
https://outlook.office.com/mail/id/AAQkADg4M2FlMjA0LWVkMTEtNDgwNi04ODg2LTdiZTE5ZDBjOTQwNQAQAAnRBm1aiJVIh1wdSWUAw4E%3D 20/29
15/03/2024, 21:54 Mail - Jason Wong - Outlook

US EIA, Various National and Regional Electrical Distribution Organizations


Comparing global electricity tariffs, the US has among the lowest power
prices in the world at $0.083 USD/kWh on average. Natural gas production
in the US is abundant and has surged since the shale gas revolution of the
early 2000s, which made the US the world’s largest producer of natural gas.
Almost 40% of the electricity generation in the US is fueled by natural gas,
with low power generation prices chiefly driven by the abundance of dry
natural gas production from shale formations. Natural gas prices will remain
depressed in the US due to fracking for oil and the increasing percentage of
gas coming from wells, and so natural gas bulls that ping us every few
weeks about datacenter power consumption should probably calm down.
The fact that the US is energy independent in natural gas adds geopolitical
stability to prices, and the widespread distribution of gas fields across the
US adds supply chain robustness, while a proven reserve of 20 years of
consumption adds longevity to energy supply, though these reserve
estimates have increased over the years, doubling since 2015 and up 32%
in 2021 alone.

https://outlook.office.com/mail/id/AAQkADg4M2FlMjA0LWVkMTEtNDgwNi04ODg2LTdiZTE5ZDBjOTQwNQAQAAnRBm1aiJVIh1wdSWUAw4E%3D 21/29
15/03/2024, 21:54 Mail - Jason Wong - Outlook

US Energy Information Administration


In addition, the US has a far greener energy mix than most of the other
contenders, having reduced its coal mix of power generation from 37% in
2012 to 20% by 2022 with the coal mix forecast to reach 8% by 2030 as
natural gas and renewables step in to fill the gap. This compares to India at
a 75% coal mix, China at a 61% coal mix, and even Japan still at a 34% coal
mix in 2022. This difference is very impactful as coal power plants have a
carbon intensity of 1.025 kg/kWh CO2e, over double that of natural gas
plants at 0.443 kg/kWh CO2e. Datacenters built in the US will therefore rely
on a far cleaner fuel mix for necessary baseload and overnight power
generation than in many other countries.

https://outlook.office.com/mail/id/AAQkADg4M2FlMjA0LWVkMTEtNDgwNi04ODg2LTdiZTE5ZDBjOTQwNQAQAAnRBm1aiJVIh1wdSWUAw4E%3D 22/29
15/03/2024, 21:54 Mail - Jason Wong - Outlook

US Energy Information Administration


The energy supply situation in the US stands in stark contrast to East Asia
and Western Europe, which host about 15% and 18% of global datacenter
capacity respectively. While the US is self-sufficient in natural gas,
countries such as Japan, Taiwan, Singapore, and Korea import well over
90% of their gas and coal needs.
Japan’s power mix is tilted towards these imported fuel types, with 35%
natural gas, 34% coal, 7% hyrdo and 5% nuclear, resulting in average
industrial electricity tariffs of $0.152 USD/kWh in 2022, 82% higher than the
US at $0.083 USD/kWh. Taiwan and Korea have similar power mixes
dominated by natural gas imports and have electricity tariffs of about $0.10
to $0.12 USD/kWh, though this is after $0.03 to $0.04 of effective subsidies
given the state owned electric companies have been running massive
losses, with Korea’s KEPCO losing $24B in 2022 and Taiwan’s Taipower
losing $0.04 USD per kWh sold.
In ASEAN, Singapore is another datacenter hub that has a heavy reliance on
imported natural gas at 90% of its power generation mix, resulting in a high
electricity tariff of $0.23 USD/kWh in 2022. The 900MW of Critical IT Power
that Singapore hosts is large relative to its power generation capacity and
consumes over 10% of Singapore’s national power generation. For this
reason, Singapore had placed a four-year moratorium on new datacenter
builds which only lifted in July 2023 with approval of a mere 80MW of new
capacity. This constraint has spawned an enormous development pipeline
https://outlook.office.com/mail/id/AAQkADg4M2FlMjA0LWVkMTEtNDgwNi04ODg2LTdiZTE5ZDBjOTQwNQAQAAnRBm1aiJVIh1wdSWUAw4E%3D 23/29
15/03/2024, 21:54 Mail - Jason Wong - Outlook

of up to 1,000 MW of capacity in Johor Bahru, Malaysia, just a few miles


north of Singapore, with much of it being driven by Chinese companies
masquerading as Malaysian companies. Indonesia also has a significant
pipeline.
China’s industrial electricity tariff of $0.092 USD/kWh is on the low end of
the range for electricity tariffs, but like many other emerging markets, China
has a very dirty power generation mix, with 61% of generation from coal.
This is a significant disadvantage from an emissions perspective, and new
coal power plants are still being approved despite China significantly
leading the world in renewable power installation. Any hyperscale or AI
company that has a net-zero emissions commitment will be fighting an
uphill battle with respect to that goal given coal’s carbon intensity of 1.025
kg/kWh CO2e vs natural gas at 0.443 kg/kWh CO2e.

Ember Electricity
China is largely self-reliant on coal used for power generation, but it imports
the vast majority of its other energy needs, with over 70% of its petroleum
and LNG exports shipped through the Strait of Malacca, and therefore
subject to the so-called “Malacca Dilemma”, meaning that for strategic
reasons China cannot pivot towards natural gas and will have to rely on
adding coal and nuclear for baseload generation. China does lead the world
in adding renewable capacity, however, the huge existing base of fossil fuel-
based power plants and continued reliance on adding coal power to grow
https://outlook.office.com/mail/id/AAQkADg4M2FlMjA0LWVkMTEtNDgwNi04ODg2LTdiZTE5ZDBjOTQwNQAQAAnRBm1aiJVIh1wdSWUAw4E%3D 24/29
15/03/2024, 21:54 Mail - Jason Wong - Outlook

overall capacity means that in 2022 only 13.5% of total power generation
was from renewables.
To be clear, China is the best country at building new power generation, and
they would likely lead in the construction of gigawatt scale datacenters if
they were enabled to, but they cannot, so the US is dominating here.

US Energy Information Administration


And this is all before looking the elephant in the room squarely in the eyes,
specifically, the ongoing AI Semiconductor export controls put into place by
the US’s Bureau of Industry and Security, which has the intent of almost
entirely denying China from obtaining any form of AI chips. In this respect,
the exports control whack-a-mole game is ongoing, with Nvidia tweaking its
chips to comply with the latest changes to the controls. The H20 has a
huge ramp in Q2 and on for China, but this is still nowhere close to the 35%
to 40% of AI chips China would import if they were allowed to.
In Western Europe, electricity generation has been slowly declining, with a
5% drop cumulatively over the past five years. One reason for the drop is
that nuclear power has become a political non-starter, causing nuclear
power generation to decline massively, for example declining 75% in
Germany from 2007 to 2021. A strong focus on the “environment” has led to
dirty fuel sources such as coal also declining dramatically over the same
https://outlook.office.com/mail/id/AAQkADg4M2FlMjA0LWVkMTEtNDgwNi04ODg2LTdiZTE5ZDBjOTQwNQAQAAnRBm1aiJVIh1wdSWUAw4E%3D 25/29
15/03/2024, 21:54 Mail - Jason Wong - Outlook

time, although the cleanest power in the world nuclear has been replaced
with coal and natural gas in some instances. Renewable energy is
increasing within Europe’s power mix, but not fast enough, leaving many
Europeans countries to scramble to pivot more towards natural gas, which
now stands at 35-45% of the power generation mix for major Western
European countries.

Ember Electricity
Given Europe’s energy situation, the EU average industrial tariff reached
$0.18 USD/kWh in 2022, with the UK at $0.235 USD/kWh and datacenter
heavyweight Ireland at $0.211 USD/kWh, nearly triple the electricity cost in
the US. Like Asia, Europe imports over 90% of its gas in the form of LNG,
mainly sourced from the Middle East (and also still from Russia, despite the
ongoing war), so their entire industrial base, not just Datacenters, is subject
to geopolitical risk, as most readers will vividly remember from the onset of
the war in Ukraine. Given the political and geopolitical realities, adding a
massive amount of power generation capacity to host the AI Datacenter
boom in Europe would be very challenging.
Furthermore, Europe is allergic to building as proven by many regulations
and restrictions on the datacenter and manufacturing industries already in
place. While small projects and pipelines for datacenters are in progress,
especially in France who at least has somewhat realized the geopolitical
necessity, no one is planning to build Gigawatt class clusters in Europe.
https://outlook.office.com/mail/id/AAQkADg4M2FlMjA0LWVkMTEtNDgwNi04ODg2LTdiZTE5ZDBjOTQwNQAQAAnRBm1aiJVIh1wdSWUAw4E%3D 26/29
15/03/2024, 21:54 Mail - Jason Wong - Outlook

Europe has less than 4% of globally deployed AI Accelerator FLOPs based


on our estimates.
As discussed, electricity pricing will matter considerably given the scale of
AI clusters to be deployed, amounting to hundreds of millions of dollars of
cost difference depending where the clusters are deployed. Locating AI
Datacenters in Europe or Asia would easily double or triple the power costs
vs building datacenters in the US. Furthermore construction costs are also
higher due to the lack of skills.
The Middle East is another location that is racing to start datacenter
construction and they scores very high among the Real AI Superpower
criteria in some metrics, with some of the lowest electricity tariffs globally,
and very high viability for the use of Solar power. Indeed, the Middle East
has a very strong pipeline with the UAE expected to nearly triple Datacenter
Critical IT Power from 115MW in 2022 to 330MW by 2026.
Saudi Arabia has already gotten involved with the purchase of a measily
3,000 H100s so far for its research institution, with plans to build its own
LLM. Microsoft also announced plans to establish a datacenter in Saudi
Arabia, following on the launch of its Qatar datacenter in 2022. Saudi Arabia
leads the pack however, with a current Critical IT Power of 67MW, but with
plans to leapfrog the UAE and reach 530MW in the next few years.
Meanwhile, AI startup Omniva, which has just barely exited from stealth
mode, purportedly aims to build low-cost AI Datacenter facilities in the
Middle East with significant backing from a Kuwaiti royal family member. It
boasts ex AWS, Meta and Microsoft staff among key personnel. They are
the only one with real traction on the ground movement and have the most
impressive talent / pedigree, but they also have a legal battle with Meta
having sued an employee who allegedly stole documents and recruited 8
former employees to join.
Below we are going to quantify the power price differences,
transformer infrastructure, power generation capabilities, and a
breakdown of global datacenter capex requirements by UPS,
Generator, Switch Gear, Power Distribution, CRAH/CRAC, Chillers,
Cooling towers, Valves/pipes/pumps, Project Management Facility
Engineering, Lighting, Management, Security, IT Enclosures and
containment, raised floors/dropped ceilings, and Fire protection.
https://outlook.office.com/mail/id/AAQkADg4M2FlMjA0LWVkMTEtNDgwNi04ODg2LTdiZTE5ZDBjOTQwNQAQAAnRBm1aiJVIh1wdSWUAw4E%3D 27/29
15/03/2024, 21:54 Mail - Jason Wong - Outlook

We will also dive deeper into Meta’s buildouts specifically. We will also
discuss the merits of solar versus wind on the renewable side and
regional differences for deploying this type of power. Power storage
capabilities, and carbon emissions are also touched on.
Get 20% off a group subscription

US EIA, Various National and Regional Electrical Distribution Organizations...

Continue reading this post for free, courtesy of


Dylan Patel.
https://outlook.office.com/mail/id/AAQkADg4M2FlMjA0LWVkMTEtNDgwNi04ODg2LTdiZTE5ZDBjOTQwNQAQAAnRBm1aiJVIh1wdSWUAw4E%3D 28/29
15/03/2024, 21:54 Mail - Jason Wong - Outlook

Claim my free post

Or upgrade your subscription. Upgrade to paid

A subscription gets you:


An edge on semiconductor knowledge! (You can expense it too!)
Conclusions and further insight into the deep dives!
The joy of supporting SemiAnalysis! (Each subscription is for 1 person,
not an entire company.)

LIKE COMMENT RESTACK

© 2024 SemiAnalysis LLC


548 Market Street PMB 72296, San Francisco, CA 94104
Unsubscribe

https://outlook.office.com/mail/id/AAQkADg4M2FlMjA0LWVkMTEtNDgwNi04ODg2LTdiZTE5ZDBjOTQwNQAQAAnRBm1aiJVIh1wdSWUAw4E%3D 29/29

You might also like