SemiAnalysis - AI DC
SemiAnalysis - AI DC
READ IN APP
https://outlook.office.com/mail/id/AAQkADg4M2FlMjA0LWVkMTEtNDgwNi04ODg2LTdiZTE5ZDBjOTQwNQAQAAnRBm1aiJVIh1wdSWUAw4E%3D 2/29
15/03/2024, 21:53 Mail - Jason Wong - Outlook
SemiAnalysis Estimates
The boom in generative AI, powered by transformers, will indeed need a lot
of transformers, generators and a myriad of other electrical and cooling
widgets.
A lot of back of the envelope guesstimates or straight up alarmist narratives
are based on outdated research. The IEA’s recent Electricity 2024 report
suggests 90 terawatt-hours (TWh) of power demand from AI Datacenters
by 2026, which is equivalent to about 10 Gigawatts (GW) of Datacenter
Critical IT Power Capacity, or the equivalent of 7.3M H100s. We estimate
that Nvidia alone will have shipped accelerators with the power needs of
5M+ H100s (mostly shipments of H100s, in fact) from 2021 through the end
of 2024, and we see AI Datacenter capacity demand crossing above 10 GW
by early 2025.
https://outlook.office.com/mail/id/AAQkADg4M2FlMjA0LWVkMTEtNDgwNi04ODg2LTdiZTE5ZDBjOTQwNQAQAAnRBm1aiJVIh1wdSWUAw4E%3D 3/29
15/03/2024, 21:53 Mail - Jason Wong - Outlook
https://outlook.office.com/mail/id/AAQkADg4M2FlMjA0LWVkMTEtNDgwNi04ODg2LTdiZTE5ZDBjOTQwNQAQAAnRBm1aiJVIh1wdSWUAw4E%3D 4/29
15/03/2024, 21:53 Mail - Jason Wong - Outlook
https://outlook.office.com/mail/id/AAQkADg4M2FlMjA0LWVkMTEtNDgwNi04ODg2LTdiZTE5ZDBjOTQwNQAQAAnRBm1aiJVIh1wdSWUAw4E%3D 5/29
15/03/2024, 21:53 Mail - Jason Wong - Outlook
SemiAnalysis Estimates
SemiAnalysis Estimates
https://outlook.office.com/mail/id/AAQkADg4M2FlMjA0LWVkMTEtNDgwNi04ODg2LTdiZTE5ZDBjOTQwNQAQAAnRBm1aiJVIh1wdSWUAw4E%3D 6/29
15/03/2024, 21:54 Mail - Jason Wong - Outlook
The need for abundant, inexpensive power, and to quickly add electrical
grid capacity while still meeting hyperscalers’ carbon emissions
commitments, coupled with chip export restrictions, will limit the regions
and countries that can meet the surge in demand from AI Datacenters.
Some countries and regions such as the US will be able to respond flexibly
with a low electrical grid carbon intensity, low-cost fuel sources with supply
stability, while others such as Europe will be effectively handcuffed by
geopolitical realities and structural regulatory constraints on power. Others
will simply grow capacity without care for environmental impact.
Key Needs of Training and Inference
AI Training workloads have unique requirements that are very dissimilar to
those of typical hardware deployed in existing datacenters.
First, models train for weeks or months, with network connectivity
requirements being relativity limited to training data ingress. Training is
latency insensitive and does not need to be near any major population
centers. AI Training clusters can be deployed essentially anywhere in the
world that makes economic sense, subject to data residency and
compliance regulations.
The second major difference to keep in mind is also somewhat obvious – AI
Training workloads are extremely power hungry and tend to run AI hardware
at power levels closer to their Thermal Design Power (TDP) than would a
traditional non-accelerated hyperscale or enterprise workload. Additionally,
while CPU and storage servers consume on the order of 1kW, each AI server
is now eclipsing 10kW. Coupled with the insensitivity towards latency and
decreased importance of proximity to population centers, this means that
the availability of abundant quantities of inexpensive electricity (and in the
future – access to any grid supply at all) is of much higher relative
importance for AI Training workloads vs traditional workloads. Incidentally,
some of these are requirements shared by useless crypto mining
operations, sans the scaling benefits of >100 megawatt in singular sites.
Inference on the other hand is eventually a larger workload than training,
but it can also be quite distributed. The chips don’t need to be centrally
located, but the sheer volume will be outstanding.
Datacenter Math
https://outlook.office.com/mail/id/AAQkADg4M2FlMjA0LWVkMTEtNDgwNi04ODg2LTdiZTE5ZDBjOTQwNQAQAAnRBm1aiJVIh1wdSWUAw4E%3D 7/29
15/03/2024, 21:54 Mail - Jason Wong - Outlook
purchased in this example, add up the total expected power load of the IT
equipment deployed. In our example below, 20,480 GPUs at 1,389W per
GPU equates to 28.4 MW of Critical IT Power Required.
To get to the total power that the IT equipment is expected to consume
(Critical IT Power Consumed), we need to apply a likely utilization rate
relative to Critical IT Power Required. This factor accounts for the fact that
the IT equipment typically does not run at 100% of its design capability and
may not be utilized to the same degree over a 24-hour period. This ratio is
set to 80% in the example.
On top of the Critical IT Power Consumed, operators must also supply
power for cooling, to cover power distribution losses, lighting and other
non-IT facility equipment. The industry measures Power Usage
Effectiveness (PUE) to measure the energy efficiency of data centers. It's
calculated by dividing the total amount of power entering a data center by
the power used to run the IT equipment within it. It of course is a very
flawed metric, because cooling within the server is considered “IT
equipment”. We account for this by multiplying the Critical IT Power
Consumed by the Power Usage Effectiveness (PUE). A lower PUE indicates
a more power efficient datacenter, with a PUE of 1.0 representing a
perfectly efficient datacenter, with no power consumption for cooling or any
non-IT equipment. A typical enterprise colocation PUE is around 1.5-1.6,
while most hyperscale datacenters are below 1.4 PUE, with some purpose
build facilities (such as Google’s) claim to achieve PUEs of below 1.10. Most
AI Datacenter specs aim for lower than 1.3 PUE. The decline in industry-
wide average PUE over the last 10 years, from 2.20 in 2010 to an estimated
1.55 by 2022 has been one of the largest drivers of power savings and has
helped avoid runaway growth in datacenter power consumption.
For example at 80% utilization rate and a PUE of 1.25, the theoretical
datacenter with a cluster of 20,480 GPUs would on average draw 28-29MW
of power from the grid, adding up to 249,185 Megawatt-hours per year,
which would cost $20.7M USD per year in electricity based on average US
power tariffs of $0.083 per kilowatt-hour.
https://outlook.office.com/mail/id/AAQkADg4M2FlMjA0LWVkMTEtNDgwNi04ODg2LTdiZTE5ZDBjOTQwNQAQAAnRBm1aiJVIh1wdSWUAw4E%3D 9/29
15/03/2024, 21:54 Mail - Jason Wong - Outlook
Server deployments will therefore vary depending on the power supply and
cooling capacity available, with only 2-3 DGX H100 servers deployed where
power/cooling constrained, and entire rows rack space sitting empty to
double the power delivery density from 12 kW to 24 kW in colocation
datacenters. This spacing is implemented to resolve cooling
oversubscription aswell.
https://outlook.office.com/mail/id/AAQkADg4M2FlMjA0LWVkMTEtNDgwNi04ODg2LTdiZTE5ZDBjOTQwNQAQAAnRBm1aiJVIh1wdSWUAw4E%3D 11/29
15/03/2024, 21:54 Mail - Jason Wong - Outlook
https://outlook.office.com/mail/id/AAQkADg4M2FlMjA0LWVkMTEtNDgwNi04ODg2LTdiZTE5ZDBjOTQwNQAQAAnRBm1aiJVIh1wdSWUAw4E%3D 12/29
15/03/2024, 21:54 Mail - Jason Wong - Outlook
https://outlook.office.com/mail/id/AAQkADg4M2FlMjA0LWVkMTEtNDgwNi04ODg2LTdiZTE5ZDBjOTQwNQAQAAnRBm1aiJVIh1wdSWUAw4E%3D 14/29
15/03/2024, 21:54 Mail - Jason Wong - Outlook
SemiAnalysis Estimates
As mentioned above, total Datacenter Critical IT Power demand will double
from about 49 GW in 2023 to 96 GW by 2026, with 90% of the growth
coming from AI-related demand. This is purely from chip demand, but
physical datacenters tell a different story.
Nowhere will the impact be felt more than in the United States, where our
satellite data shows the majority of AI Clusters are being deployed and
planned, meaning Datacenter Critical IT Capacity in the US will need to
triple from 2023 to 2027.
SemiAnalysis Estimates
Aggressive plans by major AI Clouds to roll out accelerator chips highlight
this point. OpenAI has plans to deploy hundreds of thousands of GPUs in
their largest multi-site training cluster, which requires hundreds of
megawatts of Critical IT Power. We can track their cluster size quite
accurately by looking at the buildout of the physical infrastructure,
https://outlook.office.com/mail/id/AAQkADg4M2FlMjA0LWVkMTEtNDgwNi04ODg2LTdiZTE5ZDBjOTQwNQAQAAnRBm1aiJVIh1wdSWUAw4E%3D 15/29
15/03/2024, 21:54 Mail - Jason Wong - Outlook
Structure Research
From a supply perspective, sell side consensus estimates of 3M+ GPUs
shipped by Nvidia in calendar year 2024 would correspond to over 4,200
MW of datacenter needs, nearly 10% of current global datacenter capacity,
just for one year’s GPU shipments. The consensus estimates for Nvidia’s
shipments are also very wrong of course. Ignoring that, AI is only going to
grow in subsequent years, and Nvidia’s GPUs are slated to get even more
power hungry, with 1,000W, 1,200W, and 1,500W GPUs on the roadmap.
Nvidia is not the only company producing accelerators, with Google
https://outlook.office.com/mail/id/AAQkADg4M2FlMjA0LWVkMTEtNDgwNi04ODg2LTdiZTE5ZDBjOTQwNQAQAAnRBm1aiJVIh1wdSWUAw4E%3D 16/29
15/03/2024, 21:54 Mail - Jason Wong - Outlook
Datacenter Dynamics
The Carbon and Power Cost of AI Training
and Inference
Understanding power requirements for training popular models can help
gauge power needs as well as understand carbon emissions generated by
the AI industry. Estimating the Carbon Footprint of BLOOM, a 175B
Parameter Language Model examines the power usage of training the
BLOOM model at the Jean Zay computer cluster at IDRIS, a part of France’s
CNRS. The paper provides empirical observations of the relationship of an
AI Chip’s TDP to total cluster power usage including storage, networking
and other IT equipment, all the way through the actual power draw from the
grid.
https://outlook.office.com/mail/id/AAQkADg4M2FlMjA0LWVkMTEtNDgwNi04ODg2LTdiZTE5ZDBjOTQwNQAQAAnRBm1aiJVIh1wdSWUAw4E%3D 17/29
15/03/2024, 21:54 Mail - Jason Wong - Outlook
https://outlook.office.com/mail/id/AAQkADg4M2FlMjA0LWVkMTEtNDgwNi04ODg2LTdiZTE5ZDBjOTQwNQAQAAnRBm1aiJVIh1wdSWUAw4E%3D 18/29
15/03/2024, 21:54 Mail - Jason Wong - Outlook
https://outlook.office.com/mail/id/AAQkADg4M2FlMjA0LWVkMTEtNDgwNi04ODg2LTdiZTE5ZDBjOTQwNQAQAAnRBm1aiJVIh1wdSWUAw4E%3D 19/29
15/03/2024, 21:54 Mail - Jason Wong - Outlook
SemiAnalysis Estimates
Building out AI Infrastructure at Scale – What
makes a Real AI Superpower?
The AI Datacenter industry is going to need the following:
Inexpensive electricity costs given the immense amount of power to be
consumed on an ongoing basis, particularly since inference needs will
only compound over time.
Stability and robustness of energy supply chain against geopolitical
and weather disturbances to decrease likelihood of energy price
volatility, as well as the ability to quickly ramp up fuel production and
thus rapidly provision power generation at great scale.
Power generation with a low carbon intensity power mix overall, and
that suitable to stand up massive quantities of renewable energy that
can produce at reasonable economics.
Countries that can step up to the plate and tick off those boxes are
contenders to be Real AI Superpowers.
Electricity Tariffs, Power Mix, and Carbon
Intensity
https://outlook.office.com/mail/id/AAQkADg4M2FlMjA0LWVkMTEtNDgwNi04ODg2LTdiZTE5ZDBjOTQwNQAQAAnRBm1aiJVIh1wdSWUAw4E%3D 20/29
15/03/2024, 21:54 Mail - Jason Wong - Outlook
https://outlook.office.com/mail/id/AAQkADg4M2FlMjA0LWVkMTEtNDgwNi04ODg2LTdiZTE5ZDBjOTQwNQAQAAnRBm1aiJVIh1wdSWUAw4E%3D 21/29
15/03/2024, 21:54 Mail - Jason Wong - Outlook
https://outlook.office.com/mail/id/AAQkADg4M2FlMjA0LWVkMTEtNDgwNi04ODg2LTdiZTE5ZDBjOTQwNQAQAAnRBm1aiJVIh1wdSWUAw4E%3D 22/29
15/03/2024, 21:54 Mail - Jason Wong - Outlook
Ember Electricity
China is largely self-reliant on coal used for power generation, but it imports
the vast majority of its other energy needs, with over 70% of its petroleum
and LNG exports shipped through the Strait of Malacca, and therefore
subject to the so-called “Malacca Dilemma”, meaning that for strategic
reasons China cannot pivot towards natural gas and will have to rely on
adding coal and nuclear for baseload generation. China does lead the world
in adding renewable capacity, however, the huge existing base of fossil fuel-
based power plants and continued reliance on adding coal power to grow
https://outlook.office.com/mail/id/AAQkADg4M2FlMjA0LWVkMTEtNDgwNi04ODg2LTdiZTE5ZDBjOTQwNQAQAAnRBm1aiJVIh1wdSWUAw4E%3D 24/29
15/03/2024, 21:54 Mail - Jason Wong - Outlook
overall capacity means that in 2022 only 13.5% of total power generation
was from renewables.
To be clear, China is the best country at building new power generation, and
they would likely lead in the construction of gigawatt scale datacenters if
they were enabled to, but they cannot, so the US is dominating here.
time, although the cleanest power in the world nuclear has been replaced
with coal and natural gas in some instances. Renewable energy is
increasing within Europe’s power mix, but not fast enough, leaving many
Europeans countries to scramble to pivot more towards natural gas, which
now stands at 35-45% of the power generation mix for major Western
European countries.
Ember Electricity
Given Europe’s energy situation, the EU average industrial tariff reached
$0.18 USD/kWh in 2022, with the UK at $0.235 USD/kWh and datacenter
heavyweight Ireland at $0.211 USD/kWh, nearly triple the electricity cost in
the US. Like Asia, Europe imports over 90% of its gas in the form of LNG,
mainly sourced from the Middle East (and also still from Russia, despite the
ongoing war), so their entire industrial base, not just Datacenters, is subject
to geopolitical risk, as most readers will vividly remember from the onset of
the war in Ukraine. Given the political and geopolitical realities, adding a
massive amount of power generation capacity to host the AI Datacenter
boom in Europe would be very challenging.
Furthermore, Europe is allergic to building as proven by many regulations
and restrictions on the datacenter and manufacturing industries already in
place. While small projects and pipelines for datacenters are in progress,
especially in France who at least has somewhat realized the geopolitical
necessity, no one is planning to build Gigawatt class clusters in Europe.
https://outlook.office.com/mail/id/AAQkADg4M2FlMjA0LWVkMTEtNDgwNi04ODg2LTdiZTE5ZDBjOTQwNQAQAAnRBm1aiJVIh1wdSWUAw4E%3D 26/29
15/03/2024, 21:54 Mail - Jason Wong - Outlook
We will also dive deeper into Meta’s buildouts specifically. We will also
discuss the merits of solar versus wind on the renewable side and
regional differences for deploying this type of power. Power storage
capabilities, and carbon emissions are also touched on.
Get 20% off a group subscription
https://outlook.office.com/mail/id/AAQkADg4M2FlMjA0LWVkMTEtNDgwNi04ODg2LTdiZTE5ZDBjOTQwNQAQAAnRBm1aiJVIh1wdSWUAw4E%3D 29/29