A Systematic Review of Green AI: Roberto Verdecchia June Sallou Luís Cruz
footprint of AI is no longer negligible. AI researchers and practi- Systematic literature Type of studies
tioners are therefore urged to hold themselves accountable for the
arXiv:2301.11047v3 [cs.AI] 5 May 2023
carbon emissions of the AI models they design and use. This led in primary
recent years to the appearance of researches tackling AI environ- studies Position papers Observational Solution papers
(11) studies (35) (52)
mental sustainability, a field referred to as Green AI. Despite the
Green AI topics
rapid growth of interest in the topic, a comprehensive overview of
Green AI research is to date still missing. To address this gap, in this
paper, we present a systematic review of the Green AI literature. Footprint Hyperparameter Benchmarking Deployment Precision/energy Algorithm Other
From the analysis of 98 primary studies, different patterns emerge. monitoring tuning trade off design
The topic experienced a considerable growth from 2020 onward. Recurrent Green AI Most considered AI Most considered artefacts
Most studies consider monitoring AI model footprint, tuning hy- phase
Still, most of the existing work focuses on the training stage of the literature indexing platforms to execute the automated search al-
AI model. Moreover, we observe that there is little involvement lows us to conduct an encompassing search of the literature based
of the industry (23%) and that most studies revolve around labora- on multiple sources, hence allowing us to mitigate potential threats
tory experiments. We argue that the field is growing to a level of to external validity, as further documented in Section 5. Following,
maturity in which involvement of the industry is quintessential to the details of each step of our research process are documented
enable the overarching goal of Green AI: harness the full potential in detail.
of AI without a negative impact in our planet. 2.2.1 Automated Initial Search. To identify a preliminary set of
To encourage open science and the reproducibility of this study, potentially relevant research works, we design an encompassing
we provide all data and scripts in a replication package available automated query to be executed on three different literature in-
online with an open-source license1 . dexers, namely Google Scholar, Scopus, and Web of Science. The
The remainder of this paper is structured as follows. In Section 2, automated query targeting publication titles states as follows:
we describe the methodology used to collect and analyze Green
AI literature. In Section 3, we present all the results yielded by our
methodology. Section 4 discusses findings and reflects on the impact
Listing 1: Automated search query
of our results in the research community. In Section 5, we reflect
on the potential threats to the validity of this study. Following, 1 INTITLE ( " green " OR " sustainab * " ) AND
Section 6 describes related work and pinpoints the differences with 2 INTITLE ( " AI " OR " ML " OR " artificial ␣ intelligence "
our study. The main conclusions and future work are presented in 3 OR " machine ␣ learning " OR " deep ␣ learning " )
Section 7.
The query is designed to retrieve literature with titles containing
2 Methodology
keywords related to sustainability, identified by the keywords green
In this section, we document the research design, which was rigor- or sustainability and its variations, e.g., “sustainable” (Listing 1,
ously adhered to during the planning and execution of the study. We Lines 1). The second part of the query instead is used to retrieve
primarily followed the guidelines for conducting SLRs in software literature concerning AI, or related synonyms and acronyms (List-
engineering research presented by Kitchenham [6]. ing 1, Lines 2-3). The query is executed on the three aforementioned
2.1 Research Objective and Question literature libraries and indexes on the 18th of July 2022, and led
to the identification of 190 potentially relevant studies. In order
The goal of this review is to understand the characteristics of ex- to be as comprehensive as possible, and avoid potential threats to
isting Green AI research. By utilizing the Goal-Question-Metric external validity, the year of publication is left unbounded in the
method [1], this objective can be described more formally as follows: automated search.
2.2.2 Application of Selection Criteria. Subsequent to the identi-
Analyze Green AI literature fication of the initial potentially relevant studies, we execute the
For the purpose of knowledge collection and categorization manual selection of the studies via a set of selection criteria defined
With respect to AI a priori. A paper is confirmed as primary study if it adheres to all
From the viewpoint of researchers and practitioners inclusion criteria, and none of the exclusion ones. The following
In the context of environmental sustainability. inclusion (I) and exclusion (E) criteria are used:
The goal of this research can be directly translated into a research
question (RQ), which states as follows: I1- The study regards AI
I2- The study regards environmental sustainability
RQ: What are the characteristics of Green AI state-of-the-art re- I3- The study regards the environmental sustainability of AI
search? I4- The study regards the software level
By answering our research question, we aim at gaining a system- E1- The study is not written in English
atic overview of the Green AI body of knowledge, starting from E2- The study is not available
an outline of the general publication trends, to a detailed analy- E3- The study is a duplicate or extensions of an already included
sis of the past and current Green AI research activities and their study
characteristics. E4- The study is a secondary or tertiary study
E5- The study is in the form of editorials, tutorials, books, ex-
2.2 Research Process tended abstracts, etc.
An overview of the research process followed is depicted in Figure 1. E6- The study is a non-scientific publication or grey literature
The process starts with the execution of a conservative automated
search query via the digital libraries and indexing platforms Google With the first three inclusion criteria (I1-I3), we ensure that the
Scholar, Scopus, and Web of Science, complemented by a subsequent primary studies focus on Green AI (I1, I2), and that the studies
iterative bidirectional snowballing process, which is conducted un- regard the environmental sustainability of AI, rather than the im-
til the achievement of theoretical saturation. Including multiple provement of environmental sustainability through AI. With the
fourth inclusion criterion instead (I4), we ensure that the primary
studies focus on software-centric Green AI. This latter criterion is
1 Replication package: https://github.com/luiscruz/slr-green-ai used to exclude studies focusing on hardware-specific Green AI
A Systematic Review of Green AI
techniques, e.g., the use of ad hoc implemented hardware compo- The first phase consists of a data exploration process, which
nents, which we consider out of reach for most researchers/practi- terminates with the establishment of the data extraction framework
tioners interested in Green AI, and is only marginal to the definition of this study. Specifically, during this first phase, the three authors
of Green AI itself [92]. of this review independently scan the identified primary studies,
The exclusion criteria are designed to ensure that data can be and annotate the characteristics of the studies which are relevant
extracted from the papers (E1, E2), do not represent duplication or to answer our RQ. The identified characteristics are then jointly
redundancy with respect to other primary studies (E3, E4), and are discussed and refined, leading to the consolidation of the fields
provided in the form of scientific studies (E5, E6). constituting the data extraction framework of this review.
To ease the primary study selection process, adaptive reading In the second data extraction phase, the primary studies are
depth [13] is used to efficiently assess potentially relevant studies. thoroughly analyzed, and the data is extracted from the studies
In order to mitigate subjective biases and interpretations, the three according to the data extraction framework.
authors independently utilized the selection criteria to scrutinize 63- The fields of the data extraction framework utilized for this
64 candidate studies. Weekly meeting are held during the selection literature review on Green AI are the following.
process to jointly discuss examples, doubts, and align the selection
process between the three researchers. • Green AI Definition: the level of abstraction used in the paper
The application of the selection criteria concludes with the iden- to quantify the impact of AI in the surrounding environment:
tification of 16 primary studies, which constitute the starting set energy efficiency [18], carbon footprint [20], or ecological
for the subsequent snowballing process. footprint [9].
2.2.3 Snowballing. In order to enrich the set of selected primary • Study type: The overarching type of study, which could be
studies, and ensure that the primary study comprehensively repre- either presenting a position on Green AI, a Green AI solution,
sents the Green AI body of literature, the automated search results or an observational study on Green AI;
are complemented with a recursive bidirectional snowballing pro- • Topic: The Green AI topic considered in the study, e.g., hyperparameter-
cess [21]. This step entails the scrutiny of all studies either citing tuning to achieve energy efficiency of an AI algorithm;
or cited by the already included primary studies. As for the appli- • Domain: The domain considered in the study, e.g., edge or
cation of selection criteria, three researchers are involved in the mobile computing;
snowballing. During each snowballing round, the researchers in- • Type of data: The type of data utilized by AI in the study, e.g.,
dependently snowball different primary studies, and propose new text or images;
primary studies to be included, i.e., the new identified studies which • Artifact considered: The AI artifact considered in the study,
adhere to the selection criteria. During each snowballing round, ex- e.g., the data used by AI models, the AI models themselves,
amples, doubts, and divergences are jointly revisited and resolved, or the AI deployment pipeline.
and the next snowballing iteration is started. A total of two rounds • Considered phase: If the study focused on the AI training
of backward and forward snowballing are executed before no new phase, the AI inference phase, or both.
studies are identified, i.e., when theoretical saturation is reached. • Research strategy: The research strategy, as defined in [16],
The snowballing process terminates with the inclusion of 82 new used to support the claims reported in the study;
primaries studies, leading to a total of 98 primary studies which are • Dataset size: The size of the dataset, in number of data points,
considered in the literature review reported in this research. considered in the study (if any);
2.2.4 Data Extraction. In order to achieve the intended goal of • Energy Savings: The reported percentage energy savings
this study and answer our RQ (see Section 2.1), we proceed to achieved by solutions reported in the study (if any is docu-
systematically extract data from the primary studies. The data mented);
extraction process consisted of two subsequent phases. • Industry involvement: Industry involvement in the author-
ship of the study, which could be either academic-only au-
thorship, industrial-only authorship, or mixed authorship;
Roberto Verdecchia, June Sallou, and Luís Cruz
3 Results
In this section, we present the results collected with our SLR on 3.3 Green AI Definition
Green AI. The distribution of publications across different Green AI definitions
is presented in Figure 3. Most literature addresses Green AI at
3.1 Publication Years the level of energy efficiency (81 papers). Higher-level definitions,
The literature spans from 2015 with the first publication on the topic namely carbon and ecological footprint, are only addressed in 20
to this present year (i.e., 2022). Figure 2 presents the distribution and 9 publications respectively. Note that a primary study might
of the literature papers regarding the publication year. We observe be mapped to more than one definition, if more than one is used in
a global increase following the years. Furthermore, a spike in the the paper at hand.
number of publications is seen in 2020, going from 7 publications
in 2019 to 20 in 2020. As the automated initial search was launched
in 2022, the publication trends reported in this review might not 3.4 Study Types
be representative of the actual research output of 2022 (see also Existing literature on Green AI spans across three types of studies,
Section 2.2.1). namely observational, solution, and position papers (see also Sec-
tion 2.2.4). As shown in Figure 4, from the 98 papers covered in
3.2 Venue Types this review, the most common are solution papers, with 51 entries,
Publications are particularly concentrated on conferences (⊲ 47 out followed by observational with 35, and position papers with 12.
of 98 papers.) and journals (⊲ 39 out of 98 papers.). Only 12 out of Note that study types are mutually exclusive, i.e., a single paper has
the 98 publications are associated with a workshop. Conferences only one study type.
being treated as an equal publishing venue as journals follows the
trends observed in the computer science research field [5, 19].
3.5 Green AI Topics
Green AI publication trends From our analysis we identify 13 main topics being addressed by the
Green AI literature. Figure 5 depicts the distribution of publications
¨ The topic of Green AI is experiencing an increasing trend
across the different topics. The most popular topic is Monitoring,
of popularity, with a considerable growth in publications from
addressed by 28 papers, followed by Hyperparameter Tuning (18),
2020 onward. Most studies are published in conferences and
Model Benchmarking (17), Deployment (17), and Model Comparison
journals, while only a minor portion in workshops.
(17). Since papers are not exclusive to a single topic, these top-4
A Systematic Review of Green AI
Data-centric approaches for Green AI show that feature selection Green AI topics by study type
and subsampling techniques can significantly reduce the energy
consumption of training machine learning models [102]. Subsam- ¨ There are 13 main topics on Green AI. The majority (61%)
pling strategies can be more sophisticated by removing data points of the publications focuses on Monitoring, Hyperparameter-
that are expected to be redundant in terms of knowledge acquisi- tuning, Model Benchmarking, and Deployment. Despite being
tion [39]. important, topics such as Data-Centric, Estimation, and Emis-
sions are underrepresented in the scientific literature.
Network Architecture ⊲ 6 out of 98 papers. The impact of a
distributed network on the energy efficiency of AI. AI models are 3.6 Green AI Topics by Study Type
often deployed in a distributed context – e.g., IoT, edge computing, We further investigate the distribution of papers across different
etc. Hence the design and architecture of the network plays an topics per category. Figure 6 presents a bubble plot that draws a
important role in leveraging sustainable models. bubble for each pair topic (x-axis) and study type (y-axis). The size
For example, Kim and Wu [68] propose an adaptive execution of the bubble is proportional to the number of papers published in
engine that selects the inference strategy according to the signal each pair. The plot enables a few observations.
strength of the network in different devices, as it is known to affect Most topics adhere to the general pattern observed earlier in Sec-
the energy efficiency of the edge mobile system. tion 3.4: the majority of papers consist of solution studies, followed
by observational and then position. However, the topics of Model
Estimation ⊲ 5 out of 98 papers. Collecting and making sense
Benchmarking and Libraries do not follow this pattern, being mostly
of energy or climate data is far from trivial – many different factors
covered by observational papers. This is expected as these topics
contribute to the final estimation [48]. This topic revolves around
revolve around comparing different libraries and models to provide
understanding ways of estimating the energy consumption or car-
insight on the energy efficiency of different design decisions.
bon footprint of models.
Moreover, papers from the least represented topics Ethics, Policy,
Existing solutions to estimate energy consumption for software
and Emissions tend to be position papers. From the ten studies in
fail to provide meaningful insight about energy consumption that
these three topics, only one is observational and none is solution.
can be mapped to a machine learning model’s structure. IrEne cre-
Also worth noticing is the fact that the majority of the position
ates a graph that breaks down NLP models into low-level machine
studies in Green AI only cover the smallest topics. Considering
learning primitives and provides energy estimations at the primitive
the top-10 topics – from Monitoring to Estimation – only 6 are
level [37].
position papers. In contrast, the bottom-4 topics (including Other)
Emissions ⊲ 4 out of 98 papers. Papers that focus on understand- are covered by 10 position papers.
ing the carbon impact of creating and/or consuming AI systems.
Green AI topics by study type
Dhar [40] flags the importance of being able to quantify carbon im-
pact and the lack of tools and data available. Fraga-Lamas et al. [44] ¨ Most publications on Ethics, Policy, and Emissions are posi-
go beyond reporting the energy consumption of an AI-enabled tion studies calling for more research in these topics.
IoT scenario and present how much carbon would be emitted in
different countries and different energy sources. 3.7 Domains
Figure 7 presents the distribution of the publications according to
Policy ⊲ 3 out of 98 papers. Studies within this topic address and
the domain they cover. The majority of the publications (i.e., ⊲ 58
discuss strategies on how we should handle the carbon footprint of
out of 98 papers.) do not devote their studies to a specific domain, but
AI as a society.
tackle the energy efficiency of AI in a general context. Regarding
Perucica and Andjelkovic [85] reflect on the environmental poli-
the most specific studies, the most covered domains are:
cies implemented by the European Union, discussing whether they
fit the AI era or new regulations are needed. Rhode et al. [89] call Edge Regarding Internet of Things and Edge Computing, which
out for the unclear dilemma between the impact of existing/up- are usually associated with distributed systems and networks.
coming AI technologies and the commitment to achieve the 1.5℃ ⊲ 24 out of 98 papers.
climate change goal as expressed in the UNFCCC Paris Declaration. Computer Vision Regarding image recognition.⊲ 6 out of 98 pa-
Ethics ⊲ 3 out of 98 papers. Papers that focus on the ethical im- Cloud ⊲ 5 out of 98 papers.
plications of the growing carbon footprint of AI. Tamburrini [100] Mobile ⊲ 4 out of 98 papers.
discusses the responsibilities of AI scientists, AI infrastructure
providers, and other stakeholders in enabling Green AI. The pa- The Other category gathers publications about a specific domain,
per questions whether it is ethically justified to create massive AI being covered only once, among Health, Autonomous Driving,
pipelines to improve accuracy. Smart cities, Human Activity, Wearables, and Embedded Systems.
20 18 17 17
11 10
10 8
6 6 5 5
4 3 3
ng nin
g ing ent Off sig
n rie
s ric ure ati
on on
licy ics er
it -Tu ark oym ade- - De i bra Cent i t ect i m i ssi Po Eth Oth
on ter m p l r
L ta -
rch Es t m
M e nch De yT Da
a r am e l Be n erg l g ori o r kA
d n-E A
Hy e c isio
Figure 5: Number of papers per Green AI topic.
60 58
Other 2 2 1
Ethics 3
Policy 3 20
5 4 4 7
Emissions 1 3 0
General Edge Computer Cloud Mobile Other
Estimation 1 4
Network Architecture 1 5
Figure 7: Number of publications per study domain.
Data-Centric 1 4 1
Libraries 7 1 is used to make predictions from new data. Thus, we classify the
papers according to 3 categories: training, inference, and all. The
Algorithm-Design 1 9 all category translates the fact that the paper does not consider a
particular phase, but the whole pipeline.
Precision-Energy Trade-Off 3 8 As depicted in Figure 8, we find that most of the publications
on the topics of Green AI focus on the training phase (⊲ 49 out of
Deployment 5 12 98 papers.). In comparison, fewer papers direct their studies at the
inference phase (⊲ 17 out of 98 papers.) or on the overall process
(⊲ 32 out of 98 papers.).
Model Benchmarking 14 3
Hyperparameter-Tuning 6 12 49
Monitoring 11 12 5 32
20 17
Observational Solution Position
Figure 6: Number of publications by topic and study type. Training All Inference
Considered stage
3.8 AI Pipeline Phases
Figure 8: Number of publications per studied phase of AI.
The AI pipeline is divided into two major phases: the training,
when the AI model is built, and the inference, when the model
Roberto Verdecchia, June Sallou, and Luís Cruz
Green AI Pipeline Phase 40
¨ Approximately half of Green AI studies focus on the training 32
phase, while a minor portion considers the entire AI pipeline.
Only a minor portion of the Green AI literature focuses on the 20
inference phase.
10 10
4 2
3.9 Considered Artifacts
AI systems are based on several artifacts, and tackling the energy ef- Image Textual Numeric Video Audio Not
ficiency of such systems can thus involve multiple of those artifacts
(e.g., data, model, pipeline) or different related artifacts (e.g., archi- Data Type
tecture, framework, CPU). A distribution of the artifacts considered
in the primary studies is documented in Figure 9. The categories of Figure 10: Occurrence of data types used in the Green AI lit-
artifacts are: erature.
Model The publications within this category focus on the model
and/or associated algorithm to tackle the energy efficiency
of AI. ⊲ 63 out of 98 papers. Green AI algorithm types
Data Papers that address energy efficiency through the study of ¨ Most Green AI primary studies are algorithm-agnostic or
the data used in the AI pipeline. ⊲ 8 out of 98 papers. focus on neural networks. A small fraction uses decision trees.
Pipeline Studies looking at the whole AI pipeline. ⊲ 3 out of 98
papers. 3.11 Data Types Used
Other Publications dealing with CPU, architecture, and framework.
Regarding the types of data used in the Green AI body of literature,
⊲ 4 out of 98 papers.
an overview of their distribution is reported in Figure 10. From
General The papers do not specify a particular artifact and address
the figure, we can observe that the recurrence of data types across
AI systems as a whole. ⊲ 24 out of 98 papers.
primary studies is:
Image data ⊲ 42 out of 98 papers.
63 Textual data ⊲ 22 out of 98 papers.
60 Numeric data ⊲ 10 out of 98 papers.
Video data ⊲ 4 out of 98 papers.
Audio data ⊲ 2 out of 98 papers.
From the distribution of data types, we notice that image data
24 is by far the most used one, and is utilized by almost half of the
20 studies in the body of literature. The second most utilized data type
8 is textual data, which nevertheless appears approximately half as
3 4
0 often as the image one. Other types of data result to be less recurrent,
Algorithm General Data Pipeline Other with only few studies utilizing audio data (e.g., Lenherr et al. present
Artifact a metric to measure the sustainability of Green AI by considering
as case study the Intel MovidiusX processor, an embedded video
Figure 9: Number of publications per studied artifact. processor with a Neural Engine for video processing and object
detection [74]).
A rather high number of primary studies does not specify any
3.10 Algorithm Types kind of data (Not specified category, ⊲ 32 out of 98 papers.). This
finding has to be primarily attributed to the position and theoretical
By considering the primary studies which focus on a specific algo- papers included in the review (see also Section 3.4 and Section 3.5).
rithm (⊲ 51 out of 98 papers.), we note that the vast majority focus
on neural networks (⊲ 41 out of 98 papers.). Only a much smaller Green AI data types
fraction focuses on algorithms of different nature, such as decision ¨ Image data is the most used data type in Green AI studies,
trees (⊲ 5 out of 98 papers.), genetic algorithms (⊲ 1 out of 98 papers.), followed by textual and numeric data.
or logistic regression models (⊲ 5 out of 98 papers.).
Regarding the deep neural network algorithms, we also note a
further characterization of this field, with 8 studies focusing on 3.12 Dataset sizes
convolutional neural networks, one on transformers, and one Regarding the size of the datasets used in the papers, approximately
on spiking neural networks. We also observe three algorithms half of the primary studies (⊲ 48 out of 98 papers.)) directly reference
that appear only once in the Green AI literature (Other category, the number of data points used. By inspecting such numbers, we
⊲ 3 out of 98 papers.), namely genetic algorithms, logic regression note that the number of data points used to study and to evaluate
algorithms, and stochastic gradient descent algorithms. Green AI algorithms and approaches varies greatly, and ranges from
A Systematic Review of Green AI
12 Academic
6 5 2
75 Industrial
20 3
Laboratory Field Computer Simulation Judgement None
Experiment Experiment Study
Research Strategy
1k data points [51] to 40M data points [45]. Almost half of the Figure 12: Industry involvement.
studies reporting the number of data points (⊲ 25 out of 48 papers)
utilize data points in the order of thousands (1𝑘 ≤ #𝑑𝑎𝑡𝑎𝑝𝑜𝑖𝑛𝑡𝑠 ≤
70𝑘), while the remaining (⊲ 23 out of 48 papers) use one million to optimize energy the most are based on quantizing the inputs of
data points or more (1𝑀 ≤ #𝑑𝑎𝑡𝑎𝑝𝑜𝑖𝑛𝑡𝑠 ≤ 40𝑀). decision trees [23] (97% energy savings), using data-centric Green
AI techniques [102] (92% energy savings), and leveraging efficient
Green AI dataset sizes
deployment of AI algorithms via virtualized cloud fog networks
¨ Dataset sizes range from 1k to 40M data points, with ap- (91% energy savings) [116]. Overall, more than half of the papers
proximately half of the studies utilizing 1M or more data points. explicitly reporting energy saving percentages report a saving of at
least 50% (⊲ 17 out of 27 papers), while only a minor number savings
between 13% and 49%.
3.13 Research Strategies
¨ Green AI energy savings
By considering the research strategies [16] utilized in the Green AI
literature, the distribution of the various strategies, according to Studies report energy savings between 13% and 115% energy
the collected primary studies, is reported in Figure 11. savings, with more than half of the papers reporting savings of
The majority of paper results adopt laboratory experiments at least 50%.
(⊲ 73 out of 98 papers.), while only a fraction uses other research
strategies, such as field experiments (⊲ 6 out of 98 papers.), i.e., 3.15 Industry involvement
experiments conducted in pre-existing settings and computer sim- Regarding the industry involvement in Green AI scientific publica-
ulations, i.e., “in silico” simulations conducted in a nonempirical tions (see also Section 2.2.4), an overview of the authorship of the
setting (⊲ 5 out of 98 papers.). As examples, Liu et al. [77] use a Green AI primary papers is depicted in Figure 12.
field study to assess a green software stack for computer vision From the figure, we can note that most Green AI studies are au-
of autonomous robots, while Yosuf et al. [116] leverage computer thored exclusively by academic researchers (⊲ 75 out of 98 papers.),
simulations to study how virtualized cloud fog networks can be while also a considerable portion, amounting almost to a fourth of
used to improve AI energy efficiency. The 12 papers not displaying all primary studies, are authored by a mix of academic and indus-
any research strategy correspond to the position papers (cf. the trial researchers (⊲ 20 out of 98 papers.). Green AI studies written
“None” category in Figure 11). exclusively by industrial authors appear only in rare instances (⊲ 3
out of 98 papers.).
Green AI Research Strategies
¨ Most Green AI studies use laboratory experiments, while ¨ Industry involvement
only a minority adopt other research strategies, such as field Most studies are written by academic authors, while a minor
experiments and computer simulations. portion by a mix of academic and industrial authors. Green AI
studies written exclusively by academic authors are very rare.
3.14 Energy savings
By considering the energy savings reported achievable via Green AI 3.16 Intended readers
strategies, we note that only approximately a third of the primary By considering the intended readers of the Green AI scientific
studies explicitly document them (⊲ 27 out of 98 papers.). Out of literature, the vast majority targets academic readers (⊲ 85 out
all Green AI strategies, among the ones which report concrete of 98 papers.), while a much smaller portion both academic and
saving percentages, a technique based on structure simplification industrial readers (⊲ 8 out of 98 papers.). Despite scientific papers
for deep neural networks results to save more energy, amounting targetting intuitively a specialized audience, among the Green AI
to 115% energy savings [118]. The other techniques which result literature, few studies are intended also for the general public (⊲ 5
Roberto Verdecchia, June Sallou, and Luís Cruz
out of 98 papers.). For example, Dhar et al. [40], present an intuitive emissions alone. Based on these considerations, we define the field
yet thoroughly positioned article on the systemic effect of AI on of Green AI as follows:
carbon emissions. Interestingly, among the primary studies, few are
intended also for policymakers, i.e., aim to sensibilize government “Green AI regards practices aimed at utilizing AI to
stakeholders to consider issues related to Green AI. For example, mitigate the impact that humans have on the natu-
in a paper by Rohde et al. [89], how opportunities and risks for ral environment in terms of natural resources utilized,
the environment, economy and society associated with AI can be and/or mitigating the impact that AI itself can have on
governed are discussed. the natural environment.”
¨ Intended readers On one hand, the definition above perfectly fits the studies fo-
The vast majority of Green AI studies are targetting academic cusing on the holistic impact that Green AI has on the natural
readers, while a much smaller portion targets both academic environment. On the other hand, given its encompassing nature,
and industrial readers. A handful of studies, especially position the definition is also suited for studies focusing on lower abstrac-
papers, are intended for the general public. tion levels of sustainability, such as Green AI 𝐶𝑂 2 emissions and
energy consumption. In the latter case however, the definition also
3.17 Tool Provision acts as a word of warning: while studying the lower levels of Green
AI is paramount, only by considering the totality of the heteroge-
Among the primary studies collected for this literature review on
neous natural resources utilized by AI can we really understand
Green AI, only a small fraction (⊲ 15 out of 98 papers.) makes tools
the environmental impact of AI.
available to tackle Green AI. The tools provided are of heteroge-
The transdisciplinary topics of Green AI (with gaps). The 13
neous nature, and range from tools to monitor the resource effi-
different topics we discover in this review emphasize that Green AI
ciency of AI algorithms [52], to tools optimizing the energy effi-
is a broad field that needs to be tackled as a transdisciplinary field.
ciency for stochastic edge inference [68], and implementations of
Some topics are naturally tied to training strategies (e.g., monitoring,
convolutional neural networks optimized for energy efficiency [81].
hyperparameter tuning, algorithm design). However, there are other
¨ Green AI Tool Provision topics that take Green AI outside the training realm.
This is the case for example of Deployment, Libraries, and Esti-
Albeit numerous studies provide solution to tackle Green AI,
mation that promise to be relevant in enabling Green AI. We argue
only a fraction of them makes tools based on the solutions
that other disciplines need to be involved. For example, Software
readily available online as an implemented tool.
Engineering which has been dealing with these topics for tradi-
tional software systems. As highlighted by Cao et al. in their work
4 Discussion on estimation [37], one cannot expect existing strategies for tradi-
The consolidated and still growing Green AI publication trend. tional software to address the new challenges of AI-based systems.
From the analysis of the publication trends a clear picture emerges. Conversely, only a few Green AI papers [50, 56, 80] come from
The topic is gaining increasing traction in the academic community, software engineering venues.
especially if the latest years are considered (from 2020 onward). Our analysis also shows that the topics Estimation and Emis-
Despite being a quite new research topic (with the first paper on sions are under-represented, with six and five papers, respectively.
Green AI being published in 2015), the socio-environmental rele- We argue that more work is quintessential in these topics to help
vance of the topic seems to be reflected in its targeted publication scientists and practitioners report the carbon footprint of their AI
venues. With conferences and journal being the most recurrent models in a seamless way.
Green AI publication venues, the Green AI research field seems We showcase that papers under the topic Policy are only covered
to have positioned and consolidated itself quite quickly within AI by position papers. We find this finding disconcerting: new policies
research communities. to encourage Green AI within both industry and academic contexts
A definition of Green AI. From the results regarding how the need to be backed up with reliable evidence. Hence, we need more
term “Green AI” is used in the literature a clear picture emerges. observational and solution papers that tackle this topic in the near
Most Green AI studies consider Green AI as exclusively related to future.
energy efficiency. Only fewer studies examine the influence of AI The same issue is present in Emissions – only one paper is
on greenhouse gas emissions (𝐶𝑂 2 ), and an even minor fraction ex- observational and the remaining are position. It might be the case
amines the holistic impact that AI has on the natural environment. that computing the climate impact of AI is far from trivial and it
By considering the different levels of abstraction (namely energy is easier said then done. Again, this is a call for the community to
efficiency, carbon footprint, and environmental footprint) the higher, take action. It is not enough to ask big companies to provide their
more encompassing level, of environmental footprint seems best data on carbon impact – we also need to provide strategies and
fitted to define the field of Green AI. In fact, as demonstrated in solutions to make it standard and straightforward.
recent literature, reducing the environmental impact of AI exclu- The fundamental Green AI research unbounded from ap-
sively to energy consumption has to be deemed as overly simplis- plication domains. From the collected results we deduce that,
tic process [8]. Similarly, as green resources are sustainable but in order to improve the environmental sustainability of AI, it is
not infinite [17], the field of Green AI has to account also for the often not necessary to focus on a specific domain. This implies
multifaceted environmental impact AI can have, other than 𝐶𝑂 2 that frequently fundamental aspects of Green AI are still open to
A Systematic Review of Green AI
investigation, and results can then be ported from a generic set- of industry towards Green AI concerns, and/or the importance of
ting to specific domains. However, from the obtained results, we moving towards more environmentally sustainable AI practices.
also note that the increasing distribution of digital infrastructures As a potential impediment to the industrial adoption of Green AI
to achieve environmental sustainability [17] might have played a research, our results point to a low recurrence of studies targeted
role in Green AI research, with edge computing being the most towards practitioners. While numerous journals are explicitly aimed
considered specific domain. at practitioners, e.g., IEEE Software2 , only few studies on Green AI
The high emphasis on the AI training phase. The results included in our review target them. This result might point to the
regarding the AI pipeline phases considered in the literature un- fact that the Green AI interest is still primarily focused towards
equivocally point to training as the most studied phase. Albeit the academic activities, while the authorship showcases a rather high
training phase is intuitively the most energy-greedy phase, this re- interest of industry. As take away, similar to the considerations
sults calls for a word of caution. From recent results (e.g., a study on made for the Green AI research strategies, it might be the right
data-centric Green AI [102]) the inference phase results to consume moment to consider a higher involvement of industry in Green AI,
only a negligible fraction of the energy consumed in the training which results to date to be a research area still targeted primarily
phase. Nevertheless, given the high execution rate of the inference towards academic readers.
phase, how the energy consumed by the infrequent execution of Green AI lacks tool support. Finally, from this review, we note
the training phase compares to one of the highly executed inference that the current situation regarding the provisioning of Green AI
phase is still an open question. As a call for action, studies should tools is not bright. Albeit the majority of the studies present Green
be conducted by considering the energy consumed throughout the AI solutions, only a small fraction of them makes the solutions
whole life cycle of AI models, from their training to inference phase, available as a tool. We conjecture that this result might either point
till their eventual deprecation. towards (i) a fast-paced nature of Green AI research, in which
Image datasets as primary Green AI data source. By consid- results are rapidly deprecated, and hence tools are not meaningful,
ering the data types used in Green AI studies, we note that the or (ii) an immaturity of the research field, which still requires a
vast majority of the literature uses image data. To the best of our solid empirical foundation on which tools can be built upon.
knowledge, this choice is not guided by any specific research design
choice (e.g., AI models based on image data being the most used in 5 Threats to Validity
practice, or being the most energy greedy ones). For this reason, In this section, we discuss the threats to validity of our study. To en-
we conjecture that the popularity of utilizing image data for Green sure the quality of the results, we established a well-defined research
AI data is mostly driven by convenience, either because past work protocol to proceed with the data collection. In addition, through-
focused on such data by chance, image datasets are more accessi- out our study, we followed the recommendations of the guidelines
ble/standardized with respect to other ones, or more off the shelf for conducting a systematic literature review [6, 7, 10, 14, 21]. We
image AI models/libraries are currently available. Regardless of the designed and carried the different reviewing processes according
cause, this result points to the need of utilizing more heterogeneous to the rigorous protocol we established after the guidelines and
data types, rather than focusing primarily on image data, in order described in Section 2. Nevertheless, some threats to validity can
to gain a holistic understanding of Green AI. still exist even with our best efforts. In the following, we present
Laboratory experiments guided till now Green AI. The most the threats which could have influenced our study, jointly with the
common research strategy adopted for Green AI studies clearly strategies we adopted to mitigate them.
emerges from the literature as being laboratory experiments. Given External validity The main threat to external validity is that
the fast popularization and consolidation of the Green AI research the literature collected and analysed in this study is not sufficiently
field, from this review it seems as if the time is suitable to shift representative. To avoid this situation, we surveyed three promi-
the focus to other research strategies, e.g., field experiments and nent literature indexers through an automatic query (i.e., Google
case studies. This would not only allow to change the considered Scholar, Scopus, and Web of Science), and left the year of publica-
context from an in vitro to an in vivo setting, but also to bridge tion unbounded, to reduce the probability of missing any relevant
potential gaps between academic research and industrial practice. publication. In addition, the search query was designed to target
The highly promising energy savings of Green AI. From the relevant literature directly with specific keywords, while allow for
results of this review, we deduce that the research field of Green AI flexibility by considering similar, complementary, and variation of
is highly promising, with more than half of the papers reporting 50% the keywords (e.g., the keywords green, sustainability, and sustain-
or more energy savings. This study focuses on the state of the art able). We also mitigated the threat of having an incomplete set of
of Green AI, rather than focusing on the state of practice. It would studies, as well as the threat associated with the specificity of the
be therefore interesting to understand, as future work, the extent to terms used in the search query, by performing a complementary
which this encouraging results are transposed to industrial practice, iterative bidirectional snowballing process of the query results. This
and the potential impediments which hinder their adoption or full latter search strategy allowed us to include literature related to our
potential. query that was not directly referencing any of the automated search
A noticeable industry involvement. Regarding industry in- keywords. We limited our review of the literature to peer-reviewed
volvement in Green AI studies, the results gathered from this re- studies, to moderate the threat about the low quality of the set of
view are promising. The authorship of Green AI literature results primary studies. We deem that such practice does not constitute
to be to a good extent shared between academic and industrial re-
searchers/practitioners. This finding might highlight the sensibility 2 https://www.computer.org/csdl/magazine/so. Accessed 22nd December 2022.
Roberto Verdecchia, June Sallou, and Luís Cruz
an additional threat, as peer-review is a standard requirement of primary studies (98 versus 41 papers). This difference could be ex-
high-quality publications. plained by the fact that their review only includes papers involving
Internal validity To address potential threats to internal valid- consumer products and services and excludes papers dealing with
ity, we established a rigorous research protocol a priori, and we non-commercial applications, whereas we provide an overview of
followed it to conduct all the research activities. Subjective biases the whole field of Green AI.
and interpretations were mitigated by closely complying with the Previous literature reviews consider Green AI research by fo-
selection criteria to evaluate the studies. Moreover, weekly meeting cusing exclusively on specific subdomains of AI and application
were held during the selection process to jointly discuss exam- subdomains of Software Engineering, e.g., deep learning [22], in-
ples, doubts, and to align the selection process between the three formation retrieval [15], or embedded systems [11]. In contrast,
researchers. our research aims to review the entirety of the Green AI literature,
Construct validity To ensure that the set of studies answered regardless of the specific AI or software engineering subdomain it
our research questions, we applied a priori carefully constructed focuses on.
inclusion and exclusion criteria to strictly control the manual selec- In the survey of Xu et al. [22], the authors provide an overview
tion of studies. We then used the bidirectional snowballing tech- of the approaches aimed at improving the environmental sustain-
nique to expand the range of relevant primary studies to a more ability of deep learning. The authors map the different approaches
comprehensive set. using a taxonomy of the deep learning life cycle stage and its related
Conclusion validity Possible sources of bias arising from the artifacts. In contrast to such study, in this review we target a higher
data extraction and analysis phases were mitigated by strict compli- number of Green AI characteristics (see Section 2.2.4), and target
ance with an a priori defined protocol, explicitly tailored to collect the entirety of Green AI literature, rather than exclusively the one
the data needed to answer our research questions. In all, we fol- on deep learning.
lowed the best practises of the standard guidelines for systematic Scells et al. [15] provide a literature review on methods related
literature reviews [6, 7, 10, 14, 21]. Lastly, we documented all the to the domain of Green Information Retrieval. The authors explain
data throughout the whole review process and made them available that the domain of Information Retrieval (IR) produces relatively
for reproducibility and replicability purposes (see Section 1). low emissions compared to other research domains, but they also
warn that similar trends of costs and environmental impact may
6 Related Work appear considering the growing development of new IR-focused
Despite the growing interest around Green AI, the topic has been deep learning models. Natural Language Processing and Machine
marginally considered only in a handful of reviews. The related Learning are also discussed, but only with respect to the Information
work manly investigates the topic as an intersection of AI and envi- Retrieval domain. Therefore, they are not addressing the whole field
ronmental sustainability, or by defining it as a specific subdomain of AI, as done in this review.
of software engineering. To the best of our knowledge, this review Finally, the optimizations that can be made for the implementa-
is the first aiming towards a comprehensive review of Green AI tion of deep learning models on the specific platform of NVIDIA
research and its characteristics. Jetson are reviewed with a focus on energy efficiency by Mittal [11].
In a recent publication, Natarajan et al. perform a systematic The review covers studies at both the hardware and software level.
literature review on the topics of ‘AI for Environmental Sustain- Nevertheless, the review addresses only the Jetson platform 3 . We
ability’ as well as ‘Environmental Sustainability of AI’. The authors differentiate ourselves from this study by providing a holistic review
present the affordances of the use of AI for sustainability that they of Green AI, rather than focusing exclusively on deep learning.
extracted from the literature [12]. ‘AI affordances’ are introduced
as the posible actions offered by AI artifacts to an organizational 7 Conclusion
actor whose goal is to achieve environmental sustainability. The In this systematic literature review, we aimed at characterizing
authors point out the focus of previous research on the technical the existing body of research in Green AI. We identified 98 peer-
side, and they advocate for a further exploration of the concept reviewed publications that show a significant growth in this re-
of sustainable AI affordances from a socio-technical perspective. search field since 2020.
The literature is exclusively analyzed with respect to building the We provide an encompassing overview and characterization of
AI affordances, and other characteristics of the state-of-the art of the different topics being addressed by Green AI papers. We identi-
Green AI are considered nor discussed in the study. In contrast, fied 13 different Green AI topics, showcasing that the spotlight falls
our review focuses on the sustainability of AI, and maps the en- on monitoring, hyperparameter-tuning, model benchmarking, and
tirety of the Green AI literature. In our review, we aim at providing deployment. Less frequent topics – such as data-centric, estimation,
a detailed and comprehensive overview of the characteristics of and emissions – show less obvious approaches that deserve further
the Green AI state-of-the-art research (e.g., topic, domain, type research in the upcoming years.
of study, targeted artifact, overview of energy savings, tool provi- The potential of Green AI cannot be disregarded: the majority of
sion, industrial involvement). Therefore, in contrast to the work publications show significant energy savings, up to 115%, at little
of Natarajan et al. [12], we consider the different facets of Green or no cost in accuracy. However, we argue that most publications
AI, rather than exclusively on AI affordances, leading to a more revolve around laboratory studies. More field experiments are quin-
holistic review of Green AI, and a higher number of considered tessential to help AI practitioners embrace green strategies that
are effective, feasible, and mensurable. This is also reflected in the //doi.org/10.1002/asi.23349
small participation of the industry in these studies – only 23% of [20] Thomas Wiedmann and Jan Minx. 2008. A definition of ‘carbon footprint’.
Ecological economics research trends 1, 2008 (2008), 1–11.
publications involve industry partners. [21] Claes Wohlin. 2014. Guidelines for snowballing in systematic literature stud-
At the same time, we conclude that the field seems to be reaching ies and a replication in software engineering. In International Conference on
Evaluation and Assessment in Software Engineering. ACM Press, 1–10.
a considerable level of maturity. Hence, it is necessary to encourage [22] Jingjing Xu, Wangchunshu Zhou, Zhiyi Fu, Hao Zhou, and Lei Li. 2021. A Survey
the port of promising academic results to industrial practice. In on Green Deep Learning. arXiv (Nov. 2021). https://doi.org/10.48550/arXiv.2111.
other words, our study calls out for the importance of having re- 05193 arXiv:2111.05193
producible research. Only a small fraction of solution papers offers
a tool or software package that can be used by the community. We
Primary Studies
argue that Green AI is an urgent and necessary line of research [23] Brunno Abreu, Mateus Grellert, and Sergio Bampi. 2020. VLSI design of tree-
based inference for low-power learning applications. In 2020 IEEE International
that needs to grow fast and solid – non-replicable research can only Symposium on Circuits and Systems (ISCAS). IEEE, 1–5.
slow us down. [24] Brunno Abreu, Mateus Grellert, and Sergio Bampi. 2022. A framework for design-
This review also serves as a foundation for future research that ing power-efficient inference accelerators in tree-based learning applications.
Engineering Applications of Artificial Intelligence 109 (2022), 104638.
ultimately aims to reduce the climate impact of AI. In this respect, [25] Phyllis Ang, Bhuwan Dhingra, and Lisa Wu Wills. 2022. Characterizing the
we see potential in follow-up grey literature or interview studies Efficiency vs. Accuracy Trade-off for Long-Context NLP Models. In Proceedings
of NLP Power! The First Workshop on Efficient Benchmarking in NLP. Association
to understand how AI professionals are currently addressing the for Computational Linguistics, Dublin, Ireland, 113–121. https://doi.org/10.
issue. 18653/v1/2022.nlppower-1.12
[26] Lasse F. Wolff Anthony, Benjamin Kanding, and Raghavendra Selvan. 2020.
