LCA Book - Chapter 5
LCA Book - Chapter 5
Chapter 5: Data Acquisition and Management for Life Cycle Inventory Analysis
Life Cycle Assessment: Quantitative Approaches for Decisions That Matter lcatextbook.com
Chapter 5: Data Acquisition and Management for Life Cycle Inventory Analysis
99
Inventory analysis follows a straightforward and repeating workflow, which involves the
following steps (as taken from ISO 14044:2006) done as needed until the inventory analysis
matches the then-current goal and scope:
Data Collection
Life Cycle Assessment: Quantitative Approaches for Decisions That Matter lcatextbook.com
100
Chapter 5: Data Acquisition and Management for Life Cycle Inventory Analysis
Data Aggregation
As the inventory analysis process is iterated, the system boundary and/or goal and scope
may be changed (recall the two-way arrows in Figure 4-1). The procedure is as simple as
needed, and gets more complex as additional processes and flows are added. Each of the
inventory analysis steps are discussed in more detail below, with brief examples for
discussion. Several more detailed examples are shown later in the chapter.
Step 1 - Preparation for data collection based on goal and scope
The goal and scope definition guides which data need to be collected (noting that the goal
and scope may change iteratively during the course of your study and thus may cause
additional data collection effort or previously collected data to be discarded). A key
consideration is the product system diagram and the chosen system boundary. The boundary
shows which processes are in the study and which are not. For every unit process in the
system boundary, you will need to describe the unit process and collect quantitative data
representing its transformation of inputs to outputs. For the most fundamental unit
processes that interface at the system boundary, you will need to ensure that the inputs and
outputs are those elementary flows that pass through the system boundary. For other unit
processes (which may not be connected to those elementary flow inputs and outputs) you
will need to ensure they are connected to each other through non-elementary flows such as
intermediate products or co-products.
When planning your data collection activities, keep in mind that you are trying to represent
as many flows as possible in the unit process shown in Figure 5-2. Choosing which flows to
place at the top, bottom, left, or right of such a diagram is not relevant. The only relevant
part is ensuring inputs flow into and outputs flow out of the unit process box. You want to
quantitatively represent all inputs, either from nature or from the technosphere (defined as
the human altered environment, thus flows like products from other processes). By covering
all natural and human-affected inputs, you have covered all possible inputs. You want to
quantitatively represent outputs, either as products, wastes, emissions, or other releases.
Inputs from nature include resources from the ground, from water, or air (e.g., carbon
dioxide to be sequestered). Outputs to nature will be in the form of emissions or releases to
'compartments' in the ground, air, or water. Outputs may also be classified to 'direct human
uptake' for food products, medicines, etc.
Life Cycle Assessment: Quantitative Approaches for Decisions That Matter lcatextbook.com
Chapter 5: Data Acquisition and Management for Life Cycle Inventory Analysis
101
As a tangible example, imagine a product system like the mobile phone example in Chapter 4
where we have decided that the study should track water use as an input. Any of the unit
processes within the system boundary that directly uses water will need a unit process
representation with a quantity of water as an input and some quantitative measure of output
of the process. For mobile phones, such processes that use water as a direct input from
nature may include plastic production, energy production, and semiconductor
manufacturing. Other unit processes within the boundary may not directly consume water,
but may tie to each other through flows of plastic parts or energy. They themselves will not
have water inputs, but by connecting them all together, in the end, the water use of those
relevant sectors will still be represented. The final overall accounting of inventory inputs
and/or outputs across the life cycle within the system boundary is called a life cycle
inventory result (or LCI result).
The unit process focus of LCA drives the need for data to quantitatively describe the
processes. If data is not available or inaccessible, then the product system, system boundary,
or goal may need to be modified. Data may be available but found not to fit the study. For
example, an initial system boundary may include a waste management phase, but months of
effort could fail to find relevant disposition data for a specific product of the process. In this
case, the system boundary may need to be adjusted (made smaller) and other SDPs edited to
represent this lack of data in the study. On the other hand, data that is assumed to not be
available at first may later be found, which would allow an expansion of the system
boundary. In general, system boundaries are made smaller not larger over the course of a
study.
Step 2 - Data Collection
For each process within the system boundary, ISO requires you to "measure, calculate, or
estimate" data to quantitatively represent the process in your product system model. In LCA,
the "gold standard" is to collect your own data for the specific processes needed, called
primary data collection. This means directly measuring inputs and outputs of the process
Life Cycle Assessment: Quantitative Approaches for Decisions That Matter lcatextbook.com
102
Chapter 5: Data Acquisition and Management for Life Cycle Inventory Analysis
on-site for the specific machinery use or transformation that occurs. For example, if you
required primary data for energy use of a process in an automobile assembly line that fastens
a component on to the vehicle with a screw, you might attach an electricity meter to the
piece of machinery that attaches the screw. If you were trying to determine the quantity of
fuel or material used in an injection molding process, you could measure those quantities as
they enter the machine. If you were trying to determine the quantity of emissions you could
place a sensor near the exhaust stack.
If you collect data with methods like this, intended to inventory per-unit use of inputs or
outputs, you need to use statistical sampling and other methods to ensure you generate
statistically sound results. That means not simply attaching the electricity meter one time, or
measuring fuel use or emissions during one production cycle (one unit produced). You
should repeat the same measurement multiple times, and perhaps on multiple pieces of
identical equipment, to ensure that you have a reasonable representation of the process and
to guard against the possibility that you happened to sample a production cycle that was
overly efficient or inefficient with respect to the inputs and outputs. The ISO Standard gives
no specific guidance or rules for how to conduct repeated samples or the number of samples
to find, but general statistical principles can be used for these purposes. Your data collection
summary should then report the mean, median, standard deviation, and other statistical
properties of your measurements. In your inventory analysis you can then choose whether to
use the mean, median, or a percentile range of values.
Note that many primary data collection activities cannot be completed as described above. It
may not be possible to gain access to the input lines of a machine to measure input use on a
per-item processed basis. You thus may need to collect data over the course of time and
then use total production during that time to normalize the unit process inventory. For the
examples in the previous paragraph, you might collect electricity use for a piece of machinery
over a month and then divide by the total number of vehicles that were assembled. Or you
may track the total amount of fuel and material used as input to the molding machine over
the course of a year. In either case, you would end up with an averaged set of inputs and/or
outputs as a function of the product(s) of the unit process. The same general principles
discussed above apply here with respect to finding multiple samples. In this case you could
find several monthly values or several yearly values to find an average, median, or range.
The ISO Standard (14044:2006, Annex A) gives examples of "data collection sheets" that
can support your primary data collection activities. Note that these are only examples, and
that your sheets may look different. The examples are provided to ensure, among other
things, that you are recording quantities and units, dates and locations of record keeping, and
descriptions of sampling done. The most likely scenario is that you will create electronic data
collection sheets by recording all information in a spreadsheet. This is a fair choice because
from our perspective, Microsoft Excel is the most popularly used software tool in support of
LCA. Even practitioners using other advanced LCA software packages still typically use
Microsoft Excel for data management, intermediate analysis, and graphing.
Life Cycle Assessment: Quantitative Approaches for Decisions That Matter lcatextbook.com
Chapter 5: Data Acquisition and Management for Life Cycle Inventory Analysis
103
Collecting primary data can be difficult or impossible if you do not own all the equipment or
do not have direct access to it either due to geospatial or organizational barriers. This is
often the case for an LCA consultant who may be tasked with performing a study for a client
but who is given no special privileges or access to company facilities. Further, you may need
to collect data from processes that are deemed proprietary or confidential by the owner. This
is possible in the case of a comparative analysis with some long-established industry practice
versus a new technology being proposed by your client or employer. In these cases, the
underlying data collection sheets may be confidential. Your analysis may in these cases only
"internally use" the average data points without publicly stating the quantities found in any
subsequent reports. If the study is making comparative assertions, then it may be necessary
to grant to third-party reviewers (who have signed non-disclosure agreements) access to the
data collection sheets to appreciate the quality of the data and to assess the inventory analysis
done while maintaining overall confidentiality.
Beyond issues of access, while primary data is considered the "gold standard" there are
various reasons why the result may not be as good as expected in the context of an LCA
study. First, the data is only as good as the measurement device (see accuracy and precision
discussion in Chapter 2). Second, if you are not able to measure it yourself then you
outsource the measurement, verification, and validation to someone else and trust them to
do exactly as you require. Various problems may occur, including issues with translation
(e.g., when measuring quantities for foreign-owned or contracted production) or not finding
contacts with sufficient technical expertise to assist you. Third, you must collect data on
every input and output of the process relevant to your study. If you are using only an electric
meter to measure a process that also emits various volatile organic compounds, your
collected data will be incomplete with respect to the full litany of inputs and outputs of the
process. Your inventory for that process would undercount any other inputs or outputs.
This is important because if other processes in your system boundary track volatile organics
(or other inputs and outputs) your primary data will undercount the LCI results.
The alternative to primary data collection is to use secondary data (the "calculating and
estimating" referenced above). Broadly defined, secondary data comes from life cycle
databases, literature sources (e.g., from searches of results in published papers), and other
past work. It is possible you will find data closely, but not exactly, matching the required unit
process. Typical tradeoffs to accessibility are that the secondary data identified is for a
different country, a slightly different process, or averaged across similar machinery. That
does not mean you cannot use it you just need to carefully document the differences
between the process data you are using and the specific process needed in your study. While
deemed inferior given the use of the word secondary, in some cases secondary data may be
of comparable or higher quality than primary data. Secondary data is typically discoverable
because it has been published by the original author who generated it as primary data for
their own study (and thus is typically of good quality). In short, one analyst's primary data
may be another's secondary data. Again, the "secondary" designation is simply recognition
Life Cycle Assessment: Quantitative Approaches for Decisions That Matter lcatextbook.com
104
Chapter 5: Data Acquisition and Management for Life Cycle Inventory Analysis
that it is being "reused" from a previously existing source and not collected new in your own
study. Many credible and peer reviewed studies are constructed mostly or entirely of
secondary data. More detail on identifying and using secondary data sources like LCI
databases is below.
For secondary data, you should give details about the secondary source (including a full
reference), the timestamp of the data record, and when you accessed it. In both cases you
must quantitatively maintain the correct units for the inputs and outputs of the unit process.
While not required, it is convenient to make tables that neatly summarize all of this
information.
Regardless of whether your data for a particular process comes from a primary or secondary
source, the ISO Standard requires you to document the data collection process, give details
on when data have been collected, and other information about data quality. Data quality
requirements (DQRs) are required scope items that we did not discuss in Chapter 4 as part
of the SDP, but characterize the fundamental expectations of data that you will use in your
study. As specified by ISO 14044:2006, these include statements about your intentions with
respect to age of data, geospatial reach, completeness, sources, etc. Data quality indicators
are summary metrics used to assess the data quality requirements.
For example, you may have a data quality requirement that says that all data will be primary,
or at least secondary but from peer-reviewed sources. For each unit process, you can have a
data quality indicator noting whether it is primary or secondary, and whether it has been
peer-reviewed. Likewise, you may have a DQR that says all data will be from the same
geospatial region (e.g., a particular country like the US or a whole region like North
America). It is convenient to summarize the DQRs in a standardized tabular form. The first
two columns of Figure 5-3 show a hypothetical DQR table partly based on text from the
2010 Christmas tree study mentioned previously. The final column represents how the
requirements might be indicated as a summary in a completed study. The indicated values
are generally aligned with the requirements (as they should be!).
Data Quality Category
Temporal
Geospatial
Technological
Requirement
Life Cycle Assessment: Quantitative Approaches for Decisions That Matter lcatextbook.com
Chapter 5: Data Acquisition and Management for Life Cycle Inventory Analysis
105
Beyond using primary or secondary data, you might need to estimate the parameters for
some or all of the input and outputs of a unit process using methods as introduced in
Chapter 2. Your estimates may be based on data for similar unit processes (but which you
deem to be too dissimilar to use directly), simple transformations based on rules of thumb,
or triangulated averages of several unit processes. From a third-party perspective, estimated
data is perceived as lower quality than primary or secondary sources. However when those
sources cannot be found, estimating may be the only viable alternative.
Example 5-1: Estimating energy use for a service
Question:
Consider that you are trying to generate a unit process associated with an
internal corporate design function as part of the life cycle "overhead" of a particular product
and given the scope of your study need to create an input use of electricity. Your company is all
located in one building. There is no obvious output unit for such a process, so you could define
it to be per 1 product designed, per 1 square foot of design space, etc., as convenient for your
study.
Answer:
You could estimate the input electricity use for a design office over the course of
a year and then try to normalize the output. If you only had annual electricity use for the entire
building (10,000 kWh), and no special knowledge about the energy intensity of any particular
part of the building as subdivided into different functions, you could find the ratio of the total
design space in square feet (2,000 sf) as compared to the total square feet of the building
(50,000 sf), and use that ratio (2/50) to scale down the total consumption to an amount used
for design over the course of a year (400 kWh). If your output was per product, you could then
further normalize the electricity used for the design space by the unique number of products
designed by the staff in that space during the year.
You could add consideration of non-electricity use of energy (e.g., for heating or cooling)
with a similar method. Note that such ancillary support services like design, research and
development, etc., generally have been found to have negligible impacts, and thus many
studies exclude these services from their system boundaries.
Step 3 - Data Validation
Chapter 2 provided some general guidance on validating research results. With respect to
validating LCI data, you generally need to consider the quantitative methods used and ensure
that the resulting inventories meet your stated DQRs. Data validation should be done after
data is collected but before you move on to the actual inventory modeling activities of your
LCA.
Life Cycle Assessment: Quantitative Approaches for Decisions That Matter lcatextbook.com
106
Chapter 5: Data Acquisition and Management for Life Cycle Inventory Analysis
Life Cycle Assessment: Quantitative Approaches for Decisions That Matter lcatextbook.com
Chapter 5: Data Acquisition and Management for Life Cycle Inventory Analysis
107
scope may sound like "cheating", but the purpose of the DQRs is as a single and convenient
summary of the study goals for data quality. It allows a reader to quickly get a sense of how
relevant the study results are given the final DQRs. While not required, you can state initial
goal DQRs alongside final DQRs upon completion of the study.
Step 4 - Data Allocation (if needed)
Allocation will be discussed more in Chapter 6, but in short, allocation is the quantitative
process done by the study analyst to assign specific quantities of inputs and outputs to the
various products of a process based on some mathematical relation between the products.
For example, you may have a process that produces multiple outputs, such as a petroleum
refinery process that produces gasoline, diesel, and other fuels and oils. Refineries use a
significant amount of energy. Allocation is needed to quantitatively connect the energy input
to each of the refined products. Without specified allocation procedures, the connections
between those inputs and the various products could be done haphazardly. The ISO
Standard suggests that the method you use to perform the allocation should be based on
underlying physical relationships (such as the share of mass or energy in the products) when
possible. For example, if your product of interest is gasoline, you will need to determine how
much of the total refinery energy was used to make the gasoline. For a mass allocation, you
could calculate it by using the ratio of the mass of the gasoline produced to the total mass of
all of the products. You may have to further research the energetics of the process to
determine what allocation method is most appropriate.
If physical relationships are not possible, then methods such as economic allocationsuch
as by eventual sale price could be used. ISO also says that you should consistently choose
allocation methods as much as possible across your product system, meaning that you
should try not to use a mass-based allocation most of the time and an energy-based
allocation some of the time. This is because mixing allocation methods could be viewed by
your audience or reviewers as a way of artificially biasing the results by picking allocations
that would provide low or high results. Allocation is conceptually similar to the design space
electricity Example 5-1. Most allocations are just linear transformations of effects.
When performing allocation, the most important considerations are to fully document the
allocation method chosen (including underlying allocation factors) and to ensure that total
inputs and outputs are equal to the sum of the allocated inputs and outputs. It is possible
that none of your unit processes have multiple products, and thus you do not need to
perform allocation. You might also be able to avoid allocation entirely, as we will see later.
Step 5 - Translating Data to the Unit Process
In this step you convert the various collected data into a representation of the output of the
unit process. Regardless of how you have defined the study overall, this step requires you to
collect all of the inputs and outputs as needed for 1 unit output from that process. From
Life Cycle Assessment: Quantitative Approaches for Decisions That Matter lcatextbook.com
108
Chapter 5: Data Acquisition and Management for Life Cycle Inventory Analysis
Example 5-1, you would ensure that the electricity input matched the unit basis of your
product flow (e.g., per 1 product designed). This result also needs to be validated.
Step 6 - Translating Data to the Functional Unit
The reason why this step is included in the ISO LCA Standard is to remind you that you are
doing an overall study on the basis of 1 functional unit of product output. Either during the
data collection phase, or in subsequent analysis, you will need to do a conversion so that the
relative amount of product or intermediate output of the unit process is related to the
amount needed per functional unit. Eventually, all of your unit process flows will need to be
converted to a per-functional unit basis. If all unit processes have been so modified, then
finding the total LCI results per functional unit is a trivial procedure. From Example 5-1, the
design may be used to eventually produce 1 million of the widgets. The electricity use for
one product design must be distributed to the 1 million widgets so that you will then have
the electricity use for a single widget in the design phase (a very small amount). This result
also needs to be validated.
Step 7 - Data Aggregation
In this step, all unit process data in the product system diagram are combined into a single
result for the modeled life cycle of the system. What this typically means is summing all
quantities of all inputs and outputs into a single total result on a functional unit basis.
Aggregation occurs at multiple levels. Figure 4-4 showed the various life cycle stages within
the view of the product system diagram. A first level of aggregation may add all inputs and
outputs under each of the categories of raw material acquisition, use, etc. A second level of
aggregation may occur across all of these stages into a final total life cycle estimate of inputs
and outputs per functional unit. Aggregated results are often reported in a table showing
total inputs and outputs on per-process, or per stage, values, and then a sum for the entire
product system. Example 5-2 shows aggregated results for a published study on wool from
sheep in New Zealand. The purpose of such tables is to emphasize category level results,
such as that half of the life cycle energy use occurs on farm. Results could also be graphed.
Example 5-2: Aggregation Table for Published LCA on Energy to Make Wool
(Source: The AgriBusiness Group, 2006)
Life Cycle Stage
Energy Use
(GJ per tonne wool)
On Farm
22.6
Processing
21.7
Transportation
1.5
Total
45.7
Life Cycle Assessment: Quantitative Approaches for Decisions That Matter lcatextbook.com
Chapter 5: Data Acquisition and Management for Life Cycle Inventory Analysis
109
Beyond such tables, product system diagrams may be annotated with values for different
levels of aggregation by adding quantities per functional unit. Example 5-3 shows a diagram
for a published study on life cycle effects of bottled water and other beverage systems
performed for Nestle Waters. Such values can then be aggregated into summary results.
Example 5-3: Aggregation Diagram for Bottled Water (Source: Quantis, 2010)
We have above implied that aggregation of results occurs over a relatively small number of
subcomponents. However, a product system diagram may be decomposed into multiple sets
of tens or hundreds of constituent pieces that need to be aggregated. If all values for these
subcomponents are on a functional unit basis, the summation is not difficult, but the
bookkeeping of quantities per subcomponent remains an issue. If the underlying
subcomponent values are not consistently on a per functional unit basis, units of analysis
should be double checked to ensure they can be reliably aggregated.
Life Cycle Assessment: Quantitative Approaches for Decisions That Matter lcatextbook.com
110
Chapter 5: Data Acquisition and Management for Life Cycle Inventory Analysis
Life Cycle Assessment: Quantitative Approaches for Decisions That Matter lcatextbook.com
Chapter 5: Data Acquisition and Management for Life Cycle Inventory Analysis
111
for the sensitivity results to be stated so that it was clear that there is a variable that if
credibly changed by a specified amount, has the potential to alter the study conclusions.
While on the subject of assessing comparative differences, it is becoming common for
practitioners in LCA to use a "25% rule" when testing for significant differences. The 25%
rule means that the difference between two LCI results, such as for two competing products,
must be more than 25% different for the results to be deemed significantly different, and
thus for one to be declared as lower than the other. While there is not a large quantitative
framework behind the choice of 25% specifically, this heuristic is common because it
roughly expresses the fact that all data used in such studies is inherently uncertain, and by
forcing 25% differences, then relatively small differences would be deemed too small to be
noted in study conclusions. We will talk more about modeling and assessing uncertainties in
Chapter 11 on uncertainty.
Interpretation can also serve as an additional check on the goal and scope parameters. This is
where you could assess whether a system boundary is appropriate. As an example, while the
ISO Standard encourages full life cycle stage coverage within system boundaries, it does not
require that every LCA encompass all stages. One could try to defend the validity of a life
cycle study of an automobile that focused only on manufacturing, or only on the use stage.
The results of the interpretation phase could then internally weigh in on whether such a
decision was appropriate given the study goal. If a (qualified) conclusion can be drawn, the
study could be left as-is, if not, a broader system boundary could be chosen, with or without
preliminary LCI results.
Regardless, the real purpose of interpretation is to improve the quality of your study,
especially the quality of the written conclusions and recommendations that arise from your
quantitative work. As with other quantitative analysis methods, you will need to also improve
your qualitative skills, including documentation, to ensure that your interpretation efforts are
worthwhile.
Life Cycle Assessment: Quantitative Approaches for Decisions That Matter lcatextbook.com
112
Chapter 5: Data Acquisition and Management for Life Cycle Inventory Analysis
One prominent source of secondary data is the thousands of peer-reviewed journal papers
done over time by the scientific community, also known as literature sources. Some of
these papers have been explicitly written to be a source of secondary data, while authors of
other papers developed useful data in the course of research (potentially on another topic)
and made the process-level details available as part of the paper or in its supporting
information. Sometimes the study authors are not just teams of researchers, but industry
associations or trade groups (e.g., those trying to disseminate the environmental benefits of
their products). Around the world, industry groups like Plastics Europe, the American
Chemistry Council, and the Portland Cement Association have sponsored projects to make
process-based data available via publications. It is common to see study authors citing
literature sources, and doing so requires you to simply use a standard referencing format like
you would for any source. Unfortunately, data from such sources is typically not available in
electronic form, and thus there are potentials for data entry or transcription errors as you try
to make use of the published data. It is due to issues like these that literature sources
constitute a relatively small share of secondary data used in LCA studies.
There is a substantial amount of secondary data available to support LCAs in various life
cycle databases. These databases are the main source of convenient and easy to access
secondary data. Some of the data represented in these databases are from the literature
sources mentioned above. Since the first studies mentioned in Chapter 1, various databases
comprised of life cycle inventory data have been developed. The original databases were sold
by Ecobilan and others in the mid-1990s. Nowadays the most popular and rigorously
constructed database is from ecoinvent, developed by teams of researchers in Switzerland
and available either by paying directly for access to their data website or by an add-on fee to
popular LCA system tools such as SimaPro and GaBi (which in turn have their own
databases). None of these databases are free, and a license must be obtained to use them. On
the other hand, there are a variety of globally available and publicly accessible (free) life cycle
databases. In the US, LCI data from the National Renewable Energy Laboratory (NREL)'s
LCI database and the USDA's LCA Digital Commons are popular and free3. Figure 5-4
summarizes the major free and paid life cycle databases (of secondary data) in the world that
provide data at the unit process level for use in life cycle studies. Beyond the individual
databases, there is also an "LCA Data Search Engine," managed by the United Nations
Environmental Programme (UNEP), that can assist in finding available free and commercial
unit process data (LCA-DATA 2013). All of the databases have their own user's guides that
you should familiarize yourself with before searching or using the data in your own studies.
Data from the NREL US LCI Database has been transferred over to the USDA LCA Digital Commons as of 2012. Both datasets can
now be accessed from that single web database.
3
Life Cycle Assessment: Quantitative Approaches for Decisions That Matter lcatextbook.com
Chapter 5: Data Acquisition and Management for Life Cycle Inventory Analysis
Database
Approximate
Cost
Number of
processes
Notes
ecoinvent
2,500 Euros
($3,000 USD)
4,000+
US NREL LCI
Database
Free
(companies,
agencies pay to
publish data)
600+
USDA LCA
Digital Commons
Free
(manufacturers
and agencies
pay to publish
data)
300+
ELCD
Free
300+
BEES
Free
GaBi
$3,000 USD
113
These databases can be very comprehensive, with each containing data on hundreds to
thousands of unique processes, with each process comprised of details for potentially
hundreds of input or output flows. Collecting the various details of inputs and outputs for a
particular unit process (which we refer to as an LCI data module but which are referred to
as "datasets" or "processes" by various sources) requires a substantial amount of time and
effort. This embedded level of effort for unit process data is important because even though
it represents a secondary data source, to create a superior set of primary data for a study, you
might need to collect data for 100 or more input and output flows for the process. Of course
your study may have a significantly smaller scope that includes only 5 flows, and thus your
data collection activities would only need to measure those. The databases do highlight an
ongoing conundrum in the LCA community the nave stated preference for primary data
when substantial high-quality secondary data is pervasive. Another benefit of these databases
is that subsets of the data modules are created and maintained consistently, thus a common
set of assumptions or methods would be associated with hundreds of processes. This is yet
another difference to primary data which could have a set of ad-hoc assumptions used in its
creation.
Life Cycle Assessment: Quantitative Approaches for Decisions That Matter lcatextbook.com
114
Chapter 5: Data Acquisition and Management for Life Cycle Inventory Analysis
Now that the availability of vast secondary data sources has been introduced, we discuss the
data structures typical of these LCI data modules. As with many facets of LCA, there is a
global standard for storing information in LCI data modules, known as EcoSpold. The
EcoSpold format is a structured way of storing and exchanging LCI data, where details such
as flows and allocation methods are classified for each process. There is no requirement that
LCA tools use the EcoSpold format, but given its popularity and the trend that all of the
database sources in Figure 5-4 use this format, it is worth knowing. Instead of giving details
on the format (which is fairly technical and generally only useful for personnel involved in
creating LCA software) we instead will demonstrate the way in which LCI data modules are
typically represented in the database and allow you to think about the necessary data
structures separately.
In the rest of this chapter we consider an LCI of the CO2 emitted to generate 1 kWh of coalfired electricity in the United States. Our system boundary for this example (as in Figure 5-5)
has only three unit processes: mining coal, transporting it by rail, and burning it at a power
plant. The refinery process that produces diesel fuel, an input for rail, is outside of our
boundary, but the effects of using diesel as a fuel are included. We can assume, beyond the
fact that this is an academic example, that such a tight boundary is realistic because these are
known to be significant parts of the supply chain of making coal-fired power. We will
discuss the use of screening methods to help us set such boundaries in Chapter 8.
Figure 5-5: Product System Diagram for Coal-Fired Electricity LCI Example
To achieve our goal of the CO2 emitted per kWh, we will need to find process-level data for
coal mining, rail transportation, and electricity generation. In the end, we will combine the
Life Cycle Assessment: Quantitative Approaches for Decisions That Matter lcatextbook.com
Chapter 5: Data Acquisition and Management for Life Cycle Inventory Analysis
115
results from these three unit processes into a single estimate of total CO2 per kWh. This way
of performing process-level LCA is called the process flow diagram approach.
We will focus on the US NREL LCI database (2013) in support of this relatively simple
example. This database has a built-in search feature such that typing in a process name or
browsing amongst categories will show a list of available LCI data modules (see the
Advanced Material at the end of this chapter for brief tutorials on using the LCA Digital
Commons website, that hosts the US LCI data, as well as other databases and tools).
Searching for "electricity" yields a list of hundreds of processes, including these LCI data
modules:
The nomenclature used may be confusing, but is somewhat consistent across databases. The
constituents of the module name can be deciphered as representing (1) the product, (2) the
primary input, and (3) the boundary of the analysis. In each of the cases above, the unit
process is for making electricity. The inputs are various types of fuels. Finally, the boundary
is such that it represents electricity leaving the power plant (as opposed to at the grid, or at a
point of use like a building). Once you know this nomenclature, it is easier to browse the
databases to find what you are looking for specifically.
Given the above choices, we want to use one of the three coal-fueled electricity generation
unit processes in our example. Lignite and anthracite represent small shares of the
generation mix, so we choose bituminous coal as the most likely representative process and
use the last data module in the list above (alternatively, we could develop a weighted-average
process across the three types that would be useful). Using similar brief search methods in
the US NREL website we would find the following unit processes as relevant for the other
two pieces of our system:
These two processes represent mining of bituminous coal and the transportation of generic
product by diesel-powered train.
Life Cycle Assessment: Quantitative Approaches for Decisions That Matter lcatextbook.com
116
Chapter 5: Data Acquisition and Management for Life Cycle Inventory Analysis
Figure 5-6 shows an abridged excerpt of the US NREL LCI data module for Electricity,
bituminous coal, at power plant. The entire data module is available publicly4. Within the US
NREL LCI database website, such data is found by browsing or searching for the process
name and then viewing the "Exchanges". These data modules give valuable information
about the specific process chosen as well as other processes they are linked to. While here
we discuss viewing the data on the website, it can also be downloaded to a Microsoft Excel
spreadsheet or as XML.
It is noted that this is an abridged view of the LCI data module. The complete LCI data
module consists of quantitative data for 7 inputs and about 60 outputs. For the sake of the
example in this section, we assume the abridged inventory and ignore the rest of the details.
Most of the data modules in databases have far more inputs and outputs than in this
abridged module; it is not uncommon to find data modules with hundreds of outputs (e.g.,
for emissions of combustion processes). If you have a narrow scope that focuses on a few
air emission outputs, many of the other outputs can be ignored in your analysis. However if
you plan to do life cycle impact assessment, the data in the hundreds of inputs and/or
outputs may be useful in the impact assessment. If your study seeks to do a broad impact
assessment, collecting your own primary data can be problematic as your impact assessment
will all but require you to broadly consider the potential flows of your process. If you focus
instead on just a few flows you deem to be important, then the eventual impact assessment
could underestimate the impacts of your process. This is yet another danger of primary data
collection (undercounting flows).
Data from the NREL US LCI database in this chapter are as of July 20, 2014. Values may change in revisions to the database that cannot
be expressed here.
4
Life Cycle Assessment: Quantitative Approaches for Decisions That Matter lcatextbook.com
Chapter 5: Data Acquisition and Management for Life Cycle Inventory Analysis
Flow
Category
Type
Unit
Amount
root/flows
ProductFlow
kg
4.42e-01
root/flows
ProductFlow
t*km
4.61e-01
root/flows
ProductFlow
kWh
1.00
air/unspecified
ElementaryFlow
kg
9.94e-01
117
Comment
Inputs
Outputs
Figure 5-6: Abridged LCI data module from US NREL LCI Database for bituminous coal-fired
electricity generation. Output for functional unit italicized. (Source: US LCI Database 2012)
Figure 5-6 is organized into sections of data for inputs and outputs. At the top, we see the
abridged input flows into the process for generating electric power via bituminous coal.
Recalling the discussion of direct and indirect effects from Chapter 4, the direct inputs listed
are bituminous coal and train transport. The direct outputs listed are fossil CO2 emissions
(which is what results when you burn a fossil fuel) and electricity. Before discussing all of the
inputs and outputs, we briefly focus on the output section to identify a critical component of
the data module the electricity output is listed as a product flow, with units of 1 kWh.
Every LCI process will have one or more outputs, and potentially have one or more product
flows as outputs, but this module has only one. That means that the functional unit basis for
this unit process is per (1) kWh of electricity. All other inputs and outputs in Figure 5-6,
representing the US NREL LCI data module for Electricity, bituminous coal, at power plant are
presented as normalized per 1 kWh. You could think of this module as providing energy
intensities or emissions factors per kWh. Thinking back to the discussion above on data
collection, its unlikely that the study done to generate this LCI data module actually
measured the inputs and outputs needed to make just 1 kWh of electricity at a power plant
it is too small a value. In reality, it is likely that the inputs and outputs were measured over
the course of a month or year, and then normalized by the total electricity generation in kWh
to find these normalized values. It is the same process you would do if you were making the
LCI data module yourself. We will discuss how to see the assumptions and boundaries for
the data modules later in this chapter.
We now consider the abridged data module in more detail. In Figure 5-6, each of the input
flows are a product flow from another process (namely, the product of bituminous coal
mining and the product of train transportation). The unit basis assumption for those inputs
is also given kg for the coal and ton-kilometers (t*km) for the transportation. A tonLife Cycle Assessment: Quantitative Approaches for Decisions That Matter lcatextbook.com
118
Chapter 5: Data Acquisition and Management for Life Cycle Inventory Analysis
kilometer is a compound unit (like a kilowatt-hour) that expresses the movement of 1 ton of
material over the distance of 1 kilometer. Both are common SI units. Finally the amount of
input required is presented in scientific notation and can be translated into 0.442 kg of coal
and 0.46 ton-km of train transport. Likewise, the output CO2 emissions to air are estimated
at 0.994 kg. All of these quantities are normalized on a per-kWh generated basis. The
comment column in Figure 5-6 (and which appears in many data modules) gives brief but
important notes about specific inputs and outputs. For example, the input of train
transportation is specified as being a potential transportation route from mine to power
plant, which reminds us that the unit process for generating electricity from coal is already
linked to a requirement of a train from the mine.5
Now that we have seen our first example of a secondary source LCI data module, Figure 5-7
presents a graphical representation of the abridged unit process similar to the generic
diagram of Figure 5-2. The direct inputs, which are product flows from other man made
processes, are on the left side as inputs from the technosphere. The abridged unit process
has no direct inputs from nature. The direct CO2 emissions are at the top. The output
product, and functional unit basis of the process, of electricity is shown on the right. All
quantitative values are representative of the functional unit basis of the unit process.
Figure 5-7: Unit Process Diagram for abridged electricity generation unit process
Returning to our example LCA problem, we now have our first needed data point, that the
direct CO2 emissions are 0.994 kg / kWh generated. Given that we have only three unit
processes in our simple product system, we can work backwards from this initial point to get
estimated CO2 emissions values from mining and train transport. Again using the NREL
LCI database, Figure 5-8 shows abridged data for the data module bituminous coal, at mine. The
The unabridged version of the module has several other averaged transport inputs in ton-km, such as truck, barge, etc. Overall, the
module gives a "weighted average" transport input to get the coal from the mine to the power plant. Since we are only using the abridged
(and unedited) version, we will otherwise undercount the upstream CO2 emissions from delivering coal since we are skipping the weighted
effects from those other modes.
5
Life Cycle Assessment: Quantitative Approaches for Decisions That Matter lcatextbook.com
Chapter 5: Data Acquisition and Management for Life Cycle Inventory Analysis
119
output and functional unit is 1 kg of bituminous coal as it leaves the mine. Two important
inputs are diesel fuel needed to run equipment, and coal. It may seem odd to see coal listed
as an input into a coal mining process, but note it is listed as a resource and as an elementary
flow. As discussed in Chapter 4, elementary flows are flows that have not been transformed
by humans. Coal trapped in the earth for millions of years certainly qualifies as an elementary
flow by that definition! Further, it reminds us that there is an elementary flow input within
our system boundary, not just many product flows. This particular resource is also specified
as being of a certain quality, i.e., with energy content of about 25 MJ per kg. Finally, we can
see from a mass balance perspective that there is some amount of loss in the process, i.e.,
that every 1.24 kg of coal in the ground leads to only 1 kg of coal leaving the mine.
Flow
Category
Type
Unit
Amount
resource/ground
ElementaryFlow
kg
1.24
root/flows
ProductFlow
8.8e-03
Comment
Inputs
Outputs
Bituminous coal, at mine
root/flows
ProductFlow
kg
1.00
Figure 5-8: Abridged LCI data module from US NREL LCI Database for bituminous coal mining.
Output for functional unit italicized. (Source: US LCI Database 2012)
Figure 5-9 shows the abridged NREL LCI data module for rail transport (transport, train, diesel
powered). The output / functional unit of the process is 1 ton-km of rail transportation service
provided. Providing that service requires 0.00648 liters of diesel fuel and emits .0189 kg of
CO2, both per ton-km.
Life Cycle Assessment: Quantitative Approaches for Decisions That Matter lcatextbook.com
120
Chapter 5: Data Acquisition and Management for Life Cycle Inventory Analysis
Flow
Category
Type
Unit
Amount
root/flows
ProductFlow
6.48e-03
air/unspecified
ElementaryFlow
kg
1.89e-02
Comment
Inputs
Diesel, at refinery
Outputs
Carbon dixoide, fossil
To then find the total CO2 emissions across these three processes, we can work backwards
from the initial process. We already know there are 0.994 kg/kWh of CO2 emissions at the
power plant. But we also need to mine the coal and deliver it by train for each final kWh of
electricity. The emissions for those activities are easy to associate, since Figure 5-6 provides
us with the needed connecting units to estimate the emissions per kWh. Namely, that 0.442
kg of coal needs to be mined and 0.461 ton-km of rail transport needs to be used per kWh
of electricity generated. We can then just use those unit bases to estimate the CO2 emissions
from those previous processes. Figure 5-8 does not list direct CO2 emissions from coal
mining, although it does list an input of diesel used in a boiler6. If we want to assume that we
are only considering direct emissions from each process, we can assume the CO2 emissions
from coal mining to be zero7, or we could expand our boundary and acquire the LCI data
module for the diesel, combusted in industrial boiler process. Our discussion below follows the
assumption that direct emissions are zero.
Figure 5-9 notes that there are 0.0189 kg of CO2 emissions per ton-km of rail transported.
Equation 5-1 summarizes how to calculate CO2 emissions per kWh for our simplistic
product system. Other than managing the compound units, it is a simple solution: about 1
kg CO2 per kWh. If we were interpreting this result, we would note that the combustion of
coal at the power plant is about 99% of the total emissions.
0.994 kg CO2 /kWh + 0.442 kg * 0 + (0.461 ton-km / kWh)*(0.0189 kg CO2 / ton-km) =
0.994 kg CO2 / kWh + 0.0087 kg CO2 / kWh = 1.003 kg CO2 / kWh
(5-1)
The estimated CO2 emissions for coal-fired electricity of 1 kg / kWh was obtained relatively
easily, requiring only three steps and queries to a single database (US NREL LCI). As always
This particular input of "diesel, combusted in industrial boiler" may not be what you would expect to find in an LCI data module, since it is a
description of how an input of diesel is used. Such flows are fairly common though.
7 Also, the unabridged LCI data modules list emissions of methane to air, which could have been converted to equivalent CO emissions.
2
Doing so would only change the result above by about 10%.
6
Life Cycle Assessment: Quantitative Approaches for Decisions That Matter lcatextbook.com
Chapter 5: Data Acquisition and Management for Life Cycle Inventory Analysis
121
one of our first questions should be "is it right?" We can attempt to validate this value by
looking at external references. Whitaker et al (2012) reviewed 100 LCA studies of coal-fired
electricity generation and found the median value to be 1 kg of CO2 per kWh, thus we
should have reasonable faith that the simple model we built leads to a useful result. Of
course we can add other processes to our system boundary (such as other potential
transportation modes) but we would not appreciably change our simple result of 1 kg/kWh.
Note that anecdotally experts often refer to the emissions from coal-fired power plants to be
2 pounds per kWh, which is a one significant digit equivalent to our 1 kg/kWh result.
Process-based life cycle models are constructed in this way. For each unit process within the
system boundary, data (primary or secondary) is gathered and flows between unit processes
are modeled. Since you must find data for each process, such methods are often referred to
as "bottom up" studies because you are building them up from nothing, as you might
construct a building on empty land.
Beyond validating LCI results, you should also try to validate the values found in any unit
process you decide to use, even if sourced from a well-known database. That is because
errors can and do exist in these databases. It is easy to accidentally put a decimal in the
wrong place when creating a digital database. As an example, the US NREL LCI database
had an error in the CO2 emissions of its air transportation process, of 53 kg per 1000 ton-km
(0.053 kg per ton-km) for several years before it was fixed. This error was brought to their
attention because observant users noted that this value was less than the per-ton-km
emissions for truck transportation, which went against common sense. Major releases of
popular databases are also imperfect. It is common to have errors found and fixed, but this
may happen months after licenses have been purchased, or worse, after studies have been
completed. These are additional reasons why despite being of high quality, you need to
validate your data sources.
Life Cycle Assessment: Quantitative Approaches for Decisions That Matter lcatextbook.com
122
Chapter 5: Data Acquisition and Management for Life Cycle Inventory Analysis
process is representative. Figure 5-10 summarizes some of the popular abbreviations used
for country basis within ecoinvent.
Country or Region
Abbreviation
Country or Region
Abbreviation
Norway
NO
Japan
JP
Australia
AU
Canada
CA
India
IN
Global
GLO
China
CN
Europe
RER
Germany
DE
Africa
RAF
United States
US
Asia
RAS
Netherlands
NL
RU
Hong Kong
HK
France
FR
Russian Federation
Latin America and the
Caribbean
North America
RLA
RNA
United Kingdom
GB
Middle East
RME
Figure 5-10: Summary of abbreviations for countries and regions in ecoinvent
Ecoinvent has substantially more available metadata for its data modules, including primary
sources, representative years, and names of individuals who audited the datasets. While
ecoinvent data are not free, the metadata is freely accessible via the database website. Thus,
you could do a substantial amount of background work verifying that ecoinvent has the data
you want before deciding to purchase a license.
A particular feature of ecoinvent data is its availability at either the unit process or system
process level. Viewing and using ecoinvent system processes is like using already rolled-up
information (and computations would be faster), while using unit processes will be more
computationally intensive. This will be discussed more in Chapter 9.
Life Cycle Assessment: Quantitative Approaches for Decisions That Matter lcatextbook.com
Chapter 5: Data Acquisition and Management for Life Cycle Inventory Analysis
123
Figure 5-11 shows metadata from the Activity metadata portion of the US NREL LCI
database for the Electricity, bituminous coal, at power plant process used above. This metadata
notes that the process falls into the Utilities subcategory (used for browsing on the website)
and that it has not yet been fully validated. It applies to the US, and thus it is most
appropriate for use in studies looking to estimate impacts of coal-fired electricity generation
done within the United States. Note that this does not mean that you can only use it for that
geographical region. A process like coal-fired generation is quite similar around the world;
although factors such as pollution controls may differ greatly by region. However, since
capture of carbon is basically non-existent, if we wanted to use this process to estimate CO2
emissions from coal-fired generation in other regions it might still be quite useful.
The metadata field for "infrastructure process" notes whether the process includes estimated
infrastructure effects. For example, one could imagine two parallel unit processes for
electricity generation, where one includes estimated flows from needing to build the power
plant and one does not (such as the one referenced above). In general, infrastructure
processes are fairly rare, and most LCA study scopes exclude consideration of infrastructure
for simplicity.
Name
Category
Description
Location
US
Geography Comment
United States
Infrastructure Process
False
Quantitative Reference
Electricity, bituminous coal, at power plant
Figure 5-11: Activity metadata for Electricity, bituminous coal, at power plant process
Figure 5-12 shows the Modeling metadata for the coal-fired generation unit process. There is
no metadata provided for the first nine categories of this category, but there are ten
references provided to show the source data used to make the unit process. While a specific
"data year" is not dictated by the metadata, by looking at the underlying data sources, the
source data came from the period 1998-2003. Thus, the unit process data would be most
useful for analyses done with other data from that time period. If we wanted to use this
process data for a more recent year, we would either have to look for an LCI data module
Life Cycle Assessment: Quantitative Approaches for Decisions That Matter lcatextbook.com
124
Chapter 5: Data Acquisition and Management for Life Cycle Inventory Analysis
that was newer, or verify that the technologies have not changed much since the 1998-2003
period.
LCI Method
Modelling constants
Data completeness
Data selection
Data treatment
Sampling procedure
Data collection period
Reviewer
Other evaluation
Sources
U.S. EPA 1998 Emis. Factor AP-42 Section 1.1, Bituminus and Subbituminus Utility
Combustion
U.S. Energy Information Administration 2000 Electric Power Annual 2000
Energy Information Administration 2000 Cost and Quality of Fuels for Electric
Utility Plants 2000
Energy Information Administration 2000 Electric Power Annual 2000
U.S. EPA 1998 Study of Haz Air Pol Emis from Elec Utility Steam Gen Units V1
EPA-453/R-98-004a
U.S. EPA 1999 EPA 530-R-99-010
unspecified 2002 Code of Federal Regulations. Title 40, Part 423
Energy Information Administration 9999 Annual Steam-Electric Plant Operation and
Design Data
Franklin Associates 2003 Data Details for Bituminous Utility Combustion
Figure 5-12: Modeling metadata for Electricity, bituminous coal, at power plant process
Finally, Figure 5-13 shows the Administrative metadata for the Electricity, bituminous coal, at
power plant process. There are no explicitly-defined intended applications (or suggested
restrictions on such applications), suggesting that it is broadly useful in studies. The data are
Life Cycle Assessment: Quantitative Approaches for Decisions That Matter lcatextbook.com
Chapter 5: Data Acquisition and Management for Life Cycle Inventory Analysis
125
not copyrighted, are publicly available, and were generated by Franklin Associates, a
subsidiary of ERG, one of the most respected life cycle consulting business in the US. The
"Data Generator" is a significant piece of information. You may opt to use or not use a data
source based on who created it. A reputable firm has a high level of credibility. A listed
individual with no obvious affiliation or reputation might be less credible. Finally, the
metadata notes that it was created and last updated in October 2011, meaning that perhaps it
was last checked for errors on this date, not that the data is confirmed to still be valid for the
technology as of this date.
Intended Applications
"
Copyright
false
Restrictions
Data Owner
Data Generator
Franklin Associates
Data Documentor
Franklin Associates
Project
Version
Created
2011-10-24
Last Update
2011-10-24
Figure 5-13: Administrative metadata for Electricity, bituminous coal, at power plant process
Our metadata examples have focused on the publicly available US NREL LCI Database, but
other databases like ELCD and ecoinvent have similar metadata formats. These other
databases typically have more substantive detail, in terms of additional fields and more
consistent entries in these fields. Since these other data sources are not public, we have not
used examples here.
You should browse through the available metadata for any of the databases that you have
access to, so that you can better appreciate the records that may exist within various
metadata records. Remember that the reason for better appreciating the value of the
metadata is to help you with deciding which secondary data sources to use, and how
compatible they are with your intended goal and scope.
Life Cycle Assessment: Quantitative Approaches for Decisions That Matter lcatextbook.com
126
Chapter 5: Data Acquisition and Management for Life Cycle Inventory Analysis
Life Cycle Assessment: Quantitative Approaches for Decisions That Matter lcatextbook.com
Chapter 5: Data Acquisition and Management for Life Cycle Inventory Analysis
127
A final note about referencing is that the LCA databases are generally not primary sources,
they are secondary sources. Ideally, sources would credit the original author, not the database
owner who is just providing access. If the LCI data module is taken wholesale from another
source (i.e., if a single source were listed in the metadata), it may make sense to also
reference the primary source, or to add the primary source to the database reference. In this
case the reference might look like one of the following:
RPPG Of The American Chemistry Council 2011. Life Cycle Inventory Of Plastic
Fabrication
Processes:
Injection
Molding
And
Thermoforming.
http://plastics.americanchemistry.com/Education-Resources/Publications/LCI-ofPlastic-Fabrication-Processes-Injection-Molding-and-Thermoforming.pdf. via U.S. Life
Cycle Inventory Database. Injection molding, rigid polypropylene part, at plant unit process
(2012). National Renewable Energy Laboratory, 2012. Accessed November 19, 2012:
https://www.lcacommons.gov/nrel/search
U.S. Life Cycle Inventory Database. Injection molding, rigid polypropylene part, at plant unit
process (2012). National Renewable Energy Laboratory, 2012. Accessed November 19,
2012: https://www.lcacommons.gov/nrel/search (Primary source: RPPG Of The
American Chemistry Council 2011. Life Cycle Inventory Of Plastic Fabrication
Processes:
Injection
Molding
And
Thermoforming.
http://plastics.americanchemistry.com/Education-Resources/Publications/LCI-ofPlastic-Fabrication-Processes-Injection-Molding-and-Thermoforming.pdf)
As noted in Chapter 2, ideally you would identify multiple data sources (i.e., multiple LCI
data modules) for a given task. This is especially useful when using secondary data because
you are not collecting data from your own controlled processes. Since the data is secondary,
it is likely that there are slight differences in assumptions or boundaries than what you would
have used if collecting primary data. By using multiple sources, and finding averages and/or
standard deviations, you could build a more robust quantitative model of the LCI results. We
will discuss such uncertainty analysis for inventories in Chapter 10.
Life Cycle Assessment: Quantitative Approaches for Decisions That Matter lcatextbook.com
128
Chapter 5: Data Acquisition and Management for Life Cycle Inventory Analysis
Temporal Issues
In creating temporal data quality requirements, you will set a target year (or years) for data
used in your study. For example, you might have a DQR of "2005 data" or "data from 20052007" or "data within 5 years of today". After setting target year(s) you then must do your
best to find and use data that most closely matches the target. It is likely that you will not be
able to match all data with the target year(s). When setting and evaluating temporal DQRs,
the following issues need to be understood.
You may need to do some additional work to guarantee you know the basis year of the data
you find, but this is time well spent to ensure compatibility of the models you will build. You
will need to distinguish between the year of data collection and year of publication. In our
CBECS example in Chapter 2, the data were collected in the year 2003 but the study was not
published by DOE until December 2006 (or, almost 2007). It is easy to accidentally consider
the data as being for 2006 because the publication year is shown throughout the reports. But
the data were representative of the year 2003. If your temporal DQR was set at "2005", you
might still be able to justify using the 2003 CBECS data, but would need to assess whether
the electricity intensity of buildings likely changed significantly between 2003 and 2005. The
same types of issues arise when using sources such as US EPA's AP-42 data, which are
compilations of (generally old) previously estimated emissions factors. Other aspects of your
DQRs may further help decide the appropriateness of data newer or older than your target
year.
The same is true of dates given in the metadata of LCI data modules. You don't care about
when you accessed the database, or when it was published in the database. You care about
the primary source's years of analysis. Figure 5-12 showed metadata on the coal-fired
electricity generation process where the underlying data was from 1998-2003, and which was
put in the US LCI database in 2011. An appropriate "timestamp" for this process would be
1998-2003.
While on the topic of temporal issues, we revisit the point about age of data in databases.
The US LCI database project started in the mid-2000s. Looking at the search function in
that database, you can find a distribution of the "basis year" of all of the posted data
modules. This is a date that is not visible within the metadata, but is available for
downloaded data modules and summarized in the web server. Figure 5-14 shows a graph of
the distribution of the years. In short, there is a substantial amount of relatively old data, and
a substantial amount of data where this basis year is not recorded (value given as '9999').
Half of the 200 data modules updated in 2010 are from an update to the freight
transportation datasets. These could be key considerations when considering the suitability
of data in a particular database.
Life Cycle Assessment: Quantitative Approaches for Decisions That Matter lcatextbook.com
Chapter 5: Data Acquisition and Management for Life Cycle Inventory Analysis
129
Geospatial Issues
You must try to ensure that you are using data with the right geographical and spatial scope
to fit your needs. If you are doing a study where you want to consider the emissions
associated with producing an amount of electricity, then you will find many potential data
sources to use. The EIA has data that can give you the average emissions factors for
electricity generation across the US. E-GRID (a DOE-EPA partnership) can give you
emissions factors at fairly local levels, reflecting the types of power generation used within a
given region. The question is the context of your study. Are you doing a study that inevitably
deals with national average electricity? Then the EIA data is likely suitable. Or are you doing
a study that needs to know the impact of electricity from a particular factory's production?
In that case you likely want a fairly local data source, e.g., from E-GRID. An alternative is to
leverage the idea of ranges, presented in Chapter 2, to represent the whole realm of possible
values for electricity generation, including various local or regional averages all the way up to
the national average.
Life Cycle Assessment: Quantitative Approaches for Decisions That Matter lcatextbook.com
130
Chapter 5: Data Acquisition and Management for Life Cycle Inventory Analysis
Chapter Summary
Typically, the most time consuming aspect of an LCA (or LCI) study relates to the data
collection and management phase. While the LCA Standard encourages practitioners to
collect primary data for the product systems being studied, typically secondary data is used
from prior published studies and databases. Using secondary data requires being
knowledgeable and cognizant of issues relating to the sources of data presented and also
requires accurate referencing. Data quality requirements help to manage expectations of the
study team as well as external audiences pertaining to the goals of your data management
efforts. Utilization of effective LCI data management methods leads to excellent and wellreceived studies.
Life Cycle Assessment: Quantitative Approaches for Decisions That Matter lcatextbook.com
Chapter 5: Data Acquisition and Management for Life Cycle Inventory Analysis
131
132
Chapter 5: Data Acquisition and Management for Life Cycle Inventory Analysis
c. Truck transportation
Objective 2. Map information from LCI data modules into a unit process
framework AND
Objective 7. Generate an inventory result from LCI data sources
2. Redo the Figure 5-5 example shown in Equation 5-1, but include the diesel, combusted in
industrial boiler process (referenced as an input in the bituminous coal mining process) within
the system boundary. What is your revised estimate of CO2 emissions per kWh? How
different is your updated estimate?
3. Redo the Figure 5-5 example but include within the system boundary refining of the
diesel used in the coal mining and rail transportation processes. Assume you have LCI
data that there are 2.5 E-04 kg fossil CO2 emissions per liter of diesel fuel refined. How
is your revised estimate of fossil CO2 emissions per kWh compared to Equation 5-1?
Objective 3. Explain the difference between primary and secondary data, and when
each might be appropriate in a study
4. Explain the difference between primary and secondary data. Provide an example of
when each would be appropriate for a study.
Objective 4. Document the use of primary and secondary data in a study
5. The data identified in part 1c above would be secondary data if you were to use it in a
study. If you instead wanted primary data for a study on trucking, discuss what methods
you might use in order to get the data.
Objective 5. Create and assess data quality requirements for a study
6. If you had data quality requirements stating that you wanted data that was national (US)
in scope, and from within 5 years of today, how many of the LCI data modules from
Question 1 would be available? Which others might still be relevant? Justify your
answer.
Objective 6. Extract data and metadata from LCI data modules and use them in
support of a product system analysis
7. Using an LCI database available to you, search for one LCI data module in each of the
following broad categories - energy, agriculture, and transportation. For each of the
three, do the following:
a. List the name of the process.
Life Cycle Assessment: Quantitative Approaches for Decisions That Matter lcatextbook.com
Chapter 5: Data Acquisition and Management for Life Cycle Inventory Analysis
133
Life Cycle Assessment: Quantitative Approaches for Decisions That Matter lcatextbook.com
134
Chapter 5: Data Acquisition and Management for Life Cycle Inventory Analysis
Life Cycle Assessment: Quantitative Approaches for Decisions That Matter lcatextbook.com
Chapter 5: Data Acquisition and Management for Life Cycle Inventory Analysis
135
Clicking on the + icon next to the categories generally reveals one or more additional subcategories. For example, under the Utilities category there are fossil-fired and other
generation types. Clicking on any of the dataset type, category/subcategory or year
checkboxes will filter the overall data available. The "order by" box will sort the resulting
modules. Filtering by (checking) Unit processes and the Fossil fuel electric power generation category
under Utilities, and ordering by description will display a subset of LCI data modules, as
shown in Figure 5-16. A resulting process module can be selected (see below for how to do
this and download the data).
Figure 5-16: Abridged View of LCA Digital Commons Browsing Example Results
8
The examples of the NREL US LCI Database in this section are as of July 2014, and may change in the future.
Life Cycle Assessment: Quantitative Approaches for Decisions That Matter lcatextbook.com
136
Chapter 5: Data Acquisition and Management for Life Cycle Inventory Analysis
Figure 5-17: Keyword search entry on homepage of NREL engine of LCA Digital Commons Website
Figure 5-18 indicates that the search engine returns more than 100 LCI data modules
(records) that may be relevant to "electricity". Some were returned because electricity is in
the name of the process and others because they are in the Electric power distribution data
category. When searching, you can order results by relevance, description, or year. Once a set
Life Cycle Assessment: Quantitative Approaches for Decisions That Matter lcatextbook.com
Chapter 5: Data Acquisition and Management for Life Cycle Inventory Analysis
137
of search results is obtained, results can be narrowed by filtering via the options on the left
side of the screen. For example, you could choose a subset of years to be included in the
search results, which can help ensure you use fairly recent instead of old data (as discussed
along with Figure 5-14). You can also filter based on the LCI data categories available, in this
case by clicking on the + icon next to the high-level category for Utilities, which brings up all
of the subcategories under utilities. Figure 5-19 shows the result of a keyword search for
'electricity', ordered by relevance, and filtered by the Utilities subcategory of Fossil fuel electric
power generation and by data for year 2003. The fifth search result listed is the same one
mentioned in the chapter that forms the basis of the process flow diagram example.
Figure 5-19: Abridged Results of electricity keyword search, ordered and filtered
Life Cycle Assessment: Quantitative Approaches for Decisions That Matter lcatextbook.com
138
Chapter 5: Data Acquisition and Management for Life Cycle Inventory Analysis
Figure 5-20: Details for Electricity, bituminous coal process on LCA digital commons
The default result is a view of the Activity tab, which was shown in Figure 5-11. The
information available under the Modeling and Administrative tabs was presented in Figure 5-12
and Figure 5-13. Finally, an abridged view of the information available on the Exchanges tab
was also shown in Figure 5-6. Not previously mentioned is that the module can be
downloaded by first clicking on the shopping cart icon in the top right (adjacent to the
"Next item" tag). This adds it to your download cart. Once you have identified all of the data
you are interested in, you can view your current cart (menu option shown in Figure 5-21)
and request them all to be downloaded (Figure 5-22).
Figure 5-21: Selection of Current Cart Download Option on LCA Digital Commons
Life Cycle Assessment: Quantitative Approaches for Decisions That Matter lcatextbook.com
Chapter 5: Data Acquisition and Management for Life Cycle Inventory Analysis
139
After clicking download, you will be sent a link via the e-mail in your account registration.
As noted, the format will be an Ecospold XML file. For novices, viewing XML files can be
cumbersome, especially if just trying to look at flow information. While less convenient, the
download menu (All LCI datasets submenu) will allow you to receive a link to a ZIP file
archive containing all of the NREL modules in Microsoft Excel spreadsheet format (or you
can receive all of the modules as Ecospold XML files). You can also download a list of all of
the flows and processes used across the entire set of about 600 modules.
A spreadsheet of all flows and unit processes in the US LCI database (and their
categories) is on the www.lcatextbook.com website in the Chapter 5 folder.
When uncompressed the Electricity, bituminous coal, at power plant module file has four
worksheets, providing the same information as seen in the tabs of the Digital
Commons/NREL website above. The benefit of the spreadsheet file, though, is the ability
to copy and paste that values into a model you may be building. We will discuss building
spreadsheet models with such data in Section 4 of this advanced material.
Life Cycle Assessment: Quantitative Approaches for Decisions That Matter lcatextbook.com
140
Chapter 5: Data Acquisition and Management for Life Cycle Inventory Analysis
This tutorial also does not describe any of the initial steps needed to purchase a license for
or install SimaPro on your Windows computer or server. It will only briefly mention the
login and database selection steps, which are otherwise well covered in the SimaPro guides
provided with the software.
Note that SimaPro refers to the overall modeling environment of data available as a
"database" and individual LCI data sources (e.g., ecoinvent) as "libraries". After starting
SimaPro, selecting the database (typically called "Professional"), and opening or creating a
new project of your choice, you will be presented with the screen in Figure 5-23. On the left
side of the screen are various options used in creating an LCA in the tool. By default the
"processes" view is selected, showing the names and hierarchy of all processes in the
currently selected libraries of the database. This list shows thousands of processes (and many
of those will be from the ecoinvent database given its large size).
You can narrow the processes displayed by clicking on "Libraries" on the left hand side
menu, which will display Figure 5-24. Here you can select a subset of the available libraries
for use in browsing (or searching) for process data. You can choose "Deselect all" and then
to follow along with this tutorial, click just the "US LCI" database library in order to access
only the US NREL LCI data.
Life Cycle Assessment: Quantitative Approaches for Decisions That Matter lcatextbook.com
Chapter 5: Data Acquisition and Management for Life Cycle Inventory Analysis
141
If you then click the "Processes" option on the left hand side, you return to the original
screen but now SimaPro filters and shows only processes from the selected libraries, as in
Figure 5-25. Many of the previously displayed processes are no longer displayed.
Figure 5-25: View of Processes and Data Hierarchy for US-LCI Library in SimaPro
Life Cycle Assessment: Quantitative Approaches for Decisions That Matter lcatextbook.com
142
Chapter 5: Data Acquisition and Management for Life Cycle Inventory Analysis
Now that you have prepared SimaPro to look for the processes in a specific database library,
you can browse or search for data.
Browsing for LCI Data Modules in SimaPro
Looking more closely at Figure 5-25, the middle pane of the window shows the categorized
hierarchy of data modules (similar to the expandable hierarchy list in the Digital Commons
tool). However, these are not the same categories used on the NREL LCA Digital
Commons website. Instead, they are the standard categories used in SimaPro for processes
in any library. Clicking on the + icon next to any of the categories will expand it and show its
subcategories. To find the Electricity, bituminous coal process, expand the Energy category then
expand Electricity by fuel, then expand coal, resulting in a screen like Figure 5-26. Several of
the other processes burning coal to make electricity and mentioned in the chapter would also
be visible.
The bottom pane shows some of the metadata detail for the selected process. By browsing
throughout the categories (and collapsing or expanding as needed) and reading the metadata
you can find a suitable process for your model. The tutorial will demonstrate how to view or
download such data after briefly describing how to search for the same process.
Life Cycle Assessment: Quantitative Approaches for Decisions That Matter lcatextbook.com
Chapter 5: Data Acquisition and Management for Life Cycle Inventory Analysis
143
Figure 5-28 shows the result of a narrowed search on the word "electricity" in the name of
processes only in "Current project and libraries" and sorted by the results column "Name".
Since we have already selected only the US LCI database in libraries, the results will not
include those from ecoinvent, etc. One of the results is the same Electricity, bituminous coal, at
power plant process previously discussed.
Life Cycle Assessment: Quantitative Approaches for Decisions That Matter lcatextbook.com
144
Chapter 5: Data Acquisition and Management for Life Cycle Inventory Analysis
By clicking "Go to" in the upper right corner of the search results box, SimaPro "goes to"
the same place in the drill-down hierarchy as shown in Figure 5-26.
Viewing process data in SimaPro
To view process data, choose a process by clicking on it (e.g., as in Figure 5-26) and then
click the View button on the right hand side. This returns the process data and metadata
overview shown in Figure 5-29. Similar to the Digital Commons website, the default screen
shows high-level summary information for the process. Full information is found in the
documentation and system description tabs.
Life Cycle Assessment: Quantitative Approaches for Decisions That Matter lcatextbook.com
Chapter 5: Data Acquisition and Management for Life Cycle Inventory Analysis
145
Clicking on the input-output tab displays the flow data in Figure 5-30, which for this process
is now quite familiar. If you need to download this data, you can do so by choosing
"Export" in the File menu, and choosing to export as a Microsoft Excel file.
Figure 5-30: View of Process Flow Data (Inputs and Outputs) in SimaPro
Life Cycle Assessment: Quantitative Approaches for Decisions That Matter lcatextbook.com
146
Chapter 5: Data Acquisition and Management for Life Cycle Inventory Analysis
Life Cycle Assessment: Quantitative Approaches for Decisions That Matter lcatextbook.com
Chapter 5: Data Acquisition and Management for Life Cycle Inventory Analysis
147
If you double click on the "Processes" folder it will display the same sub-hierarchy of
processes (not shown here) that we saw in the NREL/Digital Commons website in Section
1. All of the data for unit processes are contained under that folder. If you click on the
"Utilities" subcategory folder, then the "Fossil Fuel Electric Power Generation" folder, you
will see the Electricity, bituminous coal, at power plant seen above, as shown in Figure 5-33.
Several of the other processes burning coal to make electricity and mentioned in the chapter
would also be visible.
Figure 5-33: Expanded View of Electricity Processes in Fossil Fuel Generation Category
Life Cycle Assessment: Quantitative Approaches for Decisions That Matter lcatextbook.com
148
Chapter 5: Data Acquisition and Management for Life Cycle Inventory Analysis
In the first search option, you may search in all databases or narrow the scope of your search
to only a single database (e.g., to the US-LCI database). In the second option, you may
search all object types, or narrow the scope of your search to just "Processes", etc. Finally,
you can enter a search term, such as "electricity". If you choose to search for "electricity"
only in your US LCI database (note you may have named it something different), and only in
processes, and click search you will be presented with the results as in Figure 5-35. Note that
these results have been manually scrolled down to show the same Electricity, bituminous coal, at
power plant process previously identified.
Life Cycle Assessment: Quantitative Approaches for Decisions That Matter lcatextbook.com
Chapter 5: Data Acquisition and Management for Life Cycle Inventory Analysis
149
Unlike the other tools, there is no quick and easy way to skim metadata to ensure which
process you want to use.
Life Cycle Assessment: Quantitative Approaches for Decisions That Matter lcatextbook.com
150
Chapter 5: Data Acquisition and Management for Life Cycle Inventory Analysis
Clicking on the Inputs/Outputs tab displays the flow data in Figure 5-37, which for this
process is now quite familiar.
Figure 5-37: View of Process Flow Data (Inputs and Outputs) in openLCA
If you need to download this data, you can do so by choosing "Export" in the File menu,
but you cannot export it as a Microsoft Excel file.
Life Cycle Assessment: Quantitative Approaches for Decisions That Matter lcatextbook.com
Chapter 5: Data Acquisition and Management for Life Cycle Inventory Analysis
151
Depending on which tool you used to find the US LCI process data, it may be easy to export
the input and output flows for the functional unit of each process into Excel. If not, you
may need to either copy/paste, or manually enter, the data. Recall that accessing the US LCI
data directly from the LCA Digital Commons can yield Microsoft Excel spreadsheet files.
2) Organize the data into separate worksheets
A single Microsoft Excel spreadsheet file can contain many underlying worksheets, as shown
in the tabs at the bottom of the spreadsheet window. For each of the downloaded or
exported data modules, copy / paste the input/output flows into a separate Microsoft Excel
worksheet. If you downloaded the US LCI process data directly from the lcacommons.gov
website, the input/output flow information is on the "X-Exchange" worksheet of the
downloaded file (the US LCI data in other sources would be formatted in a similar way). The
Transport, train, diesel powered process has 1 input and 9 outputs (including the product output),
as shown in Figure 5-38.
Life Cycle Assessment: Quantitative Approaches for Decisions That Matter lcatextbook.com
152
Chapter 5: Data Acquisition and Management for Life Cycle Inventory Analysis
Figure 5-38: Display of Extracted flows for Transport, train, diesel powered process from US LCI
Life Cycle Assessment: Quantitative Approaches for Decisions That Matter lcatextbook.com
Chapter 5: Data Acquisition and Management for Life Cycle Inventory Analysis
153
This simple LCI model shows a minimal effort result, such that using a spreadsheet is
perhaps overkill. Tracking only CO2 emissions means that we only have to add three scaled
values, which could be accomplished by hand or on a calculator. However this spreadsheet
motivates the possibility that a slightly more complex spreadsheet could be created that
tracks all flows, not just emissions of CO2.
Complex LCI Spreadsheet Example
Beyond the assumptions made in the simple model above, in LCA we often are concerned
with many (or all) potential flows through our product system. Using the same underlying
worksheets from the simple spreadsheet example, we can track flows of all of the outputs
listed in the various process LCI data modules (or across all potential environmental flows).
This not only allows us a more complete representation of flows, but better prepares us for
next steps such as impact assessment.
In this complex example, we use the same three underlying input/output flow worksheets,
but our Model worksheet more comprehensively organizes and calculates all tracked flows
from within a dataset. Instead of creating cell formulas to sums flows for each output (e.g.,
CO2) by clicking on individual cells in other worksheets, we can use some of Excel's other
built-in functions to pull data from all listed flows of the unit processes into the summary
Model worksheet. An example file is provided, but the remaining text in this section
describes in a bit more detail how to use Excel's SUMPRODUCT function for this task.
The SUMPRODUCT function in Microsoft Excel, named as such because it finds the sum
of a series of multiplied values, is typically used as a built-in way of finding a weighted
average. Each component of the function is multiplied together. For example, instead of the
method shown in the Simple LCI spreadsheet above, we could have copied the CO2
Life Cycle Assessment: Quantitative Approaches for Decisions That Matter lcatextbook.com
154
Chapter 5: Data Acquisition and Management for Life Cycle Inventory Analysis
emissions values from the three underlying worksheets into the row of cells B8 through D8,
and then used the function =SUMPRODUCT(B4:D4*B8:D8) to generate the same result.
The "Simple and Complex LCI Models" file has a worksheet "Simple Model (with
SUMPRODUCT)" showing this example in cell E8, yielding the same result as above.
However the SUMPRODUCT function can be more generally useful, because of how Excel
manages TRUE and FALSE values and the fact that the "terms" of SUMPRODUCT are
multiplied together. In Excel, TRUE is represented as 1 and FALSE is represented as 0 (they
are Booleans). So if we have "terms" in the SUMPRODUCT that become 1 or 0, we can use
SUMPRODUCT to only yield results when all expressions are TRUE, else return 0. This is
like achieving the mathematical equivalent of if-then statements on a range of cells.
The magic of this SUMPRODUCT function for our LCI purposes is that if we have a
master list of all possible flows, compartments, and sub-compartments, we can find whether
flow values exist for any or all of them. On the US LCI Digital Commons website, a text file
can be downloaded with all of the nearly 3,000 unique compartment flows present in the US
LCI database. This master list of flows can be pasted into a Model worksheet and then used
to "look up" whether numerical quantities exist for any of them.
A representative cell value in the complex Model worksheet, which has similar cell formulas
in the 3,000 rows of unique flows, looks like this (where cells A9, B9, and C9 are the flow,
compartment, and subcompartment values we are trying to match in the process data):
=E$4*SUMPRODUCT((Electricity_Bitum_Coal_Short!$A$14:$A$65=A
9)*(Electricity_Bitum_Coal_Short!$C$14:$C$65=B9)*(Electrici
ty_Bitum_Coal_Short!$D$14:$D$65=C9)*Electricity_Bitum_Coal_
Short!$G$14:$G$65)
This cell formula multiplies the functional unit scale factor in cell E4 by the
SUMPRODUCT value of:
whether the flow name, compartment, and subcompartment in the unit flows for the
coal-fired electricity process match every item in the master list of flows.
Life Cycle Assessment: Quantitative Approaches for Decisions That Matter lcatextbook.com
Chapter 5: Data Acquisition and Management for Life Cycle Inventory Analysis
155
The Chapter 5 folder on the textbook website has spreadsheets with all of the flows and
processes in the US LCI database, as downloaded from the LCA Digital Commons website.
The 'Simple and Complex LCI Models' file has a worksheet 'Complex Model' that
shows how to use the SUMPRODUCT function to track all 3,000 flows present in the
US LCI database (from the flow file above). Of course the results are generally zero for each
flow due to data gaps, but this example model expresses how to broadly track all possible
flows. You should be able to follow how this spreadsheet was made and, if needed, add
additional processes to this spreadsheet model.
Homework Questions for this Section
1. Answer Question 2 from the end of Chapter 5 by using the 'Simple and Complex LCI
Models' spreadsheet introduced in this section.
Life Cycle Assessment: Quantitative Approaches for Decisions That Matter lcatextbook.com