Life 10201604004
Life 10201604004
Life 10201604004
DOCTOR OF PHILOSOPHY
of
HOMI BHABHA NATIONAL INSTITUTE
December 2021
Homi Bhabha National Institute
Recommendations of the Viva Voce Committee
As members of the Viva Voce Committee, we certify that we have read the dissertation
based exploration of diverse environmental chemical spaces” and recommend that it may
be accepted as fulfilling the dissertation requirement for the Degree of Doctor of Philoso-
phy.
Date: 22/06/2022
Chair - Prof. Rahul Siddharthan
Date: 22/06/2022
Supervisor/Convener - Prof. Areejit Samal
Date: 22/06/2022
Member 1 - Prof. Sitabhra Sinha
Date: 22/06/2022
Member 2 - Prof. Satyavani Vemparala
Date: 22/06/2022
Member 3 - Prof. Dhiraj Kumar
Date: 22/06/2022
External Examiner - Prof. Nagasuma Chandra
Final approval and acceptance of this dissertation is contingent upon the candidate’s sub-
mission of thefinal copies of the dissertation to HBNI.
I hereby certify that I have read this dissertation prepared under my direction and recom-
mend that it may be accepted as fulfilling the dissertation requirement.
Date: 22/06/2022
This dissertation has been submitted in partial fulfillment of requirements for an advanced
degree at Homi Bhabha National Institute (HBNI) and is deposited in the Library to be
made available to borrowers under rules of the HBNI.
Brief quotations from this dissertation are allowable without special permission, pro-
vided that accurate acknowledgement of source is made. Requests for permission for
extended quotation from or reproduction of this manuscript in whole or in part may be
granted by the Competent Authority of HBNI when in his or her judgement the proposed
use of the material is in the interests of scholarship. In all other instances, however, per-
mission must be obtained from the author.
Janani R
Declaration
I, hereby declare that the investigation presented in this thesis has been carried out by me.
The work is original and has not been submitted earlier as a whole or in part for a degree
or diploma at this or any other Institution or University.
Janani R
List of Publications arising from the thesis
Journals
Published
Copyright
1. DEDuCT - Database of Endocrine Disrupting Chemicals and their Toxicity Profiles
Authors: A. Samal, J. Ravichandran, B.S. Karthikeyan, M. Karthikeyan and R.P.
Vivek Ananth.
Copyright granted to The Institute of Mathematical Sciences by the Copyright of-
fice, Government of India, with the Dairy Number 16429/2018-CO/L.
Janani R
This thesis is dedicated
TO MY PARENTS
For their unconditional love and support
Acknowledgements
At the outset, I would like to thank my thesis supervisor, Prof. Areejit Samal, for his
continued guidance and support. I have greatly benefited from his guidance and he has
always inspired me to aim high in my career. His constructive criticism and suggestions
have helped me in moulding myself into a strong personality and a better researcher. His
hardworking nature and willingness to assist in all aspects of the research, regardless of
time, has truly inspired me, and is something I hope to carry forward throughout my
career. I am grateful for his efforts and support throughout the pandemic, during which
he ensured that the work was never hindered. Further, I am immensely thankful to him
for his assistance during all of my conference participation and research visits.
I thank all my doctoral committee members - Prof. Rahul Siddharthan, Prof. Sitabhra
Sinha, Prof. Satyavani Vemparala, and Prof. Dhiraj Kumar for their critical comments
during the doctoral committee meetings. I would also like to thank all the faculties and
researchers - Prof. Gautam Menon, Prof. Satyavani Vemparala, Prof. Sitabhra Sinha,
Prof. Rahul Siddharthan, Prof. Areejit Samal, Prof. S Krishnaswamy, Dr. Vasudharani
Devanathan, Dr. Nivedita Chatterjee, Prof. Vijayalakshmi Mahadevan, Dr. Grace Chon-
gloi, Dr. P Varuni, who have taught various courses during the course-work period of my
PhD.
I would like to thank Mr. B Raveendra Reddy for his assistance in setting up the web
server described in the thesis chapters.
I am very grateful for the intellectual discussions I had with Prof. Oliver Ebenhöh,
Prof. Martin J. Lercher, and Dr. Tiago C. Alves during my academic visit to Germany.
My sincere thanks to Prof. Sanjay Jain, Prof. Amit Singh, Prof. Dhiraj Kumar,
Prof. Vinay K. Nandicoori for their thoughtful comments and suggestions on my research
projects. Further, I thank Dr. Kushi Anand and other lab members of Prof. Amit Singh
for sharing their research experiences and knowledge with me during my visit to IISc.
I thank IMSc for their funding and support during my PhD. A special thanks to Mrs.
R Indra, who coordinated and assisted me during my participation at both national and
international conferences. I would like to thank all administrative staff members and
computer committee members for doing the needful whenever necessary. A huge thanks
to everyone who works in the canteen, housekeeping, civil and electrical departments
for making IMSc a pleasant place to work. A special thanks to Mrs. Mahalakshmi, the
housekeeping staff, for her kindness and care, especially during my illness.
At this time, I would also like to thank all the editors and reviewers of all my publi-
cations, for their critical comments, which were really helpful in improving the quality of
the work.
I am extremely grateful to all of my friends from IMSc who have helped and sup-
ported me in a variety of circumstances - Garima, Semanti, Ajjath, Pooja, Sruthy, Rakesh,
Deepika, Vignesh, Savitha, Sourav, Ankita. Sincere gratitude to Semanti, one of the best
roommates in my hostel life, who has mostly bought me food and medicines while I
was sick. A heartfelt thanks to all the past and current lab members - Roshani, Van-
dana, Sreejith, Gayathri, Subathra, Sudharsan, Meena, Pavithra, Kavyaa, Tamil Maran,
Divya, Murugesan, Jyotsna, Sudharsan V, Nithin, Vishalini, Evanjalee, Ajaya, Ajay, and
Yasharth, for their support. Many thanks to Vandana who is currently a PhD student at
IISc, Bangalore, for hosting me during my academic visits. A very special thanks to
Garima, Gayathri, Subathra, Sudharsan, and Roshani, who have always been there for me
in times of difficulty and happiness. Thank you for the short outings, dinners, and leisure
walks, which will be treasured memories for the rest of my life.
I also thank all my friends from school (Divya, Dhana Laksmi, Mogana), BTech
(Vinothini, Deva Priya, Sreemol, Gayathri), and Masters (Aravind, Jeffy, Divya Selvaraju,
Nisha), for staying in touch and encouraging me throughout my PhD. A big thanks to
Divya Selvaraju for her care and support, who visited from Vienna during my academic
visit to Dresden. Furthermore, I thank Ashreya, a friend from IMSc, for her time and
support while we were in Heidelberg. Even though our time together was limited, they
were always memorable.
Lastly, I would like to express my deep gratitude to my dear parents Rani and
Ravichandran, and my husband Bala Kumar who entrusted and supported all my life de-
cisions. In particular, I thank my parents for tolerating and understanding my emotional
overload at times, as well as ensuring a happy and positive environment to do my research
peacefully. A big thanks to mom, dad, and Bala for believing in me and supporting me to
pursue my dreams. They are the pillars of strength in my life, without whom I would not
be the person I am today.
Janani R
Contents
List of Figures i
List of Tables vi
Abstract vii
1 Introduction 1
1.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
2.5 Lack of correlation between chemical structure and target genes of EDCs 53
2.7 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
3.3 Exploration of potential EDCs across chemical lists that are a part of in-
ventories, regulations and guidelines . . . . . . . . . . . . . . . . . . . . 73
3.5 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79
6.5 Comparison of ExHuMId with other resources on human milk exposome . 153
6.6 Analysis of human milk contaminants with substances of concern or in use 155
6.8 Analysis of potential effects of contaminants on maternal and infant health 160
7.5 Linking fragrance chemicals in children’s products to their target genes . . 191
References 233
List of Figures
2.4 Classification of the 686 EDCs based on their source in the environment . 36
2.7 Size of the largest connected component (LCC) of the chemical similarity
network (CSN) of EDCs versus chemical similarity measures . . . . . . . 44
2.8 High chemical similarity network (CSN) of 686 EDCs constructed based
on Tanimoto coefficient with ECFP4 fingerprints . . . . . . . . . . . . . . 47
2.9 High chemical similarity network (CSN) of 686 EDCs constructed based
on Tanimoto coefficient with MACCS keys fingerprints . . . . . . . . . . 49
2.10 Size of the largest connected component (LCC) of the target similarity
network (TSN) of EDCs versus Jaccard index . . . . . . . . . . . . . . . 51
2.11 Network visualization of high target similarity network (TSN) of 383 EDCs 52
i
2.12 Scatter plots of target similarity versus chemical structure similarity be-
tween pairs of EDCs . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54
3.2 Detailed workflow for the compilation of potential EDCs and creation of
the updated knowledgebase DEDuCT 2.0. . . . . . . . . . . . . . . . . . 65
3.4 Classification of the 792 potential EDCs in DEDuCT 2.0 based on their
source in the environment . . . . . . . . . . . . . . . . . . . . . . . . . . 67
3.5 Classification of the 792 EDCs in DEDuCT 2.0 based on their chemical
structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68
3.6 Sankey plot showing the classification of 36 chemical lists that are part of
inventories, guidelines and regulations . . . . . . . . . . . . . . . . . . . 71
4.2 Visualization of the ED-AOP network based on shared KEs among the 48
ED-AOPs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92
4.3 The directed network for LCC C1 in the ED-AOP network consisting of
44 KEs and 56 KERs . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94
4.4 The directed network for LCC C2 in the ED-AOP network consisting of
48 KEs and 56 KERs . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95
4.5 The directed network for LCC C1 wherein the KEs are colored based on
their betweenness centrality values . . . . . . . . . . . . . . . . . . . . . 98
ii
4.6 The directed network for LCC C2 wherein the KEs are colored based on
their betweenness centrality values . . . . . . . . . . . . . . . . . . . . . 99
4.7 The directed network for LCC C1 wherein the KEs are colored based on
their eccentricity values . . . . . . . . . . . . . . . . . . . . . . . . . . . 100
4.8 The directed network for LCC C2 wherein the KEs are colored based on
their eccentricity values . . . . . . . . . . . . . . . . . . . . . . . . . . . 101
4.9 The directed network for LCC C1 in the ED-AOP network showing the
categorization of their KEs into 4 systems-level perturbations . . . . . . . 104
5.2 Venn diagram showing the occurrence of the 475 potential neurotoxicants
compiled in NeurotoxKb 1.0 . . . . . . . . . . . . . . . . . . . . . . . . 124
5.5 Sankey plot displays the 55 chemical lists considered for comparative
analysis that are a part of chemical inventories, regulations and guidelines 130
iii
6.4 Box plots displaying the distributions of 6 physicochemical properties . . 161
6.5 Sankey plots show the human milk contaminants in ExHuMId and their
target genes or proteins involved in prolactin signalling and lactose synthesis165
6.6 Sankey plots show the human milk contaminants in ExHuMId and their
target genes or proteins involved in oxytocin signalling and xenobiotic
transport . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 167
6.7 Sankey plots show the human milk contaminants in ExHuMId and their
target genes or proteins involved in cytokine signalling . . . . . . . . . . 169
7.4 Sankey plot showing the presence of fragrance chemicals in FCCP across
21 chemical lists which reflect regulations or guidelines . . . . . . . . . . 184
7.6 Bipartite graph displaying the 20 fragrance chemicals in FCCP and their
associated odor receptors . . . . . . . . . . . . . . . . . . . . . . . . . . 193
7.7 Schematic overview of the creation and analysis of the repository of Fra-
grance Chemicals in Children’s Products (FCCP). . . . . . . . . . . . . . 196
8.2 Venn diagram shows the presence of 380 environmental chemicals com-
piled in TExAs across the three exposome resources . . . . . . . . . . . . 205
8.4 Bipartite network of 110 chemicals detected in the liver and 134 associ-
ated diseases . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 214
iv
9.1 Summary of the research on compilation, curation and exploration of di-
verse groups of environmental chemicals . . . . . . . . . . . . . . . . . . 220
v
List of Tables
4.1 The curated subset of 48 ED-AOPs among the 161 high-confidence AOPs
filtered from AOP-Wiki. The table also gives the fraction of ED-KEs,
the cumulative WoE score, and the WoE score for human applicability
(Human WoE) for each of the 48 ED-AOPs. . . . . . . . . . . . . . . . . 112
4.2 The list of AOs in the 7 connected components of the ED-AOP network
and their categorization into 4 systems-level endocrine-mediated pertur-
bations, namely, ‘hepatic’, ‘metabolic’, ‘neurological’ and ‘reproductive’,
depending on the perturbed biological processes. . . . . . . . . . . . . . 113
4.3 The table gives information on the starting MIE and the ending AO for
each of the 4 new paths identified in the LCC C1 of the ED-AOP network. 114
vi
8.1 List of 13 potential chemicals of concern in TExAs suggested for priori-
tization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 217
vii
Abstract
Humans are exposed to environmental chemicals in their everyday life and such exposure
can contribute to the incidence of several chronic diseases. Characterization, monitoring
and regulation of the ever-increasing space of environmental chemicals for their potential
adverse health effects is both necessary and challenging. In other words, characterization
of the chemical exposome from a health perspective is necessary for human well-being.
To this end, there has been growing interest in characterizing the human exposome along
with the genome to better understand the environmental factors crucial for human health
and disease.
In this thesis, we focus on environmental chemicals that have gained significant atten-
tion from scientists, regulatory authorities, and the general public, due to their potential
health concerns. In order to link chemical exposomes to health effects, we have under-
taken a systematic compilation, curation and exploration of the existing information con-
tained in published toxicological studies on diverse groups of environmental chemicals.
Specifically, we focus on five groups of chemicals with toxicological relevance, namely
endocrine disrupting chemicals (EDCs), environmental neurotoxicants, human milk con-
taminants, fragrance chemicals in children’s products, and exogenous chemicals detected
in human tissues.
Furthermore, there is recent recognition of the need to leverage network science and
systems biology approaches in characterizing the chemical exposome. Therefore, we ex-
tensively employ these approaches on the compiled toxicological information for the five
groups of environmental chemicals studied in this thesis. Specifically, we investigated
similarity networks of these environmental chemicals based on similarity in chemical
structures or similarity of target genes. Further, we constructed bipartite networks of
environmental chemicals and their target genes, and tripartite networks of environmen-
tal chemicals, their target genes and associated diseases, to reveal perturbed pathways
and potential disease comorbidities related to chemical exposure. Moreover, we derive a
viii
comprehensive adverse outcome pathway (AOP) network for endocrine-mediated pertur-
bations, and thereafter, employ graph-theoretic measures to identify the critical biological
events associated with endocrine disruption upon chemical exposure.
To further demonstrate the utility of our research for chemical risk assessment, we
perform a comparative study using several chemical lists that are a part of inventories,
guidelines or regulations to assess the regulatory status and source of the diverse groups
of environmental chemicals considered in this thesis. These analyses reveal that several
environmental chemicals of concern are part of everyday exposures, and moreover, many
of these chemicals are found to be produced in high volume.
ix
Chapter 1
Introduction
1.1 Motivation
Our state of health or disease is really a reflection of the environment
we all live in. And the environment we perceive.
- Darnell Houston
In the last century, industrial advances have resulted in the rapid synthesis and com-
mercialization of myriad chemicals. As of October 2021, more than 86000 such chem-
icals have been registered with the United States Environmental Protection Agency (US
EPA) under the Toxic Substances Control Act [1]. Further, based on an estimate from the
United States National Toxicology Program report of 2017 [2], around 2000 new commer-
cial chemicals are introduced into the market every year. However, only a small fraction
of these chemicals released into the environment have been tested for safety or toxicity
concerns to date [3, 4]. Humans are exposed to many of these environmental chemi-
cals in their daily life in the form of consumer products including personal care prod-
ucts, pharmaceuticals, food additives, pesticides and insecticides [5±8]. Such exposure
to environmental chemicals contribute significantly to the incidence of several chronic
diseases [9±12]. In short, the ever-increasing rate of new chemicals released into the en-
vironment and the subsequent global prevalence of chronic diseases underline the urgent
1
need for the characterization and prioritization of environmental chemicals of concern to
human health [9±11, 13±17].
To capture the diverse environmental factors influencing health and disease starting
from the prenatal period, Wild [18] introduced the concept of ªexposomeº. Subsequently,
others have both expanded and refined the definition of the exposome. Rappaport et
al. [19] included the body’s internal chemical environment in the definition of the ex-
posome. Miller et al. [20] expanded the definition of the exposome to include the behav-
ioral aspects of human beings, including social interactions and emotional stressors. In
sum, the human exposome captures a variety of environmental factors, both internal and
external, among which the assessment of external stressors in the form of environmental
contaminants or toxicants and the resulting impact on human health is gaining momentum
among researchers [13, 18±20].
To improve the risk assessment of environmental chemicals, there is a need for sys-
tematic characterization and better understanding of the human health impact of such
chemical exposures. Simply stated, there is immense interest in characterizing this chem-
ical exposome. In this direction, two approaches have been undertaken to characterize
the chemical exposome: ªbottom-upº or ªtop-downº [21±23]. Using a ªbottom-upº ap-
proach, the different classes of chemicals present in the external environment such as food,
air, and water, can be evaluated and monitored for their potential health effects. This ap-
proach also enables the identification of exogenous exposures along with their sources
in the environment. In contrast, a ªtop-downº approach involves the characterization of
both exogenous and endogenous chemicals within the biological samples such as blood,
urine, breast milk, and adipose tissue, of an individual. This approach does not provide
any information on the source of the exogenous chemicals identified in the biological
samples [21±23]. In short, the above-mentioned two approaches can be used to capture
an individual’s overall exposome. The characterization of an individual’s exposome over
their lifetime, however, remains a challenging task. Figure 1.1 is an illustration of the
various environmental exposure sources contributing to the chemical exposome of the
2
Figure 1.1: An overview of the various environmental exposure sources contributing to the chem-
ical exposome of humankind.
humankind. In this thesis, we have employed both approaches to identify and charac-
terize certain groups of (exogenous) chemicals in the environment that have potential to
cause adverse health effects in various populations. In particular, we have studied promi-
nent groups of chemicals of concern such as endocrine disruptors and neurotoxicants, that
have received significant attention from scientists, regulatory agencies and the public due
to their potential health hazards.
In recent times, several initiatives have been undertaken to establish large-scale ex-
posome resources using bottom-up or top-down approaches, and these resources enable
the regulatory authorities to prioritize environmental chemicals with potential to cause
adverse effects. The Exposome-Explorer database [24], which compiles biomarkers of
exposure to dietary and environmental risk factors for diseases, is one of the largest ex-
posome resources established to date. The Human Indoor Exposome Database [25] is
another manually curated exposome resource dedicated to risk factors identified in in-
door dust from human exposure studies. T3DB [26] is a toxic exposome database that
3
contains information about toxic compounds and their target interactions. The database
of intentionally added food contact chemicals (FCCdb) [27] compiles a list of chemicals
used in food contact materials or food contact articles. There have also been initiatives
to create exposome databases tailored to specific biological tissues or biospecimens, such
as the Blood Exposome Database [28] and Saliva Exposome [29]. Moreover, Compara-
tive Toxicogenomics Database (CTD) [30] also compiles information on environmental
chemicals detected in different biospecimens. Specific to potential health impact of en-
vironmental factors on both mothers and infants, there have been a few initiatives such
as the Human Early Life Exposome study in Europe [31] and the Drugs and Lactation
Database (LactMed) [32,33] of the US National Library of Medicine. Additionally, some
non-profit organizations also compile information on common chemicals and exposure
concerns to help mothers better understand their possible health effects on infants [34].
In this thesis, we focus on certain groups of environmental chemicals that have gained
significant attention from scientists, regulatory authorities, and the general public due to
their potential health concerns. Specifically, we aim to highlight the links between chem-
ical exposome and human health. For this purpose, a systematic compilation, curation
and exploration of the existing information derived from toxicological studies can aid in
assessing the biological response to environmental chemical exposure. As a first step to-
ward establishing a link between chemical exposome and human health, we identify and
compile at least five groups of chemicals with toxicological relevance from published ex-
perimental studies, namely endocrine disrupting chemicals (EDCs) [35±37], environmen-
tal neurotoxicants [38], human milk contaminants [39], fragrance chemicals in children’s
products [40], and exogenous chemicals detected in human tissues [41]. We have em-
ployed both the bottom-up and top-down approaches to characterize the above-mentioned
groups of environmental chemicals that have a potential to cause adverse health effects in
humans. Furthermore, there is a growing interest in using network science and systems
biology approaches to characterize the chemical exposome in order to better understand
the links between environmental exposures and human biology [13, 42]. As a result, in
4
this thesis, we have extensively utilized network science and systems biology approaches
to shed light on biological perturbations associated with exposure to diverse groups of
environmental chemicals using the compiled toxicological information in our compiled
resources. In addition, we have studied the exposure sources, regulatory status and the
nature of compiled chemical spaces using computational approaches.
The subsequent sections of this chapter will provide an overview of the different
groups of environmental chemicals studied here and a description of various analyses
presented in this thesis.
To begin, we consider the EDCs [35,36] present in the environment that are capable of
interfering with the normal functioning of the human endocrine system. Binding of EDCs
to the native hormonal receptors interferes with the normal endocrine signalling mecha-
nism leading to adverse health effects related to reproduction, development, metabolism,
immune system, neurological system, liver or hormone-related cancers [4, 8, 43, 44]. No-
tably, the estimated annual cost of disease burden and impact on healthcare due to EDCs
is $340 billion in the USA and €163 billion in the European Union (EU) [43, 45]. While
there have been previous attempts such as the World Health Organization (WHO) re-
port [8], The Endocrine Disruption Exchange (TEDX) [46], EDCs Databank [47, 48] and
Endocrine Disruptor Screening Program (EDSP) [49] of the United States Environmental
Protection Agency (US EPA), to compile the list of potential EDCs, the earlier efforts
5
have not assessed the weight of evidence of endocrine disruption from existing literature,
as highlighted by Solecki et al. [45] and the scientific statements from the Endocrine So-
ciety [43, 50, 51]. Further, none of the earlier resources on EDCs compiled the adverse
health effects associated with chemical exposure that can facilitate the mechanistic un-
derstanding of endocrine disruption. In Chapter 2, we present a systematic workflow for
identifying and compiling potential EDCs in the environment along with their adverse
effects, from published experimental studies. In Chapter 3, we explore the current reg-
ulations and guidelines from the perspective of EDCs, which can aid in the better risk
assessment. In Chapter 4, we build a comprehensive adverse outcome pathway (AOP)
network relevant to endocrine disruption which can aid in understanding the systems-
level endocrine-mediated perturbations resulting from exposure to EDCs.
Thereafter, we focus on the environmental chemicals that have potential to cause ad-
verse health effects in children from two different perspectives. First, we explore several
6
environmental contaminants that are capable of entering human milk [39] and can have
a potential impact on maternal health [63] and the early development of a child [64, 65].
These contaminants are mostly lipophilic, persistent and bioaccumulative in nature, and
have a tendency to deposit in adipose tissue of women or mothers who are exposed to
these chemicals [66,67]. During lactation these chemicals can transfer to human milk pri-
marily via passive diffusion [68±72]. In Chapter 6, we investigate these human milk con-
taminants and their potential health impact on infant and mothers. Second, we investigate
fragrance chemicals in children’s products to emphasize the importance of monitoring and
regulating them. Exposure to fragrance chemicals can lead to asthma, contact dermati-
tis (irritant or allergic), dyschromia, photosensitivity, and migraine headaches [73±78].
Specifically, the exposure to hazardous chemicals is a significant health concern for chil-
dren who have high metabolic rate, immature organ systems, thin skin, rapid growth and
development of organs and tissues [79±81]. Despite being a subset of chemicals uti-
lized in children’s products, fragrance chemicals are either self-controlled or weakly reg-
ulated [75, 79, 81]. In Chapter 7, we present a knowledgebase on the fragrance chemicals
in children’s products and their potential health hazards.
7
formation associated with these environmental chemicals, which can facilitate chemical
risk assessment [35, 36, 38±41].
approach
The growing number of chemicals in commerce necessitates the use of computational and
high-throughput techniques to prioritise the subset of chemicals linked to serious health
consequences [13, 87]. Data-driven exploration using published toxicological studies can
facilitate the identification of biological consequences of environmental chemical expo-
sures [87]. To comprehend the environmental and biological components of the expo-
some, however, a systems approach to the ªparadigm of biological complexityº is neces-
sary [87]. Network-centric techniques can aid in understanding the organizing principles
of complex biological systems [88]. Furthermore, there is a recent interest to leverage net-
work science and systems biology approaches in characterizing the chemical exposome.
The use of networks, in particular, might provide a conceptual framework for capturing
the intricate relationship between the environment and human health [13, 42]. In this the-
sis, we leverage the compiled toxicological information associated with the five groups
of environmental chemicals to capture the different components of the biological system
such as perturbed genes, receptors or pathways, as well as disease outcomes as a result
of environmental chemical exposure (Figure 1.2). Specifically, we extensively apply net-
work science and systems biology approaches to investigate the links between chemical
exposome and human health.
The U.S. Environmental Protection Agency’s Toxicity Forecaster (ToxCast) [89] has
screened more than 9000 chemicals using high-throughput assay experiments to capture
the molecular or cellular level changes that occur as a result of individual chemical expo-
8
1. Environmental 2. Bioaccumulation of chemicals in 3. Perturbed biological
chemical exposure various human biospecimens networks
Disease outcomes
Tissue/Organ
networks
Cellular
networks
Molecular
networks
Genes
Figure 1.2: A figure depicting the complex interplay of environmental chemical exposure and per-
turbed biological networks at various levels of organization, which can result in disease outcomes.
9
sure. This data can be leveraged to prioritize chemicals using computational toxicology
approaches. Apart from ToxCast, CTD [30] provides a manually curated list of chemical-
gene associations compiled from the existing literature. In a toxicological context, chem-
icals do not affect the function of a single gene or protein, but rather they affect multiple
genes or proteins at the same time. Thus, in order to better understand the aetiology of
several chronic diseases, it is necessary to gather information on multiple target genes that
are perturbed as a result of chemical exposure [90]. Studying the chemical-gene networks
can be further helpful in understanding the various receptor-mediated processes and the
potential pathways that get perturbed upon chemical exposure. Furthermore, informa-
tion on molecular interactions can throw light on network-level perturbations such as in
protein-protein interaction network, metabolic network, and gene regulatory network, en-
abling us to capture the cellular behavior at systems-scale in response to environmental
exposures [88,90]. In this thesis, we have studied bipartite networks of these environmen-
tal chemicals and their target genes wherein the interactions were identified based on the
in vitro human assays in ToxCast.
In 2007, the U.S. National Research Council issued a vision report titled ‘Toxicity testing
in the twenty-first century: a vision and a strategy’ [91], which included several recom-
mendations to enhance and expedite chemical toxicity testing. The report [91] urged the
use of high-throughput screening technologies such as in vitro toxicology, in silico ap-
proaches, to accomplish rapid, efficient, and cost-effective screening of chemicals [92].
In addition, the report [91] emphasized the importance of the notion of ‘toxicity path-
ways’ for the purpose of chemical risk assessment. These toxicity pathways are described
as a set of cellular processes that were found to mediate toxicant-induced adverse ef-
fects [93±98]. Ankley et al. [99] suggested a similar framework, ªAdverse Outcome
Pathways (AOPs)º, to gather mechanistic information on documented adverse effects
in humans or wildlife following chemical exposure. AOPs can serve as a basis for In-
10
tegrated Approaches to Testing and Assessment (IATA), and they have the potential to
identify and fill knowledge gaps, prioritize chemicals, and support regulatory decision-
making [100, 101].
An AOP is defined as: ªthe conceptual construct that portrays existing knowledge
concerning the linkage between a direct molecular initiating event and an adverse out-
come at a biological level of organization relevant to risk assessmentº [99] (Figure 1.3A).
The Organization for Economic Cooperation and Development (OECD) established an in-
ternational programme in 2012 to standardize the development and evaluation of AOPs.
Following that, several studies reported the development of specific AOPs [101±103] and
their applications in risk assessment, human- and eco-toxicology [97, 104±112]. Each
AOP consists of two components, namely, key events (KEs) and key event relationships
(KERs). A KE in an AOP is defined as: ªa measurable change in biological state that is es-
sential, but not necessarily sufficient for the progression from a defined biological pertur-
bation toward a specific adverse outcomeº [105] (Figure 1.3A). Among KEs, Molecular
Initiating Events (MIEs) capture the initial molecular level interactions between chem-
icals or stressors and their target receptor(s), while, Adverse Outcomes (AOs) capture
perturbations at the organ or higher levels of biological organization such as changes in
morphology or physiology [105] (Figure 1.3A). A KER is a directed interaction between
any two KEs in an AOP [97, 105, 106].
In 2014, OECD initiated AOP knowledge base (AOP-KB) [113] for the collaborative
development of AOPs. AOP-Wiki [114] is an actively maintained module within AOP-
KB that receives real-time updates and serves as a central repository for AOPs in various
stages of development. The sharing of KEs within AOP-Wiki can result in the develop-
ment of ‘AOP networks’. An AOP network is defined as: ªan assembly of 2 or more AOPs
that share one or more KEs, including specialized KEs such as MIEs and AOsº [107]
(Figure 1.3B). Recent studies [107, 110, 115±118] have highlighted the potential appli-
cability of such AOP networks in exploring specific toxicology-related questions. The
use of graph-theoretic techniques [88] to analyze such derived AOP networks can high-
11
A
B
AOP1 MIE1 KE1 KE2 AO1 AOP1 MIE1 KE1 KE2 AO1
MIE2
MIE
KE
AO
Figure 1.3: (A) Schematic representation of Adverse Outcome Pathways (AOPs) that comprise
of Molecular Initiating Events (MIEs), Key Events (KEs) and Adverse Outcomes (AOs) spanning
across different levels of biological organization. (B) Two AOPs can be assembled together based
on shared KEs to form an AOP network. (C) An illustration of an AOP network built from existing
information in AOP-Wiki, which can then be derived to study a specific research question.
12
light important topological features, critical paths, and relationships among individual
AOPs [107, 110]. In Chapter 4 of this thesis, we develop and analyze a comprehensive
AOP network relevant to endocrine disruption based on the existing information available
in AOP-Wiki.
Exposome-disease associations
13
1.4 Characterization of environmental chemical spaces
In silico or computational toxicology was originally developed for drug development.
But, in recent years, it has been employed for toxicological research and risk assessment
in the environmental chemical space [124]. In particular, in silico approaches are be-
ing employed to predict or model the toxicological mechanisms, adverse outcomes or
systems-level behaviour [124]. In silico approaches in this direction include databases,
data mining, read-across, different kinds of quantitative structure-activity relationship
(QSAR) methods, molecular modelling, and network-based approaches [124, 125]. Sev-
eral of these computational approaches are based on the similarity principle, which as-
sumes that structurally similar chemicals will have similar toxicological effects [126,127].
In particular, chemical categorization and read-across methods are widely used for risk
assessment of chemicals.
Structure-based similarity analysis can aid in the understanding of the diversity of the
investigated environmental chemical space. Any chemical space can be characterized by
a multi-dimensional space of descriptors such as hydrophobicity, chemical connectivity,
presence or absence of particular substructures, and these features can be measured ex-
perimentally or obtained computationally [128]. For this, each chemical structure is rep-
resented in the form of binary fingerprints that capture different aspects such as hydropho-
bicity, chemical connectivity, presence or absence of particular substructures [126, 128].
Similarity between any two chemicals is quantified using distance measures such as Tan-
imoto index, Dice index, Cosine coefficient and Soergel distance [126]. These distance
measures typically give the chemical similarity value in the range between 0 and 1, with
0 representing no resemblance and 1 representing strong similarity. Some of the widely-
used molecular fingerprints for similarity quantification include the extended connectivity
fingerprints (ECFP4) [129], the MACCS keys fingerprints [130], and the Daylight-like
fingerprints. Visualisation and analysis of a particular environmental chemical space by
constructing chemical similarity networks (CSNs) can provide insight into the diversity
14
of the compiled chemical spaces [131]. In CSN, the nodes are the chemicals, and there is
an edge between two nodes (chemicals) if they share certain level of structural similarity.
To this end, we have constructed CSNs for various groups of environmental chemicals
studied in this thesis, and further, have evaluated the structural diversity of associated
chemical spaces.
In addition to the chemical structure similarity, we have leveraged the predicted chem-
ical classification, predicted physicochemical properties, and predicted absorption, distri-
bution, metabolism, and excretion (ADME) properties to characterize the compiled envi-
ronmental chemical spaces studied in this thesis.
15
the publicly available scientific and regulatory sources of toxicity information [153]. The
presence of diverse groups of environmental chemicals in the existing chemical lists repre-
senting the current chemical regulations, guidelines or inventories can also reflect the gaps
in the current regulation across various exposure sources. To this end, comparative studies
for food, food additives and food contact compounds have been performed [154,155], and
these studies have revealed inadequacies in current regulation that lead to the inclusion of
substances of concern in food-related products.
In this direction, we have compiled the publicly available chemical lists representing
current regulations, guidelines or inventories in this thesis, and thereafter, classified the
chemical lists according to various exposome categories. Thereafter, we have explored
the presence of the five groups of environmental chemicals studied in this thesis, across
the chemical lists representing current regulations, guidelines or inventories, in order to
assess the current regulatory status of the different groups of environmental chemicals.
16
Chapter 3 presents an overview of the updated knowledgebase DEDuCT 2.0, and an
investigation of the current regulations and guidelines from the perspective of EDCs. In
this chapter, we sought to understand how scientific knowledge from academic research
could be used to improve chemical regulation, with an emphasis on EDCs. We expand
our comparative analysis with various chemical lists and classifying them based on an
influential report commissioned by the European Parliament [156]. To understand the
scale of exposure and the related hazard potential, we analyze which of these potential
EDCs in human use are produced in large volumes. Lastly, we also demonstrate how the
compiled information in curated knowledgebases like DEDuCT 2.0 can aid in the risk
assessment of EDCs using an example. The work reported in this chapter is contained
in the published manuscript [36].
Chapter 4 presents the steps involved in the characterization, development and inves-
tigation of an adverse outcome pathway (AOP) network derived to capture the endocrine-
mediated perturbations resulting from environmental exposure. In this chapter, we as-
sess the quality and completeness of information of each AOP compiled in AOP-Wiki
[114], and thereafter, identify high-confidence AOPs relevant to endocrine disruption
(ED-AOPs). The identified ED-AOPs were used to construct an ED-AOP network by
assembling the information on shared KEs and KERs among them. We further utilize a
graph-theoretic approach to study the ED-AOP network and identify critical biological
events perturbed upon endocrine disruption. Besides, we also study the systems-level
perturbations caused by endocrine disruption, emergent paths, and stressor-event associ-
ations. The work reported in this chapter is contained in the manuscript [37].
17
compiled neurotoxicants in various chemical lists representing regulations, guidelines or
inventories. We also characterize the associated chemical space by constructing a chemi-
cal similarity network. The work reported in this chapter is contained in the published
manuscript [38].
18
human exposures to environmental chemicals detected in human tissues, as well as the
current status of their monitoring and regulation. Further, we propose a priority list of
potentially hazardous chemicals based on a comparative analysis of TExAs with SVHC
REACH regulation [157] and high production volume chemicals. The work reported in
this chapter is contained in the published manuscript [41].
Chapter 9 concludes this thesis with a brief summary of the research reported across
different chapters. The chapter also discusses the future prospects and the scope of our ef-
forts in identifying, compiling and characterizing different classes of environmental chem-
icals, and linking them to potential health hazards in humans.
19
20
Chapter 2
For the risk assessment of EDCs, an important limitation is the lack of availability of
validated test systems for their identification [43,45]. This has hampered both researchers
21
and policymakers to reach a consensus agreement on identification of EDCs and the char-
acterization of their endocrine disruption mechanisms [43, 45]. In this direction, Solecki
et al. [45] have outlined a detailed consensus statement on the scientific principles that
can form a basis for the identification of EDCs and their disruption mechanism. Further-
more, the scientific statements by the endocrine society [43, 50, 51] provide principles for
better understanding of disruption mechanisms by EDCs.
Given the potential risk from EDCs in our environment, there have been multiple ef-
forts towards their compilation which include the World Health Organization (WHO) re-
port [8], The Endocrine Disruption Exchange (TEDX) [46] and EDCs Databank [47, 48]
and Endocrine Disruptor Screening Program (EDSP) [49] of United States Environmen-
tal Protection Agency (US EPA). However, these existing resources on potential EDCs
consider evidence for endocrine disruption upon exposure from disparate types of pub-
lished studies. Specifically, the WHO report and TEDX contain manually curated in-
formation on EDCs based on published literature evidence including in vivo, in vitro, in
silico, environmental monitoring and epidemiological studies while EDCs Databank com-
piles EDCs from the TEDX and the EU list of potential endocrine disruptors followed by
PubMed [158] search to associate literature evidence with EDCs. Another important lim-
itation of these existing resources on potential EDCs is the lack of systematic effort to
compile the observed adverse effects specific to endocrine disruption in supporting pub-
lished experiments.
22
Literature mining STAGE-4
Identification of EDCs with supporting
PubMed query WHO report TEDX EDCs Databank evidence on systems-level endocrine-
(16407 articles) (337 articles) (1087 articles) (456 articles)
mediated perturbations
STAGE-1
Final list of 686 EDCs, their
Manually filtered for the presence
of keywords related to EDCs systems-level endocrine-
mediated perturbations,
14297 articles with likely and supporting evidence
information on EDCs from 1796 articles
Figure 2.1: Detailed workflow with four stages to identify potential EDCs from published re-
search articles containing supporting experimental evidence of systems-level endocrine-mediated
perturbations in humans or rodents.
23
Firstly, we mined PubMed [158] using the following keyword search:
The above query was designed to filter abstracts on EDCs from PubMed, and this keyword
search in February 2018 led to 16407 research articles. Secondly, we compiled research
articles from three existing resources on EDCs, namely, the WHO report [8], TEDX [46]
and EDCs Databank [47, 48]. Specifically, the WHO report, TEDX and EDCs Databank
captured information from 337, 1087 and 456 research articles, respectively.
Subsequently, we manually filtered the compiled abstracts from PubMed query, WHO
report, TEDX and EDCs Databank for the presence of keywords such as endocrine dis-
ruptors or endocrine disrupters or endocrine disrupting or endocrine disrupting chemicals
or EDC or EDCs. In particular, we check that the acronym EDC in a filtered abstract
refers to endocrine disrupting chemicals. For example, we found that the acronym EDC
in certain abstracts may refer to irrelevant terms such as electric dynamic catathermometer
or expected delivery cesarean or endothelium-derived contracting. This manual filtration
of abstracts based on presence of keywords relevant to endocrine disruption studies led
to 14297 research articles at the end of the stage 1 (Supplementary Table S2.1). Of these
14297 research articles at the end of stage 1, 12879 are not captured in existing resources,
namely, WHO report, TEDX or EDCs Databank [35].
In stage 2, we screened the 14297 research articles from stage 1 to select studies based on
in vivo or in vitro experiments in humans or rodents (Figure 2.1). Here, we have excluded
24
published studies where receptor-based binding assays or in silico methods are employed
to infer the potential endocrine disruption by a chemical using binding affinity or bioac-
tivity information. Such binding affinity or bioactivity values do not provide sufficient
information on whether chemical exposure can actually lead to adverse effects due to en-
docrine disruption [159]. We have also excluded human epidemiological studies due to
insufficient mechanistic evidence linking observed adverse effects to potential endocrine
disruption upon chemical exposure [160, 161]. The filtration based on study type and test
organism led to a subset of 3300 research articles at the end of stage 2 (Supplementary
Table S2.2). Of these 3300 research articles at the end of stage 2, 2394 are not captured
in existing resources, namely, WHO report, TEDX or EDCs Databank [35].
In this work, we do not include information from two existing resources on EDCs,
namely, the Endocrine Disruptor Knowledge Base (EDKB) [162] and Endocrine Disrup-
tor Screening Program (EDSP) of the United States Environmental Protection Agency
(US EPA). EDKB compiles EDCs based on multiple receptor binding assays and in silico
QSAR studies, and such evidence is ignored in our workflow to identify EDCs (Figure
2.1). EDSP screens chemicals based on several hormonal assays in test organisms such
as human, rat, fish and amphibians to determine its potency to interact with the human
endocrine system. EDSP identifies a chemical to be an EDC if the chemical displays
consistent evidence of endocrine disruption across all hormonal assays carried out by
them. As highlighted by Zoeller et al. [43], the weight of evidence used by EDSP to iden-
tify EDCs is too stringent which leads to omission of several chemicals with significant
endocrine-specific effects. Specifically, in the EDSP Tier 1 screening of 52 chemicals,
18 were determined to have conclusive evidence for endocrine disruption while 34 have
inconclusive evidence. However, a closer inspection of the 34 chemicals determined by
EDSP to have inconclusive evidence finds well-known EDCs such as Chlorpyrifos and
2,4-Dichlorophenoxyacetic acid highlighted by the WHO report and the Endocrine soci-
ety [163]. Thus, we decided not to include information from EDSP in our resource.
25
2.1.3 Compilation of tested chemicals from the filtered research arti-
cles
In stage 3, we gathered the set of chemicals tested for potential endocrine disruption in any
of the 3300 research articles from stage 2. Moreover, we also gathered information on the
two-dimensional (2D) structure of each tested chemical using PubChem [86] and Chemi-
cal Abstracts Service (CAS) [164] databases (Figure 2.1). Note that we have omitted any
tested chemical in the 3300 research articles which could not be mapped to a chemical
identifier in standard chemical databases. At the end of stage 3, we compiled 1626 chem-
icals along with their 2D structures that were tested for endocrine disruption in humans or
rodents in at least one of the filtered research articles from stage 2 (Supplementary Table
S2.3) [35].
In stage 4, we identify potential EDCs among the 1626 chemicals compiled in stage 3 by
assessing the significance of observed effects for endocrine disruption upon exposure in
published experiments in humans or rodents (Figure 2.1).
Prior to this assessment of supporting evidence for endocrine disruption upon chem-
ical exposure, we excluded a tested chemical or its published experiment based on the
following criteria (Figure 2.1):
1. Chemical is a natural hormone.
2. Chemical was tested as part of a mixture in the published experiment. This criterion
reflects our choice to include chemicals which as single entities can cause endocrine dis-
ruption upon exposure.
3. Chemical was tested for therapeutic relevance in the published experiment.
Moreover, we excluded published experiments which contain evidence for endocrine dis-
26
ruption upon chemical exposure in an in vitro rodent system. Since the observed effects in
an in vitro rodent system do not adequately reflect the complexities observed in humans,
the last criterion omits such evidence in the published literature (Figure 2.1). For the next
phase of the workflow, we filtered chemicals and their associated literature which pass the
above-mentioned criteria.
For each chemical which passed the above-mentioned criteria, we next evaluated the
level of supporting evidence for endocrine disruption in humans or rodents upon expo-
sure based on published experiments contained in the filtered research articles. For this
evaluation, we manually compiled the observed effects upon exposure of each chemical
in associated published experiments in humans or rodents. A published experiment in
humans or rodents is considered as strong supporting evidence for endocrine disruption
by a chemical if the chemical upon exposure leads to observed effects or endpoints related
to endocrine-specific perturbations such as changes in morphology, physiology, growth,
reproduction, development and lifespan [8]. Thereafter, if a chemical has at least one pub-
lished experiment with strong supporting evidence for endocrine disruption upon expo-
sure, then it is identified as a potential EDC in stage 4 of the workflow. At the end of stage
4, we identified 686 potential EDCs with supporting evidence of endocrine-mediated per-
turbations in published literature spanning 1796 research articles (Supplementary Table
S2.4) [35].
For the identification of EDCs, we have manually compiled the observed effects or end-
points related to endocrine-specific perturbations reported in published experiments on
chemical exposure in humans or rodents (Figure 2.1). This compiled list of observed ef-
fects or endpoints was then used to assess the level of supporting evidence for endocrine
disruption upon chemical exposure. In order to standardize the reported evidence for
27
an EDC, we undertook an extensive manual effort to unify the biological terms used to
describe the observed effects or endpoints related to endocrine-specific perturbations in
published experiments upon chemical exposure.
EDCs perturb the normal functioning of the human endocrine system which consists
of several glands that secrete hormones which in turn regulate diverse biological func-
tions such as development, growth, reproduction, metabolism, immunity and behaviour
[165, 166]. Hence, exposure to EDCs can have adverse effects in several biological pro-
cesses regulated by the human endocrine system (Figure 2.2). In addition, the endocrine-
related processes perturbed by EDCs can also induce cancer in humans [8, 50, 51]. Mo-
tivated by the major biological processes controlled by the human endocrine system, we
have classified the 514 endocrine-mediated endpoints into 7 systems-level perturbations
which are:
1. Reproductive endocrine-mediated perturbations (RT)
2. Developmental endocrine-mediated perturbations (DT)
3. Metabolic endocrine-mediated perturbations (MT)
4. Immunological endocrine-mediated perturbations (IT)
5. Neurological endocrine-mediated perturbations (NT)
6. Hepatic endocrine-mediated perturbations (HT)
7. Endocrine-mediated cancer (CT)
In Supplementary Table S2.5, we list the 514 endocrine-mediated endpoints and their cat-
egorization into 7 systems-level endocrine-mediated perturbations in DEDuCT 1.0 [35].
Figure 2.3A shows the occurrence of these 7 systems-level perturbations in the support-
28
ing published experiments for the 686 EDCs in DEDuCT 1.0 [35]. Among the 686 EDCs
in DEDuCT 1.0 [35], it is seen that 535 have supporting evidence for reproductive per-
turbations and 315 for metabolic perturbations (Figure 2.3A). Thus, majority of EDCs
in DEDuCT 1.0 have supporting evidence for adverse effects on the reproductive system
followed by metabolism [35].
We highlight that future studies and toxicological databases can leverage our compre-
hensive list of endocrine-mediated endpoints and their categorization into 7 systems-level
perturbations while reporting or documenting the adverse effects related to endocrine dis-
ruption from experiments related to chemical exposure. Hence, our work also contributes
towards development of a unified biological vocabulary to describe toxicity profiles of
chemicals.
mediated endpoints
In stage 4 of the workflow, we have also compiled the dosage values for each EDC at
which the endocrine-mediated endpoints are observed in the published experiments (Fig-
ure 2.1). Firstly, we have gathered the test dosage values for each EDC in appropriate
units from the published experiments. Secondly, we have identified the effective dosage
value among the test dosage values at which a particular endocrine-mediated endpoint is
observed upon EDC exposure in the published experiment. Thirdly, the published experi-
ments with supporting evidence for endocrine disruption by EDCs employ different units
to report the test and effective dosage values. Thus, we undertook a significant effort to
convert and express the test and effective dosage values taken from published experiments
on EDCs in a uniform format wherever possible.
Based on this effort, we realized that the different units used to report the test and
effective dosage values of EDCs in published experiments can be classified into two broad
categories:
29
5 2
Neurological endocrine-mediated Developmental endocrine-mediated
perturbations (NT) perturbations (DT)
[65 endpoints] [83 endpoints]
For example: For example:
Affects neuronal density, Increase in Hypothalamus Affects embryonic development, Affects
corticosterone levels, Decreased Pitutary gland skeletal development in fetus, Affects
dopamine levels, Affects social behavior placental development
4
3
Immunological endocrine-mediated
Metabolic endocrine-mediated perturbations (IT)
pertubations (MT) [33 endpoints]
[125 endpoints] Thyroid gland
For example:
For example: Atrophy of spleen, Thymus
Affects xenobiotic metabolism, atrophy, Alterations in immune
Elevated insulin levels, Decrease responses
in T4 levels, Lead to obesity
Thymus gland
6 7
Hepatic endocrine-mediated Endocrine-mediated
perturbations (HT) cancers (CT)
[29 endpoints] Liver [20 endpoints]
For example: For example:
Oxidative stress in liver, Affects Adrenal glands Cancer phenotype,
hematopoiesis of liver, Increased liver Adenocarcinoma, Induce
weights cancer metastasis
Pancreas
1
Reproductive endocrine-mediated
perturbations (RT)
[273 endpoints]
For example: Ovary
Reduced sperm counts, Affects
testicular morphology, Affects Testis
germ cell differentiation
Figure 2.2: Schematic figure depicting the classification of the 514 endocrine-mediated endpoints
into 7 systems-level perturbations in DEDuCT 1.0. Note that this classification of endpoints into
systems-level perturbations is overlapping, that is, a given endpoint may fall into more than one
systems-level perturbations.
30
1. Dose which gives the amount of chemical that is administered directly to the test
organism in the experiment.
2. Concentration which gives the amount of chemical present in another substance such
as food, soil or water that is administered to the test organism in the experiment.
Moreover, only a fraction of the published experiments on EDCs report dosage values
normalized by the body weight of the individual test organism and duration of exposure
[167]. For example, if a published experiment on EDC reports the dosage value in the
unit mg/kg/day then this gives the amount of chemical administered per kg of the body
weight of the test organism per day.
Due to the above-mentioned limitations, we were able to convert the different units
used in published experiments to report the dosage values of EDCs into 19 standardized
units. Supplementary Table S2.6 lists these 19 standardized units which were used to
compile the dosage values of EDCs specific to endocrine-mediated endpoints from pub-
lished experiments. For each EDC, we have compiled the test and effective dosage values
specific to endocrine-mediated endpoints in standardized units, and this information is
readily available via the DEDuCT webserver.
Natural hormones in human body can carry out their physiological functions at very low
concentration. EDCs are known to interfere with the endocrine system by mimicking the
natural hormones. Thus, it is important for risk assessment of EDCs to understand the
adverse effects caused by their low dose exposure [168±170]. In this direction, our com-
pilation of the test and effective dosage values for EDCs in DEDuCT 1.0 from published
experiments can be leveraged to elucidate such low dose effects. Specifically, we have
used the test and effective dosage values for EDCs in DEDuCT 1.0 to determine the fol-
lowing dose-response measures [51, 168]:
1. No Observed Adverse Effect Level (NOAEL) gives the highest dose of an EDC at
which no observed effects or endocrine-mediated endpoints are seen in the published ex-
31
periments.
2. Low Observed Adverse Effect Level (LOAEL) gives the lowest dose of an EDC at
which any one of the observed effects or endocrine-mediated endpoints are seen in the
published experiments.
Note that the supporting evidence for the EDCs in DEDuCT 1.0 has been compiled
from three different types of published experiments, namely, in vivo or in vitro experi-
ments in humans or in vivo experiments in rodents. In cases where the supporting evi-
dence for an EDC comes from more than one type of published experiment, we determine
the NOAEL and LOAEL values for the EDC separately for different types of published
experiments (Supplementary Table S2.7). Moreover, the supporting evidence for an EDC
in DEDuCT 1.0 may come from published experiments employing different units to spec-
ify test and effective dosage values. In such cases, we determine the NOAEL and LOAEL
values for the EDC separately for different standardized units across the published experi-
ments (Supplementary Table S2.7). Note that we did not compile information on the route
and duration of EDC exposure from published experiments in DEDuCT. Supplementary
Table S2.7 lists the NOAEL and LOAEL values for EDCs in DEDuCT 1.0.
We have classified the 686 EDCs in DEDuCT 1.0 into 4 categories based on the type of
supporting evidence in published experiments. EDCs in category I have supporting evi-
dence from in vivo human experiments, category II from in vivo rodent and in vitro human
experiments but not from in vivo human experiments, category III from only in vivo ro-
dent experiments, and category IV from only in vitro human experiments (Supplementary
Table S2.8). Thus, potential EDCs in category I have the highest level of supporting ev-
idence in published experiments followed by category II, III and IV, respectively. Of the
686 EDCs in DEDuCT 1.0, 7, 142, 367 and 170 are in category I, II, III and IV, respec-
32
tively (Supplementary Table S2.8). These 142, 367 and 170 potential EDCs in categories
II, III and IV, respectively, in DEDuCT 1.0 require additional experimentation and further
risk assessment for their potential risk to humankind [35].
We then compared potential EDCs in each category (I-IV) to the safer chemical in-
gredients list (SCIL) developed and released by the US EPA as part of its safer choice
program [171]. US EPA has identified 931 chemicals in SCIL to be ‘safe’ based on their
functional use categories. In SCIL, US EPA has labelled chemicals of low concern by
green circle, chemicals of low concern for which additional data is required by green
half-circle, chemicals satisfying safer choice criteria only for a particular functional use
while possibly displaying hazardous profile in other uses by yellow triangle, and chemi-
cals unsuitable for use in consumer products by grey square. We have compared the subset
of 930 SCIL chemicals labelled by green circle or green half-circle or yellow triangle with
the 686 potential EDCs in DEDuCT 1.0.
We find that 10 out of the 686 potential EDCs in DEDuCT 1.0 to be also in the
SCIL (Figure 2.3B). None of these 10 potential EDCs in SCIL are listed under category
I EDCs in DEDuCT 1.0 with supporting evidence for endocrine disruption from in vivo
human experiments. Of these 10 potential EDCs, 1, 7 and 2 are in category II, III and
IV, respectively. Benzyl salicylate is the only chemical in SCIL that is listed as category
II EDC in DEDuCT 1.0 with supporting evidence for endocrine disruption from in vivo
rodent and in vitro human experiments while lacking evidence from in vivo human experi-
ments. As Benzyl salicylate is labelled by yellow triangle in SCIL based on the functional
use category of fragrances, this suggests that this chemical may have potential to display
hazardous profile in other use categories. For improved risk assessment, there is need to
further evaluate and gather additional evidence for potential EDCs listed in the SCIL [35].
We have also compared the list of 3312 inactive ingredients used in US Food and Drug
Administration (FDA) approved drug products from inactive ingredient database [172]
with 686 potential EDCs in DEDuCT 1.0 [35]. Inactive ingredients in a drug are the chem-
icals that do not have any pharmacological effect and these include colorants, drug preser-
33
vatives and flavouring agents. We find that 44 of the 686 potential EDCs in DEDuCT 1.0
are used as inactive ingredients in FDA approved drugs (Figure 2.3B). None of these 44
potential EDCs are listed under category I EDCs in DEDuCT 1.0. Of 44 potential EDCs
in FDA inactive ingredients list, 7 chemicals (Caffeine, Trichloroethylene, Diethyl ph-
thalate, Butyl p-hydroxybenzoate, Methyl p-hydroxybenzoate, Ethyl p-hydroxybenzoate,
Butylated hydroxyanisole) are in category II, 30 in category III, and 7 in category IV of
DEDuCT 1.0. For better risk assessment, these 44 potential EDCs in FDA inactive in-
gredients list require additional evidence from in vivo human experiments considering the
effective dosage, route of exposure, and duration of exposure [35].
Based on the environmental source of EDCs, we have classified the 686 EDCs into 7
broad categories, namely, ‘Agricultural and farming’, ‘Consumer products’, ‘Industry’,
‘Intermediates’, ‘Medicine and health care’, ‘Natural sources’, and ‘Pollutant’ (Figure
2.4). Furthermore, the 7 broad categories of EDCs were further classified into 48 sub-
categories (Figure 2.4). Note that this environmental source-based classification of EDCs
is overlapping, that is, a given EDC may belong to multiple broad or sub-categories.
Majority of EDCs in DEDuCT 1.0 are used in ‘Consumer products’ (Figure 2.4).
We have employed the web-based application ClassyFire [173, 174] to obtain a chemical
classification of the 686 EDCs in DEDuCT 1.0. Note that ClassyFire [174] gives a non-
overlapping hierarchical chemical classification based on the structure and composition
of the molecule. Using ClassyFire, the 686 EDCs in DEDuCT 1.0 were classified into two
chemical kingdoms, namely, organic and inorganic compounds (Figure 2.5). Moreover,
the EDCs in the organic kingdom can be further classified into 19 super-classes while
those in the inorganic kingdom fall into 3 super-classes (Figure 2.5). Of the 686 EDCs
in DEDuCT 1.0, 646 are organic and 40 are inorganic (Figure 2.5A). Among the 646
organic EDCs in DEDuCT 1.0, the largest fraction belongs to the chemical super-class
34
B C
DEDuCT
WHO report
639
28
TEDX DEDuCT
3 37
7 198 19
620 12
696 225 3043 177 27
184 80
Databank
310 18
EDCs
1 0
US EPA safer Inactive ingredients
chemical ingredients of FDA approved
list (SCIL) drug products 22 0
A D
1.0
R = 0.17
600
0.8
(Tanimoto coefficient)
Chemical similarity
500
Number of EDCs
400 0.6
300
0.4
200
0.2
100
0 0.0
IT DT HT CT NT MT RT 0.0 0.2 0.4 0.6 0.8 1.0
Figure 2.3: (A) Histogram shows the occurrence of 7 systems-level perturbations in the support-
ing evidence compiled from published experiments for the 686 EDCs in DEDuCT 1.0. Majority
of EDCs in DEDuCT 1.0 have adverse effects on the reproductive system followed by metabolism.
(B) Comparison of the 686 EDCs in DEDuCT 1.0 with the US EPA SCIL and the FDA inactive
ingredients list. 10 EDCs are present in the SCIL while 44 EDCs are present in FDA inactive
ingredients list. (C) Comparison of the 686 EDCs in DEDuCT 1.0 with those in the WHO report,
TEDX and EDCs Databank. From the Venn diagram, it is seen that 198 EDCs in DEDuCT 1.0 are
not captured in the three other existing resources. (D) Scatter plot of target similarity versus chem-
ical structure similarity between pairs of EDCs. Here chemical structure similarity was computed
using Tanimoto coefficient with ECFP4 fingerprint. We find no significant correlation (Pearson
correlation coefficient R = 0.17) between the structural and target similarity of EDCs.
35
Agricultural and Consumer Industry Intermediates Natural sources
farming (299) products (338) (301) (119) (39)
Figure 2.4: Classification of the 686 EDCs in DEDuCT 1.0 into 7 broad categories and 48 sub-
categories based on their source in the environment. In this figure, the number of EDCs in each
category or sub-category is reported within the parenthesis.
36
Benzenoids (Figure 2.5A). In Figure 2.5B, we show the chemical structure of a repre-
sentative EDC in each chemical super-class with at least 10 potential EDCs in DEDuCT
1.0 [35].
For the 686 EDCs in DEDuCT 1.0, we obtained the 2D chemical structure from Pub-
chem and CAS databases. Thereafter, Balloon [175, 176] and Open Babel [177, 178]
with Merck Molecular Force Field (MMFF94) were used to generate the lowest energy
three-dimensional (3D) structure of the EDCs. RDKit [179] and Open Babel [177, 178]
were used to compute the basic physicochemical properties of the EDCs. In addition, we
have also computed the one-dimensional (1D), 2D and 3D molecular descriptors using
PaDEL [180, 181], RDKit [179] and Pybel [182]. For each EDC, PaDEL, RDKit and Py-
bel gave 1875, 213 and 14 descriptors, respectively. For each EDC, we have made its 2D
and 3D chemical structure, physicochemical properties and molecular descriptors readily
available via the DEDuCT 1.0 webserver, and this information can aid future efforts to
develop computational toxicity models based on structure-activity relationships.
37
A
Mixed
Hom
ogen
metal/n
mo
ous
Or
gen
ga
on-met
non-m
eo
no
)
tive nds (7
us
ph
)
me
os
(35
al com
)
etal
s (4
ph
tal
s
der mpo
o ru
nd
com
com
ou
O
ga ids a allic c )
pounds
sc
et s (5
iva
o
rg
mp
p
p ou
an
om
ou n d
co
po
nds (2)
ac
nd
en
id
un
)
(22)
)
s (8)
Org droca
(1
(9
s
log
( 10
ds
an
s
ds
ive
d
ha
un
at
de
Hy
lo
Be po
no
riv
riv
a
nz m
de
at
en co
iv
Or
on
o n )
es
id ge rs (1
rb
s
tro
(4
ca
(3
ni lyme
8)
01
ro
) ic
yd
n o
a ic P
H
rg
O gan
Or
Inorganic
(40)
com (646)
Or o u n d
(51)
ga
p
tides
nic s
dp olyke
id s an
p an o
ylpro
Phen
lts (8 )
Organic sa
Organic oxygen compounds (26)
Nucle
analo osides, n
u
Org gues (2) cleotide
a s, an
Lig nosulf d
n ur c
com ans, omp
pou neoli ou n
ds (
n d s gn a 3)
(2) ns
an d
rela
ted
Or a n i c
Or
ga
g
no 1,3-d
he
ter polar
oc
yc
B
i
lic
co
com
mp
nd
nd
s(
s(
O
88
1)
S P O
HO
OH
S
Cd O O
O O
Mixed metal or Organic acids and Organic oxygen Organohalogen Organoheterocyclic Phenylpropanoids
non-metal compounds derivatives compounds compounds compounds and polyketides
Cl
Cl H
Cl Cl Cl Cl N N
Cl
Cl Cd Cl Cl
HO
O Cl
O Cl Cl N N
OH
N
Cl O P S
O S
O Cl NH
O Cl Cl
Cl
Figure 2.5: Classification of the 686 EDCs in DEDuCT 1.0 into chemical kingdoms and chem-
ical super-classes using ClassyFire. (A) Of the 686 EDCs, 646 are organic and 40 are inorganic
compounds. The 646 organic EDCs can be further classified into 19 super-classes while the 40
inorganic EDCs fall into 3 super-classes. The number of EDCs in each super-class is reported
within the parenthesis. (B) The chemical structure of a representative EDC in each super-class
with more than 10 EDCs is shown here. For instance, the super-class Benzenoids contains 301
EDCs including Bisphenol A shown here.
38
Kp). Distribution properties of a chemical shed light on its availability in other parts of
the body after being absorbed into the bloodstream. The predicted distribution properties
for EDCs include blood-brain barrier (BBB), CNS permeability, fraction unbound in hu-
man, P-glycoprotein inhibitor, P-glycoprotein substrate, plasma protein binding, steady
state volume of distribution (VDss) and subcellular localization. Metabolism properties
of a chemical describe its conversion into metabolites through enzymatic breakdown prior
to elimination from the human body. The predicted metabolism properties for EDCs in-
clude assessment to act as a substrate or inhibitor of CYP450 enzymes, human bile salt
export pump (BSEP), human liver microsomal (HLM) stability assay, human multidrug
and toxin extrusion (MATE) transporter, organic anion-transporting polypeptides (OATP)
and UDP-glucuronosyltransferases (UGT) catalysis. The predicted excretion properties
for EDCs include total clearance rate and the ability to inhibit or act as a substrate for re-
nal organic cation transporter 2 (OCT2). The predicted toxicological properties for EDCs
include biodegradation capacity, carcinogenicity, Cramer’s rule, cytotoxicity, hepatotox-
icity, hERG inhibitors, maximum recommended tolerated dose (MRTD), mitochondrial
membrane potential (MMP), rat oral toxicity and skin sensitization. Supplementary Table
S2.9 lists the predicted ADMET properties by different tools used here.
39
https://cb.imsc.res.in/deduct/.
The web interface of DEDuCT 1.0 was created using PHP [189], HTML, CSS, Boot-
strap 4, and jQuery [190]. To facilitate interactive visualization, we have used Google
Charts [191], D3.js [192], Cytoscape.js [193] and JSmol [194] in the web interface. The
compiled database on EDCs is stored using MariaDB [195], and the information from the
database is retrieved using Structured Query Language (SQL). DEDuCT 1.0 website is
hosted on Apache [196] webserver running on Debian 9.4 Linux Operating System.
Using the Browse section in the web interface of DEDuCT, users can view the EDCs
based on their type of supporting evidence or environmental source or chemical classi-
fication or systems-level perturbations (Figure 2.6). Using the Simple search option in
DEDuCT, users can search for individual EDCs using chemical name or standard iden-
tifier (Figure 2.6). Using the Physicochemical filter option in DEDuCT, users can also
filter EDCs based on their physicochemical properties such as molecular weight, number
of hydrogen bond donors or acceptors, and number of rotatable bonds (Figure 2.6). By
clicking the chemical name of any EDC in DEDuCT, users can view the entire compiled
information including supporting evidence and dosage information.
To better expose the utility of DEDuCT, let us consider the well-known EDC,
Atrazine, as an example. Based on environmental source, DEDuCT 1.0 classifies Atrazine
into the broad categories ‘Agriculture and farming’ and ‘Pollutant’, and sub-categories
‘Environmental Pollutant’, ‘Fertilizer’, ‘Fungicide’, ‘Herbicide’ and ‘Pesticide’. Based
on chemical classification, Atrazine is an ‘Organic’ compound belonging to super-class
‘Organoheterocyclic compounds’ and class ‘Triazines’. In DEDuCT 1.0, Atrazine is a
potential EDC with supporting experimental evidence from 40 research articles and falls
into category II based on the type of supporting evidence. Based on compiled evidence in
DEDuCT 1.0, Atrazine exposure can lead to any of the 7 systems-level perturbations and
users can view the compiled dosage information corresponding to the observed endocrine-
mediated endpoints in the web interface.
40
A B
Figure 2.6: The web interface of DEDuCT. (A) The screenshot shows the different search options
in our resource to obtain information on EDCs. Simple search option in DEDuCT can be used to
search for individual EDCs using the chemical name or standard identifier. Physicochemical filter
option in DEDuCT can be used to also filter EDCs based on their physicochemical properties such
as molecular weight, number of hydrogen bond donors or acceptors, number of rotatable bonds.
Chemical similarity filter gives the top 10 structurally similar EDCs in DEDuCT in comparison
to the query molecule. (B) The Browse section in the web interface of DEDuCT can be used to
view the EDCs based on the type of supporting evidence or their environmental source or chemical
classification or systems-level perturbations and endocrine-mediated endpoints.
41
2.3 Comparison of DEDuCT 1.0 with existing resources
on EDCs
In addition to extensive PubMed mining to identify published experiments on EDCs,
DEDuCT integrates information from three existing resources, WHO report, TEDX and
EDCs Databank (Figure 2.1). We find that 198 out of the 686 potential EDCs (28.9%) and
1294 out of the 1796 associated published research articles (72.0%) containing supporting
experimental evidence in DEDuCT 1.0 are not captured in any of the three existing re-
sources (Figure 2.3C; Table 2.1). Unlike DEDuCT, the supporting evidence for compiled
EDCs in the three existing resources are not limited to in vivo or in vitro studies in humans
and in vivo studies in rodents (Figure 2.1). Note that we were unable to find supporting
evidence for endocrine disruption upon exposure in published experiments on humans
or rodents for several chemicals listed as EDCs in the WHO report or TEDX or EDCs
Databank, and thus, such chemicals are not contained in DEDuCT 1.0 (Figure 2.3C). Im-
portantly, in contrast to the three existing resources, DEDuCT 1.0 compiles the observed
endocrine-mediated endpoints and systems-level perturbations upon EDC exposure from
published experiments (Table 2.1). Moreover, in contrast to the three existing resources,
DEDuCT compiles the dosage information at which endocrine-mediated endpoints were
observed upon EDC exposure from published experiments (Table 2.1).
Chemical similarity networks (CSNs) can shed insights on the extent of scaffold diversity
in the associated chemical space [197±199]. We constructed the chemical similarity net-
work (CSN) of the 686 EDCs in DEDuCT 1.0 as follows. In the CSN, nodes are EDCs
and the edge weights reflect the extent of chemical similarity between pairs of EDCs.
42
Among the metrics for chemical similarity, Tanimoto [126, 200] and Dice [201] coeffi-
cients were determined to be the best choices [126]. In addition, while computing the
Tanimoto or Dice coefficient, there are several choices of molecular fingerprints such as
the extended connectivity fingerprints (ECFP4) [129], the MACCS keys fingerprints [130]
and the Daylight-like (DLL) fingerprints, and ECFP4 has been shown to outperform other
widely-used fingerprints [126, 202]. Thus, there are multiple choices based on similarity
metrics and molecular fingerprints to specify the edge weights in the CSN, and in this
work, we have explored six possible choices, namely, Tanimoto with ECFP4, Tanimoto
with MACCS, Tanimoto with DLL, Dice with ECFP4, Dice with MACCS, and Dice with
DLL which were computed using RDKit [179]. By exploring these six possible choices to
construct CSN, we show that the broad conclusions from the analysis of CSN are robust
to choices of similarity metrics and molecular fingerprints.
Since both Tanimoto coefficient and Dice coefficient for any pair of chemicals is in
the range 0 to 1, the edge weights in the six CSNs are in the same range. To visualize
the high similarity backbone of the CSN, we decided to omit edges with weights below
a chosen threshold value signifying poor chemical similarity. Rather than choosing an
arbitrary threshold value to construct this high CSN, we have investigated the size of the
largest connected component (LCC) of the CSN as a function of the increasing threshold
value for omitting edges (Figure 2.7). Note that the size of the LCC reflects the overall
connectivity of the network. By identifying the threshold value at which there is a sharp
decrease in the size of the LCC of the CSN, we have obtained the threshold value to
construct the high CSN (Figure 2.7).
We find that this threshold value to construct the high CSN differs based on the six
choices to assign edge weights, and it is found to be 0.45 for Tanimoto with ECFP4, 0.66
for Tanimoto with MACCS, 0.56 for Tanimoto with DLL, 0.62 for Dice with ECFP4, 0.80
for Dice with MACCS, and 0.72 for Dice with DLL (Figure 2.7A-F). Interestingly, we
find that the size and composition of the LCC of the high CSNs depend on the choice of
the molecular fingerprints rather than the similarity metric. That is, the size and composi-
43
A Tanimoto with ECFP4 D Dice with ECFP4
800 800
Size of the largest connected
600 600
component
0.45 0.62
400 400
200 200
0 0
0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0
600 600
component
0.66 0.80
400 400
200 200
0 0
0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0
600 600
component
0.56 0.72
400 400
200 200
0 0
0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0
Figure 2.7: The size of the largest connected component (LCC) of the chemical similarity network
(CSN) of EDCs as a function of the increasing threshold for omitting edges. (A) Tanimoto with
ECFP4. (B) Tanimoto with MACCS. (C) Tanimoto with Daylight-like (DLL). (D) Dice with
ECFP4. (E) Dice with MACCS. (F) Dice with Daylight-like (DLL).
44
tion of the LCC for the high CSNs constructed using Tanimoto with ECFP4 or Dice with
ECFP4 are same with 255 EDCs, Tanimoto with MACCS or Dice with MACCS are same
with 266 EDCs, and Tanimoto with DLL or Dice with DLL are same with 258 EDCs.
Furthermore, we find more than 75% overlap between EDCs contained in LCCs corre-
sponding to any pair of the six high CSNs [35]. Thus, we have chosen to show only the
high CSNs constructed using Tanimoto with ECFP4, Tanimoto with MACCS and Tani-
moto with DLL (Figure 2.8; Figure 2.9). Moreover, we have chosen to report the detailed
analysis of the high CSN constructed using Tanimoto with ECFP4 (Figure 2.8; Supple-
mentary Table S2.10) as the combination of Tanimoto coefficient and ECFP4 fingerprints
was earlier found to be the best choice for chemical similarity computations [126, 202].
Since EDCs are believed to cause endocrine disruption by mimicking the hormones
in human body [8, 50, 203], it is worthwhile to investigate the chemical properties shared
by EDCs. Based on the chemical classification of the 686 EDCs in DEDuCT 1.0, we find
that EDCs can be either organic or inorganic compounds, and moreover, are spread across
diverse chemical classes (Figure 2.8). Still, 301 of the 686 EDCs (43.9%) in DEDuCT 1.0
belong to a single chemical super-class Benzenoids (Figure 2.8). We further investigate
this chemical space by analyzing the CSN for the 686 EDCs in DEDuCT 1.0.
In Figure 2.8A, it is seen that the high CSN has a LCC of 255 EDCs, 8 small com-
ponents with 5 to 14 EDCs, 44 small components with 2 to 4 EDCs and many isolated
EDCs. In order to reveal the finer clustering of EDCs within the LCC, we have employed
Louvain modularity [204] as implemented in the network visualization tool Gephi [205]
to identify 14 modules within the LCC of the high CSN (Figure 2.8A). Moreover, a
closer inspection revealed that 210 out of the 255 EDCs in the LCC belong to the chem-
ical super-class Benzenoids. This observation inspired us to investigate the number of
benzene rings contained in each EDC (Figure 2.8A) [35].
Interestingly, we find that 254 out of the 255 EDCs in the LCC contain at least 1
benzene ring. Furthermore, 42 out of the 43 EDCs in the largest module of the LCC
have 2 benzene rings (Module 1 in Figure 2.8A). Similarly, 29 out of the 31 EDCs in the
45
A 2
No benzene ring
1 benzene ring
2 benzene rings
3 benzene rings
4 benzene rings
5 benzene rings
6 benzene rings
1
3
12 11 10 9 8 7 6 5
B
1 2 3 4 5 6
Cl OH
Cl Cl
Cl
O
O
Cl
Cl Cl F
HO
OH
N
F
O O
F OH
F F
F F
Cl F F
F
F
Cl Cl
O
F
OH
F
O
F
Cl F
Cl HO
Cl
2,3',4,4',5-
Hexachlorobenzene Bisphenol A Genistein Perfluorooctanoic acid Cypermethrin
Pentachlorobiphenyl
7 8 9 10 11 12
O
OH
Cl H
O Cl Cl N N
Cl Br Br
O Cl
N N
O S Sn
Sn
O Cl
Cl NH
O
Cl Br
46
Figure 2.8 (previous page): Network visualization of the high chemical similarity network (CSN)
of 686 EDCs in DEDuCT 1.0. (A) High CSN of 686 EDCs where nodes represent EDCs and edges
represent chemical similarity between pairs of EDCs quantified using Tanimoto coefficient with
ECFP4 fingerprints. Here, the edge thickness reflects the extent of chemical similarity between
two EDCs, and the node colour is based on the number of benzene rings in its chemical struc-
ture. Moreover, Louvain modularity within the network visualization tool Gephi was employed to
identify 14 modules within the LCC. The four largest modules in LCC and 8 smaller connected
components with 5 to 14 EDCs have been prominently labelled in this figure. (B) The chemical
structure of a representative EDC in each of the labelled modules or connected components in (A)
is shown here.
second largest module of the LCC have 1 benzene ring (Module 2 in Figure 2.8A) and 24
out of the 29 EDCs in the third largest module of the LCC have 2 benzene rings (Module
3 in Figure 2.8A). These observations suggest a striking pattern within larger modules
of the LCC in terms of the number of constituent benzene rings of EDCs. In contrast to
Modules 1, 2 and 3 of the LCC, the fourth largest module contains 28 EDCs of which 16,
3, 6 and 4 EDCs have 2, 3, 4 and 5 benzene rings, respectively (Figure 2.8A). In Figure
2.8B, we also show the chemical structure of a representative EDC contained in the 4
largest modules of the LCC and 8 smaller components or clusters with 5 to 14 EDCs. For
example, Bisphenol A is a well-known EDC contained in Module 3 of the LCC (Figure
2.8B).
47
A High CSN using Tanimoto with MACCS keys
No benzene ring 1 benzene ring 2 benzene rings 3 benzene rings 4 benzene rings 5 benzene rings
6 benzene rings
48
Figure 2.9 (previous page): Network visualization of the high chemical similarity network (CSN)
of 686 EDCs in DEDuCT 1.0. (A) High CSN where chemical similarity is quantified by Tanimoto
coefficient with MACCS keys fingerprints. (B) High CSN where chemical similarity is quantified
by Tanimoto coefficient with Daylight-like (DLL) fingerprints. In this figure, the edge thickness
reflects the extent of chemical similarity between two EDCs, and the node colour is based on
the number of benzene rings in its chemical structure. Moreover, Louvain modularity within the
network visualization tool Gephi was employed to identify modules within the LCC.
49
more target genes. The assay activity information file (hitc_Matrix_180918.csv) provides
a list of active or inactive chemicals based on the potency of the chemical to produce
a significant biological effect captured via 1504 assay component endpoints of different
ToxCast assays. In this work, we restrict to ToxCast assays and their corresponding as-
say component endpoints that are specific to humans. If a tested chemical is active for a
particular assay component endpoint of a ToxCast assay, then the corresponding gene is
assigned to be the target of the chemical.
Of the 686 potential EDCs in DEDuCT 1.0, we found target genes for 383 EDCs
based on 1228 ToxCast assay component endpoints specific to humans. Supplementary
Table S2.11 gives the target genes of these 383 EDCs based on ToxCast assay component
endpoints specific to human [35]. We remark that it is possible to expand this information
on target genes of EDCs using toxicological databases such as CTD [30], however, CTD
compiles target information from both experiments and computational predictions.
To reveal the target similarity between EDCs, we next investigated the target similarity
network (TSN) of EDCs. For the 383 EDCs with information on target genes from Tox-
Cast assays, we have constructed a target similarity network (TSN) based on shared target
genes between pairs of EDCs. In the TSN, nodes are EDCs and edge weights signify the
target similarity between pairs of EDCs. To quantify the similarity between two sets of
target genes corresponding to a pair of EDCs, we use the standard measure, Jaccard in-
dex [207], given by the ratio of the number of elements in the intersection over the number
of elements in the union of the two sets of target genes. By construction, Jaccard index is
in the range 0 to 1. Jaccard index between two EDCs is 0 if they have no target genes in
common, and it is 1 if they have all target genes in common.
To visualize the high similarity backbone of the TSN, we decided to omit edges with
weights below a chosen Jaccard index value signifying poor target similarity between
50
Target similarity network (TSN)
500
400
connected component
0.517
Size of the largest 300
200
100
0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0
Jaccard index
Figure 2.10: The size of the largest connected component (LCC) of the target similarity network
(TSN) of EDCs as a function of the increasing Jaccard index for omitting edges.
pairs of EDCs. Rather than choosing an arbitrary Jaccard index value to construct this
high TSN, we have investigated the size of the LCC of the TSN as a function of the
increasing Jaccard index value for omitting edges (Figure 2.10). Based on this investi-
gation, we find that there is a sharp decrease in the size of the LCC of the TSN obtained
after omitting edges below the Jaccard index of 0.517 (Figure 2.10). Subsequently, we
used this threshold Jaccard index of 0.517 to construct the high TSN of the 383 EDCs
(Figure 2.11; Supplementary Table S2.12).
In Figure 2.11, it is seen that the high TSN has a LCC of 199 EDCs, 13 smaller
components of 2 to 6 EDCs and 145 isolated EDCs. We have also employed Louvain
modularity [204] to partition the LCC of the high TSN into 6 modules (Figure 2.11). The
sizes of nodes in the high TSN reflect the weighted degree of EDCs, and the top 2 hubs are
well-known EDCs, o,p’-DDT (CID:13089) and 4-Octylphenol (CID:15730), that belong
to the largest module within the LCC of the high TSN (Figure 2.11). Based on the TSN
constructed using limited information on target genes from ToxCast assays, we conclude
that EDCs can have very different set of target genes [35].
51
CID:13089
CID:
15730
Figure 2.11: Network visualization of high target similarity network (TSN) of 383 EDCs. The
high TSN was constructed for 383 EDCs which have information on their target genes from Tox-
Cast assays. The legend at the bottom of this figure gives the colour code for nodes or EDCs
in TSN which is based on the 7 systems-level perturbations, namely, Reproductive (RT), De-
velopmental (DT), Metabolic (MT), Immunological (IT), Neurological (NT), Hepatic (HT) and
Endocrine-mediated cancer (CT), associated with EDCs in DEDuCT 1.0. Note that if an EDC
is associated with more than one systems-level perturbations then its colour is given by Multiple.
Moreover, the sizes of the nodes in the high TSN reflect their weighted degree in the network
and the thicknesses of the edges in the high TSN reflect their weights given by Jaccard index.
In addition, we have labelled the top 2 hubs, namely, o,p’-DDT (CID:13089) and 4-Octylphenol
(CID:15730), based on the weighted degree of nodes in this network.
52
2.5 Lack of correlation between chemical structure and
53
A D
R = 0.17 R = 0.17
(Tanimoto with ECFP4)
Chemical similarity
Chemical similarity
(Dice with ECFP4)
B E
R = 0.16 R = 0.16
(Tanimoto with MACCS keys)
Chemical similarity
C F
R = 0.08 R = 0.08
(Tanimoto with Daylight-like)
Chemical similarity
Figure 2.12: Scatter plots of target similarity versus chemical structure similarity between pairs
of EDCs. In this figure, we explore six combinations of two similarity metrics and three molecular
fingerprints to compute the chemical similarity between pairs of EDCs. (A) Tanimoto coefficient
with ECFP4 fingerprints. (B) Tanimoto coefficient with MACCS keys fingerprints. (C) Tanimoto
coefficient with Daylight-like (DLL) fingerprints. (D) Dice coefficient with ECFP4 fingerprints.
(E) Dice coefficient with MACCS keys fingerprints. (F) Dice coefficient with Daylight-like (DLL)
fingerprints. In each figure, we report the Pearson correlation coefficient R between structural and
target similarity of EDCs. Regardless of the choice of metric to compute the chemical similarity,
we find no significant correlation between the structural and target similarity of EDCs.
54
2.6 Evaluation of the sensitivity of toxicity predictors us-
In DEDuCT 1.0, 157 EDCs have experimental evidence to cause hepatic endocrine-
mediated perturbations. Among the toxicity predictors, admetSAR 2.0, pkCSM and vNN
server can predict the hepatotoxicity of chemicals. Of these 157 EDCs, admetSAR 2.0,
pkCSM and vNN server gave correct prediction for 60, 23 and 41 EDCs, respectively.
Thus, the sensitivity for predicting hepatotoxicity of EDCs by admetSAR 2.0, pkCSM
and vNN server are 0.382, 0.146 and 0.261, respectively, based on our dataset.
admetSAR 2.0 predicted 127 out of the 185 EDCs with experimental evidence to
cause cancer in DEDuCT 1.0 to be non-carcinogens, and we have compared these 127
EDCs with the potential carcinogens released by the International Agency for Research
on Cancer (IARC) Monographs [208, 209] and the Report on Carcinogens (RoC) by the
55
National Toxicology Program [210]. Based on this comparison, we found 9 of the 127
EDCs predicted as non-carcinogens by admetSAR 2.0 were listed as potential carcinogens
in IARC Monographs and RoC. Notably, 3 of the 127 EDCs, namely, benzo[a]pyrene,
diethylstilbesterol and pentachlorophenol are categorized as group 1 potential carcinogens
for human by IARC Monographs.
Overall, this evaluation of the computational toxicity tools for prediction of hepa-
totoxicity and carcinogenicity of EDCs based on the compiled experimental evidence in
DEDuCT 1.0 suggests lack of significant predictive power. A possible interim solution to-
wards increasing the predictive power of the existing tools will be to update their positive
training dataset with experimental information on EDCs from DEDuCT [35].
2.7 Discussion
EDCs are a group of chemicals of emerging concern which are omnipresent in our en-
vironment. Since endocrine disruption mechanism is a special form of toxicity, the
risk assessment and identification of EDCs remains challenging [8]. In this chapter,
we have developed a detailed workflow which was employed to identify 686 potential
EDCs from 1796 research articles with supporting evidence for endocrine disruption
from published experiments in humans or rodents. Further, we have compiled, unified
and standardized the observed adverse effects upon EDC exposure in published experi-
ments into 514 unique endocrine-mediated endpoints which were further classified into 7
systems-level perturbations. DEDuCT 1.0 compiles additional information including the
dosage information, environmental source classification, classification based on support-
ing evidence, chemical structure, physicochemical properties, predicted ADMET proper-
ties, and target genes for the 686 potential EDCs, and this information is accessible at:
https://cb.imsc.res.in/deduct/ (Figure 2.13).
56
Network-centric analysis Identification of EDCs with published experimental evidence
on endocrine disruption in humans or rodents
Chemical Similarity Network (CSN) Target Similarity Network (TSN) Lack of correlation between STAGE-4
Literature mining
Identification of EDCs with supporting
PubMed query WHO report TEDX EDCs Databank evidence on systems-level
(16407 articles) (337 articles) (1087 articles) (456 articles)
endocrine-mediated perturbations
chemical structure and
2 Final list of 686 EDCs, their
No benzene ring Manually filtered for the presence
systems-level
STAGE-1
of keywords related to EDCs
target genes of EDCs endocrine-mediated
1 benzene ring 14297 articles with likely perturbations, and supporting
information on EDCs evidence from 1796 articles 7 Systems-level
2 benzene rings Literature mining Literature filter based on
3 benzene rings study type and test Compilation of observed
organism effects or
4 benzene rings Select in vivo or in vitro endocrine-mediated perturbations
studies in humans or endpoints for each EDC from
rodents supporting literature
5 benzene rings
STAGE-2
6 benzene rings 3300 articles with tested 600
Manual evaluation of
chemicals in humans or observed effects for
PubMed query rodents endocrine-specific
pertubations in filtered 500
4 articles for each chemical
Compilation of tested
chemicals from the filtered
3
1 research articles Check for specific
400
Retrieve chemicals tested study type
WHO report for endocrine disruption
in vitro rodent study
in humans or rodents in 300
in vivo rodent study
at least one of the filtered
in vitro human study
articles
in vivo human study
Mapping of chemicals to their 200
TEDX two-dimensional structure using
STAGE-3
standard databases Check for tested
chemicals
Number of EDCs
List of 1626 chemicals 100
Natural hormone
tested for endocrine
Tested as a mixture
disruption in filtered
EDCs Databank articles
Therapeutic usage
0
IT DT HT CT NT MT RT
Systems-level perturbations
Curation of ~16000
12 11 10 9 8 7 6 5
3 37
7
10 safer chemicals 696 225 3043 44 FDA inactive Lowest-observed-adverse-effect level (LOAEL)
were found to be US EPA safer FDA inactive ingredients were found Endocrine Disrupting No-observed-adverse-effect level (NOAEL)
ingredients
potential EDCs chemicals to be potential EDCs
4 Chemicals (EDCs) and their
biological systems-level
DEDuCT-Database of Endocrine Disrupting perturbations Classification of EDCs
Chemicals and their Toxicity profiles
2
57
Classification based on
In vitro In vivo In vivo
type of supporting human human rodent
evidence from published
3 literature
Database Classification
Agriculture and Farming
Classification based on
Environmental source
Additional information Curated list of EDCs
7 broad categories
Industry Pollutant
for EDCs: 48 sub-categories
2D and 3D chemical Medicine and Healthcare Intermediates
(4)
Lipi
(1)
ds (9)
ds
M
an
ixe
ives
d
d
poun
(5)
structure
m
Ho
m
lipid
eta
(1)
og
derivat
l/n
1)
en
-like
on
(5
ers
Hom ou
s
og
-m
es
rbon
og
mol
no
tid
eta
en n-m
ec
ke
Physicochemical properties
Polym en com
eo
ly
l co
us
Hydrocarbons compounds (7)
Org eta
Alkaloids and derivatives
Organometallic
m
Organohalogen compounds (35)
po
ules
an
nic
l co
d
op
po
met
Hydroca
ho al m
an
un
Orga
sp co p
(51)
s
d
Orga nic nitr
ho
id
mpo oun
no
s (2
rus
co un ds (26)
2)
Molecular descriptors
pa
(8 ) ds
ro
mpo ds
(1 ) (8 un
lp
un
y
ds 0) lts mpo
sa co
en
(2) ic en
Ph
an yg
ox
Predicted ADMET Org anic and
Organic Org ides,
Classification based on nucleot
acids
and leos ides,
or
ives
In (40)
(48) Org
Benzenoid co Organosu ns and related
properties s (301) mpo anic
(646 unds Lignans, neoligna
(2)
chemical properties ) compounds
Experimentally inferred
https://cb.imsc.res.in/deduct/ target genes
Org
Org anoh
an eter
ic
1,3- ocyc
dipo lic co
lar mpo
co
mpu unds
nd (88)
s (1
)
An important aspect of EDCs is their ability to exert adverse effects even at low
dosage values [168±170]. Our compilation of dosage information at which endocrine-
mediated endpoints were observed in published experiments upon individual EDC expo-
sure will further help researchers to understand the low dose exposure effects of EDCs.
Also, our large-scale compilation of the observed effects or endpoints along with the
systems-level perturbations upon EDC exposure can be visualized as a tripartite network
with nodes as EDCs, endocrine-mediated endpoints and systems-level perturbations. Fu-
ture exploration of this tripartite network will enhance systems-level understanding of
perturbed biological pathways upon EDC exposure.
After publication [35], DEDuCT has received coverage in national and international
media including India Science Wire, Chemistry and Engineering News (c&en) [211] of
the American Chemical Society, Hindustan Times, Chemical Watch, and European Trade
Union Institute. Importantly, DEDuCT has been well received by scientific peers. To
highlight, the French Agency for Food, Environmental and Occupational Health & Safety
58
(ANSES) has come up with a list of substances to be further included in their assessment
program as part of the Second French National Endocrine Disruptor Strategy (SNPE 2).
To draw their list of priority substances, ANSES has utilized DEDuCT 1.0 as one of their
primary resources after assessing 27 existing initiatives on EDCs worldwide. According
to this ANSES report [212], the robust approach followed in DEDuCT 1.0 to identify
EDCs meets the SNPE 2 criteria for the inclusion of priority substances. In sum, DEDuCT
is an important resource on EDCs that will enable delivery of safer consumer products.
Supplementary Information
Supplementary Tables S2.1-S2.12 associated with this chapter are available for download
from the GitHub repository: https://github.com/asamallab/PhDThesis-Janani_
R/blob/main/SI/ST_Chapter2.xlsx.
59
EDCs
Feature DEDuCT 1.0 TEDX WHO report
Databank
Number of EDCs 686 615 1428 184
Web interface Yes Yes Yes No
Compilation of endocrine-mediated endpoints
for EDCs from published experiments on Yes No No No
endocrine disruption in humans or rodents
Dosage information specific to
endocrine-mediated endpoints for EDCs from
Yes No No No
published experiments on endocrine disruption
in humans or rodents
Systems-level perturbations for EDCs based on
observed endocrine-mediated endpoints in
Yes No No No
published experiments on endocrine disruption
in humans or rodents
Categorization of EDCs based on the type of
Yes No No No
supporting evidence
Categorization of EDCs based on environmental
Yes No No No
source
Categorization of EDCs based on their use Yes Yes Yes No
Chemical classification of EDCs Yes No No No
Availability of 2D structure for EDCs Yes Yes No No
Availability of 3D structure for EDCs Yes Yes No No
Downloadable formats for 2D and 3D structure SDF, MOL2, SDF, MOL2,
No No
of EDCs PDB, PDBQT PDB, PDBQT
PubChem or PubChem or
Chemical identifiers of EDCs CAS No
CAS CAS
Physicochemical properties of EDCs Yes Yes No No
Molecular descriptors for EDCs Yes No No No
Predicted ADMET properties of EDCs Yes No No No
Chemical-gene association based on
Yes No No No
experimental assays
Chemical similarity filter Yes Yes No No
Table 2.1: Comparison of the information on EDCs in DEDuCT with three existing resources,
namely, EDCs Databank, TEDX and WHO report.
60
Chapter 3
Due to the hazardous potential of EDCs, their adverse health effects on humans and
wildlife have been studied for more than three decades, and this information is docu-
mented in scientific literature, including published research articles, toxicological reports,
and regulatory guidelines [8, 213]. Despite the increasing research interest, several limi-
tations and uncertainties challenge the risk assessment and regulation of EDCs [3, 213].
Importantly, a standard (consensus) definition for EDCs can dictate the evidence needed
for its identification among environmental chemicals [3, 43, 45].
In this direction, several definitions have been proposed and adopted by various reg-
ulatory agencies. However, clarity and standardization are yet to be achieved in EDCs
61
research [3]. This is also reflected in a recent comprehensive study commissioned by the
European Parliament on endocrine disruptors and the current EU regulations on the sub-
ject [156]. In particular, the report found gaps in the definition of EDCs, test requirements
and guidelines for authorization of products in a number of categories such as cosmetics,
drinking water and workers’ regulations [156]. Another challenge to the regulation of
EDCs is the wide range of factors to be considered in developing risk assessment criteria.
In addition to defining the adverse effects, factors such as source and dosage of exposure
need to be considered, all of which are aspects studied and documented in peer-reviewed
articles in scientific journals. However, it is unknown to what extent this scientific litera-
ture is consulted during the development of risk assessment criteria and testing standards
for EDCs. In fact, toxicity test guidelines have received criticism for having omitted
several relevant endpoints which are captured in academic research [214].
The above-mentioned two observations, namely, the growth in the volume of scien-
tific knowledge surrounding EDCs, and the perceived presence of gaps in the risk assess-
ment and regulation of EDCs, have prompted the comparative analysis reported in this
chapter. In this chapter, we explore how academic research leading to curated knowl-
edgebases can inform current chemical regulations on EDCs. To this end, we present
in this chapter an updated knowledgebase DEDuCT 2.0, and thereafter, studied the dis-
tribution of potential EDCs across several chemical lists that reflect guidelines for use
or regulations [36]. The work reported in this chapter is contained in the published
manuscript [36].
62
To create DEDuCT 1.0 [35], we had mined and curated more than 16000 research ar-
ticles published until February 2018 to finally obtain a corpus of 1796 articles containing
supporting experimental evidence specific to humans or rodents for 686 potential EDCs.
An analysis of this corpus of 1796 articles published until February 2018 found that the
number of articles with supporting evidence on potential EDCs has significantly increased
over the last three decades (Figure 3.1A) [36]. The continuous growth of literature on
EDCs (Figure 3.1A) and community interest in DEDuCT 1.0 [211] served as motiva-
tion to perform a substantial update of our knowledgebase to include published scientific
literature until January 2020.
Here, we have built an updated knowledgebase, DEDuCT version 2.0, with infor-
mation on 792 potential EDCs with supporting experimental evidence from 2218 pub-
lished research articles (Supplementary Tables S3.1-S3.2). In order to achieve the up-
dated database DEDuCT 2.0, we had to mine and curate additional 3396 research articles
on EDCs which were published until January 2020. Essentially, we followed the four
staged workflow used to create DEDuCT 1.0 [35] as described in chapter 2, to create the
updated database DEDuCT 2.0 (Figure 3.2). The compiled information on 792 potential
EDCs and additional information including supporting literature, systems-level perturba-
tions, observed endocrine-mediated endpoints and corresponding dosage information is
accessible via DEDuCT 2.0 webserver at: https://cb.imsc.res.in/deduct [35,36].
A chronological analysis of the corpus of 2218 published articles which form the
supporting evidence for 792 potential EDCs in DEDuCT 2.0 finds that there are 1181
articles published in the period 2011-2020, followed by 696 articles in the period 2001-
2010, followed by 192 articles in the period 1991-2000 (Figure 3.1A). We remark that
the corpus of 2218 research articles in DEDuCT 2.0 is likely to be a lower estimate of
the accumulated scientific knowledge to date on EDCs; nevertheless, it is evident from
Figure 3.1A that there has been significant growth in research on EDCs in the past three
decades.
In addition, we leverage the 792 potential EDCs along with the associated supporting
63
D
B
A
Number of research articles
on EDCs
1500
Number of new EDCs in DEDuCT 2.0 per year
100
200
300
400
500
600
700
19
0
51
-1
19 55 9
0
5
10
15
20
25
30
35
40
45
50
3
56
-1
1000
3
19 96
1952
1
1
61 0
1954 -1
0
1955 19 96
Set size
1
66 5
500
1957 -1
1962 19 97
4
1967 71 0
-1
1968 19 97
0
1969
12
76 5
1972 -1
1 1 1 1 1 1 1
19 98
1973 81 0 27
6
1974
TEDX
-1
5
1975 19 98
(v.2015)
47
3
Year
86 5
1976 -1
5
Intersection size
WHO report
1977
DEDuCT 2.0
19 99
6
54
1978 91 0
-1
EDCs Databank
0
100
200
300
700
1979 19 99
7
74
~
~
1980 96 5
8
1981
-2
20 00
3
1982
118
01 0
11
1983 -2
242
7
1984
20 00
268
06 5
15
1985 -2
31
5
1986 20 01
428
11 0
1987 -2
7 7
1988
688
20 01
513
1989
64
16 5
-2
8 8
1990 02
69
0
668
25
1991
14
1992
13
1993
18
11
Year
1994
15
1995
7
1996
226
20
1997
23
7
1998
C
20
1999
22
2000
Number of EDCs in DEDuCT 2.0
10
26
2001
2002 23
20
2003
3
100
200
300
400
500
600
700
40
2004
21
2005
2006 15
250
42
2007
IT
129
2008
28
29
18
2009
2010
25
DT
164
1
17
2011
38
2012
24
2013
HT
185
18
2014
187
20
2015
CT
213
13
2016
16
24
2017
perturbations
38
2018
NT
251
47
79
2019
2020
MT
369
Systems-level endocrine-mediated
RT
616
Figure 3.1 (previous page): (A) A chronological analysis of the corpus of 2218 published articles
which form the supporting evidence for 792 potential EDCs in DEDuCT 2.0. (B) A plot of the
number of new EDCs identified in published literature per year based on information compiled
in DEDuCT 2.0. (C) Evidence for seven different systems-level perturbations from published
experiments across 792 potential EDCs compiled in DEDuCT 2.0. (D) Comparison of the list of
EDCs captured in DEDuCT 2.0 with three other resources. From the UpSetR plot, it is seen that
242 out of 792 potential EDCs in DEDuCT 2.0 are not captured in any other resource.
Figure 3.2: Detailed workflow for the compilation of potential EDCs and creation of the updated
knowledgebase DEDuCT 2.0.
literature of 2218 research articles, to study the identification of new EDCs in the past
decades. In Figure 3.1B, we show the number of new EDCs reported in published liter-
ature over the last 70 years. For this analysis, we consider a potential EDC captured in
DEDuCT 2.0 to be identified for the first time in a particular year, if the earliest supporting
experimental evidence for that EDC is from a research article published in that year. From
Figure 3.1B, it is seen that the number of new EDCs identified in the scientific literature
has slowly but surely increased on average over the past decades. These observations also
align with the observed growth in scientific literature on EDCs [36].
65
5 2
Neurological endocrine-mediated Developmental endocrine-mediated
perturbations (NT) perturbations (DT)
[83 endpoints] [166 endpoints]
For example: For example:
Affects neuronal density, Increase in Hypothalamus Affects embryonic development, Affects
corticosterone levels, Decreased Pitutary gland skeletal development in fetus, Affects
dopamine levels, Affects social behavior placental development
4
3
Immunological endocrine-mediated
Metabolic endocrine-mediated perturbations (IT)
pertubations (MT) [36 endpoints]
[145 endpoints] Thyroid gland
For example:
For example: Atrophy of spleen, Thymus
Affects xenobiotic metabolism, atrophy, Alterations in immune
Elevated insulin levels, Decrease responses
in T4 levels, Lead to obesity
Thymus gland
6 7
Hepatic endocrine-mediated Endocrine-mediated
perturbations (HT) cancers (CT)
[36 endpoints] Liver [19 endpoints]
For example: For example:
Oxidative stress in liver, Affects Adrenal glands Cancer phenotype,
hematopoiesis of liver, Increased liver Adenocarcinoma, Induce
weights cancer metastasis
Pancreas
1
Reproductive endocrine-mediated
perturbations (RT)
[323 endpoints]
For example: Ovary
Reduced sperm counts, Affects
testicular morphology, Affects Testis
germ cell differentiation
Figure 3.3: Schematic figure depicting the classification of the 609 endocrine-mediated endpoints
into 7 systems-level perturbations in DEDuCT 2.0.
A unique feature of our resource, DEDuCT 2.0, on EDCs is the compilation of ob-
served 609 unique endocrine-mediated endpoints and their classification into 7 systems-
level perturbations from supporting literature (Figure 3.3) [35]. We have also studied the
available evidence for any of the 7 different systems-level perturbations across the 792 po-
tential EDCs in DEDuCT 2.0 (Figure 3.1C). Of the 792 potential EDCs in DEDuCT 2.0,
616 EDCs have evidence for reproductive endocrine-mediated perturbations, 369 EDCs
for metabolic perturbations and 251 EDCs for neurological perturbations (Figures 3.1C
and 3.3). This reflects that reproductive effects followed by metabolic effects may have
been the main focus of the scientific investigations on EDCs [36].
Since DEDuCT compiles potential EDCs with supporting evidence specific to hu-
mans or rodents [35], we also considered three other resources on EDCs, namely, the
WHO report [8], TEDX and the EDCs Databank [48] for the subsequent analysis. Figure
3.1D also gives an overview of unique and overlapping EDCs across the four resources.
66
Agricultural and Consumer Industry Intermediates Natural sources
farming (349) products (388) (366) (140) (39)
Figure 3.4: Classification of the 792 potential EDCs in DEDuCT 2.0 into 7 broad categories and
48 sub-categories based on their source in the environment. In this figure, the number of EDCs in
DEDuCT 2.0 contained in each category or sub-category is reported within the parenthesis.
Specifically, 242 EDCs in DEDuCT 2.0 are not captured in any of the other three re-
sources. In subsequent sections, we compare chemical lists pertaining to guidelines or
regulations with the union of EDCs across these four resources which add up to 1856
potential EDCs (Figure 3.1D) [36].
In addition to experimental evidence, DEDuCT 2.0 also compiles diverse information for
the 792 potential EDCs including 2D and 3D chemical structure, physicochemical proper-
ties, predicted ADMET properties, molecular descriptors, and experimentally inferred tar-
get genes from ToxCast database version August 2019 [215]. We also provide a classifica-
67
Mixed
H om
ogen
metal/n
mo
ous
Or
gen
ga
on-met
non-m
eo
0)
no
us
tive nds (1
ph
)
me
os
(39
al com
)
etal
s (4
ph
tal
ds
der mpo
o ru
com
com
un
O
iva
po
o
rg
ic c
p ou
an
om
m
ou n d
co
po
nds (2)
ac
nd
5)
en
id
un
)
(28)
s
Org droca
(2
(1
s
sa
log
(10
(8)
ds
an
s
s
ive
nd
d
ha
)
u
at
de
Hy
Be po
no
riv
riv
nz m
de
ga
at
en co
iv
on
o n )
es
id ge ( 1
rb
s rs
t ro
(5
ca
(3
ni lyme
2)
5
ro
5) ic
yd
an Po
H
rg nic
O ga
Or
Inorganic
(46)
com (746)
Or ound
(53)
ga
p
des
lyketi
nic s
an d po
p an oids
ylpro
Phen
lts (10)
Organic sa
Organic oxygen compounds (27)
Nucle
analo osides, n
u
Org gues (2) cleotide
a s, an
Lig nosulf d
n ur c
com ans, omp
pou neoli ou n
ds (
nds gna 3 )
(2) ns
an d
rela
ted
Or anic
Or
ga
g
no 1,3-d
he
ter
oc
yc
ipo
lic
lar
co
com
mp
ou
pu
nd
nd
s(
s(
10
2)
) 2
Figure 3.5: Classification of the 792 EDCs in DEDuCT 2.0 into chemical kingdoms and chemical
super-classes using ClassyFire. Of the 792 EDCs, 746 are organic and 46 are inorganic com-
pounds. The 746 organic EDCs can be further classified into 19 super-classes while the 46 inor-
ganic EDCs fall into 3 super-classes. The number of EDCs in each super-class is reported within
the parenthesis.
tion of the potential EDCs based on their environmental source into 7 broad categories and
48 sub-categories (Figure 3.4). We also provide a hierarchical classification of the 792 po-
tential EDCs based on their chemical structure information using ClassyFire [174] (Figure
3.5). Moreover, the final list of 792 potential EDCs were classified into 4 categories (I-IV)
based on the type of supporting evidence for endocrine disruption in published experi-
ments specific to humans or rodents (Supplementary Table S3.2). All the compiled infor-
mation in DEDuCT 2.0 is accessible at: https://cb.imsc.res.in/deduct/ [35, 36].
In sum, the expanded list of potential EDCs in DEDuCT 2.0 can assist academia, industry,
and regulatory agencies in developing safer consumer products.
68
3.2 Compilation of chemical lists that are a part of inven-
Apart from the broad classification into SIU or SOC, we have also organized the 36
chemical lists into 9 categories based on the recent report commissioned by the European
Parliament [156]. These 9 categories include Plant protection products, Cosmetics and
household products, Food additives and Food contact materials, Biocides, Medicines and
Medical devices, REACH chemicals, Environment and Water Quality, Workers’ regula-
tions, and Miscellaneous (Figure 3.6; Supplementary Table S3.3). Note that we were able
to find from public resources both SIU and SOC lists for only 3 out of these 9 categories
(Figure 3.6; Supplementary Table S3.3).
A list is considered a SIU list if it fulfills one of the following criteria: (a) It is an inventory
of substances generally found to be in use in a certain product category; (b) It is a part
of a guideline document, issued either by a government agency or an independent body,
for safer product formulation; (c) It is a list of substances permitted for use in a certain
69
L14 (230)
L2 (3343)
L3 (157)
L4 (120)
L5 (27)
L6 (2037)
Biocides L20 (1933)
L23 (83)
Cosmetics and household products L24 (297)
L25 (79)
L26 (246)
Substances in use (SIU) Environment and Water Quality L27 (477)
L7 (2612)
L8 (16341)
L9 (3049)
Food additives and Food contact materials
L10 (2446)
L11 (683)
Medicines and Medical devices L12 (6800)
L13 (1527)
L15 (789)
Substances of concern (SOC) Miscellaneous L16 (77)
L17 (978)
L29 (927)
Plant protection products L30 (146)
L31 (869)
REACH chemicals L32 (386)
L1 Active ingredients allowed in minimum risk pesticide L18 List of banned pesticides in India
products L19 List of banned and restricted pesticide products in China
L2 IFRA transparency list L20 EU list of substances prohibited in cosmetic products
L3 EU list of colorants allowed in cosmetic products L21 Restricted substances under REACH
L4 EU list of preservatives allowed in cosmetic products L22 SVHC under REACH
L5 EU list of UV filters allowed in cosmetic products L23 NPI Australia
L6 Consumer product ingredient database L24 Singapore list of controlled hazardous substances
L7 Substances added to food (EAFUS) L25 Ozone-depleting substances in India
L8 FooDB L26 EWG tap water database
L9 The Joint FAO/WHO Expert Committee on Food L27 Human Indoor Exposome database
Additives (JECFA) list L28 US OSHA list
L10 EU food flavorings database L29 SIN List
L11 EU plastic food packaging materials L30 Toxic chemicals restricted to be imported or exported in
L12 Pew list of food additives China
L13 ESCO list of non-plastic food contact materials L31 IARC monographs on carcinogens
L14 ECHA biocidal products L32 Schedule 1 hazardous chemical list in India
L15 US FDA inactive ingredient list L33 Schedule 3 hazardous chemical list in India
L16 Production of major chemicals year-wise in India L34 NZ EPA priority chemical list
L17 US EPA safer chemical ingredients list L35 ECHA list of chemicals in Annex I
L36 PACSs list Japan
70
Figure 3.6 (previous page): Sankey plot showing the classification of 36 chemical lists that are
part of inventories, guidelines and regulations obtained from public resources. The 36 chemical
lists were broadly classified into two categories, namely, ‘Substances in use (SIU)’ and ‘Sub-
stances of concern (SOC)’. Based on chemical use or environmental source, the 36 chemical
lists are further organized into 9 categories, namely, Plant protection products, Cosmetics and
household products, Food additives and Food contact materials, Biocides, Medicines and Medical
devices, REACH chemicals, Environment and Water Quality, Workers’ regulations, and Miscella-
neous. In this figure, the number of chemicals in each list is reported in parenthesis besides each
list.
product category, by a regulatory authority. Note that though inventories, regulations and
guidelines, from where the 17 SIU lists were compiled, may have followed their own
criteria to define the specific chemical lists, it is evident that the chemicals captured in
these 17 SIU lists are in use in various consumer and industrial products.
Further the 17 SIU lists were classified into 6 categories including Plant protection
products, Cosmetics and household products, Food additives and Food contact materials,
Biocides, Medicines and Medical devices, and Miscellaneous (Figure 3.6; Supplementary
Table S3.3). Of the 17 SIU lists, the category ‘Food additives and Food contact materials’
has the maximum number of chemical lists (L7-L13), while ‘Plant protection products’,
‘Biocides’, and ‘Medicines and Medical devices’ contain only one chemical list in each of
their category. Five SIU lists (L2-L6) fall under the ‘Cosmetics and household products’
category. Two lists namely, ‘L16 - Production of major chemicals year-wise in India’ and
‘L17 - US EPA safer chemical ingredients list’ were categorized under ‘Miscellaneous’
lists.
An example of SIU list is the ‘L7 - Substances added to food (EAFUS)’ which is an
inventory developed by the US Food and Drug Administration (FDA), and this list was
previously known as Everything Added to Foods in the United States (EAFUS) (Figure
3.6; Supplementary Table S3.3). The L7 list contains 2612 unique chemicals which are
used as food additives, color additives and other substances approved for specific use in
food by the US FDA (Figure 3.6; Supplementary Table S3.3).
71
3.2.2 Substances of concern (SOC) lists
A list is considered a SOC list if it fulfills one of the following criteria: (a) It is an in-
ventory of substances considered toxic, published either by a government agency or an
independent body; (b) It is a list of substances monitored, restricted or banned for import,
export or manufacture by a regulatory authority, due to their hazard potential. Following
the above criteria, we have compiled 19 SOC lists that are a part of chemical inventories,
regulations or guidelines.
The SOC lists were further divided into 6 categories, namely, Plant protection prod-
ucts, Cosmetics and household products, REACH chemicals, Environment and Wa-
ter Quality, Workers’ regulations, and Miscellaneous. Of these 6 categories, REACH
chemicals, Environment and Water Quality, and Workers’ regulations were specific to
SOC lists. The ‘Plant protection products’ category has two lists (L18-19) specific to
banned/restricted pesticidal substances. The categories, ‘Cosmetics and household prod-
ucts’ and ‘Workers’ regulations’, each constitute only one chemical list containing the
substances that are prohibited in cosmetic products (L20) and the substances with poten-
tial occupational hazards (L28), respectively. Two lists, namely, ‘L21 - Restricted sub-
stances under REACH’ and ‘L22 - SVHC under REACH’ were categorized as ‘REACH
chemicals’. The category ‘Environment and Water Quality’ includes five lists (L23-L27)
containing the list of substances that were monitored by the environmental agencies across
different countries. Of 19 SOC lists, 8 chemical lists were categorized as ‘Miscellaneous’
that identified the substances of potential hazard.
An example of SOC list is the ‘L24 - Singapore list of controlled hazardous sub-
stances’ which is a chemical regulatory list compiled under the Schedule 2 of the En-
vironmental Protection and Management Act of Singapore (Figure 3.6; Supplementary
Table S3.3). The L24 list contains 297 hazardous substances (Figure 3.6; Supplementary
Table S3.3).
72
3.3 Exploration of potential EDCs across chemical lists
lines
Following the compilation of potential EDCs from four resources and 36 chemical lists,
we have performed a three step systematic analysis to understand how potential EDCs are
distributed across SIU and SOC lists.
First, we tried to identify any chemical overlap between the SIU and SOC lists. Upon
finding a large chemical overlap between these two classes, we split the chemicals from
the SIU and SOC lists into 3 groups (I-III). Group I consists of chemicals that are present
only in 17 SIU lists, and not in any of the 19 SOC lists. Group II represents the list of
chemicals that are present both in 17 SIU and 19 SOC lists. Group III represents the list
of chemicals that are present only in 19 SOC lists, and not in any of the 17 SIU lists. We
found 23483, 1139 and 3223 chemicals in group I, II and III, respectively (Figure 3.7A).
Second, we compared the list of potential EDCs compiled from 4 resources, namely,
DEDuCT 2.0, the WHO report, TEDX and EDCs Databank, with the group I chemicals.
We refer to the list of potential EDCs in group I chemicals as group I EDCs or ‘EDCs in
use (EIU)’ (Figure 3.7A). A similar comparison also led to group II EDCs and group III
EDCs (Figure 3.7A). Based on the comparison, we find 242, 356 and 278 potential EDCs
in groups I, II and III, respectively (Figure 3.7A; Supplementary Table S3.4) [36]. Note
that group II which is the intersection of chemicals present in SIU and SOC lists, contains
more EDCs than groups I or III.
Third, we compared the EIU list with the list of High Production Volume (HPV)
chemicals to identify the potential EDCs in use which are produced or manufactured in
high volume. For this analysis, we have compiled HPV chemicals from the union of two
resources, namely, the United States High Production Volume (USHPV) database and
the Organisation for Economic Co-operation and Development (OECD) High Production
73
Volume (OECD HPV) list last updated on 2004. The OECD HPV list contains 4712
chemicals that are produced more than 1000 tonnes per year in at least one OECD member
country or region. The USHPV database compiles 4297 chemicals that are produced or
imported in the United States in quantities of 1 million pounds or more per year. A similar
comparison of the group II EDCs and group III EDCs was also performed with the HPV
chemicals.
We designate the 242 potential EDCs among group I chemicals as EDCs in use (EIU)
(Figure 3.7A; Supplementary Table S3.4). These 242 EIU are distributed across 5 of
the 9 categories of chemical lists, and thus pose a high risk of exposure (Figure 3.7A).
Majority of EIU are found in 2 categories of chemical lists, namely, ‘Food additives and
Food contact materials’ and ‘Cosmetics and household products’. Minority of EIU are
found in 3 categories of chemical lists, namely, ‘Biocides’, ‘Medicines and Medical de-
vices’ and ‘Miscellaneous’ (Figure 3.7B; Supplementary Table S3.4). Of the 242 EIU,
DEDuCT 2.0 captures 119 potential EDCs along with supporting experimental evidence
(Supplementary Table S3.4). Lastly, 6 EIU, namely, 2,4,5,2’,4’,5’-Hexabromobiphenyl,
Coumestrol, Daidzein, Genistein, Pendimethalin and Zearalenone are captured in all four
resources on EDCs (Supplementary Table S3.4) [36].
EIU produced in high volume can pose significant risk as humans are readily exposed
to them through use of commercial products. Figure 3.7B gives the distribution of 63
EIU produced in high volume across 5 different categories of chemical lists (Supplemen-
tary Table S3.4). While none of EIU produced in high volume are captured in all four
resources on EDCs, 7 EIU produced in high volume, namely, 4,4’-Dihydroxybiphenyl, 4-
Hydroxybenzoic acid, 4-sec-Butylphenol, Chlorocresol, Monosodium glutamate, N,N’-
Diphenyl-4-phenylenediamine and Sodium fluoride, are captured in three of the four re-
74
A
SIU (24622) SOC (4362)
Group II EDCs
B
Plant protection products Cosmetics and household Food additives and Food
products contact materials
9
25 177
31
76 283
177
272 111 49
189
31
52
16
18 23 77
7 58
15
20 35 9
2
143 170 24
2
8
8
277
219
183
137
75
Figure 3.7 (previous page): Distribution of potential EDCs from four resources, namely,
DEDuCT 2.0, WHO report, TEDX and EDCs Databank, across 36 chemical lists that are part
of inventories, guidelines and regulations. (A) Venn diagram displaying the intersections of group
I, II and III chemicals with potential EDCs. (B) Sunburst plot showing the distribution of potential
EDCs across 9 categories of chemical lists. Within each category in this plot, the inner ring gives
the number of potential EDCs in group I, II and III, and the outer ring gives the number of potential
EDCs in group I, II and III that are also high production volume (HPV) chemicals.
sources on EDCs. These 7 EIU produced in high volume are found in 4 categories of
chemical lists, namely, ‘Biocides’, ‘Cosmetics and household products’, ‘Food additives
and Food contact materials’ and ‘Medicines and Medical devices’ (Figure 3.7B; Supple-
mentary Table S3.4). Finally, 31 of the 63 EIU produced in high volume are captured in
DEDuCT 2.0 (Supplementary Table S3.4) [36].
From this analysis, it is evident that several EDCs in commercial use are also pro-
duced in high volume. The risk of exposure and associated hazard potential warrant an
evaluation of these EIU produced in high volume, and framing appropriate risk assess-
ment criteria will help such efforts. Later in this chapter, we illustrate how our knowl-
edgebase, DEDuCT 2.0, on EDCs can aid in risk assessment.
There are 356 group II EDCs (Figure 3.7A) of which 211 are also HPV chemicals.
Among the 356 group II EDCs, 46 are captured in all four resources on EDCs (Sup-
plementary Table S3.4). Of these 46 group II EDCs, 28 are also produced in high volume.
These 28 group II EDCs produced in high volume are distributed across 6 categories of
chemical lists, namely, ‘Plant protection products’, ‘Cosmetics and household products’,
‘Food additives and Food contact materials’, ‘Environment and Water Quality’, ‘REACH
chemicals’ and ‘Miscellaneous’ (Supplementary Table S3.4) [36]. Given the volume of
production and their possible presence in commercial products, the risk of human expo-
sure to these potential EDCs is a concern.
We next analyzed group III chemicals which are only present in SOC lists and found
76
278 potential EDCs among them (Figure 3.7A). Of these 278 group III EDCs, 5 chemi-
cals, namely, Simazine, Linuron, Acetochlor, Vinclozolin, and Prochloraz, were found to
be produced in high volume and captured in all four resources on EDCs (Supplementary
Table S3.4). These 5 group III EDCs are distributed across 4 categories of SOC lists,
namely, ‘Plant protection products’, ‘Cosmetics and household products’, ‘Environment
and Water Quality’, and ‘Miscellaneous’ (Supplementary Table S3.4). These 5 potential
EDCs in SOC lists need better monitoring as they are produced in high volume in spite of
known concern [36].
77
3.4 A case study of DEDuCT 2.0 in risk assessment of
EDCs
To better understand how diverse information in a curated knowledgebase such as
DEDuCT 2.0 [35, 36] can aid in chemical regulation, we present a case study for a poten-
tial EDC. We focused on 28 group II EDCs produced in high volume and captured in all
four resources on EDCs including DEDuCT 2.0. Of these 28 group II EDCs, ‘Dibutyl ph-
thalate (CAS: 84-74-2)’ is a potential EDC present in 6 SIU lists and 7 SOC lists which are
distributed across 5 categories, namely, ‘Cosmetics and household products’, ‘Food ad-
ditives and Food contact materials’, ‘REACH chemicals’, ‘Environment and Water Qual-
ity’, and ‘Miscellaneous’. We next discuss the utility of DEDuCT 2.0 in risk assessment
of chemicals using Dibutyl phthalate as an example.
According to the United States National Academy of Sciences, risk assessment in-
volves four steps, namely, Hazard identification, Dose-response assessment, Exposure
assessment, and Risk characterization [216]. Among the four resources on EDCs, no-
tably, DEDuCT has compiled the observed endocrine-mediated endpoints and the dosage
at which endpoints are observed, from published experiments specific to humans and ro-
dents [35, 36], and this information can aid in risk assessment process. DEDuCT 2.0
compiles supporting evidence on endocrine disruption upon Dibutyl phthalate exposure
from in vivo experiments in rodents and in vitro experiments in humans which were pub-
lished in 35 research articles.
For the first step in risk assessment, we used DEDuCT 2.0 to identify health hazards
posed by Dibutyl phthalate. For Dibutyl phthalate exposure, DEDuCT 2.0 has compiled
81 endocrine mediated endpoints spanning 7 systems-level perturbations, namely, repro-
ductive, developmental, metabolic, immunological, neurological, hepatic, and endocrine-
mediated cancer (Figure 3.3). For the second step in risk assessment, one can use the
dosage information compiled in DEDuCT 2.0 for 81 endpoints observed upon Dibutyl
78
phthalate exposure. In particular, we have analyzed the dosage information for Dibutyl
phthalate compiled in DEDuCT 2.0 specific to endpoints observed in in vivo rodent stud-
ies using dosage unit as mg/kg/day (Supplementary Table S3.5). In these published in
vivo rodent studies on Dibutyl phthalate, the test concentration range for different end-
points is 0.01-1000 mg/kg/day across compiled studies in DEDuCT 2.0, the lowest dose
at which an adverse effect is observed in any of these studies is 0.01 mg/kg/day, and
the highest dose at which no adverse effects are observed in any of the studies is 125
mg/kg/day (Supplementary Table S3.5). We remark that the compiled dosage informa-
tion for Dibutyl phthalate in DEDuCT 2.0 is compatible with previous reports suggesting
possible non-monotonic dose response for this chemical [217].
The third step of exposure assessment involves the identification of routes, frequency
and duration of exposure at the population level. Though DEDuCT 2.0 compiles infor-
mation on environmental sources of potential EDCs, it does not capture their duration and
routes of exposure. A possible expansion of the knowledgebase to include biomonitoring
and epidemiological information for EDCs from published literature will further aid in
exposure assessment and risk characterization; however, such an update of DEDuCT 2.0
requires significant effort beyond the current scope of our work.
3.5 Discussion
The number of chemicals introduced into the market for commercial purposes continues
to be high. Adequate risk assessment strategies are needed now, more than ever, to cope
with the increasing demand for safe product formulations. In general, regulatory stan-
dards and criteria differ across countries and this lack of standardization applies to the
regulation of EDCs as well [218, 219]. The regulatory assessment of EDCs is complex
as there are several challenges and limitations associated with these substances [3, 218].
In recent years there has been a rapid increase in endocrine disruption studies and the
accumulation of knowledge surrounding EDCs (Figure 3.1A,B). However, regulatory
79
assessments fall short due to the limitations and uncertainties in the risk assessment of
EDCs [3, 218, 220]. This may be also due to the lack of knowledge transfer from aca-
demic research to the regulatory assessment of EDCs.
The presence of potential EDCs in the compiled chemical lists is a concern as hu-
mans are exposed to these potential EDCs via the use of industrial and consumer products.
Similar investigations have previously been conducted for food, food additives, and food
contact chemicals [154, 155], and these studies have revealed regulatory gaps that con-
tribute to the inclusion of substances of concern in food and associated products. How-
ever, these studies were not specific to EDCs, and were also limited to a single category
of substances. Hence, there is a need to incorporate endocrine disruption as a standard
criterion in chemical risk assessment. Despite scientific efforts to evaluate the risks that
EDCs pose, there is a gap in the transfer of knowledge to the policy planning level [214].
Focused systematic review of these lists by regulatory agencies and non-governmental
chemical advocacy groups, coupled with better incorporation of research data compiled
in academic resources may help improve and strengthen chemical regulations and guide-
lines, and consequently, improve the safety of our products as well.
Based on the extent and variety of information necessary for building regulatory stan-
dards, the utility of the WHO report, TEDX, and EDCs Databank in regulatory assessment
may be limited. These resources lack the systematic compilation of observed adverse
effects specific to endocrine disruption from published literature. The compilation of
endocrine-mediated adverse effects along with dosage information in DEDuCT 2.0 may
prove valuable in the risk assessment and regulation of EDCs as demonstrated using a
case study for Dibutyl phthalate in this chapter. Additional information including species,
strain, sex, route, and duration of exposure for the compiled EDCs from published lit-
erature will aid in better risk assessment of chemicals. Moreover, a possible update of
DEDuCT to include biomonitoring and epidemiological studies for the compiled EDCs
from published literature can also aid in exposure assessment and risk characterization.
However, such an update of DEDuCT will also require an intensive manual curation effort.
80
To this end, experimental evidence of endocrine disruption for potential EDCs compiled
in knowledgebases could help in the early identification of hazardous substances, so that
regulatory bodies can then streamline the process for safety testing, and in turn improve
chemical safety standards.
Supplementary Information
Supplementary Tables S3.1-S3.5 associated with this chapter are available for download
from the GitHub repository: https://github.com/asamallab/PhDThesis-Janani_
R/blob/main/SI/ST_Chapter3.xlsx.
81
82
Chapter 4
Chemical regulatory risk assessment is based on in vivo methods, which are time con-
suming, costly, and necessitate the use of a large number of animals for testing [221,222].
To improve and accelerate chemical toxicity testing, the US National Research Council
published a vision report in 2007 titled ‘Toxicity testing in the 21st century: a vision and a
strategy’ recommending the implementation of high-throughput screening methods such
as in vitro toxicology or in silico approaches [93,94,96,98]. In this context, ‘toxicity path-
ways’ were proposed to capture the perturbed biological events that occur as a result of
chemical exposure and can be utilised to predict the observed adverse effects [93±96, 98].
Later, the concept of Adverse Outcome Pathways (AOPs) was suggested to organize avail-
able mechanistic knowledge on observed adverse effects in humans or wildlife following
chemical exposure [99]. Subsequently, several studies have reported the development of
specific AOPs and their applications in risk assessment [97, 104±106, 108, 111, 112, 223].
83
In 2012, the OECD launched an international program to formalize the development and
evaluation of AOPs. This has led to a series of OECD guidance documents [101±103] and
primary literature [97, 104±106, 108, 109, 111, 112, 223] for the development of AOPs and
their potential applications in human- and eco-toxicology. AOP-Wiki [114], an actively
maintained module within AOP-KB created by OECD serves as a central repository of
AOPs at various stages of development [105, 106].
On similar lines, the AOP framework is ideal for organizing the existing knowl-
edge and providing a pathway perspective on diverse modes of endocrine disruption by
EDCs [233±235]. Moreover, the development and analysis of an AOP network relevant to
endocrine disruption has the potential to reveal key events, critical paths, and unexpected
links between individual AOPs capturing varied adverse effects [107, 110]. Previously,
there have been few efforts to construct AOP networks for disruption specific to a single
hormone, namely, androgen [116], thyroid, or thyroxine [107, 110]. Due to the focus on
specific hormones, the constructed AOP networks in these studies do not provide a com-
prehensive picture of all endocrine disruption mechanisms captured within AOP-Wiki. In
this chapter, we first aim to build a comprehensive derived AOP network for endocrine
84
disruption by curating and organizing existing toxicological information from AOP-Wiki.
Second, we aim to utilize this derived AOP network for endocrine disruption to better
understand the perturbed biological events involving multiple systems that occur when
exposed to environmental chemicals. Finally, we use graph-theoretic measures to iden-
tify critical biological events, emergent new paths, chemical stressors associated with the
events, and possible adverse outcomes following EDC exposure. Such information can
aid in the development of new endpoints or assays for better risk assessment of environ-
mental chemicals. The work reported in this chapter is contained in the published
manuscript [37].
tion
The aim of this study is to develop a derived AOP network relevant to endocrine disruption
based on information in AOP-Wiki. From the Project Downloads section (https://
aopwiki.org/downloads) of the AOP-Wiki, we have downloaded the XML archive
as on 03 January 2021. This XML archive from AOP-Wiki was parsed using the xml2
package in R to obtain information on AOPs, Key Events (KEs), Key-Event Relationships
(KERs), and stressors. To construct this AOP network relevant to endocrine disruption or
‘ED-AOP network’, we have compiled detailed information on 316 AOPs, 1131 KEs and
1363 KERs from AOP-Wiki. Due to continuous development of AOP-Wiki, some AOPs
may have incomplete information at any particular time (Figure 4.1).
For each AOP in AOP-Wiki, we have retrieved information including the AOP identi-
fier, AOP title, OECD status, and Society for the Advancement of AOPs (SAAOP) status.
For each KE in an AOP, we have gathered information including the KE identifier, KE
type, level of biological organization and taxonomy. The KE type can be either molecular
85
Compilation and curation of
Figure 4.1: Detailed workflow for the development, characterization and analysis of an adverse
outcome pathway (AOP) network for endocrine disruption.
86
initiating event (MIE), key event (KE) or adverse outcome (AO). For each KER in an
AOP, we have gathered information including the KER identifier, upstream KE, down-
stream KE, the weight of evidence (WoE), adjacency information, and the quantitative
understanding score (OECD, 2018). Lastly, we have compiled the chemical stressors
linked to KEs in different AOPs along with their structure information such as the CAS
identifier [164], DSSTOX identifier [236] and InChIKey. Note that the AOP-Wiki also
contains information on non-chemical stressors such as genetic or environmental factors.
We remark that each AOP can be viewed as a directed graph or network wherein
the nodes are KEs and directed edges are KERs linking upstream KEs with downstream
KEs. In this directed graph representation of an AOP, it is straightforward to determine
the existence of a directed path between any pair of KEs.
Since AOP-Wiki is under continuous development, some AOPs may have incomplete in-
formation [101]. Therefore, it is important to evaluate the quality and completeness of
information in each AOP before their selection for the derived AOP network construc-
tion [110]. We have assessed the quality and completeness of information in each AOP
obtained from AOP-Wiki as follows (Figure 4.1).
Firstly, we have removed the ‘archived AOPs’ based on SAAOP status as these are
no longer under active development. This led to the removal of 6 AOPs. Secondly, we
have removed ‘empty AOPs’, which are AOP pages created in AOP-Wiki but lack a KE
or a KER [228]. After removing ‘archived AOPs’ and ‘empty AOPs’, we have 218 AOPs
that remain under consideration. Thirdly, we have removed any AOP which does not
contain at least one MIE and at least one AO. After this step, we have 182 AOPs with
both MIE and AO that remain under consideration. Fourthly, we have computed the
number of (weakly) connected components in each AOP because the presence of more
than one component in an AOP may indicate AOPs in the early stages of development
87
[228]. This led to the identification of 3 disconnected AOPs that have more than one
connected component. After the removal of 3 disconnected AOPs, we have 179 AOPs
that remain under consideration.
Fifthly, we have computed directed paths from different MIEs to different AOs in
each AOP to filter out incomplete AOPs. Since an AOP can have both multiple MIEs and
multiple AOs, we have computed the directed paths between each pair of MIE and AO
in an AOP to impose this path criterion. We have retained an AOP only if it satisfies the
following path criteria:
(a) Every MIE in an AOP has at least one (outgoing) path to at least one AO in the
same AOP.
(b) Every AO in an AOP has at least one (incoming) path from at least one MIE in the
same AOP.
(c) Every KE in an AOP (other than MIEs and AOs) has at least one incoming path
from at least one MIE in the same AOP and at least one outgoing path to at least
one AO in the same AOP.
After removing AOPs that do not satisfy the path criteria, we arrive at a high-confidence
set of 161 AOPs which are associated with 635 KEs and 810 KERs (Figure 4.1; Sup-
plementary Table S4.1). Next, these 161 high-confidence AOPs were considered for the
identification of AOPs relevant for endocrine disruption.
To build the AOP network specific to endocrine disruption, it is important to identify the
subset of endocrine-relevant AOPs (ED-AOPs) among the 161 high-confidence AOPs.
To identify ED-AOPs, we have manually curated the endocrine-relevant KEs (ED-KEs)
among the 635 KEs associated with the 161 high-confidence AOPs.
88
List of endocrine glands, (b) List of endocrine hormones, (c) List of endocrine receptors
where hormones can bind, (d) List of endocrine disorders in MeSH [237], and (e) List
of endocrine-specific endpoints in DEDuCT [35, 36]. All of the data used for filtering
ED-KEs in the aforementioned criteria are specific to humans or rodents (which are com-
monly used animal models for human endocrine disruption) [238]. This process led to a
curated subset of 294 ED-KEs (Supplementary Table S4.2). Afterwards, we retained 151
AOPs among the 161 high-confidence AOPs that contain at least one ED-KE. Further-
more, we consider an AOP to be an ED-AOP if it contains at least one MIE which is an
ED-KE and at least one AO which is an ED-KE. This filtration led to a curated subset of
48 ED-AOPs which are associated with 232 KEs and 268 KERs (Table 4.1; Figure 4.1;
Supplementary Table S4.3). Due to the use of humans or rodents-specific data to filter the
ED-KEs, the majority of these ED-AOPs contain KEs relevant for humans or rodents.
For each ED-AOP, we compute the fraction of KERs with different values of the WoE
89
score namely, ‘high’, ‘moderate’, ‘low’ or ‘not specified’. For example, the fraction of
KERs in an ED-AOP with WoE score ‘high’ can be computed from the ratio of the num-
ber of KERs in the AOP with WoE score ‘high’ and the total number of KERs in the AOP,
and this quantity for an ED-AOP is denoted by F(‘high’). Similarly, it is straightforward
to compute the quantities F(‘moderate’), F(‘low’) and F(‘not specified’) for an ED-AOP.
For each of the 48 ED-AOPs, we have computed the quantities F(‘high’), F(‘moderate’),
F(‘low’) and F(‘not specified’) from the WoE scores of the associated KERs (Supple-
mentary Table S4.4). Subsequently, we have assigned the cumulative WoE score to each
ED-AOP as follows:
(i) If an ED-AOP has F(‘high’) ≥ 0.5, then the cumulative WoE score was assigned to
‘high’.
(ii) Else if an ED-AOP has F(‘high’) < 0.5 but has [F(‘high’) + F(‘moderate’)] ≥ 0.5,
then the cumulative WoE score was assigned to ‘moderate’.
(iii) Else if an ED-AOP has [F(‘high’) + F(‘moderate’)] < 0.5 but has [F(‘high’) +
F(‘moderate’) + F(‘low’)] ≥ 0.5, then the cumulative WoE score was assigned to
‘low’.
(iv) Else if an ED-AOP has [F(‘high’) + F(‘moderate’) + F(‘low’)] < 0.5, then the
cumulative WoE score was assigned to ‘not specified’.
Based on this definition, we find that 18, 12, 1 and 17 ED-AOPs were assigned cumula-
tive WoE score of ‘high’, ‘moderate’, ‘low’ and ‘not specified’, respectively (Table 4.1;
Supplementary Table S4.4).
90
ity information for AOP:13 indicates that the AOP is relevant during brain development
(Supplementary Table S4.5). Similar to WoE scores for KERs in AOP-Wiki, the WoE
information for taxonomic, sex, or life stage applicability for each AOP in AOP-Wiki can
have one of the four values namely, ‘high’, ‘moderate’, ‘low’ or ‘not specified’. Lastly,
we have evaluated the information on taxonomic applicability of the 48 ED-AOPs from
AOP-Wiki webpage (last accessed in April 2021) to assess the human applicability of
each ED-AOP. We find that 14 out of the 48 ED-AOPs have evidence for human applica-
bility in AOP-Wiki (Table 4.1; Supplementary Table S4.5). Of these 14 ED-AOPs with
evidence for human applicability, 4, 4 and 6 ED-AOPs have WoE score for human appli-
cability to be ‘high’, ‘moderate’ and ‘low’, respectively (Table 4.1; Supplementary Table
S4.5). Note that if the WoE score for taxonomic applicability of an ED-AOP for Homo
sapiens was ‘not specified’ in AOP-Wiki, we have assigned the WoE score for human
applicability of that ED-AOP in Table 4.1 to ‘low’.
Evidently, the cumulative WoE score and the WoE score for human applicability listed
in Table 4.1 can be used to qualitatively assess the level of evidence for an ED-AOP and
further filter the curated subset of 48 ED-AOPs. Nevertheless, we have not imposed any
filters based on taxonomic, sex, or life stage applicability information in AOP-Wiki during
the filtration of the 48 ED-AOPs for the subsequent construction of the derived AOP
network. Note that these WoE scores are qualitative indicators representing the strength of
evidence based on current knowledge compiled in AOP-Wiki, and they tend to vary over
time. Hence, it is worthwhile to manually evaluate the evidence while applying filters
based on these scores specific to research question. In addition, these scores indicate the
knowledge gaps in the development of AOPs.
91
C1 C2 C3 C4
AOP: AOP: AOP: AOP: AOP: AOP: AOP: AOP: AOP: AOP: AOP: AOP:
6 19 41 43 112 124 164 167 220 232 293 306
Figure 4.2: Visualization of the ED-AOP network based on shared KEs among the 48 ED-AOPs.
Here, each node corresponds to an ED-AOP and there exists an edge between any two ED-AOPs if
they have at least one shared KE. The network has 7 connected components (labeled C1-C7) with
≥ 2 ED-AOPs and 12 isolated ED-AOPs. The two largest connected components (LCCs) labeled
by C1 and C2 contain 12 ED-AOPs each.
ponents
After filtration of the curated subset of 48 ED-AOPs, we have constructed the AOP net-
work specific to endocrine disruption by assembling the information on shared KEs and
KERs among the 48 ED-AOPs. We refer to this derived AOP network as ‘ED-AOP
network’ (Figure 4.2). The ED-AOP network contains KEs and KERs across the 48
ED-AOPs, and thus, captures diverse biological perturbations related to endocrine sys-
tem [107, 110]. The ED-AOP network can be visualized as an undirected graph of 48
nodes corresponding to the 48 ED-AOPs, and there exists an edge between any two nodes
in this undirected graph if the two ED-AOPs have at least one shared KE (Figure 4.2).
92
completely connected network has a single connected component comprising all nodes in
the graph. Based on this computation, we find that the ED-AOP network can be decom-
posed into 7 connected components with ≥ 2 ED-AOPs and 12 isolated ED-AOPs. These
7 connected components together comprise 36 ED-AOPs (Figure 4.2; Supplementary Ta-
ble S4.6). Among these 7 connected components, the two largest connected components
(LCCs) labeled by C1 and C2 in Figure 4.2 contain 12 ED-AOPs each, and the remain-
ing 5 connected components contain ≤ 3 ED-AOPs each. The LCCs C1 and C2 comprise
of 44 and 48 KEs, respectively, of which 19 and 20 KEs are shared among 2 or more
ED-AOPs in C1 and C2, respectively (Figures 4.3 and 4.4).
To better understand the systems-level effects of AOs in the 7 components of the ED-
AOP network, we have categorized AOs into 4 systems-level endocrine-mediated pertur-
bations, namely, ‘hepatic’, ‘metabolic’, ‘neurological’ and ‘reproductive’, and this classi-
fication depends on the perturbed biological process corresponding to an AO (Table 4.2).
For example, the AO titled ‘Increase, hepatocellular adenomas and carcinomas’ in AOP-
Wiki was classified as ‘hepatic’ while the AO titled ‘impaired, Fertility’ as ‘reproductive’
(Table 4.2). This categorization of AOs in ED-AOPs into 4 systems-level perturbations
follows a similar classification scheme for observed adverse effects upon exposure to en-
docrine disrupting chemicals (EDCs) in our previous work [35, 36] described in Chapter
2. We observe that majority of AOs in the ED-AOP network affect the ‘reproductive’
system (Table 4.2). Moreover, the AOs in C1 can affect 4 different systems, while all AOs
in C2 affect solely the ‘reproductive’ system (Table 4.2).
ED-AOP network
Since the two LCCs dominate the ED-AOP network, we decided to next focus on them.
For a detailed analysis of each LCC in the ED-AOP network, we have constructed the
corresponding directed network wherein nodes are KEs and each directed edge represents
93
AOP:7
AOP:18
AOP:18 AOP:64
AOP:7
Reduction, Cholesterol Reduction, irregularities,
transport in testosterone ovarian cycle
mitochondria level
AOP:271
Increase, Clonal Decrease, Steroidogenic Reduction, Testosterone Reduction,
Expansion of Altered acute regulatory protein synthesis in Cumulative fecundity
Hepatic Foci (STAR) Leydig cells and spawning
AOP:42
Increase, Thyroid- Increase, Activation, Activation, Thyroid hormone Reduction, 17beta-estradiol
stimulating hormone Cytotoxicity Constitutive Androgen synthesis, AOP:300 synthesis by ovarian
(TSH) (hepatocytes) androstane receptor receptor Decreased granulosa cells
Cognitive Function,
AOP:118 AOP:107 AOP:117 Decreased
Figure 4.3: The directed network for LCC C1 in the ED-AOP network consisting of 44 KEs and
56 KERs. The 44 KEs in C1 can be categorized into 9 MIEs, 28 KEs and 7 AOs. MIEs, KEs
and AOs are shown in distinct shapes namely, diamond, square and circle, respectively. The 19
shared KEs in C1 are marked in ‘red’. For each MIE and AO, the corresponding AOP identifier is
displayed in this figure.
94
AOP:63
AOP:100
Decrease, Reduced, AOP:101
Population Reproductive AOP:102
trajectory Success AOP:103
AOP:340
AOP:341 AOP:63 AOP:216 AOP:28
AOP:100 AOP:299
AOP:101 AOP:336
Decrease, Decrease, N/A, Reproductive
AOP:102 AOP:337
Fecundity (F3) Fecundity failure
AOP:103
AOP:216
AOP:299
AOP:336 Reduced, Meiotic
Increased, Reduced, Reduced, Ability
Decrease, AOP:337 Decrease, prophase I/ Decrease, to attract Reduction,
AOP:340 Chromosome Pheromone
Oogenesis (F3) Oogenesis metaphase I Ovulation spawning Eggshell thickness
misseggregation release
AOP:341 transition, oocyte mates
Decrease, Reduction,
Increase, Increase, Increase, Ovarian Increased, cyclic Reduced, Reduced,
Increase, Oocyte Increase, Adenosine Ca and HCO3
Ovarian follicle Oocyte follicle adenosine Prostaglandins, Spawning
apoptosis (F3) Follicular atresia triphosphate transport to
breakdown (F3) apoptosis breakdown monophosphate ovary behavior
pool shell gland
AOP:216
AOP:299
Reduced, Luteinizing
hormone (LH),
plasma
Figure 4.4: The directed network for LCC C2 in the ED-AOP network consisting of 48 KEs and
56 KERs. The 48 KEs in C2 can be categorized into 3 MIEs, 40 KEs and 5 AOs. MIEs, KEs
and AOs are shown in distinct shapes namely, diamond, square and circle, respectively. The 20
shared KEs in C2 are marked in ‘red’. For each MIE and AO, the corresponding AOP identifier is
displayed in this figure.
95
a KER linking its upstream KE with its downstream KE. The directed network for C1
(Figure 4.3) has 44 KEs and 56 KERs while that for C2 (Figure 4.4) has 48 KEs and 56
KERs. Subsequently, we have studied four standard network measures namely, in-degree,
out-degree, betweenness centrality and eccentricity, for KEs in the directed network cor-
responding to LCC, and these measures were computed using NetworkAnalyzer [240]
in Cytoscape [241]. In the directed network, in-degree (respectively, out-degree) of a
KE refers to the number of KEs immediately upstream (respectively, immediately down-
stream) of that KE [110]. Importantly, in-degree and out-degree of KEs can help iden-
tify points of convergence and divergence in the directed network. Further, betweenness
centrality can help identify KEs crucial for the spread of biological perturbations, while
eccentricity can help identify KEs which are farthest upstream or farthest downstream
in the directed network [110, 230]. By applying network measures, we have studied the
systems-level perturbations caused by endocrine-mediated events in the ED-AOP network
upon chemical exposure. We have also investigated the ED-AOP network for possible
emergence of new paths between pairs of MIE and AO that are both ED-KEs and belong
to different ED-AOPs.
Firstly, we identified convergent and divergent events within the directed networks for
C1 and C2 by assessing the in-degree and out-degree of each KE. A KE is considered to
be ‘convergent’ if the in-degree is greater than (>) out-degree for the particular KE, while
a KE is considered to be ‘divergent’ if the in-degree is less than (<) out-degree for the
particular KE [110]. In C1, there are 13 convergent KEs and 12 divergent KEs. Among
the 13 convergent KEs in C1, 2 KEs namely, ‘Increase, cell proliferation (hepatocytes)’
and ‘Increase, hepatocellular adenomas and carcinomas’, have the highest in-degree of 4.
Among the 12 divergent KEs in C1, 2 KEs namely, ‘Activation, PPARα’ and ‘Thyroxine
(T4) in serum, Decreased’, have the highest out-degree of 5, and in other words, these
2 divergent events lead to 5 other events in C1 (Figure 4.3; Supplementary Table S4.7).
In C2, there are 6 convergent KEs and 7 divergent KEs. Among the 6 convergent KEs
in C2, 2 KEs namely, ‘Decrease, Oogenesis’ and ‘Reduced, Reproductive Success’, have
96
the highest in-degree of 4. Among the 7 divergent KEs in C2, the KE ‘Inhibition, Cy-
clooxygenase activity’ has the highest out-degree of 5 (Figure 4.4; Supplementary Table
S4.7).
Secondly, we have assessed the betweenness centrality of KEs in the directed net-
works for C1 and C2. The shared KE ‘Reduction, Testosterone synthesis in Leydig cells’
has the maximum betweenness centrality of 0.4 in C1 (Figure 4.5; Supplementary Table
S4.7), while the shared KE ‘Reduced, Maturation inducing steroid receptor signalling,
oocyte’ has the maximum betweenness centrality of 0.43 in C2 (Figure 4.6; Supplemen-
tary Table S4.7). Since these KEs with the highest betweenness centrality are on the
shortest paths linking various nodes in C1 or C2, the events serve as significant control
points in the ED-AOP network [242].
Thirdly, we have assessed the eccentricity of KEs in the directed networks for C1
and C2. The higher the eccentricity value for a node, the farther is the node located with
respect to other nodes in the network, and thus, low eccentricity value for a node indicates
its central location in the network [243]. In C1, the 2 shared KEs namely, ‘Activation,
PPARα’ and ‘Thyroperoxidase, Inhibition’, have the maximum eccentricity value of 6
(Figure 4.7; Supplementary Table S4.7). In C2, the shared KE ‘Reduced, Prostaglandin
E2 concentration, hypothalamus’ has the maximum eccentricity value of 8 (Figure 4.8;
Supplementary Table S4.7).
Afterwards, we assessed the available information in AOP-Wiki for the two LCCs, C1
and C2. For C1, 21 out of the 44 KEs, i.e. nearly 50%, have evidence for human applica-
bility in AOP-Wiki. For C2, however, 46 out of the 48 KEs do not have taxonomic appli-
cability information in AOP-Wiki. Further, C2 contains two pairs of ED-AOPs namely, (i)
AOP:336 and AOP:337, and (ii) AOP:340 and AOP:341, such that each pair of ED-AOPs
contain the identical set of MIEs and AOs (Supplementary Table S4.6). Further, each pair
of ED-AOPs is such that the two ED-AOPs have most of their KEs in common, and thus,
it may be worthwhile to consider only one ED-AOP in each pair to avoid duplication of
information in the ED-AOP network. Moreover, we find that AOP:28 of C2 contains KEs
97
AOP:7
AOP:18
AOP:18 AOP:64
AOP:7
Reduction, Cholesterol Reduction, irregularities,
transport in testosterone ovarian cycle
mitochondria level
AOP:271
Increase, Clonal Decrease, Steroidogenic Reduction, Testosterone Reduction,
Expansion of Altered acute regulatory protein synthesis in Cumulative fecundity
Hepatic Foci (STAR) Leydig cells and spawning
AOP:42
Increase, Thyroid- Increase, Activation, Activation, Thyroid hormone Reduction, 17beta-estradiol
stimulating hormone Cytotoxicity Constitutive Androgen synthesis, AOP:300 synthesis by ovarian
(TSH) (hepatocytes) androstane receptor receptor Decreased granulosa cells
Cognitive Function,
AOP:118 AOP:107 AOP:117 Decreased
Figure 4.5: The directed network for LCC C1 wherein the KEs are colored based on their be-
tweenness centrality values. MIEs, KEs and AOs are shown in distinct shapes namely, diamond,
square and circle, respectively. The shared KEs are marked in ‘red’. For each MIE and AO, the
AOP identifier is displayed in this figure.
98
AOP:63
AOP:100
Decrease, Reduced, AOP:101
Population Reproductive AOP:102
trajectory Success AOP:103
AOP:340
AOP:341 AOP:63 AOP:216 AOP:28
AOP:100 AOP:299
AOP:101 AOP:336
Decrease, Decrease, N/A, Reproductive
AOP:102 AOP:337
Fecundity (F3) Fecundity failure
AOP:103
AOP:216
AOP:299
AOP:336 Reduced, Meiotic
Increased, Reduced, Reduced, Ability
Decrease, AOP:337 Decrease, prophase I/ Decrease, to attract Reduction,
AOP:340 Chromosome Pheromone
Oogenesis (F3) Oogenesis metaphase I Ovulation spawning Eggshell thickness
misseggregation release
AOP:341 transition, oocyte mates
Decrease, Reduction,
Increase, Increase, Increase, Ovarian Increased, cyclic Reduced, Reduced,
Increase, Oocyte Increase, Adenosine Ca and HCO3
Ovarian follicle Oocyte follicle adenosine Prostaglandins, Spawning
apoptosis (F3) Follicular atresia triphosphate transport to
breakdown (F3) apoptosis breakdown monophosphate ovary behavior
pool shell gland
AOP:216
AOP:299
Reduced, Luteinizing
hormone (LH),
plasma
Figure 4.6: The directed network for LCC C2 wherein the KEs are colored based on their be-
tweenness centrality values. MIEs, KEs and AOs are shown in distinct shapes namely, diamond,
square and circle, respectively. The shared KEs are marked in ‘red’. For each MIE and AO, the
AOP identifier is displayed in this figure.
99
AOP:7
AOP:18
AOP:18 AOP:64
AOP:7
Reduction, Cholesterol Reduction, irregularities,
transport in testosterone ovarian cycle
mitochondria level
AOP:271
Increase, Clonal Decrease, Steroidogenic Reduction, Testosterone Reduction,
Expansion of Altered acute regulatory protein synthesis in Cumulative fecundity
Hepatic Foci (STAR) Leydig cells and spawning
AOP:42
Increase, Thyroid- Increase, Activation, Activation, Thyroid hormone Reduction, 17beta-estradiol
stimulating hormone Cytotoxicity Constitutive Androgen synthesis, AOP:300 synthesis by ovarian
(TSH) (hepatocytes) androstane receptor receptor Decreased granulosa cells
Cognitive Function,
AOP:118 AOP:107 AOP:117 Decreased
Figure 4.7: The directed network for LCC C1 wherein the KEs are colored based on their ec-
centricity values. MIEs, KEs and AOs are shown in distinct shapes namely, diamond, square and
circle, respectively. The shared KEs are marked in ‘red’. For each MIE and AO, the AOP identifier
is displayed in this figure.
100
AOP:63
AOP:100
Decrease, Reduced, AOP:101
Population Reproductive AOP:102
trajectory Success AOP:103
AOP:340
AOP:341 AOP:63 AOP:216 AOP:28
AOP:100 AOP:299
AOP:101 AOP:336
Decrease, Decrease, N/A, Reproductive
AOP:102 AOP:337
Fecundity (F3) Fecundity failure
AOP:103
AOP:216
AOP:299
AOP:336 Reduced, Meiotic
Increased, Reduced, Reduced, Ability
Decrease, AOP:337 Decrease, prophase I/ Decrease, to attract Reduction,
AOP:340 Chromosome Pheromone
Oogenesis (F3) Oogenesis metaphase I Ovulation spawning Eggshell thickness
misseggregation release
AOP:341 transition, oocyte mates
Decrease, Reduction,
Increase, Increase, Increase, Ovarian Increased, cyclic Reduced, Reduced,
Increase, Oocyte Increase, Adenosine Ca and HCO3
Ovarian follicle Oocyte follicle adenosine Prostaglandins, Spawning
apoptosis (F3) Follicular atresia triphosphate transport to
breakdown (F3) apoptosis breakdown monophosphate ovary behavior
pool shell gland
AOP:216
AOP:299
Reduced, Luteinizing
hormone (LH),
plasma
Figure 4.8: The directed network for LCC C2 wherein the KEs are colored based on their ec-
centricity values. MIEs, KEs and AOs are shown in distinct shapes namely, diamond, square and
circle, respectively. The shared KEs are marked in ‘red’. For each MIE and AO, the AOP identifier
is displayed in this figure.
101
such as ‘N/A, Gap’ and ‘N/A, Reproductive failure’. Overall, this highlights disparity and
gaps in available information across AOPs in AOP-Wiki. In sum, the available informa-
tion is more comprehensive for the 12 ED-AOPs in C1 (in comparison to C2). As a result,
the LCC C1 was further investigated to reveal the systems-level perturbations caused by
endocrine-mediated events, the emergence of new paths linking MIEs and AOs, and the
chemical stressors associated with KEs.
ED-AOP network
Human exposure to EDCs can lead to endocrine disruption that in turn can affect various
biological systems. Of late, there is concern regarding an increase in the incidence of
endocrine-mediated disorders linked to reproduction, metabolism, development, nervous
system and immunity in humans and wildlife [43, 50, 51, 244]. To better understand the
systems-level perturbations upon EDC exposure, it is important to investigate the associ-
ated endocrine-mediated events leading to varied adverse outcomes. In this direction, we
have investigated the systems-level perturbations caused by endocrine-mediated events
captured in LCC C1 of the ED-AOP network.
In LCC C1, there are 44 KEs of which 9 are MIEs and 7 are AOs. Notably, 37 out
of these 44 KEs (84%) in C1 were found to be ED-KEs. Depending on the perturbed cell
types, organs or biological processes, we categorized the 44 KEs in C1 into 4 different
systems-level endocrine-mediated perturbations, namely, ‘hepatic’, ‘metabolic’, ‘neuro-
logical’ and ‘reproductive’ (Figure 4.9; Supplementary Table S4.8). This categorization
scheme for the 44 KEs in C1 is similar to the one used for AOs listed in Table 4.2. For
example, the KE titled ‘Increase, Phenotypic enzyme activity’ in AOP-Wiki is associ-
ated with the cellular term ‘hepatocyte’, and thus, the KE is categorized as ‘hepatic’ in
102
our scheme (Figure 4.9; Supplementary Table S4.8). However, the information on the
perturbed cell types, organs or biological processes is not available in AOP-Wiki for 3
MIEs in C1, namely, ‘Antagonism, Thyroid Receptor’, ‘Activation, Androgen receptor’,
and ‘Activation, Constitutive androstane receptor’, and this prevented the categorization
of these 3 MIEs into any of the 4 different systems-level perturbations (Figure 4.9; Sup-
plementary Table S4.8). In addition, the OECD recommends generalizing some KEs
in terms of their cell or tissue specificity so that they can be linked to different AOPs
(OECD, 2018). Of the remaining 41 KEs in C1, 9, 10, 5, and 17 KEs were categorized
as ‘hepatic’, ‘metabolic’, ‘neurological’ and ‘reproductive’ systems-level perturbations,
respectively (Figure 4.9; Supplementary Table S4.8).
103
AOP:7
AOP:18
AOP:18 AOP:64
AOP:7
Reduction, Cholesterol Reduction, irregularities,
transport in testosterone ovarian cycle
mitochondria level
AOP:271
Increase, Clonal Decrease, Steroidogenic Reduction, Testosterone Reduction,
Expansion of Altered acute regulatory protein synthesis in Cumulative fecundity
Hepatic Foci (STAR) Leydig cells and spawning
AOP:42
Increase, Thyroid- Increase, Activation, Activation, Thyroid hormone Reduction, 17beta-estradiol
stimulating hormone Cytotoxicity Constitutive Androgen synthesis, AOP:300 synthesis by ovarian
(TSH) (hepatocytes) androstane receptor receptor Decreased granulosa cells
Cognitive Function,
AOP:118 AOP:107 AOP:117 Decreased
Decreased, Uptake of
Hippocampal
Antagonism,
Metabolism
gene
inorganic iodide Thyroid Receptor
expression, Altered Neurological
AOP:110 AOP:300 Reproduction
Unclassified
Figure 4.9: The directed network for LCC C1 in the ED-AOP network consisting of 44 KEs
wherein the KEs are colored based on their categorization into 4 systems-level perturbations
namely, hepatic, metabolic, neurological and reproductive. MIEs, KEs and AOs are shown in
distinct shapes namely, diamond, square and circle, respectively. The 19 shared KEs in C1 are
marked in ‘red’. The ‘red’ edges highlight KERs that connect KEs categorized into different
systems-level perturbations. The ‘yellow rectangles’ highlight 3 divergent KEs which serve as
point of divergence from one system to another system.
104
Furthermore, the divergent KE titled ‘Activation, PPARα’ is categorized as ‘hep-
atic’, and this KE is immediately upstream of 2 KEs namely, ‘Decrease, Steroidogenic
acute regulatory protein (STAR)’ and ‘Decrease, Translocator protein (TSPO)’, catego-
rized as ‘reproductive’ in C1 (Figure 4.9), and there are supporting evidences for these
particular associations between hepatic and reproductive events in the published litera-
ture [249±251]. Finally, the divergent KE titled ‘Thyroid hormone production, Decreased’
is categorized as ‘metabolic’, and this KE is immediately upstream of a KE titled ‘Reduc-
tion, Plasma 17beta-estradiol concentrations’ categorized as ‘reproductive’ in C1 (Figure
4.9), and there is supporting evidence for this particular association on the influence of
thyroid levels on reproductive hormones [252]. Analysis of divergent KEs in the ED-AOP
network can offer insights into links between different systems affected by endocrine dis-
ruption. Furthermore, these points of divergence tend to branch out into multiple down-
stream occurrences, reflecting a strong predictive utility and thus suggesting novel end-
points or assays that might be designed for better chemical risk assessment.
105
4.4 Emergent paths in the ED-AOP network
Since an AOP network contains multiple AOPs connected via shared KEs, new (directed)
paths, other than those in individual AOPs, can emerge between MIEs and AOs belonging
to different AOPs in the corresponding directed network of KEs and KERs. Such emer-
gent paths from MIEs to AOs in an AOP network can also lead to the development of
new stand-alone AOPs [110]. Here, we have investigated the possibility of such emergent
paths between MIEs and AOs in the LCC C1 of the ED-AOP network consisting of 12
ED-AOPs. We have found 4 new paths in the LCC C1 that connect an endocrine-relevant
MIE in one ED-AOP to an endocrine-relevant AO in another ED-AOP (Figure 4.3; Table
4.3).
Of the 4 new paths in C1 (Figure 4.3; Table 4.3), 2 new paths start from the shared
MIE ‘Thyroperoxidase, Inhibition’ (in AOP:42, AOP:119, and AOP:271) and end at the
2 AOs namely, ‘irregularities, ovarian cycle’ (in AOP:7) and ‘impaired, Fertility’ (in
AOP:7, AOP:18, and AOP:64). We find that previously published research supports the
above links indicating the impact of thyroperoxidase on reproduction [253±256]. An-
other new path in C1 starts from the MIE ‘reduction in ovarian granulosa cells, Aro-
matase (Cyp19a1)’ (in AOP:7) and ends at the AO ‘Reduction, Cumulative fecundity
and spawning’ (in AOP:271). The AO ‘Reduction, Cumulative fecundity, and spawning’
in this path describes the process of releasing eggs or sperms for aquatic animals like
fishes. On the other hand, Aromatase appears to play a substantial role in egg release in
both humans [257, 258] and fishes [259, 260] based on previous studies. Lastly, there is
a new path in C1 starting from the MIE ‘Glucocorticoid Receptor Agonist, Activation’
(in AOP:64) and ending at the AO ‘Malformation, Male reproductive tract’ (in AOP:18).
Previous research has shown that the glucocorticoid receptor has an effect on male repro-
duction [261, 262]. These emergent paths identified in LCC C1 of the ED-AOP network
have potential to reveal unknown relationships between distant KEs and may represent
toxicity pathways specific to endocrine disruption. Further, a closer inspection of these
106
emergent paths may also lead to prediction of unknown adverse effects upon specific EDC
exposure, as well as guide future development of new AOPs.
Analyses of direct associations between chemical stressors and KEs in the ED-AOP
network can reveal the diversity of biological mechanisms via which EDCs can cause
different endocrine-mediated adverse effects. To aid ongoing efforts in risk assessment
of EDCs, it will be worthwhile to undertake a future effort to associate all known EDCs,
including 792 potential EDCs in DEDuCT 2.0, to different events in the ED-AOP network.
In sum, a stressor-ED-AOP network can serve as a predictive model for EDCs and their
adverse effects.
107
4.6 Discussion
An AOP is a systematic framework to encapsulate the existing toxicological information
as a toxicity pathway to aid in risk assessment and chemical regulation [96, 97, 99, 104,
223]. Within AOP-Wiki, the up to date central repository of individual AOPs, AOP net-
works have emerged due to sharing of KEs and KERs across individual AOPs. Since
AOP networks are expected to be the functional units for prediction in real-world scenar-
ios, there is notable interest in the derivation and analysis of AOP networks tailored to
address specific problems or applications [107, 110, 228].
The challenges in the risk assessment and regulation of EDCs partially stem from
the existing knowledge gaps in linking chemical exposure to diverse adverse outcomes
[3, 218, 244]. To address this challenge, a blueprint of the endocrine disruption mech-
anisms in the form of toxicity pathways spanning different levels of biological organi-
zation can be invaluable [234]. In this context, the development of a comprehensive
AOP network relevant to endocrine disruption (i.e., an ED-AOP network) can aid ongo-
ing research and policy framing surrounding EDCs. In this chapter, we have developed
a detailed workflow (Figure 4.1) to leverage information in AOP-Wiki and construct a
comprehensive ED-AOP network (Figure 4.2; Table 4.1). Ensuing graph-theoretic analy-
sis of this ED-AOP network of 48 ED-AOPs, and in particular, its largest components C1
and C2 of 12 ED-AOPs each, reveals several mechanistic insights on endocrine-mediated
perturbations upon chemical exposure.
Since AOP development is a continuous and iterative exercise, therefore the ED-AOP
network constructed in this chapter is appreciably limited by the existing knowledge in
AOP-Wiki. As AOPs are living documents, it will be important to maintain the ED-AOP
network up to date with any expansion in AOP-Wiki. This could have an impact on the
graph-theoretic analysis reflecting the bias of the existing data. For example, the key
events with the highest betweenness value could reflect important control points in the
ED-AOP network, as well as the most frequently investigated occurrences rather than a
108
biological reality. Another significant limitation is the choice of criteria for filtration of
ED-KEs, where we used endocrine-relevant keywords such as glands, hormones, hor-
monal receptors, endocrine disorders, and endpoints specific to humans or rodents. As a
result, the majority of ED-AOPs used to construct the ED-AOP network may be confined
to these organisms. We expect that the detailed workflow in Figure 4.1 with a little or
no modification can be used for any future update of the ED-AOP network. Moreover,
the current information in AOP-Wiki on chemical stressors associated with events in the
ED-AOP network is a small fraction of the existing knowledge on potential EDCs in the
published literature [35, 36], and therefore, it will be important to invest future efforts to-
wards developing a comprehensive stressor-ED-AOP network wherein all known EDCs
are linked to different events in the ED-AOP network. In sum, ED-AOP network pro-
vides an overall landscape of potential adverse outcomes associated with EDC exposure,
allowing for the identification of important biological events that are relevant for better
risk assessment.
Supplementary Information
Supplementary Tables S4.1-S4.9 associated with this chapter are available for download
from the GitHub repository: https://github.com/asamallab/PhDThesis-Janani_
R/blob/main/SI/ST_Chapter4.xlsx.
109
Fraction
AOP Cumulative Human
S. No. AOP title of
identifier WoE WoE
ED-KEs
Antagonist binding to PPARα leading to
1 6 62.5 High High
body-weight loss
Aromatase (Cyp19a1) reduction leading to
2 7 100 High Low
impaired fertility in adult female
Chronic binding of antagonist to
N-methyl-D-aspartate receptors (NMDARs)
3 12 during brain development leads to 87.5 Moderate Low
neurodegeneration with impairment in learning
and memory in aging
Chronic binding of antagonist to
N-methyl-D-aspartate receptors (NMDARs)
4 13 90 Low High
during brain development induces impairment of
learning and memory abilities
PPARα activation in utero leading to impaired
5 18 87.5 Moderate Low
fertility in males
Androgen receptor antagonism leading to adverse
6 19 100 - -
effects in the male foetus (mammals)
Cyclooxygenase inhibition leading reproductive
7 28 66.7 Moderate -
failure
Peroxisomal Fatty Acid Beta-Oxidation Inhibition
8 36 75 High -
Leading to Steatosis
PPARα activation leading to hepatocellular
9 37 80 High -
adenomas and carcinomas in rodents
Sustained AhR Activation leading to Rodent Liver
10 41 100 High -
Tumours
Inhibition of Thyroperoxidase and Subsequent
11 42 Adverse Neurodevelopmental Outcomes in 62.5 High High
Mammals
Disruption of VEGFR Signaling Leading to
12 43 60 High Moderate
Developmental Defects
NR1I2 (Pregnane X Receptor, PXR) activation
13 60 58.3 High -
leading to hepatic steatosis
14 62 AKT2 activation leading to hepatic steatosis 100 - -
Cyclooxygenase inhibition leading to reproductive
15 63 80 Moderate Low
dysfunction
Glucocorticoid Receptor (GR) Mediated Adult
16 64 Leydig Cell Dysfunction Leading to Decreased 100 - -
Male Fertility
110
Cyclooxygenase inhibition leading to reproductive
17 100 dysfunction via inhibition of female spawning 57.1 Moderate -
behavior
Cyclooxygenase inhibition leading to reproductive
18 101 57.1 High -
dysfunction via inhibition of pheromone release
Cyclooxygenase inhibition leading to reproductive
19 102 dysfunction via interference with meiotic 80 High Low
prophase I /metaphase I transition
Cyclooxygenase inhibition leading to reproductive
20 103 dysfunction via interference with spindle 80 High Low
assembly checkpoint
Constitutive androstane receptor activation leading
21 107 to hepatocellular adenomas and carcinomas in the 80 High -
mouse and the rat
Inhibition of iodide pump activity leading to
22 110 follicular cell adenomas and carcinomas (in rat 100 - -
and mouse)
Decrease in androgen receptor activity leading to
23 111 100 - -
Leydig cell tumors (in rat)
Increased dopaminergic activity leading to
24 112 100 - -
endometrial adenocarcinomas (in Wistar rat)
Androgen receptor activation leading to
25 117 hepatocellular adenomas and carcinomas (in 75 - -
mouse and rat)
Chronic cytotoxicity leading to hepatocellular
26 118 75 - -
adenomas and carcinomas (in mouse and rat)
Inhibition of thyroid peroxidase leading to
27 119 follicular cell adenomas and carcinomas (in rat 100 - -
and mouse)
Inhibition of 5α-reductase leading to Leydig cell
28 120 100 - -
tumors (in rat)
HMG-CoA reductase inhibition leading to
29 124 83.3 - -
decreased fertility
Beta-2 adrenergic agonist activity leading to
30 164 66.7 High -
mesovarian leiomyomas in the rat and mouse
Early-life estrogen receptor activity leading to
31 167 71.4 High -
endometrial carcinoma in the mouse.
Juvenile hormone receptor agonism leading to
32 201 male offspring induction associated population 50 - -
decline
111
5-hydroxytryptamine transporter inhibition
33 203 leading to decreased reproductive success and 37.5 - -
population decline
5-hydroxytryptamine transporter inhibition
34 204 leading to increased reproductive success and 37.5 - -
population increase
35 205 AOP from chemical insult to cell death 50 High -
Histone deacetylase inhibition leading to testicular
36 212 66.7 Moderate Moderate
atrophy
Excessive reactive oxygen species production
37 216 85.7 - -
leading to population decline via follicular atresia
38 220 Cyp2E1 Activation Leading to Liver Cancer 80 High Moderate
39 232 NFE2/Nrf2 repression to steatosis 87.5 - -
Inhibition of thyroid peroxidase leading to
40 271 80 High -
impaired fertility in fish
Increased DNA damage leading to increased risk
41 293 66.7 Moderate -
of breast cancer
Excessive reactive oxygen species production
42 299 leading to population decline via reduced fatty 62.5 - -
acid beta-oxidation
Thyroid Receptor Antagonism and Subsequent
43 300 Adverse Neurodevelopmental Outcomes in 40 Moderate High
Mammals
Androgen receptor (AR) antagonism leading to
44 306 short anogenital distance (AGD) in male 100 High Moderate
(mammalian) offspring
DNA methyltransferase inhibition leading to
45 336 57.1 Moderate -
population decline (1)
DNA methyltransferase inhibition leading to
46 337 62.5 Moderate -
population decline (2)
DNA methyltransferase inhibition leading to
47 340 62.5 Moderate -
transgenerational effects (1)
DNA methyltransferase inhibition leading to
48 341 66.7 Moderate -
transgenerational effects (2)
Table 4.1: The curated subset of 48 ED-AOPs among the 161 high-confidence AOPs filtered from
AOP-Wiki. The table also gives the fraction of ED-KEs, the cumulative WoE score, and the WoE
score for human applicability (Human WoE) for each of the 48 ED-AOPs.
112
Component
S. No. AO Systems-level perturbation
identifier
Increase, hepatocellular adenomas and
1 C1 Hepatic
carcinomas
2 C1 Increase, Adenomas/carcinomas (follicular cell) Metabolic
3 C1 Cognitive Function, Decreased Neurological
4 C1 impaired, Fertility Reproductive
5 C1 irregularities, ovarian cycle Reproductive
6 C1 Reduction, Cumulative fecundity and spawning Reproductive
7 C1 Malformation, Male reproductive tract Reproductive
8 C2 Decrease, Population trajectory Reproductive
9 C2 Decrease, Fecundity Reproductive
10 C2 Decrease, Fecundity (F3) Reproductive
11 C2 N/A, Reproductive failure Reproductive
12 C3 Increased, Liver Steatosis Hepatic
13 C4 Increased, Male offspring Reproductive
14 C4 Decreased, Reproductive Success Reproductive
15 C4 Increased, Reproductive Success Reproductive
16 C5 Impairment, Learning and memory Neurological
17 C6 Increase, Leydig cell tumors Reproductive
18 C7 Apoptosis -
19 C7 Testicular atrophy Reproductive
Table 4.2: The list of AOs in the 7 connected components of the ED-AOP network and their cate-
gorization into 4 systems-level endocrine-mediated perturbations, namely, ‘hepatic’, ‘metabolic’,
‘neurological’ and ‘reproductive’, depending on the perturbed biological processes.
113
S. No. MIE AO
1 Thyroperoxidase, Inhibition irregularities, ovarian cycle
2 Thyroperoxidase, Inhibition impaired, Fertility
reduction in ovarian granulosa cells, Aromatase
3 Reduction, Cumulative fecundity and spawning
(Cyp19a1)
4 Glucocorticoid Receptor Agonist, Activation Malformation, Male reproductive tract
Table 4.3: The table gives information on the starting MIE and the ending AO for each of the 4
new paths identified in the LCC C1 of the ED-AOP network.
114
Chapter 5
115
Despite these limitations, there have been some efforts to compile potential neurotox-
icants with evidence specific to mammals from published literature [57, 58, 60±62]. Al-
though the lists of potential neurotoxicants compiled by Grandjean and Landrigan [58],
Mundy et al. [61], and Aschner et al. [62] are available via the CompTox dashboard [265],
there is no dedicated online resource to date on environmental neurotoxicants. In this
chapter, we address this unmet need by building the first dedicated online knowledgebase,
namely, NeurotoxKb 1.0 [38], which compiles 475 potential non-biogenic neurotoxicants
with published evidence specific to mammals. The work reported in this chapter is
contained in the published manuscript [38].
Firstly, we considered 802 potential neurotoxicants compiled in the US EPA report [60]
published in 1976 on neurotoxic chemicals. From published literature, the US EPA re-
port had compiled 802 chemicals tested for neurotoxic effects upon exposure on various
living organisms including mammals and non-mammals [60]. Secondly, we have con-
sidered 214 potential neurotoxicants compiled by Grandjean and Landrigan [57, 58] to
which humans are vulnerable upon exposure in early stages of development. For compil-
ing their list, Grandjean and Landrigan [57, 58] had employed PubMed literature mining
and toxicological resources such as TOXNET [266,267], TOXLINE [268] and Hazardous
116
Compilation of potential neurotoxicants
from existing resources
Mapping of neurotoxicants to
their chemical identifiers using
standard databases 742 potential neurotoxicants
with chemical structure
information compiled from
four existing resources
Filtration of biogenic
chemicals
Figure 5.1: Schematic workflow describing the compilation of 475 potential non-biogenic neuro-
toxicants along with published evidence of observed neurotoxic endpoints specific to mammals.
117
Substances Data Bank (HSDB) [269]. Note that these toxicological resources have been
integrated into other NLM databases since 2019 [267]. Note that Grandjean and Landri-
gan had first published a list of 201 potential human neurotoxicants in 2006 [58] which
they subsequently expanded to 214 potential human neurotoxicants in 2014 [57]. Thirdly,
we have considered the 97 potential neurotoxicants compiled by Mundy et al. [61, 270]
that have demonstrated effects on neurodevelopment. Fourthly, we have considered the
33 potential neurotoxicants compiled by Aschner et al. [62, 271] that have evidence of
triggering developmental neurotoxicity in vivo.
Next, we mapped the 802, 214, 97 and 33 potential neurotoxicants compiled from the
US EPA report [60], Grandjean and Landrigan [57], Mundy et al. [61] and Aschner et
al. [62], respectively, to chemical identifiers in standard databases such as PubChem [86],
CAS [164] and CTD [30]. While mapping the potential neurotoxicants to their chemical
structure, we have removed any potential neurotoxicant in the four lists that could not be
mapped to a chemical identifier or represents a chemical mixture rather than individual
chemical entity. This resulted in a non-redundant list of 742 potential neurotoxicants
compiled from the four above-mentioned resources (Figure 5.1).
Next, we have removed any chemical from the non-redundant list of 742 potential
118
neurotoxicants compiled from the four above-mentioned resources that are of biological
origin such as snake venoms, plant or microbial toxins, and hormones. This removal
of potential biogenic neurotoxins is motivated by our exclusive focus on human-made
environmental neurotoxicants. This resulted in a list of 610 potential non-biogenic neu-
rotoxicants compiled from the four above-mentioned resources (Figure 5.1).
In summary, we have compiled from four existing resources, a curated list of 610
potential non-biogenic neurotoxicants along with their two-dimensional (2D) and three-
dimensional (3D) chemical structure information via the above-mentioned steps in our
workflow (Figure 5.1).
Firstly, we have compiled from the USA EPA report [60], the observed neurotoxic
endpoints for potential non-biogenic neurotoxicants along with the information on test
organisms including mammals and non-mammals in the published experimental studies.
Note that the USA EPA report [60] also compiles observations of no neurotoxic effects
for potential neurotoxicants from published experimental studies.
Secondly, Mundy et al. [61] and Aschner et al. [62] have compiled potential develop-
119
mental neurotoxicants along with the information on their observed neurotoxic endpoints
from published experimental studies in rodents and primates. However, the compilation
of neurotoxic endpoints in Mundy et al. [61] and Aschner et al. [62] is much less de-
tailed in comparison to the USA EPA report [60]. Specifically, Mundy et al. [61] have
reported the neurotoxic endpoints from published studies after their broad categorization
into 3 terms, namely, behaviour, morphology, and neurochemistry. Similarly, Aschner et
al. [62] have reported the neurotoxic endpoints from published studies after their broad
categorization into 40 terms. However, we believe that a detailed compilation of neuro-
toxic endpoints for potential neurotoxicants from published studies specific to mammals
can render a valuable toxicological resource that can aid in early identification and reg-
ulation of hazardous chemicals. Therefore, we have performed a manual curation of the
287 published studies compiled by Mundy et al. [61] and Aschner et al. [62] to collect
detailed neurotoxic endpoints for potential non-biogenic neurotoxicants covered by the
two resources.
Thirdly, Grandjean and Landrigan [57, 58] have compiled a list of chemicals poten-
tially toxic to the human nervous system from published literature. However, Grandjean
and Landrigan [57, 58] have not compiled the observed neurotoxic endpoints for the po-
tential neurotoxicants from associated published literature. Therefore, we have performed
an extensive manual curation effort to compile the observed neurotoxic effects specific to
humans from HSDB [269] for the potential neurotoxicants in the list by Grandjean and
Landrigan [57,58]. Note that HSDB [269] (which has been integrated into PubChem [86])
was used by Grandjean and Landrigan [57, 58] to compile their list of 214 potential hu-
man neurotoxicants. During this manual curation effort, we were unable to gather exper-
imental evidence specific to mammals from HSDB [269] for some of the 214 potential
human neurotoxicants in the list by Grandjean and Landrigan [57, 58]. For such potential
neurotoxicants in the list by Grandjean and Landrigan [57, 58] without any documented
evidence of neurotoxicity in HSDB [269], we performed additional literature searches to
gather any published evidence of neurotoxicity specific to mammals.
120
At the end of the above-mentioned steps to compile observed neurotoxic endpoints
specific to mammals for 610 potential non-biogenic neurotoxicants from existing re-
sources [57, 60±62], HSDB [269] and published literature, we were able to gather pub-
lished experimental evidence specific to mammals for only 475 out of 610 potential non-
biogenic neurotoxicants (Figure 5.1; Supplementary Table S5.1). These 475 potential
non-biogenic neurotoxicants with experimental evidence specific to mammals from 835
published articles have been compiled in our environmental Neurotoxicants Knowledge-
base, namely NeurotoxKb 1.0 [38], which is accessible at: http://cb.imsc.res.in/
neurotoxkb.
Of these 475 identified potential neurotoxicants in NeurotoxKb 1.0 [38], the US EPA
report [60], Grandjean and Landrigan [57], Mundy et al. [61] and Aschner et al. [62]
capture 292, 178, 88 and 26 potential neurotoxicants, respectively, with published evi-
dence specific to mammals (Figure 5.2A). Notably, among the four existing resources,
the US EPA report [60] contributes a unique set of 231 out of the 475 potential neurotoxi-
cants (∼ 50%) compiled in NeurotoxKb 1.0 with published evidence specific to mammals
(Figure 5.2A). In other words, almost 50% of the potential neurotoxicants specific to
mammals in NeurotoxKb 1.0 were solely identified due to our extensive manual effort
121
to digitize, compile, curate and organize the vast information on potential neurotoxicants
captured in the US EPA report [60] published in 1976. Notably, the US EPA report [60]
contributes a unique set of 414 out of the 835 published articles (∼ 50%) compiled in
NeurotoxKb 1.0 that provide mammalian-specific evidence on potential neurotoxicants.
Information on the major sources of exposure is vital for chemical regulation and moni-
toring by agencies. Therefore, we have compiled the environmental sources for the 475
potential neurotoxicants in NeurotoxKb 1.0. Specifically, NeurotoxKb 1.0 has classified
the 475 potential neurotoxicants into 6 broad categories of environmental sources, namely,
‘Agriculture and Farming’, ‘Consumer Products’, ‘Industry’, ‘Intermediates’, ‘Medicine
and Healthcare’, and ‘Pollutant’, and 41 sub-categories (Figure 5.3). It can be seen that
majority of the 475 potential neurotoxicants are in the category ‘Agriculture and Farming’
which is followed by ‘Industry’ (Figure 5.3) [38].
Furthermore, we have also classified the 475 potential neurotoxicants in NeurotoxKb 1.0
based on their chemical structure. Specifically, we have employed ClassyFire [173, 174]
for a hierarchical chemical classification into kingdom, super-class, class and subclass. Of
the 475 potential neurotoxicants, 430 are organic while 45 are inorganic (Figure 5.2B).
Moreover, majority (100) of the 475 potential neurotoxicants belong to chemical super-
class ‘Benzenoids’ (Figure 5.2B) [38]. Note that information on the chemical class of
potential neurotoxicants can be used to draw inferences on their nature and behaviour.
122
A E
Adipose tissue 10
US EPA
0 3 14 231 report (1976) Amniotic fluid 12
(292) Blood 63
Bone 3
1 7 9 27
Brain 10
Breast milk 30
0 1 8 125 Grandjean and
Landrigan (2014) Cord blood 35
(178) Follicular fluid 10
3 11 35 Hair 17
Hand 4
Mundy et al
Heart 2
(2015)
(88) Kidney 1
Aschner et al
(2017) Liver 2
(26) Lung 2
Mouth 2
Muscle 1
B Nail 11
Pancreas 1
and derivatives
Organic acids
Pituitary gland 1
ds gen
)
(33
Placenta
po nitro
17
(77)
com anic
un
Saliva 11
g
Or
Semen 4
Or
ga Skin 3
no gen
com hete oxy (32)
po rocy anic nds Spinal cord 2
un
ds clic Org pou
(90 com Sweat 2
)
Thyroid gland 1
logen
Organic Organoha (23) Tooth 8
nds
compounds compou
(430) Umbilical cord 7
Lipids an Urinary bladder
lipid-like
d 1
molecul
es (19) Urine 68
Org
a Urothelial cells
com nometa 1
Inorganic
Or pounds llic
compounds g (12)
co ano
(45) Or mpou sulfu
Hy ganic nds r
0 10 20 30 40 50 60 70
d sa ( 9 )
P ro lts
O hen ca Biospecimen Number of neurotoxicants
rg rb (8)
an ylpr on
Al rga os rbo
op op
0)
s(
O ucle roca
ka nic id n
ho an
10
7)
N d
lo 1 es, de
sp oid
Hy
id ,3 n riv
s(
ho s
Hom pounds
an
com
ru
oid
d
d ola otid es
s po
Mixe unds (15
de r
co
en
comp
Homogeneous
compounds (13)
lyk
riv com es, )
m
oge
nz
po et
at
id
iv pou nd
un
D
Be
d me
es
es
neo 7)
ds
o
(5
(4
(4 )
)
)
us m
tal/no
(1
nd an
a
(1
)
ta
a
n-me
)
es(
tal
3)
250 247
Number of neurotoxicants
200
C 166
148
150
High Production Volume 132
123
Neurotoxicants (136) 102 100
100 85 88
73
66
3 50 34 30 37
14
7
12 16 0
105
Childrens’
Dietary
Exter l
Miscellaneous
Occupational
Pesticide/
Skin-specific
exposome
expsome
tal
exposome
external
exposome
biocide
exposome
exposome
expsome
expsome
environmenna
Indoor-specifi
20 57 101
Neurotoxicants in Neurotoxicants of
use (194) concern (279) Category of Exposome
123
Figure 5.2 (previous page): (A) Venn diagram showing the occurrence of the 475 potential neu-
rotoxicants compiled in NeurotoxKb 1.0 across four existing resources, namely, the US EPA re-
port (1976), Grandjean and Landrigan (2014), Mundy et al. (2015), and Aschner et al. (2017).
(B) Sunburst plot showing the hierarchical classification of the 475 potential neurotoxicants into 2
chemical kingdoms and 20 chemical super-classes. The number of potential neurotoxicants in each
kingdom or super-class is indicated within parenthesis. (C) Venn diagram showing the overlap be-
tween the sets of potential neurotoxicants present in Substances in use (SIU) lists, Substances of
concern (SOC) lists, and High production volume (HPV) lists. Here, the potential neurotoxicants
present in SIU lists and SOC lists are labeled as ‘Neurotoxicants in use’ and ‘Neurotoxicants of
concern’, respectively. (D) Presence of the 475 potential neurotoxicants across chemical lists cat-
egorized into 8 exposome categories, namely, Children’s exposome, Dietary exposome, External
environmental exposome, Indoor-specific exposome, Miscellaneous external exposome, Occupa-
tional exposome, Pesticide/biocide exposome, and Skin-specific exposome. This plot displays
two bars for each exposome category wherein one bar gives the number of neurotoxicants present
in that exposome while other bar gives the number of neurotoxicants that are produced in high
volume present in that exposome. (E) The bar chart shows the occurrence of the 475 potential
neurotoxicants in NeurotoxKb 1.0 across 31 different human biospecimens.
124
Agricultural and Consumer Industry Intermediates Pollutant (49)
farming (198) products (163) (196) (61)
Minerals,
Chemicals In
Metals, Heavy
Diagnosis (3)
metals (36)
Organic Drugs
Synthesis (16) (167)
Paints
(50)
Photography
(34)
Plasticizer (37)
Solvent
(38)
Figure 5.3: Classification of the 475 potential neurotoxicants in NeurotoxKb 1.0 into 6 broad
categories and 41 sub-categories based on their environmental source. The number of potential
neurotoxicants in each category or sub-category is mentioned besides the category or sub-category
within parenthesis. Note that a potential neurotoxicant can belong to more than one category or
sub-category of environmental sources.
125
tion of the compiled information in NeurotoxKb 1.0 is facilitated by Cytoscape.js [193],
Google Charts [191] and Plotly [276]. NeurotoxKb 1.0 is hosted on an Apache [196]
webserver running on Debian 9.4 Linux Operating System. Using the web interface of
NeurotoxKb 1.0, users can access detailed information on any of the potential neurotoxi-
cants via search or browse options (Figure 5.4).
sources on neurotoxicants
Table 5.1 presents a comparison of our resource, NeurotoxKb 1.0, with the four existing
resources, namely, the US EPA report [60], Grandjean and Landrigan [57], Mundy et
al. [61] and Aschner et al. [62] on potential neurotoxicants. From this table, it is evident
that NeurotoxKb 1.0 [38] will be a valuable resource for future research and monitoring
of neurotoxicants due to several additional features in comparison to existing resources.
126
A C
B D
E G
F H
Figure 5.4: The web interface of NeurotoxKb. (A) The screenshot displays the home page of
NeurotoxKb 1.0. NeurotoxKb 1.0 has options to search and retrieve information on potential neu-
rotoxicants. (B) Simple search to retrieve potential neurotoxicants using their chemical names or
identifiers. (C) Physicochemical filter to retrieve potential neurotoxicants based on their physic-
ochemical properties. (D) Chemical similarity filter to retrieve potential neurotoxicants that are
structurally similar to a query compound. NeurotoxKb 1.0 also has options to browse information
on potential neurotoxicants based on their (E) Environmental source classification, (F) Chemi-
cal classification, (G) Presence in chemical regulation or guideline, and (H) Presence in human
biospecimen.
127
classified into 8 categories of exposomes, namely, ‘Children’s exposome’, ‘Dietary expo-
some’, ‘External environmental exposome’, ‘Indoor-specific exposome’, ‘Occupational
exposome’, ‘Pesticide/biocide exposome’, ‘Skin-specific exposome’ and ‘Miscellaneous
external exposome’ (Figure 5.5; Supplementary Table S5.3), and these contribute to the
total external exposome of humans.
We find that 311 potential neurotoxicants in NeurotoxKb 1.0 are present in at least
one of the 55 chemical lists (Supplementary Table S5.4). Figure 5.2C shows the distribu-
tion of these 311 potential neurotoxicants across SIU, SOC and HPV lists. Notably, 162
potential neurotoxicants are present in both SIU and SOC lists, and further, 105 of these
162 potential neurotoxicants are also produced in high volume (Figure 5.2C). Among
the 311 potential neurotoxicants present in at least one of the 55 chemical lists, Ethylene
oxide is present in the maximum number (24) of lists which includes both SIU and SOC
lists (Supplementary Table S5.4) [38]. Published literature on Ethylene oxide has clearly
documented experimental evidence on its neurotoxicity, and humans are mainly exposed
to this neurotoxicant via occupational exposure [277, 278].
Upon investigation of the presence of the 475 potential neurotoxicants across chemi-
cal lists categorized into 8 exposome categories revealed that 166 potential neurotoxicants
in NeurotoxKb 1.0 are present in the dietary exposome, specifically as food additives,
128
neurotoxicants
Substances in use (SIU) Lists
Number of
Number of
chemicals
L1 ESCO list of non-plastic food contact materials
L2 EU food flavorings database
L3 EU lists of Food Additives
L4 EU plastic food packaging materials
L5 FDA TOR Notices L22 (86) (17)
L6 FooDB L23 (24) (1)
L7 Pew list of food additives
L24 (66) (1)
L8 Substances added to food (EAFUS)
L9 The Joint FAO/WHO Expert Committee on Food Additives L25 (85) (26)
(JECFA) list L1 (1527) (59)
L10 US FDA Indirect Additives used in Food Contact Substances
L2 (2446) (23)
L11 WHO Codex General Standards for Food Additives
L12 Consumer product ingredient database L3 (299) (8)
L13 Active ingredients allowed in minimum risk pesticide products L4 (683) (37)
L14 ECHA biocidal products
L5 (119) (0)
L15 EU list of colorants allowed in cosmetic products
L16 EU list of preservatives allowed in cosmetic products L6 (16341) (121)
L17 EU list of UV filters allowed in cosmetic products L7 (6800) (88)
L18 IFRA transparency list
L19 Production of major chemicals year-wise in India L8 (2612) (27)
L20 US EPA safer chemical ingredients list L9 (3049) (68)
L21 US FDA inactive ingredient list L10 (3227) (43)
L11 (234) (4)
Childrens’ exposome L26 (67) (14)
L27 (19) (1)
Dietary exposome L28 (39) (17)
L29 (246) (69)
Substances in use (SIU) L30 (477) (63)
External environmental exposome L31 (83) (45)
L32 (126) (23)
Indoor-specific exposome L33 (79) (2)
L34 (297) (53)
L12 (2037) (88)
Substances of concern (SOC) Miscellaneous external exposome L19 (77) (27)
L20 (978) (6)
L21 (789) (20)
Occupational exposome L40 (757) (61)
Pesticide/biocide exposome L41 (224) (0)
L42 (33) (110)
Skin-specific exposome L43 (869) (116)
L44 (39) (41)
Substances of concern (SOC) Lists L45 (188) (14)
L22 Chemicals of concern in plastic toys L46 (479) (22)
L23 Danish EPA Sensitizing Fragrances in Childrens’ Articles L47 (386) (94)
L24 EU Toy Safety Directive
L48 (162) (61)
L25 Washington State Childrens’ Safe Product Act
L26 EU substances subject to POPs Regulation L49 (927) (25)
L27 EU Union-wide Monitoring Watchlist L50 (345) (24)
L28 EU Water Framework Priority Substances
L51 (146) (33)
L29 EWG tap water database
L30 Human Indoor Exposome database L52 (226) (50)
L31 NPI Australia L53 (340) (28)
L32 OSPAR List of Substances of Possible Concern
L54 (566) (85)
L33 Ozone-depleting substances in India
L34 Singapore list of controlled hazardous substances L55 (174) (3)
L35 US OSHA list L35 (124) (14)
L36 List of banned and restricted pesticide products in China
L37 List of banned pesticides in India L13 (42) (0)
L38 Pesticide Action Network (PAN) International List of Highly L14 (230) (21)
Hazardous Pesticides L36 (60) (26)
L39 EU list of substances prohibited in cosmetic products
L40 California Proposition 65 (CP65) L37 (81) (36)
L41 ECHA list of chemicals in Annex I L38 (392) (72)
L42 ECHA PBT assessment list L15 (157) (3)
L43 IARC monographs on carcinogens
L44 NZ EPA priority chemical list L16 (120) (3)
L45 PACSs list Japan L17 (27) (0)
L46 Restricted substances under REACH L18 (3343) (25)
L47 Schedule 1 hazardous chemical list in India
L48 Schedule 3 hazardous chemical list in India L39 (1933) (115)
L49 SIN List
L50 SVHC under REACH
L51 Toxic chemicals restricted to be imported or exported in China
L52 US NTP Report on Carcinogens
L53 EU Community rolling action plan (CoRAP)
L54 European Trade Union Priority List
L55 EU ECHA public activities coordination tool (PACT) PBTs
129
Figure 5.5 (previous page): Sankey plot displays the 55 chemical lists considered for compara-
tive analysis that are a part of chemical inventories, regulations and guidelines. These lists were
broadly classified into two categories, namely, Substances in use (SIU) and Substances of concern
(SOC), based on the nature of substances. Further, these lists have also been classified into 8
categories of exposome, namely, Children’s exposome, Dietary exposome, External environmen-
tal exposome, Indoor-specific exposome, Miscellaneous external exposome, Occupational expo-
some, Pesticide/biocide exposome, and Skin-specific exposome, based on the route or source of
exposure. Besides each chemical list, the total number of chemicals and the number of potential
neurotoxicants present in that list are shown within parenthesis.
food packaging materials and food contact substances (Figure 5.2D). For example, the
Pew list of food additives (L7) contains 88 potential neurotoxicants (Figure 5.5; Sup-
plementary Table S5.4). Further analysis of the SIU lists classified as Indoor-specific
exposome, Pesticide/biocide exposome, Skin-specific exposome or Miscellaneous exter-
nal exposome found the presence of several potential neurotoxicants compiled in Neu-
rotoxKb 1.0 (Supplementary Table S5.4). In other words, we find that several potential
neurotoxicants compiled in NeurotoxKb 1.0 are in regular use [38]. An analysis of the
SOC lists classified as Children’s exposome, Occupational exposome, Pesticide/biocide
exposome, Skin-specific exposome, External environmental exposome or Miscellaneous
external exposome found that several potential neurotoxicants compiled in NeurotoxKb
1.0 are also subject to chemical regulations worldwide [38].
To highlight the possible implications from this exploratory analysis of the presence
of potential neurotoxicants across 55 chemical lists including inventories, regulations and
guidelines, we next focus on chemical lists classified into a single category of external
exposome, namely, Children’s exposome. As neurotoxicants can cause permanent or ir-
reversible damage to neuronal systems [55, 56], it is important to monitor and regulate
their exposure to developing children. For this focused analysis, we considered 4 SOC
lists namely, Chemicals of concern in plastic toys (L22), Danish EPA Sensitizing Fra-
grances in Children’s Articles (L23), EU Toy Safety Directive (L24), and Washington
State Children’s Safe Product Act (L25), which contain chemicals prohibited or restricted
in children related consumer products. We find that 34 potential neurotoxicants compiled
130
in NeurotoxKb 1.0 are present in the lists pertaining to Children’s exposome, and of these,
30 potential neurotoxicants are also produced in high volume as they are present in HPV
lists (Supplementary Table S5.4). Our observations are indicative of the extent to which
these chemicals have been, or are currently being used, in children related products. These
30 potential neurotoxicants warrant further attention, and dedicated monitoring strategies
to prevent exposure of children (Supplementary Table S5.4) [38].
biospecimens
Exposome refers to the totality of exposure during the lifetime of an individual and their
associated health effects [13, 18±20]. Note that the presence of any potential neurotox-
icant in a human biospecimen presents conclusive proof of human exposure and is also
indicative of its potential to affect the nervous system. In this work, we have explored the
presence of 475 potential neurotoxicants in human biospecimens using compiled data in
two resources, namely, the Exposome-Explorer [24] and CTD [30].
We find that 91 potential neurotoxicants were detected in at least one of the 31 human
131
biospecimens (Figure 5.2E; Supplementary Table S5.5). Among the 91 potential neu-
rotoxicants detected in human biospecimens, Arsenic was detected in maximum number
(16) of human biospecimens. Among the 31 human biospecimens, the 68 and 63 potential
neurotoxicants were detected in urine and blood, respectively (Figure 5.2E) [38].
cants
An exploration of the current chemical regulations and guidelines enabled us to better un-
derstand the route and likelihood of human exposure to potential neurotoxicants in their
lifetime. We next decided to explore the utility of our resource NeurotoxKb 1.0 in aiding
132
prioritization of potential neurotoxicants. For this purpose, we have analyzed the presence
of the 475 potential neurotoxicants compiled in NeurotoxKb 1.0 across following lists:
1. Two lists of high production volume (HPV) chemicals, namely, the USHPV database
and the OECD HPV list. These lists enable us to identify potential neurotoxicants that are
extensively manufactured, and thus, have a high likelihood of human exposure.
2. List of substances of very high concern (SVHC) under Registration, Evaluation, Autho-
risation and Restriction of Chemicals (REACH) regulation of the European Union (EU).
SVHC includes chemicals based on their potential to be: (i) Carcinogenic, Mutagenic,
toxic to Reproduction (CMR), (ii) disruptive to the endocrine system, (iii) Persistent,
Bioaccumulative and Toxic (PBT), and (iv) very Persistent and very Bioaccumulative
(vPvB).
Table 5.2 gives the list of 18 potential neurotoxicants in NeurotoxKb 1.0 that are also
present in both HPV and SVHC lists. Being registered as SVHC, these 18 chemicals
are monitored and phased out where necessary, under stringent controls in the EU. These
18 chemicals are associated with multiple types of toxicity (Table 5.2). Overall, our
analysis suggests the need for dedicated monitoring and worldwide prioritization of these
18 potential neurotoxicants. We remark that our analysis of the potential neurotoxicants
produced in high volume is limited to HPV lists pertaining to EU and USA due to the lack
of publicly available HPV lists for other countries. Regulatory bodies in other countries
seeking to improve the prioritization of potential neurotoxicants can analyze NeurotoxKb
in conjunction with country-specific data on chemical production volume and scale of
use.
133
Potential neurotoxicants Neuroreceptors
Allethrin (2)
Methotrexate (6) ADRA2C (6)
Methidathion (1)
EPN (1) DRD1 (13)
8-Hydroxyquinoline (1)
Acetamiprid (1)
Azinphos-methyl (2) ADORA2A (7)
DRD2 (7)
OPRD1 (5)
TACR2 (8)
ADRB2 (4)
Cypermethrin (3)
Dapsone (1) CHRM4 (6)
Chlordecone (6) CHRNA2 (3)
Maneb (3) HRH1 (3)
Disulfiram (7) ADRA2A (5)
Tributyl phosphate (1)
Ethion (1) ADRB1 (9)
CHRM1 (3)
Haloperidol (17)
CHRM2 (4)
CHRM3 (3)
Hexachlorophene (2) CHRM5 (2)
Amitraz (2)
Isoniazid (1) DRD4 (5)
Malathion (1)
Permethrin (2) HTR5A (4)
Methyl parathion (4)
Naled (4)
HTR6 (11)
Diethylstilbestrol (9)
Phenolphthalein (2) HTR7 (11)
Phenylephrine HCl (1)
NPY1R (6)
Thiram (7)
Chlordane (1) NPY2R (3)
OPRL1 (3)
Triphenyltin hydroxide (13) NTRK1 (3)
ADRB3 (3)
Isophorone (1)
AVPR1A (1)
Bisphenol A (7)
PFOS (10)
Hydroquinone (2)
3-BHA (3)
Tebuconazole (1)
Imidacloprid (1)
PFOA (1)
Parathion (2)
Figure 5.6: The bipartite network of 38 potential neurotoxicants in NeurotoxKb 1.0 that target 27
human neuroreceptors. Besides each potential neurotoxicant, the number of target neuroreceptors
is indicated within parenthesis. Besides each neuroreceptor, the number of potential neurotoxi-
cants targeting it is indicated within parenthesis.
134
5.7 Interaction of environmental neurotoxicants with
neuroreceptors
Identification of target human genes or proteins of environmental neurotoxicants can shed
light on complex molecular mechanisms via which these chemicals cause neurotoxic-
ity. We have used ToxCast [89] to identify the target human genes or proteins of the
475 potential neurotoxicants in NeurotoxKb 1.0. To retrieve the list of target human
genes perturbed by potential neurotoxicants, we have used ToxCast invitroDB version
3.2 dataset released in August 2019 [215]. We followed the method described in the
Section 2.4.2 to extract from ToxCast the human target genes perturbed upon exposure
to compiled neurotoxicants. Based on human-specific assays in ToxCast [89], we were
able to obtain 255 target human genes for 220 out of the 475 potential neurotoxicants
in NeurotoxKb 1.0 (Supplementary Table S5.6). Further investigation of the 255 target
human genes of the 220 potential neurotoxicants revealed that 27 target genes correspond
to neuroreceptors. We find that 38 potential neurotoxicants in NeurotoxKb 1.0 target
at least one of these 27 neuroreceptors (Figure 5.6; Supplementary Table S5.6) [38].
Among these 38 potential neurotoxicants, 4 neurotoxicants namely, Mercuric chloride,
Haloperidol, Triphenyltin hydroxide and Perfluorooctanesulfonic acid (PFOS), target 10
or more neuroreceptors (Figure 5.6). Among the 27 neuroreceptors which are targets
of at least one potential neurotoxicant, the neuroreceptor OPRM1 (Opioid Receptor Mu
1) for endogenous opioids such as β-endorphin and endomorphin, was found to interact
with 15 potential neurotoxicants. Other neuroreceptors which are targets of at least 10
potential neurotoxicants include the receptor DRD1 (Dopamine receptor D1) for neuro-
transmitter dopamine, and the receptors HTR6 (5-Hydroxytryptamine Receptor 6) and
HTR7 (5-Hydroxytryptamine Receptor 7) for the neurotransmitter serotonin (Figure 5.6;
Supplementary Table S5.6) [38]. In future, an in depth analysis of chemical-gene inter-
actions will shed new insights on the molecular mechanisms via which the exposure to
the 475 potential neurotoxicants in NeurotoxKb 1.0 can lead to documented neurotoxic
135
endpoints in mammals.
rotoxicants
Chemical similarity approaches can aid in early identification of toxic chemicals [198,
199] including potential neurotoxicants. To construct the CSN of neurotoxicants, we
have employed the similarity metric Tanimoto coefficient [200]. For any pair of chemi-
cals, Tanimoto coefficient has a value in the range 0 to 1, wherein the level of chemical
similarity between two molecules is directly proportional to the corresponding Tanimoto
coefficient value. The computation of Tanimoto coefficient between pairs of chemicals
can depend on the choice of chemical fingerprints used to represent the molecules. Here,
we have chosen Extended Circular Fingerprints (ECFP4) [129] while computing Tani-
moto coefficient between different pairs of potential neurotoxicants.
In the CSN of potential neurotoxicants in NeurotoxKb 1.0, there are 475 nodes cor-
responding to the 475 potential neurotoxicants, and there is an edge between any pair
of nodes if the corresponding Tanimoto coefficient value is ≥ 0.5. The chosen cutoff of
Tanimoto coefficient ≥ 0.5 to decide on significant structural similarity between pairs of
chemicals was motivated by a similar choice made in previous studies [282±284].
We find that the CSN of 475 potential neurotoxicants is fragmented into 60 connected
components with the number of neurotoxicants ≥ 2 and 286 isolated neurotoxicants (Fig-
ure 5.7). Moreover, the largest connected component consists of only 13 potential neuro-
toxicants (Figure 5.7). In Figure 5.7, we have coloured the nodes based on the number
of aromatic rings in the corresponding neurotoxicant. It can be seen that neurotoxicants
belonging to a connected component typically have the same number of aromatic rings.
Altogether, this preliminary analysis of the CSN of potential neurotoxicants reveals a
fragmented network, and thus, the associated toxicological space has high chemical di-
versity [38].
136
0 aromatic ring 4 aromatic rings
3 aromatic rings
Figure 5.7: Chemical similarity network (CSN) of the 475 potential neurotoxicants in Neuro-
toxKb 1.0. In this figure, there are 475 nodes corresponding to the 475 potential neurotoxicants,
and there is an edge between any pair of nodes if the corresponding Tanimoto coefficient value is ≥
0.5. Further, nodes are coloured based on the number of aromatic rings present in the correspond-
ing neurotoxicants, while the thickness of the edges indicate Tanimoto coefficient value between
the corresponding neurotoxicants. Here, the connected components of the CSN are displayed in
the decreasing order of the number of nodes in each component.
137
5.9 Discussion
The Swiss philosopher and poet, Henri-Frédéric Amiel (1821-1881), once stated that: ªTo
repair is twenty times more difficult than to preventº. The quote is apt for the manage-
ment of hazardous chemicals including environmental neurotoxicants. Since neurotoxi-
cants can cause permanent or irreversible damage to the nervous system [52,55,56], early
screening of environmental chemicals with potential to cause neurotoxicity is important
for human well-being. In this direction, a comprehensive resource on potential neuro-
toxicants compiling published evidence specific to mammals, can aid in monitoring and
regulation of human neurotoxicants. Here, we present such a comprehensive resource,
NeurotoxKb 1.0, with compiled information on 475 potential non-biogenic neurotoxi-
cants curated from 835 published studies specific to mammals. The entire compiled in-
formation on the 475 potential neurotoxicants in NeurotoxKb 1.0 can be easily accessed
and retrieved via a user-friendly and interactive web interface (Figure 5.8).
138
Adipose tissue 10
Number of
Total number
of chemicals
neurotoxicants
L22 (86) (17)
Amniotic fluid 12
L23 (24) (1) Blood 63
L24 (66) (1)
L25 (85) (26) Bone 3
L1 (1527) (59)
L2 (2446) (23)
Brain 10
L3 (299) (8)
Breast milk 30
L4 (683) (37)
L5 (119) (0) Cord blood 35
L6 (16341) (121)
L7 (6800) (88) Follicular fluid 10
L8 (2612) (27)
Hair 17
L9 (3049) (68)
L10 (3227) (43) Hand 4
L11 (234) (4)
Childrens’ exposome L26 (67) (14) Heart 2
L27 (19) (1)
L28 (39) (17)
Kidney 1
Dietary exposome
L29 (246) (69) Liver 2
Substances in use (SIU) L30 (477) (63)
External environmental exposome L31 (83) (45) Lung 2
L32 (126) (23)
Indoor-specific exposome L33 (79) (2)
Mouth 2
L34 (297) (53) Muscle
L12 (2037) (88)
1
Substances of concern (SOC) Miscellaneous external exposome L19 (77) (27) Exploration of potential Nail 11
L20 (978) (6)
L21 (789) (20)
Exploration of potential Pancreas 1
Occupational exposome L40 (757) (61)
Pituitary gland 1
Pesticide/biocide exposome L41 (224) (0)
L42 (33) (110) Placenta 17
Skin-specific exposome L43 (869) (116)
neurotoxicants in external neurotoxicants in external
L44 (39) (41) Saliva 11
L45 (188) (14)
Semen 4
L46 (479) (22)
L47 (386) (94) Skin 3
L48 (162) (61)
Spinal cord
L49
L50
(927)
(345)
(25)
(24)
exposomes via chemical exposomes via human 2
3)
en
SVHC (345)
com
ds (2
Org poun
an
9)
poun
anic ds
compounds (32)
Organic oxygen
(1
Org
Potential neurotoxicants Neuroreceptors
gen
com ohalog
les
nitro (33)
llic 2)
Allethrin (2)
cu
Lip
eta (1 Methotrexate (6) ADRA2C (6)
lip ids
Methidathion (1)
ole ike d
om ds EPN (1)
m id-l an
DRD1 (13)
8-Hydroxyquinoline (1)
an un lfur )
rg o Acetamiprid (1)
O mp osu ds (9 ) Azinphos-methyl (2) ADORA2A (7)
DRD2 (7)
co rgan un lts (8
O O mpo sa )
an rgan (7) s (5 OPRD1 (5)
d d ic co anic ns etide
eri ac Org carbo polyk (4)
Interaction of environmental Mercuric chloride (23) OPRM1 (15)
va ids
tiv ro and nds
es Hyd noids compou TACR2 (8)
ylpropa orus
(77 (4)
) ADRB2 (4)
Phen nophosph rivatives (3)
nic
Orga ids and de r compounds
analogu
es (3)
Chemical Cypermethrin (3)
Dapsone (1) CHRM4 (6)
Alkalo c 1,3-dipola ides, and neurotoxicants with neuro- Chlordecone (6) CHRNA2 (3)
HRH1 (3)
Orga nds
ou Organi ides, nucleot es(1) Maneb (3)
comp 0)
(43
Nucleos rbon derivativ
Hydroca
2 7 Disulfiram (7)
Tributyl phosphate (1)
ADRA2A (5)
Inorganic
Homoge
classification CHRM1 (3)
Haloperidol (17)
neous met
nds al
compou
compou
nds (17
)
receptors CHRM2 (4)
CHRM3 (3)
(45) Hexachlorophene (2) CHRM5 (2)
Mixe Amitraz (2)
d Isoniazid (1) DRD4 (5)
comp metal/no Malathion (1)
ound n- Permethrin (2) HTR5A (4)
s (15) metal Methyl parathion (4)
Naled (4)
139
HTR6 (11)
Ho
mo
com gen Diethylstilbestrol (9)
pou eou Phenolphthalein (2) HTR7 (11)
nds s no Phenylephrine HCl (1)
c NPY1R (6)
cli (13 n-m Thiram (7)
) eta Chlordane (1) NPY2R (3)
ro cy (90) l OPRL1 (3)
ete d s
Triphenyltin hydroxide (13) NTRK1 (3)
oh un
an po ADRB3 (3)
Isophorone (1)
AVPR1A (1)
Be
Org com Bisphenol A (7)
nz
PFOS (10)
en
Hydroquinone (2)
oid
3-BHA (3)
Tebuconazole (1)
s (1
Imidacloprid (1)
PFOA (1)
00
Parathion (2)
)
Compilation of potential
Network based visualization of the
non-biogenic neurotoxicants
chemical space of environmental
from published studies 1 8
neurotoxicants
specific to mammals
Compilation of potential neurotoxicants
from existing resources
Mapping of neurotoxicants to
their chemical identifiers using
standard databases 742 potential neurotoxicants
with chemical structure
information compiled from
four existing resources
Filtration of biogenic
chemicals
Supplementary Information
Supplementary Tables S5.1-S5.6 associated with this chapter are available for download
from the GitHub repository: https://github.com/asamallab/PhDThesis-Janani_
R/blob/main/SI/ST_Chapter5.xlsx.
140
Grandjean
US EPA
NeurotoxKb and Mundy et Aschner et al.
Feature report
1.0 Landrigan al. (2015) (2017)
(1976)
(2014)
Number of potential neurotoxicants 475 802 214 97 33
Yes via Yes via
Web interface Yes No No
CompTox CompTox
Compilation of neurotoxic endpoints Yes Yes No Yes Yes
Standardization of neurotoxic endpoints Yes No No No No
Classification based on environmental
Yes No Yes Yes Yes
source
Classification based on chemical
Yes No No No No
structure
Presence in chemical regulation or
Yes No No Yes Yes
guideline
Information on external exposomes Yes No No No No
Presence in human biospecimen Yes No No No No
DSSTox DSSTox
PubChem or
substance substance
Chemical identifiers CAS or CAS CAS
identifier or identifier or
MeSH
CAS CAS
SDF, MOL,
Download of 2D structure No No MOL MOL
MOL2
SDF, MOL,
MOL2,
Download of 3D structure No No No No
PDB,
PDBQT
Physicochemical properties Yes No No Yes Yes
Molecular descriptors Yes No No No No
Predicted ADMET properties Yes No No Yes Yes
Chemical-gene association Yes No No Yes Yes
Chemical similarity filter Yes No No No No
Table 5.1: Comparison of the features including compiled information captured in NeurotoxKb
1.0 for the potential neurotoxicants with respect to four existing resources.
141
Presence in Presence in Presence in
Potential Neurotoxicant SVHC Criteria
USHPV OECD HPV SVHC
Tributyltin oxide Yes Yes Yes PBT (Article 57d)
Lead Yes Yes Yes Toxic for reproduction (Article 57c)
N,N-Dimethylformamide Yes Yes Yes Toxic for reproduction (Article 57c)
Tetraethyllead Yes Yes Yes Toxic for reproduction (Article 57c)
Trichloroethylene Yes Yes Yes Carcinogenic (Article 57a)
Dinoseb Yes Yes Yes Toxic for reproduction (Article 57c)
Nitrobenzene Yes Yes Yes Toxic for reproduction (Article 57c)
Boric acid Yes Yes Yes Toxic for reproduction (Article 57c)
1-Bromopropane Yes Yes Yes Toxic for reproduction (Article 57c)
2-Methoxyethanol Yes Yes Yes Toxic for reproduction (Article 57c)
2,4-Dinitrotoluene Yes Yes Yes Carcinogenic (Article 57a)
Hydrazine Yes Yes Yes Carcinogenic (Article 57a)
Carcinogenic (Article 57a); Specific
Cadmium Yes Yes Yes target organ toxicity after repeated
exposure (Article 57(f) - human health)
Toxic for reproduction (Article 57c);
Dibutyl phthalate Yes Yes Yes Endocrine disrupting properties (Article
57(f) - human health)
Carcinogenic (Article 57a); Mutagenic
Propylene oxide Yes Yes Yes
(Article 57b)
Carcinogenic (Article 57a); Mutagenic
Acrylamide Yes Yes Yes
(Article 57b)
Toxic for reproduction (Article 57c);
Endocrine disrupting properties (Article
Bisphenol A Yes Yes Yes 57(f) - environment); Endocrine
disrupting properties (Article 57(f) -
human health)
Toxic for reproduction (Article 57c);
Endocrine disrupting properties (Article
Bis(2-ethylhexyl)
Yes Yes Yes 57(f) - environment); Endocrine
phthalate
disrupting properties (Article 57(f) -
human health)
Table 5.2: List of 18 potential neurotoxicants in NeurotoxKb 1.0 suggested for prioritization.
These 18 chemicals are considered to be substance of very high concern (SVHC) under REACH
regulation, and moreover, are present in two lists of high production volume (HPV) chemicals,
namely, United States High Production Volume (USHPV) database and Organisation for Eco-
nomic Co-operation and Development High Production Volume (OECD HPV) list.
142
Chapter 6
143
detected in human milk across different geographical regions [24]. Some studies have also
compiled the list of chemicals detected in human milk, and these studies were published
as research articles or scientific reports. A prominent example is the work of Lehmann et
al. [286] that has compiled the human milk exposome from samples collected across the
United States through literature mining and manual curation.
India is home to a population of nearly 1.33 billion [288] with extensive growth in
agricultural and industrial sectors, contributing to the production and use of several com-
mercial chemicals in everyday life [289]. Several studies have detected the presence of
environmental contaminants in human milk and a few studies have also compiled the list
of chemicals detected in human milk across India [290±292]. However, so far there has
been no systematic effort towards the monitoring and compilation of these environmental
contaminants in India, with the objective to aid chemical risk management and informing
policy decisions [293]. For example, the reports by van den Berg et al. [294] and Sharma
et al. [293] compile only the chemical component of the exposome [13], but lack the sys-
tematic compilation of maternal factors such as age, body weight, diet, and other factors
which may affect the exposome.
144
6.1 Compilation of human milk contaminants specific to
India
We created the database, Exposome of Human Milk across India (ExHuMId) with the
primary objective of bringing all the published knowledge surrounding human milk con-
taminants, specific to India, into a single knowledgebase [39]. In other words, ExHuMId
compiles the list of human milk contaminants detected in published scientific studies in-
volving samples collected across India.
This keyword search last performed on 24 August 2020, led to 1704 research articles.
Subsequently, this set of 1704 articles was manually curated to obtain a subset of articles
relevant to the study of human milk contaminants in India (Figure 6.1). Specifically,
we retained only those articles pertaining to ‘human milk’ or ‘breast milk’, with samples
collected solely from India. During the manual curation process, we excluded studies
on samples collected from outside India, studies without specific geographical indication,
review articles or conference abstracts, studies specific to essential nutrients, and articles
promoting breastfeeding. This step resulted in a curated set of 36 research articles contain-
ing information about the environmental contaminants identified in human milk samples
across India, using analytical techniques (Figure 6.1; Supplementary Table S6.1) [39].
From the curated list of 36 research articles, we have compiled the contaminants
including their concentrations detected in human milk samples, geographical location,
age, and other factors associated with the mothers from whom the milk samples were
collected (Figure 6.1). For an unambiguous analysis, the data compiled in ExHuMId has
145
Data Unification of
compilation compiled data
List of 101 human milk Compilation of the concentration of
contaminants compiled chemicals in human milk samples in
from 36 published studies standarized units, geographical location
with samples from India of samples, and maternal factors
3 4
Br Cl Cl
Br Br Br Cl O Cl
Geographical O Cl O Cl
Classification of
location filter Br Br Cl Cl
contaminants
Manual curation of 1704 1. Environmental source
research articles for 6 broad categories
India-specific studies on 35 sub-categories
human milk contaminants 2. Chemical classification
2 5
Literature Data
mining analysis
PubMed query search resulted 1. Comparison with human milk
in 1704 research articles likely contaminants from other
to contain India-specific geographies
studies on human milk 2. Comparison with substances of
1 concern or in use
3. Physicochemical properties of
human milk contaminants
4. Target genes of human milk
contaminants and potential effects
on maternal and infant health 6
Figure 6.1: Schematic workflow describing the compilation, curation and analysis of the resource
ExHuMId on Exposome of Human Milk across India.
146
been standardized and unified through the following steps.
The first step involved the standardization of the geographical locations from which
human milk samples were collected in our curated set of 36 studies. The geographical
locations of the study samples were mapped to their respective states in India (Figure
6.2A).
Our manually curated set of 36 studies also recorded a list of maternal factors that
influence the presence or transfer of environmental chemicals into mothers’ milk. The
second step involved the unification of maternal factors that were compiled from the 36
research articles. We have compiled 23 maternal conditions associated with the human
milk samples reported in the curated set of 36 published studies, and these maternal con-
ditions include the body weight, food habits, societal factors, and other antenatal and post-
natal conditions of the mothers. These maternal conditions were unified into 9 maternal
factors, namely, body weight, food, gestational age, number of pregnancies (Primipara,
Biparous and Multipara), occupation, phases of breast milk, residential area, social status,
and types of birth (Figure 6.2D). Among these 9 maternal factors, the number of preg-
nancies is found to be highly distributed with many more contaminants (Figure 6.2D).
Note that maternal factors are not available for all samples that have been compiled from
the curated set of 36 published articles.
Next, the environmental chemicals detected in human milk across the curated set
of 36 studies were mapped to standard chemical identifiers using PubChem [86], CAS,
ChEMBL [295], and CTD [30] to obtain a set of 101 unique chemicals. The final step
involved the manual unification of the units for the lowest concentration, highest con-
centration, mean, standard deviation and standard error associated with the measurement
of each chemical in human milk samples in different studies. This step resulted in the
unification of the compiled information in 12 different concentration units into 2 stan-
dardized concentration units, namely, µg/g lipid weight and µg/L lipid weight. Of 101
compiled human milk contaminants, we find 71 chemicals with concentration in stan-
dard unit µg/g lipid weight, 18 chemicals with concentration in standard unit µg/L lipid
147
weight, and 11 chemicals with concentrations in both the standard units [39]. Further-
more, we gathered information on their chemical structure including two-dimensional
(2D) and three-dimensional (3D) structure (in SDF, MOL and MOL2 formats), canonical
SMILES, InChI, and InChIKey.
Following the compilation and standardization of the data on human milk contaminants,
we classified the human milk contaminants based on: (a) their environmental source, and
(b) their chemical features [39].
The human milk contaminants were structurally classified according to the taxonomy
from ClassyFire [173, 174], a web-based application (Figure 6.2E). Upon classifying the
101 contaminants in ExHuMId based on their chemical class, we find that 96 are organic
and 5 are inorganic (Figure 6.2E). Among the 96 organic chemicals in ExHuMId, the
largest number (46 contaminants) belong to the super-class benzenoids (Figure 6.2E).
148
A D Body weight 9
Food 9
Punjab
(5) Uttarakhand Gestational 3
(1) age
Haryana Delhi (8)
(2) Number of 48
Uttar Pradesh pregnancies
(5) Assam (1)
Occupation 4
West Bengal
Gujarat Madhya Pradesh (5)
(1) (1)
Chhattisgarh
Phases of 3
(1) breast milk
Maharashtra Assam 9
Residential 13
(4) Chhattisgarh 4
area
Delhi 11
Gujarat 7
Karnataka 7 Social status 9
Haryana
(2) Karnataka 34
Madhya Pradesh 7
Types of birth
Maharashtra 9 5
Tamil Nadu delivery
Punjab 25
(7)
Tamil Nadu 66
Uttar Pradesh 12 0 10 20 30 40 50
Maternal
Uttarakhand 9 factors Number of contaminants
West Bengal 61
1 2 3 4 5 6 7 8 0 20 40 60 80
Number of published studies State Number of contaminants
tal
poun gen
8
s ( me
and
)
molecules (6)
ds (2
8
nd us
7
Number of published
de
4)
ou eo
anic es
ou al us
mp en
6
Orga
mp et eo
co mog
1)
acid (6)
co n-m gen
5
s(
studies
nd
Ho
s
no mo
4 4
4
Ho
3 3 O
co rgan
mp oh
2 ou alo
2 nd ge
s( n
16
(96)
pou rganic
)
5)
nd nic
nds
s(
0
ou rga
O
mpIno
85 90 95 00 05 10 15 20
-19 6-19 1-19 6-20 1-20 6-20 1-20 6-20
com
81 8 9 9 0 0 1 1
co
19 19 19 19 20 20 20 20
Period
C
125
Number of contaminants
98 101
100
lic Be
yc ) nz
75 oc 20 en
oid
62 ter s ( s(
52 he nd 46
a no pou )
50 g
Or com
25 17
10 15
8
0
85 90 95 00 05 10 15 20
-19 1-19 1-19 1-20 1-20 1-20 1-20 1-20
81 8 8 8 8 8 8 8 F
19 19 19 19 19 19 19 19 80
Period
70 67
63
Number of contaminants
G ExHuMId
60
54
(101) 50
42
40
20 30
20
12 25
44 10 6
3
0
54 17 97
d er ry te
s d nt
an m s st ia an e ta
re su ct du ed ne r llu
l tu ing on odu In m i ci thca Po
u
ic rm C Pr t er ed al
gr Fa In M he
ExHuMUS ExHuM Explorer A
(127) (183) Broad categories of environmental source
149
Figure 6.2 (previous page): (A) An India map displaying different states or geographical loca-
tions from where samples were obtained in the curated set of 36 published research articles in
ExHuMId on human milk contaminants. The number besides each state in brackets gives the
number of published articles reporting human milk samples from that state. The histogram shows
the number of contaminants detected across samples obtained from each state. (B) A chronologi-
cal analysis of the curated set of 36 published studies in ExHuMId. (C) A chronological analysis
of the cumulative number of contaminants detected across published studies in different time peri-
ods. (D) Evidence across 9 maternal factors compiled from published articles associated with the
human milk contaminants in ExHuMId. (E) Sunburst plot showing the chemical classification of
101 human milk contaminants in ExHuMId into 2 kingdoms and 8 super-classes as obtained from
ClassyFire. (F) Distribution of 101 human milk contaminants in ExHuMId across 6 broad cate-
gories of environmental sources. (G) Comparison of 101 human milk contaminants in ExHuMId
with those in two other resources, namely, ExHuMUS and ExHuM Explorer.
ther research, inform industry directions and policy decisions, especially when it comes
to chemical usage and regulation. Knowledgebases make this possible, by serving as
a platform for researchers, industry and regulatory authorities to access a range of use-
ful information. This has motivated us to compile Exposome of Human Milk across India
(ExHuMId) version 1.0, a curated resource on human milk contaminants specific to India.
The web interface of ExHuMId was created using an approach similar to that de-
scribed in Section 2.2. Through the web interface (Figure 6.3), users can also access
the identifiers, structural information including 2D and 3D structure for each human milk
contaminant in ExHuMId. The users can navigate ExHuMId via either simple search or
browse options (Figure 6.3).
150
A F
H
D
151
Figure 6.3 (previous page): The web interface of ExHuMId. (A) A screenshot of the home page of
ExHuMId. In Search section, there are three options available to search and obtain information on
human milk contaminants compiled in ExHuMId. (B) Firstly, Simple search option can be used
to search the chemicals using either chemical name or standard identifiers (CAS or PubChem).
(C) Secondly, Physicochemical filter option can be used to filter the contaminants based on their
physicochemical properties such as molecular weight, Log P, TPSA, number of hydrogen bond
donors or number of hydrogen bond acceptors. (D) Thirdly, Chemical similarity filter can be used
to filter the contaminants based on the structural similarity with respect to a query compound.
(E) The screenshot shows the result page for an individual contaminant. For each contaminant,
we can obtain information on structure identifiers, environmental source, chemical classification,
experimental evidence, chemical-gene interaction, physicochemical properties, predicted ADMET
properties and molecular descriptors. The Browse option in ExHuMId can be used to obtain the
human milk contaminants based on: (F) Geographical location of samples, (G) Maternal factors
associated with samples, (H) Environmental source classification, and (I) Chemical classification.
152
6.4 Chronological analysis of published studies compiled
in ExHuMId
Within the curated set of 36 published studies compiled in ExHuMId, the earliest study
is from 1981 while the latest study is from 2018. Furthermore, Figure 6.2B presents a
chronological analysis of the 36 published studies in five-year intervals. It is seen that
the maximum number (8) of published studies are from the period 2011-2015 followed
by 7 published studies from the period 2006-2010 (Figure 6.2B). Figure 6.2C displays
a chronological analysis of the cumulative number of contaminants detected across pub-
lished studies in different time periods [39]. It is seen that there is a significant increase
in the cumulative number of contaminants from published studies after 2000 and 2010
(Figure 6.2C).
153
not reflective of the entire US and global populations, respectively. However, given that
they are the only compilations of human milk contaminants for geographies outside India,
we have considered them in this work. The union of the above-mentioned three datasets
gives us a list of environmental chemicals detected in human milk samples from various
parts of the world, and we refer to this chemical space as ‘Global ExHuM’ (Supplemen-
tary Table S6.2). The intersection of ExHuMId, ExHuMUS and ExHuM Explorer (Figure
6.2G; Supplementary Table S6.2) contains 44 chemicals that are of potential concern in
the Indian, USA and global scenarios, and we refer to this space of 44 chemicals as the
‘Common ExHuM’ (Figure 6.2G; Supplementary Table S6.2) [39].
Table 6.1 presents a detailed comparison of our resource ExHuMId with the other
two resources on human milk contaminants. Note that the three resources, ExHuMId,
ExHuMUS and ExHuM Explorer, do not have in common any published experimental
evidence or literature as the resources compile data on different geographies. Further, the
research article [286] on ExHuMUS provides the list of detected chemicals, their concen-
trations and the geographical location within USA from where the study samples were
collected. However, the ExHuMUS publication is not accompanied by an online resource
and the meta-analysis article offers limited information for the compiled list of human
milk contaminants [286]. In contrast, ExHuM Explorer [24] contains detailed informa-
tion on 183 contaminants which were detected in human milk samples collected across
several countries. Specifically, ExHuM Explorer gives information on the 2D and 3D
structures of the contaminants [24]. Notably, our resource ExHuMId compiles the differ-
ent types of information in ExHuMUS and ExHuM Explorer on chemicals, and further,
compiles the list of maternal factors that influence the transfer of the contaminants to
human milk, their physicochemical properties, their target genes (including visualization
of the chemical-gene or chemical-protein interactions), in comparison to the two other
resources (Table 6.1). In sum, ExHuMId compiles information on human milk contami-
nants in the specific context of India, and further, makes the compiled information easily
accessible to researchers via a user-friendly web interface.
154
6.6 Analysis of human milk contaminants with sub-
EDCs, carcinogenic substances, neurotoxins and prohibited substances have all been iden-
tified as hazards, and have been well-studied for their adverse effects. Mitigating the risk
posed by these substances will involve identifying their common sources, monitoring and
regulating them on a timely basis. Here, we focused on identifying substances in Ex-
HuMId that are endocrine disruptors, carcinogens or neurotoxins. These three categories
of chemicals are of particular concern due to their potential to affect development and
leave behind long-term effects.
Specifically, we have considered four substance lists in this category for analysis of
human milk contaminants. Firstly, to understand the presence of endocrine disruptors,
we used the list of 792 potential EDCs from DEDuCT 2.0 [35, 36] for this analysis. Sec-
ondly, we considered the list of carcinogens from IARC monographs [296]. Thirdly,
we considered two lists of neurotoxins from the CompTox chemistry dashboard [265] of
US EPA, which are: (a) chemicals demonstrating effects on neurodevelopment (DNTEF-
FECTS) [61] and (b) chemicals triggering developmental neurotoxicity in vivo (DNTIN-
VIVO) [62]. Fourthly, we have considered a chemical regulation, namely, the EU list
of substances prohibited in cosmetic products [141]. In addition, we have also consid-
155
ered two lists of chemicals which are known to be produced in high volume: (a) United
States High Production Volume (USHPV) database, and (b) Organisation for Economic
Co-operation and Development (OECD) High Production Volume (OECD HPV) list last
updated on 2004.
Comparing ExHuMId with resources for the above chemical categories revealed the
following. We found that 43 potential EDCs are present in ExHuMId (Supplementary
Table S6.3). The web interface of ExHuMId provides detailed information on environ-
mental sources of these EDCs detected in human milk samples [62]. The IARC mono-
graphs classify carcinogenic substances into: (a) class 1 that are carcinogenic to humans,
(b) class 2A that are probably carcinogenic to humans, (c) class 2B that are possibly car-
cinogenic to humans, and (d) class 3 that are not classifiable as to its carcinogenicity to
humans [296]. Our comparative analysis revealed that 23 carcinogens were in ExHuMId
of which 7 carcinogens belong to class 1, 4 to class 2A, 5 to class 2B and 7 to class 3. Six
commonly found carcinogens listed by IARC were found in the Common ExHuM and
have been detected in human milk samples from India, USA, and other parts of the world.
Among these, there are 3 class 1 carcinogens, namely, 2,3,4,7,8-Pentachlorodibenzofuran,
3,4,5,3’,4’-Pentachlorobiphenyl (PCB-126) and Lindane (Supplementary Table S6.3).
Neurotoxins in human milk are a significant concern since they are capable of influencing
neurodevelopment during the prenatal and postnatal stages [64]. We found 14 potential
neurotoxins to be present in ExHuMId (Supplementary Table S6.3). Cosmetic products
are a significant source of exposure to various substances, due to their ubiquitous nature
and widespread use. On comparison, we found 16 prohibited cosmetic ingredients (under
EU regulations) to be present in ExHuMId (Supplementary Table S6.3). Among these, 3
prohibited cosmetic ingredients, namely, Hexachlorobenzene, Chlorophenothane (DDT)
and Lindane are also produced in high volume (Supplementary Table S6.3) [39].
156
6.6.2 Substances manufactured or regulated in India
We have built ExHuMId with the purpose of compiling and understanding the published
data on human milk contaminants from samples specific to India. To obtain a deeper un-
derstanding of the contaminants in ExHuMId, we have considered lists that reflect either
chemical regulation in India or chemical production scenario in India. Such an analysis is
in line with the main focus of this work, that is, Exposome of Human Milk across India.
Specifically, we have considered the following lists compiled by relevant departments of
Government of India: (a) Production of major chemicals year-wise in India [297], (b)
List of banned pesticides in India [298], (c) Schedule 1 hazardous chemicals list in In-
dia [299], and (d) Schedule 3 hazardous chemicals list in India [300]. A comparative
analysis of ExHuMId with lists of chemicals manufactured in India and lists from In-
dian chemical regulations, can further clarify the status of human milk contamination in
India [62].
157
6.6.3 Substances contaminating human milk through possible every-
day exposure
Humans come into contact with a variety of substances in daily life, particularly via
the usage or consumption of an increased number and variety of processed products in
today’s world. This is a significant factor in the case of a pregnant woman or breast-
feeding mother, since several of these substances may find their way into the mother’s
milk [66, 67, 285]. A concern and consideration of this study was to better understand
the scenario whereby chemicals encountered in everyday life make their way into human
milk. For this, we have considered two lists of substances found in food: (a) FooDB [301],
and (b) the Joint FAO/WHO Expert Committee on Food Additives (JECFA) list [140]. We
found 12 food additives are present in ExHuMId (Supplementary Table S6.3) [39].
milk contaminants
Lipophilic chemicals can be transferred to human milk from maternal plasma via pas-
sive diffusion [68±72]. The Milk to Plasma (M/P) concentration ratio is generally used
to identify the equilibrium concentration of chemicals in maternal plasma and breast
milk [68, 71, 72], and can indicate propensity of the environmental contaminants to enter
human milk. However, the M/P ratio, while easily available for drugs, is scarcely avail-
able for environmental contaminants [70]. There is substantial evidence suggesting that
the transfer of xenobiotics into human milk is influenced by the physicochemical prop-
erties of the chemicals [68±72]. The key physicochemical properties that influence the
transfer of environmental chemicals into human milk are the Log P, Topological Polar Sur-
face Area (TPSA), the number of hydrogen bond donors (HBD), the number of hydrogen
bond acceptors (HBA), the number of rotatable bonds, and molecular weight [68±70, 72].
Due to the unavailability of experimentally determined M/P ratio for the 101 chemicals
158
compiled in ExHuMId, we performed a comparative analysis of their physicochemical
properties with those of chemicals for which the M/P ratio is available. Specifically, we
considered the M/P ratios for a list of 375 chemicals compiled by Vasios et al. [72] from
published literature, and compared the computed physicochemical properties of chemi-
cals in ExHuM with those compiled by Vasios et al. The physicochemical properties of
the chemicals in ExHuM or Vasios et al. were computed using RDKit [179].
Following Vasios et al. [72], we have considered the chemicals with M/P ratio ≥ 1.0
as high risk and chemicals with M/P ratio < 1 as low risk for transfer to human milk
from maternal plasma. For a more detailed analysis, we have further divided the low risk
compounds in Vasios et al. based on their M/P ratios into < 1, ≤ 0.75, ≤ 0.5 and ≤ 0.25
resulting in 249, 213, 170 and 114 chemicals, respectively. Thereafter, a comparison of
the physicochemical properties was made across the sets of human milk contaminants in
ExHuMId, ExHuMUS and ExHuM Explorer, high risk compounds in Vasios et al. [72]
with M/P ratio ≥ 1, and low risk compounds in Vasios et al. [72] with M/P ratio < 1, ≤
0.75, ≤ 0.5, ≤ 0.25 (Figure 6.4; Supplementary Table S6.4).
Figure 6.4 shows the mean and standard deviation of the distributions of 6 physico-
chemical properties, namely, Log P, TPSA, number of rotatable bonds, number of HBD,
number of HBA and molecular weight, for chemicals in different sets. We report the
mean, standard deviation, minimum value and maximum value for the 6 physicochemical
properties for the sets of human milk contaminants in ExHuMId, ExHuMUS, ExHuM
Explorer, high risk compounds in Vasios et al. [72] with M/P ratio ≥ 1, and low risk com-
pounds in Vasios et al. [72] with M/P ratio < 1, ≤ 0.75, ≤ 0.5, and ≤ 0.25 (Supplementary
Table S6.5). We find that the mean and standard deviation of the distributions of 6 physic-
ochemical properties for human milk contaminants in ExHuMId are much closer to those
for high risk compounds in Vasios et al. [72] with M/P ratio ≥ 1 [39]. Note that the high
risk compounds in Vasios et al. [72] are capable of easily transferring to human milk if
they are present in the lactating mother’s body. Further, we observed the same trend for
chemicals in ExHuMUS and ExHuM Explorer (Figure 6.4). Figure 6.4 also shows a
159
clear difference between the mean and standard deviation of the distributions of the above
6 physicochemical properties for the low risk compounds in Vasios et al. [72] in compar-
ison to high risk compounds or human milk contaminants in ExHuMId, ExHuMUS and
ExHuM Explorer.
Overall, these results give insights into the effect physicochemical properties can have
in the transfer of environmental chemicals into human milk, and further, can enable the
prediction of such chemicals. While predicting the possible transfer of environmental
chemicals into human milk based on physicochemical properties, it is important to bear
in mind the due limitations of any such method that does not account for the influence of
maternal factors, frequency of exposures, varying pharmacokinetic properties of contam-
inants, and the complexity of lactation pathways [66, 67, 286, 287, 302].
160
A B
12 210
10 180
8 150
6 120
TPSA
Log P
4 90
2 60
0 30
-2 0
rer ) .25
)
.5) 5) rer ) .25
)
.5) 5)
≥1 ≤0 ≤0 0.7 <1
) ≥1 ≤0 ≤0 0.7 <1
)
MI
d
MU
S
Ex
plo R( R( R( (≤ R( MI
d
MU
S
Ex
plo R( R( R( (≤ R(
u u M sH sL sL LR sL Hu Hu M sH sL sL LR sL
E xH E xH
Ex
Hu
Va
sio sio a sio s ios asio Ex Ex Ex
Hu
Va
sio sio a sio s ios asio
Va V V a V Va V V a V
C D
16 7
14
6
Number of hydrogen bond
Number of rotatable bonds
12
5
10
donors
4
8
3
6
2
4
1
2
0 0
) ) ) ) ) ) ) )
rer ≥1 .25 0.5 .75 ) rer ≥1 .25 0.5 .75 )
US plo R( ≤0 (≤ ≤0 (<1 S plo R( ≤0 (≤ ≤0 (<1
R( R( R( R(
Id Ex MI
d Ex
M M sH LR LR MU sH LR LR
Ex
Hu Hu uM sio io sL ios iosL s ios Ex
Hu Hu uM sio io sL ios iosL s ios
Ex H a s s s a Ex H a s s s a
Ex V Va Va Va V Ex V Va Va Va V
E F
14 700
12 600
Number of hydrogen bond
10 500
Molecular weight
acceptors
8 400
6 300
4 200
2 100
0 0
rer ) .25
)
.5) 5) rer ) .25
)
.5) 5)
≥1 ≤0 ≤0 0.7 <1
) ≥1 ≤0 ≤0 0.7 <1
)
MI
d
MU
S
Ex
plo R( R( R( (≤ R( MI
d
MU
S
Ex
plo R( R( R( (≤ R(
u u M sH sL sL LR sL Hu Hu M sH sL sL LR sL
ExH ExH
Ex
Hu
Va
sio sio a sio s ios asio Ex Ex Ex
Hu
Va
sio sio a sio s ios asio
Va V V a V Va V V a V
Figure 6.4: Box plots displaying the distributions of 6 physicochemical properties: (A) Log P,
(B) TPSA, (C) number of rotatable bonds, (D) number of hydrogen bond donors (HBD), (E)
number of hydrogen bond acceptors (HBA), and (F) molecular weight, for chemicals in 8 different
sets, namely, human milk contaminants in ExHuMId, ExHuMUS, ExHuM Explorer, high risk
compounds in Vasios et al. with M/P ratio ≥ 1 (Vasios HR ≥ 1 ), and low risk compounds in
Vasios et al. with M/P ratio < 1 (Vasios LR < 1 ), M/P ratio ≤ 0.75 (Vasios LR ≤ 0.75), M/P ratio
≤ 0.5 (Vasios LR ≤ 0.5), and M/P ratio ≤ 0.25 (Vasios LR ≤ 0.25). Note that, the distributions for
the number of HBD in subfigure (D) are not visible for chemicals in ExHuMId, ExHuMUS and
ExHuM Explorer as the mean and standard deviation for each of the three sets is very close to 0.
161
67,285]. Hence, we were motivated to perform the following analysis to explore the effect
of human milk contaminants on mother and child. Using systems biology approach, we
provide another perspective from our analysis by predicting the effect of environmental
contaminants on lactation, cytokine signalling and production pathways, and xenobiotic
transporters with the help of existing large-scale toxicological resources such as ToxCast
and CTD [30, 89].
To identify the target human genes or proteins of the chemicals in Global ExHuM, we
have used two well-known toxicology resources, ToxCast [89] and CTD [30].
We have used the ToxCast invitroDB3 dataset released in August 2019 [215] to re-
trieve the list of target genes or proteins of human milk contaminants in the Global Ex-
HuM. We followed the method described in Section 2.4.2 to extract from ToxCast the
human target genes perturbed upon exposure to human milk contaminants in the Global
ExHuM. Thereafter, we also retrieved from CTD the list of target genes or proteins of
chemicals in the Global ExHuM using specific filters. In CTD, we have considered only
the chemical-gene or chemical-protein interactions specific to humans and those inter-
actions which have at least one evidence in published scientific literature. Moreover,
in CTD, we have considered only binary interactions involving one chemical and one
gene [30], and thus, have filtered out complex interactions. In CTD, we have also not
considered the interactions that contained the terms ‘Chemical abundance’ or ‘Response
to substance’ based on their ‘interaction actions’.
162
6.8.2 Identification of contaminants interacting with lactation rele-
vant genes
Prolactin [307] and oxytocin [308] are the major hormones responsible for lactation.
Therefore, we have considered the signalling pathways associated with these hormones
for this analysis. We compiled the set of genes involved in the prolactin and oxytocin sig-
nalling pathways in humans from NetPath [309±311] and Kyoto Encyclopedia of Genes
and Genomes (KEGG) [312]. NetPath compiles a list of genes involved in prolactin and
oxytocin signalling pathways in mammals, while the genes retrieved from KEGG are spe-
cific to humans. Further, we mapped these genes to their respective human NCBI Entrez
identifiers. In this step, we obtained 181 and 237 genes involved in prolactin and oxytocin
signalling pathways, respectively, from the above two resources. In addition to these path-
ways, we have included a set of 14 differentially expressed genes from Lemay et al. [304]
that are involved in lactose synthesis pathways and important for milk production. Using
ToxCast and CTD, we then identified chemicals from ExHuMId that may interact with
these lactation relevant genes (Figures 6.5 and 6.6; Supplementary Table S6.6). More-
163
over, we have also performed the same analysis for chemicals in ExHuMUS and ExHuM
Explorer (Supplementary Table S6.6).
It is known that human milk contains several immunological factors including cy-
tokines, chemokines, immunoglobulins, and other soluble receptors that can confer im-
munity in the lactating infants [314, 315]. Among these immunological factors, cytokines
164
Genes involved in
A prolactin signalling
Contaminant pathway
2,4,4'-Trichlorobiphenyl (3)
Aldrin (5)
alpha-HCH (2) FOXO3
Arsenic (48) MAPK1
BDE-15 (2) MAPK3
BDE-47 (7) EGR1
BDE-99 (1) ESR1
Benzo[a]pyrene (29) ESR2
beta-HCH (5) FOS
Cadmium (26) NFKB1
Chlorpyrifos (12) AKT1
Cyfluthrin (3)
Cypermethrin (1) AKT3
delta-HCH (5) ANGPT1
Dibenzo(a,c)anthracene (1) BCL2
p,p'-DDE (12) BCL6
Dieldrin (10) BRCA1
Dimethoate (3) CA13
Endosulfan (12) CCND1
Endosulfan sulphate (7) CEBPB
Ethion (3) CXCR3
Fenvalerate (6) CYP17A1
gamma-HCH (9) EFNA1
Heptachlor (12) FBXO32
Heptachlor epoxide (5)
Hexachlorobenzene (8) GBP3
Lead (8) GET4
Malathion (7) GLIPR1
Methoxychlor (4) GSK3B
Methyl parathion (1) HMOX1
o,p'-DDE (6) HNRNPA1
o,p'-DDT (11) HRAS
p,p'-DDD (8) HSP90AA1
p,p'-DDT (16) HSPA5
PCB-101 (4) IER3
PCB-118 (4) IGF1
PCB-126 (8) IRF1
PCB-138 (2)
PCB-153 (9) JAK2
PCB-156 (3) MAPK14
PCB-52 (2) MAPK8
PFOA (18) MAPK9
PFOS (15) MATN2
Phosalone (6) MID1IP1
Profenofos (5) ND4
Tetrachlorodibenzodioxin (21) NOS2
NUP93
PIK3R3
PRL
RUNX2
SF3B4
SHANK2
SHC2
SRC
SRRM2
STAT3
TENM2
TNFRSF11A
TRIB1
ABCG2
CRH
ALDH1A3
GPAT3
HSP90AB1
LHCGR
RAF1
RELA
STAT1
B F12
Genes involved in GRB2
lactose synthesis INS
KRAS
Contaminant pathway MAP2K1
MAP2K2
SHC1
GALK1 TP63
AKT2
TH
STAT5B
HK1 IGF2
LHB
PIK3CA
Arsenic (4) ALDH7A1
FTH1
HAX1
OXA1L
NME1-NME2 PSMD2
AREG
SLC2A9
Figure 6.5: Sankey plots show the human milk contaminants in ExHuMId and their target genes
or proteins involved in the pathways affecting lactation: (A) Prolactin signalling pathway, and (B)
Lactose synthesis pathway. Besides each contaminant, the number of target genes is mentioned in
parenthesis.
165
play a vital role in the regulation of specific and non-specific immune responses [302].
Cytokines bind to the cytokine receptors and trigger the production of cytokines or elicit
the immune response via the activation of cytokine signalling pathways [316]. Notably,
the presence of environmental contaminants in human milk can interfere with cytokine
signalling and production [302, 317], thereby influencing the effective immune response
in developing infants [64, 302, 313, 317]. Thus, we aimed to identify chemicals in the
Global ExHuM that could potentially disrupt cytokine signalling pathways.
To this end, we first compiled the list of cytokine receptor genes from Cameron et
al. [318], HGNC database [319, 320], KEGG BRITE database [312] and Guide to Phar-
macology database [321]. In total, we have compiled 116 cytokine receptors for which
the chemical-gene interactions were obtained from ToxCast and CTD. Finally, we have
gathered the list of cytokines specific to the cytokine receptors that are known to interact
with the human milk contaminants. This resulted in a tripartite network containing con-
taminants or chemicals, cytokine receptors, and cytokines (Figure 6.7; Supplementary
Table S6.7).
On analyzing the list of 116 cytokine receptors with chemical interactions obtained
from ToxCast and CTD, we found that 22 chemicals compiled in ExHuMId interact with
32 cytokine receptors, which in turn could interfere with signalling or production of 64
cytokines (Figure 6.7; Supplementary Table S6.7). These interactions are displayed in the
form of a tripartite network in Figure 6.7. Among the chemicals in ExHuMId, arsenic tar-
gets the highest number of cytokine receptors (24 genes) followed by Benzo[a]pyrene (9
genes). Among the cytokine receptors, CD40 is perturbed by 17 contaminants compiled
in ExHuMId, and the binding of these contaminants to the CD40 receptor could inter-
fere with the signalling and production of CD40LG, a cytokine specific to CD40 (Figure
6.7; Supplementary Table S6.7) [39]. Thus, human milk contaminants targeting cytokine
receptors could bind to these receptors and interfere with normal function of cytokines.
For the chemicals compiled in ExHuMUS and ExHuM Explorer, we have also performed
the same analysis, and found several contaminants in these resources to be capable of
166
A Contaminant
2,4,4'-Trichlorobiphenyl (2)
Aldrin (7)
alpha-HCH (2)
Arsenic (74)
BDE-15 (4)
BDE-153 (1)
BDE-209 (3)
BDE-47 (8)
BDE-99 (6)
Benzo[a]pyrene (42)
beta-HCH (4)
Cadmium (21)
Chlorpyrifos (16)
Cyfluthrin (8)
Cypermethrin (4)
delta-HCH (4) Genes involved in
p,p'-DDE (12) oxytocin signalling
Dibenzo(a,c)anthracene (2) pathway
Dieldrin (12)
Dimethoate (4) MAPK1
Endosulfan (15) MAPK3
CCL2
Endosulfan sulphate (4) CD38
Ethion (3) CXCL8
ESR1
Fenvalerate (3) FOS
gamma-HCH (10) JUN
PLAU
Heptachlor (14) ABCC4
Heptachlor epoxide (5) ADCY2
ADCY3
Hexachlorobenzene (10) ADCY6
Lead (12) ADCY9
ANXA3
Malathion (6) BMP2
Mercury (2) CACNA1C
CACNA2D4
Methoxychlor (10) CACNB4
Methyl parathion (5) CACNG5
CACNG8
Monocrotophos (1) CALM1
o,p'-DDE (9) CAMK2A
CAMK2B
o,p'-DDT (13) CAMKK2
p,p'-DDD (8) CCL20
CCL5
p,p'-DDT (16) CCND1
PCB-101 (4) CCR2
CXCL1
PCB-118 (3) EEF1G
PCB-126 (11) EEF2K
EGFR
PCB-138 (2) FABP4
PCB-153 (7) FOXD3
PCB-156 (3) FUT4
GATA4
PCB-52 (3) GNAO1
PFOA (21) HCFC1
HIVEP3
PFOS (19) HRAS
Phosalone (8) HSPA5
ITPR1
Profenofos (5) ITPR2
Tetrachlorodibenzodioxin (24) KCNJ5
LEP
MAP2K5
MAPK7
MEF2C
MMP2
MYL9
NFATC1
NFATC2
NOS2
NOS3
PAG1
PDIA3
PECAM1
PKM
B Genes involved
PLA2G4C
PLA2G4D
POU5F1
in xenobiotic PPARG
PRKACG
Contaminant transport PRKAG1
PRKAG2
PRKCA
PRKCB
PRKCE
PRKCZ
PTGES2
ABCB1 PTGS2
ROCK1
RUNX2
RYR2
RYR3
SRC
TNNT2
TRPM2
Arsenic (7) TXNRD1
ABCC1 MKI67
IL6
ACTB
ATF4
CACNA1S
CACNG4
CDKN1A
CREB1
EEF2
ABCC2 ELK1
GUCY1A3
KRT19
MYH6
MYL6
Benzo[a]pyrene (1) MYLK3
Cadmium (1) SLC22A5 MYOG
Chlorpyrifos (1) PIK3CG
PPP1R12C
Dieldrin (1) PRKACB
RAF1
SLC29A1 SOD2
SOX2
Endosulfan (3) CALML5
INS
kras
Fenvalerate (1) MAP2K1
SLCO2B1 MAP2K2
gamma-HCH (1) RYR1
GJA1
Heptachlor (2) NFATC4
PPP1CA
PRKACA
SLCO3A1 PLA2G4A
Mercury (2)
BGLAP
NANOG
o,p'-DDT (1) PRKAA1
SLC22A1 GNAI2
CXCR1
PFOS (3)
SLC22A4
PFOA (2)
Phosalone (1)
p,p'-DDT (1)
Figure 6.6: Sankey plots show the human milk contaminants in ExHuMId and their target genes or
proteins involved in: (A) Oxytocin signalling pathway, and (B) Xenobiotic transporters. Besides
each contaminant, the number of target genes is mentioned in parenthesis.
167
influencing cytokine signalling and production (Supplementary Table S6.7).
transporters
Drug or xenobiotic transporters are membrane proteins that play a major role in transfer
of xenobiotics into human milk [322,323]. Some of these transporters have been found to
be expressed in mammary gland during lactation [322±325]. From the study by Alcorn et
al. [326], we compiled the list of 19 (out of 30) transporters that are expressed in the mam-
mary gland during lactation based on their Real-Time Reverse Transcription-Polymerase
Chain Reaction (RT-PCR) analysis. Thereafter, we have explored any potential interac-
tions between the chemicals in Global ExHuM and these 19 transporters, using interaction
data obtained from ToxCast and CTD (Figure 6.6; Supplementary Table S6.8).
The analysis of this dataset with chemical-gene interactions obtained from ToxCast
and CTD revealed that 15 contaminants in ExHuMId target 9 transporters which are ex-
pressed during lactation (Figure 6.6B; Supplementary Table S6.8). Of these, there are two
prominent transporter protein genes, namely, SLC22A1 and SLC22A4, which were found
to be expressed 4-fold during lactation [326] (Figure 6.6B; Supplementary Table S6.8).
Among the contaminants in ExHuMId, Arsenic targets 7 transporter genes. The ABCB1
transporter protein gene appears to be targeted by the maximum number of contaminants
in ExHuMId (Figure 6.6B; Supplementary Table S6.8) [39]. We have also performed the
same analysis for the chemicals compiled in ExHuMUS and ExHuM Explorer, and these
results are included in Supplementary Table S6.8.
From the analysis reported in this section, it is evident that the human milk contami-
nant Arsenic can target several genes or proteins in lactation pathway, cytokine signalling
and production pathway, and xenobiotic transporters (Figures 6.5, 6.6 and 6.7). Based
on the compilation of studies in ExHuMId, Arsenic was detected in human milk samples
collected from 3 states of India, namely, Chhattisgarh, Maharashtra and West Bengal.
168
Contaminant Cytokine receptor Cytokine
CCR8 (1) CCL1
Arsenic (24) CCR3 (10) CCL11
CCL13
CCR1 (9) CCL14
Benzo[a]pyrene (9) CCR2 (5) CCL15
CCR4 (3) CCL16
Tetrachlorodibenzodioxin (3)
CCL17
Aldrin (1) CCR6 (1)
CCL2
Chlorpyrifos (1) CC40 (1)
CCL20
p,p'-DDE (1)
CD27 (1) CCL22
Dieldrin (1)
CSF1R (2) CCL23
Endosulfan (1)
CXCR1 (4) CCL24
Heptachlor (1)
CXCR3 (5) CCL26
Heptachlor epoxide (1)
CXCR4 (1) CCL27
Methoxychlor (1)
Methyl parathion (1) CXCR5 (1) CCL28
o,p'-DDE (1) FAS (1) CCL3
o,p'-DDT (1) IFNAR2 (9) CCL5
p,p'-DDD (1) IFNGR2 (1) CCL7
p,p'-DDT (1) IL10RA (1) CCL8
PFOS (2) IL12RB1 (1) CD40LG
Phosalone (1) IL15RA (1)
Profenofos (1) CD70
IL18R1 (1) CSF1
Cadmium (2)
IL1R1 (3) CXCL1
PFOA (3)
IL20RA (1) CXCL10
BDE-47 (1) IL4R (1) CXCL11
IL7R (1) CXCL12
TNFRSF1A (2) CXCL13
TGFBR2 (1)
CXCL5
TGFBR3 (1)
CXCL6
TNFRSF10A (1)
CXCL8
TNFRSF11A (1)
CXCL9
TNFRSF17 (1)
FASLG
TNFRSF8 (1)
IFNA21
XCR1 (2)
Ifna4
IFNA5
IFNA6
IFNA7
IFNA8
IFNB1
IFNG
IFNK
IFNW1
IL10
IL12B
IL15
IL18
IL1A
IL1B
IL1RN
IL20
IL34
IL4
IL7
LTA
PF4
TGFB2
TGFB3
TNF
TNFSF10
TNFSF11
TNFSF13B
TNFSF8
XCL1
XCL2
Figure 6.7: Sankey plot shows the tripartite network of human milk contaminants in ExHuMId,
their target genes or proteins corresponding to cytokine receptors, and the cytokines regulated by
the specific cytokine receptors. Besides each contaminant, the number of target cytokine receptors
is mentioned in parenthesis, and similarly, besides each cytokine receptor, the number of cytokines
regulated is mentioned in parenthesis.
169
Arsenic was also found in a human milk sample from the United States, as reported in
Lehmann et al. [286]. From the evidence in scientific literature, Arsenic has been found
to be present in many biospecimens from across the world [327]. Especially, the primary
source of Arsenic is known to be ground water or drinking water [328, 329]. Moreover,
there are several studies which have reported on Arsenic contamination in ground water
and drinking water samples collected from several states in India [330±333]. Thus, it is
not surprising that Arsenic has been found to be a human milk contaminant.
6.9 Discussion
Human milk is the sole source of nourishment for infants for the first few months of their
lives, during which exposure to environmental contaminants is a concern. These con-
taminants may have an impact on maternal health and lactation as well. Understanding
the effects of these environmental contaminants to maternal and infant health remains
challenging [66, 67, 285]. In recent years there is an increased interest towards the devel-
opment of an integrated approach in toxicology known as the exposome which captures
all the environmental exposures of humans during their lifetime, their associated biolog-
ical responses, and the implications of the exposures on their health [13, 18±20]. In this
work we have developed a comprehensive resource on Exposome of Human Milk across
India, ExHuMId version 1.0, through a systematic approach.
The development of a resource on human milk exposome specific to India is the first
step in covering the wide range of information related to detected human milk contam-
inants, their concentrations, maternal factors, and other information which are dispersed
across a large body of scientific literature. The determination of mean concentrations of
contaminants or any established benchmarks like reference dose (RfD) or Tolerable Daily
Intake (TDI) or Average Daily Dose (ADD) is not ventured into in this chapter, as the
data compiled in this work is diverse in consonance with the breadth of the Indian popu-
lation. It is important to highlight the availability of guidelines provided by the US EPA
170
on child-specific exposure scenarios examples [334] in the Indian context, which can help
to estimate the above benchmarks specific to India. During our literature mining we also
found thousands of research articles available in the corpus of PubMed [158], on the de-
tection of environmental contaminants in human milk across the world. Thus, the expan-
sion of human milk exposome resources worldwide, and the availability of experimentally
determined M/P ratio for environmental contaminants can help in better risk assessment
and management of human milk contaminants. Importantly, further studies are necessary
to understand the influence of variable factors such as maternal factors [67, 71, 287], the
pharmacokinetics of environmental contaminants [71, 286], and the complexity of lac-
tation pathways and physiology [287, 313] in order to incorporate these variables in the
risk estimation of human milk contaminants. We also note that there are several studies
on detection of environmental contaminants in other specimens such as blood, plasma,
serum, placenta, urine, saliva across India, and substantial manual effort is required to
develop a comprehensive exposome resource specific to India which is beyond the scope
of this work. In future, we would like to contribute further towards mapping the external
exposomes specific to India.
Supplementary Information
Supplementary Tables S6.1-S6.8 associated with this chapter are available for download
from the GitHub repository: https://github.com/asamallab/PhDThesis-Janani_
R/blob/main/SI/ST_Chapter6.xlsx.
171
Feature ExHuMId ExHuMUS ExHuM Explorer
Number of human milk contaminants 101 127 183
Number of published research articles
36 44 31
covered
Web interface Yes No Yes
Compilation of concentration of human
Yes Yes Yes
milk contaminants
Compilation of maternal factors from the
Yes No No
experimental data
Categorization of contaminants based on
Yes No No
environmental source
Chemical classification of contaminants Yes No Yes
Standard chemical identifiers of
Yes No Yes
contaminants
Availability of 2D structure for
Yes No Yes
contaminants
Availability of 3D structure for
Yes No Yes
contaminants
Downloadable formats for 2D and 3D SDF, MOL, MOL2,
No MOL, SDF, PDB
structure of contaminants PDB, PDBQT
Physicochemical properties of contaminants Yes No No
Molecular descriptors for contaminants Yes No No
Predicted ADMET properties of
Yes No No
contaminants
Chemical-gene association network Yes No No
Chemical similarity filter Yes No No
Table 6.1: Comparison of the features including meta-information captured in ExHuMId with
respect to two other resources, ExHuMUS and ExHuM Explorer, on human milk contaminants.
172
Chapter 7
Apart from breast milk, infants are also exposed to environmental chemicals in food,
indoor air, child care products and toys, which are part of the external exposome of chil-
dren [80, 81, 148, 335, 336]. Exposure to hazardous chemicals is a significant health con-
cern for children who have high metabolic rate, immature organ systems, thin skin, rapid
growth and development of organs and tissues [79±81]. Notably, children are exposed to
chemicals in toys and different child care products related to feeding, diapering, bathing
and clothing [81, 335, 337]. With respect to chemicals in children’s products, the toxic
effects of heavy metals, phthalates and brominated flame retardants have been well stud-
ied [80, 81, 148, 335, 336, 338]. There are also regulations in some parts of the world that
limit the use of hazardous chemicals in children’s products. However, fragrance chemicals
which are a subset of chemicals used in children’s products remain either self-regulated
or poorly regulated [75, 79, 81]. Moreover, there is a lack of an overarching international
approach for the global regulation of chemicals (including fragrances) in children’s prod-
ucts [336].
Fragrance chemicals in terms of their chemical origin are either natural or synthetic
compounds, and exposure to such chemicals can lead to asthma, contact dermatitis (ir-
173
ritant or allergic), dyschromia, photosensitivity, and migraine headaches [73±76, 78, 86].
Further, certain fragrance chemicals used in cosmetics or personal care products were
found to be carcinogens, neurotoxicants, and linked to reproductive disorders [75, 78,
339±341]. Notably, fragrance chemicals have been detected in human samples of blood,
adipose tissue and breast milk [75,339]. Exposure to these fragrance chemicals can occur
via direct skin contact, inhalation, or ingestion [342, 343]. For instance, when children
are exposed to fragrance chemicals found in skin care products like moisturizing lotions,
soaps, or baby diapers, such chemicals may penetrate through the skin, absorbed into the
bloodstream, and subsequently, distributed to various organs [339]. Given the potential
health risk posed by these fragrance chemicals in early childhood, there is a need to con-
tinuously monitor and regulate such chemicals to ensure safety of children’s products.
In the European Union (EU), the ‘EU Toy Safety Directive’ [145] and the ‘Danish EPA
Sensitizing Fragrances in Children’s Articles’ [146] are two regulations that limit the use
of certain fragrance chemicals in children’s products. Still, there is no dedicated online
repository to date that compiles the inventory of fragrance chemicals used in children’s
products. In this chapter, we present a comprehensive resource of fragrance chemicals de-
tected experimentally in children’s products and several analyses of the associated chem-
ical space to highlight the need and importance of monitoring and regulating the use of
such chemicals in children’s products. The work reported in this chapter is contained
in the published manuscript [40].
dren’s products
As a first step towards building the database, we performed literature mining to identify
experimental published studies which report or detect fragrance chemicals used in chil-
174
dren’s products. For this, we mined PubMed [158] using the following keyword search:
The above keyword search which was last performed on 23 March 2021 resulted in 306
research articles from PubMed. Further, we manually curated these 306 research arti-
cles to filter the relevant articles reporting the fragrance chemicals identified in children’s
products. Specifically, we retained experimental studies that reported fragrance or scented
compounds detected across children’s products. Moreover, studies that reported chemi-
cals other than fragrance chemicals, as well as the ones that did not include any children’s
products were excluded. Finally, this manual curation led to the identification of 21 re-
search articles that contain information on fragrance chemicals from children’s products
like toys, moisturizing creams, shampoos, infant milk formula, and baby diapers (Figure
7.1; Supplementary Table S7.1). Of these 21 research articles, 11 publications reported
fragrance chemicals identified in ‘toys’ [40]. The steps involved in the filtration of the 306
research articles to compile experimental studies that have detected fragrance chemicals
in children’s products are described in a flowchart based on the preferred reporting items
for systematic reviews and meta-analyses (PRISMA) [344] (Figure 7.1).
cals
From the filtered set of 21 research articles, we next compiled the list of detected fragrance
chemicals, along with the source or types of children’s products in which the chemicals
were identified. For unambiguous analysis of the fragrance chemicals compiled in this
dataset, we further mapped the chemicals to their standard chemical identifiers using
CAS [164] and PubChem [86]. This process led to the compilation of 153 unique fra-
grance chemicals from the filtered set of 21 research articles (Supplementary Table S7.2).
175
Identification
Research articles that experimen- Studies that did not report any
Included
Figure 7.1: The flowchart depicting the steps involved in the selection of published research
articles that are used to compile the fragrance chemicals experimentally detected in children’s
products.
176
Thereafter, using PubChem [86] database, we gathered two-dimensional (2D) and three-
dimensional (3D) structure information, IUPAC name, canonical SMILES, InChI, and
InChIKey for the 153 fragrance chemicals compiled in this dataset [40].
Subsequently, the 153 fragrance chemicals were classified based on: (a) chemical
structure, (b) children’s product source, and (c) chemical origin. Firstly, we used Classy-
Fire [173, 174] to classify the 153 fragrance chemicals based on their chemical structure
(Figure 7.2A). ClassyFire [174] based chemical classification of the 153 fragrance chem-
icals in FCCP revealed that all fragrance chemicals in this resource are ‘organic’. Further,
among the 153 fragrance chemicals in FCCP, 50 are ‘benzenoids’ and 40 are ‘organic
oxygen compounds’ according to ClassyFire (Figure 7.2A).
Secondly, we classified the children’s product source information for the fragrance
chemicals obtained from the associated literature, and this resulted in 8 broad categories
and 19 sub-categories (Figure 7.2B). The 8 broad categories include ‘Clothing and Ac-
cessories’, ‘Diapering’, ‘Diet and Feeding’, ‘Hair care’, ‘Miscellaneous products’, ‘Oral
care’, ‘Skin care’, and ‘Toys’. We find that 5 chemicals namely, ‘Benzyl alcohol’, ‘Ben-
zyl benzoate’, ‘Citronellol’, ‘Hexyl cinnamic aldehyde’, and ‘Linalool’ were present in
5 out of 8 broad categories of children’s product source. 19 sub-categories represent the
standardized term for children’s products studied in the published literature. For example,
sub-categories such as ‘clay toys’ and ‘plastic toys’ were grouped into the broad category
of ‘Toys’. Of the 153 fragrance chemicals in FCCP, 85 have their children’s product
source as ‘Toys’, and moreover, these chemicals belong to 9 different sub-categories of
toys (Figure 7.2C).
Thirdly, we classified the fragrance chemicals based on their origin into either ‘nat-
ural’ or ‘synthetic’ (Figure 7.2C). Based on literature search, we determined whether a
fragrance chemical is a natural product (i.e., produced by microbes, plants or animals)
or a synthetic chemical (i.e., man-made or artificial). Several natural chemicals are be-
ing synthesized due to increased demand. However, if there is evidence that a fragrance
chemical has a natural source (e.g., plants, animals, fungi, algae, bacteria), we label it as
177
A D
Organic oxygen Lipids and lipid-like
Animal 34
compounds (40) molecules (25)
Aromatic 108
Berry 36
Citrus 84
Dairy 58
Fishy 4
Floral 90
Organoheterocyclic
compounds (11) Fruity 31
Maillard 59
B Mineral 88
Hair care 16
Seed spices 33
Oral care 2
Tropical 56
Unknown 13
Toys 85
0 25 50 75 100 125
C E
Unclassified 80 Fragrance chemicals in each category
(10)
Number of fragrance chemicals
60 56
52
46
42
Synthetic Natural
40
compounds compounds 31
31
(46) (97)
26 25
22
20
3 2
0
s us s ce
s
line o rdo s ion fer s in n tan gh
ide ic t za ce lat to Sa ical Sk zatio bs Hi
Gupecif ren's Ha bstan gu fic nd em Su Very ern
Repeci tics aes c h siti
s hild ucts su s me nc sen of onc
c rod s a C
p co fragr
Category
178
Figure 7.2 (previous page): (A) ClassyFire based classification of the 153 fragrance chemicals
into 7 superclasses. The number of fragrance chemicals in each superclass is indicated within the
parenthesis. (B) Histogram shows the distribution of the 153 fragrance chemicals across 8 broad
categories of children’s product source. (C) Classification of the 153 fragrance chemicals based
on their chemical origin. The number of fragrance chemicals in each category is indicated within
the parenthesis. (D) The column chart shows the distribution of the fragrance chemicals across 24
odor classes. (E) The graph shows the distribution of the 153 fragrance chemicals across different
categories of chemical lists reflecting guidelines or regulations, namely, ‘Guidelines specific to
children’s products’, ‘Hazardous substances’, ‘Regulations specific to cosmetics and fragrances’,
‘Safer chemicals’, ‘Skin sensitization’, ‘Substances of Very High Concern’, and ‘High Production
Volume (HPV)’ chemicals. This figure also gives the number of chemicals produced in high
volume in each category.
Furthermore, we compiled the odor information for the 153 fragrance chemicals from
various resources including Flavornet [345, 346], FlavorDB [68, 347], The Good Scents
Company Information System [348] and other published literature. Based on this com-
pilation of the odor information, 102 odor types were known to be associated with 140
fragrance chemicals compiled in this dataset. Similar to Flavornet [346], these 102 odor
types were further grouped into 24 odor classes (Figure 7.2D; Supplementary Table S7.3).
Moreover, the odor profiling of the fragrance chemicals in FCCP showed that each chem-
ical is associated with multiple odor classes (Supplementary Table S7.3). Of the 24 odor
classes associated with the fragrance chemicals in FCCP, ‘Aromatic’ odor is found to be
prevalent among 108 fragrance chemicals in FCCP, followed by the odor classes ‘Veg-
etable’ with 100 fragrance chemicals and ‘Herbs’ with 97 fragrance chemicals (Figure
7.2D; Supplementary Table S7.3).
179
https://cb.imsc.res.in/fccp [40].
The web interface of FCCP has been created using an approach similar to that de-
scribed in Section 2.2. FCCP contains detailed information on the 153 fragrance chem-
icals and their chemical structures. Especially, users can readily download 2D and 3D
structures of the fragrance chemicals in different formats such as MOL, MOL2, SDF,
PDB, and PDBQT. In addition, we compiled physicochemical properties, molecular de-
scriptors, and predicted ADMET properties for the 153 fragrance chemicals compiled
in FCCP. To compute physicochemical properties and generate molecular descriptors of
chemicals, we have used RDKit [179], PaDEL [180, 181] and Pybel [182]. For predict-
ing ADMET properties of chemicals, we have used admetSAR 2.0 [183], pkCSM [184],
SwissADME [186], Toxtree 2.6.1 [187] and vNN server [188]. In FCCP, users can obtain
diverse information on a fragrance chemical, including 2D and 3D chemical structure, via
the search and browse option in the user-friendly web interface (Figure 7.3).
spective
To assess the current level of regulation of the fragrance chemicals compiled in FCCP,
we performed a comparative analysis with 21 publicly available chemical lists which re-
flect chemical guidelines or regulations (Figure 7.2E; Supplementary Table S7.4). These
chemical lists represent different categories including Guidelines specific to children’s
products, Regulations specific to cosmetics and fragrances, Substances of Very High Con-
cern, Hazardous substances, Skin sensitization, and Safer chemicals.
180
A C
B
D
E G
Figure 7.3: Screenshots from the web interface of FCCP. (A) Home page of FCCP. Users can
retrieve the compiled fragrance chemicals using the following search options, namely, (B) Simple
search, (C) Physicochemical filter, and (D) Chemical similarity filter. FCCP also provides a list of
options to browse the compiled fragrance chemicals, namely, (E) Children’s product source, (F)
Chemical classification, (G) Odor profile, and (H) Presence in chemical regulation or guideline.
181
database [151], and (iii) REACH High Production Volume (REACH HPV) chemicals
containing REACH registered substances as of 21 September 2021 with a tonnage range
≥ 1000 tonnes [152].
Based on comparison with the 6 chemical lists in the category ‘Guidelines specific to
children’s products’, we find that the ‘EU Toy Safety Directive’ list contains the highest
number (31) of fragrance chemicals in FCCP (Figure 7.4; Supplementary Table S7.5). Of
these 31 banned allergenic chemicals common to ‘EU Toy Safety Directive’ and FCCP,
3 fragrance chemicals namely, ‘Methylparaben’, ‘Propylparaben’, and ‘Phenol’ are also
contained in 4 other chemical lists in the category ‘Guidelines specific to children’s prod-
ucts’. Interestingly, we also find that 18 out of 31 fragrance chemicals common to ‘EU
Toy Safety Directive’ and FCCP are produced in high volume based on comparison with
the three chemical lists of HPV chemicals (Figure 7.4; Supplementary Table S7.5).
Notably, 14 fragrance chemicals common to FCCP and the chemical prioritization list
‘Chemicals of concern in plastic toys’ were also found to be present in the majority of the
regulatory lists of concern investigated by Aurisano et al. [148]. Further, 13 out of these
182
14 fragrance chemicals are produced in high volume (Figure 7.4; Supplementary Table
S7.5).
To better comprehend the regulation of compiled fragrance chemicals for their use in
personal care products, we considered 2 publicly available lists that compile chemicals
which are restricted or prohibited for their use in cosmetics or fragrance products. These
2 lists are: (i) EU list of substances prohibited in cosmetic products [141], and (ii) IFRA
Standards Library - Prohibited, Restricted, Specification list [351].
Based on comparison with the 2 above chemical lists specific to cosmetics and
fragrances, the ‘IFRA Standards Library - Prohibited, Restricted, Specification’ list
contains 43 fragrance chemicals in FCCP, while the ‘EU list of substances prohibited
in cosmetic products’ contains 19 fragrance chemicals in FCCP. Further, 10 fragrance
chemicals in FCCP namely, ‘2-Heptenal’, ‘2,4-Dihydroxy-3-methylbenzaldehyde’,
‘4-Tert-Butylphenol’, ‘7-Ethoxy-4-methylcoumarin’, ‘7-Methoxycoumarin’, ‘7-
Methylcoumarin’, ‘Benzylideneacetone’, ‘Hexahydrocoumarin’, ‘Isophorone’, and
‘Lyral’ are present in both chemical lists in the category ‘Regulations specific to cosmet-
ics and fragrances’. Moreover, of these 10 fragrance chemicals, 3 are also produced in
high volume (Figure 7.4; Supplementary Table S7.5).
183
Total number
of chemicals
Number of
chemicals
fragrance
L1 (126) (14)
L2 (24) (22)
L4 (85) (6)
L5 (73) (6)
L1 Chemicals of concern in plastic toys L12 List of mammalian neurotoxicants from NeurotoxKb
L2 Danish EPA Sensitizing Fragrances in Children’s Articles L13 Toxic plant-phytotoxins (TPPT)
L3 EU Toy Safety Directive L14 ICCVAM: Skin Corrosion 2004 collection from NIEHS
L4 Washington State Children’s Safe Product Act L15 ICCVAM: local lymph node assay (LLNA) 2009
L5 High Priority Chemicals of Concern for Children's Health L16 NIOSH: Skin Notation Profiles
- Oregon State L17 PubChem Compound TOC: Skin, Eye, and Respiratory
L6 Chemicals of High Concern to Children's products rule Irritations
- Vermont State L18 US EPA safer chemical ingredients list
L7 EU list of substances prohibited in cosmetic products L19 Organisation for Economic Co-operation and Development
L8 IFRA Standards Library-Prohibited, Restricted, Specification High Production Volume (OECD HPV)
L9 SVHC under EU REACH L20 United States High Production Volume (USHPV) database
L10 IARC monographs on carcinogens L21 REACH High Production Volume (REACH HPV) chemicals
L11 Database of Endocrine Disrupting Chemicals and their
Toxicity profiles (DEDuCT)
Figure 7.4: Sankey plot showing the presence of fragrance chemicals in FCCP across 21 chemical
lists which reflect regulations or guidelines. Further, the 21 chemical lists have been classified
into 7 categories which include Guidelines specific to children’s products, Regulations specific to
cosmetics and fragrances, Hazardous substances, Skin sensitization, Safer chemicals, Substances
of Very High Concern, and High Production Volume (HPV) chemicals.
184
high concern among the compiled fragrance chemicals in FCCP.
Based on comparison with the only chemical list in the category ‘Substances of Very
High Concern’, we find that 3 fragrance chemicals in FCCP are contained in ‘SVHC under
EU REACH’ list. These 3 fragrance chemicals are ‘4-Tert-Butylphenol’, ‘Butylparaben’,
and ‘Musk xylene’, of which 2 fragrance chemicals are also produced in high volume
(Figure 7.4; Supplementary Table S7.5).
To analyze the fragrance chemicals in FCCP for known chemical hazards, we consid-
ered 4 publicly available lists which include: (i) IARC monographs on carcinogens [208],
(ii) Database of Endocrine Disrupting Chemicals and their Toxicity profiles (DEDuCT)
[35,36] (https://cb.imsc.res.in/deduct/), (iii) List of mammalian neurotoxicants
from NeurotoxKb [37] (https://cb.imsc.res.in/neurotoxkb/), and (iv) Toxic
plant-phytotoxins (TPPT) database [68, 352].
Based on comparison with the 4 chemical lists in the category ‘Hazardous sub-
stances’, 17, 15, 8, and 21 fragrance chemicals in FCCP are also carcinogens, endocrine
disruptors, neurotoxicants and phytotoxins, respectively (Figure 7.4). The presence of
these fragrance chemicals in consumer products for children increases the possibility of
exposure, which may lead to potential health impacts in children. Carcinogens reported
in IARC monographs have been categorized into one of the following groups: (i) Group
1 chemicals are human carcinogens, (ii) Group 2A chemicals are listed as ‘probable’ hu-
man carcinogens, (iii) Group 2B chemicals are possibly carcinogenic to humans, and (iv)
Group 3 chemicals are not classifiable as human carcinogens [296]. Of the 17 fragrance
chemicals in FCCP that are also carcinogens, 2, 1, 3 and 11 fragrance chemicals belong
to Group 1, Group 2A, Group 2B and Group 3 based on IARC monographs classifica-
tion. Further, 12 out of these 17 carcinogens in FCCP are also produced in high volume
(Supplementary Table S7.5). A similar analysis revealed that 12, 8, and 10 fragrance
185
chemicals in FCCP which are endocrine disruptors, neurotoxicants and phytotoxins, re-
spectively, are also produced in high volume, indicating the potential for adverse health
effects in children when exposed to such chemicals (Supplementary Table S7.5). Notably,
two fragrance chemicals in FCCP namely, ‘Ethanol’ and ‘Acetaldehyde’ are contained in
3 out of the 4 chemical lists in the category ‘Hazardous substances’ (Figure 7.4).
Based on comparison with the 4 chemical lists in the category ‘Skin sensitization’,
we find that the chemical list ‘PubChem Compound TOC: Skin, Eye, and Respiratory
Irritations’ contains 62 out of the 153 fragrance chemicals in FCCP (Figure 7.4; Supple-
mentary Table S7.5). Further, 5 fragrance chemicals in FCCP namely, ‘2-Butoxyethanol’,
‘Citral’, ‘Eugenol’, ‘Lauric acid’, and ‘Phenol’, are present in at least 2 out of the 4 chem-
ical lists in the category ‘Skin sensitization’. Moreover, all of these 5 fragrance chemicals
are also produced in high volume (Supplementary Table S7.5).
The United States Environmental Protection Agency (US EPA) has released a list of
chemicals that are considered to be among the safest for their intended functional use
[171]. In other words, the chemicals in this list are safer alternatives for certain functional
uses including chelating agents, colorants, polymers, preservatives, enzyme stabilizers,
186
perfumes, solvents, and surfactants. The US EPA considers a chemical to be a safer al-
ternative for specific functional use category only if the chemical meets the Safer Choice
Program criteria, which include the assessment of a wide range of potential toxicological
effects such as carcinogenicity, mutagenicity, bioaccumulation, skin sensitization, aller-
genicity, and endocrine disruption. Further, US EPA gives the following classification of
chemicals that indicates their safety status in each functional category: (i) ‘Green circle’
indicates the chemicals that are verified to be of low concern, (ii) ‘Green half-circle’ indi-
cates the chemicals that are expected to be of low concern based on the available evidence,
(iii) ‘Yellow triangle’ indicates the chemicals which have some evidence for hazardous
nature though listed to be safe for certain functional-use, and (iv) ‘Grey square’ indicates
the chemicals that are not acceptable for their use in some of the products and must be
reformulated. We used this list to assess the fragrance chemicals in FCCP.
Based on this comparison, we find that 31 fragrance chemicals in FCCP are contained
in the ‘US EPA safer ingredients’ list (Supplementary Table S7.5). Since the ‘US EPA
safer ingredients’ list classifies the chemicals based on different use categories (like sol-
vents, fragrances), we analyzed these 31 chemicals based on these categories. Of these
31 fragrance chemicals, we find that 25 were labeled as ‘safer’ for use as fragrance in-
gredients in consumer products, while the remaining 6 were not labeled as ‘safer’ for use
as fragrance ingredients. Furthermore, analysis of these 25 (safer) fragrance chemicals
in the ‘US EPA safer ingredients’ list based on the type of evidence revealed that 2, 3,
and 20 fragrance chemicals belong to ‘Green circle’, ‘Green half-circle’, and ‘Yellow
triangle’ categories, respectively (Figure 7.4). Of these 25 (safer) fragrance chemicals,
we find that 4, 5, and 5 fragrance chemicals are present in 3 chemical lists that reflect
guidelines specific to children’s products namely, ‘Chemicals of concern in plastic toys’,
‘Danish EPA Sensitizing Fragrances in Children’s Articles’, and ‘EU Toy Safety Direc-
tive’, respectively. Interestingly, we find that 22 out of the 25 (safer) fragrance chemicals
are listed in ‘IFRA Standards Library - Prohibited, Restricted, Specification’. By ana-
lyzing these 25 (safer) fragrance chemicals with chemical lists grouped in ‘Hazardous
187
substances’ category, we find that the chemicals ‘Benzyl salicylate’ and ‘D-limonene’ are
class 3 carcinogen and endocrine disruptor, respectively. In addition, these two chemicals
are also produced in high volume (Figure 7.4; Supplementary Table S7.5). Although
these 25 chemicals were marked ‘safer’ for their use as fragrance ingredients by the US
EPA, some of them are present in the different lists containing chemicals that display haz-
ard profiles or suggested to be limited or prohibited in cosmetics or children’s products.
Overall, these results highlight the disparities in the regulations or guidelines across
countries, necessitating prioritization and risk assessment of fragrance chemicals used in
children’s products, as many of them have potency to cause health hazards in children
[40].
dren’s products
To better understand the space of fragrance chemicals in children’s products, we com-
pared the structural similarity of fragrance chemicals in our resource FCCP with the list
of allergenic fragrance chemicals restricted or banned for their use in children’s toys as
compiled in the ‘EU Toy Safety Directive’ [145]. For this purpose, we constructed two
chemical similarity networks (CSNs), one for the 153 fragrance chemicals in FCCP, and
another for the 58 allergenic fragrance chemicals in the ‘EU Toy Safety Directive’. Note
that only 58 out of the 66 allergenic fragrance chemicals in the ‘EU Toy Safety Directive’
have chemical structure information available.
To build the CSNs, the Tanimoto coefficient [200] was computed using the Extended
Circular Fingerprints (ECFP4) method [129] for each pair of chemicals between the two
datasets. Tanimoto coefficient for any pair of compounds ranges from 0 to 1 with 1 sig-
nifying two compounds with identical structures. This led to two CSNs, one with 153
nodes for fragrance chemicals in FCCP, and another comprising 58 nodes for banned
allergenic fragrance chemicals in the ‘EU Toy Safety Directive’. Based on previous stud-
188
ies [283,357], a Tanimoto coefficient cut-off of 0.5 was used to determine if an edge exists
between any pair of chemicals in the dataset, resulting in a high similarity network of fra-
grance chemicals. Moreover, we also computed the Tanimoto coefficient for each pair of
a fragrance chemical in FCCP and a banned allergenic fragrance chemical in the ‘EU Toy
Safety Directive’ (Supplementary Table S7.6). A detailed investigation of the two CSNs
can help reveal the extent of structural similarities between chemicals in our resource and
‘EU Toy Safety Directive’.
An analysis of the CSN of 153 fragrance chemicals in FCCP reveals that there are
16 connected components with ≥ 2 chemicals and 51 isolated nodes (chemicals), and this
suggests a high structural diversity in the space of fragrance chemicals used in children’s
products (Figure 7.5A). Notably, the largest connected component in the CSN of 153
fragrance chemicals in FCCP consists of 25 fragrance chemicals (Figure 7.5A). In Fig-
ure 7.5A, the 31 fragrance chemicals common to FCCP and ‘EU Toy Safety Directive’ of
banned allergenic chemicals are highlighted in green. We observed that the 31 banned al-
lergenic chemicals are dispersed across different connected components in the CSN of 153
fragrance chemicals in FCCP, implying that both chemical spaces are structurally diverse.
Furthermore, we computed the chemical similarity using the Tanimoto coefficient [200]
between each chemical in FCCP and each banned allergenic chemical in ‘EU Toy Safety
Directive’, and any fragrance chemical in FCCP with chemical similarity ≥ 0.7 to any of
the banned allergenic chemicals in the ‘EU Toy Safety Directive’ are also highlighted in
the CSN of 153 fragrance chemicals in FCCP (Figure 7.5A; Supplementary Table S7.6).
Finally, we also built and visualized the CSN for the 58 banned allergenic chemicals in
‘EU Toy Safety Directive’ (Figure 7.5B). It is seen that the CSN of 58 banned allergenic
chemicals in ‘EU Toy Safety Directive’ has 11 connected components with ≥ 2 chemi-
cals and 26 isolated nodes (Figure 7.5B). Overall, an analysis of these CSNs reveals the
structural diversity of the fragrance chemical space [40].
189
A
190
Figure 7.5 (previous page): Chemical similarity networks (CSNs) of fragrance chemicals. Here,
nodes represent fragrance chemicals, and two nodes are connected by an edge if the corresponding
chemicals have chemical similarity ≥ 0.5 based on Tanimoto coefficient. (A) CSN of the 153
fragrance chemicals in FCCP. Here, nodes corresponding to the 31 fragrance chemicals common
to both FCCP and ‘EU Toy Safety Directive’ (L3) are highlighted in ‘green’, while the other
nodes are colored based on their level of chemical similarity to the banned allergenic chemicals
in L3. (B) CSN of the 58 allergenic fragrance chemicals in ‘EU Toy Safety Directive’ (L3). Note
that only 58 out of the 66 allergenic fragrance chemicals in the ‘EU Toy Safety Directive’ have
chemical structure information available. Here, nodes corresponding to the allergenic fragrance
chemicals that are also present in FCCP have been highlighted.
191
fragrance chemicals in FCCP can bind, ORL2156, ORL2162, ORL1858, ORL1553 and
ORL1138 are found to be targeted by at least 5 fragrance chemicals in FCCP. Additional
information on the binding of fragrance chemicals in FCCP to different odor receptors
can help better understand the mechanisms of olfactory perception [40].
Besides compiling the odor receptors, we also identified the target genes specific to
humans of the fragrance chemicals in FCCP using ToxCast [89]. ToxCast provides infor-
mation on the list of genes perturbed upon exposure to chemicals which were identified
based on high-throughput experimental assays. To identify the human target genes for
the fragrance chemicals in FCCP, we used ToxCast invitroDB3 dataset released in August
2019 [215]. We followed the method described in Section 2.4.2 to extract from ToxCast
the human target genes perturbed upon exposure to fragrance chemicals in FCCP (Supple-
mentary Table S7.8). Based on the ToxCast assays, we were able to compile 130 human
genes which are targets of at least one of 102 fragrance chemicals in FCCP (Supplemen-
tary Table S7.8). Of these 102 fragrance chemicals in FCCP, 18 fragrance chemicals can
target at least 20 human genes based on ToxCast assays. Specifically, 4 fragrance chemi-
cals namely, ‘Propylparaben’, ‘2-Benzylideneheptanal’, ‘Oxacyclohexadecan-2-one’, and
‘Hexyl cinnamic aldehyde’ can target more than 40 human genes based on ToxCast as-
says. Among the 130 human target genes of the 102 fragrance chemicals in FCCP, 14
human genes are targets of at least 20 fragrance chemicals in FCCP. An in-depth analysis
of these target genes can shed light on shared toxicological mechanisms associated with
fragrance chemicals in children’s products [40].
192
A Fragrance chemical Odor receptor
ICAM1 (2)
TGFB1 (2)
2-Benzylideneheptanal (7)
TIMP2 (4)
MMP9 (4)
Lilial (3)
IL1A (1)
Figure 7.6: (A) Bipartite graph displaying the 20 fragrance chemicals in FCCP and their asso-
ciated odor receptors identified using OdorDB. (B) Bipartite graph displaying the human target
genes of 7 fragrance chemicals in FCCP which were identified to have potential to cause skin
sensitization based on ToxCast in vitro human assays. Here, the number of odor receptors or tar-
get genes associated with each fragrance chemical is mentioned in parenthesis, and similarly, the
number of fragrance chemicals associated with each odor receptor or target gene is also mentioned
in parenthesis.
193
sensitization that can be used to select relevant ToxCast assays for skin sensitization.
Within AOP-Wiki, AOP:40 describes the key events (KEs) that lead to skin sensitization,
and these include chemical binding to skin proteins, activation of keratinocytes, dendritic
cells, and T-cells. Among the KEs of AOP:40 for skin sensitization, we identified ‘Ac-
tivation, Keratinocytes’ (KE:826) as a suitable endpoint for screening of skin sensitizing
fragrance chemicals. Previous studies have also revealed that keratinocytes are useful in
determining whether substances have the potential to cause skin sensitization [365, 366].
To select the list of relevant skin sensitization assays in ToxCast, we used the ToxCast
invitroDB3 dataset released in August 2019 [215]. Firstly, we imposed a tissue-specific
filter to only select ToxCast assays for human skin tissue. Two cell lines have been inves-
tigated among the shortlisted skin-specific ToxCast assays which are foreskin fibroblasts
(hDFCGF) and co-culture of keratinocytes and foreskin fibroblasts (KF3CT). Secondly,
we evaluated the ToxCast assays performed on KF3CT cell lines that have already been
used to screen compounds for skin sensitization [367]. Thirdly, we selected only the re-
porter assays that were designed to analyze the regulation of gene expression in ToxCast.
The above-mentioned filtration resulted in identification of human-specific skin sensiti-
zation assays from ToxCast which can be further used to test if a chemical has potency
for skin sensitization. Note that each ToxCast assay constitutes multiple assay component
endpoints which are designed to assess one or more target genes. Finally, if a fragrance
chemical in FCCP has tested ‘active’ for the assay component endpoints specific to a se-
lected human skin sensitization ToxCast assay, the corresponding gene is assigned as a
target of that fragrance chemical in FCCP [35]. This process resulted in 16 assay compo-
nent endpoints that are associated with the filtered set of skin sensitization assays in Tox-
Cast [148]. Among the fragrance chemicals in FCCP, 7 fragrance chemicals have 10 out
of the 16 assay component endpoints as ‘active’ upon exposure in the filtered set of skin
sensitization assays in ToxCast (Supplementary Table S7.8). These 7 fragrance chemicals
in FCCP namely, ‘2-Benzylideneheptanal’, ‘Hexyl cinnamic aldehyde’, ‘Linalyl acetate’,
‘Lilial’, ‘Musk ketone’, ‘Musk xylene’, and ‘Oxacyclohexadecan-2-one’, have the poten-
194
tial to cause skin sensitization based on ToxCast assays, and moreover, the 7 fragrance
chemicals are associated with 8 human target genes (Figure 7.6B).
Interestingly, we find that 5 out of these 7 fragrance chemicals in FCCP with skin
sensitization potential based on ToxCast assays, are present in at least one of the 4 chemi-
cal lists in the category ‘Skin sensitization’. Further, 3 out of these 7 fragrance chem-
icals are present in the 2 chemical lists namely, ‘Danish EPA Sensitizing Fragrances
in Children’s Articles’ and ‘EU Toy Safety Directive’. Moreover, one of these 7 fra-
grance chemicals identified to have skin sensitization potential based on ToxCast assays
namely, ‘Oxacyclohexadecan-2-one’, is not present in any of the chemical lists in the
categories ‘Skin sensitization’ or ‘Guidelines specific to children’s products’. However,
‘Oxacyclohexadecan-2-one’ is a prohibited or restricted substance in cosmetics and fra-
grances according to ‘IFRA Standards Library - Prohibited, Restricted, Specification’ list
(Supplementary Table S7.5).
7.7 Discussion
Exposure of children to hazardous chemicals via any route is a significant concern due
to the potential impact on the growth and development during early childhood [18, 39,
80, 81, 148, 286, 287, 335, 336, 342]. Fragrance chemicals, a subset of chemicals used in
children’s products, are either self-regulated or poorly regulated [75, 79, 81]. The absence
of a dedicated knowledgebase compiling the surrounding knowledge dispersed across
scientific literature on fragrance chemicals in children’s products may also hinder the risk
assessment and regulatory decisions on such chemicals.
In this chapter [40], we present a manually curated knowledgebase FCCP that com-
piles 153 fragrance chemicals in children’s products from 21 published experimental stud-
ies (Figure 7.7). The detailed information on fragrance chemicals in FCCP can be eas-
ily accessed via a user friendly web interface. Through a comparative analysis with 21
chemical lists reflecting current guidelines or regulations, we found that several fragrance
195
Chemical classification
1. Chemical structure
2. Children’s product source
8 broad categories
19 sub-categories
3. Chemical origin
1 5
Literature mining Data analysis
Figure 7.7: Schematic overview of the creation and analysis of the repository of Fragrance Chem-
icals in Children’s Products (FCCP).
196
chemicals in FCCP are either banned allergenic chemicals, or are prohibited or restricted
in cosmetics and fragrances. Further, this analysis revealed that several fragrance chemi-
cals in FCCP are carcinogens, endocrine disruptors, neurotoxicants, phytotoxins and skin
sensitizers, raising concerns about the potential health hazards in children. Notably, sev-
eral fragrance chemicals in FCCP of potential concern are also produced in high volume.
Next, we performed a similarity network based analysis of the fragrance chemicals in
FCCP which revealed the structurally diverse nature of the associated chemical space.
Then, we compiled and analyzed the odor receptors and human target genes for fragrance
chemicals in FCCP. Lastly, we identified 7 skin sensitizing fragrance chemicals in FCCP
using ToxCast in vitro human assays. In sum, our multipronged analysis of the atlas of
fragrance chemicals in children’s products underscores the need to monitor and regulate
them (Figure 7.7).
197
Supplementary Information
Supplementary Tables S7.1-S7.8 associated with this chapter are available for download
from the GitHub repository: https://github.com/asamallab/PhDThesis-Janani_
R/blob/main/SI/ST_Chapter7.xlsx.
198
Chapter 8
In this chapter, we aim to characterize the chemical component of the external expo-
some, specific to human tissues, and to explore ways to understand the health implications
of these chemicals. For this purpose, we consider three resources namely, CTD [30],
Exposome-Explorer [24] and PubChem [86], which have compiled chemicals detected
across human tissues, based on exposure studies from published research articles. Since
199
we have chosen to focus on human tissues excluding biological fluids, comprehensive re-
sources such as the Blood Exposome Database [28] pertaining to a biological fluid were
not included in this chapter. The three resources [24, 30, 86] considered in this chapter,
however, do not provide a cohesive picture of chemical exposure-disease relationships,
specific to human tissues. In this chapter, we have explored exposure-disease relation-
ships of the tissue-specific external exposome using network biology [13,88] approaches.
The work reported in this chapter is contained in the published manuscript [41].
CTD has compiled a list of 1146 chemicals detected across non-biological and bio-
logical specimens from exposure studies published in scientific literature [30]. In CTD,
the non-biological and biological specimens together are referred to as ‘Mediums’ in the
database [30]. Exposome-Explorer is a comprehensive resource that compiles ‘biomark-
ers’ of dietary and environmental exposures that are risk factors for disease [24]. Although
Exposome-Explorer compiles information on more than 1200 chemical biomarkers, we
only considered the subset of 450 dietary and environmental chemicals in Exposome-
Explorer with chemical structure information, after excluding entries that lack struc-
ture information or occur as chemical mixtures. PubChem, a comprehensive chemical
database developed by the National Center for Biotechnology Information (NCBI), Na-
tional Institutes of Health (NIH) of the United States, annotates information including
200
Compilation of chemical Filtration of human tissue-specific
exposure data exposure data from biological and
1. CTD exposure studies
non-biological specimens
2. Exposome-Explorer
3. PubChem Body burden Removal of endogenous chemicals
detected in human tissues
Link to chemical
regulations and 300 chemicals from TExAs are present
exposomes across 55 chemical regulations or guidelines
Figure 8.1: Detailed workflow describing the creation of Human Tissue-specific Exposome Atlas
(TExAs) and downstream analysis of the compiled list of 380 environmental chemicals detected
across 27 human tissues.
201
toxicological and exposure information for the chemicals compiled in the resource [86].
A list of 844 chemicals is available separately through PubChem Classification Browser
under the hierarchy ‘Body Burden’. These 844 chemicals have been annotated as chemi-
cals detected across environmental samples and biological specimens in published scien-
tific studies. To standardize the exposure and biospecimen data compiled from the three
resources, we have manually unified the information on mediums and biospecimens to
a standard vocabulary. Note that the above-mentioned three resources also give the ref-
erences to the published literature evidence associated with exposure and biospecimen
data. To build the human tissue-specific exposome atlas, we perform the following two
steps [41].
In the first step, the list of 467 mediums compiled from the three resources, CTD,
Exposome-Explorer and PubChem, were manually filtered to 199 biological mediums.
For example, non-biological mediums such as air, water or other environmental samples
have been removed in this step. In the second step, we have filtered 61 human biospeci-
mens from 199 biological mediums, which include both biological fluids, such as blood
and sweat, and biological non-fluids, such as adipose tissue. In the last step, we have
filtered 27 human tissues from the list of 61 human biospecimens to develop a human
tissue-specific chemical exposome resource (Figure 8.1). In this work, we do not con-
sider environmental chemicals detected in human biospecimens corresponding to biolog-
ical fluids like blood, urine and saliva, and therefore, we have not gathered information
from comprehensive resources such as the Blood Exposome Database [28].
We have considered the chemicals detected across all 61 human biospecimens from the
three resources, CTD, Exposome-Explorer and PubChem. A set of 1510 chemicals have
been detected across 61 human biospecimens (including biological fluids and biolog-
202
ical non-fluids). Endogenous chemicals do not constitute the external environmental
exposures of a human being. We therefore manually filtered and considered only non-
endogenous chemicals for further analysis. We then mapped the filtered chemicals to stan-
dard chemical identifiers such as Chemical Abstract Service (CAS) and PubChem [86] to
compile a unified list of environmental chemicals. Note that chemical classes and mix-
tures were also removed in this step. At this stage, we filtered 380 unique environmental
chemicals which have been detected across 27 human tissues (excluding biological flu-
ids), from our initial compilation of 1510 chemicals (Figure 8.1; Supplementary Table
S8.1). Among 27 human tissues in our compiled dataset, the maximum number of 240
environmental chemicals were detected in adipose tissue, followed by 120 chemicals in
placenta. Figure 8.2B shows the number of environmental chemicals detected across the
27 human tissues in our compiled dataset.
203
A D
Total chemicals in each exposome category
192
200 189
Chemicals that are produced in high volume
CTD Exposome- 168
Number of chemicals
(139) Explorer (128) 150
113
19
96 93 100 90 93
86
75 69
65
13
50 44
11 3 30
25
17
4 1
145 0
Children’s
Dietary
Exter l
Miscellaneous
Skin
Indoor
Pesticide/
Occupational
exposome
tal
external
exposome
exposome
exposome
biocide
exposome
exposome
exposome
exposome
environmenna
PubChem (172)
B External exposome categories
250 240
200 E
Number of chemicals
110
Liver 35
134
150
109
120 Kidney 10
37
100
67
Breast 3
17
50 42 41
32
23 21 63
14 13 11 11 Ovary 1
8 7 6 6 5 5 3 2 2 2 2 1 1 1 1 1 7
0
Adipose tissue
Placenta
Lung
Liver
Brain
Skin
Kidney
Vascular
Umbilical cord
Heart
Muscle
Bone
Urinary bladder
Spleen
Spinal cord
Lymph nodes
Eye
Pancreas
Stomach
Intestine
Gonad
Breast
Thyroid gland
Thymus
Testis
Pituitary gland
Ovary
61
Vascular 9
44
51
Skin 11
45
Human tissues 41
Intestine 1
42
C
37
Lung 5
Kingdom Superclass 18 Chemicals
Alkaloids and derivatives (1) Genes
2
Brain 2 Diseases
5
Benzenoids (150)
Frequency
204
Figure 8.2 (previous page): (A) The Venn diagram shows the presence of 380 environmental
chemicals compiled in TExAs across the three resources, namely, CTD, Exposome-Explorer, and
PubChem database. (B) The histogram shows the distribution of 380 environmental chemicals
detected across 27 human tissues. (C) The Sankey plot shows the chemical classification of 380
environmental chemicals into 2 kingdoms and 16 super-classes based on ClassyFire. The number
of chemicals in each classification is indicated within the parenthesis. (D) The bar plot shows
the distribution of 300 environmental chemicals present in at least one of the 55 chemical lists
(corresponding to chemical inventories, regulations, and guidelines), across 8 external exposome
categories. For each external exposome category, one bar represents the total number of chemicals
and the other represents the number of chemicals produced in high volume. (E) The grouped bar
plot gives the number of environmental chemicals, target genes and diseases associated with each
of the 9 human tissues.
gories
The presence or detection of chemicals of concern in biological specimens is proof of
human exposure [82], and thus, warrants further attention from the monitoring and regu-
latory perspectives to avoid future human exposure. We, therefore, sought to understand
the source and nature of the environmental chemicals in TExAs through a comparative
205
A
B C
206
Figure 8.3 (previous page): The web interface of TExAs. (A) Screenshot of the TExAs home
page. (B) The search page facilitates search for chemicals in two ways: Chemical search and
Physicochemical filter. In the Chemical search option, a chemical can be searched using the chem-
ical name or standard identifiers (CAS or PubChem). Using Physicochemical filter, the chemicals
can be searched using physicochemical properties such as molecular weight, LogP, TPSA, num-
ber of rotatable bonds, number of hydrogen bond donors, or acceptors. (C) On the browse page,
the chemical(s) can be obtained using either chemical name or based on their presence in 27
human tissues. (D) Screenshot showing the result page for each chemical compiled in TExAs.
From the result page, chemical information including the structural identifiers, tissue-specific ex-
posome, chemical-gene interaction, chemical-disease association, presence in chemical regulation
or guideline, and presence of chemical in high production volume (HPV) lists can be obtained for
each chemical.
analysis with 55 publicly available chemical inventories, regulations, and guidelines (Sup-
plementary Table S8.2). Based on the nature of human exposure, these 55 chemical lists
were classified into 8 external exposome categories such as ‘Children’s exposome’, ‘Di-
etary exposome’, ‘External environmental exposome’, ‘Indoor exposome’, ‘Occupational
exposome’, ‘Pesticide/biocide exposome’, ‘Skin exposome’ and ‘Miscellaneous external
exposome’ (Supplementary Table S8.2). We find that 300 out of the 380 environmental
chemicals in TExAs were also part of at least one of 55 chemical lists corresponding to
chemical inventories, regulations, and guidelines (Supplementary Table S8.3). Further
based on classification of these 55 chemical lists into various categories of the external
exposome, we found the majority of environmental chemicals in TExAs belong to ‘Di-
etary exposome’ (192 chemicals) followed by ‘External environmental exposome’ (189
chemicals) (Figure 8.2D; Supplementary Table S8.3). The least number of environmen-
tal chemicals in TExAs belong to ‘Occupational exposome’ (4 chemicals), which may be
due to data being limited to only one chemical regulatory list within this category (Figure
8.2D) [41].
Further to understand the scale at which humans are exposed to these chemicals, we
have also compared against chemicals produced in high volume as compiled in the Or-
ganisation for Economic Cooperation and Development High Production Volume (OECD
HPV) list which was last updated in 2004 and the United States High Production Volume
(USHPV) database. We find that 109 of 300 environmental chemicals detected in hu-
207
man tissues and present in at least one of the 55 chemical lists, are also produced in high
volume as per the OECD HPV list and USHPV database. Figure 8.2D shows the dis-
tribution of these 300 environmental chemicals across 8 exposome categories along with
the HPV chemicals in each exposome category. The above-mentioned 109 environmen-
tal chemicals produced in high volume have been detected in at least one of 27 human
tissues [41].
The high production volume of these chemicals also indicates their potential to cause
severe or widespread exposure. We, therefore, sought to understand their hazard potential
by comparing them with the substances of very high concern (SVHC) list under Registra-
tion, Evaluation, Authorisation and Restriction of Chemicals (REACH) regulation of the
European Union (EU) [157]. The chemicals in SVHC have been identified as bioaccumu-
lative, carcinogenic, mutagenic, or linked to serious health effects. Table 8.1 gives the list
of 13 potentially hazardous chemicals in TExAs that have also been included in the SVHC
list along with the information about the human tissues in which they have been detected.
The table also provides the criteria for their inclusion under the SVHC candidate list.
These 13 potential hazardous chemicals fall into 7 external exposome categories namely
‘Children’s exposome’, ‘Dietary exposome’, ‘External environmental exposome’, ‘Indoor
exposome’, ‘Skin exposome’ ‘Pesticide/biocide exposome’ and ‘Miscellaneous external
exposomes’ (Table 8.1). Of these 13 chemicals listed under SVHC, 3 are carcinogens, 4
are endocrine disruptors and 5 are known to cause reproductive toxicity (Table 8.1). No-
tably, these 13 chemicals have been detected across 13 out of 27 human tissues in TExAs
which include the brain, breast, kidney, liver, lung, pancreas and placenta. These findings
highlight the various possible routes of human exposure, potential health concerns, and
the implications for global monitoring and regulation of these 13 hazardous chemicals in
the future.
208
8.4 Linking diseases to the tissue-specific external expo-
some
Previous studies have suggested linkages between exposures, genes and gene expression,
and disease origins [368]. Earlier studies have also shown tissue specificity in the ex-
pression and interaction of genes, corresponding to the tissue-specific manifestation of
diseases [119]. Network biology [88] approaches can help in identifying mechanistic
links between the chemical spaces and their biological outcomes upon exposure [13].
Such analysis may also shed light on the tissue-specificity of the targets of the chemicals,
which can further help in the risk assessment of potential hazardous chemicals. Thus,
we construct a tripartite chemical-gene-disease network (considering only human tissue-
specific genes) to understand the effect of these environmental chemicals detected across
27 human tissues (Figure 8.1). We do so through the following steps.
209
data on whether a tested chemical is active or inactive for a particular assay component
endpoint, corresponding to specific target genes. If a tested chemical is active for a par-
ticular assay component endpoint, then the corresponding tissue-specific target gene is
assigned to the tested chemical. In total, ToxCast invitroDB3 dataset [215] compiles in-
formation based on various assays for 6623 tested chemicals that can target 138 genes
present across 13 human tissues. Importantly, 9 out of the 13 human tissues for which
information is compiled in ToxCast were mapped to the set of 27 human tissues compiled
in TExAs. ToxCast provides tissue-specific chemical-gene interaction data for 13 human
tissues, and we were able to map 9 out of the 27 human tissues in TExAs to their equiva-
lent tissue names in ToxCast. For subsequent analysis, we have considered the chemicals
in TExAs for which target gene information, across these 9 human tissues, is available in
ToxCast. The chemical-gene interaction network built as a result of this analysis shows
that 158 chemicals from TExAs interact with 121 gene targets, corresponding to 9 hu-
man tissues. Among these 9 tissues, only kidney, liver, lung, skin and vascular tissues
have chemical-gene interaction information for 10 or more targets (Supplementary Table
S8.4) [41].
To construct the tissue-specific gene-disease association network, we have used the cu-
rated gene-disease associations dataset in DisGeNET [370], which was compiled from
PsyGeNET [371], UniProt [372], OrphaNet [373], CGI [374], CTD (human data) [30],
ClinVar [375], and the Genomics England PanelApp [376]. DisGeNET also gives dif-
ferent scores which can be used to rank the compiled associations such as the gene-
disease associations (GDA) score, Disease Specificity Index (DSI), and Evidence Index
(EI) which range from 0 to 1 [370]. In our study, we first filtered high confidence gene-
disease associations from DisGeNET using the GDA score cut-off of > 0.5. Note that the
GDA score considers the level of curation, data source, test organisms and the number
of associated publications [370]. Next, we filtered the resulting data using the EI cut-off
210
of > 0.5, which implies that at least 50% of the publications supporting the gene-disease
associations are validated. Lastly, we chose only the gene-disease associations in which
disease types are classified as ‘disease’. After applying the above-mentioned filters in
DisGeNET, we have retrieved the list of gene-disease associations for the target genes
compiled in the previous step.
The liver is the human tissue with the largest number of linkages, consisting of 110 en-
vironmental chemicals targeting 35 genes which are associated with 134 diseases. Among
these chemicals, Tetrabromobisphenol A is predicted to be associated with the maximum
number (107) of diseases (Figure 8.4A; Supplementary Table S8.5). An inspection of
the external exposome categories of these 110 environmental chemicals shows that a ma-
jority of them (81 chemicals) fall under the ‘External environmental exposome’ category
(Supplementary Table S8.3). The ‘External environmental exposome’ category consists
of 9 chemical lists including substances which are labelled hazardous, regulated, or re-
211
stricted for human exposure, and present as water or environmental contaminants. This
result highlights the role and burden on the liver with regard to the environmental expo-
sures of humans. We further discuss the health implications of this chemical burden on
the liver [41].
Among the 134 diseases linked to the liver via chemical exposure, obesity and di-
abetic nephropathy are found to be associated with the maximum number (84) of the
environmental chemicals detected in the liver (Figure 8.4A; Supplementary Table S8.5).
Due to the shared chemical linkages amongst the diseases associated with the liver, we
sought to understand possible connections and co-occurrences among them. We con-
struct a liver-specific disease-disease network based on these shared chemicals. Analysis
of such disease-disease networks could also give insights on commonalities in the biolog-
ical mechanisms of diseases associated with shared chemicals. To get the most significant
disease associations, we have computed the overlap score for each pair of diseases. The
overlap score is the ratio of the number of chemicals shared between two diseases and
the total number of chemicals detected in the tissue. Thus, the strength of the association
between two disease pairs is proportionate to the overlap score, which ranges from 0 to 1.
Here, we have used an overlap score ≥ 0.5 as the cut-off, to retrieve the most significant
disease associations based on the shared chemicals.
212
A
Chemicals detected in Liver Diseases Number Chemicals with >70
(110) (134) of diseases associated diseases
Malignant
Neoplasm
Of Liver 107 Tetrabromobisphenol A
p,p'-DDD
105 Pentachlorophenol
Bisphenol A Non-Small Cell 105 Perfluoroundecanoic acid
Obesity Lung Carcinoma
103 Tris(1,3-dichloro-2-propyl)
Hexachlorophene Pentachlorophenol
Triclosan
Vitamin D-dependent -phosphate
Rickets, Type 2A
91 Hexachlorophene
o,p'-DDT
Rickets
Fatty Liver,
Alcoholic
91 p,p'-DDD
Heptachlor 89 2,4,6-Tribromophenol
o,p'-DDE
Liver
Linolenic
2-Naphthol
Endometrial Cirrhosis 88 Aspon-chlordane
Benzo[b]
Benzyl
acid Carcinoma
Diabetic 87 PFDA
salicylate
-fluoranthene Oxybenzone
Nephropathy
79 Linolenic acid
Celestolide
Familial Partial
Fatty Lipodystrophy,
77 Bisphenol A
1-Tridecanol Malignant
Heptachlor epoxide 2,4,6-Tribromophenol
Liver Type 2 Familial Partial
Neoplasm 76 Heptachlor
Lipodystrophy, Of Breast
Type 3 76 PFOA
Estrogen
Clofenotane
Aldrite Liver
Carcinoma
Resistance 75 o,p'-DDT
PFNA
Microphthalmia, 75 Triphenyl phosphate
Syndromic 12
Perfluoro
Tris(1,3-dichloro
-2-propyl)phosphate Familial Partial 73 Heptachlor epoxide
-undecanoic acid Lipodystrophy Breast
Malignant Carcinoma 71 Triclosan
Tumor Of
Tetrabromobisphenol A Acute
Colon
Benz[a]anthracene Promyelocytic
Leukemia Ovarian
Number Diseases associated
Aspon-
Neoplasm
of chemicals with >60 chemicals
chlordane Benzyl butyl
p,p'-DDE phthalate Colorectal
Triphenyl Osteoporosis
Carcinoma 84 Diabetic Nephropathy
phosphate
2,2'-Methylenebis
Chlordecone
Pulmonary
Fibrosis
84 Obesity
-(4-methyl-6-tert Malignant
-butylphenol) Neoplasm 74 Pulmonary Fibrosis
Of Ovary
69 Liver Cirrhosis
PFDA PFOA Diabetes Mellitus, 69 Non-Small Cell Lung
Non-Insulin-Dependent
Carcinoma
67 Breast Carcinoma
67 Diabetes Mellitus,
Non-Insulin-Dependent
67 Malignant Neoplasm Of Breast
67 Malignant Neoplasm Of Ovary
67 Ovarian Neoplasm
B 65 Malignant Neoplasm Of Liver
Malignant 63 Endometrial Carcinoma
Neoplasm 63 Estrogen Resistance
Pulmonary Of Liver
Fibrosis
Diabetes Mellitus,
Non-Insulin-Dependent
Diabetic
Nephropathy
Liver
Cirrhosis Malignant
Neoplasm
Of Breast
Ovarian
Non-Small Cell Obesity Neoplasm
Lung Carcinoma
Malignant
Neoplasm
Estrogen Of Ovary
Resistance
Endometrial
Breast Carcinoma
Carcinoma
213
Figure 8.4 (previous page): (A) The bipartite network of 110 chemicals detected in the liver and
134 associated diseases. In this network, the chemical nodes are colored in ‘red’ while the disease
nodes are colored in ‘grey’. The table (on the right) gives the list of chemicals detected in liver with
more than 70 disease associations and diseases associated with more than 60 chemicals detected
in liver. (B) Liver-specific disease-disease network built using the most significant disease-disease
associations with an overlap score of ≥ 0.5. The overlap score is the ratio of the number of
chemicals shared between any two diseases and the total number of chemicals detected in the
tissue.
In summary, we present TExAs [41] that compiles a list of 380 environmental chem-
icals detected across 27 human tissues in published literature compiled in three existing
resources. TExAs provides detailed information regarding the structures, chemical clas-
sification, and exposome categories for these 380 environmental chemicals. For the envi-
ronmental chemicals in TExAs, we show the application of network biology approaches
to explore chemical exposure-disease relationships in understanding the health burden of
chemicals and the possibilities of disease comorbidities.
214
pressed within the particular tissue [119]. While the Human Protein Atlas (HPA) gives
comprehensive information on the expression profiles of human genes in more than 50
tissue types [369], however, this presents only one side of the story as it is not linked to
any chemical exposures.
This study is the first step towards the integration of data surrounding chemicals de-
tected across human tissues into a single resource, which will help future exposome re-
search. Systematic expansion of tissue-specific exposure data along with the integration
of large-scale gene expression data will enable a better understanding of tissue-specific
chemical-disease relationships and the impact of chemical combinations in tissues. From
the perspective of chemical regulations, this expansion in data could guide the priori-
tization and regulation of environmental chemicals in the future. From the perspective
of future research, several parallels and contrasts could be identified in chemical-disease
associations when a chemical is present in different tissues. We believe the continued
expansion, compilation, and standardization of exposure data, gene expression data, and
gene-disease linkages are essential to understand the full impact of the external exposome
on human health.
8.5 Discussion
We wish to note that our focus in this study has been to meaningfully integrate and ex-
plore the available data surrounding environmental chemicals and their tissue-specific
disease associations, rather than to expand on the isolated compilation of environmen-
tal chemicals [41]. We obtain two important insights via our network-centric analy-
sis. The first is the significant effect that environmental exposures can have on hu-
man health. The second is the interconnections and possible co-occurrence of diseases,
specific to tissues. Such linkages between diseases have also been discussed in other
studies [384]. This work could serve as a template for the development of similar net-
work biology approaches to understand other exposure-disease relationships, character-
215
ize the effect of chemicals, and study exposome-related comorbidities [13]. The data
integrations that led to these findings have been made available through a web interface
(https://cb.imsc.res.in/texas) for use by the scientific community and the public
alike.
Supplementary Information
Supplementary Tables S8.1-S8.6 associated with this chapter are available for download
from the GitHub repository: https://github.com/asamallab/PhDThesis-Janani_
R/blob/main/SI/ST_Chapter8.xlsx.
216
Presence in Presence in Presence in
Chemical name SVHC Criteria
USHPV OECD HPV SVHC
Decabromodiphenyl oxide Yes Yes Yes PBT (Article 57d); vPvB (Article 57e)
Toxic for reproduction (Article 57c);
Endocrine disrupting properties (Article
Bis
Yes Yes Yes 57(f) - environment); Endocrine
(2-ethylhexyl)phthalate
disrupting properties (Article 57(f) -
human health)
Anthracene Yes Yes Yes PBT (Article 57d)
Dechlorane plus Yes Yes Yes vPvB (Article 57e)
Octamethylcyclote-
Yes Yes Yes PBT (Article 57d); vPvB (Article 57e)
trasiloxane
Lead Yes Yes Yes Toxic for reproduction (Article 57c)
Carcinogenic (Article 57a); Specific
Cadmium Yes Yes Yes target organ toxicity after repeated
exposure (Article 57(f) - human health)
Arsenic acid Yes Yes Yes Carcinogenic (Article 57a)
Trichloroethylene Yes Yes Yes Carcinogenic (Article 57a)
Toxic for reproduction (Article 57c);
Endocrine disrupting properties (Article
Bisphenol A Yes Yes Yes 57(f) - environment); Endocrine
disrupting properties (Article 57(f) -
human health)
Musk xylene Yes Yes Yes vPvB (Article 57e)
Toxic for reproduction (Article 57c);
Dibutyl phthalate Yes Yes Yes Endocrine disrupting properties (Article
57(f) - human health)
Toxic for reproduction (Article 57c);
Benzyl butyl phthalate Yes Yes Yes Endocrine disrupting properties (Article
57(f) - human health)
Table 8.1: List of 13 chemicals detected in human tissues that are found to be produced in high
volume by both OECD HPV list and USHPV database, and are also listed as ‘substance of very
high concern (SVHC)’ by the European Chemicals Agency (ECHA).
217
218
Chapter 9
9.1 Summary
EDCs are chemicals of emerging concern that have the potential to cause hormonal imbal-
ance by interfering with the normal functioning of endocrine system [3, 4, 43]. In Chap-
219
Compilation and curation
of diverse groups of
Prioritizing the chemicals environmental chemicals
of concern that are a part
of everyday exposures 1
Manual curation based on
experimental evidence from
published literature
Comparison with
chemical lists that are a
part of regulations,
guidelines or inventories
Creation of curated
knowledgebases for five
groups of environmental
Br
Regulatory assessment
Br Br Br
chemicals
O
Br Br
of compiled environ-
mental chemicals 4
Cl
Cl
O
Cl
Cl
Linking exposome
Cl
Cl
O
Cl
Cl
3
Exposure-disease
Characterization of Br Br
Br
Br
associations
environmental Br
O
Br
chemical spaces
Figure 9.1: Summary of the research on compilation, curation and exploration of diverse groups
of environmental chemicals reported in this thesis.
ter 2, we developed a detailed workflow (Figure 2.1) to identify potential EDCs from
published research articles containing supporting experimental evidence for endocrine-
specific perturbations in humans or rodents. In the initial stage of the workflow, we
used extensive PubMed [158] literature mining and three existing resources, the WHO
report, TEDX and EDCs Databank, to compile more than 16000 published research ar-
ticles which are likely to contain information on EDCs. Subsequently, we process these
articles using our workflow to manually compile 686 potential EDCs from 1796 published
research articles containing supporting experimental evidence for endocrine-specific per-
turbations in humans or rodents. Of these 686 potential EDCs and 1796 research articles,
198 EDCs (28.9%) and 1294 articles (72.0%) are not captured in any of the three existing
resources integrated in our workflow. A unique feature of our work is the compilation
of the list of observed adverse effects or endocrine-specific perturbations from supporting
published experiments for the 686 EDCs, and these observed effects were manually cu-
rated, unified and standardized into a list of 514 endocrine-mediated endpoints spanning
220
7 systems-level perturbations. Another unique feature of our work is the compilation and
standardization of the dosage information at which endocrine-mediated effects were ob-
served upon individual EDC exposure in published experiments. Moreover, the 686 EDCs
were classified based on the type of supporting evidence in published experiments, their
environmental source and their chemical classification. Lastly, we have also compiled
additional detailed information for each EDC such as its two-dimensional (2D) and three-
dimensional (3D) structure, physicochemical properties, molecular descriptors, predicted
ADMET properties and experimentally inferred target genes. In order to widely share the
compiled information on 686 potential EDCs and enable basic research towards the eluci-
dation of systems-level perturbations caused by them, we have also created a webserver,
DEDuCT 1.0, which is accessible at: https://cb.imsc.res.in/deduct/.
We employed network biology approaches [88, 385, 386] to gain a better understand-
ing of the link between the underlying chemical space of EDCs and biological space of
target genes or perturbed pathways [387, 388]. Specifically, we have constructed two
networks of EDCs using our resource based on the similarity of chemical structures or
target genes. Based on the chemical similarity network, we find that EDCs are diverse
in their chemical structure and each module in the similarity network corresponds to dis-
tinct chemical features. Upon investigation of the target similarity network, we find that
EDCs can have very different sets of target genes. Subsequent analysis revealed a lack
of correlation between chemical structure and target genes of EDCs. These results high-
light potential challenges in developing predictive models for the identification of EDCs.
DEDuCT is a large-scale resource on potential EDCs compiling supporting evidence of
endocrine-mediated perturbations and dosage information from published experiments in
humans or rodents, and the compiled information will contribute to the future research in
the field of computational systems toxicology.
221
DEDuCT 2.0: An updated knowledgebase and an exploration of the current regula-
tions and guidelines from the perspective of endocrine disrupting chemicals
We next explored how knowledge on EDCs captured through academic research can help
in risk and regulatory assessment of EDCs. This analysis was carried out in three steps,
as described in Chapter 3. Firstly, we have analyzed the increase in research efforts
and knowledge on EDCs in past decades, and have captured newly available information
into our unique resource DEDuCT 2.0 (Figure 3.1). Thus, the updated knowledgebase,
DEDuCT 2.0, compiles 792 potential EDCs along with 609 unique endocrine-mediated
endpoints, spanning 7 systems-level perturbations. Secondly, we analyzed the distribu-
tions of 1856 potential EDCs compiled in DEDuCT 2.0 or three other resources, namely,
WHO report, TEDX and EDCs Databank, across 36 chemical lists which are part of
inventories, guidelines and regulations. Notably, we found several potential EDCs are
distributed across diverse chemical lists, and further, some of these chemical lists with
potential EDCs are in day-to-day product categories such as ‘Food additives and Food
contact materials’ and ‘Cosmetics and household products’. Moreover, we classified the
chemicals in SIU and SOC lists into groups I, II and III containing 23483, 1139 and 3223
chemicals, respectively, of which 242, 356 and 278, respectively, are potential EDCs.
Lastly, analysis of 242 group I EDCs with HPV chemicals found 63 group I EDCs in use
which are also produced in high volume. Given the scale of exposure and the related haz-
ard potential, an evaluation of these EDCs produced in large quantities is warranted, and
developing adequate risk assessment criteria will aid in such efforts. We also described
an example to demonstrate how the compiled information in curated knowledgebases like
DEDuCT 2.0 can aid in the risk assessment of EDCs.
In sum, this chapter emphasizes the importance of bridging the gap between academic
and regulatory aspects of chemical safety, as a step towards the better management of
environment and health hazards such as EDCs. As ongoing scientific research will lead to
new discoveries and a deeper understanding of the effects of chemical exposure, it will be
222
important to regularly monitor the substances permitted for use under various regulations,
and substances generally found in use in products, through the same lens of scientific risk
assessment, in order to restrict emerging substances of concern at the earliest. Inventories
and independent guidelines of hazardous or toxic substances also need to be evaluated
and brought under effective regulation. Information with a scientific basis is necessary
to standardize criteria for this evaluation and risk assessment, especially in the case of a
complex chemical class such as the EDCs.
223
degree, out-degree, betweenness centrality and eccentricity. These analyses lead to the
identification of important events including points of convergence or divergence in the
ED-AOP network. In particular, we focused on one of the LCCs of the ED-AOP network
to better understand the series of biological events that lead to systems-level perturbations
upon endocrine disruption. An in-depth analysis of the largest component in the ED-AOP
network sheds light on the systems-level perturbations caused by endocrine disruption,
emergent paths, and stressor-event associations. In sum, the derived ED-AOP network
can be used to address the current knowledge gaps in the existing regulatory framework
and aid in better risk assessment of environmental chemicals.
224
tion, environmental sources, physicochemical properties, predicted ADMET properties,
molecular descriptors and target human genes. The entire information compiled in Neu-
rotoxKb 1.0, on the 475 potential neurotoxicants specific to mammals, is accessible at:
https://cb.imsc.res.in/neurotoxkb.
Human milk is a significant biospecimen in the study of the mother exposome and a
vital factor in a newborn’s exposome. In this direction, we created Exposome of Human
225
Milk across India (ExHuMId) version 1.0, an India-specific repository containing 101
human milk contaminants detected in milk samples from 13 Indian states, compiled from
36 published experimental studies. The detailed steps involved in this compilation of
human milk contaminants is presented in Chapter 6. ExHuMId also compiles the detected
concentrations of the contaminants, structural and physicochemical properties, and factors
associated with the donor of the sample. In this chapter, we also considered human milk
contaminants studied by Lehmann et al. [286] that are specific to USA (referred to as
‘ExHuMUS’), and the human milk contaminants compiled in Exposome-Explorer [24]
that are not specific to any geography (referred to as ‘ExHuM Explorer’).
We analyzed the human milk contaminants compiled in ExHuMId and two other
resources from three perspectives. We first compared ExHuMId with the well-known
chemical lists representing regulations and guidelines, to identify potential EDCs, car-
cinogens, neurotoxins or other hazardous chemicals. Of 101 human milk contaminants
in ExHuMId, 43, 23 and 14 were found to be potential EDCs, carcinogens, and neuro-
toxicants, respectively. Similar analyses was performed on the human milk contaminants
compiled in ExHuMUS and ExHuM Explorer [62], and several chemicals of concern
produced in high volume were identified.
The second perspective of our analysis enables to better understand the structural
features and properties which influence the transfer of environmental contaminants into
human milk, and thus, provides a way to predict the risk of contaminant entering human
milk. Due to the lack of experimental data on M/P ratios of human milk contaminants in
ExHuMId, we considered the dataset reported by Vasios et al. [72] and performed a com-
parison of the physicochemical properties that have been widely reported to influence the
transfer of contaminants or drugs into human milk. Through our analysis we observed
that the distributions of physicochemical properties of contaminants in ExHuMId, Ex-
HuMUS and ExHuM Explorer are close to the distributions of physicochemical properties
of chemicals reported as highly likely to transfer to human milk in Vasios et al. [72].
The third aspect of our analysis predicts the effect of the human milk contaminants
226
on lactation pathway and cytokine signalling and production pathway, using a systems
biology approach. Based on the interaction data obtained from ToxCast and CTD, we
inferred that many of the human milk contaminants compiled in the above-mentioned 3
datasets can interact with genes associated with prolactin signalling, oxytocin signalling,
lactose synthesis, cytokine signalling and xenobiotic transport. These observations need
to be critically validated using experimental approaches, which should encompass various
disciplines, to understand the influence of environmental contaminants on maternal and
infant health [302]. In sum, from our systematic compilation and analysis of human
milk contaminants, we observed there is a need for better chemical regulation and policy
decisions to avoid these contaminants in human milk in India and globally.
Fragrance chemicals are either natural or synthetic compounds, and exposure to such
chemicals can lead to asthma, contact dermatitis (irritant or allergic), dyschromia, pho-
tosensitivity, and migraine headaches [73±76, 78]. In Chapter 7, we present the reposi-
tory of Fragrance Chemicals in Children’s Products (FCCP) that compiles 153 fragrance
chemicals from 21 published experimental studies. The fragrance chemicals in FCCP are
classified based on their chemical structure, children’s product source, chemical origin,
and odor profile. Firstly, ClassyFire based classification revealed that all the compiled
fragrance chemicals were ‘Organic compounds’. Secondly, we find that 85 fragrance
chemicals have their children’s product source as ‘Toys’ based on the compiled infor-
mation on children’s product source for the fragrance chemicals. Thirdly, classification
based on environmental source showed that 97 fragrance chemicals in FCCP are natural
compounds. Fourthly, the odor profiling showed that ‘Aromatic’ odor is prevalent among
the compiled fragrance chemicals in FCCP.
Since the fragrance chemicals in children’s products are known to be poorly regu-
lated, we sought to explore the current regulatory status of these chemicals and the poten-
tial health effects in children upon exposure. We analyzed the presence of the compiled
227
fragrance chemicals in different chemical lists that are a part of regulations and guidelines
including the ones that are specific to children. We find that several fragrance chemicals
in FCCP are either banned allergenic chemicals, or are prohibited or restricted in cos-
metics and perfumes, based on a comparison with 21 chemical lists representing current
guidelines or regulations. Specifically, the analysis revealed that 17, 15, 8, and 21 fra-
grance chemicals in FCCP are also carcinogens, endocrine disruptors, neurotoxicants and
phytotoxins, respectively.
Further, we analyzed the structural diversity of the space of compiled fragrance chem-
icals and banned allergenic fragrance chemicals in EU Toy Safety Directive [145]. This
similarity network-based analysis of the fragrance chemicals in FCCP revealed the di-
versity of the associated chemical space. We then identified the potential skin sensitizers
among the compiled fragrance chemicals in children’s products by leveraging ToxCast as-
says. The compiled information in FCCP can aid scientists, stakeholders and regulatory
agencies in risk assessment and develop safer products for children. FCCP is accessible
at: https://cb.imsc.res.in/fccp/.
The presence of chemicals in human tissues suggests long-term exposure and bioaccu-
mulation of environmental contaminants [85]. In Chapter 8, we describe the steps in-
volved in the compilation of environmental chemicals detected across human tissues. In
this chapter, we explored the patterns in the associations between tissue-specific chemi-
cal exposures and human diseases using network biology approaches. For this purpose,
we compile, filter and unify environmental chemicals that are detected across human tis-
sues using information in CTD [30], Exposome-Explorer [24], and PubChem [86]. This
resulted in the compilation of 380 environmental chemicals detected across 27 human tis-
sues. We find that 240 environmental chemicals were detected in adipose tissue, followed
by 120 chemicals in the placenta, among information for 380 chemicals across 27 human
228
tissues in our compiled dataset.
We also find that 300 out of the 380 environmental chemicals are present in at least
one of 55 chemical lists that are part of global chemical regulations, guidelines, or inven-
tories. Interestingly, we find that 109 of the 300 chemicals that are present in at least one
of the 55 chemical lists, are also produced in high volume. Based on the classification
of these 55 chemical lists into various external exposome categories, we find that 192
environmental chemicals belong to the ‘Dietary exposome’, followed by 189 chemicals
that belong to the ‘External environmental exposome’. Further, we propose a priority list
of 13 potentially hazardous chemicals based on a comparative analysis of the compiled
chemicals with SVHC REACH regulation [157] and high production volume chemicals.
This analysis helps in understanding the environmental sources and routes of human ex-
posure to environmental chemicals detected in human tissues, as well as the current status
of their monitoring and regulation.
Subsequently, the compiled environmental chemicals have been linked to their po-
tential gene targets using ToxCast assays, and to the associated diseases using Dis-
GeNET [370]. This information was used to construct a tissue-specific chemical-gene-
disease network. Specifically, we considered the role and burden of the liver towards the
environmental exposures of humans. An analysis of the liver-specific disease network
reveals the possibilities of disease comorbidities and demonstrates the application of net-
work biology in unravelling complex exposure-disease associations. The entire informa-
tion is compiled in Human Tissue-specific Exposome Atlas (TExAs), and accessible at
https://cb.imsc.res.in/texas.
229
Human exposome is one of the promising areas of scientific research which aims to
address human health issues caused by environmental exposures [389]. Ongoing research
in exposome and toxicology is generating a large quantity of experimental data related to
various environmental chemical exposures [42]. It is critical to mine and curate existing
toxicological data in order to reveal significant and meaningful associations between en-
vironmental exposures and health impacts. In this direction, we present highly curated
resources on diverse groups of environmental chemicals in this thesis. These knowledge-
bases will serve as one-stop resource for obtaining toxicological information and can aid
in fundamental research on different groups of environmental chemicals. Specifically, in
recent times, there is lot of interest in developing data-driven predictive models to identify
toxicological effects upon exposure to certain chemicals [390±392]. Such models can be
built using high-quality toxicological information compiled for a specific group of chem-
icals in the knowledgebases presented in this thesis. In future, the observed health effects
and/or structural information compiled for different environmental chemicals in our re-
sources can serve as a positive dataset for structure-activity relationship (SAR) studies,
which rely on the quality of chemical and toxicological data in both training and testing
datasets [390]. Further, chemical similarity networks or CSNs enable the visualization
and characterization of the diverse biologically-relevant environmental chemical spaces,
and can aid in analyzing the structural relationship between compounds having same or
different biological activity.
The ever-increasing rate at which new chemicals are introduced into the market ne-
cessitates regular monitoring of their possible health consequences. The presence of the
different groups of environmental chemicals compiled in our resources across various
product categories reflects the gap in the current chemical regulation. These results also
highlight the need to bridge the gap between scientific research in academia and regu-
latory aspects of environmental chemicals of potential concern. Such analysis can aid
in the early identification of hazardous compounds and chemical prioritization, allowing
regulatory agencies to expedite the process of safety testing and, as a result, improving
230
chemical safety standards. Further investigation of experimentally derived dosage infor-
mation for observed endocrine-mediated health effects compiled in DEDuCT [35,36] can
enable identification of reference dose (RfD) or Tolerable Daily Intake (TDI) or Average
Daily Dose (ADD) that can aid in regulatory risk assessment of chemicals [393]. More-
over, for risk assessment of chemicals of potential concern, it is worthwhile to consider
the compilation of other toxicological information such as species, sex, route of admin-
istration, duration of exposure along with the observed effects upon exposure to environ-
mental chemicals, which is one of the limitations of our compiled resources. In case of
EDCs [35,36] or neurotoxicants [38], the inclusion of biomonitoring and epidemiological
studies from published literature into our resources in future will broaden the scope of
exposure assessment and risk categorization.
231
caused by environmental chemical exposures [40]. We believe that the work detailed in
this thesis toward the characterization and compilation of environmental chemicals with
potential human health hazards will aid basic research and regulatory bodies in improved
risk assessment of such chemicals of concern. Overall, the work reported in this thesis is
a step towards clean environment and healthy humankind.
232
References
[3] Futran Fuhrman, V., Tal, A. & Arnon, S. Why endocrine disrupting chemicals
(EDCs) challenge traditional risk assessment and how to respond. Journal of Haz-
ardous Materials 286, 589±611 (2015).
[4] Schug, T. T. et al. Designing endocrine disruption out of the next generation of
chemicals. Green Chemistry 15, 181±198 (2013).
233
[9] Cui, Y. et al. The Exposome: Embracing the Complexity for Discovery in Envi-
ronmental Health. Environmental Health Perspectives 124, A137±A140 (2016).
[11] Shaffer, R. M. et al. Improving and Expanding Estimates of the Global Burden
of Disease Due to Environmental Health Risk Factors. Environmental Health Per-
spectives 127, 105001 (2019).
[12] Misra, B. B. The Chemical Exposome of Human Aging. Frontiers in Genetics 11,
1351 (2020).
[13] Vermeulen, R., Schymanski, E. L., Barabási, A.-L. & Miller, G. W. The exposome
and health: Where chemistry meets biology. Science 367, 392±396 (2020).
[14] Praveena, S. M. et al. Recent updates on phthalate exposure and human health: a
special focus on liver toxicity and stem cell regeneration. Environmental Science
and Pollution Research 25, 11333±11342 (2018).
[16] Sillé, F. C. M. et al. The exposome - a new approach for risk assessment. ALTEX
37, 3±23 (2020).
[17] Misra, B. B. & Misra, A. The chemical exposome of type 2 diabetes mellitus:
Opportunities and challenges in the omics era. Diabetes & Metabolic Syndrome:
Clinical Research & Reviews 14, 23±38 (2020).
234
[18] Wild, C. P. Complementing the Genome with an "Exposome": The Outstanding
Challenge of Environmental Exposure Measurement in Molecular Epidemiology.
Cancer Epidemiology Biomarkers & Prevention 14, 1847±1850 (2005).
[19] Rappaport, S. M. & Smith, M. T. Environment and Disease Risks. Science 330,
460±461 (2010).
[20] Miller, G. W. & Jones, D. P. The Nature of Nurture: Refining the Definition of the
Exposome. Toxicological Sciences 137, 1±2 (2014).
[22] Lioy, P. J. & Rappaport, S. M. Exposure Science and the Exposome: An Oppor-
tunity for Coherence in the Environmental Health Sciences. Environmental Health
Perspectives 119, a466±a467 (2011).
[25] Dong, T. et al. Human Indoor Exposome of Chemicals in Dust and Risk Prioriti-
zation Using EPA’s ToxCast Database. Environmental Science & Technology 53,
7045±7054 (2019).
[26] Wishart, D. et al. T3DB: the toxic exposome database. Nucleic Acids Research 43,
D928±D934 (2015).
[27] Groh, K. J., Geueke, B., Martin, O., Maffini, M. & Muncke, J. Overview of inten-
tionally used food contact chemicals and their hazards. Environment International
150, 106225 (2021).
235
[28] Barupal, D. K. & Fiehn, O. Generating the Blood Exposome Database Using
a Comprehensive Text Mining and Database Fusion Approach. Environmental
Health Perspectives 127, 97008 (2019).
[29] Bessonneau, V., Pawliszyn, J. & Rappaport, S. M. The Saliva Exposome for Moni-
toring of Individuals’ Health Trajectories. Environmental Health Perspectives 125,
077014 (2021).
[31] Vrijheid, M. et al. The Human Early-Life Exposome (HELIX): Project Rationale
and Design. Environmental Health Perspectives 122, 535±544 (2014).
[35] Karthikeyan, B. S., Ravichandran, J., Mohanraj, K., Vivek-Ananth, R. P. & Samal,
A. A curated knowledgebase on endocrine disrupting chemicals and their biolog-
ical systems-level perturbations. Science of the Total Environment 692, 281±296
(2019).
[36] Karthikeyan, B. S., Ravichandran, J., Aparna, S. R. & Samal, A. DEDuCT 2.0: An
updated knowledgebase and an exploration of the current regulations and guide-
236
lines from the perspective of endocrine disrupting chemicals. Chemosphere 267,
128898 (2021).
[38] Ravichandran, J., Karthikeyan, B. S., Singla, P., Aparna, S. R. & Samal, A. Neuro-
toxKb 1.0: Compilation, curation and exploration of a knowledgebase of environ-
mental neurotoxicants specific to mammals. Chemosphere 278, 130387 (2021).
[40] Ravichandran, J., Karthikeyan, B. S., Jost, J. & Samal, A. An atlas of fragrance
chemicals in children’s products. Science of The Total Environment 818, 151682
(2022).
[41] Ravichandran, J., Karthikeyan, B. S., Aparna, S. R. & Samal, A. Network biology
approach to human tissue-specific chemical exposome. The Journal of Steroid
Biochemistry and Molecular Biology 214, 105998 (2021).
[42] Kalia, V., Jones, D. P. & Miller, G. W. Networks at the nexus of systems biology
and the exposome. Current Opinion in Toxicology 16, 25±31 (2019).
[44] Swedenborg, E., Rüegg, J., Mäkelä, S. & Pongratz, I. Endocrine disruptive chem-
icals: mechanisms of action and involvement in metabolic disorders. Journal of
Molecular Endocrinology 43, 1±10 (2009).
237
[45] Solecki, R. et al. Scientific principles for the identification of endocrine-disrupting
chemicals: a consensus statement. Archives of Toxicology 91, 1001±1006 (2017).
[51] Gore, A. C. et al. EDC-2: The Endocrine Society’s Second Scientific Statement on
Endocrine-Disrupting Chemicals. Endocrine Reviews 36, E1±E150 (2015).
[53] Bjùrklund, G., Mutter, J. & Aaseth, J. Metal chelators and neurotoxicity: lead,
mercury, and arsenic. Archives of Toxicology 91, 3787±3797 (2017).
[54] Koch, C. Complexity and the Nervous System. Science 284, 96±98 (1999).
[55] Tshala-Katumbay, D., Mwanza, J.-C., Rohlman, D. S., Maestre, G. & Oriá, R. B.
A global perspective on the influence of environmental exposures on the nervous
system. Nature 527, S187±S192 (2015).
238
[57] Grandjean, P. & Landrigan, P. J. Neurobehavioural effects of developmental toxic-
ity. The Lancet Neurology 13, 330±338 (2014).
[60] Office of Toxic Substances, U. E. Chemicals Which Have Been Tested for Neu-
rotoxic Effects. Tech. Rep. EPA-560/1-76-005, U.S. Environmental Protection
Agency, Washington, D.C. (1976).
[61] Mundy, W. R. et al. Expanding the test set: Chemicals with potential to disrupt
mammalian brain development. Neurotoxicology and Teratology 52, 25±35 (2015).
[62] Aschner, M. et al. Reference compounds for alternative test methods to indicate de-
velopmental neurotoxicity (DNT) potential of chemicals: example lists and criteria
for their selection and use. ALTEX 34, 49 (2017).
[63] Li, Z.-M., Albrecht, M., Fromme, H., Schramm, K.-W. & De Angelis, M. Per-
sistent Organic Pollutants in Human Breast Milk and Associations with Maternal
Thyroid Hormone Homeostasis. Environmental Science & Technology 54, 1111±
1119 (2020).
[64] Leibson, T., Lala, P. & Ito, S. Chapter 24 - Drug and Chemical Contaminants
in Breast Milk: Effects on Neurodevelopment of the Nursing Infant. In Slikker,
W., Paule, M. G. & Wang, C. (eds.) Handbook of Developmental Neurotoxicology,
275±284 (Academic Press, 2018).
[65] Council, N. R. (ed.) Scientific frontiers in developmental toxicology and risk as-
sessment (National Academy Press, Washington, DC, 2000).
239
[66] Sonawane, B. R. Chemical contaminants in human milk: an overview. Environ-
mental Health Perspectives 103, 197±205 (1995).
[67] Mead, M. N. Contaminants in Human Milk: Weighing the Risks against the Bene-
fits of Breastfeeding. Environmental Health Perspectives 116, A426±A434 (2008).
[68] Agatonovic-Kustrin, S., Ling, L., Tham, S. & Alany, R. Molecular descriptors
that influence the amount of drugs transfer into human breast milk. Journal of
Pharmaceutical and Biomedical Analysis 29, 103±119 (2002).
[69] Zhao, C. et al. Prediction of Milk/Plasma Drug Concentration (M/P) Ratio Using
Support Vector Machine (SVM) Method. Pharmaceutical Research 23, 41±48
(2006).
[70] Heinzow, B. Endocrine disruptors in human milk and the health-related issues of
breastfeeding. In Endocrine-Disrupting Chemicals in Food, 322±355 (Woodhead
Publishing, 2009).
[72] Vasios, G. et al. Simple physicochemical properties related with lipophilicity, po-
larity, molecular size and ionization status exert significant impact on the transfer of
drugs and chemicals into human breast milk. Expert Opinion on Drug Metabolism
& Toxicology 12, 1273±1278 (2016).
240
[75] Klaschka, U. & Kolossa-Gehring, M. Fragrances in the Environment: Pleasant
odours for nature? (9 pp). Environmental Science and Pollution Research 14, 44±
52 (2007).
[76] Nardelli, A., Drieghe, J., Claes, L., Boey, L. & Goossens, A. Fragrance allergens
in ‘specific’ cosmetic products. Contact Dermatitis 64, 212±219 (2011).
[77] Kim, J.-H. et al. Risk assessment to human health: Consumer exposure to in-
gredients in air fresheners. Regulatory Toxicology and Pharmacology 98, 31±40
(2018).
[78] Pastor-Nieto, M.-A. & Gatica-Ortega, M.-E. Ubiquity, Hazardous Effects, and
Risk Assessment of Fragrances in Consumer Products. Current Treatment Options
in Allergy 8, 21±41 (2021).
[79] Fisher, B. E. Scents and sensitivity. Environmental Health Perspectives 106, A594±
A599 (1998).
[80] World Health Organization. Principles for evaluating health risks in children as-
sociated with exposure to chemicals (World Health Organization, 2006).
[81] Becker, M., Edwards, S. & Massey, R. I. Toxic Chemicals in Toys and Children’s
Products: Limitations of Current Responses and Recommendations for Govern-
ment and Industry. Environmental Science & Technology 44, 7986±7991 (2010).
[83] Kalia, V., Barouki, R. & Miller, G. W. The Exposome: Pursuing the Totality
of Exposure. In Jiang, G. & Li, X. (eds.) A New Paradigm for Environmental
Chemistry and Toxicology: From Concepts to Insights, 3±10 (Springer, Singapore,
2020).
241
[84] Barr, D. B. et al. The use of dried blood spots for characterizing children’s exposure
to organic environmental chemicals. Environmental Research 195, 110796 (2021).
[86] Kim, S. et al. PubChem in 2021: new data content and improved web interfaces.
Nucleic Acids Research 49, D1388±D1395 (2021).
[88] Barabási, A.-L. & Oltvai, Z. N. Network biology: understanding the cell’s func-
tional organization. Nature Reviews Genetics 5, 101±113 (2004).
[89] Dix, D. J. et al. The ToxCast Program for Prioritizing Toxicity Testing of Environ-
mental Chemicals. Toxicological Sciences 95, 5±12 (2007).
[91] Council, N. R. Toxicity Testing in the 21st Century: A Vision and a Strategy (The
National Academies Press, Washington, DC, 2007).
[92] Hartung, T. On mapping the human toxome. ALTEX 28, 83±93 (2011).
[93] Hartung, T. Toxicology for the twenty-first century. Nature 460, 208±212 (2009).
[94] Krewski, D. et al. Toxicity Testing in the 21st Century: A Vision and a Strategy.
Journal of Toxicology and Environmental Health 13, 51±138 (2010).
242
[96] Edwards, S. W., Tan, Y.-M., Villeneuve, D. L., Meek, M. & McQueen, C. A. Ad-
verse Outcome PathwaysÐOrganizing Toxicological Information to Improve De-
cision Making. Journal of Pharmacology and Experimental Therapeutics 356, 170
(2016).
[97] Vinken, M. et al. Adverse outcome pathways: a concise introduction for toxicolo-
gists. Archives of Toxicology 91, 3697±3707 (2017).
[98] Krewski, D. et al. Toxicity testing in the 21st century: progress in the past decade
and future perspectives. Archives of Toxicology 94, 1±58 (2020).
[101] The Organisation for Economic Co-operation and Development (OECD). Users’
Handbook Supplement to the Guidance Document for Developing and Assessing
Adverse Outcome Pathways. Tech. Rep. 233, OECD Environment, Health and
Safety Publications, Paris (2018).
[102] The Organisation for Economic Co-operation and Development (OECD). Revised
Guidance Document on Developing And Assessing Adverse Outcome Pathways.
Tech. Rep. 184, OECD Environment, Health and Safety Publications, Paris (2013).
[103] The Organisation for Economic Co-operation and Development (OECD). Guid-
ance Document for the Use of Adverse Outcome Pathways in Developing Inte-
grated Approaches to Testing and Assessment (IATA). Tech. Rep. 260, OECD
Environment, Health and Safety Publications, Paris (2017).
243
[104] Vinken, M. The adverse outcome pathway concept: A pragmatic tool in toxicology.
Toxicology 312, 158±165 (2013).
[106] Villeneuve, D. L. et al. Adverse Outcome Pathway Development II: Best Practices.
Toxicological Sciences 142, 321±330 (2014).
[107] Knapen, D. et al. Adverse outcome pathway networks I: Development and applica-
tions: Advancing adverse outcome pathway networks. Environmental Toxicology
and Chemistry 37, 1723±1733 (2018).
[108] Sewell, F. et al. The future trajectory of adverse outcome pathways: a commentary.
Archives of Toxicology 92, 1657±1661 (2018).
[109] Sakuratani, Y., Horie, M. & Leinala, E. Integrated Approaches to Testing and
Assessment: OECD Activities on the Development and Use of Adverse Outcome
Pathways and Case Studies. Basic & Clinical Pharmacology & Toxicology 123,
20±28 (2018).
[110] Villeneuve, D. L. et al. Adverse outcome pathway networks II: Network analytics.
Environmental Toxicology and Chemistry 37, 1734±1748 (2018).
[113] The Organisation for Economic Co-operation and Development (OECD). AOP
knowledge base (AOP-KB). https://aopkb.oecd.org/.
244
[115] Knapen, D., Vergauwen, L., Villeneuve, D. L. & Ankley, G. T. The potential
of AOP networks for reproductive and developmental toxicity assay development.
43rd Annual Conference of the European Teratology Society 56, 52±55 (2015).
[117] Coady, K. et al. When Are Adverse Outcome Pathways and Associated Assays
ªFit for Purposeº for Regulatory Decision-Making and Management of Chemicals?
Integrated Environmental Assessment and Management 15, 633±647 (2019).
[118] Hecker, M. & LaLone, C. A. Adverse Outcome Pathways: Moving from a Scien-
tific Concept to an Internationally Accepted Framework. Environmental Toxicology
and Chemistry 38, 1152±1163 (2019).
[119] Kitsak, M. et al. Tissue Specificity of Human Disease Module. Scientific Reports
6, 35241 (2016).
[120] Kim, P. et al. TissGDB: tissue-specific gene database in cancer. Nucleic Acids
Research 46, D1031±D1038 (2018).
[121] Maiorino, E. et al. Discovering the genes mediating the interactions between
chronic respiratory diseases in the human interactome. Nature Communications
11, 811 (2020).
245
[124] Raunio, H. In Silico Toxicology ± Non-Testing Methods. Frontiers in Pharmacol-
ogy 2, 33 (2011).
[126] Bajusz, D., Rácz, A. & Héberger, K. Why is Tanimoto index an appropriate choice
for fingerprint-based similarity calculations? Journal of Cheminformatics 7, 20
(2015).
[130] Durant, J. L., Leland, B. A., Henry, D. R. & Nourse, J. G. Reoptimization of MDL
Keys for Use in Drug Discovery. Journal of Chemical Information and Computer
Sciences 42, 1273±1280 (2002).
[131] Lo, Y.-C. & Torres, J. Z. Chemical Similarity Networks for Drug Discovery. In
Chen, T. & Chai, S. C. (eds.) Special Topics in Drug Discovery (InTechOpen,
2016).
[133] Service, R. F. A New Wave of Chemical Regulations Just Ahead? Science 325,
692±693 (2009).
246
[134] European Union. Commission Regulation (EU) No 10/2011 on plastic materials
and articles intended to come into contact with food. https://eur-lex.europa.eu/eli/
reg/2011/10/oj (2011).
[138] U.S. FDA. US FDA Indirect Additives used in Food Contact Substances. https:
//www.cfsanappsexternal.fda.gov/scripts/fdcc/?set=IndirectAdditives.
[139] World Health Organization. WHO Codex General Standards for Food Additives.
http://www.fao.org/gsfaonline/additives/index.html (2019).
247
metic products. https://data.europa.eu/euodp/en/data/dataset/
cosmetic-ingredient-database-list-of-preservatives-allowed-in-cosmetic-products.
[146] Danish EPA. Danish EPA Sensitizing Fragrances in Children’s Articles. https:
//www2.mst.dk/udgiv/publications/2006/87-7052-018-6/pdf/87-7052-019-4.pdf
(2006).
[148] Aurisano, N., Huang, L., Canals, L. M. i., Jolliet, O. & Fantke, P. Chemicals of
concern in plastic toys. Environment International 146, 106194 (2021).
[150] The Organisation for Economic Co-operation and Development (OECD). OECD
High Production Volume (OECD HPV). https://www.oecd.org/chemicalsafety/
risk-assessment/33883530.pdf (2004).
[151] U.S. EPA. The United States High Production Volume (USHPV) database. https:
//comptox.epa.gov/dashboard/chemical_lists/EPAHPV (2004).
248
[152] European Chemicals Agency. REACH High Production Volume (HPV) chemicals.
https://echa.europa.eu/en/information-on-chemicals/registered-substances.
[153] Stone, A. & Delistraty, D. Sources of toxicity and exposure information for iden-
tifying chemicals of high concern to children. Environmental Impact Assessment
Review 30, 380±387 (2010).
[154] Neltner, T. G., Alger, H. M., Leonard, J. E. & Maffini, M. V. Data gaps in toxicity
testing of chemicals allowed in food in the United States. Reproductive Toxicology
42, 85±94 (2013).
[155] Geueke, B., Wagner, C. C. & Muncke, J. Food contact substances and chemicals
of concern: a comparison of inventories. Food Additives & Contaminants: Part A
31, 1438±1450 (2014).
[156] Demeneix, B. & Salma, R. Endocrine disruptors: from scientific evidence to hu-
man health protection policy. Tech. Rep., Policy Department for Citizen’s Rights
and Constitutional Affairs, European Parliament (2019).
[157] European Union. Candidate List of Substances of Very High Concern (SVHC) for
Authorisation. https://echa.europa.eu/candidate-list-table.
[159] Baker, V. Endocrine disrupters Ð testing strategies to assess human hazard. Toxi-
cology in Vitro 15, 413±419 (2001).
[160] Bliatka, D., Lymperi, S., Mastorakos, G. & Goulis, D. G. Effect of endocrine
disruptors on male reproduction in humans: why the evidence is still lacking? An-
drology 5, 404±407 (2017).
249
[162] Ding, D. et al. The EDKB: an established knowledge base for endocrine disrupting
chemicals. BMC Bioinformatics 11, S5 (2010).
[167] Sharma, V. & McNeill, J. H. To scale or not to scale: the principles of dose extrap-
olation. British Journal of Pharmacology 157, 907±921 (2009).
[170] Welshons, W. V. et al. Large effects from small exposures. I. Mechanisms for
endocrine-disrupting chemicals with estrogenic activity. Environmental Health
Perspectives 111, 994±1006 (2003).
250
[173] ClassyFire. http://classyfire.wishartlab.com/.
[178] O’Boyle, N. M. et al. Open Babel: An open chemical toolbox. Journal of Chem-
informatics 3, 33 (2011).
[182] O’Boyle, N. M., Morley, C. & Hutchison, G. R. Pybel: a Python wrapper for the
OpenBabel cheminformatics toolkit. Chemistry Central Journal 2, 5 (2008).
[183] Yang, H. et al. admetSAR 2.0: web-service for prediction and optimization of
chemical ADMET properties. Bioinformatics 35, 1067±1069 (2019).
251
[185] Banerjee, P., Eckert, A. O., Schrey, A. K. & Preissner, R. ProTox-II: a webserver
for the prediction of toxicity of chemicals. Nucleic Acids Research 46, W257±
W263 (2018).
[186] Daina, A., Michielin, O. & Zoete, V. SwissADME: a free web tool to evalu-
ate pharmacokinetics, drug-likeness and medicinal chemistry friendliness of small
molecules. Scientific Reports 7, 42717 (2017).
[187] Patlewicz, G., Jeliazkova, N., Safford, R., Worth, A. & Aleksiev, B. An evaluation
of the implementation of the Cramer classification scheme in the Toxtree software.
SAR and QSAR in Environmental Research 19, 495±524 (2008).
[188] Schyman, P., Liu, R., Desai, V. & Wallqvist, A. vNN Web Server for ADMET
Predictions. Frontiers in Pharmacology 8, 889 (2017).
[198] Wassenaar, P. N., Rorije, E., Janssen, N. M., Peijnenburg, W. J. & Vijver, M. G.
Chemical similarity to identify potential Substances of Very High Concern ± An
effective screening method. Computational Toxicology 12, 100110 (2019).
252
[199] Wassenaar, P. N., Rorije, E., Vijver, M. G. & Peijnenburg, W. J. Evaluating chem-
ical similarity as a measure to identify potential substances of very high concern.
Regulatory Toxicology and Pharmacology 119, 104834 (2021).
[202] Bender, A. et al. How Similar Are Similarity Searching Methods? A Principal
Component Analysis of Molecular Descriptor Space. Journal of Chemical Infor-
mation and Modeling 49, 108±119 (2009).
[204] Blondel, V. D., Guillaume, J.-L., Lambiotte, R. & Lefebvre, E. Fast unfolding
of communities in large networks. Journal of Statistical Mechanics: Theory and
Experiment 2008, P10008 (2008).
[205] Bastian, M., Heymann, S. & Jacomy, M. Gephi: an open source software for
exploring and manipulating networks. In Third international AAAI conference on
weblogs and social media, 1±2 (2009).
[207] Jaccard, P. The distribution of the flora in the Alphine zone.1. New Phytologist 11,
37±50 (1912).
253
[209] Loomis, D., Guha, N., Hall, A. L. & Straif, K. Identifying occupational carcino-
gens: an update from the IARC Monographs. Occupational and Environmental
Medicine 75, 593±603 (2018).
[212] The French Agency for Food, E. & (ANSES), O. H. . S. Elaboration of a list of
substances of interest as regards to a potential endocrine activity and prioritisation
strategy for assessment. Tech. Rep. 2019-SA-0179, ANSES, France (2021).
[216] Council, N. R. Risk Assessment in the Federal Government: Managing the Process
(National Academies Press, Washington, D.C., 1983).
254
[219] Clahsen, S. C. S. et al. Why Do Countries Regulate Environmental Health Risks
Differently? A Theoretical Perspective: Why Do Countries Regulate Environmen-
tal Health Risks Differently? Risk Analysis 39, 439±461 (2019).
[221] Jeong, J. & Choi, J. Use of adverse outcome pathways in chemical toxicity testing:
potential advantages and limitations. Environmental Health and Toxicology 33,
e2018002 (2017).
[223] Vinken, M. Taking adverse outcome pathways to the next level. Toxicology in Vitro
50, A1±A2 (2018).
[225] Sturla, S. J. et al. Systems Toxicology: From Basic Research to Risk Assessment.
Chemical Research in Toxicology 27, 314±329 (2014).
[226] Hartung, T. et al. Systems Toxicology: Real World Applications and Opportunities.
Chemical Research in Toxicology 30, 870±882 (2017).
[227] Aguayo-Orozco, A., Taboureau, O. & Brunak, S. The use of systems biology in
chemical risk assessment. Current Opinion in Toxicology 15, 48±54 (2019).
255
[229] LaLone, C. A. et al. Weight of evidence evaluation of a network of adverse out-
come pathways linking activation of the nicotinic acetylcholine receptor in honey
bees to colony death. Science of the Total Environment 584-585, 751±775 (2017).
[230] Spinu, N. et al. Development and analysis of an adverse outcome pathway network
for human neurotoxicity. Archives of Toxicology 93, 2759±2772 (2019).
[233] Carvaillo, J.-C., Barouki, R., Coumoul, X. & Audouze, K. Linking Bisphenol S to
Adverse Outcome Pathways Using a Combined Text Mining and Systems Biology
Approach. Environmental Health Perspectives 127, 047005 (2019).
[234] Browne, P., Van Der Wal, L. & Gourmelon, A. OECD approaches and consider-
ations for regulatory evaluation of endocrine disruptors. Molecular and Cellular
Endocrinology 504, 110675 (2020).
[235] Rugard, M., Coumoul, X., Carvaillo, J.-C., Barouki, R. & Audouze, K. Decipher-
ing Adverse Outcome Pathway Network Linked to Bisphenol F Using Text Mining
and Systems Toxicology Approaches. Toxicological Sciences 173, 32±40 (2020).
[238] Patisaul, H. B., Fenton, S. E. & Aylor, D. Animal models of endocrine disrup-
tion. Best Practice & Research Clinical Endocrinology & Metabolism 32, 283±297
(2018).
256
[239] NetworkX. https://networkx.org/.
[241] Assenov, Y., Ramírez, F., Schelhorn, S.-E., Lengauer, T. & Albrecht, M. Com-
puting topological parameters of biological networks. Bioinformatics 24, 282±284
(2008).
[243] Takes, F. W. & Kosters, W. A. Determining the diameter of small world networks.
In CIKM ’11: Proceedings of the 20th ACM international conference on Infor-
mation and knowledge management, 1191±1196 (ACM Press, Glasgow, Scotland,
UK, 2011).
[245] Bernal, J. Thyroid hormones in brain development and function. Endotext [Inter-
net] (2000).
[246] Volpato, S. et al. Serum thyroxine level and cognitive decline in euthyroid older
women. Neurology 58, 1055±1061 (2002).
[247] Tunc-Ozcan, E., Ullmann, T. M., Shukla, P. K. & Redei, E. E. Low-Dose Thyrox-
ine Attenuates Autism-Associated Adverse Effects of Fetal Alcohol in Male Off-
spring’s Social Behavior and Hippocampal Gene Expression. Alcoholism: Clinical
and Experimental Research 37, 1986±1995 (2013).
[248] Cooke, G. E., Mullally, S., Correia, N., O’Mara, S. M. & Gibney, J. Hippocampal
Volume Is Decreased in Adults with Hypothyroidism. Thyroid 24, 433±440 (2014).
257
[249] Corton, J. C. & Lapinskas, P. J. Peroxisome Proliferator-Activated Receptors: Me-
diators of Phthalate Ester-Induced Effects in the Male Reproductive Tract? Toxi-
cological Sciences 83, 4±17 (2005).
[250] Latini, G., Scoditti, E., Verrotti, A., De Felice, C. & Massaro, M. Peroxisome
Proliferator-Activated Receptors as Mediators of Phthalate-Induced Effects in the
Male and Female Reproductive Tract: Epidemiological and Experimental Evi-
dence. PPAR Research 2008, 359267 (2008).
[253] Jahnke, G. D., Choksi, N. Y., Moore, J. A. & Shelby, M. D. Thyroid toxicants:
assessing reproductive health effects. Environmental Health Perspectives 112, 363±
368 (2004).
[255] Chen, C.-W., Huang, Y.-L., Tzeng, C.-R., Huang, R.-L. & Chen, C.-H. Idiopathic
Low Ovarian Reserve Is Associated with More Frequent Positive Thyroid Peroxi-
dase Antibodies. Thyroid 27, 1194±1200 (2017).
[256] Wang, X., Ding, X., Xiao, X., Xiong, F. & Fang, R. An exploration on the influence
of positive simple thyroid peroxidase antibody on female infertility. Experimental
and Therapeutic Medicine 16, 3077±3081 (2018).
258
[257] Erickso, G. F., Hsueh, A., Quigley, M., Rebar, R. & Yen, S. Functional Stud-
ies of Aromatase Activity in Human Granulosa Cells from Normal and Polycys-
tic Ovaries. The Journal of Clinical Endocrinology & Metabolism 49, 514±519
(1979).
[258] Garzo, V. & Dorrington, J. Aromatase activity in human granulosa cells during
follicular development and the modulation by follicle-stimulating hormone and in-
sulin. American Journal of Obstetrics and Gynecology 148, 657±662 (1984).
[260] Sun, L., Zha, J., Spear, P. A. & Wang, Z. Toxicity of the aromatase inhibitor
letrozole to Japanese medaka (Oryzias latipes) eggs, larvae and breeding adults.
Comparative Biochemistry and Physiology Part C: Toxicology & Pharmacology
145, 533±541 (2007).
[261] Hazra, R. et al. In Vivo Actions of the Sertoli Cell Glucocorticoid Receptor. En-
docrinology 155, 1120±1130 (2014).
[262] Silva, E. J., Queiróz, D. B., Honda, L. & Avellar, M. C. W. Glucocorticoid receptor
in the rat epididymis: Expression, cellular distribution and regulation by steroid
hormones. Molecular and Cellular Endocrinology 325, 64±77 (2010).
259
[266] Fonger, G. C., Stroup, D., Thomas, P. L. & Wexler, P. Toxnet: A computerized
collection of toxicological and environmental health information. Toxicology and
Industrial Health 16, 4±6 (2000).
[267] National Library of Medicine, U. TOXNET Update: New Locations for TOXNET
Content. Tech. Rep. 431, NLM Tech Bulletin (2019).
[269] Fonger, G. C., Hakkinen, P., Jordan, S. & Publicker, S. The National Library of
Medicine’s (NLM) Hazardous Substances Data Bank (HSDB): Background, recent
enhancements and future plans. Toxicology 325, 209±216 (2014).
[275] Rogers, F. B. Medical subject headings. Bulletin of the Medical Library Associa-
tion 51, 114±116 (1963).
260
[277] Estrin, W. J. et al. Evidence of Neurologic Dysfunction Related to Long-term
Ethylene Oxide Exposure. Archives of Neurology 44, 1283±1286 (1987).
[278] Estrin, W. J., Bowler, R. M., Lash, A. & Becker, C. E. Neurotoxicological evalua-
tion of hospital sterilizer workers exposed to ethylene oxide. Journal of Toxicology:
Clinical Toxicology 28, 1±20 (1990).
[279] Zheng, W., Aschner, M. & Ghersi-Egea, J.-F. Brain barrier systems: a new frontier
in metal neurotoxicological research. Toxicology and Applied Pharmacology 192,
1±11 (2003).
[280] Miodovnik, A., Edwards, A., Bellinger, D. C. & Hauser, R. Developmental neuro-
toxicity of ortho-phthalate diesters: Review of human and experimental evidence.
NeuroToxicology 41, 112±122 (2014).
[282] Jasial, S., Hu, Y., Vogt, M. & Bajorath, J. Activity-relevant similarity values for fin-
gerprints and implications for similarity searching. F1000Research 5, 591 (2016).
[283] Mohanraj, K. et al. IMPPAT: A curated database of Indian Medicinal Plants, Phy-
tochemistry And Therapeutics. Scientific Reports 8, 4329 (2018).
[284] Vivek-Ananth, R. P., Sahoo, A. K., Kumaravel, K., Mohanraj, K. & Samal, A.
MeFSAT: a curated natural product database specific to secondary metabolites of
medicinal fungi. RSC Advances 11, 2596±2607 (2021).
[285] Landrigan, P. J., Sonawane, B., Mattison, D., McCally, M. & Garg, A. Chemical
contaminants in breast milk and their impacts on children’s health: an overview.
Environmental Health Perspectives 110, A313±315 (2002).
261
[286] Lehmann, G. M. et al. Environmental Chemicals in Breast Milk and Formula:
Exposure and Risk Assessment Implications. Environmental Health Perspectives
126, 096001 (2018).
[287] LaKind, J. S. et al. Infant Dietary Exposures to Environmental Chemicals and In-
fant/Child Health: A Critical Assessment of the Literature. Environmental Health
Perspectives 126, 096002 (2018).
[290] Ramakrishnan, N., Kaphalia, B., Seth, T. & Roy, N. Organochlorine Pesticide
Residues in Mother’s Milk: a Source of Toxic Chemicals in Suckling Infants. Hu-
man Toxicology 4, 7±12 (1985).
[291] Devanathan, G. et al. Persistent organochlorines in human breast milk from major
metropolitan cities in India. Environmental Pollution 157, 148±154 (2009).
[293] Sharma, B. M., Bharat, G. K., Tayal, S., Nizzetto, L. & Larssen, T. The legal
framework to manage chemical pollution in India and the lesson from the Persistent
Organic Pollutants (POPs). Science of the Total Environment 490, 733±747 (2014).
[294] van den Berg, M. et al. WHO/UNEP global surveys of PCDDs, PCDFs, PCBs
and DDTs in human milk and benefit±risk evaluation of breastfeeding. Archives of
Toxicology 91, 83±96 (2017).
262
[295] Gaulton, A. et al. The ChEMBL database in 2017. Nucleic Acids Research 45,
D945±D954 (2017).
[296] Samet, J. M. et al. The IARC Monographs: Updated Procedures for Modern and
Transparent Evidence Synthesis in Cancer Hazard Identification. JNCI: Journal of
the National Cancer Institute 112, 30±37 (2020).
[298] Indian Ministry of Agriculture & Farmers Welfare. List of Banned Pesticides in
India. http://ppqs.gov.in/divisions/cib-rc/registered-products.
[299] Indian Ministry of Environment & Forests. Schedule 1 hazardous chemical list in
India. http://moef.gov.in/wp-content/uploads/2019/08/SCHEDULE-I.html.
[300] Indian Ministry of Environment & Forests. Schedule 3 hazardous chemical list in
India. http://moef.gov.in/wp-content/uploads/2019/08/SCHEDULE-3.html.
[303] Neville, M. C. & Walsh, C. T. Effects of xenobiotics on milk secretion and com-
position. The American Journal of Clinical Nutrition 61, 687S±694S (1995).
263
[304] Lemay, D. G. et al. RNA Sequencing of the Human Milk Fat Layer Transcriptome
Reveals Distinct Gene Expression Profiles at Three Stages of Lactation. PLoS ONE
8, e67531 (2013).
[305] Maningat, P. D. et al. Gene expression in the human mammary epithelium during
lactation: the milk fat globule transcriptome. Physiological Genomics 37, 12±22
(2009).
[307] Hill, P. D., Chatterton, R. T. & Aldag, J. C. Serum Prolactin in Breastfeeding: State
of the Science. Biological Research For Nursing 1, 65±75 (1999).
[312] Kanehisa, M. KEGG: Kyoto Encyclopedia of Genes and Genomes. Nucleic Acids
Research 28, 27±30 (2000).
[313] Rebelo, F. M. & Caldas, E. D. Arsenic, lead, mercury and cadmium: Toxicity,
levels in breast milk and the risks for breastfed infants. Environmental Research
151, 671±688 (2016).
264
[314] Dawod, B. & Marshall, J. S. Cytokines and Soluble Receptors in Breast Milk as
Enhancers of Oral Tolerance Development. Frontiers in Immunology 10, 16 (2019).
[315] Jackson, K. M. & Nazar, A. M. Breastfeeding, the immune response, and long-term
health. Journal of Osteopathic Medicine 106, 203±207 (2006).
[316] Bagley, C. J., Woodcock, J. M., Stomski, F. C. & Lopez, A. F. The Structural and
Functional Basis of Cytokine Receptor Activation: Lessons From the Common β
Subunit of the Granulocyte-Macrophage Colony-Stimulating Factor, Interleukin-3
(IL-3), and IL-5 Receptors. Blood 89, 1471±1482 (1997).
[320] Braschi, B. et al. Genenames.org: the HGNC and VGNC resources in 2019. Nu-
cleic Acids Research 47, D786±D792 (2019).
[322] Ito, S. & Alcorn, J. Xenobiotic transporter expression and function in the human
mammary gland. Advanced Drug Delivery Reviews 55, 653±665 (2003).
[323] García-Lino, A. M., Álvarez Fernández, I., Blanco-Paniagua, E., Merino, G. &
Álvarez, A. I. Transporters in the Mammary GlandÐContribution to Presence of
Nutrients and Drugs into Milk. Nutrients 11, 2372 (2019).
265
[324] Montalbetti, N., Dalghi, M. G., Albrecht, C. & Hediger, M. A. Nutrient Transport
in the Mammary Gland: Calcium, Trace Minerals and Water Soluble Vitamins.
Journal of Mammary Gland Biology and Neoplasia 19, 73±90 (2014).
[325] Ventrella, D., Forni, M., Bacci, M. L. & Annaert, P. Non-clinical Models to Deter-
mine Drug Passage into Human Breast Milk. Current Pharmaceutical Design 25,
534±548 (2019).
[326] Alcorn, J., Lu, X., Moscow, J. A. & McNamara, P. J. Transporter Gene Expression
in Lactating and Nonlactating Human Mammary Epithelial Cells Using Real-Time
Reverse Transcription-Polymerase Chain Reaction. Journal of Pharmacology and
Experimental Therapeutics 303, 487±496 (2002).
[327] Mandal, B. & Suzuki, K. T. Arsenic round the world: a review. Talanta 58, 201±
235 (2002).
[331] Borah, K. K., Bhuyan, B. & Sarma, H. P. Lead, arsenic, fluoride, and iron contam-
ination of drinking water in the tea garden belt of Darrang district, Assam, India.
Environmental Monitoring and Assessment 169, 347±352 (2010).
266
[332] Sharma, C., Mahajan, A. & Garg, U. K. Assessment of arsenic in drinking wa-
ter samples in south-western districts of PunjabÐIndia. Desalination and Water
Treatment 51, 5701±5709 (2013).
[333] Kumar, M., Rahman, M. M., Ramanathan, A. & Naidu, R. Arsenic and other
elements in drinking water and dietary components from the middle Gangetic plain
of Bihar, India: Health risk index. Science of the Total Environment 539, 125±134
(2016).
[334] U.S. EPA. Child-Specific Exposure Scenarios Examples (Final Report). Tech.
Rep. EPA/600/R-14-217F, U.S. Environmental Protection Agency, Washington,
DC (2014).
[336] Negev, M. et al. Regulation of chemicals in children’s products: How U.S. and
EU regulation impacts small markets. Science of the Total Environment 616-617,
462±471 (2018).
[337] Brod, B. A., Treat, J. R., Rothe, M. J. & Jacob, S. E. Allergic contact dermatitis:
Kids are not just little people. Clinics in Dermatology 33, 605±612 (2015).
[338] Högberg, J. et al. Phthalate Diesters and Their Metabolites in Human Breast Milk,
Blood or Serum, and Urine as Biomarkers of Exposure in Vulnerable Populations.
Environmental Health Perspectives 116, 334±339 (2008).
[339] Bridges, B. Fragrances and health. Environmental Health Perspectives 107, A340
(1999).
267
[341] Krowech, G. et al. Identifying Chemical Groups for Biomonitoring. Environmental
Health Perspectives 124, A219±A226 (2016).
[342] Bridges, B. Fragrance: emerging health and environmental concerns. Flavour and
Fragrance Journal 17, 361±371 (2002).
[343] Aurisano, N., Fantke, P., Huang, L. & Jolliet, O. Estimating mouthing exposure
to chemicals in children’s products. Journal of Exposure Science & Environmental
Epidemiology (2021).
[344] Preferred Reporting Items for Systematic Reviews and Meta-analyses (PRISMA).
http://www.prisma-statement.org/.
[346] Arn, H. & Acree, T. Flavornet: a database of aroma compounds based on odor
potency in natural products. Developments in food science 40, 27±28 (1998).
[351] The International Fragrance Association (IFRA). IFRA Transparency List. https:
//ifrafragrance.org/priorities/ingredients/ifra-transparency-list.
268
[352] NORMAN: Toxic Plant Phytotoxin (TPPT) Database. https://comptox.epa.gov/
dashboard/chemical_lists/PHYTOTOXINS.
[353] Drechsel, D. A. et al. Skin Sensitization Induction Potential From Daily Exposure
to Fragrances in Personal Care Products. Dermatitis 29, 324±331 (2018).
[354] U.S. National Toxicology Program. ICCVAM: Skin Corrosion 2004 collection
from NIEHS. https://comptox.epa.gov/dashboard/chemical_lists/ICCVAMSKIN
(2004).
[355] U.S. National Toxicology Program. ICCVAM: Local Lymph Node Assay (LLNA)
2009. https://comptox.epa.gov/dashboard/chemical_lists/ICCVAMLLNA (2009).
[356] National Institute for Occupational Safety and Health. NIOSH: Skin Notation Pro-
files. https://comptox.epa.gov/dashboard/chemical_lists/NIOSHSKIN (2009).
[357] Baldi, P. & Nasr, R. When is Chemical Similarity Significant? The Statistical
Distribution of Chemical Similarity Scores and Its Extreme Values. Journal of
Chemical Information and Modeling 50, 1205±1222 (2010).
[358] Farbiszewski, R. & Kranc, R. Olfactory receptors and the mechanism of odor
perception. Polish Annals of Medicine 20, 51±55 (2013).
[359] Gaillard, I., Rouquier, S. & Giorgi, D. Olfactory receptors. Cellular and Molecular
Life Sciences CMLS 61, 456±469 (2004).
[360] Genva, M., Kenne Kemene, T., Deleu, M., Lins, L. & Fauconnier, M.-L. Is It Pos-
sible to Predict the Odor of a Molecule on the Basis of its Structure? International
Journal of Molecular Sciences 20, 3018 (2019).
[362] Crasto, C. J. The olfactory receptor database: web-based resources for the ge-
nomics, proteomics and function of olfactory receptors. Flavour 3, O8 (2014).
269
[363] Olfactory Receptor Database (ORDB). http://ycmi.med.yale.edu/senselab/ordb/.
[365] van de Sandt, J. et al. The Use of Human Keratinocytes and Human Skin Mod-
els for Predicting Skin Irritation: The Report and Recommendations of ECVAM
Workshop 38 , . Alternatives to Laboratory Animals 27, 723±743 (1999).
[368] Rappaport, S. M., Barupal, D. K., Wishart, D., Vineis, P. & Scalbert, A. The Blood
Exposome and Its Role in Discovering Causes of Disease. Environmental Health
Perspectives 122, 769±774 (2014).
[369] Uhlén, M. et al. Tissue-based map of the human proteome. Science 347, 1260419
(2015).
[370] Piñero, J. et al. The DisGeNET knowledge platform for disease genomics: 2019
update. Nucleic Acids Research 48, D845±D855 (2020).
[372] The UniProt Consortium. UniProt: a hub for protein information. Nucleic Acids
Research 43, D204±D212 (2015).
[373] Rath, A. et al. Representation of rare diseases in health information systems: The
orphanet approach to serve a wide range of end users. Human Mutation 33, 803±
808 (2012).
270
[374] Ma, X., Lee, H., Wang, L. & Sun, F. CGI: a new approach for prioritizing genes by
combining gene expression and protein±protein interaction data. Bioinformatics
23, 215±221 (2007).
[377] Goodman, Z. D. Neoplasms of the liver. Modern Pathology 20, S49±S60 (2007).
[378] Aleksandrova, K., Stelmach-Mardas, M. & Schlesinger, S. Obesity and Liver Can-
cer. In Pischon, T. & Nimptsch, K. (eds.) Obesity and Cancer. Recent Results
in Cancer Research, vol 208, 177±198 (Springer International Publishing, Cham,
2016).
[380] Marengo, A., Rosso, C. & Bugianesi, E. Liver Cancer: Connections with Obesity,
Fatty Liver, and Cirrhosis. Annual Review of Medicine 67, 103±117 (2016).
[383] Gupta, R. et al. Endocrine disruption and obesity: A current review on environmen-
tal obesogens. Current Research in Green and Sustainable Chemistry 3, 100009
(2020).
271
[384] Taboureau, O. & Audouze, K. Human Environmental Disease Network: A compu-
tational model to assess toxicology of contaminants. ALTEX 34, 289±300 (2017).
[386] Zhou, X., Menche, J., Barabási, A.-L. & Sharma, A. Human symptoms±disease
network. Nature Communications 5, 4212 (2014).
[387] Dobson, C. M. Chemical space and biology. Nature 432, 824±828 (2004).
[388] Lipinski, C. & Hopkins, A. Navigating chemical space for biology and medicine.
Nature 432, 855±861 (2004).
[389] Rager, J. E. et al. Review of the environmental prenatal exposome and its relation-
ship to maternal and fetal health. Reproductive Toxicology 98, 1±12 (2020).
[390] Helma, C., Kramer, S., Pfahringer, B. & Gottmann, E. Data quality in predic-
tive toxicology: identification of chemical structures and calculation of chemical
properties. Environmental Health Perspectives 108, 1029±1033 (2000).
[391] Helma, C., Gottmann, E. & Kramer, S. Knowledge discovery and data mining in
toxicology. Statistical Methods in Medical Research 9, 329±358 (2000).
[394] Xue, J., Lai, Y., Liu, C.-W. & Ru, H. Towards Mass Spectrometry-Based Chemical
Exposome: Current Approaches, Challenges, and Future Directions. Toxics 7, 41
(2019).
272
[395] Leist, M. et al. Adverse outcome pathways: opportunities, limitations and open
questions. Archives of Toxicology 91, 3477±3505 (2017).
273