1 s2.0 S0022283620305945 Main
1 s2.0 S0022283620305945 Main
1 s2.0 S0022283620305945 Main
1 - National Center for Biotechnology Information, National Institutes of Health, Bethesda, MD 20894, USA
2 - Computational Biology and Bioinformatics, Combined Program in the Biological and Biomedical Sciences, Yale
University, New Haven, CT 06520, USA
3 - VantAI, New York, NY 10003, USA
4 - Department of Pathology and Molecular Medicine, School of Medicine, Queen’s University, ON K7L 3N6, Canada
Abstract
To elucidate the properties of human histone interactions on the large scale, we perform a comprehensive
mapping of human histone interaction networks by using data from structural, chemical cross-linking and
various high-throughput studies. Histone interactomes derived from dierent data sources show limited
overlap and complement each other. It inspires us to integrate these data into the combined histone global
interaction network which includes 5308 proteins and 10,330 interactions. The analysis of topological prop-
erties of the human histone interactome reveals its scale free behavior and high modularity. Our study of
histone binding interfaces uncovers a remarkably high number of residues involved in interactions between
histones and non-histone proteins, 80–90% of residues in histones H3 and H4 have at least one binding
partner. Two types of histone binding modes are detected: interfaces conserved in most histone variants
and variant specific interfaces. Finally, dierent types of chromatin factors recognize histones in nucleo-
somes via distinct binding modes, and many of these interfaces utilize acidic patches among other sites.
Interaction networks are available at https://github.com/Panchenko-Lab/Human-histone-interactome.
Ó 2020 The Author(s). Published by Elsevier Ltd. This is an open access article under the CC BY-NC-ND license (http://crea-
tivecommons.org/licenses/by-nc-nd/4.0/).
0022-2836/Ó 2020 The Author(s). Published by Elsevier Ltd.This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/
licenses/by-nc-nd/4.0/). Journal of Molecular Biology (2020) xxx, xxx–xxx
Please cite tts article as: Y. Peng, Y. Markov, A. Goncearenco, et al., Human Histone Interaction Networks: An Old Concept, New TrendsJournal of Molecular Biology, https://doi.org/10.1016/j.jmb.2020.10.018
Y. Peng, Y. Markov, A. Goncearenco, et al. Journal of Molecular Biology xxx (xxxx) xxx
of proteins interacting with different histone obtained structures were further used to extract
types6–13 and histone post-translationally modified information on species, protein names, UniProt
sites.14–19 accessions and chain identifiers with the RCSB
Although high-throughput approaches have been PDB RESTful Web Service interface (https://www.
widely applied for mapping of protein interactomes, rcsb.org/pdb/software/rest.do). Second, three
the identified PPIs still suffer from high false- rounds of filtering were applied to extract the struc-
positive and false-negative rates and by inability to tures of human histones or nucleosomes in com-
provide data of high resolution on physiological plex with other proteins: i) structures that did not
chromatin states. These issues can be addressed contain any human histones were excluded; ii)
in part by using experimental data produced by X- ambiguous cases of synthetic constructs mimicking
ray crystallography, NMR spectroscopy and cryo- histone tails without relevant UniProtKB accessions
electron microscopy (cryo-EM) techniques which that could not be mapped to known histone
provide atomic or near-atomic resolution on sequences in HistoneDB 2.0 database were
specific biologically relevant interactions in excluded24,25; iii) structures that contained histones
chromatin.20,21 These studies enable a systematic without any protein binding partners were removed.
analysis of histone and nucleosome interactions It should be noted that a small fraction of collected
and their binding interfaces to complement and PDB histone structures contained human histones
interpret interactomes obtained from the high- in complex with proteins from other species. Such
throughput approaches. The number of histone inter-species histone interactions were checked
and nucleosome complex structures in Protein Data and included in the interaction network if binding
Bank (PDB) has been exponentially increasing in partners were evolutionary conserved based on
recent years3 but still is very low due to the complex- the corresponding reference. Proteins were consid-
ity of structural characterization. Moreover, most of ered as unique histone-binding partners if they had
the low-throughput approaches focus on individual different UniProtKB accessions. In total, we found
interactions or sub-networks of histone or tail pep- 208 histone interactions with 164 different binding
tide interactions. proteins from 345 structures of individual histones
A comprehensive human protein interactome or histones in nucleosome complexes (Table SM2).
characterization remains a daunting challenge, In order to identify binding interfaces we retrieved
which requires the integration of different coordinates of biological assemblies from the PDB
experimental and computational approaches.22,23 and analyzed their inter-protein contacts.
Herein, we perform a comprehensive mapping of Interfaces consisted of residues located within 5 A
human histone and nucleosome interactions by distance between heavy atoms of histones and
systematically analyzing the structural, chemical their binding partners. Residue locations were
cross-linking and HTP data. Our global human his- mapped to the sequences of the corresponding
tone interactome combined from structural, cross- UniprotKB entries using SIFTS.26 Next, using pro-
linking and high-throughput data has 5308 protein tein domain family annotations from the Conserved
nodes and 10,330 interactions between them. Domain Database (a collection of manually curated
Finally, we characterize binding interfaces and iden- and annotated multiple sequence alignments for
tify binding hotspots on individual histones, histone protein domain families and full-length proteins),27
vatiants and on histones in the context of we identified domain families of histone-binding pro-
nucleosomes. teins. Proteins interacting with histones via the
same domain family were grouped together in the
Methods “domain-level network” totaling in 137 interactions
from 113 unique domain families.
Construction of human histone/nucleosome
interactomes
Histone cross-linking interactome. Cross-linking
To improve our understanding of the biological mass spectrometry (XL-MS) is a powerful
processes mediated by nucleosome or histone experimental approach for identifying protein-
interactions in human, we used three different protein interactions and probing PPI interfaces
sources of experimental data (Figure 1). We containing certain residues.28 Recently, PPIs in
further explored histone interactions at different human nuclei were studied using this technique
levels of granularity: protein, domain and residue- and interaction network was built from the inter-
levels. The details are outlined below. protein crosslinks between spatially closely located
lysine residues (cross-linked using disuccinimidyl
Histone structural interactome. The histone sulfoxide, DSSO).10 Using the interactions
structural interactome includes histone observed in both fractionated and unfractionated
interactions collected from the available histone crosslinked nuclei from10 (in total 1855 PPIs in
and nucleosome complex structures in PDB.20 To human nuclei), we extracted the histone interac-
build this network, first we performed the text search tions where one human histone was cross-linked
against PDB using a list of keywords associated with another human non-histone protein. This so
with histones (Table SM1). The PDB identifiers of called “cross-linking histone network” included 274
2
Y. Peng, Y. Markov, A. Goncearenco, et al. Journal of Molecular Biology xxx (xxxx) xxx
Figure 1. A workflow to construct human histone interactomes from different resources using PDB structures of
histone complexes, crosslinking mass spectrometry and high-throughput data from the APID database.
interactions with 200 histone-binding proteins. In interactome. For most APID entries no data on bind-
addition, we used domain annotations from the ing interfaces was available.
Conserved Domain Database27 to construct the
domain-level cross-linking network which contained
107 interactions from 70 different domain families. Interactome visualization and analysis
Finally, binding interfaces were extracted by map- All constructed human histone interactomes were
ping specific lysine residues forming inter-protein visualized using Cytoscape.30 Nodes were anno-
crosslinks from XL-MS data.10 tated with the UniProtKB accession identifiers and
the BioPAX_SIF style was used in the network
Histone high-throughput interactome. APID visualization. Histone-binding proteins were
database integrates PPIs from several major classified by their functions using the PANTHER
databases of molecular interactions for more than server (http://www.pantherdb.org)31 and further
1100 organisms.29 Herein, we extracted human his- categorized into different groups using the
tone interactions from the APID database sup- PANTHER protein class. All human histone interac-
ported by “binary” methods which provided data tome Cytoscape session files are available at
on direct physical interactions between proteins; https://github.com/Panchenko-Lab/Human-histone-
inter-species interactions were excluded. We used interactome.
UniProtKB accessions to identify histone proteins. The topological properties of networks were
Histone interaction was identified if one human his- analyzed as follows. We applied cytoHubba
tone interacted with another human non-histone program to identify hub nodes in the networks by
protein in APID database. As a consequence, 220 calculating Maximal Clique Centrality (MCC)
interactions between histones and 163 histone- score
P
32
which is defined as MCC ðv Þ ¼
binding proteins were extracted from APID data- C2Sðv Þ ðj C j 1Þ!, where S(v) is the collection of
base to build the human histone high-throughput maximal cliques which contains node v and C is
3
Y. Peng, Y. Markov, A. Goncearenco, et al. Journal of Molecular Biology xxx (xxxx) xxx
the size of maximum clique. A clique is defined as a high-throughput interactomes (Figure 2 and
subset of nodes in an undirected graph where every Table SM2). First, we observe that the numbers of
two distinct nodes are adjacent. Maximal clique is a interactions for each histone family dramatically
clique that cannot be extended by including adja- differ in the structural interactome at all levels of
cent nodes. MCC is equal to the node degree if granularity (Figure 2): H3 and H4 histones have
there is no edge between the neighbors of the node the largest number of interactions while very few
v. We used the DyNet program33 to identify the H1 interactions are present. Such observations
overlapping nodes between different networks. could arise from the bias of PDB database which
Other topological properties of networks, including contains many structures of histone H3 and H4 tail
clustering coefficient, topological coefficient, peptides with the reader domains, for example,
betweenness centrality and node degree, were JmjC and MBT domains. On the other hand, long
analyzed with the Network Analyzer module in disordered N-and C-terminal regions of H1 histone
Cytoscape.30 The node degree of a node n is a bear difficulties in their experimental structural
number of edges linked to this node. Local cluster- characterization and therefore very few H1
ing coefficient C n of a node n is defined as interactions are present in the structural
C n ¼ 2e n =ðk n ðk n 1ÞÞ,where k n is a number of interactome. Next, we compare domain-level
immediate directly connected neighbors of a node histone structural interactomes between Homo
n, and e n is an actual number of connections sapiens, Xenopus laevis and Saccharomyces
between all neighbors of node n. Local clustering cerevisiae. Histone structural interactomes of
coefficient was then averaged over all nodes. The Xenopus laevis and Saccharomyces cerevisiae
topological coefficient Tn of a node n with k n neigh- share about one third of their nodes (five domain
bors is computed as Tn ¼ avgðJðn; mÞÞ=k n , where families in Xenopus laevis and seven domain
node m is a node that shares at least one immediate families in Saccharomyces cerevisiae) with Homo
neighbor with a node n. J ðn; m Þ is calculated as a sapiens domain-level network (Figure SM1,
number of neighbors shared between nodes n and Table SM3). Some protein domain families having
m plus one if there is a direct link between nodes conserved interactions between all three networks
n and m; it is then averaged over all nodes m. The include: Bromodomain which specifically interacts
betweenness centrality P C b ðnÞ of a node n is com- with acetylated lysine, and ASF1_hist_chap family
puted as C b ðn Þ ¼ s–n–t ðrst ðnÞ=rst Þ, where s which comprises histone chaperone proteins and
and t denote nodes in the network different from participates in both the replication-dependent and
n, rst denotes the number of shortest paths from replication-independent pathways.
s to t, and rst ðnÞ is a number of shortest paths from In contrast to the human structural interactome
s to t that include node n. and in concordance with the previous study,10 we
observe that H1 and H2B histones harbor the
majority of interactions in cross-linking interactome,
Identification of binding hotspots
while H2A, H3, and H4 has the least number of
Using the interfacial residue locations from the interactions. Such observations could be explained
structural and cross-linking interactomes, we by different lysine content in different histone types
counted the number of different binding proteins since XL-MS experiments are based on Lys-Lys
with unique UniProtKB accessions per each chemical crosslinks. As we can see from the lysine
histone interfacial residue. For the cross-linking content of human canonical histones (Table SM4),
interactome, we also included one residue before H1 and H2B have indeed the highest lysine content,
and after the cross-linked lysine residue to define which is about three and two times higher than H3.
binding interfacial residues. Sequences of all As a consequence, the number of H3 interactions is
histone variants present in structural and cross- probably underestimated by XL-MS data due to the
linking networks were extracted from UniProt34 low lysine content. Lastly, we observe that high-
and HistoneDB 2.0.24,25 Then, we performed multi- throughput histone interactome has smaller differ-
ple sequence alignments for each histone family ences in the number of interactions for each histone
using Clustal Omega 1.2.335 and further mapped family compared to structural and cross-linking
the number of binding proteins for each residue interactomes. Overall, 6% of nodes in structural
onto the consensus sequence. interactome overlap with cross-linking interactome
and 32% of nodes in structural interactome overlap
with HTP interactome.
Results In order to analyze histone associated pathways
using our networks, we further identify an
Different histone interaction networks
additional layer of partners which interact with
complement each other
histone-binding proteins and construct so called
To elucidate the physicochemical properties of “global histone interactomes” (Figure 3(A),
human histone/nucleosome interactions, we Table SM5). We systemically compare three
construct three human histone interaction global histone interactomes (structural, cross-
networks including structural, cross-linking and linking and high-throughput) by identifying the
4
Y. Peng, Y. Markov, A. Goncearenco, et al. Journal of Molecular Biology xxx (xxxx) xxx
Figure 2. Human histone interactions at different granularity levels excluding histone-histone interactions. The
protein-level and domain-level networks are shown as the preferred layout in Cytoscape and nodes of histones and
histone-binding proteins are colored as green and pink while hub nodes in histone binding proteins (nodes with the
high node degree) are highlighted as orange. Degree sorted circle layout in Cytoscape is applied to show the residue-
level interactions, where histone residues are colored by purple, yellow, red, blue and green per histone colour
convention while binding proteins are shown as pink nodes.
overlapping nodes (Figure SM2 and SM3).33 A very Histone networks are scale-free and have high
low fraction of overlap (~4%) has been observed modularity
between global structural and cross-linking interac-
tomes (Figure SM2), and only 49% of nodes in Next, we systematically analyze and compare
structural interactome are present in the high- topological properties of global structural, cross-
throughput interactome (Figure SM3). linking and high-throughput networks (Figures 3
and SM5–7). The average clustering coefficient
5
Y. Peng, Y. Markov, A. Goncearenco, et al. Journal of Molecular Biology xxx (xxxx) xxx
Figure 3. Topological properties of human histone interactomes. (a) Partners which interact with histone-binding
proteins are identified (orange nodes) and added as one additional layer to the initial networks (green and pink nodes)
to construct histone global interactomes. (b) and (c) Comparison of the network topological properties between
structural, cross-linking and high-throughput global interactomes.
quantifies how well the nodes in a graph are transporter and defense/immunity proteins. As
clustered together and the topological coefficient many biological networks are scale-free,23,36,37 we
measures the extent to which a node in the show that the node degree distributions for all three
network shares neighbors with other nodes. For types of histone networks follow a power law since it
all networks, the majority of nodes have low shows a remarkedly strong linear association
clustering and topological coefficients and more between the fraction of nodes and the node degree
than 80% and 60% of nodes respectively have on a log–log plot (Figure SM6). In scale-free net-
clustering and topological coefficients less than works, the majority of proteins represent low degree
0.1. These nodes generally have sparse nodes and only a small number of proteins act as
connections with the neighboring nodes. hubs which play critical roles in mediating chromatin
Compared to cross-linking interactome, structural signal transduction. This makes a network relatively
and high-throughput interactomes show higher tolerant to perturbations as it can maintain its integ-
average clustering coefficients pointing to more rity even if vast majority of low degree nodes are
dense connections and possible complex damaged.38,39
formation in these networks (0.01 compared to Furthermore, we observe a strong power-law
0.13 and 0.1) (Figure 3(B)). Among all three decay of values of clustering coefficients with the
networks, only very few nodes have high increasing node degree (Figure SM7). Such
betweenness centrality values (a measure of the observation indicates a high modularity of these
centrality of a node in a graph) (Figure SM5). networks having more dense connections
These nodes generally comprise histones and between the nodes within the modules compared
other regulatory proteins playing ubiquitous to connections between different modules. It
functions in biological processes such as ubiquitin, suggests the existence of at least two levels of
6
Y. Peng, Y. Markov, A. Goncearenco, et al. Journal of Molecular Biology xxx (xxxx) xxx
organization, as was previously observed in human (Figure 4). Although structural and cross-linking
and yeast protein interactomes.23,36,37,40 The first interactomes share only a small portion of nodes,
level represents the separate dense modules as shown previously, we observe relatively small
formed by proteins with the low node degrees. differences in terms of their functional types.
The second level is composed of high degree nodes However, the number of proteins in each
such as histone proteins and others which act as functional category varies between two networks.
hubs to meditate the global connectivity between These networks mostly contain nucleic acid
different modules.36 We further analyze the correla- binding proteins, especially transcription regulatory
tion between topological coefficient and the node proteins, since the vast majority of proteins belong
degree, where a power-law decay is also clearly to nuclear proteins (Figure 4 and SM8). Moreover,
identified (Figure 3(c)). It demonstrates that in all we find fewer protein-modifying enzymes and
three types of networks, the high degree nodes gen- chromatin-binding/regulatory proteins in the cross-
erally do not have more common neighbors com- linking interactome compared to the structural
pared to the nodes with the low node degree, interactome.
which also points to high modularity of networks. Furthermore, we identify and rank nodes in all
three global histone networks with respect to their
numbers of interactions. Hub nodes with MCC
Different networks demonstrate similar score >= 4 are shown as squares in Figure 4 and
functional types of histone binding proteins a full list of the hub nodes is provided in Table SM
Proteins in both structural and cross-linking 6 and 7. We observe two major functional classes
networks have been categorized into different of hub proteins: first class encompasses proteins
groups using the PANTHER protein class playing ubiquitous functions in biological
Figure 4. Functional classification of histone-binding proteins in the global histone interactomes at protein level.
Proteins were classified using PANTHER classification system (ten top ranked protein types are shown) and
categorized by PANTHER protein classes. Histones are represented as large green rectangular nodes while binding
proteins are shown as circular nodes and colored by their functions. Proteins which cannot be clearly classified by
PANTHER protein classes are colored by grey. The hubs nodes of the networks (MCC score 4) are highlighted as
middle size squares.
7
Y. Peng, Y. Markov, A. Goncearenco, et al. Journal of Molecular Biology xxx (xxxx) xxx
processes and includes defense/immunity proteins, (Figure SM10). In contrast, H2A acidic patch is
transporter proteins, cytoskeletal proteins and not identified as binding hotspot in cross-linking
ubiquitin (Table SM 8 and 9). These proteins interactome (Figure SM11), which could be
usually interact with a large spectrum of partners explained by the lack of solvent accessible lysine
and are not specific to chromatin. Consistent with residues around this region. Binding hotspots of
recent XL-MS studies,10 interactions between these H2B in structural interactome are distributed over
proteins may occur across subcellular compart- both tails and globular domains while XL-MS
ments such as nucleus-cytosol/plasma membrane, study indicates that about 25% of the identified
nucleus-mitochondria and nucleus-endoplasmic inter-protein cross-links in histone interactions are
reticulum. Another set of hubs is specific to chro- located at the a-helical C-terminal region of
matin signaling pathways (Table SM8 and 9) and histone H2B (Figure SM11).10 Both H3 and H4 have
includes polycomb protein EED, transcriptional the highest number of interaction partners, their
repressor protein YY1, transcription initiation factor binding hotspots are concentrated within their N-
TFIID and others. terminal tails rich in post-translational modifications
and in case of H3 in certain regions of alpha-helices
Binding hotspots in human histones and in globular domains.
nucleosomes Finally, we systemically analyze histone binding
sites in the context of the full nucleosomes. We
Using data on binding interfaces extracted from collect 26 human nucleosome complex structures
structural and cross-linking interactomes, we from PDB, classify them by functions and map
systematically analyze binding sites of all human binding interfaces onto the molecular surface of
histone variants present in PDB complex nucleosomes (Figure 6). As can be seen in
structures and cross-linking data interactions Figure 6, many binding proteins with the exception
between histones were removed. We perform of transcription regulatory proteins recognize
multiple sequence alignments of human histone nucleosomes via the localized acidic patches. At
variants per each histone type and map the the same time, in some cases histones participate
protein binding sites onto histone sequences using in multivalent binding (in this study we only focus
binding interfaces extracted from structural and on histones while nucleosomal DNA could be also
cross-linking interactomes (Figure 5, SM9, 13–17). involved in the recognition). For example, binding
As can be seen on these figures, histone tails of of centromere proteins involves both H4 tails and
H3 and H4 have the largest number of interactions acidic patch, whereas chromatin remodelers
followed by the acidic patches on H2A and H2B. recognize the acidic patch, H2A-C tails and H3
Consistent with our observations, recent XL-MS core regions. Proteins involved in DNA repair
study highlighted a large fraction of the identified mostly bind nucleosomes via acidic patches and
cross-links mapped to the flexible histone tails.10 H3 core regions. Our observations are consistent
Importantly we identify binding sites which are with recent screening of nucleosome interactions
aligned in most histone variants, many of them in mouse embryonic stem cell,42 indicating that
belong to sites of post-translationally modified resi- acidic patch and its surrounding residues are the
dues in H3 histone tails (not shown). However, the primary binding hotspots in nucleosome.
vast majority of binding sites are not aligned
between all histone variants pointing to possible
interactions characteristic for certain histone vari- Discussion
ants. For instance, we identify binding site residues
169–180 as a variant specific interface on macro- Even though proteome-wide experiments prove
H2A and indeed this interface has been found to to be useful for mapping protein networks, it
be specifically targeted by the speckle-type POZ remains very challenging to map chromatin related
protein and these interactions are involved in the interaction networks with high resolution due to
X-chromosome inactivation pathway41 their dynamic transient nature and the
(Figure SM9). involvement of both DNA and proteins via
Next, to identify the histone binding hotspots, as multivalent interactions. Here we make an attempt
sites interacting with more than five different to characterize the histone related interactions and
binding partners, we map the number of unique show that networks derived from the three
binding proteins for each residue onto the different experimental sources largely complement
consensus sequences of the alignment using each other. Therefore, we integrate the data from
binding interfaces extracted from the structural these sources. Our combined global network from
and cross-linking interactomes (Figure 5, structural and cross-linking interactomes
Figure SM 10, 11). As shown in Figure 5, binding comprises 987 edges and 754 nodes and network
hotspots of H1 are mostly distributed over its combined from all three interactomes includes
globular domain and certain regions of the C- 5308 nodes and 10,330 edges (Table SM5 and
terminal tail. In structural interactome, H2A has Figure SM4). Even though the three networks do
more or less uniformly distributed binding sites, not show much overlap, their topological
including both tails and acidic patch properties are relatively similar. Structural, cross-
8
Y. Peng, Y. Markov, A. Goncearenco, et al. Journal of Molecular Biology xxx (xxxx) xxx
Figure 5. Mapping of protein binding sites onto human histones using the data extracted from structural and cross-
linking interactomes. (a) The number of binding proteins per residue mapped onto the consensus sequence of the full
alignment of human histone sequences (see Figure SM 9, 13–17). Red asterisks denote acidic patches and globular
domains are indicated by purple, yellow, red, blue and green bars per histone colour convention. The full sequence
alignment of H2A variants contains residue 1–378 (Figure SM 14) and the plot only shows the region of 1–200 which
has the vast majority of histone interactions. (b) Binding hotspots are highlighted on histone structures. Binding
hotspots are defined if a residue interacts with more than five different proteins. H1 structure is taken from PDB: 4QLC
and structures of H2A, H2B, H3 and H4 are from PDB: 1KX5. We illustrate H1 C-and N-terminal tails by linearly
extending them since they are not resolved in the structures.
linking, high-throughput networks and combined As a result of our analysis of binding interfaces of
networks (Figure 3 and SM12) exhibit scale-free different human variants, we observe a remarkably
behavior pointing to their robustness to high number of residues involved in interactions
perturbation and have high modularity where with non-histone proteins, 80–90% of all residues
identified hubs are not preferentially clustered in histones H3 and H4 are involved in at least one
together. High modularity and scale-free networks interaction. We argue that it can explain their high
were previously identified in H3 tail interactomes17 degree of evolutionary conservation. Our
and in human and yeast proteome-wide systematic study of structures of histone binding
interactomes.23,36,37,40 interfaces and their alignments uncovers two
9
Y. Peng, Y. Markov, A. Goncearenco, et al. Journal of Molecular Biology xxx (xxxx) xxx
Figure 6. Mapping of protein binding sites on human histones within nucleosomes. The nucleosome-binding
proteins are classified by their functions using the PATHNER classification system (all PDB IDs are provided in
Table SM10). The nucleosome representation is generated from PDB 1KX5 and histones are colored by yellow, pink,
light blue and light green per histone colour convention. The binding interfaces are colored according to the categories
in the pie chart. Acidic patch is marked with a red circle.
types of histone binding modes. First type includes evidence points to the abundance and high speci-
interactions conserved in most histone variants, ficity of histone interactions in a large variety of cel-
such as binding sites in H3 tails regions. The lular processes. The modulation of these
second type comprises variant specific interactions by mutations and post-translational
interactions characteristic for certain histone modifications merits further study.
variants, extensively studied previously,7,43 such
as interactions between speckle-type POZ protein
and macro-H2A and interactions of CENP-A with CRediT authorship contribution
CENP-B and -C.44,45 In addition, we map histone statement
interactions in the context of the full nucleosomes.
In contrast to widely distributed binding interfaces Yunhui Peng: Conceptualization, Methodology,
on histones, histones within nucleosomes utilize Software, Validation, Investigation, Writing -
localized specific regions with the abundance of original draft, Data curation. Yaroslav Markov:
acidic patch mediated interactions, consistent with Conceptualization, Methodology. Alexander
the previous studies.2,42,46–49 Moreover, we show Goncearenco: Methodology, Software. David
that distinctive binding modes are used by chro- Landsman: Conceptualization, Supervision,
matin factors with different functions. Although the Funding acquisition. Anna R. Panchenko:
scope of our analysis is influenced by the limited Conceptualization, Writing - review & editing,
data on histone and nucleosome interactions, all Supervision, Funding acquisition.
10
Y. Peng, Y. Markov, A. Goncearenco, et al. Journal of Molecular Biology xxx (xxxx) xxx
molecular and systems biology. Nat. Struct. Mol. Biol., 25, 40. Han, J.D. et al, (2004). Evidence for dynamically organized
1000–1008. modularity in the yeast protein-protein interaction network.
29. Alonso-Lopez, D. et al, (2019). APID database: redefining Nature, 430, 88–93.
protein-protein interaction experimental evidences and 41. Zhuang, M. et al, (2009). Structures of SPOP-substrate
binary interactomes. Database (Oxford), 2019 complexes: insights into molecular architectures of BTB-
30. Shannon, P. et al, (2003). Cytoscape: a software Cul3 ubiquitin ligases. Mol. Cell, 36, 39–50.
environment for integrated models of biomolecular 42. Skrajna, A. et al, (2020). Comprehensive nucleosome
interaction networks. Genome Res., 13, 2498–2504. interactome screen establishes fundamental principles
31. Thomas, P.D. et al, (2003). PANTHER: a library of protein of nucleosome binding. Nucleic Acids Res., 48, 9415–
families and subfamilies indexed by function. Genome 9432.
Res., 13, 2129–2141. 43. Shaytan, A.K., Landsman, D., Panchenko, A.R., (2015).
32. Chin, C.H. et al, (2014). cytoHubba: identifying hub objects Nucleosome adaptability conferred by sequence and
and sub-networks from complex interactome. BMC Syst. structural variations in histone H2A–H2B dimers. Curr.
Biol., 8 (Suppl 4), S11. Opin. Struct. Biol., 32, 48–57.
33. Goenawan, I.H., Bryan, K., Lynn, D.J., (2016). DyNet: 44. Allu, P.K. et al, (2019). Structure of the human core
visualization and analysis of dynamic molecular interaction centromeric nucleosome complex. Curr. Biol., 29, 2625–
networks. Bioinformatics, 32, 2713–2715. 2639 e5.
34. UniProt Consortium, T., (2018). UniProt: the universal 45. Zhao, H. et al, (2016). Promiscuous histone mis-assembly
protein knowledgebase. Nucleic Acids Res., 46, 2699. is actively prevented by chaperones. J. Am. Chem. Soc.,
35. Sievers, F., Higgins, D.G., (2018). Clustal Omega for 138, 13207–13218.
making accurate alignments of many protein sequences. 46. McGinty, R.K., Tan, S., (2016). Recognition of the
Protein Sci., 27, 135–145. nucleosome by chromatin factors and enzymes. Curr.
36. Stelzl, U. et al, (2005). A human protein-protein interaction Opin. Struct. Biol., 37, 54–61.
network: a resource for annotating the proteome. Cell, 122, 47. Kalashnikova, A.A. et al, (2013). The role of the
957–968. nucleosome acidic patch in modulating higher order
37. Ito, T. et al, (2001). A comprehensive two-hybrid analysis to chromatin structure. J. R. Soc. Interface, 10, 20121022.
explore the yeast protein interactome. Proc. Natl. Acad. 48. Davey, G.E. et al, (2017). Nucleosome acidic patch-
Sci. USA, 98 (8), 4569–4574. targeting binuclear ruthenium compounds induce
38. Nacev, B.A. et al, (2019). The expanding landscape of aberrant chromatin condensation. Nat. Commun., 8
’oncohistone’ mutations in human cancers. Nature, 567, (1), 1575.
473–478. 49. Yunhui Peng, et al, Binding of regulatory proteins to
39. Lindroth, A.M. et al, (2020). The mechanistic GEMMs of nucleosomes is modulated by dynamic histone tails,
oncogenic histones. Hum. Mol. Genet.,. https://doi.org/10.1101/2020.10.30.360990.
12