ChAHP2 and ChAHP control diverse retrotransposons by complementary activities

  1. Marc Bühler1,3
  1. 1Friedrich Miescher Institute for Biomedical Research, Basel 4056, Switzerland;
  2. 2Swiss Institute of Bioinformatics, Basel 4056, Switzerland;
  3. 3University of Basel, Basel 4003, Switzerland
  1. Corresponding author: marc.buehler{at}fmi.ch
  1. 5 These authors contributed equally to this work.

  • 4 Present address: Deutsches Elektronen-Synchrotron (DESY), Hamburg 22607, Germany

Abstract

Retrotransposon control in mammals is an intricate process that is effectuated by a broad network of chromatin regulatory pathways. We previously discovered ChAHP, a protein complex with repressive activity against short interspersed element (SINE) retrotransposons that is composed of the transcription factor ADNP, chromatin remodeler CHD4, and HP1 proteins. Here we identify ChAHP2, a protein complex homologous to ChAHP, in which ADNP is replaced by ADNP2. ChAHP2 is predominantly targeted to endogenous retroviruses (ERVs) and long interspersed elements (LINEs) via HP1β-mediated binding of H3K9 trimethylated histones. We further demonstrate that ChAHP also binds these elements in a manner mechanistically equivalent to that of ChAHP2 and distinct from DNA sequence-specific recruitment at SINEs. Genetic ablation of ADNP2 alleviates ERV and LINE1 repression, which is synthetically exacerbated by additional depletion of ADNP. Together, our results reveal that the ChAHP and ChAHP2 complexes function to control both nonautonomous and autonomous retrotransposons by complementary activities, further adding to the complexity of mammalian transposon control.

Keywords

Retrotransposons make up a substantial proportion of mammalian genomes and present a potential threat due to their ability to amplify and insert into new genomic loci. The complement of retrotransposons is diverse in terms of both sequence and origin, making their regulation mechanistically challenging (Kazazian 2004; Platt et al. 2018). To overcome this diversity, cells use a vast number of different specificity factors that guide a variety of interlinked chromatin-modulating activities. One of the best-described examples of this is the repression of endogenous retroviruses (ERVs). Here, the long terminal repeat (LTR) region of the ERV, which normally acts as a promoter, is recognized by sequence-specific transcription factors (TFs) from the Krüppel-associated box zinc finger (KRAB-ZFP) protein family (Wolf and Goff 2009; Ecco et al. 2017; Imbeault et al. 2017). These TFs then recruit the corepressor TRIM28, which in turn interacts with SETDB1, guiding the deposition of H3K9me3 and establishing a transcriptionally silent state (Schultz et al. 2002; Wolf and Goff 2009; Matsui et al. 2010; Rowe et al. 2010; Quenneville et al. 2012). This process is potentiated by HP1 proteins, which interact with both TRIM28 and H3K9me3 (Lechner et al. 2000; Ayyanathan et al. 2003) and assist with recruitment of additional chromatin regulators including histone deacetylases (Schultz et al. 2001) and demethylases (Macfarlan et al. 2011), DNA methyltransferases (Smallwood et al. 2007), and chromatin remodelers such as ATRX–DAXX, the NuRD complex, MORC3, and SMARCAD1 (Schultz et al. 2001; Sadic et al. 2015; Sachs et al. 2019; Groh et al. 2021). Ultimately, this leads to removal of activating histone acetylation and methylation marks, methylation of the underlying DNA, and remodeling of the local chromatin structure, which altogether confer transcriptional repression to the targeted loci (Rowe et al. 2010; Quenneville et al. 2012; Turelli et al. 2014; Wang et al. 2022).

Additional important repressive activity is provided by the HUSH complex, which is targeted to non-LTR-containing long interspersed elements (LINEs) such as LINE1 and a subset of ERVs (Tchasovnikarova et al. 2015; Liu et al. 2018; Robbez-Masson et al. 2018). This targeting occurs via a non-DNA sequence-specific mechanism based on the absence of introns, length of the transcript, and high adenine content (Seczynska et al. 2022). Similar to KRAB-ZFP-mediated repression, HUSH represses its targets via SETDB1-dependent deposition of H3K9me3 and chromatin remodeling (Tchasovnikarova et al. 2015; Liu et al. 2018; Robbez-Masson et al. 2018; Müller et al. 2021).

This multitude of transposon recognition and silencing pathways is essential because novel insertions of retrotransposons can induce mutations causing a variety of heritable and somatic diseases (Teugels et al. 2005; Hancks and Kazazian 2016; Gonçalves et al. 2017; Qian et al. 2017; Aneichyk et al. 2018). Both ERVs and LINE1 retrotransposons are autonomous, encoding all components necessary for their own retrotransposition (Wicker et al. 2007; Platt et al. 2018). Notably, the LINE1 retrotransposition machinery can additionally support the transposition of the nonautonomous short interspersed nuclear elements (SINEs) (Dewannieux et al. 2003; Dewannieux and Heidmann 2005; Raiz et al. 2012). These elements are often derived from tRNAs or 7SL RNAs, and their regulation is less well understood (Ullu and Tschudi 1984; Daniels and Deininger 1985; Dewannieux et al. 2003; Dewannieux and Heidmann 2005; Raiz et al. 2012; Varshney et al. 2015). We recently discovered that they are partly repressed by the ChAHP complex, which unites the transcription factor ADNP, chromatin remodeler CHD4, and HP1 proteins into a stably associated module (Ostapcuk et al. 2018; Kaaij et al. 2019). Genetic removal of ADNP allows increased expression of evolutionarily less divergent SINE B2 elements in mouse embryonic stem cells (mESCs), accompanied by an increase in chromatin accessibility and CTCF binding (Ostapcuk et al. 2018; Kaaij et al. 2019). Phenotypically, defects in in vitro differentiation and mouse development have been observed upon adnp knockout, resulting in a penetrant embryonic lethality phenotype by embryonic day 9.5 (E9.5) (Mandel et al. 2007; Ostapcuk et al. 2018). In humans, even heterozygous partial truncations of ADNP that abrogate interactions with HP1 proteins result in a severe autism spectrum syndrome typified by developmental defects, compromised function of multiple organ systems, and intellectual disability (Helsmoortel et al. 2014). Although their underlying molecular deficiency is unclear, these striking phenotypes highlight the importance of identifying hitherto unknown retrotransposon-associated chromatin regulators.

Here we report the discovery of ChAHP2, a protein complex with composition similar to that of ChAHP. ChAHP2 is defined by ADNP2, a paralog of ADNP widely present in vertebrates (Altenhoff et al. 2019). ChAHP2 chromatin binding specificity is distinct from ChAHP, predominantly associating with ERV and LINE1 retrotransposons via HP1β-mediated binding of H3K9 trimethylated histones. Through complementary activities, the ChAHP and ChAHP2 complexes control a wide variety of molecularly disparate retrotransposons, including SINEs, LINEs, and ERVs.

Results

ADNP2 interacts with HP1β and CHD4 to form a distinct ChAHP2 complex

The domain architecture of ADNP2 resembles that of ADNP, with nine zinc fingers distributed in two N-terminal clusters and one C-terminal homeodomain. These domains are the only regions showing overall high conservation, while the putatively poorly structured linker region between the zinc finger clusters as well as the C-terminal unstructured region of ADNP are not well conserved between the two paralogs (Fig. 1A). Notably, although the zinc-coordinating residues and overall predicted fold of the zinc fingers are conserved, the residues that normally determine sequence specificity of zinc fingers vary between ADNP and ADNP2 (Fig. 1B; Supplemental Fig. S1A). In contrast, the C-terminal HP1 interaction motif (PxVxL) (Thiru et al. 2004; Mosch et al. 2011) is well conserved. Moreover, ADNP2 is copurified in CHD4 immunoprecipitations (IPs) (Ostapcuk et al. 2018), suggesting that the regions critical for both CHD4 and HP1β/γ interactions are conserved between the two paralogs. These observations prompted us to hypothesize that ADNP2 can form an alternative ChAHP complex with potentially different properties. To test this, we introduced an affinity tag consisting of Avi-3xFLAG to the C terminus of ADNP2 in mouse ES cells (Supplemental Fig. S1B). We then used these cells to perform affinity purifications with streptavidin and analyzed the samples by liquid chromatography-mass spectrometry (LC-MS), with the parental cell line serving as a negative control. The bait protein (ADNP2) was strongly enriched in the ADNP2 IPs compared with control, copurifying CHD4 and HP1β as the top two significantly enriched interactors (Fig. 1C). Several other proteins were identified as significantly enriched, but their enrichment was overall lower (Supplemental Table S1). This group included abundant proteins that commonly copurify in IPs of DNA binding factors, such as PARP1, hinting that these may be experimental artifacts (Mellacheruvu et al. 2013). Overall, this experiment shows that ADNP2 interacts with both HP1β and CHD4 in mouse ES cells. Such an interaction pattern is reminiscent of the ChAHP complex, with the notable difference that ADNP2 preferentially binds HP1β over HP1γ (Fig. 1D; Ostapcuk et al. 2018). Importantly, no ADNP was copurified with ADNP2 and vice versa, suggesting that their interaction networks are separate. To further probe the biochemical independence of ADNP and ADNP2, we evaluated whether the composition of either complex is affected by removal of the other. To that end, we knocked out adnp2 from endogenously edited cells expressing ADNPFKBP-3xFLAG-Avi (Supplemental Fig. S1C) and knocked out adnp from cells expressing ADNP2Avi-3xFLAG. We screened for successful gene deletion by PCR and confirmed loss of mRNA by either exon-spanning RT-qPCR or RNA sequencing (Supplemental Fig. S1D,E). No significant difference in core subunit pull-down efficiencies was apparent between the WT and KO conditions for either ADNP or ADNP2 (Supplemental Fig. S2). This indicates that competition between ADNP and ADNP2 for subunit binding does not have a major effect on complex assembly.

Figure 1.

ADNP2 interacts with HP1β and CHD4 to form a ChAHP2 complex. (A) Protein architecture schematics of ADNP and ADNP2, drawn to scale. The amino acid conservation score (Clustal/JalView) is given for the highlighted regions. (B) Predicted ADNP or ADNP2 ortholog sequences from species representing different vertebrate classes were aligned using Clustal Ω (Madeira et al. 2022), ordered by species, and visualized with JalView (Waterhouse et al. 2009). The conservation scores were scaled and are used in A. Alignment excerpts for the regions containing zinc finger 4 (counting from the N terminus) and the PxVxL motif are displayed. The putative region responsible for sequence specificity of the zinc finger is highlighted. Alignment excerpts covering all other zinc fingers are shown in Supplemental Figure S1A. (C) Cells expressing endogenously edited ADNP2Avi-3xFLAG or parental untagged cells were subjected to immunoprecipitation with streptavidin and analyzed by proteomics (n = 3). The dashed line represents the significance cutoff at FDR = 0.1 and log2Enrichment > 1. (D) Abundance of proteins from the HP1 proteins recovered in the experiment described in C, expressed as iBAQ values. (E,F) His-ADNP2 and Strep-HP1β were coexpressed and His-CHD4 was expressed separately using baculovirus transductions in insect cells before lysing the mixed cell pools and pull-down against Strep, followed by electrophoresis and Coomassie staining (E) and SEC-MALS (F).

Finally, to directly confirm that ADNP2, HP1β, and CHD4 form a stable complex, we performed in vitro reconstitutions. First, we coexpressed His-ADNP2 and Strep-HP1β in insect cells and expressed His-CHD4 in separately transduced cells before mixing the two cell pools for lysis and streptactin pull-down. We analyzed these pull-downs using size exclusion chromatography with multiangle light scattering (SEC-MALS) and Coomassie staining. All three proteins were detected in one elution peak, with an estimated molecular weight consistent with an ADNP2–CHD4–HP1β complex (Fig. 1E,F). Together, these data demonstrate that ADNP2, CHD4, and HP1β form an independent bona fide complex, which we refer to as ChAHP2.

ChAHP2 binds retrotransposons

We next performed ChIP sequencing (ChIp-seq) to characterize the chromatin localization of ADNP2. Peak calling revealed 6315 regions with significant ADNP2 enrichment over input chromatin (Supplemental Table S2). These regions also showed enriched binding of CHD4 and HP1β in previously published data sets, suggesting that ChAHP2 complexes occupy these sites (Fig. 2A). Intriguingly, the majority of ADNP2 peaks were found in repeat regions, while few overlapped transcription start sites (TSSs) (Fig. 2B). Retrotransposons belonging to LTR-containing families, including several classes of ERV elements, were particularly overrepresented when compared with a randomized peak set of equal properties (Fig. 2C,D; Supplemental Fig. S3A; Supplemental Table S3). In addition, one subfamily of LINE1 elements was overrepresented (Fig. 2D; Supplemental Fig. S3A). We further supplemented these analyses with a peak-agnostic, repeat family-based approach. This agreed well with the initial analysis, showing enrichment over input at the repeat classes identified by peak calling, with both the internal sequence of ERVs and their associated LTR sequences exhibiting enrichment in ADNP2 ChIP signal (Fig. 2E). Finally, we additionally determined ADNP2 ChIP signal distribution along repeat elements using mapping to repeat consensus sequences, which showed results similar to those of the other analyses (Fig. 2F; Supplemental Fig. S3C). Together, these data indicate that ADNP2 binds several classes of retrotransposons both internally and at their terminal sequences.

Figure 2.

ChAHP2 predominantly binds to ERVs and LINEs. (A) Waterfall plots of ChIP-seq counts normalized to library size centered on summits of ADNP2 peaks (mean; n = 4). (B) Upset plot of overlaps between ADNP2 peaks and select genomic features (TSSs, exons, introns, and repeats). (C) Overlap numbers between repeat annotations and ADNP2 peaks or a randomized peak set of equal properties (bootstrapped 100 times; mean ± SD). (Simple sequences) Simple repeats and low-complexity regions. (D) Same as C but further split by repeat family. (E) Summed reads over repeat annotations normalized to library size for input and ADNP2 ChIP-seq (mean ± SD; n = 4). (F) Consensus mapping traces over repeat annotations normalized to library size for input and ADNP2 ChIP-seq (mean ± SD; n = 4). For IAPEz, the IAPEz-int consensus sequence was stitched with IAPLTR1a_Mm at either end, while for MMERVK10C, MMERVK10C-int was stitched with RLTR10C. The position of the stitched LTRs is highlighted. (G) IGV genome browser shots of select regions bound by ADNP2.

ChAHP2 is recruited to chromatin via HP1 binding to H3K9me3

Unlike many TFs that bind readily identifiable DNA sequence motifs, no motifs accounted for >15% of ADNP2-bound sites (Supplemental Fig. S3D). In addition, these motifs were not centrally enriched within ADNP2 peaks, hinting that they are not a direct specificity determinant (Supplemental Fig. S3D). Notably, a common feature of ADNP2-bound retrotransposons is the presence of H3K9 trimethylation (Fig. 2G; Matsui et al. 2010; Karimi et al. 2011; Castro-Diaz et al. 2014). Since HP1 proteins are known to bind this histone modification (Bannister et al. 2001; Lachner et al. 2001), we hypothesized that ChAHP2 is recruited to its targets by HP1β-mediated binding to H3K9me3 histone tails. Indeed, ADNP2 chromatin binding correlates well with H3K9me3 levels genome-wide (Fig. 3A). To directly test whether HP1 is required for ChAHP2 binding to chromatin, we specifically abrogated the ADNP2–HP1β interaction by introducing point mutations in the PxVxL motif of ADNP2. We verified successful editing and homozygosity by Sanger sequencing (Supplemental Fig. S4A). As expected, wild-type ADNP2 was able to pull down HP1β and CHD4, whereas ADNP2PxVxL mutants were unable to bind HP1β above background levels (Supplemental Fig. S4B). Importantly, CHD4 binding was unaffected, confirming that the mutations specifically disrupt the ADNP2–HP1β interaction (Supplemental Fig. S4B). Next, we performed ChIP-seq for ADNP2 and ADNP2PxVxL. Mutations in the PxVxL motif resulted in a near-complete loss of ChAHP2 binding to H3K9me3-modified target regions, while H3K9me3 levels remained unchanged (Fig. 3B). These regions almost exclusively overlap LTR or LINE1 elements. Statistical analysis of significantly changing peaks corroborated this initial impression (Supplemental Fig. S4C,D). In turn, the peaks with increased binding were promoter-associated, were devoid of H3K9me3, and exhibited higher chromatin accessibility (Supplemental Fig. S4E,F). This behavior would be consistent with a redistribution of ChAHP2 from heterochromatin to euchromatin when HP1 is removed from the complex. Finally, regions without a significant change exhibited an intermediate accessibility and H3K9me3 profile (Supplemental Fig. S4F). Despite not being identified as statistically significant, ADNP2 binding was still overall reduced in this group (Supplemental Fig. S4G). These observations provide evidence that chromatin binding at the majority of ChAHP2 targets is dependent on HP1.

Figure 3.

ChAHP2 binds heterochromatin in an HP1- and H3K9me3-dependent manner. (A) Comparison between ADNP2 and H3K9me3 ChIP-seq signal in 1 kb genomic bins across chromosome 11. (B) Cells expressing WT or PxVxL mutated ADNP2Avi-3xFLAG before or after SETDB1 depletion were analyzed by ChIP-seq. Average ChIP counts per million (cpm) is displayed (n = 2). (C) Differential binding analysis of summed reads over repeat annotations normalized to human spike-ins comparing ADNP2PxVxL ChIP and WT controls (n = 2). (D) Summed reads over repeat annotations normalized to human spike-ins for input and a series of ChIP samples as indicated (mean; n = 2; replicates are annotated). ADNP2 and H3K9me3 ChIPs were done simultaneously and from the same material.

We next sought to test whether ChAHP2 binding depends on the presence of H3K9me3. To do this, we introduced a 2xHA-FKBP12F36V degron tag at the N terminus of endogenous SETDB1, the methyltransferase responsible for depositing H3K9me3 at LTR elements (Supplemental Fig. S5A; Matsui et al. 2010; Nabet et al. 2018), and confirmed the efficacy of SETDB1 depletion by Western blotting (Supplemental Fig. S5B). We then generated ADNP2 and H3K9me3 ChIP-seq data in untreated and SETDB1-depleted conditions. Consistent with its described roles, SETDB1 depletion resulted in a reduction of H3K9me3 at LTR elements and a modest reduction at minor target regions such as LINE1 elements (Fig. 3B,D; Matsui et al. 2010; Rowe et al. 2010; Protasova et al. 2021). Tracking these changes, ADNP2 binding was reduced at LTRs and, to a lesser extent, at LINEs (Fig. 3B). This reduction was smaller than the effect of the PxVxL mutation, likely due to incomplete loss of H3K9me3 at these elements. Supporting this interpretation, the change in ADNP2 binding strongly correlated with the change in H3K9me3 levels across the entire data set (Supplemental Fig. S5C).

To further quantify the relationship between ChAHP2 and H3K9me3, we analyzed the HP1 decoupling and SETDB1 depletion data using repeat family-wide and consensus mapping approaches. First, we performed a differential binding analysis between ADNP2 and ADNP2PxVxL mutant ChIPs across all transposon families (Fig. 3C). This revealed a significant binding reduction in the PxVxL mutant for many LTRs and LINE1s. Similarly, binding to satellite repeats (GSAT_MM) was reduced, suggesting that binding to nontransposon heterochromatin is also HP1-dependent. Consistent with the peak-based analyses (Supplemental Fig. S4C–E), no transposon class exhibited significantly increased binding (Fig. 3C). A focused look at representative LTR elements and L1MdA confirmed that ADNP2 binding dropped to input levels at H3K9 trimethylated transposons in the HP1-deficient mutant (Fig. 3D; Supplemental Fig. S5D). A reduction was also apparent upon SETDB1 depletion, again smaller in scale. Finally, there was no additional reduction in ADNP2 ChIP signal when SETDB1 was depleted in the ADNP2PxVxL background (Fig. 3D), suggesting that the two perturbations exert their effect via a shared molecular axis. Overall, these data demonstrate that HP1β-mediated binding of H3K9me3 nucleosomes targets ChAHP2 to heterochromatinized retrotransposons and potentially to all other heterochromatic regions.

ChAHP partially colocalizes with ChAHP2 at retrotransposons

These chromatin binding characteristics appear overall distinct from ChAHP, which has been shown to predominantly bind SINEs (Ostapcuk et al. 2018; Kaaij et al. 2019). We saw little evidence of ADNP2 binding at ADNP-associated SINEs, with few ADNP2 peaks overlapping these transposons and very little ADNP2 signal at ADNP peaks (Figs. 2C,D, 4A,B,E). Conversely, some ADNP does colocalize with ADNP2, specifically at regions marked by H3K9me3, though this binding is less strong than on SINEs (Fig. 4A,B). Indeed, we observed ADNP peaks (Supplemental Table S4) at select heterochromatic repeat regions more frequently than a randomized peak set (Supplemental Fig. S6A). ADNP has been previously reported to bind to H3K9me3-modified chromatin and pericentromeric heterochromatin in an HP1-dependent manner (Mosch et al. 2011; Ostapcuk et al. 2018), prompting us to assess whether ADNP is targeted to ADNP2-bound sites via the same mechanism. To do this, we generated ADNPPxVxL motif point mutants (Supplemental Fig. S6B) and performed ChIP-seq. Abrogating the HP1 interaction in this way resulted in a loss of ADNP signal specifically at H3K9 trimethylated transposons, including LTR and LINE1 families, which are bound by ADNP2 (Fig. 4C,E). Binding to satellite repeats was also significantly reduced, orthogonally validating previously published imaging data (Fig. 4C; Mosch et al. 2011). A focused look at representative heterochromatinized repeats confirmed that ADNP binding at these sites drops to near-background levels after HP1 decoupling (Fig. 4D; Supplemental Fig. S6C). Conversely, binding to SINEs is increased under these conditions (Fig. 4B–D; Supplemental Fig. S6C). This striking observation indicates that ChAHP features two chromatin binding modes: one via HP1 and H3K9me3 to heterochromatin analogous to ChAHP2, and a second H3K9me3-independent mechanism through sequence-specific recruitment to euchromatic SINEs (Mosch et al. 2011; Ostapcuk et al. 2018). Thus, these analyses demonstrate that ChAHP and ChAHP2 have different combinations of chromatin binding specificities, with a notable overlap at heterochromatin.

Figure 4.

ChAHP and ChAHP2 colocalize at heterochromatin. (A) Metaplots of ChIP-seq counts normalized to library size over ADNP peaks or ADNP2 peaks as indicated (mean ± SD; n = 2). (B) Waterfall plot for ChIPs, centered on a concatenated set of ADNP and ADNP2 peak summits and split by overlap with select repeat annotations as indicated (mean; n = 2). (C) Differential binding analysis of summed reads over repeat annotations normalized to library size comparing ADNPPxVxL ChIP and WT controls (n = 2). (D) Summed reads over repeat annotations normalized to library size for input and a series of ChIP samples as indicated (mean; n = 2; replicates are represented as points). (E) IGV genome browser shots for select regions for ChIPs and input as indicated.

Repressive activities of ChAHP and ChAHP2 partially overlap at LINE1 and LTR retrotransposons

The chromatin binding characteristics of the two ChAHP complexes prompted us to assess their individual and combined contributions to transposon silencing. We attempted to generate adnp/adnp2 double-KO cells but were unable to obtain homozygous clones. To circumvent this issue, we generated cells that allow inducible degradation of ADNP in an adnp2−/− background. Approximating a constitutive KO situation, we induced ADNP degradation in adnp2−/− cells continuously over 14 days of passaging and confirmed the efficacy of degradation by Western blotting (Supplemental Fig. S7A,B). At the RNA level, consistent with previous data, 36 genes were either upregulated or downregulated upon removal of ADNP (Supplemental Fig. S7C; Supplemental Table S5; Ostapcuk et al. 2018). Genetic KO of adnp2 caused misregulation of several hundred genes, with a slight tendency for upregulation (Supplemental Fig. S7C; Supplemental Table S5). Depletion of ADNP in adnp2−/− cells resulted in misregulation of several hundred genes in addition to those already observed when only ADNP2 was removed, revealing a synthetic effect (Supplemental Fig. S7C; Supplemental Table S5). None of the changing gene categories showed strong and significant patterns in terms of either GO term enrichment (Supplemental Fig. S7D), distance between the promoter and the nearest ADNP/ADNP2 peaks (Supplemental Fig. S7E), or direct association between the peaks and promoters (hypergeometric test, P < 0.01). Therefore, ChAHP and ChAHP2 likely do not regulate these genes directly.

We next explored whether any repetitive elements were differentially expressed under the same conditions (Fig. 5A; Supplemental Table S6). Consistent with previous findings, only SINE B2 elements were significantly upregulated upon ADNP depletion (Fig. 5B; Kaaij et al. 2019). Conversely, nine repeat families were significantly upregulated in adnp2−/− cells (FDR < 0.05, |log2FoldChange > 0.9), while two were downregulated (Fig. 5A). All these differentially expressed repeats belonged to the LTR class of retrotransposons, including families identified as ADNP2-bound in ChIP-seq, such as those corresponding to components of IAP elements and MMERVK10C. Notably, both the internal sequences of the elements and their associated LTR sequences were upregulated, suggesting that all components of these elements increase in expression. We obtained nearly identical findings when intron-associated repeat elements were excluded from the analyses (data not shown), indicating that the observed effects are not covariates of changes in gene expression. Strikingly, combined removal of ADNP and ADNP2 resulted in magnified upregulation of LTR elements when compared with ADNP or ADNP2 removal in isolation. In addition, two LINE1 element subclasses exhibited significant upregulation, with the more prominent being L1MdA. Closer inspection revealed that L1MdA elements are already slightly upregulated upon ADNP2 KO, but the upregulation only reaches significance upon removal of both ADNP and ADNP2 (Fig. 5D). Synergistic regulation was also observed for IAP elements. In contrast, MMERVK10C derepression was exclusively ADNP2-sensitive despite also being weakly bound by ADNP (Fig. 5C). Overall, there was little to no difference in H3K9me3, ADNP, and ADNP2 levels or chromatin accessibility between ERVK insertions that were sensitive to either only ADNP2 or combined ADNP/ADNP2 loss (Supplemental Fig. S8A). Finally, elements that were already upregulated upon ADNP2 loss tended to be more divergent, though this trend was not particularly prominent (Supplemental Fig. S8A).

Figure 5.

ChAHP and ChAHP2 repress distinct and shared repeat classes. (A) Differential expression analysis for repeat families between ADNP/ADNP2 perturbation and the corresponding unperturbed controls (n = 3). Significant hits are highlighted with colors (FDR < 0.05; |log2FoldChange| > 0.9). ADNP depletion was performed using 250 nM dTAG13 for 14 days whereas ADNP2 was genetically removed using CRISPR–Cas9. (B–D) RNA expression normalized to library size for example repeat classes (mean ± SD; n = 3) from the same data set as in A, representing classes that are specifically ADNP-dependent (B), specifically ADNP2-dependent (C), or cooperatively regulated (D).

Derepression of retrotransposons could be caused by compromised H3K9 trimethylation upon loss of ADNP and ADNP2. To assess this, we performed H3K9me3 ChIP-seq under ADNP and ADNP2 perturbation conditions. H3K9me3 levels at ADNP and ADNP2 peaks were not reduced by either individual or combined removal of these factors (Supplemental Fig. S8B,C). We further compared the ADNP/ADNP2-dependent derepression with gross perturbation of H3K9me3 by degrading SETDB1 (Supplemental Fig. S8D). Expression of SINEs was unaffected by loss of SETDB1, whereas LTR elements were upregulated more than an order of magnitude. L1MdA expression was also slightly elevated upon SETDB1 depletion, consistent with previous reports suggesting that SETDB1 plays a small role in H3K9 trimethylation and repression of LINE1 elements (Matsui et al. 2010; Liu et al. 2018). Collectively, these data suggest that ChAHP and ChAHP2 contribute to repression of a variety of retrotransposons with distinct but partially overlapping specificities.

Discussion

Here we have shown that ADNP2, CHD4, and HP1β coalesce to form ChAHP2, a novel stable protein complex unifying different chromatin regulatory activities. The chromatin binding properties of ChAHP2 are largely distinct from ChAHP in that its predominant recruitment mode is via HP1 and H3K9me3. In contrast to ChAHP, ChAHP2 does not bind euchromatic SINEs but is instead targeted to H3K9me3-modified LTR and LINE1 elements. In line with this sequence-agnostic recruitment mechanism, we were unable to define a sequence motif that would broadly explain the distribution and specificity of the ADNP2 ChIP-seq signal. It is possible that multiple different sequences can be bound by distinct DNA binding domains within ADNP2 (zinc fingers or homeodomain) and contribute weak but important specificity toward certain repetitive elements. Thorough in vitro biochemistry and potentially structural analyses will be required to address this question in the future. Nevertheless, the observed chromatin binding appears consequential, as removal of ADNP2 results in specific upregulation of a subset of ChAHP2-bound retrotransposons.

We also found a supporting role of ChAHP in regulating these elements. Previous studies had reported weak and promiscuous binding of ChAHP to H3K9me3-modified heterochromatin without a clear functional implication (Mosch et al. 2011; Ostapcuk et al. 2018). Since ChAHP binding at non-H3K9me3-modified cognate targets is increased upon disrupting HP1 interaction and thus H3K9me3 binding, a substantial fraction of ChAHP is likely associated with heterochromatin. This observation indirectly supports a role for ChAHP in regulation of heterochromatic loci and clarifies the biological requirement for this chromatin binding modality within the complex. This was previously unappreciated because derepression of heterochromatic targets only becomes apparent when both ADNP and ADNP2 are removed. Notably, the magnitude of this derepression is comparable with the contribution of other factors with proposed roles in LTR regulation such as MORC3, ATRX–DAXX, SMARCAD1, m6A methylation, or the recently identified TNRC18 (Sadic et al. 2015; Sachs et al. 2019; Chelmicki et al. 2021; Groh et al. 2021; Liu et al. 2021; Zhao et al. 2023). Thus, we surmise that ChAHP and ChAHP2 constitute one part of a multicomponent retrotransposon repression machinery.

Given that loss of ChAHP/ChAHP2 does not result in compromised H3K9 trimethylation, the most likely mechanism that would confer repressive activities to ChAHP complexes is chromatin remodeling by CHD4. Previously, CHD4 has been shown to contribute to transcriptional silencing via chromatin remodeling activity promoting increased nucleosome densities in a nonpositioned manner at its target loci (Morris et al. 2014; de Dieuleveult et al. 2016; Bornelöv et al. 2018). Such a mechanism would conform to the general concept of transposon control, in which a diverse group of specificity factors guides the activity of less specific chromatin-modifying effectors. This model is typified by the canonical LTR repression system harnessing sequence-specific KRAB-ZFPs to guide the activity of H3K9 methyltransferases. There are two notable differences between this system and ADNP/ADNP2. First, chromatin remodeling activity is stably incorporated into the ChAHP complexes, while the KRAB-ZFPs transiently interact with chromatin modifiers and do not appear to form stable multisubunit complexes (Helleboid et al. 2019). Second, KRAB-ZFPs rapidly evolve and massively multiply to target newly invading TEs (Imbeault et al. 2017; de Tribolet-Hardy et al. 2023), whereas ADNP and ADNP2 are remarkably well conserved since their inception in agnathans. Instead, the recognition of potentially harmful genetic elements in a largely DNA sequence-agnostic manner, which is exerted by the H3K9me3 affinity of HP1 proteins within the ChAHP complexes, is conceptually reminiscent of the HUSH complex (Tchasovnikarova et al. 2015; Seczynska et al. 2022). HUSH recognizes nascent intronless transcripts (a feature typical for TEs) and chromatin modifications rather than the DNA sequence itself. However, like KRAB-ZFPs, HUSH interacts with chromatin modifiers only transiently. Thus, ChAHP and ChAHP2 possess a unique combination of chromatin binding modes and functional properties representing previously unknown components of the LTR, LINE1, and SINE repression machineries.

Materials and methods

Cell culture

Mouse embryonic stem cells (129 × C57BL/6 background) with BirA and Cre insertions in the Rosa26 locus (Flemr and Bühler 2015; Ostapcuk et al. 2018) were cultured on gelatin-coated dishes in ES medium containing DMEM (Gibco 21969-035) supplemented with 15% fetal bovine serum (FBS; Gibco), 1× nonessential amino acids (Gibco), 1 mM sodium pyruvate (Gibco), 2 mM L-glutamine (Gibco), 0.1 mM 2-mercaptoethanol (Sigma), 50 mg/mL penicillin, 80 mg/mL streptomycin, 3 μM glycogen synthase kinase (GSK) inhibitor (Calbiochem D00163483 or Sigma CHIR99021), 10 μM MEK inhibitor (Tocris PD0325901), and homemade LIF at 37°C in 5% CO2.

Genome editing

Cells were trypsinized, counted, seeded, and immediately transfected using Lipofectamine 3000 (Invitrogen) according to the manufacturer's instructions. Generally, 300,000 cells were seeded in 6 well plates and transfected with a total of 1–1.5 μg of DNA. For genetic knockout, plasmids encoding gRNAs targeting the N terminus, C terminus, Cas9, and a puromycin resistance cassette were cotransfected for 24 h before selection with 2 mg/mL puromycin for a further 24–36 h. For endogenous tagging, a plasmid harboring a desired homology repair cassette was included. Combined CRISPR–Cas9/TALEN editing was done as described previously (Ostapcuk et al. 2018). The cells were then trypsinized and counted, and 15,000 cells were seeded on a 10 cm dish for colony formation without puromycin selection. When colonies were sufficiently large (4–8 days), they were manually picked and split into two 96 well plates for screening and expansion. Screenings for both positive editing events and unedited loci were performed by PCR and Sanger sequencing of the products. Where applicable, further confirmation was done using Western blotting, RT-qPCR, and RNA sequencing.

In vitro protein purification and analysis

For cloning, cDNA encoding full-length human ADNP2 (amino acid residues 1–1131) was PCR-amplified and cloned into a pFast-Bac-derived vector (Invitrogen) in-frame with an N-terminal His6 tag. An expression construct encoding full-length human HP1β (amino acid residues 1–185) was generated by amplification of cDNA and cloning into a pFast-Bac-derived vector in-frame with an N-terminal Strep tag II. cDNA encoding for full-length human CHD4 (amino acid residues 1–1912) was amplified and cloned into a pAC8-derived vector in-frame with an N-terminal His6 tag (Abdulrahman et al. 2009). Baculoviruses for protein expression were generated in Spodoptera frugiperda Sf9 cells using the Bac-to-Bac method for pFastBac-derived vectors or by cotransfection with viral DNA for pAC8-based vectors. After one round of virus amplification in Sf9 cells, Trichoplusia ni High5 cells were infected with the respective baculovirus (150 µL of virus per 10 mL of High5 cells at a density of 2 × 106 cells/mL) and collected 48 h after infection. Cells were lysed by sonication in 50 mM Tris (pH 7.5), 300 mM NaCl, 5 mM β-mercaptoethanol, 0.1% Triton X-100, 1 mM PMSF, and 1× PIC (Sigma-Aldrich). The cleared lysate was passed over a Strep-Tactin sepharose (IBA) column. The bound complex was eluted in 50 mM Tris-HCl (pH 7.5), 100 mM NaCl, 5 mM β-mercaptoethanol, and 2.5 mM desthiobiotin.

Analytical size exclusion chromatography coupled to multiangle light scattering (SEC-MALS)

Thirty-eight microliters of sample at ∼4–5 mg/mL of protein was injected onto a Superose 6 Increase 10/300 GL column (Cytiva) in 50 mM HEPES-OH (pH 7.4), 150 mM NaCl, and 0.5 mM TCEP using an Agilent Infinity 1260 II HPLC system. In-line refractive index and light-scattering measurements were performed using a Wyatt Optilab T-rEX refractive index detector and a Wyatt miniDAWN Treos 3 light-scattering detector. System control and analysis were carried out using Wyatt Astra 7.3.1 software. System performance was checked with BSA.

Western blotting

Protein samples in lauryl-dodecyl sulfate (LDS) sample buffer were separated by standard SDS-PAGE on 4%–12% Bis-Tris gradient gels (Novex Bolt, Invitrogen). Separated proteins were transferred onto PVDF membranes (Milipore) in transfer buffer (Bjerrum–Schaeffer–Nielsen + 0.4% SDS) using a semidry transfer procedure (TransBlot Turbo, Bio-Rad) with transfer parameters 1.3 A, 25 V, and 12 min for one gel. The membranes were blocked in 3%–5% skim milk (Sigma) dissolved in Tris-buffered saline + 0.2% (v/v) Tween-20 (TBST). Antibody incubations were performed with antibodies diluted in blocking solution to empirically determined concentrations for a minimum of 1 h up to a maximum of 24 h. When reprobing, membranes were first treated with 0.01% NaN3 for 30–60 min to quench the HRP from previous staining rounds. Between each antibody incubation, the membranes were washed for a minimum of 45 min in TBST with at least four buffer exchanges. The horseradish peroxidase system (Immobilon, Milipore) coupled to camera-based detection (Agilent Technologies AI600) was used to visualize protein bands.

Immunoprecipitations

Cells from one 10 cm dish per condition were harvested by trypsinization before centrifugation at 200g for 5 min. Cells were washed once in room temperature PBS and snap-frozen or immediately taken forward for lysis. Pellets were lysed in NP-40 lysis buffer supplemented with protease inhibitors and benzonase (20 mM Tris-HCl at pH 7.4, 150 mM NaCl, 1% [v/v] NP-40, 0.1% [v/v] sodium deoxycholate,1× HALT protease inhibitor cocktail, 200 U of Turbo benzonase [Milipore]) for 30 min at 12°C and cleared by centrifugation at16,000g for 20 min at 4°C. Protein concentration in the lysates was determined by Bradford assay against BSA, and 1–3 mg of protein lysate equivalent was loaded onto 10 μL of bead slurry (Dynabeads, GE healthcare) prewashed twice with lysis buffer. For FLAG IPs, 2 μg of antibody was added per 1 mg of protein. Beads were incubated with lysate for 2 h at 4°C and washed twice with lysis buffer and twice with wash buffer (20 mM Tris-HCl at pH 7.4, 150 mM NaCl, 0.1% [v/v] NP-40) before addition of LDS. All bead separation steps were done using magnetic racks.

Proteomics

For IP-MS, immunoprecipitations were performed as normal with the addition of two washing steps without detergent (20 mM Tris at pH 7.5, 150 mM NaCl), followed by on-bead digestion. IP-MS in Figure 1B was exceptionally performed in 350 mM NaCl, and streptavidin M280 Dynabeads (Invitrogen) were used for pull-down, with all other steps remaining the same. Beads were resuspended by vortexing in 5 µL of digestion buffer (3 M GuaHCl, 20 mM EPPS at pH 8.5, 10 mM CAA, 5 mM TCEP), and 1 µL of 0.2 mg/mL LysC protease (Promega) in 50 mM HEPES (pH 8.5) was added. Proteins were digested for 2 h at room temperature with rotation. The samples were diluted with 17 µL of 50 mM HEPES (pH 8.5) and digested with 1 µL of 0.2 mg/mL trypsin (Promega) in 0.2 mM HCl at 37°C with interval mixing at 2000 rpm for 30 sec every 15 min.

For Figure 1B, digested peptides were acidified with 0.8% TFA (final) and analyzed by LC–MS/MS on an EASY-nLC 1000 (Thermo Scientific) with a two-column setup. Peptides were applied on an Acclaim PepMap 100 (Thermo Scientific) C18 trap column (75 µm ID × 2 cm, 3 µm) in 0.1% formic acid and 2% acetonitrile in H2O at a constant pressure of 80 MPa and separated by a linear gradient of 3 min of 2%–6% buffer B in buffer A; 40 min of 6%–22% , 9 min of 22%–28%, 8 min of 28%–36%, and 1 min of 36%–80% buffer B in buffer A; and 14 min of 80% buffer B in buffer A (buffer A: 0.1% formic acid; buffer B: 0.1% formic acid in acetonitrile) on an EASY-Spray column ES801 (50 µm ID × 15 cm, 2 µm) (Thermo Scientific) mounted on a DPV ion source (New Objective) connected to an Orbitrap Fusion (Thermo Scientific) at a flow rate of 150 μL/min. Data were acquired using 120,000 resolution for the peptide measurements in the Orbitrap and a top T (3 sec) method with HCD fragmentation for each precursor and fragment measurement in the ion trap following the manufacturer's guidelines (Thermo Scientific).

For Supplemental Figure S2, we used an Orbitrap Fusion LUMOS (Thermo Fisher Scientific) with VanquishNeo-nLC and an easy source with a 75 μm × 15 cm EasyC18 column. The samples were loaded onto a C18 0.3 mm × 5 mm trap, and backward flush was used for the analysis. The gradients used were 0–3 min in 2%–4% buffer B in buffer A; and 3–43 min in 4%–20%, 43–58 min in 20%–30%, 58–66 min in 30%–36%, 66–68 min in 36%–45%, 68–69 min in 45%–100%, and 69–75 min in 100% buffer A (0.1%FA in H2O) and buffer B (0.1%FA, 80% MeCN in H2O) at room temperature, and the flow rate during the gradient was 350 μL/min.

Peptides were identified with MaxQuant version 1.5.3.8 using the search engine Andromeda (Cox et al. 2011). The mouse subset of the UniProt version 2017_04 or 2021_05 combined with the contaminant DB from MaxQuant was searched, and the protein and peptide FDR values were set to 0.05. Statistical analysis was done in Perseus version 1.5.2.6 (Tyanova et al. 2016) or using limma within the einProt package (version 0.5.13) (Soneson et al. 2023). Results were filtered to remove reverse hits, contaminants, and peptides found in only one sample. Missing values were imputed, and potential interactors were visualized in volcano plots.

ChIP-seq

ChIP experiments were performed with at least two different clones from endogenously tagged cell lines. For ADNP and ADNP2, the ChIPs were performed using antibodies against the affinity tag, and for H3K9me3, the ChIPs were performed using a specific antibody as detailed further below. Harvesting was performed by trypsinization, and cells were counted for each sample. For ADNP2 ChIPs, 2 × 107 cells were collected, and for all other ChIPs, 1 × 107 were collected The cells were cross-linked for 8 min at room temperature in 10 mL of PBS supplemented with 1% formaldehyde (Sigma F8775). Cross-linking was quenched by adding glycine to a final concentration of 0.125 mM and incubating for 1 min at room temperature and for 3 min on ice. Cells were pelleted by centrifugation at 500g for 3 min at 4°C, and the pellet was lysed in 10 mL of lysis buffer A (50 mM HEPES at pH 8.0, 140 mM NaCl, 1 mM EDTA, 10% glycerol, 0.5% NP40, 0.25% Triton X-100) for 10 min on ice. After centrifugation, the pellet was resuspended in 10 mL of buffer B (10 mM Tris at pH 8, 1 mM EDTA, 0.5 mM EGTA, 200 mM NaCl) and incubated for 5 min on ice. The samples were centrifuged at 500g for 3 min at 4°C, and the pellets lysed in 180 μL of buffer C (50 mM Tris at pH 8, 5 mM EDTA, 1% SDS, 100 mM NaCl) for 2 min at room temperature and for 10 min on ice. The lysates were diluted in 1.6 mL of ice-cold TE buffer and sonicated in 15 mL tubes twice for 10 cycles of 30 sec on/30 sec off at 4°C (Bioruptor Pico, Bio-Rad). Next, 200 μL of 10× ChIP buffer (0.1% SDS, 10% Triton X-100, 12 mM EDTA, 167 mM Tris-HCl at pH 8, 1.67 M NaCl) was added, and the chromatin was transferred into 2 mL Eppendorf tubes before centrifugation at 16,000 g for 10 min at 4°C. Five percent sheared chromatin was reserved for the input control, while the rest was transferred into fresh tubes. Generally, beads were prewashed in 1× ChIP buffer and added to the sonicated chromatin alongside different amounts of antibody, depending on the ChIP. For ADNP2, we used 20 μL of Protein G Dynabeads and 2 μL of a-FLAG. For H3K9me3, we used 20 μL of Protein G Dynabeads and 2 μL of a-H3K9me3. For ADNP, 20 μL of Protein G Dynabeads and 20 μL of Protein A Dynabeads were premixed and washed twice with 1× ChIP buffer, and 3 μL of a-FLAG was used. Samples were incubated for 4 h at 4°C. ChIPs were washed for 1 min each for each step, four times with RIPA (10 mM Tris-HCl at pH 8.0, 1 mM EDTA at pH 8.0, 140 mM NaCl, 1% Triton X-100, 0.1% SDS, 0.1% Na-deoxycholate), twice with RIPA500 (10 mM Tris-HCl at pH 8.0, 1 mM EDTA at pH 8.0, 500 mM NaCl, 1% Triton X-100, 0.1% SDS, 0.1% Na-deoxycholate), twice with Li wash buffer (10 mM Tris-HCl at pH 8.0, 1 mM EDTA at pH 8.0, 250 mM LiCl, 0.5% NP-40, 0.5% Na-deoxycholate), and once with TEplus (10 mM Tris-HCl at pH 8.0, 1 mM EDTA). Beads were transferred to a fresh tube during the last wash, and wash buffer was completely removed before adding 75 μL of elution buffer (10 mM Tris-HCl at pH 8.0, 1 mM EDTA at pH 8.0, 150 mM NaCl, 1% SDS) and incubating for 20 min at 65°C with constant shaking. Elution was repeated once more with 75 μL of elution buffer for 20 min, eluates were pooled, and 2 μL of 20 μg/μL RNase A was added and incubated for 1 h at 37°C. Next, 2 μL of 20 mg/mL Proteinase K was added, and samples were incubated for 2 h at 55°C followed by decross-linking for 6 h at 65°C. Input samples were adjusted to 150 μL total volume with elution buffer and processed equivalently to ChIP samples. DNA was purified by adding 30 μL of AMPure XP beads, 9 μL of 5 M NaCl, and 190 μL of isopropanol and incubating for 10 min at room temperature after thorough mixing. The beads were collected on a magnetic rack and washed twice with 80% EtOH, and DNA was eluted in 30 μL of 10 mM Tris (pH 8.0) for 5 min at 37°C. Twenty-five microliters of ChIP DNA or 10 ng of input DNA was used to generate libraries using the NEBNext Ultra II library preparation kit for Illumina (NEB). Reactions were scaled down to half; otherwise, processing was according to the manufacturer's manual. Libraries were sequenced 51 bp paired-end reads on a NovaSeq6000 instrument (Illumina), 75 bp paired-end reads on a NextSeq2000 device (Ilumina), or 50 bp single-end reads on a HiSeq2500 device (Ilumina).

ATAC-seq

ATAC-seq was performed in biological triplicates as previously described (Ostapcuk et al. 2018).

RNA-seq

RNA was prepared equally as for RT-qPCR. Libraries were prepared using the Ilumina stranded total RNA library preparation, including a ribosomal RNA depletion step, and were sequenced on the Illumina NovaSeq 6000 (51 nt paired-end reads).

RT-qPCR

RNA was isolated from cells using the Absolutely RNA miniprep kit (Agilent) according to the manufacturer's instructions, including genomic DNA prefiltering and DNase I treatment. The concentration was determined with the RNA broad range reagents on a Qubit 2.0 system according to the manufacturer's instruction. Reverse transcription was performed by adding Primescript II master mix (Takara) to 1× concentration and incubating for 15 min at 37°C, followed by enzyme inactivation for 5 sec at 85°C. qPCR was performed with the SSO advanced Bio-Rad qPCR master mix using an amount of inactivated RT mixture corresponding to 40–200 ng of total RNA (depending on the experiment) with 0.4 μM primers in a CFX96 system (Bio-Rad). The cycling parameters were always 30 sec at 95°C and 40 cycles of 5 sec at 95°C and 15 sec at 60°C, and melt curve of 65°C–95°C.

Computational methods

Read mapping

Reads were mapped to the mouse mm10 genome or a combined mouse/human (mm10/GRCh38) genome (for the samples including spike-ins) using STAR version 2.7.3 (Dobin et al. 2013), allowing up to 10,000 multimapping reads, reporting one multimapper at a random location for ChIP-seq analyses.

The parameters used for ChIP were STAR ‐‐runMode alignReads ‐‐outSAMtype BAM SortedByCoordinate ‐‐readFilesIn R1.fastq.gz R2.fastq.gz ‐‐readFilesCommand zcat ‐‐genomeDir mm10_hg38Spike_refSTAR ‐‐runThreadN 10 ‐‐alignIntronMax 1 ‐‐alignEndsType EndToEnd ‐‐outFilterType Normal ‐‐seedSearchStartLmax 30 ‐‐outFilterMultimapNmax 10000 ‐‐outSAMattributes NH HI NM MD AS nM ‐‐outMultimapperOrder Random ‐‐outSAMmultNmax 1 ‐‐outSAMunmapped Within ‐‐outFileNamePrefix _ ‐‐clip3pAdapterSeq CTGTCTCTTATACACATCT AGATGTGTATAAGAGACAG ‐‐outBAMsortingBinsN 100.

The parameters used for RNA-seq were STAR ‐‐runMode alignReads ‐‐outSAMtype BAM SortedByCoordinate ‐‐readFilesIn R1.fastq.gz R2.fastq.gz ‐‐readFilesCommand zcat –genomeDir tar2_7_3a_GRCm38.primary_assembly_gencodeM23_spliced_sjdb50 ‐‐runThreadN 10 ‐‐outFilterType BySJout ‐‐outFilterMultimapNmax 10000 ‐‐outFilterMismatchNmax 3 ‐‐winAnchorMultimapNmax 20000 ‐‐alignMatesGapMax 350 ‐‐seedSearchStartLmax 30 alignTranscriptsPerReadNmax 30000 ‐‐alignWindowsPerReadNmax 30000 ‐‐alignTranscriptsPerWindowNmax 300 ‐‐seedPerReadNmax 3000 ‐‐seedPerWindowNmax 300 ‐‐seedNoneLociPerWindow 1000 ‐‐outSAMattributes NH HI NM MD AS nM ‐‐outMultimapperOrder Random ‐‐outSAMmultNmax 10000 ‐‐outSAMunmapped Within.

We used RepeatMasker (options: -species “Mus musculus”) (http://repeatmasker.org) to annotate repeats and extract consensus repeat sequences. Reads that overlapped repeat annotations were extracted and aligned to repeat consensus sequences using bowtie2 (version: 2.3.5.1) (Langmead and Salzberg 2012) with options: -q -D 20 -R 3 -N 1 -L 20 -i S,1,0.50 ‐‐local -p 20 ‐‐no-unal.

Identification and annotation of ADNP2/ADNP binding sites (peak finding)

To identify peaks in ADNP2 or ADNP ChIPs, we pooled all ChIP replicates and used the callpeak function of MACS2 (version 2.2.7.1) (Zhang et al. 2008). Peaks were resized to span 300 bp around the peak summit. To filter out only significantly enriched peaks, we kept only those peaks where the ChIP was enriched >1.2-fold over input in at least three replicates.

We used Gencode version M23 annotation and defined transcription start sites (TSSs) as 300 bp upstream of the annotated TSSs until the TSS. To find overlaps between peaks and TSSs or repeat elements, we used the Genomic Ranges R package (Lawrence et al. 2013). We also generated 100 sets of randomly distributed regions matching our peak sets in number and size and calculated overlaps with TSSs and repeat annotations using these regions. Repeat annotation overlaps were summed up based on the repeat_name column of the RepeatMasker output (generated as described above).

ChIP-seq analysis

To determine differences in ADNP2 or ADNP ChIP signals in WT and mutant cells, we used Quasr (Gaidatzis et al. 2015) to count the number of reads in peaks or repeat elements, and edgeR (Robinson et al. 2010) for differential count analysis and normalization to library size (using, if available, the human spike-in total read counts as library size, TMM normalization, and a prior.count of 3). To display ChIP signal intensities (as counts per million [cpm]) in heat maps or metaplots or to calculate cpm in sliding windows across a chromosome, we used the MiniChip R package (https://github.com/fmi-basel/gbuehler-MiniChip). We used MEME-ChIP version 5.5.5 (Machanick and Bailey 2011) to search for motifs in all 6315 ADNP2 peaks using the sequence 500 bp around the peak summit and the option -maxw 25.

RNA-seq analysis

To count the number of uniquely mapping reads in genes, we used featureCounts from the Rsubread (Liao et al. 2019) package on Gencode version M24 gene annotation. We used TEtranscripts (Jin et al. 2015) to quantify read counts within transposable elements (TEs) with a TE annotation for mm10 downloaded from https://labshare.cshl.edu/shares/mhammelllab/www-data/TEtranscripts/TE_GTF/mm10_rmsk_TE.gtf.gz and the Gencode version M24 gene annotation with the options ‐‐format BAM ‐‐sortByPos ‐‐stranded reverse ‐‐mode multi. We then calculated differential expression and counts per million using DESeq2 (Love et al. 2014).

To find enriched gene ontology terms among regulated genes, we used the enrichGO function of the clusterProfiler (Yu et al. 2012) R package.

Data and code availability

All NGS data generated for this study have been deposited at NCBI GEO and are available under accession number GSE253069. Custom scripts for data analysis are available on GitHub (https://github.com/xxxmichixxx/ChAHP2). The mass spectrometry proteomics data have been deposited to the ProteomeXchange Consortium via the Proteomics Identification Database (Perez-Riverol et al. 2021) partner repository with the data set identifiers PXD048314 and PXD048310. Other tools used are indicated in the respective Materials and Methods sections.

Competing interest statement

The Friedrich Miescher Institute (FMI) for Biomedical Research receives significant financial contributions from the Novartis Research Foundation. Published research reagents from the FMI are shared with the academic community under a material transfer agreement (MTA) having terms and conditions corresponding to those of the uniform biological material transfer agreement (UBMTA).

Acknowledgments

We thank the members of the Bühler laboratory for their constant support and discussions. Special thanks to Yukiko Shimada and Nathalie Laschet for technical support. We are also grateful to the Friedrich Miescher Institute Functional Genomics Facility for library construction and next-generation sequencing. This work was supported by the Novartis Research Foundation, the Boehringer Ingelheim Fonds, and the Swiss National Science Foundation (SNSF; grant 310030_188835).

Author contributions: J.A. and A.P. generated cell lines, designed and performed most experiments, and analyzed data. M.S. performed the bulk of bioinformatics analyses. J.A. and M.S. generated the figures. F.M. performed and advised experiments and generated cell lines. A.B. performed protein purifications and in vitro biochemistry analyses. G.K. performed SEC-MALS. D.H. acquired and analyzed the mass spectrometry data. A.A. and L.K. generated cell lines. M.B. conceived and supervised the study and secured funding. J.A., F.M., and M.B. wrote the manuscript. All authors discussed the results and commented on the manuscript.

Footnotes

  • Received May 14, 2024.
  • Accepted June 7, 2024.

This article, published in Genes & Development, is available under a Creative Commons License (Attribution 4.0 International), as described at http://creativecommons.org/licenses/by/4.0/.

References

| Table of Contents
OPEN ACCESS ARTICLE

Life Science Alliance