Gkac 1143
Gkac 1143
Gkac 1143
21 12131–12148
https://doi.org/10.1093/nar/gkac1143
Received May 18, 2022; Revised November 03, 2022; Editorial Decision November 11, 2022; Accepted November 17, 2022
* To whom correspondence should be addressed. Tel: +47 228 40 561; Email: [email protected]
C The Author(s) 2022. Published by Oxford University Press on behalf of Nucleic Acids Research.
This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/), which
permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.
12132 Nucleic Acids Research, 2022, Vol. 50, No. 21
target size of functional elements, technical shortcomings, Despite active research on post-transcriptional regula-
and their composite effect with small individual effect size tion and the identification of miRNAs and their targets (35),
on multiple regulatory regions, e.g. slightly altering, but not the understanding of miRNA transcriptional regulation is
obliterating, protein-DNA interactions (4,8). Furthermore, currently limited (30). One obstacle was the lack of precise
while high-impact driver mutations are typically found and identification of pri-miRNA TSSs. The FANTOM5 consor-
reported, medium-impact putative passenger mutations can tium recently took advantage of the cap analysis of gene ex-
have an aggregated effect on tumorigenesis, beyond the al- pression (CAGE) technology to identify pri-miRNA TSSs
ready annotated driver events (9). genome-wide from different cell types and tissues in human
Gene expression is mainly regulated at the transcrip- and mouse (36). Given their short size and the fact that
tional level by the binding of transcription factors (TFs) to they are not recurrently mutated (8), we hypothesize that
promoters (cis-regulatory regions surrounding genes’ tran- the driver potential of miRNAs in cancer could be triggered
scription start sites, TSSs) and enhancers (cis-regulatory re- by cis-regulatory mutations that alter their expression with
gions distal to genes) at TF binding sites (TFBSs) (10,11). a downstream cascading effect on the gene regulatory pro-
Most of the studies that predict noncoding driver muta- grams of the cancer cells.
(iii) small RNA-seq data were available with at least 30 pa- Mutation rate analysis
tients per cohort. Data were downloaded from the Inter-
For each sample, we calculated the mutation rates by divid-
national Cancer Genome Consortium (ICGC) portal (42)
ing the number of mutated nucleotides within a set of re-
through the icgc-get client (Additional file 5). Altogether, we
gions (TFBSs, exons, and flanking regions) by the number
collected data for 349 samples from seven TCGA patient co-
of nucleotides covered by the given set of regions. TFBS ge-
horts (35–89 donors per cohort; Additional file 1): BRCA-
nomic positions were obtained from UniBind (38) (see be-
US (breast invasive carcinoma), HNSC-US (head and neck
low). Protein-coding exon coordinates were retrieved from
squamous cell carcinoma), LIHC-US (liver hepatocellular
RefSeq Curated (51) (Additional file 5). Flanking regions
carcinoma), LUAD-US (lung adenocarcinoma), LUSC-US
were computed by (i) extending TFBS or exonic regions by
(lung squamous cell carcinoma), STAD-US (stomach ade-
100, 500 and 1000 nucleotides on both sides using the flank
nocarcinoma), and UCEC-US (uterine corpus endometrial
bedtools subcommand and (ii) removing regions overlap-
carcinoma).
ping TFBSs and exonic regions using the subtract bedtools
We retrieved data from 256 samples collected by the
subcommand. Sets of regions were independently merged
ICGC Breast Cancer Working group (43,44) for which
−4
−5
Log10 mutation rate
−6
BS
BS
BS
BS
BS
nic
ing
ing
ng
n
STAD−US LUAD−US LUSC−US
Exo
i
Exo
TF
_TF
TF
TF
TF
cod
cod
d
_co
m_
m_
nk_
nk_
nk_
nk_
ank
ank
ndo
ndo
Fla
Fla
Fla
Fla
_Fl
−4
_Fl
Ra
nt_
nt_
Ra
0nt
500
100
100
500
100
100
−5
−6
BS
nic
ng
ng
nic
ing
ng
g
BS
BS
BS
BS
BS
BS
BS
BS
BS
BS
nic
ng
ing
g
BS
BS
BS
BS
n
din
Exo
din
Exo
din
Exo
i
odi
odi
odi
Exo
Exo
Exo
TF
_TF
TF
TF
TF
cod
TF
_TF
TF
TF
TF
cod
TF
_TF
TF
TF
TF
cod
_co
_co
_co
m_
m_
m_
k_c
k_c
k_c
m_
nk_
nk_
m_
nk_
nk_
m_
nk_
nk_
nk_
nk_
nk_
ank
ank
ank
ank
ank
ank
ndo
ndo
ndo
ndo
ndo
ndo
lan
lan
n
Fla
Fla
Fla
Fla
Fla
Fla
Fla
Fla
Fla
Fla
_Fl
_Fl
_Fl
_Fl
_Fl
_Fl
F
Ra
Ra
Ra
nt_
nt_
nt_
nt_
nt_
nt_
Ra
Ra
Ra
nt_
nt_
nt_
nt_
nt_
nt_
0nt
0nt
0nt
0nt
0nt
0nt
500
100
500
100
500
100
100
500
100
500
100
500
100
100
100
100
100
100
Figure 1. Comparison of mutation rates in TFBSs and exons versus their flanking regions and random mutation rates. Each panel corresponds to a specific
cancer cohort (see title boxes) and each point corresponds to a sample. On each panel, the two central boxplots (shadowed) represent mutation rates in
TFBS and exonic regions, the remaining box plots correspond to mutation rates in increasing-size flanking regions (100, 500 and 1000 nt) and mutation
rates expected by chance (150 randomly distributed sets of mutations in the genome; Material and Methods).
Cis-regulatory and loss-of-function mutations complemen- hort (Supplementary Figure S9). Genes with low expres-
tarily alter protein-coding gene networks sion in a given cohort were filtered out; the distribution of
the 90th percentile of expression for genes was decomposed
We then seek to predict the cis-regulatory mutations that
into two Gaussian distributions corresponding to low and
lie in these TFBSs and that lead to cascading effects on
high expression values and only genes lying in the high ex-
gene network deregulation, a hallmark of carcinogenesis.
pression distribution were retained (Materials and Meth-
We first focus on the mutations in TFBSs linked to protein-
ods). Furthermore, gene expression is corrected for copy
coding genes and compare their effect on gene regulation
number alterations (amplifications and deletions detected
to that of mutations altering the function of the protein-
by GISTIC2 (49)) to compensate for copy number-related
coding genes. We consider a protein-coding gene to be mu-
cis-effects on expression (Material and Methods). LoF mu-
tated through either a loss-of-function (LoF) somatic mu-
tations and mutations that overlap TFBSs are analyzed
tation in one of its exons as in (26) or a somatic mutation
independently. Finally, we consider predictions that sat-
overlapping a TFBS associated with the gene. TFBSs are
isfy a false discovery rate (FDR) <0.05, computed empiri-
linked to protein-coding or miRNA genes based on cis-
cally for each cohort using random controls (Materials and
regulatory element-to-gene associations from GeneHancer
Methods).
(56) or distances to TSSs (Materials and Methods; Supple-
Out of the 7275 unique protein-coding genes linked to so-
mentary Figure S8). We estimate the potential trans-effect
matic mutations in the seven TCGA cohorts, 237 are associ-
of the mutations on expression disruption in protein-coding
ated with the deregulation of transcriptional networks in at
gene networks using the xseq tool, following approaches
least one cohort. Of these, 21 harbor LoF mutations (TP53
implemented in previous studies (26,27). Specifically, the
and RPL22 are predicted with LoF mutations in three and
method uses a hierarchical bayesian approach to associate
two cohorts, respectively; Figure 2A) and 219 are linked
mutations with expression dysregulation in biological net-
to cis-regulatory mutations associated to transcriptional
works associated with the mutated protein-coding genes.
deregulation (24 genes are found in more than one cohort;
In a nutshell, it assesses the posterior probability of the
Figure 2A, Supplementary Figures S10 and S11, and Ad-
likely association between observing mutations in a set of
ditional File 3). Three genes are linked to dysregulated net-
patients and observed deviations from neutral expression
works in association with both LoF and cis-regulatory mu-
in these samples for protein-coding genes in the same net-
tations but in different patients and/or cohorts: ACVR2A,
work. The likely trans-associations between mutations and
ARID1A and GATA3 (Figure 2A). These three genes are
gene network deregulation are first assessed in a sample-
already known cancer drivers that we predict to be im-
specific manner and then across samples from the same co-
Nucleic Acids Research, 2022, Vol. 50, No. 21 12137
pacted by alternative mutational mechanisms (LoF or cis- cading trans-effect in gene network dysregulation but the
regulatory mutations). The remaining genes are either as- method cannot identify the specific main driver event or the
sociated with LoF or cis-regulatory mutations across co- combination of cis-regulatory mutations. When considering
horts (TP53, RPL22 with LoF mutations; e.g. PIK3C3 and all the predicted genes per cohort, we detect a similar pat-
CHRM3 with cis-regulatory mutations; Figure 2). tern with subnetworks of interconnected genes with a max-
From the combined list of 237 predicted protein-coding imum of 12 subgraphs containing at least two nodes per co-
genes (Additional File 3), 81 are already annotated as hort (mean = 3; median = 4.13; Figure 3B and Supplemen-
cancer-associated genes (P-value = 9.3e–17; hypergeomet- tary Figure S13). Altogether, these interconnected subnet-
ric test) and 29 as TFs (P-value = 0.025; Supplementary works suggest that the predicted genes are likely involved in
Figures S10 and S11). We observe 28 genes to be predicted similar biological pathways with altered expression associ-
in at least two cohorts. These 28 genes are enriched for al- ated with cis-regulatory somatic mutations.
ready known cancer-associated genes (P-value = 1.4e–6; hy-
pergeometric test) but not for TFs (P-value = 0.21; hyper-
Deregulation of transcriptional activity and cancer pathways
geometric test) (Figure 2A).
are trans-effect signatures of the predicted cis-regulatory and
The genes predicted through cis-regulatory mutations
loss-of-function mutations
rarely contained LoF mutation in the same tumors (Fig-
ure 2B and Supplementary Figure S12). We interpret this To shed light on the functional role of the somatic muta-
to mean that LoF and cis-regulatory mutations are possibly tions predicted to be associated with a cascading effect, we
complementary mechanisms that alter the gene regulatory perform enrichment analyses on the altered gene expres-
programs of cancer cells. We observe that multiple genes can sion profiles. One advantage of xseq is its capacity to high-
be predicted through cis-regulatory mutations in the same light the specific genes in the biological networks that are
sample. Furthermore, these genes tend to be interconnected dysregulated in the samples harboring the somatic muta-
in the dysregulated genes’ networks (Figure 3A). All these tions considered (Material and Methods) (26). These genes
genes are predicted through mutations associated with cas- are consistently found to be either up- or down-regulated
12138 Nucleic Acids Research, 2022, Vol. 50, No. 21
in the samples with predicted disrupted expression (see the Combining transcriptional and post-transcriptional regula-
blue and red colors in the upper and lower clusters in Fig- tion highlights pan-cancer miRNAs associated with gene ex-
ure 4A). These results highlight sets of genes up- or down- pression alteration in tumors
regulated across samples where cancer-associated genes are
The analysis of mutations linked to protein-coding genes
predicted.
presented above demonstrates that our methodology pin-
We assess the biological relevance of the networks pre-
points cis-regulatory mutations likely associated with car-
dicted to be dysregulated in association with either LoF or
cinogenesis. We hypothesize that our method could high-
cis-regulatory mutations linked to the protein-coding genes.
light cis-regulatory mutations linked to miRNAs with
Functional enrichment analysis is performed using path-
downstream cascading effects on the gene regulatory pro-
ways from KEGG (58), WikiPathways (59) and Panther
grams of the cells because miRNAs are involved in post-
(74), and gene ontology biological processes (GO BP (75))
transcriptional regulation of gene expression. This novel ap-
with the EnrichR tool (67). The dysregulated genes in the
proach of functional analysis of mutations aims to combine
networks are enriched for transcriptional activity (‘regula-
transcriptional (through mutations in TFBSs) and post-
tion of transcription, DNA-templated’ from GO BP; Sup-
transcriptional (through regulatory networks of miRNA–
plementary Figure S14). Combined with the enrichment
targets) regulation to predict miRNAs associated with a
of TFs in the complete list of predicted cancer-associated
trans-effect on gene expression alteration through somatic
genes, this result emphasizes that the alteration of tran-
mutations in cis-regulatory elements.
scriptional regulation is likely a common feature of can-
Specifically, we adapt the xseq framework to infer cis-
cer cells throughout cancer types. Pathways already known
regulatory somatic mutations linked to miRNAs and as-
to be associated with carcinogenesis (e.g. ‘Pathways in can-
sociated with a cascading effect on miRNA target net-
cer’, ‘JAK-STAT signaling’, ‘PI3K-Akt signaling’, ‘p53 sig-
works dysregulation. Similar to the analysis of protein-
naling pathway’, ‘Focal adhesion’ and ‘Apoptosis’; Figures
coding genes, we estimate the posterior probability of the
4B, C and Supplementary Figures S14–S17) are at the top
likely association between the presence of mutations in TF-
of the enriched terms. The enrichment for cancer pathways
BSs linked to a miRNA with observed deviations from neu-
confirms that our approach identifies somatic exonic and
tral expression of the miRNA’s target genes. We consider
cis-regulatory mutations associated with potential protein-
miRNAs from miRBase (53) and their corresponding TSSs,
coding cancer-associated genes with cascading effect on reg-
which were identified using CAGE (Materials and Meth-
ulatory alteration of key cancer-related pathways. Our re-
ods) (36). To assess the cascading effect of mutations linked
sults suggest that alteration of gene network expression
to miRNAs on their targets’ expression, we examined the
could be achieved through cis-regulatory mutations asso-
protein-coding genes predicted by TargetScan (31) to be tar-
ciated with different genes in different patients but involved
gets of each miRNA. We limited the set of miRNA–target
in the same pathways.
Nucleic Acids Research, 2022, Vol. 50, No. 21 12139
Figure 4. Dysregulated protein-coding gene networks and functional enrichment analysis. (A) Dysregulated gene network in samples where FUS
is predicted through cis-regulatory mutations in breast cancer (BRCA-US) (rows: dysregulated genes associated with FUS; columns: samples with
FUS-associated cis-regulatory mutations). The color scale represents the gene regulatory status posterior probability (red: up-regulation; blue: down-
regulation––posterior probability * (–1)). The top horizontal bar shows the sample-specific dysregulation posterior probability computed by xseq for the
samples harboring a cis-regulatory mutation in the FUS gene. The horizontal bar below shows the gene expression z-value of FUS (Materials and Meth-
ods). (B) KEGG 2021 most enriched terms computed from all the dysregulated genes associated with the predicted protein-coding genes (A is one example
for FUS) by xseq with LoF mutations and (C) cis-regulatory mutations in TCGA cohorts (columns). Terms (rows) are ordered by their mean rank across
all cohorts. Significance is provided as –log10 (P-value).
12140 Nucleic Acids Research, 2022, Vol. 50, No. 21
genes pairs to those where at least two target sites for the 5C), arguing for a potential link between viral infections
miRNA are predicted to reduce false positive predictions and cancer initiation/progression, as previously suggested
(63,64) (Materials and Methods). Note that we separately (89,90), via miRNAs.
analyze miRNAs from both arms (5p and 3p) for each pre- Altogether, this study provides the first foray into the
miRNA sufficiently expressed in a TCGA cohort (Materials analysis of a combined effect of coherent transcriptional
and Methods). and post-transcriptional dysregulation downstream of so-
Applying this analysis to the seven TCGA cohorts, we matic cis-regulatory mutations associated with miRNAs in
predict 68 mature miRNAs, derived from 47 pre-miRNAs, cancer cells. It highlights a core set of miRNAs associated
as associated with mutations in TFBSs and deregulation of with cis-regulatory mutations that are linked to a cascading
expression for their target genes (Figure 5A and Supple- alteration of gene regulatory networks involved in cancer
mentary Figure S18; Additional File 3). From these 68 miR- onset and progression.
NAs, 54 are already annotated as cancer-associated miR-
NAs in the miRCancer database (71) (P-value = 5e–23;
Complementary analysis of an independent breast cancer co-
hypergeometric test), which is derived from text-mining of
B C
Figure 5. Overview of miRNA driver predictions and their dysregulated target networks. (A) Pre-miRNAs with mature miRNAs predicted as potential
drivers by xseq. Cell colors indicate the posterior probability computed over the corresponding cohort. Red stars indicate that the miRNA is annotated
as a cancer-associated miRNA in miRCancer (71). Blue stars indicate that the miRNA was reported as a cancer-associated miRNA in the specific cancer
type where it is predicted by xseq, according to miRCancer annotation. (B) Dysregulated network of target genes for miRNA hsa-mir-20a-5p predicted in
liver hepatocellular carcinoma (LIHC-US) (rows: dysregulated targets; columns: samples with cis-regulatory mutations associated with hsa-mir-20a-5p).
The top color scale represents the gene regulatory status posterior probability (red: up-regulation; blue: down-regulation - posterior probability * (-1)). The
horizontal bar below shows the miRNA expression z-value (Materials and Methods). (C) KEGG 2021 most enriched terms (rows) for all the dysregulated
genes associated with the identified miRNA drivers across TCGA cohorts (columns). Terms are ordered by their mean rank across all cohorts. Significance
is provided as –log10 (P-value).
12142 Nucleic Acids Research, 2022, Vol. 50, No. 21
Genes are linked in the network if they are known biological ple cis-regulatory mutation information with gene expres-
partners in the original network (Figure 6). The constructed sion data from the same samples to highlight direct evidence
network comprises 87 genes, which are all connected in a of the regulatory impact of the mutations. By integrating
single dense network, where the top three (hub) genes with whole-genome somatic mutations, RNA-seq, small RNA-
the largest in-degree are JUN, RB1 and TP53. This obser- seq, and copy number aberrations (CNA) data with gene
vation highlights that the predicted genes across the cohorts regulatory networks, we perform pan-cancer predictions of
are likely involved in similar biological pathways, which is protein-coding and miRNA genes associated with somatic
supported by the functional enrichment results above. It cis-regulatory mutations in patients from seven distinct can-
suggests that the same pathways tend to be dysregulated cer types. Our study provides a large-scale foray into pre-
through mutations associated with different genes. dicting cancer-associated protein-coding and miRNA genes
We predict one miRNA (hsa-mir-378a-3p) associated by combining both transcriptional and post-transcriptional
with cis-regulatory mutations in the ICGC cohort when information. Our results provide new insights into the po-
considering all samples (Supplementary Figure S22). We tential impacts and causes of the alterations of gene regula-
do not predict any driver miRNAs associated with cis- tory programs observed in cancer cells along with the cas-
POLR1A
PDE4D
ADCY7
RGS2
ADCY9
FCGR3A HGS
PTGER4 RGS1
PSMD3
SNRPE RGS1
PLCB3
LAMB1
ITGA2 CBLB FCER1G GRB2
FUS
ERBB4 CXCR4
ITGA2 SYK PSMD9 HEXIM1
IL1B
FES
FBXO31
BAMBI
ZNF143
Indegree
10 20 30
Figure 6. Predicted genes in breast cancer cohorts are connected in the biological network. Network representing the predicted protein-coding genes in
ICGC (all samples), ICGC ER+, ER– and BRCA-US cohorts. The names of the genes predicted in two or more cohorts are displayed several times with
different colors.
affinity poorly predict the effects on expression as reported The analysis of protein-coding genes predicts 28 genes in
by massively parallel assays (101). A previous method sys- at least two (out of the seven) TCGA cohorts analyzed, with
tematically assessed the potential impact of somatic muta- many already known cancer drivers (Figure 2A). We ob-
tions in genomic tiles near genes’ TSSs on gene expression serve that the protein-coding genes predicted through the
(25). Here, we consider mutations lying within a specific set analysis of cis-regulatory mutations generally do not con-
of pre-defined TFBSs without restrictions on distances to tain mutations in exonic regions for the same patients (Fig-
TSSs and evaluate the trans-association of the mutations ure 2B and Supplementary Figure S12). This observation
with genes’ network deregulation. Our approach is some- suggests complementary mechanisms acting upon gene ex-
what similar to a genome-wide association study frame- pression dysregulation with cascading effects on regulatory
work focused on TFBSs to reduce the search space. More- network disruption. We hypothesize that either the final
over, our strategy is not directly assessing the effect on TF- product of a gene may be altered due to LoF mutations or
DNA interactions, i.e. the gain/loss of TFBSs, but rather fo- the expression of the gene is altered through cis-regulatory
cuses on the association with gene expression deregulation. mutations, which, in both cases, alter the activity of biolog-
Although we focused on somatic mutations and small in- ical networks.
dels at cis-regulatory elements, we acknowledge that CNAs Given that miRNAs cover a small portion of the hu-
such as duplications or deletions are likely to contribute to man genome, they harbor a small number of somatic muta-
gene expression alteration as well. Nevertheless, our analy- tions (8), limiting the possibility to affect gene expression.
ses considered CNAs to ensure that the predicted deregula- The potential mechanism that we propose here is the alter-
tions were not confounded with CNAs. Further work and ation of their regulatory elements. Our study highlights cis-
a complementary computational framework will be neces- regulatory mutations linked to miRNAs that are associated
sary to bring together single nucleotide variants, small in- with dysregulation of expression of the miRNA targets. In
dels, CNAs, and structural variations and assess their com- our pan-cancer analysis, we discover a core set of 12 mature
bined impact on gene expression deregulation in cancer. miRNAs associated with the dysregulation of key pathways
12144 Nucleic Acids Research, 2022, Vol. 50, No. 21
involved in carcinogenesis. This core set of miRNAs repre- network lead to the same phenotype), which may have orig-
sents a common feature for gene expression dysregulation inated because the dysregulated genes are connected in the
associated with cancer onset or progression. We note that biological network (Figure 6). Moreover, as originally de-
several of these miRNAs are established oncomiRs, which scribed in Ding et al. (26), the xseq probabilistic framework
promote carcinogenesis. The Kaplan–Meier plots in Figure highlights the specific samples where mutations are associ-
7 for hsa-mir-29a-3p, hsa-mir20a-3p, hsa-mir-20a-5p, and ated with an impact on gene expression (Figure 4A). This
hsa-mir-145-3p show that higher expression correlates with dichotomy can, in principle, be used to stratify samples and
poorer survival rates, which would indicate that these miR- mutations but, in this study, is limited by the number of sam-
NAs act as oncomiRs in breast cancer, possibly targeting ples considered.
tumor suppressor genes or pathways. We apply our methodology to two cohorts of breast can-
The analysis of the dysregulated networks of the pre- cer samples (BRCA-US and ICGC). Given the large num-
dicted cancer-associated genes (protein-coding and miR- ber of samples in ICGC (n = 256), we perform three analy-
NAs) shows that many genes are dysregulated in a few sam- ses separately by considering (i) all samples, (ii) ER+ sam-
ples but rarely across all the mutated samples (Figure 5B). ples and (iii) ER– samples. Predictions vary depending on
However, the functional enrichment analysis of the dys- the samples’ histopathology. This is particularly important
regulated genes shows consistency across cohorts and the for methods relying on gene expression, which is influenced
analyzed types of mutations (LoF and cis-regulatory) for by the clinical composition of the cohorts. We acknowl-
both protein-coding and miRNA genes, even when there is edge that methodological differences between the BRCA-
a small intersection among the predicted genes in cohorts US and ICGC cohorts (e.g. different somatic mutation call-
of the same cancer type (Supplementary Figures S20 and ing algorithms, RNA-seq versus microarrays, and normal-
S21). Altogether, these observations suggest a phenotypic ization of RNA-seq raw counts) can provide additional ex-
heterogeneity (i.e. alterations of different parts of the same planations for the variation in predictions, which is the case
Nucleic Acids Research, 2022, Vol. 50, No. 21 12145
with the BRCA-US and ICGC cohorts that were indepen- Our study represents, to our knowledge, the first large-scale
dently normalized. Although only a few of the predicted analysis of cis-regulatory mutations that are linked to gene
protein-coding genes are predicted in both the ICGC and expression alteration in key cancer-associated pathways.
the BRCA-US cohorts (Supplementary Figure S20), the Our results suggest that this process can be achieved flexibly
functional enrichment analysis of the dysregulated gene net- because although we observe different genes in different pa-
works is consistent (Supplementary Figure S21). This ob- tients, all are associated with deregulation of the same path-
servation suggests common dysregulated pathways that act ways. Combining transcriptional and post-transcriptional
as attractors and that could originate from (non-recurrent) information, we identify a core set of 12 miRNAs linked
distinct cancer-associated events. It underlines the impor- to altered cancer pathways across cancer types. These pan-
tance of addressing cancer as a disease with perturbations cancer results provide new insights into the impact and po-
manifested at the gene network level. Our miRNA analyses tential causes of miRNA-mediated gene expression dysreg-
target gene expression alteration recurrently altered across ulation. This work extends our capacity to address the dis-
the BRCA-US and ICGC ER– breast cancer cohorts and covery gap of cancer-associated event identification through
highlight two miRNAs (hsa-mir-17-3p and hsa-mir-18-5p) the analysis of noncoding mutations and miRNA genes.
Norway (NCMM) (to A.M. and J.A.C.M.); Norwegian Re- Interpretation Working Group, Gallinger,S. and Stein,L.PCAWG
search Council [288404 to J.A.C.M. and Mathelier group]; Consortium (2020) Combined burden and functional impact tests
for cancer driver discovery using driverpower. Nat. Commun., 11,
Norwegian Cancer Society [197884 to Mathelier group]; 734.
M.R.A. was a postdoctoral fellow of the South Eastern 19. Kalender Atak,Z., Imrichova,H., Svetlichnyy,D., Hulselmans,G.,
Norway Health Authority [2014021 to A.L.B.D.]; a re- Christiaens,V., Reumers,J., Ceulemans,H. and Aerts,S. (2017)
search fellow of the Norwegian Cancer Society [711164 to Identification of cis-regulatory mutations generating de novo edges
V.N.K.]. Funding for open access charge: Norwegian Re- in personalized cancer gene regulatory networks. Genome Med., 9,
80.
search Council. 20. Fu,Y., Liu,Z., Lou,S., Bedford,J., Mu,X.J., Yip,K.Y., Khurana,E.
Conflict of interest statement. None declared. and Gerstein,M. (2014) FunSeq2: a framework for prioritizing
noncoding regulatory variants in cancer. Genome Biol., 15, 480.
21. Ritchie,G.R.S., Dunham,I., Zeggini,E. and Flicek,P. (2014)
REFERENCES Functional annotation of noncoding sequence variants. Nat.
1. Bradner,J.E., Hnisz,D. and Young,R.A. (2017) Transcriptional Methods, 11, 294–296.
addiction in cancer. Cell, 168, 629–643. 22. Boyle,A.P., Hong,E.L., Hariharan,M., Cheng,Y., Schaub,M.A.,
(2006) The UCSC genome browser database: update 2006. Nucleic 58. Kanehisa,M. and Goto,S. (2000) KEGG: kyoto encyclopedia of
Acids Res., 34, D590–D598. genes and genomes. Nucleic Acids Res., 28, 27–30.
41. Karolchik,D. and James Kent,W. (2003) The UCSC genome 59. Pico,A.R., Kelder,T., van Iersel,M.P., Hanspers,K., Conklin,B.R.
browser. Curr. Protoc. Bioinformatics, and Evelo,C. (2008) WikiPathways: pathway editing for the people.
https://doi.org/10.1002/0471250953.bi0104s00. PLoS Biol., 6, e184.
42. Zhang,J., Baran,J., Cros,A., Guberman,J.M., Haider,S., Hsu,J., 60. Karp,P.D., Ouzounis,C.A., Moore-Kochlacs,C., Goldovsky,L.,
Liang,Y., Rivkin,E., Wang,J., Whitty,B. et al. (2011) International Kaipa,P., Ahrén,D., Tsoka,S., Darzentas,N., Kunin,V. and
cancer genome consortium data Portal––a one-stop shop for cancer López-Bigas,N. (2005) Expansion of the biocyc collection of
genomics data. Database, 2011, bar026. pathway/genome databases to 160 genomes. Nucleic Acids Res., 33,
43. Nik-Zainal,S., Davies,H., Staaf,J., Ramakrishna,M., Glodzik,D., 6083–6089.
Zou,X., Martincorena,I., Alexandrov,L.B., Martin,S., Wedge,D.C. 61. Zhou,H., Jin,J., Zhang,H., Yi,B., Wozniak,M. and Wong,L. (2012)
et al. (2016) Landscape of somatic mutations in 560 breast cancer IntPath–an integrated pathway gene relationship database for model
whole-genome sequences. Nature, 534, 47–54. organisms and important pathogens. BMC Syst. Biol., 6(Suppl. 2),
44. Smid,M., Rodrı́guez-González,F.G., Sieuwerts,A.M., Salgado,R., S2.
Smissen,Prager-Vander, Vlugt-Daane,W.J.C., van der,M., van 62. Gerstein,M.B., Kundaje,A., Hariharan,M., Landt,S.G., Yan,K.-K.,
Galen,A., Nik-Zainal,S., Staaf,J. et al. (2016) Breast cancer genome Cheng,C., Mu,X.J., Khurana,E., Rozowsky,J., Alexander,R. et al.
79. Nadiminty,N., Tummala,R., Lou,W., Zhu,Y., Shi,X.-B., Zou,J.X., 91. Lee,W., Jiang,Z., Liu,J., Haverty,P.M., Guan,Y., Stinson,J., Yue,P.,
Chen,H., Zhang,J., Chen,X., Luo,J. et al. (2012) MicroRNA let-7c is Zhang,Y., Pant,K.P., Bhatt,D. et al. (2010) The mutation spectrum
downregulated in prostate cancer and suppresses prostate cancer revealed by paired genome sequences from a lung cancer patient.
growth. PLoS One, 7, e32832. Nature, 465, 473–477.
80. Tehler,D., Høyland-Kroghsbo,N.M. and Lund,A.H. (2011) The 92. Vorontsov,I.E., Khimulya,G., Lukianova,E.N., Nikolaeva,D.D.,
miR-10 microRNA precursor family. RNA Biol., 8, 728–734. Eliseeva,I.A., Kulakovskiy,I.V. and Makeev,V.J. (2016) Negative
81. Ke,K. and Lou,T. (2017) MicroRNA-10a suppresses breast cancer selection maintains transcription factor binding motifs in human
progression via PI3K/Akt/mTOR pathway. Oncol. Lett., 14, cancer. BMC Genomics, 17, 395.
5994–6000. 93. Martincorena,I., Raine,K.M., Gerstung,M., Dawson,K.J.,
82. Mu,N., Gu,J., Huang,T., Zhang,C., Shu,Z., Li,M., Hao,Q., Li,W., Haase,K., Van Loo,P., Davies,H., Stratton,M.R. and Campbell,P.J.
Zhang,W., Zhao,J. et al. (2016) A novel (2017) Universal patterns of selection in cancer and somatic tissues.
NF-B/YY1/microRNA-10a regulatory circuit in fibroblast-like Cell, 171, 1029–1041.
synoviocytes regulates inflammation in rheumatoid arthritis. Sci. 94. Frigola,J., Sabarinathan,R., Mularoni,L., Muiños,F.,
Rep., 6, 20059. Gonzalez-Perez,A. and López-Bigas,N. (2017) Reduced mutation
83. Hirschberger,S., Hinske,L.C. and Kreth,S. (2018) MiRNAs: rate in exons due to differential mismatch repair. Nat. Genet., 49,
dynamic regulators of immune cell functions in inflammation and 1684–1692.