Epigenomicslinc RNA
Epigenomicslinc RNA
Epigenomicslinc RNA
net/publication/328594757
CITATIONS READS
2 217
10 authors, including:
Some of the authors of this publication are also working on these related projects:
All content following this page was uploaded by Yao Li on 07 November 2018.
Aim: We aimed to identify previously unreported long intergenic noncoding RNAs (lincRNAs) in the
porcine liver, an important metabolic tissue, and further illustrate the epigenomic landscapes and the
evolution of lincRNAs. Materials & methods: We used porcine omics data and comprehensively analyzed
and identified lincRNAs and their methylation, expression and evolutionary patterns during pig domesti-
cation. Results: LincRNAs exhibit highly methylated promoter and downstream regions, as well as lower
expression levels and higher tissue specificity than protein-coding genes. We identified a batch of lincRNAs
with selection signals that are associated with pig domestication, which are more highly expressed in the
liver than in other tissues (19:10/8/6/3/2/1/1). Interestingly, the lincRNA linc-sscg1779 and its target gene
C6, which is crucial in liver metabolism, are differentially expressed during pig domestication. Conclusion:
Although they may originate from noisy transcripts, lincRNAs may be subjected to artificial selection. This
phenomenon implies the functional importance of lincRNAs in pig domestication.
First draft submitted: 11 September 2017; Accepted for publication: 7 February 2018; Published online:
29 October 2018
Keywords: coding gene • domestication • evolution • expression • lincRNAs • liver • methylation • pig • selection
signal • wild
The pig (Sus scrofa) is an economically and medically important domesticated animal. It is used as a biomedical model
for human metabolism because of the extensive similarities between the metabolic features and organ proportions of
humans and pigs [1–3]. Approximately 10,000 years ago, Asian and European pigs were independently domesticated
from Asian and Eurasian wild boars, respectively [4–8]. Domestication is a continuous process that involves a series
of phenotypic and genetic changes, including changes in behavior, body composition, reproductive capacity and
coat color, in animals to meet human demands [9–12]. Domestic pigs have faster growth rates, higher fat content
and higher reproductive capacity than their ancestor, the wild boar [9,13]. Various Chinese and European local pig
breeds have been produced through domestication and breeding. Different breeds exhibit considerably different
physiological characteristics and morphology as a result of natural and artificial selection and different breeding
objectives [14]. Chinese domesticated pigs are short and broad with medium ears. They have high fat, fine bones
and lean and tender meat. In addition, Chinese domesticated pigs have stronger immune systems than European
domesticated pigs [9]. European domesticated pigs have higher fecundity, better feed conversion ratio and stronger
adaptability than Chinese domesticated pigs. The Asian domestic pig was first introgressed into the European pig
during the eighteenth century and early nineteenth centuries.
The rapid expansion of genomic data resources has enabled the exploration of the genomic basis of pig domes-
tication, local adaptation and breeding. For example, Li et al. [9] used porcine 60K BeadChip genotyping data to
10.2217/epi-2017-0117
C 2018 Future Medicine Ltd Epigenomics (Epub ahead of print) ISSN 1750-1911
Research Article Li, Zou, Cui et al.
analyze the extended haplotype homozygosity values of Chinese and European domesticated pigs. They identi-
fied HNF4A, which regulates fatty acid, cholesterol and lipoprotein metabolism in the liver, as a candidate gene
with selection signature in European domesticated pigs. Other identified candidate genes with selection signatures
include MC1R [15] and KIT [11]. These genes are related to coat color. In addition, LCORL, NR6A1 and PLAG1
exhibit strong selection signals and are associated with increased body length and meat production in European
domesticated pigs [11]. Generally, these studies have mainly focused on the genetic mutations of coding genes that
affect traits under selection.
Long intergenic noncoding RNAs (lincRNAs) do not code for genes, have attracted considerable attention
because of their biological significance in growth, development and domestication, which is a microevolutionary
process of special interest [16]. Genomic studies have indicated that lincRNA are ubiquitous in mammals [17,18].
LincRNAs participate in various biological processes by regulating target genes [19]. A few studies have reported that
specific lncRNAs, such as lincRNA HOTAIR, participate in epigenetic regulation. For example, HOTAIR specifies
the histone modification patterns of its target genes by acting as the scaffold of histone modification enzymes [20,21].
The influence of lincRNAs on phenotypes through target gene regulation is a new research hotspot, and a growing
number of case studies have shown the functional importance of lincRNAs in transcriptional regulation [22],
epigenetic regulation [23], induced pluripotent stem cell reprogramming [24,25] and dosage compensation [26,27].
Given the advances in knowledge on lincRNAs, lincRNA resources must be systematically mined at the whole
genome level to illuminate their potential impact on species evolution.
As the pig, an important domestic animal, is an ideal model that can be used to address the above issues. Several
studies have identified the genomic distributions and characteristics of lincRNAs [1,17,28–30], and some researches
have preliminarily explored the possible roles of lincRNAs in pig domestication and differentiation [18,31]. Given that
lincRNAs usually exhibit a tissue-specific expression pattern [1], studies on the role of lincRNAs in pig domestication
or breed differentiation have been mainly based on transcriptomic data, thus limiting their findings and inferences
to the tested tissues. Currently, the most commonly tested porcine tissues include the muscles and the brain [18,31].
Therefore, the majority of pig lincRNAs with putative functions in pig domestication or differentiation are related
to muscle development or emotional behavior. Identifying the evolutionary pattern of lincRNAs on the genomic
level may help deepen the understanding of their impact on pig domestication and differentiation. In addition,
epigenetic regulation via DNA methylation is an important alternative to genetic mutation in animal phenotypic
evolution. Different epigenetic systems crosstalk and contribute in animal phenotypic evolution. We previously
found that H19, a long noncoding gene, showed epigenetic changes in the liver via DNA methylation during pig
domestication and breeding. These epigenetic changes are associated with the differential expression of H19 in
the liver. This result suggested that lincRNAs may be artificially selected through an epigenetic system during pig
domestication [16].
To broaden our understanding of the features and roles in pig domestication and differentiation of lincRNAs
from the genomic and epigenomic perspectives, we comprehensively analyzed multiomics data, RNA-seq, liver
methylome (BS-seq data) and published genome-resequencing data [11,32]. First, we predicted a batch of pig
lincRNAs in pig liver. This batch included a large number of previously unreported lincRNAs [1]. Second, to
provide a systematic overview of pig lincRNAs and their functional characteristics, we investigated the association
between the methylation status and expression pattern of these lincRNAs and predicted their potential target
genes (PTGs). Finally, we identified lincRNAs that exhibit strong selection signals and differential methylation
status during domestication and breed differentiation. Our study provides substantial information, such as genome
landscape, on pig lincRNAs. Moreover, we provided a list of candidate lincRNAs under artificial selection in genetic
and epigenetic aspects. Our results are a valuable basis for future studies on lincRNA functions and pig breeding.
once the average quality within the window falls below 15 when performing a sliding window trimming with four
bases window size; removing out of the read if it is below 36 bp).
The resequencing dataset, which include 41 samples including ten Chinese domesticate pigs (CND), 22 European
domesticate pigs (EUD) and five Chinese wild pigs (CNW), were downloaded from the NCBI SRA database
(ERP001813 [11], SRP018123 [32]) and the detailed information was showed in Supplementary Table 1. Raw
sequence reads were filtered with the criteria that read with over 10% unidentified nucleotides (N); or more than
40 bp with low quality value (quality score ≤5); or more than 5 bp mismatches; pair-end reads were completely
identical which were generated by PCR amplification in the library construction process.
Sus scrofa genome (version 10.2) and the genome annotation file were downloaded from the Ensemble database
(www.ensembl.org/index.html). The published pig lincRNAs [18] were used to generate a lincRNA annotation file
and integrated into the genome annotation.
log 2x(i,j)
N
i 1 1 log 2x(i,max)
N-1
where N is the number of tissues examined and N is equal to 8 in here. X (i, max) is the highest number of reads
mapped to gene i across the N tissues, and X (i, j) is the number of reads mapped to gene i in j tissue. The τ value
ranges from 0 to 1, 1 indicating one tissue specificity and 0 indicating internal reference gene.
Processing the BS-seq data to investigate the methylation status of the lincRNAs & coding genes
The clean BS-seq data were aligned to the bisulfite-converted Sus scrofa 10.2 genome using bismark (version
0.10.1) [46]. The methylation level was extracted using the bismark methylation extractor script of Bismark bisulfite
mapper software (version 0.10.1) [46] and included uniquely aligned reads only. We obtained the number of
methylated and unmethylated CpG and non-CpG (CHG and CHH, H representing A/C/T) sites of each
cytosine. Cytosine sites with at least five reads were retained for further methylation level analysis and differential
methylation analysis. The methylation level was calculated with the total number of mapping reads of each cytosine
divided by the number of mapping reads of methylated CpG of this cytosine. The promoter region was defined as
the genomic region 2 kb upstream of gene body to 500 bp of gene body. The region around transcription start site
(TSS) and transcription termination site (TTS) was defined as the genomic region 2 kb upstream of gene body and
2 kb downstream of gene body. These regions were divided by 100 bp sliding windows and the gene body region
was divided into 20 portions.
Processing the resequencing data to screen for lincRNAs with signal of artificial selection
The clean reads of resequencing data were aligned to the Sus scrofa 10.2 genome sequence using bwa software
(version 0.7.7) with default parameters [47]: SNP calling was performed on a population scale for three groups (ten
domestic pigs from China, 22 domestic pigs from European and five wild pigs from China) by GATK software
(version 2.2). The allele frequency was obtained using the package GATK [48]. A selective sweep approach was
used to detect artificial selection signal, according to Li’s method [32]. Briefly, a sliding window approach (1000 bp
windows sliding in 100 bp steps) was used to calculate the polymorphism level (θπ , pairwise nucleotide variation
as a measure of variability) and genetic differentiation (Fst) between CND and CNW, CND and EUD. Windows
with signatures of selective sweep were detected as the cutoff of the top 5% Fst and the lowest 5% θπ ratio among
the whole genome. We further merged the adjacent windows with selection signals to a larger region with selection
signal using bedtools software (version 2.17.0). LincRNAs overlapping with the regions with selection signals were
considered as candidate domestication genes or breeding differential genes. Considering the sample size is very
small, we try to screen relatively reliable selected genes, so we take the intersection of selected genes between CND
and CNW as the final domestication genes, and the selected genes screened between CND and EUD are considered
as breed differentiation related genes.
Results
Identification of lincRNAs on the basis of RNA-seq datasets of liver tissue
By analyzing the liver RNA-seq data of three pig breeds (ES, a Chinese pig breed; WB, Chinese wild pig; and
LW, a European breed), we identified 861 lincRNA transcripts that are encoded by 713 lincRNA sequences
(Supplementary File 1). Out of these lincRNAs, 611 do not overlap with currently annotated coding or noncoding
transcripts. A considerable large portion (403, 56.5%) of these lincRNA has not been previously identified [18].
Our results indicated that similar to that in other mammals and even insects, lincRNA expression in pigs is tissue
specific [50]. We combined the lincRNAs that have been identified in our study with those identified in previous
studies [1,18] for further analyses.
The methylation level of different functional regions between lincRNAs and protein-coding
genes in LW
1.0
***
LW_lincRNA
** LW_protein
Methylation level
*
*** **
0.5
0.0
Promoter Exon Intron Downstream 2 k
1.0 LW
LW_lincRNA
LW_protein
Methylation level
0.5
0.0
Upstream 2 k Genebody TTS Downstream 2 k
TSS
Log (number of mapping reads)
Log (number of mapping reads)
15 15 15 15
10 10 10 10
5 5 5 5
0 0 0 0
Kidney
Log (number of mapping reads)
15 15 15 15
10 10 10 10
5 5 5 5
0 0 0 0
Figure 1. Methylation and expression of long intergenic noncoding RNAs in the Large White-breed genome. (A)
Methylation level at different functional regions. (B) Methylation analysis around TSS and TTS between lincRNAs and
protein-coding genes. Two-kilobase regions upstream and downstream of each gene were divided into 100-base pair
intervals. Each gene was divided into 20 intervals. The results of the other two pig samples are shown in
Supplementary Figure 2, with similar pattern. (C) Expression difference between protein coding genes and lincRNAs in
eight tissues. (D) Tissue specificity analysis between lincRNAs and protein coding genes. (E) The relationship between
promoter methylation level and expression of lincRNAs. Methylation level among up 2 kb, gene body and down 2 kb
of lincRNAs divided by its expression level. Genes were classified into quintiles based on expression: lowest is the silent
lincRNAs, the rest of lincRNAs were equally categorized into the low, medium, high and highest expression groups.
Upstream and downstream region of each lincRNA were divided into 100-base pair intervals and gene body region of
each lincRNA was divided into 20 intervals. (F) Tissue specificity analysis of five promoter methylation groups.
lincRNA: Long intergenic noncoding RNA; LW: Large white; TSS: Transcription start site; TTS: Transcript termination
site.
0.03
0.02
0.01
0.00
0.00
0.04
0.08
0.12
0.16
0.20
0.24
0.28
0.32
0.36
0.40
0.44
0.48
0.52
0.56
0.60
0.64
0.68
0.72
0.76
0.80
0.84
0.88
0.92
0.96
1.00
τ value
LW
1.0
Methylation level
0.5
Highest
High
Medium
Low
Lowest
0.0
Upstream 2 k Genebody Downstream 2 k
LW
0.4
Lowest
Lower
0.3
Medium
Frequency
Higher
Highest
0.2
0.1
0.0
0.0 0.2 0.4 0.6 0.8 1.0
τ value
Figure 1. Methylation and expression of long intergenic noncoding RNAs in the Large White-breed genome (cont.).
(A) Methylation level at different functional regions. (B) Methylation analysis around TSS and TTS between lincRNAs
and protein-coding genes. Two-kilobase regions upstream and downstream of each gene were divided into 100-base
pair intervals. Each gene was divided into 20 intervals. The results of the other two pig samples are shown in
Supplementary Figure 2, with similar pattern. (C) Expression difference between protein coding genes and lincRNAs in
eight tissues. (D) Tissue specificity analysis between lincRNAs and protein coding genes. (E) The relationship between
promoter methylation level and expression of lincRNAs. Methylation level among up 2 kb, gene body and down 2 kb
of lincRNAs divided by its expression level. Genes were classified into quintiles based on expression: lowest is the silent
lincRNAs, the rest of lincRNAs were equally categorized into the low, medium, high and highest expression groups.
Upstream and downstream region of each lincRNA were divided into 100-base pair intervals and gene body region of
each lincRNA was divided into 20 intervals. (F) Tissue specificity analysis of five promoter methylation groups.
lincRNA: Long intergenic noncoding RNA; LW: Large white; TSS: Transcription start site; TTS: Transcript termination
site.
Research Article Li, Zou, Cui et al.
6 4 2 0 2 4 6 8
The relation of expression lincRNA and their target genes in LW
The expression level of lincRNAs
0
0 3 6 9 12 15 18
The expression level of potential target genes of lincRNAs
All lincRNAs Differential expressed
Correlation analysis
55
lincRNAs
Figure 2. The relationship between expression of long intergenic noncoding RNA and their potential target genes in Large White. (A)
The scatter plot of the expression relationship between lincRNAs and their target genes. The expression level was calculated by log2 (the
number of mapping reads of gene). (B) Correlation analysis of differential expressed lincRNAs and their target genes. (C) The regulation
relationship between differential expressed lincRNAs and their target genes. The abscissa is calculated by log2 (fold change).
ES: Enshi black pig; lincRNA: Long intergenic noncoding RNA; LW: Large white; WB: Guizhou wild pig.
Table 1. Details of Fst and ratio between domestication pig and wild pig.
Groups Fst ratio Gene number
CND/CNW Top 5% 0.87684 Right 5% 0.30316 501
CND/EUD Top 5% 0.90745 Left 5% 8.92453 401
Right 5% 0.38204
CND: Domestication pig of China; CNW: Wild pig of China; EUD: Domestication pig of Europe.
Linc-sscg1689
Linc-sscg3147
Linc-sscg3705
Linc-sscg1215
Linc-sscg0044
0.1 Linc-sscg3478
Linc-sscg1703
Linc-sscg4464
Linc-sscg1413
Linc-sscg1708
Linc-sscg1249
Linc-sscg3929 10
Linc-sscg0041 10
Linc-sscg4083 9.24
Linc-sscg0300
Linc-sscg4039 8.47
Linc-sscg1584 7.70
Linc-sscg1774
Linc-sscg1202 6.93
Linc-sscg4309 6.16
Linc-sscg0607
Linc-sscg4021 5.39
Linc-sscg2340 4.62
Linc-sscg2694
0.0 Linc-sscg1681
Linc-sscg4330
3.85
3.08
Linc-sscg1624 2.31
0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0 Linc-sscg1268
Linc-sscg0264 1.54
Linc-sscg2580 0.77
Linc-sscg0672
τ value
Linc-sscg1430 0.00
Liver
Kidney
Muscle
Lung
Lymph
Spleen
Fat
Heart
The θπ ratio and Fst of linc-sscg1779 between
Dome_CN and Wild _CN
θπ ratio (Dome_CN/wild_CN)
Fst=0.87684
FST
0.7 0.5
θπ ratio = 0.30316
0.0 0.0
Upstream 2 k Genebody Downstream 2 k
Tissue specific_linc-sscg1779 Differential expression in liver tissue
6
12
* Linc-sscg0079
mapping reads)
Log (number of
mapping reads)
Log (number of
9 ENSS CG00000024079
3
6 ***
0 0
Heart Fat Kidney Liver Lung Lymph Muscle Spleen ES WB
Pig breed
Tissues
Figure 3. Long intergenic noncoding RNAs under selection during domestication and breeding. (A) Tissue specificity analysis of selected
lincRNAs. (B) The expression heatmap of selected lincRNAs which are tissue specificity expressed. (C) The evolutionary pattern of
linc-sscg1779 during pig domestication. The region within black frame respective the selective region. (D) Tissue specificity analysis of
linc-sscg1779. (E) Expression analysis of linc-sscg1779 and its target gene (ENSSSCG00000024079) in ES and WB.
lincRNA: Long intergenic noncoding RNA; ES: Enshi black pig; WB: Guizhou wild pig.
analyses of their PTGs. GO and KEGG enrichment analyses showed that the PTGs of lincRNAs with selection
signals are mainly enriched in developmental, regulatory, metabolic and disease-related pathways (Supplementary
Tables 3 &4). Moreover, lincRNAs with selection signals showed higher tissue specificity than lincRNAs without
(Figure 3A). We found that the proportions of lincRNAs specifically expressed in only one tissue type (τ-value = 1)
are higher than those of background lincRNAs (Figure 3A). We further investigated the expression pattern of
single-tissue-expressed lincRNAs in eight different tissues types. Interestingly, among the 50 selected lincRNAs,
Tissue specificity analysis of differential methylated lincRNAs ES_WB_DM ES_WB_DE ES_LW_DM ES_LW_DE
0.3
All_lincRNAs
ES_LW_DMlincRNAs 427 37 151 442 37 176
ES_WB_DMlincRNAs (18.7%) (1.6%) (6.6%) (19%) (1.6%) (7.6%)
0.2
Frequency
32 37
(1.4%) (1.6%)
244 132 228 168
(10.7%) (5.8%) (9.8%) (7.2%)
0.1
1260 1235
(55.2%) (53.2%)
0.0
0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0
τ value LincRNA_target LincRNA_target
Methylation level
XLOC_032503
Log2 (number of
mapping reads)
ENSSSCG00000014825
8.4
0.5
7.6
6.8 0.0
8515001
8515051
8515269
8515338
8515373
8515429
8515559
8515796
8515864
8515935
8516007
8516071
8516122
8516159
8516286
8516511
8516583
8516729
8516754
8516805
8517164
8517470
ES LW WB
Pig breeds
Upstream 2 k
Figure 4. Long intergenic noncoding RNAs with differential methylation among domesticated and wild pigs. (A) Tissue specificity
analysis of differential methylated lincRNAs. (B) The venn diagram of differential methylated and differential expressed lincRNAs which
have potential target genes between ES and WB, ES and LW. (C) Differential expression analysis of XLOC 032503 and its PTG
ENSSSCG00000014825 among ES, LW and WB. (D) Differential methylation analysis of XLOC 032503 between ES and LW.
ES: Enshi black pig; lincRNA: Long intergenic noncoding RNA; LW: Large white; WB: Guizhou wild pig.
19 are specifically expressed in the liver. By contrast, ten lincRNAs are kidney-specific, eight are heart-specific,
six are muscle-specific, three are lung-specific, two are fat-specific, one is lymph-specific and one is spleen-specific
(Figure 3B). We identified an interesting liver-specific lincRNA, linc-sscg1779, a liver-specific selected lincRNA,
which is differentially expressed in liver tissue during pig domestication. We found that its PTG, complement 6 (C6),
is also differentially expressed (Figure 3C–E). The expression levels of Linc-sscg1779 and C6 exhibited opposing
trends in the liver tissues of ES and LW (Figure 3E).
The epigenetic system is also an important alternative to genetic mutations in species evolution. Previous studies
have suggested that epigenetic changes may have participated in lincRNA evolution during pig domestication and
differentiation [16]. Therefore, we further analyzed the epigenetic changes of porcine lincRNAs under artificial
selection. Since promoter methylation is of great interest in gene regulation, we focused our analyses on differential
methylation in this region. DMLs and DMRs between ES and WB exhibited epigenetic changes during pig domes-
tication, whereas DMLs and DMRs between ES and LW exhibited epigenetic changes during breed differentiation.
We identified 740 differentially methylated (DM) lincRNAs between ES and WB, 744 DM lincRNAs between ES
and LW, and 291 candidate lincRNAs that are DM in domestication and differentiation (Supplementary Files 4
& 5). GO and KEGG enrichment analysis showed that similar to lincRNAs with selection signals, DM lincRNAs
mainly participate in metabolic and regulatory processes and disease-related pathways (Supplementary Tables 5 &
6), Similar to those with selection signals, lincRNAs with DMRs (DM lincRNA) showed high tissue specificity
(Figure 4A). Particularly, the proportion of DM lincRNAs expressed in only one tissue (τ-value = 1) is higher than
that of background lincRNAs (Figure 4A). Among these DM LincRNAs, 69 and 74 are also differentially expressed
(Figure 4B). We found that 32 and 37 DM and differentially expressed lincRNAs have PTGs (Figure 4B). Of
these lincRNAs, lincRNA XLOC 032503 is of special interest. Its differential methylation in the upstream 2 kb
region is associated with its differential expression in ES and LW breeds (Figure 4C & D). We observed a DMR
in XLOC 032503. In addition, its expression and methylation levels of its upstream 2 kb region are negatively
correlated, suggesting that the methylation of its upstream 2 kb region can influence its expression. Further-
more, we found that RELT, the PTG of XLCO 032503, and its expression trend are similar to those of lincRNA
XLCO 032503 in ES and LW (Figure 4C), suggesting that XLCO 032503 may affect RELT expression. Moreover,
XLOC 032503 may be affected by epigenetics and influence the expression of RELT in pig breed differentiation.
RELT has an important role in cell death and can influence human reproductive traits [61,62]. Thus, the expression
of this gene warrants further exploration.
Discussion
In this study, we comprehensively integrated multiomics data and identified the methylation and expression pattern
and PTGs of lincRNAs on the whole genome level. Most importantly, for the first time, we proposed a series of
lincRNAs that are under artificial selection on the genomic level during pig domestication and differentiation.
We contributed to future studies on pig lincRNAs, particularly those expressed in the liver, by adding a new batch
of liver lincRNAs to known pig lincRNA resources. We illustrated the landscapes of the two epigenomic systems in
the porcine liver by exploiting liver methylomic and transciptomic data. Consistent with previous observations [1,52],
we observed that the methylation level of lincRNAs across TSS and TTS are higher than that of protein-coding
genes. The expression pattern of lincRNAs in the porcine liver is similar to that in other mammals [63]. The expression
pattern of lincRNAs suggested that unlike protein-coding genes, a large number of lincRNAs with unestablished
functional stability may encounter degradation or are generally repressed by other regulatory mechanisms, such as
DNA methylation (Figures 1A & B). These classes of newly emerged genes participate in regulatory processes [64]
that may be important in phenotypic adaption and evolution [16,65]. As shown by previous studies, lincRNAs show
obvious tissue-specific expression and low expression [66,67]. On the basis of the above results, we speculated that
lincRNAs are evolutionarily newer than coding genes and may have emerged from noisy transcripts. Similar to the
case of early gene copy stages, in which newly duplicated genes are methylated [68] and expressed in specific tissues,
lincRNAs expression may be repressed through methylation. However, additional in-depth studies are needed to
investigate the mechanisms that underlie the regulation of lincRNA expressions.
LincRNAs function as transcriptional regulators that regulate the expression of their nearby PTGs [19]. Here,
we identified 1858 PTGs of lincRNAs. Through differential expression analysis, we found that lincRNAs and
their PTGs are co-expressed. In addition, the expression of a PTG is regulated by its corresponding lincRNA
through diverse mechanisms, as previously observed in case study [69–71]. Interestingly, many PTGs participate in
biological processes associated with metabolism, such as the positive regulation of metabolic processes, implying
that lincRNAs may participate in liver metabolism by regulating their PTGs. This inference, however, requires
further verification through detailed function studies.
Our most important contribution is a series of candidate lincRNAs that have been subjected to artificial selection.
Through the genomic screening of selection signals, we avoided the limitation imposed by biased tissue expression
pattern when inferring the functional impact of lincRNAs [18,31]. We identified 501 candidate lincRNAs associated
with domestication in Chinese pigs and 401 candidate lincRNAs associated with differentiation in Chinese and
European pig breeds. Interestingly, when further decipher the function of these lincRNAs by their PTGs, we
found that these PTGs were enriched in developmental, regulatory, metabolic- and disease-related pathways.
These findings are consistent with those of previous studies that analyzed protein-coding genes with crucial roles
in pig domestication [72]. Compared with domesticated pigs, the wild boar requires more energy to adapt to
a complex environment. In addition, immune system responses are other traits selected through domestication
and differentiation [73]. Our results therefore strongly suggested that artificial selection may have acted not only
on the pathways of protein-coding pathways, but also on the epigenetic system, in other words, the regulation
of these pathways by lincRNAs strengthens the effects of artificial selection itself. From this perspective, the
candidate lincRNA list will provide novel avenues for the further exploration of the genetic and molecular bases of
economically important traits and has considerable practical significance in pig breeding. Interestingly, we observed
that lincRNAs under artificial selection tend to be highly and specifically expressed in single tissues, suggesting
that artificial selection exerts its effects on lincRNAs through a tissue-specific approach. In particular, we found
that a large proportion of the 50 single-tissue-expressed lincRNAs with selection signals are from the porcine liver.
This result strongly suggested that the regulation of liver metabolism by lincRNAs is crucial in domestication and
breeding. We further highlighted one lincRNA, linc-sscg1779, because it is differentially expressed in breeding and
exhibits high expression in ES and low expression in WB. By contrast, its target gene C6 exhibited low expression
in ES and high expression in WB. This gene affects the liver metabolism and function in liver damage [74–76] and
is associated with liver cancer and other diseases [77,78]. These results implied that linc-sscg1779 participated in
liver metabolism by regulating C6 expression. We suspect that during domestication, artificial selection may act on
linc-sscg1779 and further affect C6, thus changing liver metabolism during domestication and breeding.
We further deciphered epigenetic changes in liver lincRNAs during pig domestication and differentiation.
We identified a batch of lincRNAs with DM promoter regions. We then selected an interesting lincRNA,
XLOC 032503. The expression of this lincRNA is influenced by the methylation of its upstream 2 kb region.
The expression trend of its PTG, RELT, in ES and LW is similar to its expression trend. This result suggested that
this lincRNA may affect the expression of its PTG. RELT participate in cell death and can influence reproductive
traits in human [61,62], suggesting that XLOC 032503 may be affected by epigenetics and influence RELT expres-
sion during pig breed differentiation. Our candidate list of lincRNAs provides clues for the effects of epigenetic
mechanisms on lincRNAs under artificial selection. Nevertheless, we need to investigate lincRNAs in additional
pig breeds to validate our conclusions.
We comprehensively and systematically integrated lincRNA data with transcriptome, methylome and large-scale
pig resequencing data. From this multiomics perspective, we were able to identify lincRNAs under artificial selection
during pig domestication. Our expression data and PTG information indicated that lincRNAs and their PTGs
constitute a novel epigenetic system that mediates liver modification during pig domestication and differentiation.
Our study and inferences provide new insights that are useful for further functional studies on pig biology and for
pig breeding.
Summary points
• We identified 713 liver long intergenic noncoding RNAs (lincRNAs) using RNA-seq data from three different pig
breeds and added new information to the porcine lincRNA dataset. We comprehensively analyzed DNA
methylation and lincRNA expression on the genome-wide level.
• We systematically analyzed the relationship between DNA methylation and expression of lincRNAs.
• We predicted the potential target genes of lincRNAs and the biological processes that may require the
involvement of lincRNAs.
• We systematically analyzed the expression relationship between lincRNAs and its potential target genes.
• We used resequencing data and BS-seq data resources to conduct the genome-wide detection of selection signals
of all candidate lincRNAs and to identify their epigenetic changes via DNA methylation during domestication and
breed differentiation.
• We identified two candidate lincRNAs that may have important roles in liver-specific processes during pig
domestication.
• Our findings demonstrated the evolutionary pattern of lincRNAs during pig domestication and breeding from
the genetic and epigenetic perspectives. The gene list and highlighted cases presented herein will shed new light
on the mechanism that underlies the artificial selection of swine.
Supplementary data
To view the supplementary data that accompany this paper please visit the journal website at: www.futuremedicine.com/doi/sup
pl/10.2217/epi-2017-0117
Author contributions
C Li and H Xiang conceived and designed the experiments, explained the data and revised the manuscript. J Li provided assistant
in resequencing data download. W Wang, C Zou, Y Cui, Y Fu, C Fang and Y Li provided in writing paper. C Zou provided assistant
in lincRNAs identification. C Li analyzed main content of the data and wrote the paper. H Xiang and C Li revised the manuscript.
Acknowledgements
Authors specially thank all of the contributors of the BS-seq data, resequencing data and RNA-seq data.
Reference
Papers of special note have been highlighted as: • of interest; •• of considerable interest
1. Zhou ZY, Li A, Wang LG et al. DNA methylation signatures of long intergenic noncoding RNAs in porcine adipose and muscle
tissues. Sci. Rep. 5, 15435 (2015).
•• Systematically analyzes the methylation signature of long intergenic noncoding RNAs (lincRNAs) in porcine adipose and muscle
tissues using MEDIP data.
2. Spurlock ME, Gabler NK. The development of porcine models of obesity and the metabolic syndrome. J. Nutr. 138(2), 397–402 (2008).
3. Schachtschneider KM, Madsen O, Park C, Rund LA, Groenen MAM, Schook LB. Adult porcine genome-wide DNA methylation
patterns support pigs as a biomedical model. BMC Genomics 16, 743 (2015).
4. Yang S, Li X, Li K, Fan B, Tang Z. A genome-wide scan for signatures of selection in Chinese indigenous and commercial pig
breeds. BMC Genet. 15, 7 (2014).
5. Giuffra E, Kijas JM, Amarger V, Carlborg O, Jeon JT, Andersson L. The origin of the domestic pig: independent domestication and
subsequent introgression. Genetics 154(4), 1785–1791 (2000).
6. Groenen MAM, Archibald AL, Schook LB. Analyses of pig genomes provide insight into porcine demography and
evolution. Nature 491(7424), 393–398 (2012).
•• Provides a high-quality draft pig genome sequence and illuminated the evolution of pig. This research provides a valuable
resource enabling effective uses of pigs in agricultural production.
7. Watanabe T Hayashi Y, Kimura J et al. Pig mitochondrial DNA: polymorphism, restriction map orientation, and sequence
data. Biochem. Genet. 24(5-6), 385–396 (1986).
8. Okumura N, Ishiguro N, Nakano M, Hirai K, Matsui A, Sahara M. Geographic population structure and sequence divergence in the
mitochondrial DNA control region of the Japanese wild boar (Sus scrofa leucomystax), with reference to those of domestic pigs. Biochem.
Genet. 34(5–6), 179–189 (1996).
9. Li XL, Yang S, Tang Z et al. Genome-wide scans to detect positive selection in Large White and Tongcheng pigs. Animal
Genet. 45(3), 329–339 (2014).
10. Ramos-Onsins SE, Burgos-Paz W, Manunza A, Amills M. Mining the pig genome to investigate the domestication process. Heredity
(Edinb.) 113(6), 471–484 (2014).
11. Rubin CJ, Megens HJ, Martinez Barrio A et al. Strong signatures of selection in the domestic pig genome. Proc. Natl Acad. Sci.
USA 109(48), 19529–19536 (2012).
12. Wilkinson S, Lu ZH, Megens HJ. Signatures of diversifying selection in European pig breeds. PLoS Genet. 9(4), e1003453 (2013).
13. Andersson L. Studying phenotypic evolution in domestic animals: a walk in the footsteps of Charles Darwin. Cold Spring Harb. Symp.
Quant. Biol. 74, 319–325 (2009).
14. Ai H, Fang X, Yang B et al. Adaptation and possible ancient interspecies introgression in pigs identified by whole-genome
sequencing. Nat. Genet. 47(3), 217–225 (2015).
•• Identifies a set of loci about reginal adatations to different latitude enviroments within China and provided new insights into
evolutinary history of pigs and the role of introgression in adaption.
15. Fang MY, Larson G, Ribeiro HS, Li N, Andersson L. Contrasting mode of evolution at a coat color locus in wild and domestic
pigs. PLoS Genet. 5(1), e1000341 (2009).
16. Li C, Wang X, Cai H et al. Molecular microevolution and epigenetic patterns of the long non-coding gene H19 show its potential
function in pig domestication and breed divergence. BMC Evol. Biol. 16, 87 (2016).
17. Zhao W, Mu Y, Ma L et al. Systematic identification and characterization of long intergenic non-coding RNAs in fetal porcine skeletal
muscle development. Sci. Rep. 5, 8957 (2015).
18. Zhou ZY, Li AM, Adeola AC et al. Genome-wide identification of long intergenic noncoding RNA genes and their potential association
with domestication in pigs. Genome Biol. Evol. 6(6), 1387–1392 (2014).
•• Systematically identifies lincRNAs and analyzes the potentical associtation between lincRNAs and domestication of pigs.
19. Vance KW, Ponting CP. Transcriptional regulatory functions of nuclear long noncoding RNAs. Trends Genet. 30(8), 348–355 (2014).
•• Draws upon other studies to review the functions of nuclear localized intergenic lncRNAs in regulating gene transcription and
chromatin organization, their local and distal modes of action, their mechanisms of genomic targeting, and the nature of their
interactions with chromatin.
20. Tsai MC, Manor O, Wan Y et al. Long noncoding RNA as modular scaffold of histone modification
complexes. Science 329(5992), 689–693 (2010).
21. Gupta RA, Shah N, Wang KC et al. Long non-coding RNA HOTAIR reprograms chromatin state to promote cancer
metastasis. Nature 464(7291), 1071–1076 (2010).
22. Orom UA, Derrien T, Beringer M et al. Long noncoding RNAs with enhancer-like function in human cells. Cell 143(1), 46–58 (2010).
23. Sleutels F, Zwart R, Barlow DP. The non-coding Air RNA is required for silencing autosomal imprinted genes. Nature 415(6873),
810–813 (2002).
24. Loewer S, Cabili MN, Guttman M et al. Large intergenic non-coding RNA-RoR modulates reprogramming of human induced
pluripotent stem cells. Nat. Genet. 42(12), 1113–1117 (2010).
25. Guttman M, Donaghey J, Carey BW et al. lincRNAs act in the circuitry controlling pluripotency and
differentiation. Nature 477(7364), 295–300 (2011).
26. Borsani G, Tonlorenzi R, Simmler MC et al. Characterization of a murine gene expressed from the inactive X
chromosome. Nature 351(6324), 325–329 (1991).
27. Lakhotia SC. Divergent actions of long noncoding RNAs on X-chromosome remodelling in mammals and Drosophila achieve the same
end result: dosage compensation. J. Genet. 94(4), 575–84 (2015).
28. Wang Y, Hu T, Wu L, Liu X, Xue S, Lei M. Identification of non-coding and coding RNAs in porcine
endometrium. Genomics 109(1), 43–50 (2017).
29. Yang Y, Zhou R, Zhu S et al. Systematic identification and molecular characteristics of long noncoding RNAs in pig tissues. Biomed. Res.
Int. 2017, 6152582 (2017).
30. Zhao P, Zheng X, Feng W et al. Profiling long noncoding RNA of multi-tissue transcriptome enhances porcine noncoding genome
annotation. Epigenomics 10(3), 301–320 ( 2018).
31. Zou C, Li S, Deng L et al. Transcriptome analysis reveals long intergenic noncoding RNAs contributed to growth and meat quality
differences between Yorkshire and Wannanhua pig. Genes (Basel) 8(8), 8080203 (2017).
32. Li M, Tian S, Jin L et al. Genomic analyses identify distinct patterns of selection in domesticated pigs and Tibetan wild boars. Nat.
Genet. 45(12), 1431–1438 (2013).
• Study reports genetic adaptations in Tibetan wild boar that are associated with high altitudes and characterized the genetic basis
of increased salivation in domestic pig.
33. Schachtschneider KM,Madsen O, Park C, Rund LA, Groenen MAM, Schook LB,, Adult porcine genome-wide DNA methylation
patterns support pigs as a biomedical model. BMC Genomics 16(1), 743 (2015).
34. Pollier J, Rombauts S, Goossens A. Analysis of RNA-Seq data with TopHat and Cufflinks for genome-wide expression analysis of
jasmonate-treated plants and plant cultures. Methods Mol. Biol. 1011, 305–315 (2013).
35. Trapnell C, Williams BA, Pertea G et al. Transcript assembly and quantification by RNA-Seq reveals unannotated transcripts and isoform
switching during cell differentiation. Nat. Biotechnol. 28(5), 511–515 (2010).
36. Trapnell C, Roberts A, Goff L et al. Differential gene and transcript expression analysis of RNA-seq experiments with TopHat and
Cufflinks. Nat. Protoc. 7(3), 562–578 (2012).
37. Kong L, Zhang Y, Ye ZQ et al. CPC: assess the protein-coding potential of transcripts using sequence features and support vector
machine. Nucleic Acids Res. 35(Web Server issue), W345–W349 (2007).
38. Anders S, Pyl PT, Huber W. HTSeq – a Python framework to work with high-throughput sequencing data. Bioinformatics 31(2),
166–169 (2015).
39. McCarthy DJ, Chen Y, Smyth GK. Differential expression analysis of multifactor RNA-Seq experiments with respect to biological
variation. Nucleic Acids Res. 40(10), 4288–4297 (2012).
40. Wang L, Feng Z, Wang X, Wang X, Zhang X. DEGseq: an R package for identifying differentially expressed genes from RNA-seq
data. Bioinformatics 26(1), 136–138 (2010).
41. Yanai I, Benjamin H, Shmoish M et al. Genome-wide midrange transcription profiles reveal expression level relationships in human
tissue specification. Bioinformatics 21(5), 650–659 (2005).
42. Liao BY, Zhang J. Low rates of expression profile divergence in highly expressed genes and tissue-specific genes during mammalian
evolution. Mol. Biol. Evol. 23(6), 1119–1128 (2006).
43. Xiang H, Zhu J, Chen Q et al. Single base-resolution methylome of the silkworm reveals a sparse epigenomic map. Nat.
Biotechnol. 28(5), 516–520 (2010).
• Systematically analyzes the methylation pattern of silkworm from genome-wide level using BS-seq data and demonstrated a
strategy for sequencing the epigenomes of organisms.
44. Quinlan AR, Hall IM. BEDTools: a flexible suite of utilities for comparing genomic features. Bioinformatics 26(6), 841–842 (2010).
45. Huang da W, Sherman BT, Zheng X et al. Extracting biological meaning from large gene lists with DAVID. Curr. Protoc.
Bioinformatics Chapter 13: p. Unit 13 11 (2009).
46. Krueger F, Andrews SR. Bismark: a flexible aligner and methylation caller for Bisulfite-Seq applications. Bioinformatics 27(11),
1571–1572 (2011).
47. Li H, Durbin R. Fast and accurate short read alignment with Burrows–Wheeler transform. Bioinformatics 25(14), 1754–1760 (2009).
48. McKenna A, Hanna M, Banks E et al. The genome analysis toolkit: a MapReduce framework for analyzing next-generation DNA
sequencing data. Genome Res. 20(9), 1297–1303 (2010).
49. Wu H, Xu T, Feng H et al. Detection of differentially methylated regions from whole-genome bisulfite sequencing data without
replicates. Nucleic Acids Res. 43(21), e141 (2015).
50. Wu Y, Cheng T, Liu C et al. Systematic identification and characterization of long non-coding RNAs in the silkworm, Bombyx
mori. PLoS ONE 11(1), e0147147 (2016).
51. Cooper DN, Youssoufian H. The CpG dinucleotide and human genetic disease. Hum. Genet. 78(2), 151–155 (1988).
52. Sati S, Ghosh S, Jain V, Scaria V, Sengupta S. Genome-wide analysis reveals distinct patterns of epigenetic features in long non-coding
RNA loci. Nucleic Acids Res. 40(20), 10018–1031 (2012).
• Performs genome-wide analysis of the distribution of DNA methylation and histone modifications.
53. Cabili MN, Trapnell C, Goff L et al. Integrative annotation of human large intergenic noncoding RNAs reveals global properties and
specific subclasses. Genes Dev. 25(18), 1915–1927 (2011).
54. Luo H, Sun S, Li P, Bu D, Cao H, Zhao Y. Comprehensive characterization of 10,571 mouse large intergenic noncoding RNAs from
whole transcriptome sequencing. PLoS ONE 8(8), e70835 (2013).
55. Chen X, Qi G, Qin M et al. DNA methylation directly downregulates human cathelicidin antimicrobial peptide gene (CAMP) promoter
activity. Oncotarget 8(17), 27943–27952 (2017).
56. Wei JW, Huang K, Yang C, Kang CS. Non-coding RNAs as regulators in epigenetics (review). Oncol. Rep. 37(1), 3–9 (2017).
57. Li X, Zhu J, Hu F et al. Single-base resolution maps of cultivated and wild rice methylomes and regulatory roles of DNA methylation in
plant gene expression. BMC Genomics 13, 300 (2012).
58. Guo H, Zhu P, Yan L et al. The DNA methylation landscape of human early embryos. Nature 511(7511), 606–610 (2014).
59. Wang KC, Yang YW, Liu B et al. A long noncoding RNA maintains active chromatin to coordinate homeotic gene
expression. Nature 472(7341), 120–124 (2011).
60. Cai L, Chang H, Fang Y, Li G. A comprehensive characterization of the function of LincRNAs in transcriptional regulation through
long-range chromatin interactions. Sci. Rep. 6, 36572 (2016).
61. Cusick JK, Mustian A, Goldberg K, Reyland ME. RELT induces cellular death in HEK 293 epithelial cells. Cell
Immunol. 261(1), 1–8 (2010).
62. Moua P, Checketts M, Xu LG, Shu HB, Reyland ME, Cusick JK. RELT family members activate p38 and induce apoptosis by a
mechanism distinct from TNFR1. Biochem. Biophys. Res. Commun. 491(1), 25–32 (2017).
63. He Z, Bammann H, Han D, Xie G, Khaitovich P. Conserved expression of lincRNA during human and macaque prefrontal cortex
development and maturation. RNA 20(7), 1103–1111 (2014).
64. Ward M, McEwan C, Mills JD, Janitz M. Conservation and tissue-specific transcription patterns of long noncoding RNAs. J. Hum.
Transcr. 1(1), 2–9 (2015).
65. Feschotte C. Transposable elements and the evolution of regulatory networks. Nat. Rev. Genet. 9(5), 397–405 (2008).
66. Necsulea A, Soumillon M, Warnefors M et al. The evolution of lncRNA repertoires and expression patterns in
tetrapods. Nature 505(7485), 635–640 (2014).
67. Li A, Zhou ZY, Hei X et al. Genome-wide discovery of long intergenic noncoding RNAs and their epigenetic signatures in the rat. Sci.
Rep. 7(1), 14817 (2017).
68. Keller TE, Yi SV. DNA methylation and evolution of duplicate genes. Proc. Natl Acad. Sci. USA 111(16), 5932–5937 (2014).
69. Nelson BR, Makarewich CA, Anderson DM et al. A peptide encoded by a transcript annotated as long noncoding RNA enhances
SERCA activity in muscle. Science 351(6270), 271–275 (2016).
70. Wilusz JE, Sunwoo H, Spector DL. Long noncoding RNAs: functional surprises from the RNA world. Genes Dev. 23(13), 1494–1504
(2009).
71. Song X, Cao G, Jing L et al. Analysing the relationship between lncRNA and protein-coding gene and the role of lncRNA as ceRNA in
pulmonary fibrosis. J. Cell Mol. Med. 18(6), 991–1003 (2014).
72. Yang Y, Zhou R, Mu Y, Hou X, Tang Z, Li K. Genome-wide analysis of DNA methylation in obese, lean, and miniature pig breeds. Sci.
Rep. 6, 30160 (2016).
73. Omenn GS. Evolution in health and medicine Sackler colloquium: evolution and public health. Proc. Natl Acad. Sci. USA 107(Suppl. 1),
1702–1709 (2010).
74. Bykov IL, Vakeva A, Jarvelainen HA, Meri S, Lindros KO. Protective function of complement against alcohol-induced rat liver
damage. Int. Immunopharmacol. 4(12), 1445–1454 (2004).
75. Liao JH, Li CC, Wu SH, Fan JW, Gu HT, Wang ZW. Gene variations of sixth complement component affecting tacrolimus
metabolism in patients with liver transplantation for hepatocellular carcinoma. Chin. Med. J. (Engl.) 130(14), 1670–1676 (2017).
76. Brauer RB, Baldwin WM 3rd, Wang D et al. Hepatic and extrahepatic biosynthesis of complement factor C6 in the rat. J.
Immunol. 153(7), 3168–3176 (1994).
77. Wang Z, Liao J, Wu S, Li C, Fan J, Peng Z. Recipient C6 rs9200 genotype is associated with hepatocellular carcinoma recurrence after
orthotopic liver transplantation in a Han Chinese population. Cancer Gene Ther. 23(6), 157–161 (2016).
78. Pasaje CF, Bae JS, Park BL et al. Association analysis of C6 genetic variations and aspirin hypersensitivity in Korean asthmatic
patients. Hum. Immunol. 72(10), 973–978 (2011).